aboutsummaryrefslogtreecommitdiff
path: root/zookeeper
Commit message (Collapse)AuthorAgeFilesLines
* Upgrade to jmx-exporter 0.1.0Staffan Olsson2017-11-032-2/+2
|
* Zookeeper metrics conf contributed by @yacut #61metrics-jmx-zookeeperStaffan Olsson2017-11-031-2/+21
|
* Adds directives from kafka's rules, now for pzoo too.Staffan Olsson2017-11-032-1/+6
| | | | But before this, how did the metrics container know which port to connect to?
* Still not getting anything zookeeper-specificStaffan Olsson2017-11-031-16/+1
|
* Gets you JVM metrics from zoo, lots and lots of itStaffan Olsson2017-11-031-0/+1
|
* Uses JMX config from config map, so we can experimentStaffan Olsson2017-11-032-1/+23
|
* Had 10 OOMKilled/hour with 100Mi so let's increase request,Staffan Olsson2017-11-032-4/+4
| | | | and with 150Mi limit I got zero restarts in 48 hours.
* Uses prometheus/jmx_exporter parent-0.10 tagStaffan Olsson2017-11-032-2/+2
|
* Reverse order of containers to benefit from "Defaulting container name to"Staffan Olsson2017-11-032-36/+36
|
* Endpont works again, with similar scrape times as Xmx=80m without Metaspace ↵Staffan Olsson2017-11-032-4/+4
| | | | limit
* Let's focus on the two numbers that seem to matterStaffan Olsson2017-11-031-1/+0
|
* Don't touch Xss as it has <1MB defaults according to docsStaffan Olsson2017-11-032-2/+0
|
* Adapts to Java 8+, but still guessing the numbersStaffan Olsson2017-11-032-3/+2
|
* The 1s response time from kafka might be due to ...Staffan Olsson2017-11-031-2/+4
| | | | that unlike zoo pods it actually exposes interesting data
* zoo is fast now, <0.02s compared to >1s for the othersStaffan Olsson2017-11-031-1/+3
|
* Same base image as kafka and latest exporter sourceStaffan Olsson2017-11-032-2/+2
|
* CPU limit on metrics export won't actually save any cyclesStaffan Olsson2017-11-032-2/+0
| | | | | It'll just make the requests slower. Dreadfully slow on Minikube (>30s even when limit is increased to 100m).
* Exposes /metrics endpoints for Prometheus scrapingStaffan Olsson2017-11-032-0/+46
| | | | This reverts commit 22a314ac161d3d203881eaf4b1a44ea8bf028a27.
* Runs Kafka 1.0.0v2.1.0Staffan Olsson2017-11-012-4/+4
|
* Shares debian with kafka-initutils -> faster first startStaffan Olsson2017-10-152-4/+4
|
* - fix #65: wait for replyElmar Weber2017-10-132-2/+2
|
* ./update-kafka-image.sh to 0.11.0.1Staffan Olsson2017-10-022-4/+4
|
* We prefer Ready:False status instead of restarted pods,Staffan Olsson2017-08-052-12/+0
| | | | | | at least for now, as it allows exec into the pods to investigate. We've been having frequent restarts that are not due to OOMKilled (i.e. not #49). Now failed probes will lead to unready pods, which we can monitor for using #60.
* Makes /metrics export opt-in (through addon branch coming up)Staffan Olsson2017-07-282-46/+0
|
* Stops logs from growing when zookeeper is idleconfig-initStaffan Olsson2017-07-271-0/+4
|
* Places the myid magic number where replicas areStaffan Olsson2017-07-272-5/+5
|
* Employs the init script concept for zookeeper too, reducing duplcationStaffan Olsson2017-07-263-21/+39
|
* Default shell on Debian shows the same symptom ...Staffan Olsson2017-07-262-2/+2
| | | | | | | | of not forwarding signals as Alpine did. Kafka logs say nothing, and after 30s the container is terminated. With /bin/bash instead the log indicates shutdown behavior. This reverts commit c188f43cb8a252cd685a4944d35577ebc17a3668.
* Tagged with the policy from https://github.com/solsson/dockerfiles/pull/11Staffan Olsson2017-07-262-2/+2
|
* Clarifies a gotcha: to mount config with log4j.properties ...Staffan Olsson2017-07-262-0/+4
| | | | | | | you must use /opt/kafka/config, due to how log4j.properites (sometimes tools- or connect-) are resolved by the ./bin scripts. See https://github.com/solsson/dockerfiles/pull/10
* New build at commit 0314080Staffan Olsson2017-07-262-2/+2
|
* New build with https://github.com/solsson/dockerfiles/pull/9Staffan Olsson2017-07-252-2/+2
|
* Default shell on debian should forward signals properlyStaffan Olsson2017-07-232-2/+2
|
* solsson/kafka on debian restores installation path to /opt/kafkaStaffan Olsson2017-07-232-2/+2
|
* Upgrades to current https://github.com/solsson/dockerfiles/pull/5Staffan Olsson2017-07-232-2/+2
|
* Fixes posix compatibility for probesStaffan Olsson2017-07-232-4/+4
|
* Same startup as 51zooStaffan Olsson2017-06-281-2/+1
|
* Applies the limit to persistent zookeeper pods too. They seem more prone to ↵Staffan Olsson2017-06-281-0/+2
| | | | restarts than 51zoo.
* Upgrades to latest build from https://github.com/solsson/dockerfiles/pull/4, ↵Staffan Olsson2017-06-282-2/+2
| | | | with plain logging>=INFO config
* Limiting metrics' JVM to match resource limits. Still getting OOMKilled ↵Staffan Olsson2017-06-281-2/+3
| | | | though, but maybe half as often.
* Raises memory limit for metrics; got 10 OOMKilled per pod in the last 3 hoursStaffan Olsson2017-06-272-2/+2
|
* Reduces termination grace period for zookeeper because I fail to trigger ↵Staffan Olsson2017-06-272-2/+2
| | | | termination by signal
* Adds probes, but for Kafka I don't think it indicates readiness...Staffan Olsson2017-06-272-0/+24
| | | | | | | | | | | | | | | which might not matter because we no longer have a loadbalancing service. These probes won't catch all failure modes, but if they fail we're pretty sure the container is malfunctioning. I found some sources recommending ./bin/kafka-topics.sh for probes but to me it looks risky to introduce a dependency to some other service for such things. One such source is https://github.com/kubernetes/charts/pull/144 The zookeeper probe is from https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/ An issue is that zookeeper's logs are quite verbose for every probe.
* Reverts to default termination period, and uses bash for "shell form"...Staffan Olsson2017-06-272-4/+4
| | | | | | | | as Alpine's /bin/busybox (ash) does not forward signals, according to https://pracucci.com/graceful-shutdown-of-kubernetes-pods.html The reason for the termination period change is that we haven't observed any termination behavior yet so we can't know how slow it might be.
* Got quite repeatable OOMKilled on pzoo pods, so I figured it must be...resource-limitsStaffan Olsson2017-06-272-2/+2
| | | | in metrics becuase nither zoo nor kafka has limits
* sStaffan Olsson2017-06-271-3/+3
|
* A monitoring-only pod uses 0m / ~32Mi resourcesStaffan Olsson2017-06-272-5/+19
|
* Adds tentative resource requests, based on what idle pods use (though this ↵Staffan Olsson2017-06-272-0/+8
| | | | includes monitoring)
* Forks can tweak storage classes, but here we want setup to be simple...zookeeper-dataStaffan Olsson2017-06-261-3/+1
| | | | | | and with the mix of PV and emptyDir there's no reason to make PVs faster than host disks. Use 10GB as it is the minimum for standard disks on GKE.
* A cluster in three availability zones now get one persistent zk each, and ↵zookeeper-availability-zonesStaffan Olsson2017-06-264-23/+15
| | | | two that can move automatically at node failures