kubernetes-kafka - Kafka cluster as Kubernetes StatefulSet, plain manifests and config

	Commit message (Collapse)	Author	Age	Files	Lines
*	Upgrade to jmx-exporter 0.1.0	Staffan Olsson	2017-11-03	2	-2/+2
\|
*	Zookeeper metrics conf contributed by @yacut #61metrics-jmx-zookeeper	Staffan Olsson	2017-11-03	1	-2/+21
\|
*	Adds directives from kafka's rules, now for pzoo too.	Staffan Olsson	2017-11-03	2	-1/+6
\| \| \| \|	But before this, how did the metrics container know which port to connect to?
*	Still not getting anything zookeeper-specific	Staffan Olsson	2017-11-03	1	-16/+1
\|
*	Gets you JVM metrics from zoo, lots and lots of it	Staffan Olsson	2017-11-03	1	-0/+1
\|
*	Uses JMX config from config map, so we can experiment	Staffan Olsson	2017-11-03	2	-1/+23
\|
*	Had 10 OOMKilled/hour with 100Mi so let's increase request,	Staffan Olsson	2017-11-03	2	-4/+4
\| \| \| \|	and with 150Mi limit I got zero restarts in 48 hours.
*	Uses prometheus/jmx_exporter parent-0.10 tag	Staffan Olsson	2017-11-03	2	-2/+2
\|
*	Reverse order of containers to benefit from "Defaulting container name to"	Staffan Olsson	2017-11-03	2	-36/+36
\|
*	Endpont works again, with similar scrape times as Xmx=80m without Metaspace ↵	Staffan Olsson	2017-11-03	2	-4/+4
\| \| \| \|	limit
*	Let's focus on the two numbers that seem to matter	Staffan Olsson	2017-11-03	1	-1/+0
\|
*	Don't touch Xss as it has <1MB defaults according to docs	Staffan Olsson	2017-11-03	2	-2/+0
\|
*	Adapts to Java 8+, but still guessing the numbers	Staffan Olsson	2017-11-03	2	-3/+2
\|
*	The 1s response time from kafka might be due to ...	Staffan Olsson	2017-11-03	1	-2/+4
\| \| \| \|	that unlike zoo pods it actually exposes interesting data
*	zoo is fast now, <0.02s compared to >1s for the others	Staffan Olsson	2017-11-03	1	-1/+3
\|
*	Same base image as kafka and latest exporter source	Staffan Olsson	2017-11-03	2	-2/+2
\|
*	CPU limit on metrics export won't actually save any cycles	Staffan Olsson	2017-11-03	2	-2/+0
\| \| \| \| \|	It'll just make the requests slower. Dreadfully slow on Minikube (>30s even when limit is increased to 100m).
*	Exposes /metrics endpoints for Prometheus scraping	Staffan Olsson	2017-11-03	2	-0/+46
\| \| \| \|	This reverts commit 22a314ac161d3d203881eaf4b1a44ea8bf028a27.
*	Runs Kafka 1.0.0v2.1.0	Staffan Olsson	2017-11-01	2	-4/+4
\|
*	Shares debian with kafka-initutils -> faster first start	Staffan Olsson	2017-10-15	2	-4/+4
\|
*	- fix #65: wait for reply	Elmar Weber	2017-10-13	2	-2/+2
\|
*	./update-kafka-image.sh to 0.11.0.1	Staffan Olsson	2017-10-02	2	-4/+4
\|
*	We prefer Ready:False status instead of restarted pods,	Staffan Olsson	2017-08-05	2	-12/+0
\| \| \| \| \| \|	at least for now, as it allows exec into the pods to investigate. We've been having frequent restarts that are not due to OOMKilled (i.e. not #49). Now failed probes will lead to unready pods, which we can monitor for using #60.
*	Makes /metrics export opt-in (through addon branch coming up)	Staffan Olsson	2017-07-28	2	-46/+0
\|
*	Stops logs from growing when zookeeper is idleconfig-init	Staffan Olsson	2017-07-27	1	-0/+4
\|
*	Places the myid magic number where replicas are	Staffan Olsson	2017-07-27	2	-5/+5
\|
*	Employs the init script concept for zookeeper too, reducing duplcation	Staffan Olsson	2017-07-26	3	-21/+39
\|
*	Default shell on Debian shows the same symptom ...	Staffan Olsson	2017-07-26	2	-2/+2
\| \| \| \| \| \| \| \|	of not forwarding signals as Alpine did. Kafka logs say nothing, and after 30s the container is terminated. With /bin/bash instead the log indicates shutdown behavior. This reverts commit c188f43cb8a252cd685a4944d35577ebc17a3668.
*	Tagged with the policy from https://github.com/solsson/dockerfiles/pull/11	Staffan Olsson	2017-07-26	2	-2/+2
\|
*	Clarifies a gotcha: to mount config with log4j.properties ...	Staffan Olsson	2017-07-26	2	-0/+4
\| \| \| \| \| \| \|	you must use /opt/kafka/config, due to how log4j.properites (sometimes tools- or connect-) are resolved by the ./bin scripts. See https://github.com/solsson/dockerfiles/pull/10
*	New build at commit 0314080	Staffan Olsson	2017-07-26	2	-2/+2
\|
*	New build with https://github.com/solsson/dockerfiles/pull/9	Staffan Olsson	2017-07-25	2	-2/+2
\|
*	Default shell on debian should forward signals properly	Staffan Olsson	2017-07-23	2	-2/+2
\|
*	solsson/kafka on debian restores installation path to /opt/kafka	Staffan Olsson	2017-07-23	2	-2/+2
\|
*	Upgrades to current https://github.com/solsson/dockerfiles/pull/5	Staffan Olsson	2017-07-23	2	-2/+2
\|
*	Fixes posix compatibility for probes	Staffan Olsson	2017-07-23	2	-4/+4
\|
*	Same startup as 51zoo	Staffan Olsson	2017-06-28	1	-2/+1
\|
*	Applies the limit to persistent zookeeper pods too. They seem more prone to ↵	Staffan Olsson	2017-06-28	1	-0/+2
\| \| \| \|	restarts than 51zoo.
*	Upgrades to latest build from https://github.com/solsson/dockerfiles/pull/4, ↵	Staffan Olsson	2017-06-28	2	-2/+2
\| \| \| \|	with plain logging>=INFO config
*	Limiting metrics' JVM to match resource limits. Still getting OOMKilled ↵	Staffan Olsson	2017-06-28	1	-2/+3
\| \| \| \|	though, but maybe half as often.
*	Raises memory limit for metrics; got 10 OOMKilled per pod in the last 3 hours	Staffan Olsson	2017-06-27	2	-2/+2
\|
*	Reduces termination grace period for zookeeper because I fail to trigger ↵	Staffan Olsson	2017-06-27	2	-2/+2
\| \| \| \|	termination by signal
*	Adds probes, but for Kafka I don't think it indicates readiness...	Staffan Olsson	2017-06-27	2	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	which might not matter because we no longer have a loadbalancing service. These probes won't catch all failure modes, but if they fail we're pretty sure the container is malfunctioning. I found some sources recommending ./bin/kafka-topics.sh for probes but to me it looks risky to introduce a dependency to some other service for such things. One such source is https://github.com/kubernetes/charts/pull/144 The zookeeper probe is from https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/ An issue is that zookeeper's logs are quite verbose for every probe.
*	Reverts to default termination period, and uses bash for "shell form"...	Staffan Olsson	2017-06-27	2	-4/+4
\| \| \| \| \| \| \| \|	as Alpine's /bin/busybox (ash) does not forward signals, according to https://pracucci.com/graceful-shutdown-of-kubernetes-pods.html The reason for the termination period change is that we haven't observed any termination behavior yet so we can't know how slow it might be.
*	Got quite repeatable OOMKilled on pzoo pods, so I figured it must be...resource-limits	Staffan Olsson	2017-06-27	2	-2/+2
\| \| \| \|	in metrics becuase nither zoo nor kafka has limits
*	s	Staffan Olsson	2017-06-27	1	-3/+3
\|
*	A monitoring-only pod uses 0m / ~32Mi resources	Staffan Olsson	2017-06-27	2	-5/+19
\|
*	Adds tentative resource requests, based on what idle pods use (though this ↵	Staffan Olsson	2017-06-27	2	-0/+8
\| \| \| \|	includes monitoring)
*	Forks can tweak storage classes, but here we want setup to be simple...zookeeper-data	Staffan Olsson	2017-06-26	1	-3/+1
\| \| \| \| \| \|	and with the mix of PV and emptyDir there's no reason to make PVs faster than host disks. Use 10GB as it is the minimum for standard disks on GKE.
*	A cluster in three availability zones now get one persistent zk each, and ↵zookeeper-availability-zones	Staffan Olsson	2017-06-26	4	-23/+15
\| \| \| \|	two that can move automatically at node failures