Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support kafka 2.1, and misc improvements. #6

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

Fiveside
Copy link

@Fiveside Fiveside commented May 17, 2019

Greetings. This is a bit of a beefy pull request, so I'd like to outline the changes that we made and why, and hopefully we can discuss any of the issues in detail.

  • Upgraded the supported Kafka version to 2.1. This is the biggest change and the driver for most of the rest of the work. Kafka 2 no longer stores consumer group information in zookeeper. Instead it stores it in Kafka proper. With this, users no longer need to provide zookeeper info, as everything is queryable from Kafka itself. Right now its hardcoded to expect kafka 2.1, but that's really because that's the only version I tested it on (and coded against).

  • Swapped out the statsd library. The statsd format for tagged stats differs between influxdb and datadog. The tagging format used previously ended up sending garbage to datadog. The new statsd library properly serializes tagged stats for those 2 providers and the tagless fallback.

  • Removed the onbuild docker dependency. Onbuild images have been deprecated (see [Proposal] Deprecate onbuild tags docker-library/official-images#2076). Instead we're using a normal golang image and a multi-stage build to keep the image size and layer count down.

  • Exposed cli options as environment variables. This was specifically an issue because we launched this statsd daemon on kubernetes. In our kubernetes cluster, each worker has a datadog agent running locally. We wanted this kafka-statsd to talk to the local datadog agent, and you do this in kubernetes via an environment variable definition. This isn't possible when specifying parameters as command line options. After that fix it only made sense to expose the rest of the options as environment variables.

  • Recording absolute partition and consumer group offsets. In addition to the consumer lag, we're now also reporting partition and group offsets. It was useful for debugging purposes. In theory we don't actually need to record consumer group lag anymore since that can be calculated from the partition and group offsets, however that would be breaking backwards compatibility.

  • Kafka calls made concurrent. We need to make a lot of calls to kafka to pull stats information. We just made them concurrent to reduce the chance of falling behind the sampling frequency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants