Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support seed brokers' hostname with multiple addresses #877

Merged

Conversation

abicky
Copy link
Contributor

@abicky abicky commented Dec 13, 2020

By default, the Java implementation of a Kafka client can use each returned IP address from a hostname.
cf. https://kafka.apache.org/documentation/#client.dns.lookup

librdkafka also does:

If host resolves to multiple addresses librdkafka will
round-robin the addresses for each connection attempt.

cf. https:/edenhill/librdkafka/blob/v1.5.3/INTRODUCTION.md#brokers

Such a feature helps us to manage seed brokers' IP addresses easily, so this PR implements it.

@dasch
Copy link
Contributor

dasch commented Dec 30, 2020

This seems like a risky change, and I haven't seen other requests for this functionality. Would you consider using rdkafka-ruby instead? I'm wary of introducing risky changes to ruby-kafka unless there's broad demand.

@abicky
Copy link
Contributor Author

abicky commented Dec 30, 2020

What kind of risk do you think there is? I think this change makes ruby-kafka easier to use without any risk.
Not only other Kafka client implementations but the ruby driver of Cassandra also has a similar feature to specify the seed nodes.
cf. https:/datastax/ruby-driver/blob/v3.2.5/lib/cassandra.rb#L727-L741

@stale = false

return cluster_info
Resolv.getaddresses(node.hostname).select { |ip| ip =~ Resolv::IPv4::Regex }.shuffle.each do |ip|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about IPv6 addresses?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think currently ruby-kafka doesn't support IPv6 addresses because Kafka::BrokerUri.parse can't parse IPv6 addresses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that a hostname, a DNS record, could have a IPv6 address, and I didn’t think of such a case...

Copy link

@sodabrew sodabrew Feb 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to have this IPv4 regex at all? Why not simply:

Resolv.getaddresses(node.hostname).shuffle.each do |ip|

If an IPv6 address causes an exception further down the stack, better to solve that further down the stack, right? I don't actually follow what the issue would be with Kafka::BrokerUri.parse -- that's further up the stack and would be handling the text hostname before passing it down to this method.

If someone wanted to pass an IPv6 address in natively, the format should be like Kafka.new("kafka://[::1]:9092")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the code as follows:

(@resolve_seed_brokers ? Resolv.getaddresses(node.hostname).shuffle : [node.hostname]).each

I think currently ruby-kafka doesn't support IPv6 addresses because Kafka::BrokerUri.parse can't parse IPv6 addresses.

This is my misunderstanding. Although I don't remember well, I might have tried Kafka.new("kafka://::1:9092")...

Is there any reason to have this IPv4 regex at all?

At first, I thought it was meaningless to try IPv6 addresses because both IPv4 addresses and IPv6 addresses might belong to the same broker. However we cannot know whether they belong to the same broker or not, and the DNS record might have only IPv6 addresses.
I also checked the behavior of librdkafka and found that it used both IPv4 addresses and IPv6 addresses even if they belonged to the same broker.
cf. https:/edenhill/librdkafka/blob/v1.6.1/src/rdaddr.c#L162-L232

For the above reasons, I decided not to exclude IPv6 addresses.

@dasch
Copy link
Contributor

dasch commented Dec 30, 2020

If this new behavior can be made opt-in, i.e. disabled by default, then I'd be open to it.

@abicky
Copy link
Contributor Author

abicky commented Dec 30, 2020

OK, I'll think about another option or something like that.

@@ -7,6 +7,7 @@ jobs:
LOG_LEVEL: DEBUG
steps:
- checkout
- run: sudo apt-get update && sudo apt-get install -y cmake # For installing snappy
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this line relevant to the multiple address support?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line was needed to run the tests. I've removed it because the current master branch includes the change.

@sodabrew
Copy link

sodabrew commented Feb 10, 2021

I strongly agree with this PR! In my experience it is a common practice to use a single hostname that expands to multiple IP address records, and in this case (i.e. using Kafka in the first place because HA is important) it would be helpful to explicitly iterate over the list of records.

I think the PR itself has two issues: 1) unrelated change in circleci.yml and 2) IPv4 limitation for some reason, and would ask if @abicky would remove those lines if that would make this eligible to land the PR?

<soapbox>
The current behavior, if you pass in a hostname that resolves to multiple addresses, falls back on a lot of system-implicit-dependencies. Somewhere down the call chain, the hostname must be resolved to an IP address in order to make a TCP connection. The DNS resolver may randomize the return order of IPs such that different servers in the same network talk to different seed brokers, but what's twiddly is if one instance does the DNS lookup, uses the first address, and that address is dead... now what? If the client code (i.e. my code that uses Kafka) loops on, say, first N failures, and tries Kafka.new("server:9192") again, there's no way to know if you'll get the same dead address from cache, or get a fresh lookup with the same order of records, or get a fresh lookup with a randomized order of records and a chance at connecting to a live server. And if you did know that, you'd either have carefully configured your DNS server to handle this problem, or you'd have a lucky default setting and set a timebomb for your next OS upgrade if the defaults change.
</soapbox>

Thank you!

@wamaral
Copy link
Contributor

wamaral commented Apr 9, 2021

Looking forward to this, it matches our use case: we run a kubernetes cluster with the Kafka brokers exposed through a single DNS entry. Whenever one of the brokers go down (rare but can happen due to our k8s setup), there's a chance our services will try to reach the IP of the unavailable broker due to local DNS caching, and when it can't connect to that particular IP the service will just crash, thinking there are no brokers available

@abicky , are you still working on this? Let me know if I can be of help

@abicky abicky force-pushed the support-seed-broker-with-multiple-addresses branch 2 times, most recently from 14e9990 to 99fcabb Compare April 14, 2021 19:28
By default, the Java implementation of a Kafka client can use
each returned IP address from a hostname.
cf. https://kafka.apache.org/documentation/#client.dns.lookup

librdkafka also does:
> If host resolves to multiple addresses librdkafka will
> round-robin the addresses for each connection attempt.

cf. https:/edenhill/librdkafka/blob/v1.5.3/INTRODUCTION.md#brokers

Such a feature helps us to manage seed brokers' IP addresses
easily, so this commit implements it.
@abicky abicky force-pushed the support-seed-broker-with-multiple-addresses branch from 99fcabb to 2cf27e4 Compare April 14, 2021 19:54
@abicky
Copy link
Contributor Author

abicky commented Apr 14, 2021

@dasch I've added the option resolve_seed_brokers. Could you review the changes? Although some tests failed, I don't think they are related to these changes.

@sodabrew @wamaral Thank you for your comments and I'm sorry for my late response. I've pushed the latest code.

(@resolve_seed_brokers ? Resolv.getaddresses(node.hostname).shuffle : [node.hostname]).each do |hostname_or_ip|
node_info = node.to_s
node_info << " (#{hostname_or_ip})" if node.hostname != hostname_or_ip
@logger.info "Fetching cluster metadata from #{node_info}"
Copy link

@sodabrew sodabrew Apr 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@logger.info "Fetching cluster metadata from #{node_info}"
@logger.info "Fetching cluster metadata from #{node.hostname}#{node_info}"

This would show both the hostname and the resolved IP address that's being attempted.

Copy link
Contributor Author

@abicky abicky Apr 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node.hostname is not necessary because node.to_s has the information:

node_info = node.to_s # e.g. "kafka://localhost:9092"
# e.g. "kafka://localhost:9092" => "kafka://localhost:9092 (127.0.0.1)"
node_info << " (#{hostname_or_ip})" if node.hostname != hostname_or_ip

Here is an example output:

/Users/arabiki/ghq/src/github.com/zendesk/ruby-kafka/lib/kafka/cluster.rb:454:in `fetch_cluster_info': Could not connect to any of the seed brokers: (Kafka::ConnectionError)
- kafka://localhost0:9092 (10.0.0.1): Operation timed out

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right on, that looks good already!

Copy link
Contributor

@dasch dasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 looks good!

Can you also add an entry to the changelog?

@abicky abicky force-pushed the support-seed-broker-with-multiple-addresses branch from a5ce0d7 to 4602e1f Compare April 16, 2021 14:05
@abicky abicky force-pushed the support-seed-broker-with-multiple-addresses branch from 4602e1f to ef7c21c Compare April 16, 2021 14:06
@abicky
Copy link
Contributor Author

abicky commented Apr 16, 2021

@dasch Thank you for reviewing the PR! I've added the entry to the changelog in the commit ef7c21c.

Copy link
Contributor

@dasch dasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@dasch dasch merged commit e766e2c into zendesk:master Apr 19, 2021
@abicky abicky deleted the support-seed-broker-with-multiple-addresses branch November 19, 2021 09:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants