-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hono constantly fails to publish messages to the kafka broker without being able to recover #3544
Comments
@calohmn would you mind taking a look? I believe this falls into your area of expertise :-) |
I think the exceptions being checked in the I would be better to have an idea though what went wrong here (also because I haven't see any cases yet where a Hono Kafka producer stopped working and the pod had to be restarted). @JeffreyThijs You also don't have any tracing data for this case? Have you set any Kafka producer config properties in your protocol adapter config? |
Sorry for the late reply. Unfortunately, our logging stack was not in place when this problem occured so we don't have any tracing data from the incident. It might indeed be a good idea to treat non kafka errors also a fatal errors (or just explicitly list the errors who should not be handled as a fatal error) in order to avoid keeping discarding messages due to an unrecoverable error. |
@JeffreyThijs is this still an issue? |
Hi,
We recently stumbled upon an occurrence where hono suddenly started to failing to publish messages to our kafka broker. Our kafka broker was working perfectly fine upon discovering this issue but we do not know for sure if a small disturbance of the kafka broker in the mean time might have caused this issue. Nevertheless, we would assume that hono can recover from this since dropping messages is really not desirable. However, when restarting the pod and coming back live the adapter started working as before and the issue resolved itself.
Sadly, we do not have saved any logs of this issue and also have not been able to reproduce this issue. Although, i tried to look into the code I encountered the following which might be the culprit of why the system was not able to recover:
The following condition determines whether the cached KafkaProducer will be closed or not:
hono/clients/kafka-common/src/main/java/org/eclipse/hono/client/kafka/producer/CachingKafkaProducerFactory.java
Line 210 in 02b0a91
This condition is determined by:
https:/eclipse-hono/hono/blob/master/clients/kafka-common/src/main/java/org/eclipse/hono/client/kafka/producer/CachingKafkaProducerFactory.java#L259
So the condition to invalidate the cached KafkaProducer (which might be essential to being able to recover from failed publishes to the kafka broker) is only done if one of those error is thrown which raises some questions:
On the other hand, maybe I am tunnel visioned and this might be caused by something else?
Any comments are well appreciated!
The text was updated successfully, but these errors were encountered: