Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consul plugin doesn't work since 1.6.4 with prometheus #5522

Closed
ekbfh opened this issue Mar 4, 2019 · 8 comments
Closed

Consul plugin doesn't work since 1.6.4 with prometheus #5522

ekbfh opened this issue Mar 4, 2019 · 8 comments
Assignees
Labels
area/prometheus bug unexpected problem or unintended behavior
Milestone

Comments

@ekbfh
Copy link

ekbfh commented Mar 4, 2019

System info:

RHEL OS 3.10.0-862.11.6.el7.x86_64

Bug:

Telegraf 1.6.4 works well, any telegraf with greater version(1.7.0+) breaks the scrapping from telegraf with prometheus.

My case overview:

A custom proceses register in consul with their uniq tags.(Tags: noc, 0193068a-3ddb-44e7-81c1-0528d28419e0)
default

Steps to reproduce:

On 1.6.4 it looks normal and don't break prometheus :

consul_health_checks,check_id=service:0193068a-3ddb-44e7-81c1-0528d28419e0,host=sstage-db,node=sstage-db,service_name=discovery-default check_name="Service 'discovery-default' check",service_id="0193068a-3ddb-44e7-81c1-0528d28419e0",status="passing",passing=1i,critical=0i,warning=0i 1551639934000000000

On 1.9.5 like:

> consul_health_checks,0193068a-3ddb-44e7-81c1-0528d28419e0=0193068a-3ddb-44e7-81c1-0528d28419e0,check_id=service:0193068a-3ddb-44e7-81c1-0528d28419e0,host=sstage-db,noc=noc,node=sstage-db,service_name=discovery-default check_name="Service 'discovery-default' check",critical=0i,passing=1i,service_id="0193068a-3ddb-44e7-81c1-0528d28419e0",status="passing",warning=0i 1551640092000000000

With Prometheus error:

msg="append failed" err="expected label name, got \"EQUAL\""

So i have additional info like

0193068a-3ddb-44e7-81c1-0528d28419e0=0193068a-3ddb-44e7-81c1-0528d28419e0
noc=noc

Additional info:

Also i have found this discussion:
#4155 (comment)

@danielnelson
Copy link
Contributor

Try using tagexclude to remove the unwanted tags as a workaround:

[[inputs.consul]]
  tagexclude = ["[0-9]*", "noc"]

The situation is slightly different in 1.10, there is an error from Telegraf:

2019-03-04T20:22:09Z E! Error creating prometheus metric, key: consul_health_checks_warning, labels: [discovery-default 0193068a-3ddb-44e7-81c1-0528d28419e0 service:0193068a-3ddb-44e7-81c1-0528d28419e0 sstage-db noc sstage-db],
err: "0193068a_3ddb_44e7_81c1_0528d28419e0" is not a valid label name for metric "consul_health_checks_warning"

We need to ensure that metric names and labels conform to the naming requirements:

Label names may contain ASCII letters, numbers, as well as underscores. They must match the regex [a-zA-Z_][a-zA-Z0-9_]*. Label names beginning with __ are reserved for internal use.

@ekbfh
Copy link
Author

ekbfh commented Mar 6, 2019

Try using tagexclude to remove the unwanted tags as a workaround:

on 1.10.0 err: "" is not a valid label name for metric "consul_health_checks_critical"

And on 1.9.5 workaround doesn't work.

@danielnelson
Copy link
Contributor

danielnelson commented Mar 6, 2019

Try adding the empty string to the list of tagexclude:

[[inputs.consul]]
  tagexclude = ["[0-9]*", "noc", ""]

@ekbfh
Copy link
Author

ekbfh commented Mar 10, 2019

Try adding the empty string to the list of tagexclude:

Yes, that works. But only as workaround :)

@glinton
Copy link
Contributor

glinton commented Mar 13, 2019

@ekbfh Please try the following build. If you still have issues, please provide the relevant sections from your telegraf.conf and provide actual steps to reproduce this issue, thanks.

@ekbfh
Copy link
Author

ekbfh commented Mar 16, 2019

@glinton Hello!
As i can see, you just add a workaround about empty tags, but not for tags with labels equals those tags.

0193068a-3ddb-44e7-81c1-0528d28419e0=0193068a-3ddb-44e7-81c1-0528d28419e0
noc=noc

It was added in #4155 and i want to have an ability to disable it.
Your build is working and there is no empty tags, but if i delete tagexclude, i get

telegraf[3933]: err: "84345916_32e6_40b9_8d3e_763288067b7b" is not a valid label name for metric "consul_health_checks_critical"
telegraf[3933]: 2019-03-16T22:33:49Z E! Error creating prometheus metric, key: consul_health_checks_critical, labels: [          c5a08358-83ce-4709-af9c-aa517dbd0bcf service:c5a08358-83ce-4709-af9c-aa517dbd0bcf      sstage-db            Service 'ping-default' check c5a08358-83ce-4709-af9c-aa517dbd0bcf passing                               noc                             ping-default                                          sstage-wrk                                ],
telegraf[3933]: err: "84345916_32e6_40b9_8d3e_763288067b7b" is not a valid label name for metric "consul_health_checks_critical"
telegraf[3933]: 2019-03-16T22:33:49Z E! Error creating prometheus metric, key: consul_health_checks_critical, labels: [           service:16d758cc-534f-4c6f-b110-3151c2932c46      sstage-db            Service 'card' check 16d758cc-534f-4c6f-b110-3151c2932c46 passing                               noc 16d758cc-534f-4c6f-b110-3151c2932c46                            card                                          sstage-db                                ],
telegraf[3933]: err: "84345916_32e6_40b9_8d3e_763288067b7b" is not a valid label name for metric "consul_health_checks_critical"
telegraf[3933]: 2019-03-16T22:33:49Z E! Error creating prometheus metric, key: consul_health_checks_critical, labels: [           service:42d69010-bc02-4a77-a2d2-391d5f27e3e0      sstage-db  42d69010-bc02-4a77-a2d2-391d5f27e3e0          Service 'discovery-TMSK' check 42d69010-bc02-4a77-a2d2-391d5f27e3e0 passing                               noc                             discovery-TMSK                                          sstage-db                                ],
telegraf[3933]: err: "84345916_32e6_40b9_8d3e_763288067b7b" is not a valid label name for metric "consul_health_checks_critical"
telegraf[3933]: 2019-03-16T22:33:49Z E! Error creating prometheus metric, key: consul_health_checks_critical, labels: [           service:12c31da0-f0b7-9178-10b2-cbeab0971248      sstage-db            Service 'correlator-KMRV' check 12c31da0-f0b7-9178-10b2-cbeab0971248 passing                               noc                             correlator-KMRV                                          sstage-db              12c31da0-f0b7-9178-10b2-cbeab0971248                  ],
telegraf[3933]: err: "84345916_32e6_40b9_8d3e_763288067b7b" is not a valid label name for metric "consul_health_checks_critical"

@danielnelson
Copy link
Contributor

We will merge the consul empty tag fix for 1.10.1, but push this over to 1.10.2 for the prometheus fix.

@danielmotaleite
Copy link

danielmotaleite commented Aug 21, 2019

To help google, if you have this, what you may want is to define the tag_delimiter in input.consul: instead of tagexclude, specially if you use the format label:value (eg: environment:live)

example:

[[inputs.consul]]
 tag_delimiter = ":"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/prometheus bug unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

4 participants