Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in plugin [inputs.netstat] #3486

Closed
therealsputnik opened this issue Nov 17, 2017 · 10 comments · Fixed by #3513
Closed

Error in plugin [inputs.netstat] #3486

therealsputnik opened this issue Nov 17, 2017 · 10 comments · Fixed by #3513
Labels
bug unexpected problem or unintended behavior regression something that used to work, but is now broken upstream bug or issues that rely on dependency fixes
Milestone

Comments

@therealsputnik
Copy link

therealsputnik commented Nov 17, 2017

Bug report

Relevant telegraf.conf:

System info:

Teletgraf V1.4.4
Ubuntu Server 17.10 (fresh installation)

Steps to reproduce:

  1. Install influxdb (V1.3.7)
  2. Install grafana (V4.6.2)
  3. Import grafana dashboard .JSON from here: https://grafana.com/dashboards/928
  4. Use telegraf.conf from the same link at change name from Tivan to Telegraf #3. (change url to point to your influx database)
  5. Restart telegraf.

Expected behavior:

All fields on the dashboard should populate with relevant data collected by telegraf and stored in influx database named "telegraf"

Actual behavior:

All fields on the dashboard EXCEPT those relating to [inputs.netstat] populate fine,

Additional info:

telegraf log reports:

2017-11-17T14:17:56Z E! Error in plugin [inputs.netstat]: error getting net connections info: cound not get pid(s), 0: open /proc/1593/fd: no such file or directory
2017-11-17T14:18:06Z E! Error in plugin [inputs.netstat]: error getting net connections info: cound not get pid(s), 0: open /proc/1603/fd: no such file or directory
2017-11-17T14:18:31Z E! Error in plugin [inputs.netstat]: error getting net connections info: cound not get pid(s), 0: open /proc/1634/fd: no such file or directory

Config used:

# Global tags can be specified here in key="value" format.
[global_tags]
  # dc = "us-east-1" # will tag all metrics with dc=us-east-1
  # rack = "1a"
  ## Environment variables can be used as tags, and throughout the config file
  # user = "$USER"

# Configuration for telegraf agent
[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  debug = false
  quiet = false
  hostname = ""
  omit_hostname = false

### OUTPUT

# Configuration for influxdb server to send metrics to
[[outputs.influxdb]]
  urls = ["http://192.168.2.180:8086"]
  database = "telegraf"

  ## Retention policy to write to. Empty string writes to the default rp.
  retention_policy = ""
  ## Write consistency (clusters only), can be: "any", "one", "quorum", "all"
  write_consistency = "any"

  ## Write timeout (for the InfluxDB client), formatted as a string.
  ## If not provided, will default to 5s. 0s means no timeout (not recommended).
  timeout = "5s"
  # username = "telegraf"
  # password = "2bmpiIeSWd63a7ew"
  ## Set the user agent for HTTP POSTs (can be useful for log differentiation)
  # user_agent = "telegraf"
  ## Set UDP payload size, defaults to InfluxDB UDP Client default (512 bytes)
  # udp_payload = 512

# Read metrics about cpu usage
[[inputs.cpu]]
  ## Whether to report per-cpu stats or not
  percpu = true
  ## Whether to report total system cpu stats or not
  totalcpu = true
  ## Comment this line if you want the raw CPU time metrics
  fielddrop = ["time_*"]

# Read metrics about disk usage by mount point
[[inputs.disk]]
  ## By default, telegraf gather stats for all mountpoints.
  ## Setting mountpoints will restrict the stats to the specified mountpoints.
  # mount_points = ["/"]

  ## Ignore some mountpoints by filesystem type. For example (dev)tmpfs (usually
  ## present on /run, /var/run, /dev/shm or /dev).
  ignore_fs = ["tmpfs", "devtmpfs"]

# Read metrics about disk IO by device
[[inputs.diskio]]
  ## By default, telegraf will gather stats for all devices including
  ## disk partitions.
  ## Setting devices will restrict the stats to the specified devices.
  # devices = ["sda", "sdb"]
  ## Uncomment the following line if you need disk serial numbers.
  # skip_serial_number = false

# Get kernel statistics from /proc/stat
[[inputs.kernel]]
  # no configuration

# Read metrics about memory usage
[[inputs.mem]]
  # no configuration

# Get the number of processes and group them by status
[[inputs.processes]]
  # no configuration

# Read metrics about swap memory usage
[[inputs.swap]]
  # no configuration

# Read metrics about system load & uptime
[[inputs.system]]
  # no configuration

# Read metrics about network interface usage
[[inputs.net]]
  # collect data only about specific interfaces
  # interfaces = ["eth0"]

[[inputs.netstat]]
  # no configuration

[[inputs.interrupts]]
  # no configuration

[[inputs.linux_sysctl_fs]]
  # no configuration


@danielnelson
Copy link
Contributor

I think this may happen if a process exits during the collection, is it ever able to collect the metrics successfully?

I opened this pull request which will skip over processes that have exited: shirou/gopsutil#458

@danielnelson danielnelson added bug unexpected problem or unintended behavior regression something that used to work, but is now broken upstream bug or issues that rely on dependency fixes labels Nov 17, 2017
@therealsputnik
Copy link
Author

I think this may happen if a process exits during the collection, is it ever able to collect the metrics successfully?

No, I don't believe so. The series/measurements that don't show up on the graphs (TcpExtTCPAbortOnClose, TcpExtSyncookiesFailed, gather_errors etc) are never added to the influxdb.

@pawal
Copy link

pawal commented Nov 19, 2017

I get the same issue now:

2017-11-19T13:53:50Z E! Error in plugin [inputs.netstat]: error getting net connections info: cound not get pid(s), 0

Using Telegraf v1.5.0~112955a9 (git: master 112955a)

(I am new to Telegraf)

@danielnelson
Copy link
Contributor

@pawal Are you able to compile with gopsutil 384a55110aa5ae052eb93ea94940548c1e305a99 and check if the error remains?

@Suvitruf
Copy link

Suvitruf commented Nov 22, 2017

Same problem with Telegraf v1.4.4 (git: release-1.4 ddcb931) on Ubuntu 16.04.3 LTS

@damm
Copy link

damm commented Nov 23, 2017

Downgrading to 1.4.3 and the error is gone

@Suvitruf
Copy link

Suvitruf commented Nov 24, 2017

Hm, interesting. We have different versions of Ubuntu on our servers (Ubuntu 16.04.1 LTS, Ubuntu 16.04.2 LTS, Ubuntu 16.04.3 LTS).

The problem occurs only on Ubuntu 16.04.3 LTS. Don't know if it helps you.

@therealsputnik
Copy link
Author

I have two Ubuntu servers, one is running 17.10 and the other 16.04.3. I have this issue on both. Still unresolved. The only other reference to this error message I could locate was here (#3311) it didn't help me but perhaps could offer someone else some insight to a fix.

@gaetanquentin
Copy link

gaetanquentin commented Feb 15, 2018

same with redhat 6.4 and telegraf-1.4.2-1.x86_64:
2018-02-15T15:55:14Z E! Error in plugin [inputs.netstat]: error getting net connections info: cound not get pid(s), 0
2018-02-15T15:55:16Z E! Error in plugin [inputs.netstat]: error getting net connections info: cound not get pid(s), 0
2018-02-15T15:55:18Z E! Error in plugin [inputs.netstat]: error getting net connections info: cound not get pid(s), 0

--edit:
works fine with 1.5.2

@danielnelson
Copy link
Contributor

@gaetanquentin This is already fixed in 1.4.5 and newer if you can upgrade.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior regression something that used to work, but is now broken upstream bug or issues that rely on dependency fixes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants