Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datadog agent error to find FFI provider causes an application to crash #4083

Closed
achever1 opened this issue Oct 24, 2022 · 5 comments
Closed

Comments

@achever1
Copy link

achever1 commented Oct 24, 2022

Hello,

We encountered an issue with our application (failed to start) and it seemed linked to the datadog agent that did not find a native library (FFI).
Surprisingly this error is not handled and we could not start our app, is it normal ?

Context:

  • docker
  • java 17 (temurin)
  • arch x86_64
  • DD agent version: 0.110.0

ERROR datadog.trace.bootstrap.Agent - Throwable thrown while starting JmxFetch

OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
	... 28 more
	at jnr.unixsocket.SockAddrUnix$DefaultSockAddrUnix.<init>(SockAddrUnix.java:208)
	at jnr.unixsocket.SockAddrUnix.<init>(SockAddrUnix.java:46)
	at jnr.ffi.Runtime.getSystemRuntime(Runtime.java:67)
	at jnr.ffi.Runtime$SingletonHolder.<clinit>(Runtime.java:82)
	at jnr.ffi.provider.FFIProvider.getSystemProvider(FFIProvider.java:35)
	at jnr.ffi.provider.FFIProvider$SystemProviderSingletonHolder.<clinit>(FFIProvider.java:57)
	at jnr.ffi.provider.FFIProvider$SystemProviderSingletonHolder.getInstance(FFIProvider.java:68)
	at java.base/java.lang.Class.newInstance(Class.java:645)
	at java.base/jdk.internal.reflect.ReflectionFactory.newInstance(ReflectionFactory.java:347)
	at java.base/java.lang.reflect.ReflectAccess.newInstance(ReflectAccess.java:128)
	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
	at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at jnr.ffi.provider.jffi.Provider.<init>(Provider.java:29)
	at jnr.ffi.provider.jffi.NativeRuntime.getInstance(NativeRuntime.java:49)
	at jnr.ffi.provider.jffi.NativeRuntime$SingletonHolder.<clinit>(NativeRuntime.java:53)
	at jnr.ffi.provider.jffi.NativeRuntime.<init>(NativeRuntime.java:41)
	at jnr.ffi.provider.jffi.NativeRuntime.<init>(NativeRuntime.java:57)
	at jnr.ffi.provider.AbstractRuntime.<init>(AbstractRuntime.java:48)
	at jnr.ffi.provider.jffi.NativeRuntime$TypeDelegate.size(NativeRuntime.java:178)
	at com.kenai.jffi.Type.size(Type.java:138)
	at com.kenai.jffi.Type.resolveSize(Type.java:155)
	at com.kenai.jffi.Type$Builtin.getTypeInfo(Type.java:237)
	at com.kenai.jffi.Type$Builtin.lookupTypeInfo(Type.java:253)
	... 61 more
	at com.kenai.jffi.internal.StubLoader.loadFromJar(StubLoader.java:392)
	at java.base/sun.nio.ch.FileChannelImpl.transferFrom(FileChannelImpl.java:797)
	at java.base/sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:395)
	at java.base/sun.nio.ch.FileChannelImpl.endBlocking(FileChannelImpl.java:171)
	at java.base/java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:199)
Caused by: java.nio.channels.ClosedByInterruptException
	at java.base/java.lang.Thread.run(Thread.java:833)
	at datadog.trace.util.AgentTaskScheduler$Worker.run(AgentTaskScheduler.java:254)
	at datadog.trace.util.AgentTaskScheduler$PeriodicTask.run(AgentTaskScheduler.java:299)
	at datadog.trace.util.AgentTaskScheduler$RunnableTask.run(AgentTaskScheduler.java:36)
	at datadog.trace.util.AgentTaskScheduler$RunnableTask.run(AgentTaskScheduler.java:41)
	at datadog.trace.bootstrap.Agent$JmxStartTask.run(Agent.java:489)
	at datadog.trace.bootstrap.Agent.access$1100(Agent.java:48)
	at datadog.trace.bootstrap.Agent.startJmx(Agent.java:501)
	at datadog.trace.bootstrap.Agent.startJmxFetch(Agent.java:561)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at datadog.trace.agent.jmxfetch.JMXFetch.run(JMXFetch.java:37)
	at datadog.trace.agent.jmxfetch.JMXFetch.run(JMXFetch.java:90)
	at datadog.communication.monitor.DDAgentStatsDClientManager.statsDClient(DDAgentStatsDClientManager.java:62)
	at datadog.communication.monitor.DDAgentStatsDClient.<init>(DDAgentStatsDClient.java:20)
	at datadog.communication.monitor.DDAgentStatsDConnection.acquire(DDAgentStatsDConnection.java:62)
	at datadog.communication.monitor.DDAgentStatsDConnection.scheduleConnect(DDAgentStatsDConnection.java:90)
	at datadog.communication.monitor.DDAgentStatsDConnection.doConnect(DDAgentStatsDConnection.java:118)
	at com.timgroup.statsd.NonBlockingStatsDClientBuilder.build(NonBlockingStatsDClientBuilder.java:181)
	at com.timgroup.statsd.NonBlockingStatsDClientBuilder.resolve(NonBlockingStatsDClientBuilder.java:205)
	at com.timgroup.statsd.NonBlockingStatsDClientBuilder.staticStatsDAddressResolution(NonBlockingStatsDClientBuilder.java:287)
	at com.timgroup.statsd.NonBlockingStatsDClientBuilder.staticAddressResolution(NonBlockingStatsDClientBuilder.java:271)
	at com.timgroup.statsd.NonBlockingStatsDClientBuilder$1.call(NonBlockingStatsDClientBuilder.java:247)
	at com.timgroup.statsd.NonBlockingStatsDClientBuilder$1.call(NonBlockingStatsDClientBuilder.java:249)
	at jnr.unixsocket.UnixSocketAddress.<init>(UnixSocketAddress.java:53)
	at jnr.unixsocket.SockAddrUnix.create(SockAddrUnix.java:174)
	at jnr.unixsocket.SockAddrUnix$DefaultSockAddrUnix.<init>(SockAddrUnix.java:208)
	at jnr.unixsocket.SockAddrUnix.<init>(SockAddrUnix.java:46)
	at jnr.ffi.Runtime.getSystemRuntime(Runtime.java:67)
	at jnr.ffi.Runtime$SingletonHolder.<clinit>(Runtime.java:82)
	at jnr.ffi.provider.FFIProvider.getSystemProvider(FFIProvider.java:35)
	at jnr.ffi.provider.FFIProvider$SystemProviderSingletonHolder.<clinit>(FFIProvider.java:57)
	at jnr.ffi.provider.FFIProvider$SystemProviderSingletonHolder.getInstance(FFIProvider.java:68)
	at java.base/java.lang.Class.newInstance(Class.java:645)
	at java.base/jdk.internal.reflect.ReflectionFactory.newInstance(ReflectionFactory.java:347)
	at java.base/java.lang.reflect.ReflectAccess.newInstance(ReflectAccess.java:128)
	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
	at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at jnr.ffi.provider.jffi.Provider.<init>(Provider.java:29)
	at jnr.ffi.provider.jffi.NativeRuntime.getInstance(NativeRuntime.java:49)
	at jnr.ffi.provider.jffi.NativeRuntime$SingletonHolder.<clinit>(NativeRuntime.java:53)
	at jnr.ffi.provider.jffi.NativeRuntime.<init>(NativeRuntime.java:41)
	at jnr.ffi.provider.jffi.NativeRuntime.<init>(NativeRuntime.java:57)
	at jnr.ffi.provider.AbstractRuntime.<init>(AbstractRuntime.java:48)
	at jnr.ffi.provider.jffi.NativeRuntime$TypeDelegate.size(NativeRuntime.java:178)
	at com.kenai.jffi.Type.size(Type.java:138)
	at com.kenai.jffi.Type.resolveSize(Type.java:155)
	at com.kenai.jffi.Type$Builtin.getTypeInfo(Type.java:237)
	at com.kenai.jffi.Type$Builtin.lookupTypeInfo(Type.java:242)
	at com.kenai.jffi.Foreign.getInstance(Foreign.java:103)
	at com.kenai.jffi.Foreign$InstanceHolder.<clinit>(Foreign.java:45)
	at com.kenai.jffi.Foreign$InstanceHolder.getInstanceHolder(Foreign.java:49)
	at com.kenai.jffi.Init.load(Init.java:68)
	at java.base/java.lang.Class.forName(Class.java:467)
	at java.base/java.lang.Class.forName0(Native Method)
	at com.kenai.jffi.internal.StubLoader.<clinit>(StubLoader.java:487)
	at com.kenai.jffi.internal.StubLoader.load(StubLoader.java:278)
	at com.kenai.jffi.internal.StubLoader.loadFromJar(StubLoader.java:399)
	at com.kenai.jffi.internal.StubLoader.tempReadonlyError(StubLoader.java:414)
Caused by: java.lang.UnsatisfiedLinkError: could not get native definition for type `POINTER`, original error message follows: java.io.IOException: Unable to write jffi binary stub to `/tmp`. Set `TMPDIR` or Java property `java.io.tmpdir` to a read/write path that is not mounted "noexec".
	... 13 more
	at datadog.trace.agent.jmxfetch.JMXFetch.run(JMXFetch.java:37)
	at datadog.trace.agent.jmxfetch.JMXFetch.run(JMXFetch.java:90)
	at datadog.communication.monitor.DDAgentStatsDClientManager.statsDClient(DDAgentStatsDClientManager.java:62)
	at datadog.communication.monitor.DDAgentStatsDClient.<init>(DDAgentStatsDClient.java:20)
	at datadog.communication.monitor.DDAgentStatsDConnection.acquire(DDAgentStatsDConnection.java:62)
	at datadog.communication.monitor.DDAgentStatsDConnection.scheduleConnect(DDAgentStatsDConnection.java:90)
	at datadog.communication.monitor.DDAgentStatsDConnection.doConnect(DDAgentStatsDConnection.java:118)
	at com.timgroup.statsd.NonBlockingStatsDClientBuilder.build(NonBlockingStatsDClientBuilder.java:181)
	at com.timgroup.statsd.NonBlockingStatsDClientBuilder.resolve(NonBlockingStatsDClientBuilder.java:205)
	at com.timgroup.statsd.NonBlockingStatsDClientBuilder.staticStatsDAddressResolution(NonBlockingStatsDClientBuilder.java:287)
	at com.timgroup.statsd.NonBlockingStatsDClientBuilder.staticAddressResolution(NonBlockingStatsDClientBuilder.java:271)
	at com.timgroup.statsd.NonBlockingStatsDClientBuilder$1.call(NonBlockingStatsDClientBuilder.java:247)
	at com.timgroup.statsd.NonBlockingStatsDClientBuilder$1.call(NonBlockingStatsDClientBuilder.java:249)
	at jnr.unixsocket.UnixSocketAddress.<init>(UnixSocketAddress.java:53)
	at jnr.unixsocket.SockAddrUnix.create(SockAddrUnix.java:174)
	at jnr.unixsocket.SockAddrUnix$DefaultSockAddrUnix.<init>(SockAddrUnix.java:209)
	at jnr.ffi.Struct$Unsigned16.<init>(Struct.java:1240)
	at jnr.ffi.Struct$NumberField.<init>(Struct.java:872)
	at jnr.ffi.provider.InvalidRuntime.findType(InvalidRuntime.java:42)
	at jnr.ffi.provider.InvalidRuntime.newLoadError(InvalidRuntime.java:101)
Caused by: java.lang.UnsatisfiedLinkError: could not load FFI provider jnr.ffi.provider.jffi.Provider
	at java.base/java.lang.Thread.run(Thread.java:833)
	at datadog.trace.util.AgentTaskScheduler$Worker.run(AgentTaskScheduler.java:254)
	at datadog.trace.util.AgentTaskScheduler$PeriodicTask.run(AgentTaskScheduler.java:299)
	at datadog.trace.util.AgentTaskScheduler$RunnableTask.run(AgentTaskScheduler.java:36)
	at datadog.trace.util.AgentTaskScheduler$RunnableTask.run(AgentTaskScheduler.java:41)
	at datadog.trace.bootstrap.Agent$JmxStartTask.run(Agent.java:489)
	at datadog.trace.bootstrap.Agent.access$1100(Agent.java:48)
	at datadog.trace.bootstrap.Agent.startJmx(Agent.java:501)
	at datadog.trace.bootstrap.Agent.startJmxFetch(Agent.java:561)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

@richardstartin
Copy link
Member

Hi @achever1 - this error shouldn't prevent application startup - startJmxFetch catches Throwable. Are there any other logs which might point to why the application failed to start?

This isn't a component I work on so I will pass this on to the team which maintains JMXFetch.

@carlosroman
Copy link
Contributor

carlosroman commented Oct 25, 2022

This looks a bit suspicious to me:

Caused by: java.lang.UnsatisfiedLinkError: could not get native definition for type `POINTER`, original error message follows: java.io.IOException: Unable to write jffi binary stub to `/tmp`. Set `TMPDIR` or Java property `java.io.tmpdir` to a read/write path that is not mounted "noexec".
	... 13 more

Has this worked before in the past? If so has there been any OS changes or changes to the way the Agent is run?

@achever1
Copy link
Author

It was the 1st time it failed like this and it worked again half an hour later.
No OS updates I am aware of at this moment.

@carlosroman
Copy link
Contributor

DD agent version: 0.110.0

This doesn't look like a version of the Datadog Agent I'd expect to see. I assume this is actually the version of dd-trace-java you are using.

I'd check how you run your final packaged application. It might be worth explicitly setting env var TMPDIR to point to a place where your application has write access to.

@achever1
Copy link
Author

we use https:/guizmaii/sbt-datadog in sbt/scala to integrate the agent to application. The app is packaged by https:/sbt/sbt-native-packager, which should end up like a classical Java app started with the DD agent.
We'll watch out and keep you posted if it happens again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants