Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Startup impact of application insight on microservices startup #2000

Open
Sachin1O1 opened this issue Dec 10, 2021 · 49 comments
Open

Startup impact of application insight on microservices startup #2000

Sachin1O1 opened this issue Dec 10, 2021 · 49 comments
Assignees

Comments

@Sachin1O1
Copy link

What's the minimum / maximum startup impact of application insight on microservices startups?

@ghost ghost added the Needs: Triage 🔍 label Dec 10, 2021
@Sachin1O1
Copy link
Author

any update?

@trask
Copy link
Member

trask commented Jan 3, 2022

hi @Sachin1O1, it depends a lot on the application and the compute resources. there's also additional overhead on Java 8 because the agent jar is signed, and the Java JIT compiler is not activated until later on in the startup process on Java 8 (this is not an issue on Java 11).

@Sachin1O1
Copy link
Author

Hi @trask , I have some stats on how my service startup time is impacted by the application insight jar

Service startup sample

Sr no With jar Without jar
1 260.21 102.6
2 170 103
3 274 180
4 165 99

Container resources

resources:
limits:
cpu: 300m
memory: 700Mi
requests:
cpu: 100m
memory: 400Mi

Java

Java 11 (OpenJ9)

As you can see it inc startup time by almost X2

@heyams
Copy link
Contributor

heyams commented Jan 18, 2022

I'm curious to know which version of agent were you testing against? Did you try the latest GA/BETA version?

@Sachin1O1
Copy link
Author

Hi @heyams,

I am using Application Insights Java 3.2.4 (GA)

@Sachin1O1
Copy link
Author

@trask @heyams any update on this.

@heyams
Copy link
Contributor

heyams commented Feb 4, 2022

i'm working on a debug startup profiler.. so that you can provide thread dump to us for further debugging. we're not able to repro it.

@heyams
Copy link
Contributor

heyams commented Feb 7, 2022

@Sachin1O1 can you use this 3.2.6-BETA-SNAPSHOT and help us collect thread dump from your app?

Please set -Dapplicationinsights.debug.startupProfiling to true in your jvm arguments:
java -javaagent:applicationinsights-agent-3.2.6-BETA-SNAPSHOT.jar -Dapplicationinsights.debug.startupProfiling=true.

Let it run till your app has started. Then send me ([email protected]) the file called stacktrace.txt located in your temp folder. The full path is available in the applicationinsights.log. Something like C:\Users\${USER-NAME}\AppData\Local\Temp\applicationinsights

@heyams
Copy link
Contributor

heyams commented Feb 7, 2022

@Sachin1O1 i updated the snapshot link above. earlier version was no good. please give it a try and get back to us when you can. thanks.

@Sachin1O1
Copy link
Author

@heyams Thanks for the profiler. I will update you with the relevant data ASAP. :)

@kbjerke
Copy link

kbjerke commented Feb 11, 2022

Hi @heyams, we did some testing with the StartupProfiler, and we will share the stacktrace.txt with you on e-mail. But we encountered a warning and one exception.

First was the warning that InstrumentationKey is missing.

2022-02-11 13:27:10.002Z WARN c.m.a.a.i.telemetry.ConnectionString - Missing Statsbeat 'InstrumentationKey'

I was looking at the code,

setConnectionString(
System.getenv("APPLICATIONINSIGHTS_CONNECTION_STRING"),
System.getenv("APPINSIGHTS_INSTRUMENTATIONKEY"));

And it seems to originate here:

throw new InvalidConnectionStringException("Missing '" + Keywords.INSTRUMENTATION_KEY + "'");

Not sure if I'm in the correct place, but we haven't used the APPINSIGHTS_INSTRUMENTATIONKEY environment variable, but rather the APPLICATIONINSIGHTS_CONNECTION_STRING. And looking at the documentation for connection strings, it seems to me like it should be getting InstrumentationKey from the APPLICATIONINSIGHTS_CONNECTION_STRING if that's present in the environment, instead of warning about a missing variable.

The other was a NPE capture thread dump method.

Exception in thread "StartupProfiler" java.lang.NullPointerException
	at com.microsoft.applicationinsights.agent.StartupProfiler$ThreadDump.captureThreadDump(StartupProfiler.java:106)
	at com.microsoft.applicationinsights.agent.StartupProfiler$ThreadDump.run(StartupProfiler.java:90)
	at java.base/java.lang.Thread.run(Thread.java:829)

At first glance I was unable to figure out if this case was handled, so I'm not sure if the thread dump exited prematurely.

@heyams
Copy link
Contributor

heyams commented Feb 11, 2022

we fixed the following warning in 3.2.6 GA. Please try 3.2.6 GA. StartupProfiler is available in that release as well.
2022-02-11 13:27:10.002Z WARN c.m.a.a.i.telemetry.ConnectionString - Missing Statsbeat 'InstrumentationKey'

@heyams
Copy link
Contributor

heyams commented Feb 11, 2022

i also double checked the NPE iin StartupProfiler.. it's inside a for loop. i definitely recommend you trying out the 3.2.6 GA version and let us know how it goes. thanks.

@Sachin1O1
Copy link
Author

Hi @heyams, We used the latest version 3.2.6 GA. and the code snippet shared was from the latest code.

@heyams
Copy link
Contributor

heyams commented Feb 14, 2022

can you share your applicationinsights.log and applicationinsights.json with me at [email protected]?

@Sachin1O1
Copy link
Author

Sachin1O1 commented Feb 15, 2022

Hi @heyams

I can share it here

mycontainer:$ ls
agent.jar app.jar applicationinsights.json applicationinsights.log tmp
mycontainer:~$ cat applicationinsights.log
2022-02-15 13:20:08.232Z WARN c.m.a.a.i.telemetry.ConnectionString - Missing Statsbeat 'InstrumentationKey'
mycontainer:$ cat applicationinsights.json
{ "customDimensions": { }, "instrumentation": { "logging": { "level": "ALL" }, "micrometer": { "enabled": false } }, "preview": { "sampling": { "overrides": [ { "attributes": [ { "key": "http.url", "value": "(https?://[^/]+/([a-z]|[a-z]+-[a-z])+/monitor/+[a-z/]+$)|(https?://[^/]+/management/actuator/+[a-zA-Z/]+$)|(grpc.health.v1.Health/Check)", "matchType": "regexp" } ], "percentage": 1 } ] } }, "selfDiagnostics": { "destination": "file+console", "level": "WARN", "file": { "path": "applicationinsights.log", "maxSizeMb": 5, "maxHistory": 1 } } }

@ghost ghost removed the Needs: Author Feedback label Feb 15, 2022
@heyams
Copy link
Contributor

heyams commented Feb 15, 2022

@Sachin1O1 you don't have a connection string? or you omitted it on purpose because sharing it publicly here? can you email me the applicationinsights.log?

@Sachin1O1
Copy link
Author

Sure, @trask. Can you let me know what type of changes we have submitted in this snapshot to improve startup?

@trask
Copy link
Member

trask commented Apr 18, 2022

sure!

primarily: open-telemetry/opentelemetry-java-instrumentation#5724

and further we are disabling jax-rs annotation instrumentation by default in our distro (Application Insights) since we do not capture INTERNAL spans by default which is all that the jax-rs annotation instrumentation captures, and as noted in [the opentelemetry PR](open-telemetry/opentelemetry-java-instrumentation#5724, the jax-rs annotation instrumentation is a further type matching drag on startup performance

@Sachin1O1
Copy link
Author

Hi @trask,

Are these changes now available in the released version?

@trask
Copy link
Member

trask commented May 26, 2022

yes, available in 3.3.0-BETA, let us know if you see any improvement

@ghost
Copy link

ghost commented Jun 27, 2022

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 7 days. It will be closed if no further activity occurs within 7 days of this comment.

@ghost ghost closed this as completed Jul 5, 2022
@Sachin1O1
Copy link
Author

Hi @trask,

we are still facing a high resource usage issue with the Application insight jar at the startup. We tweaked the resources for microservices but a still big chunk of our application is still a monolith and demands more resources than microservices. Earlier there was a startup time of 2 min but as soon we migrated the monolith into the spring boot project we see a large impact on the startup by Application insight jar.

Startup time observation:

Without jar 40 to 50 sec
With jar 9 to 10 min

Allocated resources

"resources": {
"requests": {
"memory": "1024M",
"cpu": "100m"
},
"limits": {
"memory": "2500M",
"cpu": "1000m"
}

It looks like the agent is CPU demanding as I increased the CPU limit to 1500m its reduce startup time by half.

@Sachin1O1
Copy link
Author

@trask any update on this :)

@trask
Copy link
Member

trask commented Jul 18, 2022

hi @Sachin1O1! there's been another optimization upstream that may help (open-telemetry/opentelemetry-java-instrumentation#6286), it will be included in the next upstream release that is going out this week, and then we will pull it into Application Insights and we can see what kind of impact it has for you

@Sachin1O1
Copy link
Author

Hi @trask ,
I see we have capabilities to start agent after application startup but there is some comment on code related to problematic edges

https:/microsoft/ApplicationInsights-Java/blob/836e26871810a5ee3c3afd869403c2801dda86c7/agent/agent/src/main/java/com/microsoft/applicationinsights/agent/Agent.java#:~:text=/%20there%20are%20many%20problematic%20edge%20cases%20around%20dynamic%20attach%20any%20later%20than%20that

can you please clear the impact and if we can use dynamic attach of the agent.

@trask
Copy link
Member

trask commented Jul 19, 2022

yes, this is supported from 3.3.0, check out https://docs.microsoft.com/en-us/azure/azure-monitor/app/java-spring-boot (which also mentions the limitations)

@Sachin1O1
Copy link
Author

Hi @trask ,
I was thinking to attach Application insights after application startup but the document you shared says that "The invocation must be requested at the beginning of the main method. "

@trask trask self-assigned this Jul 20, 2022
@trask
Copy link
Member

trask commented Jul 20, 2022

oh yes, this is correct:

The invocation must be requested at the beginning of the main method

@jeanbisutti
Copy link
Member

Hi @Sachin1O1 and @kbjerke,

We have developed an experimental feature to improve the start-up time of Application Insights and reduce the CPU usage during Application Insights initialization, in the case of a limited number of CPU cores. The feature is documented here. We would like your feedback on this feature if you could test it.

@Sachin1O1
Copy link
Author

Hi @jeanbisutti ,

Thank you for informing us about this feature. We will surely try this and update you with the results.
FYI, we get some overall service startup improvements by shifting our docker base image to Temurin.

@carlin-q-scott
Copy link

I'm glad I found this issue. It gave me a way to fix my issue with app insights causing my service, Keycloak, to be unstable when deployed in AKS. It was failing liveness probes at startup after I enabled the agent. Adding a 60 second initial delay to the probe fixed my issue.

@jeanbisutti
Copy link
Member

jeanbisutti commented Mar 1, 2023

@carlin-q-scott

There is a known issue with Java 8 that can increase the start-up time with Application Insights. If you use Java 8, you can try:

Otherwise, you could try this experimental feature.

@carlin-q-scott
Copy link

Here's the version info output by the agent:

Application Insights Java Agent 3.4.10 started successfully (PID 1, JVM running for 5.964 s)
Java version: 11.0.18, vendor: Red Hat, Inc., home: /usr/lib/jvm/java-11-openjdk-11.0.18.0.10-2.el8_7.x86_64

Keycloak uses Quarkus, not Spring Boot.

I considered using the experimental feature, but I'm OK with a slow startup for now. If you need beta testing of the feature, I'd be willing to give it a go. I'm also wondering if OpenJDK is the issue.

@jeanbisutti
Copy link
Member

I considered using the experimental feature, but I'm OK with a slow startup for now. If you need beta testing of the feature, I'd be willing to give it a go. I'm also wondering if OpenJDK is the issue.

@carlin-q-scott You use AKS. You may have configured a limited number of CPU cores. The agent initialization can trigger the JVM C2 compiler that may use a lot of the limited CPU cores during the startup. We will be glad to have feedback about this experimental feature!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants