-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GC reporting / usage mismatch #1682
Comments
I get the GC values on reported time they ran from the formal GC stats of the JVM. Maybe you can enable GC logging and see if it helps? |
I think my natural curiosity on this issue has reached its end :) If you're happy you're making the right API call to the JVM, then I don't think there's anything left to say. For future reference though, get people to use "top -H" (to show threads) and cross reference vs jstack if they're complaining about CPU usage, those 2 GC times (or at least the concurrent mark sweep) don't seem reliable |
In https://groups.google.com/group/elasticsearch/browse_thread/thread/31d87c84dd387367#, I reported unusual CPU activity (continuous 40-60% for at least several days even under idle conditions) after creating 12GB of field cache (across 3 nodes) via a facet on a multi-valued field.
Last night I ran jstack while 2 of the nodes were in this state. All of the user threads were WAITING, THREAD_WAITING, or RUNNABLE (in some poll).
Cross-referencing the process ids from "top -H" with the nids reported in jstack, the following threads were actively using the CPU:
~19 minutes in ""Concurrent Mark-Sweep GC Thread"
~2 minutes in each of 4x "Gang worker#0 (Parallel GC Threads)" (worker#1,worker#2,worker#3)
~3 minutes in "VM Thread" prio=10 tid=0x00002aad1c6d0000 nid=0x7041 runnable
Empirically, ongoing CPU time seemed to be split between either Mark-Sweep and 1 of the GC threads, or Mark-Sweep and the VM thread.
However checking both bigdesk and head for the node in question, the following GC times were reported:
ParNew 12 5 seconds and 61 milliseconds
ConcurrentMarkSweep 4 5 seconds and 884 milliseconds
Presumably there's nothing that can be done about the CPU activity itself, that's just what the VM needs to do when I allocate a large amount of (multi valued) field cache. That's obviously fine if so.
Presuambly you also just read some standard stats to get the GC usage out - but I thought the fact the "actual values" weren't getting reported was issue worthy.
The text was updated successfully, but these errors were encountered: