-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault when creating many pods simultaneously #219
Comments
I think that we should limit the number of concurrently booting VMs on each worker to some number, say 10 (because why not 10) so that we avoid contention on the disk that is likely to cause this problem. As a fix, my suggestion is to use a global semaphore (i.e., bound channel) here that would limit the number of concurrent StartVM calls. I think that the limit for the snapshot-based starts (i.e., LoadVM calls), the limit should be higher compared to boot-based starts. However, in my experience, this has to be adjusted according to the host's storage performance. This should be a new knob for vHive. |
I agree disk contention due to booting lot's of VMs concurrently is a genuine concern. Although for the "too many open files" error, which seems to be the underlying issue for the segfault, closing the SSH connection and logging in again after running |
parent 6674807 author HermioneKT <[email protected]> 1706694164 +0800 committer HermioneKT <[email protected]> 1706694164 +0800 test # This is the commit message #157: test # This is the commit message #159: test # This is the commit message #160: test # This is the commit message #161: test # This is the commit message #162: test # This is the commit message #163: test # This is the commit message #164: tesT # This is the commit message #165: test # This is the commit message #166: test # This is the commit message #167: test # This is the commit message #168: test # This is the commit message #169: test # This is the commit message #170: test # This is the commit message #171: Test # This is the commit message #172: test # This is the commit message #173: test # This is the commit message #174: test # This is the commit message #175: test # This is the commit message #176: test # This is the commit message #177: test # This is the commit message #178: test # This is the commit message #179: test # This is the commit message #180: test # This is the commit message #181: test # This is the commit message #182: test # This is the commit message #183: test # This is the commit message #184: test # This is the commit message #185: test # This is the commit message #186: test # This is the commit message #187: test # This is the commit message #188: test # This is the commit message #189: test # This is the commit message #190: Test # This is the commit message #191: Test # This is the commit message #192: test # This is the commit message #193: Test # This is the commit message #194: test # This is the commit message #195: test # This is the commit message #196: test # This is the commit message #197: test # This is the commit message #198: test # This is the commit message #199: Test # This is the commit message #200: test # This is the commit message #201: test # This is the commit message #202: Test # This is the commit message #203: test # This is the commit message #204: test # This is the commit message #205: test # This is the commit message #206: test # This is the commit message #207: test # This is the commit message #208: test # This is the commit message #209: test # This is the commit message #210: test # This is the commit message #211: Test # This is the commit message #212: test # This is the commit message #213: Test # This is the commit message #214: Test # This is the commit message #215: Test # This is the commit message #216: test # This is the commit message #217: Test # This is the commit message #218: test # This is the commit message #219: test # This is the commit message #220: test # This is the commit message #221: test # This is the commit message #222: test # This is the commit message #223: test # This is the commit message #224: test # This is the commit message #225: test # This is the commit message #226: test # This is the commit message #227: test # This is the commit message #228: test # This is the commit message #229: Test # This is the commit message #230: test # This is the commit message #231: test # This is the commit message #232: test # This is the commit message #233: test # This is the commit message #234: Test # This is the commit message #235: test # This is the commit message #236: test # This is the commit message #237: test # This is the commit message #238: test # This is the commit message #239: test # This is the commit message #240: test # This is the commit message #241: test # This is the commit message #242: test # This is the commit message #243: test # This is the commit message #244: test # This is the commit message #245: test # This is the commit message #246: test # This is the commit message #247: test # This is the commit message #248: test # This is the commit message #249: test # This is the commit message #250: test # This is the commit message #251: test # This is the commit message #252: test
Describe the bug
When a lot's of pods are being created simultaneously, vHive segfaults in
createUserContainer
.To Reproduce
examples/deployer/functions.json
such that only helloworld will be deployedgo run examples/deployer/client.go
watch kubectl get pods
and wait until all pods of the helloworld deployment get deletedgo run examples/invoker/client.go -rps 100
Logs
The text was updated successfully, but these errors were encountered: