Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: Vm execution failed due to network interface #596

Merged
merged 2 commits into from
Aug 20, 2024

Conversation

olethanh
Copy link
Collaborator

occasionaly vm creation vm because the assigned tap network inteface was already existing. Probably not properly teared down from a previous execution or from a concurrency issue

Displayed Error was :
OSError: [Errno 16] Device or resource busy

it was then looped as a retry and blocked the whole thing

Solution:
When assigning a vm_id, check that the network interface for that vm doesn't already exists This act as a double check for a variety of issues

occasionaly vm creation vm because the assigned tap network inteface was already existing. Probably not properly teared down from a previous execution or from a concurrency issue

Displayed Error was :
 OSError: [Errno 16] Device or resource busy

it was then looped as a retry and blocked the whole thing

Solution:
When assigning a vm_id, check that the network interface for that vm doesn't already exists
This act as a double check for a variety of issue
Copy link

Failed to retrieve llama text: POST 504:

504 Gateway Time-out


The server didn't respond in time.

Copy link

codecov bot commented Apr 15, 2024

Codecov Report

Attention: Patch coverage is 25.00000% with 6 lines in your changes missing coverage. Please review.

Project coverage is 43.80%. Comparing base (0f77070) to head (4b173d6).
Report is 82 commits behind head on main.

Files Patch % Lines
src/aleph/vm/network/hostnetwork.py 40.00% 3 Missing ⚠️
src/aleph/vm/pool.py 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #596      +/-   ##
==========================================
- Coverage   43.83%   43.80%   -0.04%     
==========================================
  Files          55       55              
  Lines        4944     4952       +8     
  Branches      585      587       +2     
==========================================
+ Hits         2167     2169       +2     
- Misses       2658     2664       +6     
  Partials      119      119              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@hoh
Copy link
Member

hoh commented Apr 16, 2024

Can we easily get this code tested ?

@olethanh
Copy link
Collaborator Author

Not really automatically as it is at low level but if you want to test it manually you can create a network interface, prior to launching using

sudo ip tuntap add mode tap vmtap4

@hoh hoh merged commit ff6c119 into main Aug 20, 2024
18 of 20 checks passed
@hoh hoh deleted the ol-check-existing-network-interface branch August 20, 2024 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants