Skip to content
This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

Features / Cleanup for MP Frontend #387

Merged
merged 16 commits into from
Jul 31, 2024

Conversation

robertgshaw2-neuralmagic
Copy link
Collaborator

@robertgshaw2-neuralmagic robertgshaw2-neuralmagic commented Jul 31, 2024

SUMMARY:

  • refactor to use single socket
  • cleanup comments / logging
  • add do_log_stats
  • add abort

Copy link

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which consists a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of default ones by unblocking the steps in your fast-check build on Buildkite UI.

Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge).

To run full CI, you can do one of these:

  • Comment /ready on the PR
  • Add ready label to the PR
  • Enable auto-merge.

🚀

@robertgshaw2-neuralmagic robertgshaw2-neuralmagic changed the title Add abort Features / Cleanup for MP Frontend Jul 31, 2024
# task.add_done_callback(_running_tasks.remove)
if not engine_args.disable_log_stats:
task = asyncio.create_task(_force_log())
_running_tasks.add(task)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is here just to be a hard ref to the task so it doesn't get GC'ed, then I think it would work fine as a local variable inside lifespan instead of a module-scope variable

@@ -82,6 +82,7 @@ async def _force_log():
async def build_backend(args) -> AsyncIterator[VLLMBackend]:
# Context manager to handle backend lifecycle
# Ensures everything is shutdown and cleaned up on error/exit
global engine_args
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🌶️ , thanks!

# Wait for server process to join
rpc_server_process.join()
# Wait for server process to join
rpc_server_process.join()
Copy link
Collaborator

@joerunde joerunde Jul 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually meant this to be in the finally statement so that it will run on exit when there's an unhandled exception, which can happen if there's an exception before we get to the guarded

try:
  await server_task

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a typo, will fix

self.get_data_socket.send(pickle.dumps(GetDataRequest.MODEL_CONFIG))
model_config = await self.get_data_socket.recv()
return pickle.loads(model_config)
# Await acknowledgement from RPCServer that it aborted.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy/paste comment error?

@robertgshaw2-neuralmagic robertgshaw2-neuralmagic merged commit 1f33286 into isolate-oai-server-process Jul 31, 2024
1 check was pending
@robertgshaw2-neuralmagic robertgshaw2-neuralmagic deleted the add-abort branch July 31, 2024 16:05
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants