Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-461] Capture exceptions at top level of threads and cli #4357

Closed
1 task done
nathaniel-may opened this issue Nov 29, 2021 · 4 comments
Closed
1 task done

[CT-461] Capture exceptions at top level of threads and cli #4357

nathaniel-may opened this issue Nov 29, 2021 · 4 comments
Labels
jira spike tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality

Comments

@nathaniel-may
Copy link
Contributor

nathaniel-may commented Nov 29, 2021

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

discovered while coding up #4326. Somewhere in here when an exception is raised, the thread just hangs and you have to step through a debugger to read the exception message. Exceptions should bubble all the way to the top even in threaded environments.

you can recreate this by making fields_to_json throw.

@nathaniel-may nathaniel-may added enhancement New feature or request triage tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality labels Nov 29, 2021
@gshank
Copy link
Contributor

gshank commented Nov 29, 2021

I saw a similar problem when adding the event_status dictionary to ParsedNodeDefaults. It hung executing a test in test/integration/059... because the SourceDefinition didn't have that attribute. Did not see anywhere the actual error, but since I had just added that attribute I guessed, added it to SourceDefinition, and it fixed it.

@jtcohen6 jtcohen6 removed the triage label Nov 30, 2021
@leahwicz leahwicz added the jira label Apr 6, 2022
@github-actions github-actions bot changed the title thrown Exceptions in threads hang instead of bubbling up [CT-461] thrown Exceptions in threads hang instead of bubbling up Apr 6, 2022
@nathaniel-may
Copy link
Contributor Author

Estimation:

  • guess is that when an exception is thrown inside a thread, it's not caught and handled correctly
  • rpc server would benefit from this fix as well. Possibly fix existing tickets?
  • This is basically one big unknown unknown
  • Might be a better fit for the execution team
  • suggesting a 2 day spike instead of estimating.

@jtcohen6 jtcohen6 removed the enhancement New feature or request label May 10, 2022
@gshank gshank self-assigned this Jul 25, 2022
@gshank gshank changed the title [CT-461] thrown Exceptions in threads hang instead of bubbling up [CT-461] Capture exceptions at top level of threads and cli Jul 25, 2022
@gshank
Copy link
Contributor

gshank commented Jul 29, 2022

Note: this failure was experienced while running tests. On a particular type of test failure, the tests would hang and not complete. I was able to recreate this failure by removing "NodeInfoMixin" from the ParsedSourceDefinition class and executing: ".tox/py-integration/bin/python -m pytest -nauto --full-trace tests/functional/sources". (The "--full-trace" was suggested by pytest.) The resulting stacktrace started in /_pytest/main.py, and ended in threading.py, in the wait function, on the line "gotit = waiter.acquire(True, timeout)".

I was able to reduce it to a single test by running pytest -v --full-trace tests/functional/sources/test_source_fresher_state.py::TestSourceFresherNothingToDo::test_source_fresher_nothing_to_do. The stack trace went from the run_queue function in task/runnable.py, "self.job_queue.join()" to the join function in "graph/queue.py", line "self.inner.join()", to the joing function in the python queue.py file, "self.all_testsk_done.wait()", and then in the wait function in threading.py "waiter.acquire()".

Since this is deep in the execution team area, it might make more sense for them to take a look at it.

@gshank
Copy link
Contributor

gshank commented Aug 3, 2022

Closing this issue in favor of new ticket #5602, as requested.

@gshank gshank closed this as completed Aug 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira spike tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality
Projects
None yet
Development

No branches or pull requests

4 participants