Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect output from prints originating from different processes #604

Merged
merged 4 commits into from
Sep 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion myst_nb/core/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,11 @@ def coalesce_streams(outputs: list[NotebookNode]) -> list[NotebookNode]:
for output in outputs:
if output["output_type"] == "stream":
if output["name"] in streams:
streams[output["name"]]["text"] += output["text"]
out = output["text"].strip()
if out:
streams[output["name"]]["text"] += f"{out}\n"
else:
output["text"] = output["text"].strip() + "\n"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this line necessary?

Copy link
Contributor Author

@basnijholt basnijholt Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, perhaps it would be clearer to move it one line up. We're mutating the dict we just added to new_outputs.

edit: done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I was just nitpicking that you stripe off whitespace and add a newline; practically the same, so is this really needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this is in case there are multiple newlines at the end which can happen after merging the cells.

See the example in my first post.

I am hard at work on trying to write a test. Turns out that on MacOS the issue does not exist ... took me a good while to realize that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bsipocz, just added a test!

See my comment here #604 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@basnijholt @bsipocz just a hint, you should be using rstrip not strip, because now you have removed any possible indentation at the start of the streams 🤷

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll open a follow-up to fix that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I just opened #640

new_outputs.append(output)
streams[output["name"]] = output
else:
Expand Down
186 changes: 186 additions & 0 deletions tests/notebooks/merge_streams_parallel.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-19T21:44:29.809012Z",
"iopub.status.busy": "2024-09-19T21:44:29.808809Z",
"iopub.status.idle": "2024-09-19T21:44:29.978481Z",
"shell.execute_reply": "2024-09-19T21:44:29.977891Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"0"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"0"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"0"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"0"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"0"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"0"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"0"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"0"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"0"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
]
}
],
"source": [
"from concurrent.futures import ProcessPoolExecutor\n",
"\n",
"with ProcessPoolExecutor() as executor:\n",
" for i in executor.map(print, [0] * 10):\n",
" pass"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
13 changes: 13 additions & 0 deletions tests/test_render_outputs.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
"""Tests for rendering code cell outputs."""

import pytest

from myst_nb.core.render import EntryPointError, load_renderer
Expand Down Expand Up @@ -103,6 +104,18 @@ def test_merge_streams(sphinx_run, file_regression):
file_regression.check(doctree.pformat(), extension=".xml", encoding="utf-8")


@pytest.mark.sphinx_params(
"merge_streams_parallel.ipynb",
conf={"nb_execution_mode": "off", "nb_merge_streams": True},
)
def test_merge_streams_parallel(sphinx_run, file_regression):
"""Test configuring multiple concurrent stdout/stderr outputs to be merged."""
sphinx_run.build()
assert sphinx_run.warnings() == ""
doctree = sphinx_run.get_resolved_doctree("merge_streams_parallel")
file_regression.check(doctree.pformat(), extension=".xml", encoding="utf-8")


@pytest.mark.sphinx_params(
"metadata_image.ipynb",
conf={"nb_execution_mode": "off", "nb_cell_metadata_key": "myst"},
Expand Down
21 changes: 21 additions & 0 deletions tests/test_render_outputs/test_merge_streams_parallel.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
<document source="merge_streams_parallel">
<container cell_index="0" cell_metadata="{'execution': {'iopub.execute_input': '2024-09-19T21:44:29.809012Z', 'iopub.status.busy': '2024-09-19T21:44:29.808809Z', 'iopub.status.idle': '2024-09-19T21:44:29.978481Z', 'shell.execute_reply': '2024-09-19T21:44:29.977891Z'}}" classes="cell" exec_count="1" nb_element="cell_code">
<container classes="cell_input" nb_element="cell_code_source">
<literal_block language="ipython3" linenos="False" xml:space="preserve">
from concurrent.futures import ProcessPoolExecutor

with ProcessPoolExecutor() as executor:
for i in executor.map(print, [0] * 10):
pass
<container classes="cell_output" nb_element="cell_code_output">
<literal_block classes="output stream" language="myst-ansi" linenos="False" xml:space="preserve">
0
0
0
0
0
0
0
0
0
0