Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always use LF instead of CRLF for *.ipynb files regardless of platform and when sending code to the kernel #4576

Closed
claforte opened this issue Jan 30, 2021 · 28 comments
Assignees
Labels
bug Issue identified by VS Code Team member as probable bug notebook-workflow Issues that interrupt expected or desirable behavior verified Verification succeeded

Comments

@claforte
Copy link

Environment data

  • VS Code version: 1.53.0-insider (2021-01-29T05:13:48.533Z)
  • Jupyter Extension version (available under the Extensions sidebar): v2020.12.414227025
  • Python Extension version (available under the Extensions sidebar): 2021.2.518958983-dev
  • OS (Windows | Mac | Linux distro) and version: Windows 10
  • Python and/or Anaconda version: Python 3.8.5 using Anaconda
  • Type of virtual environment used (N/A | venv | virtualenv | conda | ...): conda
  • Jupyter server running: Local

Expected behaviour

When editing a notebook, lines ending in "\n" shouldn't be modified to "\r\n"

Actual behaviour

It appears as soon as I modify the content of a cell, the editor is replacing all "\n" by "\r\n". I don't understand why, and it's causing downstream tools (e.g. nbdev/nbparse) to no longer be able to parse the .ipynb properly.

Steps to reproduce:

This seems to happen for any .ipynb file on my PC. I did a git clone of nbdev (https:/fastai/nbdev/blob/master/nbs/00_export.ipynb) and ran the notebook, but chances are you only need to open a file:

  1. download https:/fastai/nbdev/blob/master/nbs/00_export.ipynb
  2. open the file in vscode
  3. modify a cell
  4. do a diff on the cell.
  5. notice that the newlines are messed up.
    image

Logs

Output for Jupyter in the Output panel (ViewOutput, change the drop-down the upper-right of the Output panel to Jupyter)
User belongs to experiment group 'pythonJoinMailingListVar1'
User belongs to experiment group 'jupyterTest'
User belongs to experiment group 'NativeNotebookEditor'
> D:\Users\claforte\Anaconda3\envs\fastai\python.exe c:\Users\claforte\.vscode-insiders\extensions\ms-toolsai.jupyter-2020.12.414227025\pythonFiles\pyvsc-run-isolated.py -c "import jupyter"
> D:\Users\claforte\Anaconda3\envs\fastai\python.exe c:\Users\claforte\.vscode-insiders\extensions\ms-toolsai.jupyter-2020.12.414227025\pythonFiles\pyvsc-run-isolated.py -c "import notebook"
> D:\Users\claforte\Anaconda3\envs\fastai\python.exe c:\Users\claforte\.vscode-insiders\extensions\ms-toolsai.jupyter-2020.12.414227025\pythonFiles\pyvsc-run-isolated.py jupyter kernelspec --version
> D:\Users\claforte\Anaconda3\envs\fastai\python.exe c:\Users\claforte\.vscode-insiders\extensions\ms-toolsai.jupyter-2020.12.414227025\pythonFiles\pyvsc-run-isolated.py vscode_datascience_helpers.daemon --daemon-module=vscode_datascience_helpers.jupyter_daemon -v
> D:\Users\claforte\Anaconda3\envs\fastai\python.exe c:\Users\claforte\.vscode-insiders\extensions\ms-toolsai.jupyter-2020.12.414227025\pythonFiles\pyvsc-run-isolated.py vscode_datascience_helpers.daemon --daemon-module=vscode_datascience_helpers.jupyter_daemon -v
> D:\Users\claforte\Anaconda3\envs\fastai\python.exe c:\Users\claforte\.vscode-insiders\extensions\ms-toolsai.jupyter-2020.12.414227025\pythonFiles\pyvsc-run-isolated.py vscode_datascience_helpers.daemon --daemon-module=vscode_datascience_helpers.jupyter_daemon -v
> D:\Users\claforte\Anaconda3\envs\tensorflow\python.exe c:\Users\claforte\.vscode-insiders\extensions\ms-toolsai.jupyter-2020.12.414227025\pythonFiles\pyvsc-run-isolated.py -c "import ipykernel"
> D:\Users\claforte\Anaconda3\envs\fastai\python.exe c:\Users\claforte\.vscode-insiders\extensions\ms-toolsai.jupyter-2020.12.414227025\pythonFiles\pyvsc-run-isolated.py -c "import ipykernel"
> D:\Users\claforte\Anaconda3\envs\fastai\python.exe c:\Users\claforte\.vscode-insiders\extensions\ms-toolsai.jupyter-2020.12.414227025\pythonFiles\pyvsc-run-isolated.py vscode_datascience_helpers.daemon --daemon-module=vscode_datascience_helpers.kernel_launcher_daemon -v
Started kernel Python 3
> D:\Users\claforte\Anaconda3\envs\fastai\python.exe c:\Users\claforte\.vscode-insiders\extensions\ms-toolsai.jupyter-2020.12.414227025\pythonFiles\pyvsc-run-isolated.py vscode_datascience_helpers.daemon --daemon-module=vscode_datascience_helpers.kernel_launcher_daemon -v
> D:\Users\claforte\Anaconda3\envs\fastai\python.exe c:\Users\claforte\.vscode-insiders\extensions\ms-toolsai.jupyter-2020.12.414227025\pythonFiles\pyvsc-run-isolated.py vscode_datascience_helpers.daemon --daemon-module=vscode_datascience_helpers.kernel_launcher_daemon -v
> D:\Users\claforte\Anaconda3\envs\fastai\python.exe c:\Users\claforte\.vscode-insiders\extensions\ms-toolsai.jupyter-2020.12.414227025\pythonFiles\pyvsc-run-isolated.py vscode_datascience_helpers.daemon --daemon-module=vscode_datascience_helpers.kernel_launcher_daemon -v

XXX

@claforte claforte added the bug Issue identified by VS Code Team member as probable bug label Jan 30, 2021
@joyceerhl
Copy link
Contributor

Hi @claforte, does toggling the following VS Code setting stop the line ending conversions? You can get to this UI with Ctrl+Shift+P to open the Command Palette, then type 'Open Settings (UI)'.

image

@joyceerhl joyceerhl added the info-needed Issue requires more information from poster label Jan 31, 2021
@claforte
Copy link
Author

Hi Joyce,

It was set to auto already. I'm not sure how you want me to toggle it, but I just tried setting it to \n and modified a few code cells. It seems some code cells get stuck to \r\n no matter what files.eol value I set, while other cells are stuck at \n.
image
image

@joyceerhl
Copy link
Contributor

If you open a freshly cloned ipynb after setting the files.eol setting to your preferred setting, do the endings still get converted?

The default line ending on Windows is '\r\n' (this is most likely what you are getting with the 'auto' setting), so I believe that is what VSCode is applying to your files. I don't believe changing that setting results in existing '\r\n' being converted to the new '\n', so unfortunately I think you'd need to do a find and replace on the files that have already been converted.

@claforte
Copy link
Author

Hi Joyce, I tried what you suggested (set vscode-insiders eol to \n, manually replaced all \r\n by \n in notepad++, restarted vscode-insiders, modified a cell, saved) and it looks like no new \r\n appeared.

Still it would be great if vscode-jupyter made sure it never added \r\n at the end of code line, since Jupyter notebook, AFAIK, never uses \r\n, and a lot of tools probably assume that.

Thanks,

Christian

@joyceerhl
Copy link
Contributor

Great to hear the setting change stops the conversions.

Thank you for highlighting this. From related issues like jupyter/nbconvert#1062 it does seem that Jupyter notebooks tend to default to LF even on Windows. The files.eol setting is contributed by VS Code, but perhaps one solution here would be for the Jupyter extension to override the default VS Code setting specifically for ipynb files. However, I don't believe there is currently a way to control files.eol based on file type, so this may require upstream changes in VS Code. Our team will discuss this at our weekly triage meeting.

@joyceerhl joyceerhl added upstream-vscode Blocked on upstream VS code and removed info-needed Issue requires more information from poster labels Feb 1, 2021
@joyceerhl joyceerhl changed the title each line within a cell: "\n" get replaced by "\r\n", causing problems in downstream tools Always use LF instead of CRLF for *.ipynb files regardless of platform Feb 1, 2021
@claforte
Copy link
Author

claforte commented Feb 1, 2021

Thanks a lot for your responsiveness Joyce, I appreciate it!

@DonJayamanne
Copy link
Contributor

However, I don't believe there is currently a way to control files.eol based on file type, so this may require upstream changes in VS Code. Our team will discuss this at our weekly triage meeting.

The Jupyter extension is responsible for writing out the contents of the file, not the user. Hence we are in full control over whether we use \n or \r\n. I don't see any need for a VS Code update here.

@joyceerhl /cc

@DonJayamanne DonJayamanne removed the upstream-vscode Blocked on upstream VS code label Feb 1, 2021
@msveshnikov
Copy link

This is honestly very annoying behavior (I mean conversion of LF to CRLF). After such a notebook uploaded to Jupyter Lab server, I start to have incorrect character typed. This is another bug of Jupyter Lab, which unfortunately destroy your DEV experience. See here:
jupyterlab/jupyterlab#2951
(bug reproduced in latest Jupyter Lab)
So basically after editing of a notebook in VSCode you have destroyed notebook in any external server!

@orekides
Copy link

orekides commented Jul 15, 2021

So basically after editing of a notebook in VSCode you have destroyed notebook in any external server!

exactly true !!!

@mburghart-qualitrol
Copy link

I have used several different jupyter notebook editors. When running "jupyter notebook" on both Windows and linux based systems, the notebook output consistently uses "\n" in the double-quoted lines of each cell -- not "\r\n". The only editor that I've seen output "\r\n" is The VS Code jupyter notebook extension. This inconsistency causes headaches when diff'ing notebook file versions stored in git repositories.

The issue is not the same as the common problem of Windows based systems insisting upon ending text file lines with the control characters \r\n and other systems just using \n. (git has features to help cope with this, which you are probably aware.) So, in my opinion, the "File:EOL" setting in VS Code should not be used to control the line ending characters of the notebook cell double-quoted lines. I think the simple answer is to just follow the convention of "\n" to represent line end in the double-quoted lines.

@dynamicwebpaige
Copy link

The only editor that I've seen output \r\n is the VS Code Jupyter notebook extension. This inconsistency causes headaches when diff'ing notebook file versions stored in git repositories.

Is there any way that we could prioritize resolving this bug? Destroying the diff makes it impossible to sanely understand changes to notebook files during code review -- which will be even more important as we encourage folks to use version control for their .ipynb files.

@joyceerhl
Copy link
Contributor

joyceerhl commented Aug 25, 2021

Users can configure the files.eol setting as a temporary workaround, and we can consider fixing this in the builtin ipynb extension. Note also that Colab always uses LF (they just fixed this behavior) jupyterlab/jupyterlab#9465 (comment) /cc @roblourens @DonJayamanne

@DonJayamanne
Copy link
Contributor

Sound like somethign for debt week.
Basically we need to remove all CR from the text we get in the code cells.

mrtkp9993 added a commit to mrtkp9993/QuantitaveFinanceExamplesPy that referenced this issue Sep 4, 2021
@DonJayamanne DonJayamanne changed the title Always use LF instead of CRLF for *.ipynb files regardless of platform Always use LF instead of CRLF for *.ipynb files regardless of platform and when sending code to the kernel Sep 16, 2021
@cdeil
Copy link

cdeil commented Sep 17, 2021

This is causing problems for many Jupyter users: jupyterlab/jupyterlab#2951

Can you please fix this?

@jasongrout
Copy link

FYI, it looks like microsoft/vscode#133762 fixed the line endings in cell sources in ipynb files.

@DonJayamanne DonJayamanne added verified Verification succeeded and removed verified Verification succeeded labels Sep 29, 2021
@rchiodo rchiodo added the verified Verification succeeded label Sep 30, 2021
@rchiodo rchiodo self-assigned this Sep 30, 2021
@rchiodo
Copy link
Contributor

rchiodo commented Sep 30, 2021

/verified

Tried with multiple notebooks. Existing and new. All only have LF in them:

image

@mburghart-qualitrol
Copy link

The verification doesn't appear to be testing the end-of-line representation for each line of notebook cell. You would need to make a notebook cell with at least two lines to see this. Attached is a small notebook file with two cells each with multiple lines. (File name extension changed to txt to allow upload.) This was created using Visual Studio Code 1.60.2 with the Jupyter extension v2021.8.2041215044. Note that the representation of all but the final cell line of each cell ends with "\r\n". This illustrates the undesired behavior.

{
 "cells": [
  {
   "cell_type": "markdown",
   "source": [
    "This is an example notebook with a markdown cell and python code cell. Each cell\r\n",
    "has multiple lines to illustrate how end-of-line is represented in the `ipynb`\r\n",
    "file for each line in a cell."
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "source": [
    "print( \"Aardvarks are first line.\" )\r\n",
    "print( \"Badgers are next in line.\" )"
   ],
   "outputs": [],
   "metadata": {}
  }
 ],
 ...
}

endofline_test ipynb.txt

@rchiodo
Copy link
Contributor

rchiodo commented Sep 30, 2021

@mburghart-qualitrol my screenshot was just to show no /r in the file happened to have open. It works with multiple lines in a cell too:

image

@RoyiAvital
Copy link

I many cases the files itself looks like one line, is there a way to tell VS Code to parse it correctly?

@rchiodo
Copy link
Contributor

rchiodo commented Mar 22, 2022

@RoyiAvital sorry not sure what you're saying. That the json for your ipynb file has no linefeeds in it?

@RoyiAvital
Copy link

RoyiAvital commented Mar 24, 2022

@rchiodo , Indeed. Any way to enforce VS Code to save it a multi line manner?

@rchiodo
Copy link
Contributor

rchiodo commented Mar 24, 2022

Any way to enforce VS Code to save it a multi line manner?

You mean your code cells all have a single line in them? What does the ipynb look like?

Cells usually have a format like so:

{
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "import os\n",
    "\n",
    "a,b,c = 'hihi', os, tf.__version__\n",
    "print(a, b, c)"
   ]
  },

And I believe you're saying your notebook looks like so?:

{    "cell_type": "code",   "execution_count": null,   "metadata": {},   "outputs": [],   "source": [    "import tensorflow as tf",  "import os",    "",   "a,b,c = 'hihi', os, tf.__version__",  "print(a, b, c)"   ]  },

Or does it have the \n in the source lines?

@RoyiAvital
Copy link

It is one liner like this:

image

While on VS Code it looks as expected.

@rchiodo
Copy link
Contributor

rchiodo commented Mar 24, 2022

Are you saying VS code can't open that file? It looks fine to me. If it doesn't open then the JSON is likely invalid.

Or are you asking VS code to format it so that it isn't one line? It shouldn't be doing that. Although that would be an issue on VS code itself, not the jupyter extension.

I believe the code is here:
https:/microsoft/vscode/blob/85a33a14cf694e55d85d4d3dda55a2cca1a37980/extensions/ipynb/src/notebookSerializer.ts#L97

@RoyiAvital
Copy link

@rchiodo , Can open and display it correctly.
Just when looking at the RAW file (Using text editors) it is one liner.

I wonder abut 2 things:

  1. How did it happen? Why did VS Code saved it like that?
  2. Is there a way to make it save again with all lines separated?

@rchiodo
Copy link
Contributor

rchiodo commented Mar 25, 2022

@RoyiAvital what is your version of VS Code and the jupyter extension?

We might have at one time saved the file as a single line, but the code that saves a file now should write it out with linefeeds.

@RoyiAvital
Copy link

My version is VS Code 1.65.2 with Jupyter Extension 2022.2.103. Both are the latest stable versions.

@rchiodo
Copy link
Contributor

rchiodo commented Mar 25, 2022

I'm creating a new issue. This shouldn't be happening:
#9491

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Issue identified by VS Code Team member as probable bug notebook-workflow Issues that interrupt expected or desirable behavior verified Verification succeeded
Projects
None yet
Development

No branches or pull requests