Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError on Windows when stripping comments #17

Open
kiudee opened this issue Dec 8, 2020 · 0 comments
Open

UnicodeDecodeError on Windows when stripping comments #17

kiudee opened this issue Dec 8, 2020 · 0 comments

Comments

@kiudee
Copy link

kiudee commented Dec 8, 2020

My arxiv-collector version is:

0.4.1

Debugging output:

arxiv-collector --debug main.tex Building main... .deps already exists... Running ['latexmk', '-silent', '-pdf', '-deps', '-deps-out=.deps-d', 'main'] External Perl missing or outdated. Please install a recent Perl, or configure TeX Live to always use the builtin Perl: tlmgr conf texmf TEXLIVE_WINDOWS_TRY_EXTERNAL_PERL 0 Meanwhile, continuing with built-in Perl...

Latexmk: Run number 1 of rule 'pdflatex'
This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020/W32TeX) (preloaded format=pdflatex)
restricted \write18 enabled.
entering extended mode
Latexmk: Run number 2 of rule 'pdflatex'
This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020/W32TeX) (preloaded format=pdflatex)
restricted \write18 enabled.
entering extended mode

Dependencies in .deps-d
Gathering outputs...
Deps file .deps-d: source main, base name main, output main.pdf, jobname main
Processing c:/texlive/2020/texmf-dist/fonts/type1/public/amsfonts/cm/cmr10.pfb ...
Processing c:/texlive/2020/texmf-dist/tex/latex/base/article.cls ...
Processing c:/texlive/2020/texmf-dist/tex/latex/base/inputenc.sty ...
Processing c:/texlive/2020/texmf-dist/tex/latex/base/size10.clo ...
Processing c:/texlive/2020/texmf-dist/tex/latex/l3backend/l3backend-pdfmode.def ...
Processing c:/texlive/2020/texmf-dist/web2c/texmf.cnf ...
Processing c:/texlive/2020/texmf-var/fonts/map/pdftex/updmap/pdftex.map ...
Processing c:/texlive/2020/texmf-var/web2c/pdftex/pdflatex.fmt ...
Processing c:/texlive/2020/texmf.cnf ...
Processing main.tex ...
Traceback (most recent call last):
File "c:\users\karlson\anaconda3\envs\arxiv\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\karlson\anaconda3\envs\arxiv\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users\Karlson\Anaconda3\envs\arxiv\Scripts\arxiv-collector.exe_main
.py", line 7, in
File "c:\users\karlson\anaconda3\envs\arxiv\lib\site-packages\arxiv_collector.py", line 491, in main
collect(
File "c:\users\karlson\anaconda3\envs\arxiv\lib\site-packages\arxiv_collector.py", line 261, in collect
for line in f:
File "c:\users\karlson\anaconda3\envs\arxiv\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 93: character maps to

This is a minimum failing example (file is utf8 encoded without BOM):

\documentclass{article}
\usepackage[utf8]{inputenc}

\begin{document}
“This is a test”
\end{document}

main.zip
I am compiling this on a Windows machine and it fails here:

with io.open(dep) as f, io.BytesIO() as g:
tarinfo = tarfile.TarInfo(name=dep)
for line in f:
g.write(strip_comment(line).encode("utf-8"))

Judging by the error message, I assume that python wants to open the file as Windows-1252 file, since no encoding is provided.

@kiudee kiudee changed the title UnicodeDecodeError when stripping comments UnicodeDecodeError on Windows when stripping comments Dec 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant