New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF-8 characters in lstlisting
breaks pdf conversion
#131
Comments
Hey @rossbar thats funny I forgot you had raised an issue here. Thanks for the feedback, obviously I am busy working on the myst parser at the moment, so won't be able to look into this too much in the immediate future. But this will probably end up feeding into that project 😄 @choldgraf, this is related to the conversion of source code to LaTex, which will obviously be part of |
@rossbar if you like you could open an issue in the sphinx-notebook repo to flag this as a future item to tackle |
1 similar comment
@rossbar if you like you could open an issue in the sphinx-notebook repo to flag this as a future item to tackle |
Thanks for the reply! This is not a high priority, especially in light of all the fantastic work being done with the ExecutableBookProject. Thanks for the suggestions @choldgraf , I don't think this is a general issue, just a limitation of LaTeX's lstlisting package. The |
Bug Report
Describe the bug
This is not necessarily an IPyPublish bug, but a limitation in the
lstlisting
LaTeX package causes pdf conversion to fail if unicode characters are used within anlstlisting
environment. I stumbled upon this using the%timeit
ipython magic in a code cell, as the output of%timeit
includes unicode characters (the plus-minus sign, greek characters for second-prefixes, etc.)To Reproduce
Steps to reproduce the behavior:
example.Rmd
with the following contentsnbpublish -f latex_ipypublish_all.exec -pdf example.Rmd
Minimal Notebook Example
timeit_nb.ipynb.txt
Same build instructions as above (with the different filename of course). Note that this issue is downstream in the build process (at the latex -> pdf step) so is insensitive to whether the input file is
.Rmd
,.ipynb
, etc.Expected Behaviour
Currently, the conversion fails with errors from pdflatex. The desired behavior is a successful build with unicode characters properly represented in
lstlisting
environments.Runtime Information
(please complete the following information)
IPyPublish: 0.10.10
Python: 3.8.1
OS: Arch linux (5.5.2-arch1-1)
Pandoc: 2.8
(optional for pdf issues) texlive: 3.14159265
(optional for pdf issues) latexmk: 4.65
Additional context
The
.log
file provided bypdflatex
is not particularly helpful as it makes it seem as though the problem is with theutf8x
orucs
packages/options. After some digging, I was able to trace the problem back to a limitation withlstlisting
. A simple procedure for confirming this:converted/timeit.tex
file generated by thenbpublish
processlstlisting
environment around the output from the code celllstlisting
environmentpdflatex
:pdflatex timeit.tex
The build will complete without errors and the output from the code cell will be properly rendered, albeit in plain LaTeX.
Proposed solution
The limitations of
lstlisting
with respect to unicode input are documented, and there is a proposed solution in section 2.5 of the documentation. It involves including anescapeinside=
parameter in thelstlisting
environment to pass the handling of characters in the environment back to latex. For example, here is the originallstlisting
intimeit.tex
as generated by the build process:Here is the modified version that includes
escapeinside
that fixes the issue:Note that the characters that define the escaped section (
*(
and)*
in my example) are configurable and could be specified for the entire document with\lstset
.If the proposed solution sounds workable to you, I'm happy to attempt to implement it. Some discussion would be required to hammer out details (e.g. appropriate escape characters). I wanted to create an issue first to see if there were any additional insights/ideas.
Logging
The text was updated successfully, but these errors were encountered: