Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Working with multiple source files w/ .cb.nb #45

Open
rossbar opened this issue Apr 14, 2021 · 2 comments
Open

Working with multiple source files w/ .cb.nb #45

rossbar opened this issue Apr 14, 2021 · 2 comments

Comments

@rossbar
Copy link

rossbar commented Apr 14, 2021

I recently experienced an issue with working with multiple source files that would then be combined into one larger document, e.g. multiple files representing book chapters. If the files are set up to run individually with the notebook executor (i.e. .cb.nb) then execution will fail silently when trying to execute and combine the files into a single document.

Minimal reproducing example

Say you have two source files ch1.md and ch2.md that you want to execute+compile into book.pdf:

Contents of ch1.md:

# Ch. 1 - Uniform distribution

A histogram of uniformly-distributed random numbers.

```{.python .cb.nb jupyter_kernel=python3}
import numpy as np
import matplotlib.pyplot as plt

rng = np.random.default_rng()
plt.hist(rng.uniform(size=1000))
```

Contents of ch2.md

# Ch 2. - Normal Distribution

A histogram of normally-distributed random numbers.

```{.python .cb.nb jupyter_kernel=python3}
import numpy as np
import matplotlib.pyplot as plt

rng = np.random.default_rng()
plt.hist(rng.standard_normal(size=1000))
```

Executing/converting the files individually works as expected:

$ codebraid pandoc --from markdown --to pdf ch1.md --standalone -o book.pdf

However, if you try to compile both documents into a single book, neither document is executed, though no warning or error are given on the command line:

$ codebraid pandoc --from markdown --to pdf ch1.md ch2.md --standalone -o book.pdf

In the latter case, if you look at the output book.pdf you will find an error printed:

SOURCE ERROR in "ch2.md" near line 6:
Some options are only valid for the first code chunk in a session: "jupyter_kernel"

IMO it would be helpful to the user if this error were raised at the command line rather than (or in addition to being) embedded in the output document. In my actual use-case with much larger chapters, it was a very long time before I noticed this in the output book.

The error in book.pdf seems to suggest that the problem lies with the "special" metadata jupyter_kernel, which is only supposed to be supplied in the first code cell. This suggests that an author would have to modify source file metadata if they wanted to switch between building individual chapters and the entire book. I hadn't noticed this mentioned in the docs before - if it's not there, then it would be an improvement if this behavior were documented.

Perhaps this can be avoided if .cb.run is used instead of .cb.nb? Is there a preferred way of using codebraid to have flexible outputs w/ multiple source files?

@gpoore
Copy link
Owner

gpoore commented Apr 14, 2021

I need to clarify the documentation on this. By default, when you pass Pandoc multiple files, it treats them all as one. Codebraid does the same thing, so the code from multiple files is treated as all being from one file, and thus all being in one session. Hence the error about first code cell config in the wrong place.

Pandoc has a --file-scope option that treats multiple files as individuals, and then merges the results after parsing, This should cause Codebraid to do the same thing. The test files work with --file-scope. Of course, that means that you can't have shared Markdown between files (things like footnote definitions, etc.). I have an existing way to enable the effects of --file-scope for Codebraid even when it is disabled for Pandoc, but just haven't made it available to users yet...let me know if you need that.

In terms of better errors: There's #24 for adding exit codes, and I'm referencing that here to remind myself to look into more extensive error messages on the command line as well.

@rossbar
Copy link
Author

rossbar commented Apr 14, 2021

Pandoc has a --file-scope option

Thanks, I wasn't aware of this option.

I have an existing way to enable the effects of --file-scope for Codebraid even when it is disabled for Pandoc, but just haven't made it available to users yet...let me know if you need that.

I'm not sure yet if it's necessary - at this stage it seems there's enough flexibility to put together a sensible workflow without this feature, but I'll keep it in mind as I continue experimenting with multiple files.

In terms of better errors: There's #24 for adding exit codes, and I'm referencing that here to remind myself to look into more extensive error messages on the command line as well.

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants