Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge rewrite, many fixes #14

Merged
merged 142 commits into from
Feb 11, 2024
Merged
Show file tree
Hide file tree
Changes from 71 commits
Commits
Show all changes
142 commits
Select commit Hold shift + click to select a range
e753358
Updated gitignore
rm-dr Oct 23, 2023
be3847b
Cleaned up docker files
rm-dr Oct 23, 2023
1ada098
Reworked build script
rm-dr Oct 23, 2023
2aa3396
Re-organized bundles
rm-dr Oct 23, 2023
806cf39
Minor path fixes
rm-dr Oct 23, 2023
ebe309e
Upgraded build script
rm-dr Oct 23, 2023
c4a8d6e
Whitespace
rm-dr Oct 23, 2023
94bdcfb
Added editorconfig
rm-dr Oct 23, 2023
d62d23e
Whitespace
rm-dr Oct 23, 2023
e2d1217
Removed extra dependency
rm-dr Oct 23, 2023
9117428
Updated README
rm-dr Oct 23, 2023
24d5d66
Cleaned up paths
rm-dr Oct 23, 2023
4ce0e9a
Updated README
rm-dr Oct 23, 2023
26f6f34
Python cleanup
rm-dr Oct 23, 2023
74f6775
Removed unnecessary file
rm-dr Oct 23, 2023
d75989f
Major refactor of make-zipfile.py
rm-dr Oct 23, 2023
7662397
Renamed bundles
rm-dr Oct 23, 2023
0923a41
Updated README.md
rm-dr Oct 23, 2023
02ea7fe
Fixed paths
rm-dr Oct 23, 2023
9ed1338
Fixed paths & added bonus checks
rm-dr Oct 23, 2023
20758b2
Comments
rm-dr Oct 23, 2023
1cb15ba
Minor cleanup
rm-dr Oct 23, 2023
edef933
Minor cleanup
rm-dr Oct 23, 2023
ae3287d
Fixed a minor bug
rm-dr Oct 23, 2023
06b4c97
Improved extra file logic
rm-dr Oct 23, 2023
3e82c15
Minor bug
rm-dr Oct 23, 2023
55a16d2
Clean up old files
rm-dr Oct 23, 2023
5fcb594
Safer rm
rm-dr Oct 23, 2023
9c4e6cf
Minor edits
rm-dr Oct 23, 2023
9df8a2d
Cleanup
rm-dr Oct 23, 2023
b1e2cf6
Renamed docker folder
rm-dr Oct 23, 2023
01a7624
Added iso hash check
rm-dr Oct 23, 2023
afc6587
Better interface
rm-dr Oct 23, 2023
410726a
Minor fixes
rm-dr Oct 23, 2023
4ec48e5
Minor fixes
rm-dr Oct 23, 2023
f0fc8a8
Whitespace
rm-dr Oct 23, 2023
084ac4d
Added output hash check
rm-dr Oct 28, 2023
f2cd109
Removed unused file
rm-dr Oct 29, 2023
4369d12
Major refactor: bugfixes and improved reproducibility
rm-dr Oct 29, 2023
de31187
Added documentation
rm-dr Oct 29, 2023
d4cbcac
Temporary fix
rm-dr Oct 29, 2023
684821d
Renamed texlive2023 bundle
rm-dr Nov 4, 2023
2e0bfde
Fixed ignore patterns
rm-dr Nov 5, 2023
9ef4356
Added `select` build step
rm-dr Nov 10, 2023
03c1d08
We can now build itar bundles without a zip bundle
rm-dr Nov 10, 2023
d4cc909
Minor logging cleanup
rm-dr Nov 18, 2023
e63cf4f
Removed docker-buildx dependency
rm-dr Nov 19, 2023
337e2b4
Removed VOLUME directives
rm-dr Nov 19, 2023
1d9b4f9
Renamed "select" to "content"
rm-dr Nov 19, 2023
9ebe38b
Minor organization
rm-dr Nov 19, 2023
faade55
Updated README
rm-dr Nov 19, 2023
4915cba
Added diff support in include/
rm-dr Nov 19, 2023
9a0e578
Better error on missing bundle dir
rm-dr Nov 19, 2023
a32a34b
Minor edits
rm-dr Nov 19, 2023
b800b31
Added fontawesome patch
rm-dr Nov 19, 2023
a76c3e8
Migrated more patches from texlive2022 bundle
rm-dr Nov 19, 2023
4b1d897
Improved content select speed
rm-dr Nov 19, 2023
5cca50b
Added final diff to tl2023 bundle
rm-dr Nov 19, 2023
bda50fd
Migrated tl2022 bundle to diffs
rm-dr Nov 19, 2023
045ac06
Updated README
rm-dr Nov 19, 2023
20972cd
Minor re-organization
rm-dr Nov 19, 2023
c000074
Updated tl2022 hash
rm-dr Nov 19, 2023
773bf57
Moved zip2tarindex
rm-dr Nov 19, 2023
279c089
Reverted needless change
rm-dr Nov 19, 2023
3433727
Minor diff adjustment
rm-dr Nov 19, 2023
855baff
Fixed patch hashing
rm-dr Nov 19, 2023
28b3e14
Minor docs
rm-dr Nov 23, 2023
c8ae511
Added shell.nix for reproducibility
rm-dr Nov 23, 2023
1f5fff5
Minor cleanup
rm-dr Nov 23, 2023
541f383
Minor docker cleanup
rm-dr Nov 23, 2023
189e633
Ignored all conflicting file
rm-dr Nov 23, 2023
70dd740
Minor README fix
rm-dr Nov 23, 2023
8e1f137
Grammar
rm-dr Nov 23, 2023
0df1944
Removed ignore patterns
rm-dr Nov 25, 2023
0a5b5c5
Removed a few patches
rm-dr Nov 25, 2023
eab8a5d
Patched fithesis
rm-dr Nov 25, 2023
5f25225
make_zip no longer flattens paths
rm-dr Nov 25, 2023
365dc5e
Added basic search paths implementation
rm-dr Nov 25, 2023
f781c7d
Documentation
rm-dr Nov 25, 2023
e4f6961
Build from tarballs instead of isos
rm-dr Nov 27, 2023
aad54b5
Updated .gitignore
rm-dr Nov 27, 2023
e4a62a1
Documentation
rm-dr Nov 27, 2023
1e45224
Updated result hash
rm-dr Nov 27, 2023
619eda8
Minor edit
rm-dr Nov 27, 2023
3bdd993
Merge changes from 'rework'
rm-dr Nov 29, 2023
2981204
Updated gitignore
rm-dr Nov 30, 2023
5950fd8
Added hash to index
rm-dr Nov 30, 2023
8e6783f
Renamed search-paths
rm-dr Dec 1, 2023
c2e3521
Rewrote file selector in Rust
rm-dr Dec 1, 2023
3d5646a
Updated bundle hash
rm-dr Dec 1, 2023
9c57a56
Path fix
rm-dr Dec 1, 2023
989ebc5
Added search path expansion
rm-dr Dec 1, 2023
4db50b0
Minor optimization
rm-dr Dec 2, 2023
3f06235
Slightly better error handling
rm-dr Dec 2, 2023
06a1b66
Added diff targets
rm-dr Dec 2, 2023
57cb4e7
Removed file-hashes debug file
rm-dr Dec 2, 2023
0938e0d
Removed file name from index
rm-dr Dec 2, 2023
f9d129a
Documentation
rm-dr Dec 2, 2023
0579083
Removed dependencies
rm-dr Dec 2, 2023
ad2e314
Updated hash
rm-dr Dec 2, 2023
da4068f
Index now has root slash
rm-dr Dec 2, 2023
7109c06
Formatting
rm-dr Dec 2, 2023
1a3def9
Improved search-report
rm-dr Dec 2, 2023
62344d7
Prettier search-report
rm-dr Dec 2, 2023
bf68ee5
Cleanup
rm-dr Dec 2, 2023
c649cc8
Added search paths
rm-dr Dec 2, 2023
96ebc1e
Added search paths
rm-dr Dec 2, 2023
9ff5971
Updated bundle hash
rm-dr Dec 2, 2023
1a09aa3
Swap hash library
rm-dr Dec 2, 2023
cbb2648
Added ttbv1 bundle format
rm-dr Dec 3, 2023
f2445f1
Minor cleanup
rm-dr Dec 3, 2023
8451811
Removed index from index
rm-dr Dec 3, 2023
3e91fcc
Removed zip and itar
rm-dr Dec 3, 2023
c1be5ca
Updated search paths
rm-dr Dec 5, 2023
3986122
Removed nix files
rm-dr Dec 5, 2023
08f5b17
Updated README
rm-dr Dec 5, 2023
77a2e17
Improved ttbv1 format
rm-dr Dec 8, 2023
38f1ed2
Added INDEX to INDEX
rm-dr Dec 8, 2023
1d19fd5
Minor sizing tweaks
rm-dr Dec 8, 2023
72bee94
Added a note to README
rm-dr Dec 9, 2023
ccd662c
Added unified index
rm-dr Dec 9, 2023
eff4480
Removed unused script
rm-dr Dec 9, 2023
5538b29
Renamed INDEX to FILELIST
rm-dr Dec 9, 2023
491a753
Added real_len fields
rm-dr Dec 11, 2023
17782c5
Comments
rm-dr Dec 11, 2023
e2efc8c
Added configurable search orders
rm-dr Dec 11, 2023
111c98a
Documentation
rm-dr Dec 11, 2023
994e237
Clean out stash
rm-dr Dec 12, 2023
9c4a7eb
Removed ignore patterns
rm-dr Nov 25, 2023
04a892f
Removed a few patches
rm-dr Nov 25, 2023
74ff195
Patched fithesis
rm-dr Nov 25, 2023
5b431ab
make_zip no longer flattens paths
rm-dr Nov 25, 2023
2d6e8ba
Added basic search paths implementation
rm-dr Nov 25, 2023
f269aa0
Documentation
rm-dr Nov 25, 2023
20e4b10
Move tests from stash
rm-dr Dec 12, 2023
f7336a1
Merged tests from 'rework'
rm-dr Dec 12, 2023
b28f4b2
Removed old bundle
rm-dr Dec 12, 2023
5c95e7c
Added test files
rm-dr Dec 12, 2023
aed833f
Added a few more test files
rm-dr Dec 13, 2023
6feacd7
Path fix
rm-dr Dec 13, 2023
bf809d6
Minor edits
rm-dr Dec 15, 2023
ce2e4c4
Fixed editorconfig
rm-dr Feb 10, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
18 changes: 18 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# EditorConfig is awesome: https://EditorConfig.org

# top-most EditorConfig file
root = true

[*]
indent_style = tab
indent_size = 4
end_of_line = lf
charset = utf-8
trim_trailing_whitespace = true
insert_final_newline = false

[*.py]
indent_style = space

[*.nix]
indent_style = space
9 changes: 7 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
__pycache__/
/state
/zip2tarindex/target/
/scripts/zip2tarindex/target/

/build
/tests/build

*.ignore
*.iso
185 changes: 94 additions & 91 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,142 +1,145 @@
# Tectonic TeXLive Bundle Builder
# Tectonic Bundle Builder

This repository contains scripts for building bundles for
[Tectonic](https://tectonic-typesetting.github.io), each of which is a complete TeX distribution.

**You do not need this repository to build Tectonic.** \
You only need these scripts if you want to make your own bundles of TeX files.

**Warning:** The `./tests` do not work yet, they still need to be reworked for the new bundle spec!





This repository contains scripts for building “bundles” for
[Tectonic](https://tectonic-typesetting.github.io) based on [Norbert Preining’s
Git mirror](http://git.texlive.info/texlive/) of [the TeXLive Subversion
repository](http://tug.org/svn/texlive/).

*You do not need this repository to build Tectonic.* You only need these scripts
if you want to make your own bundle of TeX files based on the TeXLive sources.


## Prerequisites

To use these tools, you will need:

- Bash
- Python 3.11 & Python standard packages
- GNU `patch` and `diff`
- An installation of [Docker](https://www.docker.com/).
- A checkout of the Preining TeXLive Git repository
(`git://git.texlive.info/texlive.git`), placed or symlinked in a subdirectory
named `state/repo` below this file. Be aware that this repository currently
weighs in at **40 gigabytes**!
- A Rust toolchain if you want to create “indexed tar” bundles. (So, you don’t
need Rust if you want to create a bundle and test it locally.)
- A [TeXlive iso](https://tug.org/texlive/acquire-iso.html). Different bundles need different TeXlive versions.
- A Rust toolchain if you want to create “indexed tar” bundles. You don’t
need Rust if you want to create a bundle and test it locally.

This repo also contains a `shell.nix` with pinned versions that contains all dependencies.








## Bundles:
Each directory in `./bundles` is a *bundle specification* which contains everything we need to reproducibly build a bundle.\
See [`./bundles/README.md`](./bundles/README.md) for details.

The following bundles are available:
- `texlive/2022.0r0`: directly copied from the bundle in `master`. \
Uses `texlive-2022.0r0`, and is probably broken.

Data files associated with the staging process will land in other subdirectories
of `state/`.
- `texlive2023-nopatch`: based on `texlive2023-20230313`.


## Getting started: creating the bundler image

The first step is to create a Docker container that will host most of the
computations — this promotes reproducibility and portability, regardless of what
kind of system you are using. To create this container, run:

```
./driver.sh build-image
```


## Creating TeXLive containers

The next step is to create TeXLive “containers” — which are different than
Docker containers. A *Docker* container is an encapsulated Linux machine that
provides a reproducible build environment. *TeXLive* containers are archives
containing the files associated with the various TeXLive packages.

To create TeXLive container files for all of the packages associated with your
TeXLive checkout, run:

```
./driver.sh update-containers
```
## Build Process:
Before building any bundles, acquire a [TeXlive iso](https://tug.org/texlive/acquire-iso.html) with a version that matches the bundle you want to build. `build.sh` checks the hash of this file when you run `install`.

This will use the Docker container to generate TeXLive container files in
`state/containers`. *The results of this step will depend on what version of the
TeXLive tree you currently have checked out in `state/repo`.*
To build a bundle, run the following jobs. This **must** be run in order!

- `./build container`: builds the docker container from `./docker`
- `./build <bundle> install <iso>`: installs TeXLive to `./build/install/`
- `./build <bundle> content`: assemble all files into a bundle at `./build/output/content`.\
This will delete all bundles in `output/<bundle>/`, move them elsewhere if you still need them.

## Creating a TeXLive installation tree
Once `./build/output/content` has been created, run any of the following commands to package the bundle:

Run:
- `./build <bundle> zip`: create a zip bundle from the content directory.\
Zip bundles can only be used locally, they may not be hosted on the web.

```
./driver.sh make-installation bundles/tlextras
./driver.sh install-packages bundles/tlextras
```
- `./build <bundle> itar`: create an indexed tar bundle from the content directory. \
These cannot be used locally, itar bundles must be used as web bundles. \
If you want to host your own, you'll need to put `bundle.tar` and `<bundle>.tar.sha256sum` under the same url.

(In the future, we might add more specifications to the `bundles` directory for
creating specialized bundles. The `tlextras` bundle is the one-size-fits-all
default bundle.)
`build.sh` also provides a few shortcuts:
- `./build <bundle> all <iso>`: Runs all the above jobs, *including* a full re-install.
- `./build <bundle> most <iso>`: Runs all jobs except `container` and `install`.
- `./build <bundle> package`: Runs `zip` and `itar`. Assumes `content` has already been run.


## Updating patches

As of TeXLive 2021, we have bitten the bullet and decided to maintain some
patches against the TeXLive tree.
### Build Notes & Troubleshooting:
- The `install` job could take a while. `tail -f` its log file to watch progress.
- `install` will fail if your iso hash does not match the hash of the iso the bundle was designed for.\
This may be overriden by replacing `./build.sh <bundle> install <iso>` with `./build.sh <bundle> forceinstall <iso>`.
- the `install` job occasionally throws the following error: `mount: /iso-mount: failed to setup loop device for /iso.img.`\
Run the job again, it should work. We don't yet know why this happens.
- You do not need to run `install` every time you change a bundle. In fact, the contents of TeXlive installations should NEVER change. You only need to install each version of TeXlive once.\
If you're building multiple bundles from the same TeXLive version, you could install once then copy & rename that installation to save time. Automating this would add a bit of needless complexity to the build process, but we may implement it later.

Maintaining long-lived patches is never fun, but Git makes life a lot easier
than it could be. We use a secondary branch named `vendor-pristine` to help
maintain our patches. The way we do that is to copy the “vendor” (TeXLive
original) files into branch, then use `git merge` to update the main branch with
whatever changes have been introduced between TeXLive updates.

First, bump the version of your bundle and run the standard update steps through
the `install-packages` step described above. Make sure that the current branch
is clean with no changes in the working tree or index. Then run:

```
./driver.sh get-vendor-pristine bundles/tlextras
```

Then follow the suggested workflow as printed out by that command. The basic
plan is to commit the vendor files into the bundle’s `patched/` directory *on
the vendor-pristine* branch, then merge them back into the main branch.


## Exporting to a Zip-format bundle
## Output Files

Run:

```
./driver.sh make-zipfile bundles/tlextras
```
**`./build.sh <bundle> content` produces the following:**
- `./build/output/<bundle>/content`: contains all bundle files. This directory also contains some metadata:
- `content/INDEX`: each line of this file maps a filename in the bundle to a relative path.
- `content/SHA256SUM`: a hash of this bundle's contents.
- `content/TEXLIVE-SHA265SUM`: a hash of the TeXlive image used to build this bundle.
- `listing`: a sorted list of all files in the bundle
- `clash-report`: debug file. did any files have the same name? (if any)
- `file-hashes`: debug file. Indexes the contents of the bundle. Used to find which files differ between two builds.
`file-hashes` and `content/SHA265SUM` are generated in roughly the same way, so the `file-hashes` files from two different bundles should match if and only if the two bundles have the same sha256sum

This will create a large Zip-format bundle file with a name something like
`state/artifacts/tlextras-2020.0r0/tlextras-2020.0r0.zip`. Such a bundle file
can be used with the `tectonic` command-line program with the `-b` argument.

**`./build.sh <bundle> zip` produces the following:**
- `<bundle>.zip`: the main zip bundle

## Converting to an “indexed tar” bundle

This step is needed to create a bundle that will be hosted on the web. Run:

```
./driver.sh make-itar bundles/tlextras
```
**`./build.sh <bundle> itar` produces the following:**\
Note that both `<bundle>.tar` and `<bundle>.tar.index.gz` are required to host a web bundle.
- `<bundle>.tar`: the tar bundle
- `<bundle>.tar.index.gz`: the (compressed) tar index, with format `<file> <start> <len>`\
This tells us that the first bit of `<file>` is at `<start>`, and the last is at `<start> + <len> - 1`.\
You can extract a file from a local bundle using `dd if=file.tar ibs=1 skip=<start> count=<len>`\
Or from a web bundle with `curl -r <start>-<start>+<len> https://url.tar`

This will create both the `.tar` and the `.tar.index.gz` files that need to be
uploaded for use as a web bundle.


## Testing

Bundle definitions come with testing information. To test a bundle, you need the
`tectonic` command-line program to be in your $PATH, as well as a Python 3
interpreter and the [toml] package.

[toml]: https://pypi.org/project/toml/

Test scripts are located in the `tests` directory. Currently available:

- `tests/classes.py`: basic compilation smoketest of the documentclasses in a bundle
- `tests/formats.py`: test generation of the format files defined in the bundle
- `tests/packages.py`: test loading if the package (style) files defined in the
bundle. There are thousands of style files in a typical bundle, so this
program uses a framework to run a random-but-reproducible subset of the tests.
See the header comment in the Python file for more information.
## Reproducibility
The `SHA256HASH` stored in each bundle should stay the same between builds. \
Below is a list of "problem files" that have made bit-perfect rebuilds difficult in the past:

- The following contain a timestamp:
- `fmtutil.cnf`
- `mf.base`
- `updmap.cfg`
- The following contain a UUID: (Most of these have a UUID *and* a timestamp)
- `357744afc7b3a35aafa10e21352f18c5.luc`
- `929f6dbc83f6d3b65dab91f1efa4aacb.luc`
- `b4a1d8ccc0c60e24e909f01c247f0a0f.luc`

#### Copyright and Licensing
Fortunately, installing TeXlive with `faketime -f` seems to pin both UUIDs and timestamps.\
The date each bundle is built on is defined in its specification.

The infrastructure scripts in this repository are licensed under the MIT
License. Their copyright is assigned to the Tectonic Project.