/
relit.tex
1353 lines (991 loc) · 135 KB
/
relit.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass[preprint,times]{elsarticle}
\biboptions{semicolon,sort&compress}
\def\numberofauthors#1{}\def\alignauthor{} \def\affaddr#1{#1} \def\email#1{#1}
\pagenumbering{arabic}
\usepackage{balance} % to better equalize the last page
\usepackage{graphics} % for EPS, load graphicx instead
\usepackage{url} % llt: nicely formatted URLs
\usepackage[normalem]{ulem}
\usepackage{xcolor,colortbl}
\usepackage{relsize}
\usepackage{array}
\usepackage{framed}
\definecolor{shadecolor}{rgb}{0.8,0.8,0.8}
\usepackage{parskip} % set document default behaviour to 0pt paragraph indents and add small vertical skip between paragraphs
\newif\ifnew
\newtrue
\ifnew
\long\def\new#1{\textcolor{red}{#1}}
\def\newstart{\color{red}}
\def\newend{\color{black}}
\else
\def\new#1{#1}
\def\newstart{}
\def\newend{}
\fi
\newcount\softwareURLfootnote \softwareURLfootnote=0
\def\softwareURL{\url{github.com/haroldthimbleby/relit}\ifnum \softwareURLfootnote=0
\footnote{\label{urlPageLabel}Note: Elsevier may use their own GitHub URL for the ``accepted for publication'' version of our software.}\global\softwareURLfootnote=\thefootnote\else
\footnote{\emph{See footnote \the\softwareURLfootnote\ on page \pageref{urlPageLabel}.}}%
%\footnotemark[\softwareURLfootnote]
\fi}
% To make various LaTeX processors do the right thing with page size.
\def\pprw{\textwidth}
%\def\pprh{\textheight}
%\special{papersize=\pprw,\pprh}
%\setlength{\paperwidth}{\pprw}
%\setlength{\paperheight}{\pprh}
%\setlength{\pdfpagewidth}{\pprw}
%\setlength{\pdfpageheight}{\pprh}
\newdimen\figurewidth \figurewidth=\pprw
\def\slash{\tt \char'134}
\def\bracket#1{\tt \{#1\}}
\def\breakup{}
\def\plusminus{$\pm$}
\usepackage{fancyvrb}
\DeclareGraphicsRule{.tif}{png}{.png}{`convert #1 `dirname #1`/`basename #1 .tif`.png}
\usepackage{amssymb} % for rightsquiqarrow etc
\usepackage{epstopdf}
\usepackage{siunitx, booktabs, cancel, csquotes} % Allows correct typesetting of SI Units
\usepackage{url} % allows typesetting of web addresses etc.
\def\citenumber#1{\cite{#1}}
\makeindex
\def\name#1{\textbf{#1}}
\RecustomVerbatimEnvironment
{Verbatim}{Verbatim}
{fontsize=\relsize{-1},formatcom=\vskip .2cm,samepage}
%set-tag \seen{}
% because this is a rather artificial demonstration file, we restrict where we use tags
% otherwise we will end up with a lot of unnecessary files!
%set-tag none
\def\listOfFiles{}
% \global\edef\listOfFiles{\listOfFiles \noindent #1 \vskip 1pt
\newif\ifexists
\def\myinput#1{\IfFileExists{./#1}{\existstrue}{\existsfalse}\global \edef\listOfFiles{\listOfFiles \noindent {#1} \ifexists -- \else file MISSING\fi source line~\the\inputlineno\ (typeset on page \thepage)\vskip -6pt}%
\IfFileExists{./#1}{\input #1\relax\unskip}{\textcolor{red}{\bfseries *** Warning: ``#1'' does not exist. Please run make~#1 to create the files that are needed! ***}}}
\def\specialMarker{*}
%generate relit-def.tex ., /ends/-1
\iftrue % if true, then activate TeX interpreting relit \commands
\def\acmIndexer#1{#1}% e.g., ACM Transactions has page numbers of the form 1:pagenum
\makeatletter
\def\@uc#1{\uppercase{#1}}%
\def\relitindex #1 #2 {\immediate\write\@indexfile{\string\indexentry{\@uc Relit #1!#2@{\string\texttt{#2}}|acmIndexer}{\thepage}}}%
\def\manualrelitindex #1 #2 {\immediate\write\@indexfile{\string\indexentry{\@uc Relit #1!#2@{\string\texttt{#2}}|manualrelitpage}{\thepage}}}%
\def\manualrelitpage#1{\acmIndexer{}#1\specialMarker}
\expandafter\DeclareRobustCommand\csname relit\endcsname{\@ifnextchar[{\@@relit}{\@relit}}
\def\@@relit[#1]#2{%
\def\seen{{\tt\char'134}seen\{\}}%
\def\unseen{{\tt\char'134}unseen\{\}}%
{\setbox0=\hbox{\relitindex #2}{\relsize{-1}\par\noindent\tt\textcolor{red}{RELIT: #2, #1}}\vskip 0pt}%
\marginpar{\tt\textcolor{red}{\sf\bfseries\relsize{-2}$\downarrow$RELIT$\downarrow$}}%
}
\def\@relit#1{%
{{\setbox0=\hbox{\relitindex #1 #1 }\par\noindent\tt\textcolor{red}{RELIT: #1}}\vskip 0pt}%
\marginpar{\textcolor{red}{\sf\bfseries\relsize{-2}$\leftarrow$RELIT}}%
}
\makeatother
\else % TeX will ignore relit's \commands
\makeatletter
\expandafter\DeclareRobustCommand\csname relit\endcsname{\@ifnextchar[{\@@relit}{\@relit}}
\def\@@relit[#1]#2{}
\def\@relit#1{}
\def\relitindex #1 #2 {}
\def\manualrelitindex #1 #2 {}
\makeatother
\fi
%ends
\begin{document}
%\markboth{H. Thimbleby \& D. Williams}{A tool for publishing reproducible algorithms}
\title{A tool for publishing reproducible algorithms\\
\&\\
A reproducible, elegant algorithm for sequential~experiments
}
\author[swansea]{Harold Thimbleby\corref{corresponding}}
\ead{harold@thimbleby.net}
\ead[url]{harold.thimbleby.net}
\cortext[corresponding]{Corresponding author}
\author[swansea]{Dave Williams}
\ead{drdjwilliams67@gmail.com} % {davidjwilliams@doctors.org.uk}
\address[swansea]{Swansea University, Wales, SA2 8PP}
\begin{abstract}
Tools to ease the burden of reproducibility are important so computer science does not fall into the trap of ``cargo cult'' science: particularly publishing discussions of algorithms that look like algorithms but which do not work properly when they are copied from the paper.
This paper introduces a tool, called \name{relit}, which makes it very easy to write about and publish correct algorithms, yet without restricting the author's style. In fact, \name{relit} can be used with any material: mathematics, proofs, algorithms or programs. It can be used in papers, in reports and books and, with analogous advantages, in student work --- where examiners may wish to automatically check what the student claims to have written is actually correct.
To demonstrate \name{relit}, this paper presents a new, elegant algorithm for the design of sequential experiments to efficiently control bias, drift, random error, carry-over and other effects. The algorithm is written in C, in a clear style to simplify porting to other languages.
We developed \name{relit} because it was impossible to find simple reproducible code for this problem, and we wanted to do better. Thanks to \name{relit}, the published algorithm is reproducible and works \emph{exactly\/} as published in the present paper. This paper also includes discussion of the problems and opportunities of reproducibility and the essential contributions of \name{relit}-style approaches to improving the reliability of computer science publications.
\end{abstract}
\begin{keyword}
Reproducibility \sep
Publishing algorithms \sep
Literate programming \sep
Euler cycle algorithm \sep
de Bruijn sequence \sep
Combinatorics \sep
Experimental design
\end{keyword}
\maketitle
\section{Introduction}
When we needed a standard ``off the shelf'' algorithm for the design of sequential experiments, we found it was quicker to invent a new one from scratch than get published algorithms to work. It was hard to find a \emph{working\/} solution for the problem in the literature --- even when we had searched both peer reviewed publications and textbooks.
Reproducibility, or rather, lack of reproducibility is a widespread problem affecting many algorithms, especially long, complex or unusual algorithms. Often such algorithms are left as ``exercises for the reader'' or are merely analysed for their properties, rather than presented with correct code.
This paper introduces our elegant new algorithm for sequential experiments and its use and application for sequential experiment design (specifically, it generates $(N,2)$ de~Bruijn sequences). The algorithm is much briefer than theoretically faster algorithms that have been outlined in the literature though not so reproducibly described. The algorithm will be useful for experimental scientists (who can parameterise it for their specific research) --- exactly the sort of people who need working algorithms but rarely have the time or resources to invent correct algorithms from the far-too-often inadequate descriptions in the literature.
We use our reproducible presentation of the algorithm as a full and thorough case study of a new tool-based approach for easily and rigorously explaining and documenting reproducible algorithms in published papers. A key contribution of the paper, then, is our tool-based approach to reproducibility and our discussion around reproducibility.
Finally, this paper relates our new, lightweight tool-based approach to Richard Feynman's exhortation to avoid what he calls ``cargo cult science'' \cite{feynman}: when we have the right tools to help us, his challenge becomes easy. We show how our approach makes publishing correct working software easier and more likely. It helps improve the quality not just of publications but also of the underlying algorithms themselves: authors can now easily improve their algorithms and their papers in a tightly integrated, even enjoyable, process.
\subsection{Reproducible research versus reproducible publication}
Science is based on reproducibility: if an idea or theory is not reproducible, it is not science, or at least the lack of reproducibility uncovers a boundary case that refines the science. A common reason for lack of reproducibility is that a published paper does not disclose enough details or contains mistakes: what is published is inadequate for another scientist to reproduce the work. (Fraud and multiple publication are further reasons --- papers may deliberately fail to disclose enough to reproduce the work.)
Algorithms and programming more generally have the interesting property that in principle everything is reproducible: programs run on computers, and programs are text that can be fully disclosed. There has been recent increasing interest in making reproducibility a criterion for accepting papers in the field (we discuss this further in Part~III).
We further make the distinction between reproducible research and reproducible papers:
\begin{description}\raggedright
\item[Reproducible research] A paper discloses enough for a reader of the paper to reproduce the research. For example, \emph{everything\/} in this paper is available on GitHub, at \softwareURL{}
Some journals and conferences (such as the ACM Symposium on Principles of Programming Languages since 2015, \url{popl.mpi-sws.org/2015}) have started asking or requiring authors for ``artifacts,'' documents and other evidence intended to support the scientific claims made in a paper. Papers with quality artifacts are badged, so that the additional value can be recognised by the community.
\item[Reproducible paper] A paper \emph{as published\/} discloses enough for a reader to reproduce the claims of the paper, with essentially no further effort than copying the details. For example, this paper introduces a new algorithm and that algorithm is completely disclosed in this paper, in fact as \emph{source code\/} in standard programming language that can be copied directly from Part~I of the paper.
\item[Semi-reproducible paper] A paper \emph{as published\/} discloses enough for a reader to reproduce the claims of the paper, but perhaps to reproduce it exactly will involve substantial further work. For example, this paper introduces a new approach to reproducibility that is thoroughly disclosed in this paper. However none of the code to implement the approach is disclosed in the paper itself, although the \emph{idea\/} can be copied from Part~II of the paper. (As it happens, the documentation and complete source code of \name{relit} are available on GitHub, so the research is reproducible.)
\end{description}
While there is an important place for non-reproducible research (for instance in inspirational writing or in papers that expound new research challenges) a serious problem is non-reproducible research masquerading as reproducible. Such research has the potential to mislead readers: it looks like it solves a problem, but it does not. Sometimes this is innocent --- for example, an author publishes an algorithm because they are interested in its complexity, but a reader may want to run the algorithm to solve a problem, however the detail disclosed in the paper may be insufficient to run the algorithm. Sometimes this is an oversight --- for example, some details the author knows were not disclosed, but they are nevertheless critical. Sometimes this is fraud or bad practice --- for example, the author wants to achieve another publication using the additional details.
\subsection{Relit: A tool for reproducibile papers}
We built a tool \name{relit}, so called because it implements ``reverse literate programming.'' \name{Relit} helps write clearly about any algorithm or algorithms, and is intended to be particularly useful to help write reliable, reproducible papers about algorithms.
In conventional literate programming, the literate program source file drives the entire process and generates a program and structured documentation for it; in reverse literate programming the main relationship is reversed, being driven by a paper (such as the one you are currently reading) from which source code is derived. Another contrast is that conventional literate programming aims to produce a program and full internal documentation; reverse literate programming aims to produce a paper (or book or other publication) that describes algorithms or program code, at the same time providing assurance that the code is correct. Figure~\ref{fig:reversed} explains the ``reversed'' concept further, and how it turns the conventional literate programming approach around.
Here is a small but complete, self-contained example:\label{simple-example}
\begin{snugshade}\noindent
The Java \texttt{for} loop \texttt{%
%define java-for-loop .,/%end/-1
for(int i = 0; i <= 9; i++) System.out.print(i)
%end
}will print \texttt{\myinput{java-output}}.
\end{snugshade}
The reverse literate programming approach does not impose any formatting or style conventions on the author, so --- to try and emphasise the flexibility --- we typeset this example in a gray box to help make it stand out as an example. Code does not have to be syntactically complete, and can be written however best suits the author's needs; for instance, in this example, we chose to omit the final semicolon strictly required by Java. Indeed, this paper has several unrelated concrete examples in it (in several programming languages) and \name{relit} imposed no conventions or restrictions on the authors to achieve this.
Our tool \name{relit} extracted the \emph{actual\/} code shown above from the \LaTeX\ \cite{latex} source for this paper, and inserted it into a complete Java program (which also provided the hidden semicolon) that was also in the \LaTeX\ source file. The makefile \cite{make}, also defined in this paper's \LaTeX\ source file, compiled and ran the Java program, saving its output to a temporary file, which was input into the paragraph above. Hence the output of the program shown above really is the output of running the actual Java code shown. In fact, the complete source code for the Java program (as well as the makefile) is in the single \LaTeX\ source file for this paper, though the rest of it is hidden from sight beyond the end of the published paper. (It appears after \LaTeX's \verb!\end{document}!, though \name{relit} could have extracted it from a separate file just as easily.) As an option, \name{relit} can summarise all the invisible (or otherwise) code to help check that nothing critical to the paper has been accidentally hidden from the reader.
As a test, the authors edited the \texttt{for} loop and the example changed as expected; this reassured us that the example above is accurate and not broken by typesetting (e.g., by some idiosyncracy in \LaTeX). In other words, \emph{both\/} the authors of this paper \emph{and\/} therefore the readers of this paper have assurance that the code above does exactly what we say it does.
In a more critical situation than publishing code in a paper, it would be desirable to design the makefile (or other run time script) to ensure that any errors in the program result in visible warnings, perhaps no paper at all, until all errors are corrected --- in our simple example, if we introduced a syntax error in the Java code, the compiler would generate no code at all, so the \emph{last\/} successful executable file would be used instead. Running it would then misleadingly generate obsolete output from some earlier successful compilation. Tools like \name{expect} \cite{expect} can help assure that the right code is used reliability in a systematic way. However, both Java and \LaTeX\ are powerful programming tools, so an author so intent could program any effect they want, but which might accidentally end up being misleading regardless of safeguards like \name{expect}; thus it is critical that tools like \name{relit} are very simple and elegant so that the author is not tempted into trying complex, possibly unreliable, tricks to get the effects they want.
Using \name{relit}, there is no notation at all to get going, and very little notation to learn to start using features as they are needed. Even when all the features are used, everything can be done with only two simple commands that are \LaTeX\ comments, so they have no unintended interaction or side-effects on the meaning of the original document. No writing needs to be structured or re-structured to make use of it (e.g., into an outline or set of sections as in conventional literate programming \cite{litprog,noweb} and in systems like Mathematica \cite{mathematica}, \name{org-mode} \cite{org-mode}, etc).
Another important difference between conventional literate programming and reverse literate programming is that literate programming is motivated by documenting programs or algorithms (even if they end up being book length) whereas reverse literate programming is motivated by helping write reproducible papers (which are usually brief) about programs or algorithms. One focuses on the program, the other focuses on the publication.
\begin{figure*}
\begin{center}
\noindent\includegraphics[width=4.9in]{figures/reversed.pdf}
\end{center}
%generate caption-#.tex .,/%end/-1
\caption{Conventional literate programming tools work as illustrated in Figure \ref{fig:reversed}(a), which is based on a diagram in the paper on the literate programming tool \name{noweb} \cite{noweb}: as shown, the author edits a source file (e.g., \texttt{wc.nw}) and the tool generates two files: a compilable program (\texttt{wc.c}) and a stylised \LaTeX\ documentation file (\texttt{wc.tex}). Contrast this with Figure \ref{fig:reversed}(b), which shows the comparable diagram for \name{relit}. With \name{relit}, the author edits a file (now \texttt{wc.tex}) from which \name{relit} can generate many files, including \texttt{wc.c}, which is compilable. Additional files generated by \name{relit} can be used for any purpose; for instance, in the present paper one of the generated files is a makefile so generated files were compiled and executed with their outputs inserted back into this paper.}
%end
\label{fig:reversed}
\end{figure*}
We used \name{relit} throughout the present paper: the new algorithm we present in this paper works \emph{exactly\/} as shown. The discussion about the new algorithm therefore combines three roles: its intrinsic interest as a useful algorithm, the discussion about the algorithm literature, and the algorithm as a complete worked example presented reproducibly using \name{relit}. Apart from our assurances that the code shown is real, there is nothing unusual in the style or format of the present paper: while giving many advantages to authors, reverse literate programming is invisible to the reader.
It is notable that with the \name{relit} approach we advocate, an \emph{ordinary}, conventionally-written, paper gradually morphed into the present paper that now accurately generates the code it talks about. The original paper was written conventionally, but \name{relit} features were incrementally introduced to ``pull out'' the existing code embedded in the paper's \LaTeX\ source document. Additionally, in the same file, but hidden from the reader as the details are not relevant to the paper, are the the full run-time resources to make the code work. In other words, the code and program results shown in this paper has been generated \emph{from the paper itself}, and it works as described --- this is reproducible research. All other related approaches to improving reproducibility of which we are aware require either some sort of special notation or structure for files, or they generate the paper from a specification. Furthermore, most impose typographic and stylistic constraints on the papers that can be published.
\subsubsection{Relit: Basic use}
Here is an illustrative and basic development cycle using \name{relit}:
\def\contrast#1{\begin{itemize}\raggedright\item\sf #1\end{itemize}}
\begin{enumerate}
\raggedright
\item Write the initial paper \emph{as usual}. (Of course, anticipating using \name{relit} one can do better than just writing an ordinary paper.) \contrast{In conventional literate programming, one has to start with a special ``web'' file. No web files are in formats suitable for conventional publication.}
\item Edit the paper so program code snippets are preceded by simple \name{relit} commands, written as comments so they do not affect the published paper at all. \contrast{In conventional literate programming, the code has to be structured to make it possible to document. The structuring decisions are hard to change later.}
\item Optionally, somewhere that is ignored --- in another file, or beyond the \LaTeX\ \verb|\end{document}| --- add \name{relit} commands to add any support code or housekeeping that is needed. \contrast{Conventional literate programming has no way to hide code that is distracting or needs no explanation.}
\item Running \name{relit} now generates the specified files from the contents in the paper. Compile and run or otherwise test the generated files.
\item Concurrently edit the original paper to improve the code so it works better and edit the text of the paper so it more accurately describes the code. \contrast{You cannot edit the original paper in literate programming; you have to edit the structured web file that generates the paper. This is made clear in Figure~\ref{fig:reversed}.}
\item Repeat until everything works as intended. At all times, the submitted versions can be used to automatically reconstruct the algorithms to ensure they work correctly.
\item Submit to a journal or conference. \contrast{If the publishers have stylistic or formatting requirements, conventional literate programming hits a wall --- the paper will require substantial editing, which undermines one of the key advantages that the code should be correct and coherent with the description. Editing the submitted paper will introduce inconsistencies with the original documents, and hence will compromise reproducibility.}
\end{enumerate}
\subsubsection{Relit: An example of its real use}
At the last moment while writing this paper, our colleague Paul Cairns pointed out a worked example we had used explaining our algorithm erroneously confused wine regions (e.g., Chianti) for grape varieties (e.g., Pinotage and Merlot).
In the conventional approach to writing about algorithms, we would probably have edited the paper alone to fix the error \ldots\ and the paper and the programs we had written about would have diverged, or at least we would have needed to edit several files and do a lot of careful cross-checking --- if we could be bothered to do it. We might have decided it was not worth the trouble. We might have fooled ourselves: the original program works even if it has the names mixed up; we would probably have persuaded ourselves that, surely, the program \emph{used to\/} work, and then we would edit the paper alone to fix the errors. It is then but a short step to justifying to ourselves not doing the cross-checking properly because it more easily imagined than done. If we had gone down this route, the paper and the program would have been different. If we had corrected any other errors or polished --- which is always very tempting --- any aspect of the paper, potentially the published paper and the algorithm would have become very different. If we made any clerical slips editing the paper, the paper and the program could be critically different. Worse, neither we the authors nor the readers would be aware of the bugs and inconsistencies introduced by these well-intentioned ``corrections.''
Instead, because we were using \name{relit}, to fix the wine naming issues we did \emph{a very trivial\/} edit in just \emph{one\/} place in a \emph{single\/} file, that is, in one place in the paper you are reading right now. With no further editing, thanks to using \name{relit}, we then automatically had a new C program that correctly used Shiraz, \emph{and\/} the output from running the actual program was then inserted back into this paper. The example output (used in Section~\ref{winelist} below) also, of course, automatically updated to say varieties instead of regions.
Everything remained consistent with negligible effort. There was no need for any tedious or error-prone double-checking, because all the necessary edits needed were in exactly one place --- the program code that is explicit in the source text of this paper. If there is only one object, it is necessarily consistent! And, finally, everything (all the code examples and the outputs from running the code) is obviously consistent with the code published in the paper because it was generated by running exactly the code in the paper you are now reading.
In fact, doing the change was much easier than explaining how easy it was to do!
\newcount\partNo \partNo=1
\makeatletter
\def\part#1{\begin{center}\bfseries\Large Part \expandafter\@slowromancap\romannumeral \partNo @\relax\global\advance \partNo by 1\\\vskip 2ex\hrule\vskip .3ex\large #1\vskip 1ex\hrule\end{center}}
\makeatother
\part{Case study \& The problems of finding reproducible algorithms}
\section{Algorithms for sequential experiments}
In a sequential experiment, such as generation of sensor calibration curves, a number of random and systematic errors may occur. Errors can include: bias, drift of sensor readings with time, and carry-over effect where the previous reading influences the next reading. Good experimental design therefore uses a sequence of calibration values arranged in such a way the sequence will normalise and cancel out random and systematic errors. Ideally the sequence will be as short as possible in order to minimise unnecessary work and expense in the collection of data.
The problem may be illustrated by a familiar example. We might want to know which of $N$ types of wine tastes best, but, as is well known, the flavour of a wine is affected by the last wine just tasted. We therefore want to design a systematic experiment for wine tasting that tries every sequential combination of pairs of the set of $N$ wine types available in our cellar, and of course we want the shortest such cycle, because experiments with wine are expensive.
For the sake of concreteness, suppose we have $N=3$ types of wine in our cellar, specifically, say, Merlot, Pinotage and Shiraz. If we have just drunk Pinotage, then there are three possible experiments to do next: to drink Merlot next, Shiraz next, or of course to drink Pinotage again. If we drink Pinotage, then the next experiment should probably be to assess Merlot or Shiraz, since we already know what Pinotage after Pinotage tastes like.
With only three wines, the best sequence of experiments is not too hard to work out by hand, but in general with lots of wine it becomes much harder, especially if we start the wine tasting before we have finished working out the right sequence.
For simple cases the problem can be solved by hand by drawing a graph with one vertex for each of the $N$ types of experiment (e.g., testing a type of wine) with $N^2$ arrows to represent each possible sequence of two experiments. This creates a \emph{complete graph}, usually denoted $K_N$\@. Figure~\ref{completeg} (on page~\pageref{completeg}) shows the complete graph for $N=3$. (The background in graph theory is explained in Section~\ref{graphtheory} below.)
A sequence of several experiments forms a path following arrows in this graph. In our problem we want to find a shortest path that follows every arrow at least once. It is a standard graph theory result for a complete graph that a shortest such path exists and only needs to follow each arrow exactly once, and it will also end up where it started --- obviously, if it followed any arrow more than once, it would be repeating an experiment and take longer. An optimal sequence is called an \emph{Euler cycle}, after Leonard Euler who first studied paths in graphs (but not sequential experiments). Our problem therefore reduces to finding an Euler cycle algorithm.
Our problem, then, is to write a computer program to generate an efficient sequence of experiments for the general case. Fortunately, finding Euler cycles is a well-known programming problem, which has been solved many times. Unfortunately, it turns out that programming textbooks typically leave this problem as an exercise for the reader, and that does not help if your aim is to do a sequential experiment, rather than learn the hard way how to program algorithms!
The definitive algorithms textbook by Cormen, Leiserson, Rivest and Stein just says: % [p623]
\begin{quote}\sf
\textbf{Exercise 22-3} [\ldots]
Describe an $O(E)$-time algorithm to find an Euler tour of $G$ if one exists. (\emph{Hint:\/} Merge edge-disjoint cycles.)\hfill\cite{cormen}
\end{quote}
Elsewhere Cormen estimates that writing up the solutions to the book's exercises would run to between 2,000 and 3,000 pages \cite{cormen-web}: it is just not going to happen. Other books do not show answers to exercises in case students might cheat.
But worse than not providing an algorithm, in many textbooks and published papers, the Euler cycle algorithm --- if presented at all --- is described in high-level English, as a sketch, or in a simplifying pseudo-language. It is also easy to find numerous lecture notes online that describe the Euler cycle algorithm in pictures with English annotations. Aguably, this is all cargo cult science --- they look deceptively like programs, but are not.
We will return to the general problems of reproducibility in Part~III, but here we now focus on the computer science problems --- and solutions.
While English may be sufficient to explain some principles or to estimate the time complexity of the algorithm or other properties, and to do other things programmers are interested in, it is completely inadequate to convert into a working program in a real, executable, language such as C or Java. To get a working program, a lot of detail needs careful consideration: how should a graph be represented in a language with type checking, for instance? If vertices are numbered, are they numbered starting from 0 or 1? Or should you use a standard package, and then have to convert the pseudo-program into the programming conventions of that package? Unfortunately, the necessary detail to get anything to work makes explanations unwieldy. In the worst case, the detail never existed, and the presented algorithm is not even an abstraction of anything that was ever executable.
Furthermore, Euler cycle programs assume the graph is arbitrary, and there are different algorithms for directed, undirected and mixed graphs. In our case, the graph happens to be both directed and complete, and we discovered that these facts can be used to greatly simplify the problem. In fact, inventing and implementing a new algorithm for our special case took less time than correctly implementing a standard algorithm.
\newcount\linesofcode
Our complete working program \emph{exactly as described reproducibly in this paper\/} is \myinput{linesofcode.tex}
\newcount\denom \denom=\linesofcode
\ifnum \denom=0 \denom=1 % in case the file doesn't exist and denom is zero
\fi
\the\linesofcode\ lines of code; moreover, the core algorithm is only
\myinput{corelinesofcode.tex} \the\linesofcode\ lines, including comments. (The code is given in~Section~\ref{complete-program}\@.) This compares favourably to a quality reference algorithm \cite{sedgewickcode} that is \myinput{sedgewickeslinesofcode.tex}
\newcount\testcode \testcode=49 \advance\linesofcode by -\testcode
\the\linesofcode\ lines of code (plus \the\testcode\ lines of test code), though it will not work ``out of the box'' without also finding and compiling it with the various files it depends on --- the total code needed runs to
\myinput{allLinesofcode.tex}
\the\linesofcode\ lines,
\newcount\ratio \ratio=\linesofcode \divide\ratio by \denom
\newcount\tmp \tmp=\ratio \multiply\tmp by \denom
\ifnum \tmp=\linesofcode exactly\else about\fi\
\the\ratio\ times longer.\footnote{Line counts are based on the actual code extracted from this paper by our tool \name{relit} (they were calculated by the Unix utility \name{wc} on the program source files and then input in the \LaTeX\ for this paper). Line counts ignore comments and blank lines. No attempt has been made to ``squash'' code to save lines.}
This reference code also requires further work to edit the Java and local Java environment to compile it, as well as code to define a complete graph and print the desired solution (details that are already included in our algorithm). Furthermore, while Java may be a fine language, it is hard to translate algorithms written in it into other languages; whereas our code, written in basic C, is easy to translate into any language that has arrays (Fortran, PHP, JavaScript, Mathematica, Matlab, Java itself~\ldots), which will be a considerable advantage for experimenters who are not familiar with Java.
The Java reference algorithm is sophisticated and no doubt ideal for teaching purposes, but there is an evident separation of the published book text \cite{sedgewickbook} from the published program \cite{sedgewickcode}, a separation that allowed the code to develop into a sophisticated program independently of the published book. Unsurprisingly, the book omits the code altogether: it has to be downloaded from the web instead. Of course, this is entirely defensible as the code available on the web has many details that go far beyond what is needed in a book --- the code includes unit tests and examples, which would be tedious and distracting to explain in a book. Even when code from a working program is published, unknown details in the support code that is not published may be critical to getting it to work properly.
It is interesting to note the polarising force of positive feedback when writing about algorithms:
\begin{itemize}\raggedright
\item When code cannot be seen by the reader, the author (as a good programmer) is under pressure to add features so it can do anything that a reader might be anticipated to want. There is thus a natural tendency to add features, which tends to make the code more complex, which further justifies keeping it out of sight~\ldots
\item When code is brief and visible, the reader can adapt what they can see themselves: the author has no pressure to add features. When code is visible, there is a natural tendency to improve and clarify it, which further justifies keeping it visible~\ldots
\item The third possibility is that the author manages the complexity by abstracting the algorithm: this makes the book or paper concise, which conceals details the author finds irrelevant to their immediate goals.\footnote{Few papers talk about more than one of: complexity, correctness, implementation, security, usability~\ldots} Unfortunately, as ``irrelevance'' is abstracted away, the author thinks less and less about the reader's possible wider needs.
\end{itemize}
It is clear that normal writing and publishing practices combined with the intricacies of managing working programs do not align well with the goal of readers finding reproducible algorithms. The goals and priorities of publishing are not the same as the goals of science; the pressures of publishing, whether papers or books, conspire to compromise reproducibility.
\subsection{Graph theory background and applications}\label{graphtheory}
%Graph theory can be applied to our problem of experimental design to generate a sequence of experiments which is efficient and normalises the effects of random and systematic error.
Leonard Euler effectively invented graph theory in 1736 with his solution to the famous K\"onigsberg\footnote{K\"onigsberg was in Prussia and is now Kaliningrad in Russia.} bridge problem \cite{euler,wilson}: is it possible to walk a cycle across each of the seven bridges in K\"onigsberg exactly once? In modern graph terminology, a bridge is an \emph{edge\/} and the land a bridge ends on is a \emph{vertex\/}; if bridges are one-way to traffic, then the graph is a \emph{directed graph\/} as opposed to an \emph{undirected graph\/}. An \emph{Euler\/} (or \emph{Eulerian\/}) \emph{cycle\/} is a walk that traverses each edge of a graph exactly once, starting and ending at the same vertex.%\footnote{cf.~an \emph{Euler(ian) path\/} is a walk that traverses every edge of the graph exactly once, but need not start and end at the same vertex.}
The example algorithm in this paper is concerned specifically with directed graphs. A \emph{complete\/} directed graph is a graph in which each ordered pair of vertices (including repetitions) is connected by a directed edge. In other words, every pair of distinct vertices is connected by two edges, one in each direction, and each vertex has a single self-edge from itself back to itself. A complete graph of 3 vertices is shown in Figure~\ref{completeg}.
\begin{quote}\raggedright\emph{Notation}.
We use $u \rightarrow v$ for the directed edge (arrow) connecting vertex $u$ to vertex $v$, and $u \rightsquigarrow v$ for a directed path, consisting of one or more edges connecting $u$ to $v$. For example, if $u \rightarrow v, v \rightarrow w$ and $w \rightarrow x$ (also written $u \rightarrow v \rightarrow w \rightarrow x$) then $u \rightsquigarrow v, u \rightsquigarrow w, u \rightsquigarrow x$ and $v \rightsquigarrow x$, etc.
\end{quote}
A \emph{cycle\/} is a path that starts and ends at the same vertex.
A directed graph is \emph{(strongly) connected\/} if it contains a directed path $u \rightsquigarrow v$ and a directed path $v \rightsquigarrow u$ for every pair of vertices $u,v$. Since edges are paths of length 1, complete graphs are strongly connected.
A \emph{bridge} in graph theory is an edge that if it was deleted then the graph would no longer be connected.
%Graphs represent relations. A complete graph relates every vertex to every other vertex, and therefore represents an \emph{equivalence\/} relation. When every vertex in a graph is connected to itself via an edge, this represents a \emph{reflexive\/} relation; and when all pairs of adjacent vertices $u \rightarrow v$ are linked by reciprocally directed edges $v \rightarrow u$, this represents a \emph{symmetric\/} relation. If for all paths $u \rightsquigarrow v$ there is an edge $u \rightarrow v$, then the graph represents a \emph{transitive\/} relation.
A $(k,n)$ \emph{de Bruijn sequence\/} is a cyclical list of length $k^n$ which contains $k$ unique symbols, arranged so that every permutation of overlapping sublists of length $n$ occurs exactly once. For example, 0011 is a $(2,2)$ de~Bruijn sequence using 0 and 1 as the 2 symbols, and where all the length 2 sublists, namely 00, 01, 11, and 10 (wrapping around), each occur exactly once, and in this order.
%There are ${k!^{k^{n-1}}}/{k^n}$ distinct de Bruijn sequences which satisfy the above criteria.
Although the sequences were named after Dutch mathematician Nicolaas Govert de~Bruijn \cite{debruijn}, %in 1946
they had been previously described in the 19th century \cite{fleury,sainte-marie}.
In fact, the Sanskrit poet Pingala employed a $(2,3)$ de~Bruijn sequence over $1,000$ years ago to permute poetic meters and drum rhythms \cite{hall,knuth4a}.
Modern applications of de Bruijn sequences include: magic tricks, optimal strategies for opening combination locks, and cryptography (including generation of one-time pads, and the Data Encryption Standard algorithm). They have been used in gene-sequencing to construct genomes from billions of short sequences \cite{genome}. Two-dimensional de~Bruijn arrays may be used to identify the position of industrial robots on a warehouse floor, or the position of an infrared-sensing pen on specially marked paper \cite{diaconis}. Randomised de~Bruijn sequences have applications in experimental design in many areas of scientific and statistical research, and have been used in signal processing to improve the signal to noise ratio, for instance in Magnetic Resonance Imaging (MRI) scanning \cite{aguirre}.
Fisher proposed the application of randomised Latin Squares to balance and equalise sampling error and bias in the design of experiments into soil fertility \cite{fisher}.
%Similarly, a randomised $(N,2)$ de Bruijn sequence may be used to generate an optimal sequence of experimental tests which minimises unnecessary tests and balances the effects of random and systematic error.
In a similar way, if a sequence of experimental tests is prescribed by a randomised $(N,2)$ de~Bruijn sequence, every element in the series will be tested $N$ times in random order, and will be immediately preceded by every other element in the series in the set exactly once. This approach allows the most efficient means of testing multiple items, balancing and equalising the effects of systematic bias, drift, serial carry-over effects, hysteresis and random error. We ourselves used de~Bruijn sequences for experimentally evaluating automatic drug administration data from syringes \cite{newdavid}.
de~Bruijn sequences can be generated from Eulerian cycles \cite{fleury,hierholzer}. An Eulerian cycle of a complete graph with $N$ vertices follows in order every edge $u \rightarrow v \rightarrow\ldots$ exactly once for all $u$ and $v$, and this is equivalent to the definition of an $(N,2)$ de~Bruijn sequence. More generally, a \emph{de~Bruijn graph\/} is a graph whose Euler cycle generates the corresponding de~Bruijn sequence. As a special case, the complete graph is a de~Bruijn graph; Good \cite{good} gives further examples. Other approaches for generating de~Bruijn sequences include feedback shift registers and genetic algorithms \cite{turan,knuth4a}.%\cite{lempel}
\subsection{Classic Euler cycle algorithms}\label{new-algorithm}
Many deceptively simple algorithms to generate Eulerian cycles have been described. These two classic algorithms were invented well before modern computers:
\begin{itemize}\raggedright
\item \textbf{Hierholzer's algorithm} \cite{hierholzer} first finds all cycles in the graph then merges them together.
\item \textbf{Fluery's algorithm} \cite{fleury} follows a path successively choosing edges to delete at each vertex by first choosing any edge that is not a bridge, and finally choosing the bridge when there is no other choice.
\end{itemize}
However their simple descriptions in English belie their relatively complex implementations in software --- just saying ``find cycles'' and ``merge'' assumes all sorts of implementation details. Should you use depth first search to find cycles, and what are the exact details for identifying bridges (especially when deleting edges changes the bridges)?
This common but deceptive simplicity of many published algorithms is a problem we also noted in our previous discussion of the Chinese Postman Tour \cite{cpp}.
\subsection{A new Euler cycle algorithm for complete directed graphs}
\def\pair#1#2{\ifx #1#2
#1\rightarrow #2 %{\rcirclearrowleft} %
\else #1\rightarrow#2\rightarrow#1\fi}
Inspired by Hierholzer's algorithm, we make five observations for an Euler Cycle algorithm for complete graphs:
\begin{enumerate}\raggedright
\item
Instead of using an algorithm to find cycles, since the graph is complete we \emph{already\/} know a decomposition into cycles: namely, for every vertex $u$ there is a cycle of length 1, $\pair uu$, and for every pair of vertices $u,v$ $(u \neq v)$ there is a cycle of length 2, $\pair uv$. We call these \emph{trivial cycles}.
\item
We use a recursive algorithm to merge cycles. It starts walking any cycle, and if it crosses another cycle (that has not already been walked) it recursively walks that cycle, then resumes the cycle it was walking. This approach does not need any explicit operations or data structures to merge cycles.
\item
Once a cycle has been walked, it is not walked again. The algorithm therefore starts by initialising every trivial cycle as unwalked, and as it walks a cycle it marks it as walked, and hence will not walk it again.
\item
Marking trivial cycles can be represented by a Boolean matrix as follows: $\mbox{\it walked}_{uv}$ is true if the cycle $\pair uv$ has been marked as walked (if $u=v$ then the cycle $\pair uu$ has been walked). We call this representation of a graph a \emph{cycle matrix}.
\end{enumerate}
\begin{figure*}
\begin{center}\large
\newdimen \wid
\begin{tabular}{@{}c@{}c@{}}
\wid=\figurewidth \divide \wid by 5 \multiply \wid by 3
\setbox0=\hbox{\includegraphics[width=\wid]{figures/K3.pdf}}
\setbox0=\hbox{\hskip -1cm\lower 1.25in\copy0}
\dp0=1cm
\copy0&\multicolumn{1}{c}{\raise 1cm\hbox{$
{}\hskip 0cm\left(\begin{array}{ccc}
1 & 1 & 1 \\
1 & 1 & 1 \\
1 & 1 & 1 \\
\end{array}\right)
$}}\\
&$
{}\hskip -2.2cm\left(\begin{array}{ccc}
\pair AA & \cdots & \cdots \\
\pair BA & \pair BB & \cdots \\
\pair CA & ~~~~\pair CB ~~~~& \pair CC \\
\end{array}\right)
$
\end{tabular}
\vskip .2ex
%generate caption-#.tex .,/%end/-1
\caption{Representations of the complete directed graph $K_3$, where there is an edge $u\rightarrow v$ for each directed pair of the $3$ vertices $u,v \in \{A,B,C\}$. As can be seen (left), the graph is composed from $6$ trivial cycles: 3 single-arrow cycles (e.g., $\pair AA$), and 3 double-arrow cycles (e.g., $\pair AB$). The corresponding cycle matrix is shown top right. The cycles represented by it are shown explicitly in the larger matrix (bottom right), which has one entry shown explicitly for each trivial cycle. The omitted entries in the matrix are implied by symmetry, as, for example, the top right would be $\pair AC$ but this is the same cycle as $\pair CA$, which is shown bottom left.}
%end
\label{completeg}
\end{center}
\end{figure*}
Hence we suggest the following Euler cycle algorithm. Using C, vertices are numbered $0$ to $N-1$.
% note that we use generate rather than define below. Relit copes - the namespace is the same for generate and define! But this allows
% us to run wc on the file generated so we can say how long the code is
\label{normal-version}
%generate define-non-randomised-cycle /Verbatim/+1, /Verbatim/-1
\vbox{\begin{Verbatim}
void cycle(int u, int v) // walk a cycle from u to u via v
{ if( !walked[u][v] )
{ // we will have walked the cycle u to u via v
walked[u][v] = walked[v][u] = 1; // keep cycle matrix symmetric
recordEdge(u, v); // record walking edge from u to v
for( int w = 0; w < N; w++ )
cycle(v, w); // walk any unwalked cycles from v
if( v != u ) // get back from v to u if not already at u
recordEdge(v, u);
}
}
\end{Verbatim}
}
As shown in Figure~\ref{completeg}, the cycle matrix is symmetric, and might therefore be represented more compactly as a triangular matrix. However, the algorithm implements \texttt{walked} as a square symmetric matrix (i.e., $\mbox{\texttt{walked[u][v]}}=\mbox{\texttt{walked[v][u]}}$ is invariant) as this simplifies the implementation. It is possible to use a more sophisticated data structure than a matrix to avoid the \texttt{for}-loop and to avoid calling \texttt{cycle} when it will immediately return: such an optimisation would not change the algorithm, but would significantly obscure its implementation.
The function \texttt{recordEdge} can be any way of recording the next edge along the Euler cycle; just printing it is easiest:
%define define-basic-recordEdge /Verbatim/+1,/Verbatim/-1
\begin{Verbatim}
void recordEdge(int u, int v) // print an edge walked
{ (void) printf("%d --> %d\n", u, v);
}
\end{Verbatim}
The initial call of \texttt{cycle} to generate a solution is now simply:
%define main-body /Verbatim/+1,/Verbatim/-1
\begin{Verbatim}
cycle(0, 0); // Euler cycle starting at 0 returning to 0
\end{Verbatim}
which generates an Euler cycle that starts and ends at 0, with $\pair 00$ as its first edge (because \texttt{cycle} starts off by going via 0, the second parameter). We discuss alternatives in the next section, below, to randomise the cycle and to avoid always starting with the same edge.
Finally, the \texttt{walked} cycle matrix needs declaring and initialising:
%define declare-initialise /initialise/, /end{Verbatim/-1
%define declare-walked .+1, .+1
\begin{Verbatim}
// represent cycles of a complete graph with N vertices
int walked[N][N];
// initialisation; all cycles initially unwalked
void initialise()
{ for( int u = 0; u < N; u++ )
for( int v = 0; v < N; v++ )
walked[u][v] = 0;
}
\end{Verbatim}
In C, arrays may be initialised to zero when declared, so explicit initialisation is not strictly required, but we provided explicit initialisation here in case the code needs to be re-entrant or is to be translated into another language. Of course, in many languages \texttt{True} and \texttt{False} will be used instead of C's convention of 1 and 0.
\subsection{Randomised sequences}\label{randomise}
Performing a sequence of experiments in random order controls the effects of drift and random error, and adding the constraint that every calibration value in the set must be preceded exactly once by every other calibration value in the set normalises any carry-over effects. Karl Popper called such sequences ``shortest random-like sequences'' \cite{popper}. Therefore, for experimental design, we need to modify our algorithm so that the search for an unwalked cycle to recursively follow is randomised. The easiest way to do this is to use a random permutation in \texttt{cycle}'s \texttt{for}-loop, as follows:
\label{randomised-version}
%define define-randomised-cycle /Verbatim/+1,/Verbatim/-1
\begin{Verbatim}
void cycle(int u, int v) // follow a cycle from u to u via v
{ if( !walked[u][v] )
{ // keep the cycle matrix symmetric
walked[u][v] = walked[v][u] = 1;
recordEdge(u, v); // record edge from u to v
for( int w = 0; w < N; w++ )
// cycles from v via a randomly permuted vertex
cycle(v, permutation[v][w]);
// get back if not already at u
if( v != u ) recordEdge(v, u);
}
}
\end{Verbatim}
There are $N$ independent random permutations (indexed by \texttt{v}) to avoid dependencies between between vertices: \texttt{w} is mapped by \texttt{permutation[v][w]} to a random value in the range $0$ to $N-1$.
%Unfortunately, only the outgoing edge from $v$ is randomised: if $\pair uv$ $(u \neq v)$ is the first entrance to $v$, then $\pair vu$ will be the last exit from $v$. For most applications, this is at most a technical problem; in particular, because an Euler cycle is a cycle, it can be rotated and started at any point, and this trick can be used to change the ``last'' exits for any vertices.
The Knuth-Fisher-Yates shuffle \cite[p145--146]{knuth2} is a standard way to initialise such a permutation matrix:\label{shuffle}
%define declare-permutation /int/, /int/
%define initialise-permutation /for/, /end/-1
\begin{Verbatim}
int permutation[N][N];
...
for( int u = 0; u < N; u++ )
{ for( int v = 0; v < N; v++ )
{ int randomv = randInt(v+1);
permutation[u][v] = permutation[u][randomv];
permutation[u][randomv] = v;
}
}
\end{Verbatim}
To avoid always starting an Euler cycle from the same vertex and always starting with the same trivial cycle, the original base call \texttt{cycle(0, 0)} must be randomised too:
%define randomised-main-body /Verbatim/+1,/Verbatim/-1
\begin{Verbatim}
cycle(randInt(N), randInt(N));
\end{Verbatim}
Since the Knuth-Fisher-Yates shuffle is carefully designed to generate a uniform distribution of permutations, the choices made in the modified \texttt{cycle} will be uniformly random. %Therefore the series of randomised walks generated by the above code will be uniform samples of all possible Eulerian cycles.
The function call \texttt{randInt(N)} above returns a uniformly distributed integer from $0$ to $N-1$ inclusive; it is not standard C, but can be implemented using the standard \texttt{rand()} function which produces a pseudo-random integer \texttt{0}..\texttt{RAND\_MAX}.
%define define-randInt /Verbatim/+1,/Verbatim/-1
\begin{Verbatim}
// return random integer 0..n-1 inclusive
int randInt(int n)
{ return floor(n*((double)rand())/((double)RAND_MAX));
}
\end{Verbatim}
This function assumes $n <\!\!<\mbox{\tt RAND\_MAX}\approx 2^{31}$ \cite[p119]{knuth2}, and it is called in a context where $n\leq N$ (where $N$ is the order of the graph).
The length of an Euler cycle is $N^2$, and it is implausible that (even robotic) experimenters have the time to perform sequential experiment runs approaching $N^2=2^{62}$ (i.e., over $10^{18}$) steps, especially as randomisation is only relevant when repeated sequential experiments are run. Hence the assumption is readily met in all realistic applications.\footnote{An alternative would be to use the function \texttt{arc4random\_uniform()} available in some implementations of C\@.}
Finally, to ensure different random sequences each time the program is run, the random number generator must be seeded differently during initialisation. This may be done in C from the current time:
%define initialise-Random /Verbatim/+1,/Verbatim/-1
\begin{Verbatim}
time_t t;
srand((unsigned) time(&t));
\end{Verbatim}
As is good practice, the seed for the random number is stored in a variable, so if any bug is found, a debugger can recover the value of the seed to repeat exactly the same sequence of random numbers for a subsequent test run.
\subsection{Solving the wine tasting problem}\label{winetasting}
The basic algorithm, described above, uses integers as the names of vertices. For many applications it may be more appropriate to use strings. We can define vertex name strings:\footnote{Not shown here, but our wine sequence generating program, which is generated automatically from this paper using our tool \name{relit} (see Section~\ref{relit-intro}), checks at run time that \texttt{N} correctly matches the number of wines declared. (It is a shame that this cannot be performed with a static check in C.)}
%define wine-names /Verbatim/+1, /Verbatim/-1
\begin{Verbatim}
// assuming N = 3
// (C permits N > 3, which will make the next line not fully initialise the array!)
char *wines[N] = { "Merlot", "Pinotage", "Shiraz" };
\end{Verbatim}
and then convert the edge recording by changing \texttt{recordEdge} to print the names of vertices rather than their numbers:
%define define-wine-recordEdge /Verbatim/+1,/Verbatim/-1
\begin{Verbatim}
void recordEdge(int u, int v)
{ (void) printf("%s $\\rightarrow$\n", wines[u]);
}
\end{Verbatim}
\label{winelist}
Note how we used \LaTeX's \texttt{\char'134rightarrow} to generate a ``$\rightarrow$'' symbol to make the experiment sequence look a bit neater when typeset by \LaTeX\@.
Finally, we need to finish with a final explicit \texttt{printf("\%s\char'134n", wines[0])}, because printing the second vertex was otherwise lost in changing the \verb|recordEdge| to only print the first vertex rather than pairs of vertices.
With these changes, the code generates the following sequence: \myinput{winelist.tex}\unskip. (This actual sequence was typeset here by \LaTeX\ inputting a file generated by running the C program, which itself was generated automatically by \name{relit} from the code explicitly published in this paper.)
It is easy to confirm by hand that this is indeed an Euler cycle (perhaps by mapping wines to the letters $A,B,C$ and ticking off the edges in Figure \ref{completeg}). Note that, as an Eulerian cycle ends at the starting vertex, an additional drink of \myinput{lastwine.tex}\unskip\ (in this worked example)\footnote{The wine experiment sequence was generated automatically by running the randomised version of the code shown in this paper; it was run and the results saved to a file \texttt{winelist.tex}, the last line of which was extracted (several times) for this paragraph by using Unix's \texttt{tail -n 1 winelist.tex}. Each run of the program will generate a different random example for this paper.} is required: thus a balanced scientific experiment would repeat sequential experiments with randomly selected starting vertices each time, using the ideas of Section~\ref{randomise}\@. Randomisation avoids the potential bias caused by drinking too much \myinput{lastwine.tex}\unskip\ --- in the sequence above \myinput{lastwine.tex}\unskip\ happens to be drunk both first and last, and hence once more than any other wine. Randomisation also avoids possible conscious or unconscious experimenter bias caused by the researcher choosing, say, to always start or finish with their favourite wine. After enough randomised sequential experiments, these biases will be evened out.
%An Euler cycle of a complete graph gives us all pairs or 2-tuples of vertices. We note that the generalisation to all sequences of $k$ varieties of wine requires $k$-tipples.
Wine buffs will be tempted to go to a city like K\"onigsberg, drink some wine, cross a bridge, consider the wine on the other side of the river, cross another bridge, and so on --- and recording the results. However, trying it for larger $N$ may result in a new meaning for ``drunken walk.''
\part{Reproducible algorithms}
\section{Finding reproducible algorithms}\label{new-approach}
While we continue to take it for granted that algorithms are described vaguely --- in English, in pseudo-code, with illustrative fragments of uncompilable code, or left as exercises for the reader --- unnecessary errors often persist in their real-life implementations. (Some of the problems are reviewed in \cite{jmlr,heedless}.) Often, of course, publications are not always aiming to describe the algorithm as such, but to do something else --- like teach students, analyse complexity, prove some theorems, discuss how to optimise them, and so on. This ``dual use'' creates things that look very much \emph{like\/} algorithms but which cannot be reliably used in practice \emph{as\/} algorithms; the likely confusion calls to mind Feynman's critique of cargo cult science~\cite{feynman}. In fact Feynman was devastatingly critical of work that was not reproducible because it wasted everyone's time who tried to continue the work; Mlodinow describes an incident where Feynman comments on lack of reproducibility due to fraudulence~\cite{mlod}.
Ironically, then, in one of the very few areas --- programming --- where we could be completely explicit about our work (e.g., if an algorithm works, it is text that can surely be published explicitly) there is a default culture of vagueness and ``abstraction'' that undermines scientific reproducibility if not progress. English, pseudo-code, fragments, exercises~\ldots\ none are scientific statements: they are irrefutable~\cite{popper}.
Often software practice and experience emphasises the efficiency of running programs, but we should also be concerned with the end-to-end time to develop a program, confirm it is correctly implemented, and to run it. In many cases, the development time dominates the run time. Unlike the bulk of the algorithms textbook literature, our goal was to have a reliable program quicker, not a faster program later. As we shall argue, literate programming and its variants are powerful approaches to be more scientific when publishing algorithms, and hence to end up with more reliable programs working in the wider world. In this paper, we developed a new variant of literate programming for this very reason.
Programs that can be compiled and run involve details that are generally not relevant to discussions of algorithms. Furthermore, writing a paper or book is a human process, so transcription errors may creep into any algorithms presented.
There is even the danger that publications may be negligent: sometimes, authors do not check their published code adequately and referees take the correctness of the code on faith (partly because it is too hard to reconstruct the code from the paper, and too hard to disentangle whether problems are due to the published code or the referee's own errors in the reconstruction of it). ``Negligent'' is a harsh word, but it covers a wide range of common problems ranging from deliberate fraud, unintentional exaggeration, accidental uncorrected errors, and well-intentioned aspirational comments --- like, the program would \emph{obviously\/} work like this (even if it does not quite work yet). Even trivial and excusable errors, like typos and spelling errors, undermine the reproducibility of program code.
All this means that finding an algorithm in the literature that can be used to solve a real problem is fraught with difficulties. In our case, having developed an algorithm to solve our problem --- because it is a very common problem for experimenters who may not have sufficient programming expertise --- we wanted to take care that what we published (i.e., this paper) would be both clear and able to be copied from the paper as real, executable code without problem. We wanted to make it readily --- and reliably --- available.
At the start, we developed our algorithm first in Mathematica. Mathematica is excellent for presenting the \emph{results\/} of programs in papers, as it allows documentation, code and results to be interleaved in a polished typeset document where Mathematica itself manages all the editing. There is a \emph{Mathematica Journal\/} full of papers constructed with it. However, Mathematica it is not good for writing papers \emph{about\/} algorithms, as it requires code to be runnable exactly as and in the order as written; although you can hide complete blocks of code from the reader, the constraints on expression are onerous (for example, you cannot show an interesting line from within a block of otherwise hidden code). In a paper you generally want to discuss fragments or lines of an algorithm in the order that suits the exposition --- you do not want to be constrained by the declaration or block syntax of the implementation language. Notwithstanding these concerns with Mathematica, we were excited by our algorithm and ported it to C (which is simpler, better known and non-proprietary). We then started to write in \LaTeX\@. \LaTeX\ \cite{latex} of course is a very flexible typesetting system, accepted by many journals. We were aiming to publish and share our findings as widely as possible. Finally, we ended up with the present paper.
As we wrote, we reviewed and revised this paper. We naturally made many changes to the original Mathematica program code. For example, we had initially used variables \texttt{i}, \texttt{j}\ldots\ in the program, but for writing the paper we decided to use \texttt{u}, \texttt{v}\ldots\ as these are conventional names for graph vertices.\footnote{Unfortunately wine names starting with u, v, w, are not commonly recognisable.} So, over time, the original code and its description in the manuscript drifted far apart; yet, in principle, it is essential that the manuscript was synchronised with the source code so that it described the source code without error. Ideally, the paper should be automatically changing to reflect updates to the source code --- but we were doing it by hand.
In a word, we did not want to be sloppy, yet our initial approach to writing the paper was making life hard. It was tempting to take an easy approach, and only describe our algorithm in words or pseudo-code, being a bit vague about the details. Conventionally, neither the readers of the paper nor we, the authors, would worry about slight discrepancies, because they would be invisible and unknown.
An obvious way to help ensure that executable code in a manuscript is correct is to cut and paste the relevant code in the paper to construct a program. This is very awkward if the code is spread out in the paper. Sometimes \LaTeX\ commands will mean that the typeset code will differ from the intended correct source code. For many reasons, then, the code in the paper is not complete and usually has to be edited to make it work. In short, a ``cut and paste'' approach is not reliable.
But why do something by hand repeatedly when you can design a tool to do it with far more generality and reliability? Once a tool is written to do this chore automatically, the authors are freed from worrying about maintaining and checking the code in the document as it is repeatedly edited and revised: it should be done automatically. Reproducible authoring then becomes easier and more reliable. A win win.
This idea is very similar to Knuth's \emph{literate programming}~\cite{litprog} which combines source code with explanatory documentation; however the format of the documentation generated by literate programming is not suitable for a journal manuscript, as it introduces named sections and other paraphenalia --- which may be very convenient for programming, but is quite arbitrary notation to impose on a paper intended for a general readership. However, \name{relit} retains the section naming convention (using names like \texttt{<this>}), but by design none of them would normally appear in a paper. (Such names only occur in the present paper due to our explaining \name{relit}; there are none and none needed in Part I of this paper where we explain the example algorithm.)
Literate programming has the disadvantage for us (and for many authors) that the author has to start with the literate program, and in this instance we had already drafted the manuscript as a \LaTeX\ document. \name{Warp\/} is a type of literate programming that extracts code from a normal program commented in XML, thus avoiding the separate processing that literate programming normally requires to generate the executable program~\cite{warp}. Our earlier paper on the related Chinese Postman Tour \cite{cpp} used \name{warp\/} to present accurate Java code.\footnote{The Chinese Postman finds the shortest cycle in a weighted graph that is not necessarily Eulerian (i.e., some edges may need to be walked more than once); it is a non-trivial generalisation of the problem discussed in this paper.} \name{Loom\/} \cite{loom} (originally written by Janet Incerpi and Robert Sedgewick for their classic algorithms book \cite{sedgeoriginalbook}) is another approach, similar to \name{warp\/}, but allows the use of Unix filters to perform arbitrary transformations of code (e.g., to handle special symbols) that is then inserted into arbitrary~documents.
In the present paper, however, we had already been working on the code \emph{in\/} the paper, in the usual informal way. To avoid this becoming increasingly sloppy (or, conversely, a huge burden to manage), we therefore developed a novel ``reversed'' literate programming approach: using it, the code is exactly as written in this paper (i.e., what you are now reading) \emph{and\/} it can be automatically extracted to generate a program that is directly executable. Readers of this paper can be assured the code is reproducible, and if they wish they can start with the paper, use our tool, and automatically extract all the code themselves and run it or develop their own programs directly from it.
The point is: we know that the code shown in this paper works, and moreover, we have a lightweight, fully automatic process that goes directly from this paper to executable code. What you now see as published may not be all of the code,\footnote{There is exactly one line missing from the main section of the published paper --- as our tool can tell you; see Section \ref{invisible-code}.} but the code shown \emph{does work\/} as shown.
%there is nothing in the approach that ties it to programs; the method can be used to generate any type of text from a document.)
Figure \ref{fig:lp} compares the main forms of literate programming, including our new approach. There are of course many other related approaches, ranging from the very simple such as our reverse literate programming to the highly sophisticated, such as Pre\TeX\ \cite{pretex} that are designed for large complex projects and have commensurate learning curves: for a review see~\cite{warp} and~\cite{wiki} for up-to-date summaries of available tools.
\begin{figure*}
\begin{center}
\noindent\includegraphics[width=4.9in]{figures/literateProgramming.pdf}
\end{center}
%generate caption-#.tex .,/%end/-1
\caption{Comparing various ways to write about programming. Note that in the conventional approach (a) there is no guarantee that the published paper faithfully represents the source code, as the paper and source code can be (and will be) edited independently: what is published has no automatic connection to the source code. (Although not made clear in the diagram, typically source code and \LaTeX\ documents will be split into multiple files for convenience. In principle, all methods can handle multiple files.)}
%end
\label{fig:lp}
\end{figure*}
\subsection{Reverse literate programming --- the details}\label{relit-intro}
Code in \LaTeX\ documents is often written in ``snippets'' in the following conventional style:
%define demo /verbatim/+1,/verbatim/-1
\begin{Verbatim}[commandchars=\\\{\}]
\slash{begin}\bracket{verbatim}
printf("example code");
\slash{end}\bracket{verbatim}
\end{Verbatim}
When typeset, this will appear as ``{\tt \myinput{nameDemo.c}}'' somewhere in the typeset document --- in fact, as it does in this very sentence.
However, the code the reader can see above might be --- and usually would be --- just text manually written in the original \LaTeX\ document: it may have errors, it may not be real program code, and it may not have been checked by compiling it --- it may have been manually copied and pasted from some program, and probably errors will have been introduced while trying to follow \LaTeX\ conventions (such as handling the \texttt{\%} and \verb|\| characters, which have different meanings in \LaTeX\ and in program code).
To assure code is correct, we need to be able to generate a file from the code snippets like this, then the file and the code it is made up from can be checked. Unfortunately, the snippets may be scattered in any order throughout the paper, and perhaps the author does not want to burden the reader with all the details of the entire program so often additional code will also be required to make a compilable program.
In our approach, we name snippets, and then use our tool \name{relit} to collect the named snippets to assemble them into one or more source code files that can then be compiled and run in the usual way. We allow snippets to be hidden from the reader --- for instance, placing them beyond the ``end'' of the document, that is placing them after the \LaTeX's \verb|\end{document}| command which signals the end of the typeset document. \name{Relit} instructions can also be placed in separate files that are not processed by \LaTeX\ at all. Hiding snippets allows the \LaTeX\ source file to define \emph{all\/} the code required, even when not all of it is relevant to the reader (for example, the reader does not always need to see standard declarations).
A \name{relit} name is defined by preceding any part of the \LaTeX\ document with a special comment:
% the %{} below stops the literate programming thing recognising %{}define!
\begin{Verbatim}[commandchars=\\\{\}]
%\breakup{}define \emph{name} /\emph{start pattern}/\plusminus\emph{offset}, /\emph{end pattern}/\plusminus\emph{offset}
\end{Verbatim}
The pattern style (general regular expressions are permitted) is deliberately reminiscent of the form used by the Unix utility \name{ed}, and the entire line (starting with the standard \LaTeX\ comment symbol \texttt{\%}) is a \LaTeX\ comment so it is ignored completely in the typesetting of the document. The effect is that \texttt{\emph{name}} is defined to be the text in the file over the specified range of lines. For example, the \LaTeX\ for the code snippet above was preceded by a definition of the name \texttt{demo} as follows:
\manualrelitindex define demo
\vskip .2cm
{\renewcommand{\baselinestretch}{0.9}
\tt\relsize{-1}
\noindent\%\breakup{}define demo /verbatim/+1, /verbatim/-1\\
\slash{begin}\bracket{verbatim}\\
\myinput{nameDemo.c} \\
\noindent\slash{end}\bracket{verbatim}
\vskip .2cm}
Here, the \texttt{\%{}define} command defines the name \texttt{demo} to be all the text --- as it happens, only one line in this case --- between the line after one containing \texttt{verbatim} (which happens to be the line \verb|\begin{verbatim}|) upto the line before the next line containing \texttt{verbatim} (which happens to be the line \verb|\end{verbatim}|).
To create this example, for it to be typeset as shown above, we used \name{relit} to generate a file, and then that file was input (several times) into the paper to show the actual value of \texttt{<demo>} in each place where the text was needed. Therefore, the original text actually occurs only once in the source file, and if it is edited there, \emph{all\/} of the examples and discussion get automatically updated. \name{Relit} thus ensures that all the examples are consistent; if we decide at a later date to improve anything there is only one place to edit --- and it is impossible to forget to update each example, as updating happens automatically. Of course, repeatedly using exactly the same code into a paper may be unusual, but the point is \name{relit} has no problem, and in fact it was helpful for the present paper to have the flexibility to be able to do so.
\name{Relit} generates source files by writing file definitions analogously to name definitions, such as:
\begin{Verbatim}[commandchars=\\\{\}]
%\breakup{}generate \emph{filename} ., /%end/-1
\emph{text} \ldots
%end
\end{Verbatim}
However, to help assemble snippets together, within \texttt{\emph{text}} any occurrence of \texttt{<\emph{name}>} is replaced by its definition, and so on recursively. For example,
\vbox{\manualrelitindex generate hello.c
\begin{Verbatim}
%generate hello.c ., /%end/-1
int main(int argc, const char *argv[])
{ <demo>
return 0;
}
%end
\end{Verbatim}
}
\label{printf-section-warning}
will generate a file called \texttt{hello.c} that should compile and do whatever \texttt{<demo>} does. In fact, with our example, the C compiler will complain ``{\tt include the header <stdio.h> or explicitly provide a declaration for printf}'' because in fact \texttt{hello.c} does not provide the needed declaration for \texttt{printf}. That is the point: the approach enables us to write a paper \emph{and\/} easily check whether the code we are writing is valid. Note that \name{relit} can also generate makefiles \cite{make} as well, so the entire code, compiling and testing protocols and indeed arbitrary transformations to generated files can be conveniently combined into a single \LaTeX\ source document and maintained in one place.
Illustrating \name{relit}'s commands out of context (i.e., not showing the surrounding English descriptions of the code), as above, does not explicitly show the powerful leverage the approach provides for reflection during the authoring process. The description and explanation of code embeds the code in the same place; it is thus easy to improve the description \emph{and\/} simultaneously improve the code, and conversely. The author can think in either language, and slip in and out of either description without losing track of their thoughts.
As a matter of fact, we realised we needed to add some explicit code to the paper only after we had implemented \name{relit} and found some oversights in our original description. Reverse literate programming helped us find the problems: various declarations and a bit of wider context were needed to make the algorithm described in this paper properly compile and run without error. The following code generation, creating a file \texttt{euler.c}, works correctly once \name{relit} substitutes the values for the various names, which have been defined elsewhere in this paper:\footnote{The code shown here \emph{is\/} the actual code that \name{relit} used --- we just placed the \name{relit} \texttt{\%} comment inside the \LaTeX\ \texttt{verbatim} environment so it could be seen.}
\manualrelitindex generate euler.c
\begin{Verbatim}
%set-tag \seen{}
%generate euler.c ., /%end/-1
#define N 3 // for a graph with N vertices
<common-declarations>
<declare-walked>
<define-basic-recordEdge>
<define-non-randomised-cycle>
int main(int argc, const char *argv[])
{ <main-body>
return 0;
}
%end
\end{Verbatim}
%set-tag none
This code was written exactly as shown, and it is easy for the authors to check it actually works: with \name{relit} it generates the file \texttt{euler.c} which can be compiled and run, and will get the expected results. This implies the code \verb|<main-body>| and so on recursively also compiles and works. Using \name{make} \cite{make} with a makefile (also generated by \name{relit} from the same document) made it very easy to automatically update the code files and unit test them repeatedly as the publication evolved.\footnote{By default \name{relit} only updates files when their content changes, which makes using \name{make} efficient, as most edits to a document do not update any files generated from it.}
Of course it is unlikely that a normal paper (that is, one not explaining \name{relit}) would present the code shown explicitly above: usually, the \texttt{\%{}generate} command would have appeared after \LaTeX's \verb|\end{document}| so it would disappear and not be typeset as part of the published paper. Similarly, any details that need not be part of a typical published paper can be hidden. As we will discuss in Section~\ref{invisible-code}, \name{relit} can tell the author what code is visible and what code has been hidden to avoid any uncertainty.
Notice that \name{relit} allows the generated program code to be put in the correct order required for a compiler, but the code visible in the paper can be presented to the reader in whatever order is best for the narrative of the publication. For example, although \verb|<main-body>| appears near the end of the code, it might be explained to the reader before showing increasing levels of detail of the implementation.
Name definitions can occur freely before or after they are used. For example, in this paper the name \verb|<common-declarations>| (used above) is defined later, after the end of the \LaTeX\ document: the headers are a detail we feel readers of the present paper do not need to see written out in full, but they are necessary to be able to generate executable code. There are no restrictions on the order of generating files: they, too, can be specified in any order that best suits the needs of the author.
This complete freedom of order distinguishes \name{relit} from other approaches such as used in Jupyter \cite{jupyter} and Mathematica \cite{mathematica}, which both impose an order on the document to suit the compiler rather than the reader or author.\footnote{Mathematica is interesting in that Mathematica notebooks are first class Mathematica expressions, and therefore can be processed in arbitrary ways by the notebook itself. Thimbleby \cite{ansim} gives an example of a published paper generated by Mathematica deleting the author's explicit Mathematica in the notebook to leave just the final paper without the underlying program it is based on.} \name{Org-mode} \cite{org-mode} is an EMACS editing environment that manages stuctured documents, like notebooks, that can generate other files, including code and \LaTeX\ documents. Indeed, all of these approaches use special or essentially proprietary tools to edit and maintain the documents (whereas \name{relit} does not). In addition, these and similar tools typically require units of code to be syntactically complete, which is not required by \name{relit}, as made clear with the simple opening example in Section~\ref{simple-example} and with the more complex Knuth-Fisher-Yates shuffle example in Section~\ref{shuffle}.
As well as in the document, as shown above, names can also be defined in \name{relit}'s command line parameters, so any textual information can be imported when the tool is run --- such as a version number, the date, or even output from Unix tools such as \name{expect} \cite{expect} to refer to unit test diagnostics.
Our tool \name{relit} provides all the normal checks, such as reporting if names are defined and never used, used recursively, multiply defined, etc. Indeed, some authors have used deliberate ``errors'' as a way of providing metacomments: defining an unused name results in its name and value being reported to the user --- so the text is highlighted, which can be used as a reminder to fix an issue with the document.
\name{Relit} is a Unix tool, written as a short C program, \myinput{linesofrelit.tex}\ lines long and, together with this paper and documentation, is available from \softwareURL{}
\subsection{Generating any text, not just program code}
In general, authors of papers may want any text generated, not just program source code; for example, we showed the text generated by running our algorithm on a selection of wines in Section~\ref{winetasting}. In general the generated text may require further processing or testing.
There are many ways to do this: \name{loom} for example, generalises \LaTeX's \texttt{\char'134input} command to allow arbitrary processing, but this has the disadvantage that the approach requires an intermediate \LaTeX\ file to be generated. Instead, our simple approach is to use \texttt{\%{}generate} to create a makefile or any shell script: then any processing whatsoever can be performed, and of course it will typically generate files that are then included in the \LaTeX\ paper. Indeed, this is how we generated the sample list of wine tasting --- using the makefile generated by \name{relit}, Unix's \name{make} then generated a C source program \texttt{wine.c} from the paper you are reading, which was then compiled and run (in the same run of \name{make}), obtaining results saved to a file. Finally, that file was read in at the appropriate point when this paper was typeset (in Section \ref{winelist}), using \texttt{\char'134input} to read in the program's saved output to insert it into the paragraph where it was needed.
Of course, files created by \name{relit} programming can be processed by other tools in arbitrary ways, for example stream editors to decode \LaTeX\ typesetting conventions (e.g., in normal \LaTeX, the common programming symbol \texttt{\&} has to be written \texttt{\char'134\&}).
\subsection{Checking hidden material --- using tags}\label{invisible-code}
\def\seen{\color{black}\ttfamily\fontseries{l}\selectfont\global\def\endline{}}
\definecolor{highlight-color}{rgb}{1,0.2,0.2}
\def\hidden{\color{highlight-color}\ttfamily\fontseries{b}\selectfont\emph{$\ast\ast\ast$ hidden code $\ast\ast\ast$}}
\def\unseen{\color{highlight-color}\ttfamily\fontseries{b}\selectfont\global\def\endline{\hfill~~\hidden}}
The authors of a paper should be able to check that the explicit code shown in their paper is exactly what they want the reader to see and that nothing critical has been omitted. On the other hand, a published paper should conceal details that are distracting or irrelevant to its core message: it need not show all of the code a runnable program requires, but just enough to get the idea across. The tension between these, being explicit and being concise, is a recipe for error: what is hidden, by definition, cannot be seen, yet some omitted information may be required for achieving a complete program. Worse, the reader of the paper may not be certain what is missing, and they may not have the skills or time needed to reconstruct it correctly. Compounding the problem is the so-called ``curse of knowledge'' \cite{pinker}: the authors of the paper have privileged knowledge (in principle they know everything about what they are talking about) and they may therefore be unaware that some things have not been explicitly mentioned in the paper --- it is very hard to distinguish between what they know in general about what they are writing and what they think they know (perhaps inaccurately) is in the paper.
Tools like \name{warp} and \name{loom} help the authors ensure that code published has been obtained directly from the working programs, but it is still possible to write code in the paper that has never been tested or compiled, and also to leave a lot of essential context in the program that \name{warp} or \name{loom} do not draw into the document. Figure \ref{fig:lp2} summarises the challenges.
\begin{figure*}
\def\ast{\hskip 0.1pt{\lower .5ex\hbox{\large$\star$}\vphantom{\Large y}}}
\begin{center}\sf\small
\newdimen\cola \setbox0=\hbox{programming} \cola=\wd0 \advance\cola by .25em
\newdimen\colc \colc=\textwidth \advance\colc by -\cola \advance \colc by -3em
\noindent\begin{tabular}{|p{\cola}|p{\colc}|} \hline
\bfseries Relit & All source code and published paper in the \LaTeX\ source file or files. Code may be hidden, but only with deliberate effort. Intended for peer reviewed publications. Diagnostics for both missing and defined but unused code, etc. \\ \hline
\bfseries \hbox{Loom} and Warp&
Some code is visible, but most is hidden in separate files. Good for publications.\\ \hline
\bfseries Literate \hbox{programming}&
No code is hidden and all the code and document is in one place. Ideal for internal documentation rather than journal publications. \\ \hline
\bfseries Conventional approach& No connection between published paper and program source code;
there is no link between the source code and the published paper. Errors and inconsistencies are easy to introduce; no diagnostics possible. \\ \hline
\end{tabular}\end{center}
%generate caption-#.tex .,/%end/-1
\caption{What the author writes and what the reader sees should be closely aligned, ideally with the reader able to deduce full working code from the published paper. However, setup and other code is often hidden from the reader, and probably held in files separate from the published paper. The more code that is hidden, the more likely it will drift into complexity; critical details may be accidentally concealed from the reader. Conventional literate programming hides nothing, but typically makes the result too large to be convenient as a journal paper. \name{Loom} and \name{Warp} help ensure code published is correct, but it may be incomplete. The present paper's approach, \name{relit}, ensures that what the authors may choose not to publish can easily be checked with simple diagnostics (see Section~\ref{complete-program}) --- and what is not published remains in the original \LaTeX\ files, so remaining visible at least to the author.}
%end
\label{fig:lp2}
\end{figure*}
Unfortunately, in reverse literate programming, generated code can include names that are defined anywhere in the source documents, and the defined values themselves may or may not be visible in the published paper, and so on recursively. It is therefore impractical to manually determine what code is visible in the published paper and what, if any, is not visible.
Our tool, \name{relit}, allows code to be tagged, and although the tags can be used for any purpose, a useful application is to keep track of what code is visible and what is not. The approach is simple; the \texttt{define} and \texttt{generate} commands can be followed by optional tags:
\def\FancyVerbFormatLine#1{\%#1}
\begin{Verbatim}[commandchars=\\\{\}]
define \emph{name} \emph{start}, \emph{end} [, \emph{tag}]
generate \emph{filename} \emph{start}, \emph{end} [, \emph{tag}]
\end{Verbatim}
\def\FancyVerbFormatLine#1{#1}
For every file generated, \name{relit} additionally generates a duplicate file but marked up with the tags, with the relevant tags output as each \texttt{\emph{name}} (or \texttt{\emph{filename}}) is expanded. The tags are arbitrary text as supplied by the author, but will typically be \LaTeX\ macro names that can be defined to highlight text in any way that the author chooses. Of course if the tags include names like \texttt{<\emph{stuff}>} then they will be expanded as normal (the names can be defined anywhere, generally after the end of the document). This feature is useful if the tag is complicated (e.g., writing \texttt{<tag>} is easier and more reliable than writing out a tag in full every time it is needed) or if the author wants a tag to have many lines of text.
With tagging, we can readily obtain typeset text showing where code has come from.
However, since tagging each \texttt{\emph{name}} definition is a bit tedious --- and therefore itself error-prone --- a default tag can be defined to apply to all future \name{relit} commands:
\def\FancyVerbFormatLine#1{\%#1}
\begin{Verbatim}[commandchars=\\\{\}]
set-tag \emph{tag}
\end{Verbatim}
\def\FancyVerbFormatLine#1{#1}
That \texttt{\emph{tag}} is then automatically applied to all subsequent definitions (and files) until it is superseded, or overridden by explicit tags. Hence, typically a \LaTeX\ document will start:
\begin{Verbatim}[commandchars=\\\{\}]
\ldots
\slash{begin}\bracket{document}
%\breakup{}set-tag \slash{}seen\bracket{}
\ldots
\emph{published document including visible definitions}
\end{Verbatim}
and then have the following at its end:
\begin{Verbatim}[commandchars=\\\{\}]
\ldots
\slash{end}\bracket{document}
%\breakup{}set-tag \slash{}unseen\bracket{}
\ldots
\emph{hidden definitions and files}
\end{Verbatim}
Here, the \LaTeX\ code \texttt{\char'134end\{document\}} signals the end of the published \LaTeX\ document, and all subsequent text will be hidden from view in the published document. Normally this part of a \LaTeX\ document is empty, or has accumulated ``junk'' text and thoughts the authors cannot steal themselves to \emph{really\/} delete --- the space has a useful role in co-authored documents, where one author wants to delete text from the published document, but does not want another author to lose some idea without a chance to reconsider it before it is deleted. In our case, with reverse literate programming, the hidden space can also be used for defining program code that is needed for compiling and testing, but is considered too much detail for visible inclusion in the published document. Although there is an advantage having everything in one file to simplify editing, if desired, \name{relit} allows the code to be placed in separate files if ``hiding'' it beyond the end of the \LaTeX\ seems contrived.
The illustrative tags \texttt{\char'134seen\{\}} and \texttt{\char'134unseen\{\}} used above are arbitrary; one might choose to use \texttt{\char'134color\{black\}} and \texttt{\char'134color\{red\}} instead, say. In the complete example shown below (having defined \texttt{\char'134unseen\{\}} appropriately) the ``{\hidden}'' marker is provided automatically and therefore correctly.\footnote{Correctness here depends on the author not cheating! \LaTeX\ is programmable, so a determined author could defeat the reliability of the tagging mechanism if they were so inclined.}
The code shown next, below, is one of the example algorithms already discussed in this paper: the highlighted text generated by \name{relit} reveals code that was not previously shown in the paper but which was generated for the compiled test programs. In other words, the highlighted code is included below, but it as used elsewhere in this paper that line was hidden from sight from readers (by being placed after \texttt{\char'134end\{document\}}). To be clear, the code shown below was generated by \name{relit}, including the highlighting, and it was then inserted into this \LaTeX\ document --- essentially, the program code was typeset by writing \texttt{\char'134input\{{\em file}\}}.
{\label{complete-program}\tt\vskip .3cm \setlength{\leftmargini}{0em} % set the left margin to zero
\begin{verse}\relsize{-2}
\myinput{demo-tagged-euler.txt}
\end{verse}
}
The authors of this paper are happy that these standard declarations are not taking up space in the published paper --- in any case, if a reader of this paper faithfully copies the published code and omits these lines, good compilers will provide error messages that help solve their problems (as we discussed above, in Section~\ref{printf-section-warning}). Alternatively, a reader of this paper can download the source code of this paper, run \name{relit} on it, and then they will have all the executable files for all of the examples.
To summarise: using tags lets an author assure themselves that readers of their paper have the right details to reproduce the algorithm as exactly as they wish --- balancing detail with verbosity and pedantry. As shown above, highlighted in the diagnostic output from \name{relit}, the present paper fails to disclose one line of code. However we, the authors, consider this omission an unnecessary, obvious and distracting implementation detail --- in fact, we know if this line was omitted by a reader, their C compiler would help them correct it, so it is not a serious problem.
Apart from the original literate programming approach that tells the reader everything, which may be overwhelming, we know of no other approach that gives the benefits of concise algorithm publication combined with such strong assurances of reproducibility.
\subsection{A more flexible \TeX-mode}
As described above, \name{relit} uses \texttt{\%} to introduce commands, like \texttt{\%{}generate} and \texttt{\%{}define}; the advantage of this approach is that \LaTeX\ completely ignores \name{relit} commands because they are in the form of \LaTeX\ comments. However, in addition, \name{relit} provides an integrated ``\TeX-mode'' that makes the syntax it uses real \TeX\ or \LaTeX\ code; this is more sophisticated, requires a little \TeX\ programming, but has some significant advantages that some authors may appreciate. In \TeX-mode, \name{relit} becomes user-programmable, which of course is impossible with the \texttt{\%}-comment form of commands.
In \TeX-mode, the author defines \name{relit} commands to do arbitrary typesetting (as well as continuing supporting \name{relit}), such as printing where they are used in the document to help prepare the paper or keep track of how \name{relit} is being used, or an index of \name{relit} names can be constructed, which would help in large projects. An example (from this paper) is shown in Figure~\ref{fig:index}.
% Figure moved here:
\newbox\helpcaption
{\def\thefootnote{\lower 1ex\hbox{\specialMarker}}
\global\setbox\helpcaption=\hbox{\footnotemark\relax}
\begin{figure*}
\renewenvironment{theindex}{\begin{list}{\relax}{}\item \hskip0pt\item \hskip 0pt \vdots\vskip .15cm}{\item \hskip0pt \vdots\end{list}}
\begin{verse}\small
\def\subitem#1{\\\hskip .5in{#1}}
\myinput{relit.ind}
\end{verse}
%generate caption-#.tex .,/%end/-1
\caption{An extract from a simple \name{relit} \TeX-mode index, as made for this paper. (It is possible page numbers are incorrect in this printing, depending on how the publisher handles \LaTeX\ indexing.) Only names and files explicitly shown in this paper have been included (i.e., definitions where the normal \name{relit} process has not shown names to the reader, along with all hidden material after the {\tt \char'134end\{document\}}, have been excluded).\copy\helpcaption}
%end
\label{fig:index}
\end{figure*}
\footnotetext{This paper, which is both a paper explaining an algorithm in the usual way and a paper explaining \name{relit} in detail, is complicated by unusually having to write some \name{relit} commands \emph{inside\/} \LaTeX\ \texttt{verbatim} environments, so the reader can see them but where simultaneous indexing is impossible. (In normal use, \name{relit} commands would be outside \texttt{verbatim} and similar commands and hence they would be processed by \LaTeX\ and would normally be invisible to the reader.) Thus, index entries marked~* in Figure~\ref{fig:index} were provided by using duplicated \name{relit} commands written just outside the \texttt{verbatim} environments so \LaTeX\ could evaluate them to make the starred index entries as shown.}
}
The \TeX-mode syntax corresponds directly to the original syntax, and is summarised in Figure~\ref{fig:syntax}.
% Figure moved here:
\begin{figure*}
\begin{center}\tt\small\begin{tabular}{ll}
\rm\bf Normal mode&\rm\bf \TeX-mode \\ \hline
\% de{}fine \emph{name} $\mbox{\tt \emph{re}}_1$, $\mbox{\tt \emph{re}}_2$ & \char'134relit\{define \emph{name} $\mbox{\tt \emph{re}}_1$, $\mbox{\tt \emph{re}}_2$\} \\
\% de{}fine \emph{name} $\mbox{\tt \emph{re}}_1$, $\mbox{\tt \emph{re}}_2$, \emph{tag} & \char'134relit[\emph{tag}]\{define \emph{name} $\mbox{\tt \emph{re}}_1$, $\mbox{\tt \emph{re}}_2$\} \\
\% gen{}erate \emph{name} $\mbox{\tt \emph{re}}_1$, $\mbox{\tt \emph{re}}_2$ & \char'134relit\{generate \emph{name} $\mbox{\tt \emph{re}}_1$, $\mbox{\tt \emph{re}}_2$\} \\
\% gener{}ate \emph{name} $\mbox{\tt \emph{re}}_1$, $\mbox{\tt \emph{re}}_2$, \emph{tag} & \char'134relit[\emph{tag}]\{generate \emph{name} $\mbox{\tt \emph{re}}_1$, $\mbox{\tt \emph{re}}_2$\} \\
\% se{}t-tag \emph{tag} & \char'134relit\{set-tag \emph{tag}\} \\
\% \emph{any comments permitted\/} & \char'134relit\{ends\} \\
\end{tabular}\end{center}
%generate caption-#.tex .,/%end/-1
\caption{Corresponding normal and \TeX-mode syntax for \name{relit} commands. The final case in the table allows \texttt{\char'134relit\{ends\}} to mark the end of definitions (anything convenient, such as text in a comment can also be used).}
%end
\label{fig:syntax}
\end{figure*}
Here is an example of use the \TeX-mode so the normally invisible \name{relit} commands used to save text have been made visible by defining {\tt \char'134relit} appropriately in \LaTeX\@. Defining \texttt{{\char'134}relit} like this can be useful when reviewing a document.
\relit{generate TeX-mode-demo.tex /This/, /relit.ends/-1}
This is a simple example that generates a file \texttt{TeX-mode-demo.tex}, which will contain the original \LaTeX\ source for this paragraph after \name{relit} is run.
\relit{ends}
%This example is admittedly brief, contrived to make how it works clear. In normal use, the text would probably be further processed elsewhere, rather than recycled inside \LaTeX\ immediately! However, to show it works, here is the generated file straight-forwardly input and recycled back into this paper followed with an automatically generated line giving the date from a Unix command just to suggest that, if desired, far more powerful processing is also possible. Of course, throughout this paper we have generated many files (such as C code) which is processed further but elsewhere we have not burdened the paper with this sort of diagnostic information which is easy to generate as needed.
For completness, the generated text (input from the file \name{relit} generated) is shown below inside a \LaTeX\ quote environment:
\begin{quote}\small\sf
\myinput{TeX-mode-demo.tex}
\end{quote}
%The \texttt{{\char'134}relit} definitions used in this paper are readily available: in the same way that \name{relit} is run on this paper to generate compilable programs, an additional \texttt{\%{}generate} command simply saved the actual definition of \texttt{{\char'134}relit} used to a file so it can be reused in other documents.
\name{Relit} warns if \TeX-mode and non-\TeX-mode commands are mixed in the same files: the point of \TeX-mode is that authors can use \TeX\ to keep track of \emph{all\/} \name{relit} commands, but using the \texttt{\%} \name{relit} notation as well (which of course \TeX\ completely ignores) would undermine reliable tracking.
The reason that \name{relit} makes \TeX-mode an option, rather than being the only method of use, is that the feature introduces a conceptual layer of complexity the standard approach does not require. Using \TeX-mode requires not just defining some \TeX\ commands (a matter of copying them from somewhere that already works) but also understanding how to control \TeX's syntax with regular expressions, which is harder as you can no longer use {\tt \char'134}, {\tt \%}, \{, \}, etc, in the same way. If you make mistakes in \TeX\-mode strange things can happen that are hard to debug because \TeX/\LaTeX\ are complex; in contrast if mistakes are made in the normal \texttt{\%}-mode, nothing strange happens in \LaTeX\ (because \name{relit} commands are ignored by \LaTeX) and \name{relit} reports any errors.
\subsection{Name subscripts}\label{subscript-section}
\name{Relit} allows authors to name and combine snippets of code, but surprisingly often the author has trouble thinking of names. For example, the publisher Elsevier wants to have a list of figure captions, and if \name{relit} is used to collect the figure captions, then we either need the author to invent many names and keep track of them, or we need a way of subscripting names --- then the author only needs to invent one name for each class of things (such as figure captions) that they want to keep track of.
Introducing programming into \name{relit} would make it more complex than seems warranted, but subscripts are very useful even with no arbitrary arithmetic allowed on them.
The approach \name{relit} takes is that any name can have a \texttt{\#} symbol in it, and this represents the subscript. Each time a name is used with a \texttt{\#}, the subscript for that name increments, having started at 1. Hence,
\begin{Verbatim}[commandchars=\\\{\}]
%\breakup{}generate fig#-caption.txt ...
%\breakup{}generate fig#-caption.txt ...
%\breakup{}generate fig#-caption.txt ...
\end{Verbatim}
will generate three files called, respectively, fig1-caption.txt, fig2-caption.txt and fig3-caption.txt. If the author inserts another similar \texttt{\%{}generate}, say between Figures 1 and 2, then the caption subscripting simply introduces a new numbered name in the sequence. The author does not have to keep inventing and keeping track of any new names --- they can just copy-and-paste one subscripted name.
Using subscripted names in \texttt{\%{}generate} is thus quite straight forward. When subscripted names are used in \texttt{\%{}define} commands, the effect is identical: a subscripted name is defined, and the subscript increments each time it is redefined. Somewhere, unlike with file names, the author will of course want to use these defined names. To use a name, the author writes \texttt{<name1>}, \texttt{<name12>}, or whatever, as usual. \name{Relit} will report errors if incorrect subscripts are used, if a defined subscript is not used, and if an explicitly subscripted name is defined (like \texttt{name4} when \texttt{name\#} has also been defined and has counted, in this case, to 4).
In fact, in \name{relit} names and file names work completely identically (they are in the same namespace), except that \texttt{\%{}generate} creates a file as a side-effect as well as defining the name. Thus, if an author generates a file, the file name can also be used as an ordinary \name{relit} name, so that its value can be used anywhere else in the file, for instance to generate a set of related files. (This feature is used in the present paper: by generating such ``nested'' files, we were easily able to run Unix's \name{wc} on snippets of code within larger files.) The way subscripts work is also identical, as described above.
\section{Further work on relit}
The experience of developing and using a tool drives new ideas, creating a tension between polishing the legacy or generalising and extending the scope and reach of the ideas. Here are several key potential developments:
\begin{enumerate}\raggedright
\item
The syntax of \name{relit} could be improved. For example, instead of defining names or generating files, \name{relit} commands could be considered more generally as specifying arbitrary text from an expression, which is then processed. Like \name{loom}, commands could pipe their output to files or other Unix processes. This would generalise \name{relit}. At present, the same effect can be achieved, albeit after one extra step, by using makefiles and other external processes filtering the files that are generated --- but this is ``bad practice'' as it involves creating an arbitrary namespace, specifically the names of the temporary files.
\item
An unnecessary restriction in the current version of \name{relit} is that regular expressions specify \emph{lines\/} of text rather than strings. If an author wants to write code like \verb!\texttt{sin(x)}! then we currently cannot pull out the value ``\texttt{sin(x)}'' unless the \texttt{sin(x)} is written on a line on its own, which is an awkward convention and introduces unnecessary blanks in the author's text.
\item
\name{Relit}'s syntax is fixed, and must be adapted if it is to be applied to international languages other than English or to programming languages other than \LaTeX\@. At present, \name{relit} is open source and available for any such improvements, and adapting its (currently built-in) syntax would be trivial as the parser is uses a standard regular expression engine.
\item
\name{Relit} is an example of a ``parallel language'': a language grafted into an existing language, in this case, \name{relit}'s simple commands are squeezed into \LaTeX. Parallel languages are very common, but all of them are \emph{ad hoc\/} and are compromises. Think of HTML, CSS, PHP and JavaScript that have developed conventions so that they can co-exist; or Java and JavaDoc; C and its \texttt{\#define} macro-language which defies C's own syntax; or XML and its metadata; even \LaTeX\ and \TeX\@. Conversely, JSON has no parallel language. More research is needed on parallel languages, so that when new languages are introduced they can be extended in the future without unnecessary and perhaps fragile compromises.
\item
\name{Relit} could be extended with its own parallel languages. \name{Relit} allows \LaTeX\ and code to be interleaved, but an author may want additional information or annotations, such as program specifications, test cases, invariants, contracts, author names, versioning information, etc: these might be neither program code the author wants the reader to see, nor program code that should be compiled.
\item
Although \name{relit} readily allows a single \LaTeX\ document to generate many programs (e.g., in the present paper we generated a Euler cycle and a randomised Euler cycle), it does not provide any built-in way to keep track of the evolution of code. Of course, existing tools can do this; for example, running \name{diff} on the generated programs from this paper shows several lines are different, for instance that the two comments below are different:
{\tt\relsize{-1}
\begin{quote} \myinput{diff.txt}
\end{quote}}
--- thus revealing a minor sloppy change between the two versions of the code given originally in Sections \ref{normal-version} and \ref{randomised-version} respectively. (As usual, the lines above were generated automatically, and editing the \emph{original\/} lines in this paper in Sections \ref{normal-version} and \ref{randomised-version} will change the lines above correspondingly.)
\item
In this paper we have argued for the value of tools like \name{relit}. For good reasons, we believe it enormously helps reproducibility and dependability of publications. Whether it \emph{actually\/} helps other scientists, and whether it can be improved to help more authors to achieve their publication goals, is an open question.
\item
Finally, a harder question, which is more useful to answer is: of several \name{relit}-like systems (\name{org-mode} \cite{org-mode} and others), which are more effective and why? We currently lack the theories and principles of doing and disseminating computer science so these critical questions are hard. Our final section, next, explores these bigger issues.
\end{enumerate}
\part{Reproducibility: discussion and conclusions}
\section{The benefits of automatic honesty}\label{benefits}
\def\startquote{\sf\raggedright\setbox0=\hbox{``}\hskip -\wd0 \copy0}
\def\quoteend{''}
Richard Feynman's 1965 Physics Nobel Prize lecture (shared with Sin-Itiro Tomonaga and Julian Schwinger) reminds us that we polish our scientific papers to convey what we have achieved rather than how we got there. As we refine and improve our articles, their connection with the original work --- our programs --- can become tenuous and ultimately deceptive.
%\begin{quote}\startquote We have a habit in writing articles published in scientific journals to make the work as finished as possible, to cover all the tracks, to not worry about the blind alleys or to describe how you had the wrong idea first, and so on.\quoteend\hfill\cite{feynmanprize}\end{quote}
This sloppiness creates a misleadingly ``productive'' culture which favours quantity over quality of written output.
% more of it: people and their promotion committees want longer CVs, not better papers.
It becomes accepted practice, because everyone seems to benefit from taking the easiest route to publish more. It gives an advantage for the scientists who do it \cite{evolution}: cargo cult science is easier to publish. Indeed, what authors even read all the papers they cite \cite{nocite}?
It is noteworthy and arguably a symptom of this cargo cult that the ACM Computing Classification System (CCS, \url{dl.acm.org/ccs}) does not even classify the core activity
%that defines our discipline:
publishing --- yet publishing (including \LaTeX\ itself) is clearly a very active field within computer science research. Evidently, thinking seriously about publishing (which involves thinking about uncomfortable subjects like quality in publishing) is not as exciting as just publishing!
As discussed in Part~I of this paper, before we had invented \name{relit}, we ourselves encountered the problem of discrepancies between source material and published material while were searching the literature to help us in writing early drafts of this paper: improving the English explanations in the paper naturally led us to talk about variables $u,v$ yet the program we were writing about still had variables called $i$ and $j$ in it. This is an example of how it is very easy to slip without noticing from clarifying your writing to simply making it~up --- here, we fell into the trap of talking about $u$ and $v$ because those are standard ways of talking about graph vertices, but the program's original $i,j$ were out of sight (at that time, before we had developed \name{relit}), and therefore incorrectly described \emph{and we did not notice}.
There is a legitimate stage in drafting papers where authors can freely write about what they hope and intend will be true as placeholders for future work \cite{writing}. ``Our program \emph{will\/} work like this~\ldots'' But if what they write has no rigorous connection with reality it will never be straightforward to know when the aspirational gap has been closed, and the authors thus risk eventually publishing misleading and time-wasting papers. Readers generally have no way to distinguish sketches and visionary ideas from actual achievements.
Feynman warned about the utter honesty --- the special kind of integrity --- essential to do good science \citenumber{feynman}. But with computer tools to help we can go further than conventional scientists: we can make this honesty automatic. And when honesty is automatic, it becomes easy and reliable.
%In 1974 he said,
%\begin{quote}\startquote It's a kind of scientific integrity, a principle of scientific thought that corresponds to a kind of utter honesty --- a kind of leaning over backwards.~[\ldots] In summary, the idea is to try to give all of the information to help others to judge the value of your contribution; not just the information that leads to judgment in one particular direction or another.\quoteend\hfill\citenumber{feynman}\end{quote}
%As Feynman said, \begin{quote}\startquote The first principle is that you must not fool yourself --- and you are the easiest person to fool. So you have to be very careful about that. After you've not fooled yourself, it's easy not to fool other scientists. You just have to be honest in a conventional way after that.\quoteend\hfill\citenumber{feynman} \end{quote}
So automatic tools and techniques to encourage reproducibility, like \name{relit}, can help you help yourself --- and help your readers. Your readers should not have to work out things for themselves you did not know you were not telling them. Our scientific writing should not rely on readers having to use their insight to interpret and clarify what we write; different readers may have different insights, and then it is not clear what our science is communicating.
When describing something that is complicated, there is a temptation to simplify and take short cuts. In the worst case, this results in publications that describe what ought to happen, what we hope happens, but there are omissions that mean the claims are not easily reproducible. Somehow it is too easy to reframe our programs so that it sounds as if they work; all the details are too hard to check even for the original authors --- and as we convince themselves what we write is correct, then there seems to be less and less need to go to the trouble of checking our code. When we conceal errors --- and, worse, when we conceal errors from ourselves --- although our work may look good, it holds back progress~\cite{blackbox}.
If a tool like \name{relit} is used, the code in the paper is exactly what works. If describing it gets complicated, this cannot be denied. Instead of simplifying the narrative, the author is encouraged to improve the code so it becomes easier to describe. This means \emph{both\/} the code and the paper improve, and improve together.
It must be pointed out that not everybody agrees under all circumstances. For example, Knuth writes:
\begin{quote}\startquote
The author [Knuth] feels that this technique of deliberate lying will actually make it easier for you to learn the ideas.\quoteend\hfill\cite[p\emph{vii}]{texbook}