Markov text with citations #18

serin-delaunay · 2020-10-31T18:58:36Z

A common criticism of GPT language models is that they plagiarise text from the internet. As an experiment in smoothing over this issue, I will make a Markov chain language model that tags each n-gram observation with the location of the original in the source text.

This means that in the text generation stage, each output token can cite the n-gram it was drawn from in the source text. In the generated novel, I'll put this info in footnotes. This should make the resulting text much better sourced, and give the reader clarity about the true origin of any deep insights found in the novel.

Haven't decided what source text to use. Maybe Shakespeare (all lines have a standard identifier), GPT research papers, Moby Dick...

Caveats:

I'll probably need to generate LaTeX to keep the footnotes organised.
The procedure would be difficult to port into GPT models.
Most of the 50,000 words would be in the footnotes.

serin-delaunay · 2020-10-31T19:00:20Z

If there's time I might also do a slightly more serious separate entry that doesn't boil down to "YAMC".

pjfpotter · 2020-10-31T21:48:25Z

Why not write an entire novel of footnotes? Each footnote is a citation of the n-gram that would have been in the novel but then wasn't because it was replaced by it's own citation. Let's see how deep this rabbit hole goes.

serin-delaunay · 2020-10-31T22:05:08Z

There's one like that at NaNoGenMo/2019#68; I'd rather keep this one simple. The footnotes will have a pretty well-defined format, so they wouldn't need to be Markov-generated or nested.

greg-kennedy · 2020-11-01T05:00:07Z

This is the one that comes to mind when I think of obsessive footnotes: NaNoGenMo/2019#127

serin-delaunay · 2020-11-01T09:27:00Z

Yeah, that's closer to what I'm going for here. Thanks for the link, I saw that one last year but it had slipped my mind.

verachell · 2020-11-01T15:27:45Z

What a cool idea!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Markov text with citations #18

Markov text with citations #18

serin-delaunay commented Oct 31, 2020

serin-delaunay commented Oct 31, 2020

pjfpotter commented Oct 31, 2020

serin-delaunay commented Oct 31, 2020

greg-kennedy commented Nov 1, 2020

serin-delaunay commented Nov 1, 2020

verachell commented Nov 1, 2020

Markov text with citations #18

Markov text with citations #18

Comments

serin-delaunay commented Oct 31, 2020

serin-delaunay commented Oct 31, 2020

pjfpotter commented Oct 31, 2020

serin-delaunay commented Oct 31, 2020

greg-kennedy commented Nov 1, 2020

serin-delaunay commented Nov 1, 2020

verachell commented Nov 1, 2020