Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Animorphs: The Lost Chapters [Completed] #66

Open
ajtran303 opened this issue Nov 27, 2020 · 43 comments
Open

Animorphs: The Lost Chapters [Completed] #66

ajtran303 opened this issue Nov 27, 2020 · 43 comments

Comments

@ajtran303
Copy link

ajtran303 commented Nov 27, 2020

Content warning

Animorphs is a war story full of tragedy and trauma. There are graphic depictions of bodily injuries and mutations.


Four AI-Generated images of giant feline creatures. Their heads appear familiar, yet unrecognizable. Images generated with Text to Image API

Animorphs: The Lost Chapters

A 50,500 word novel generated with gpt-2-simple finetuned on the entire Animorphs book series (2000 steps).

Excerpt

I focused on the wolf. I focused my mind on the DNA inside my brain. I focused on the emotion that swells up the animal mind.

I felt the changes begin.

I felt a rush of renewed optimism. The world was new. I had seen life before. I’d never seen it so wild and amazing.

And yet, I was still a human girl. I was still in control of my own body, and human thought had experienced the unexpected.

Read it here

Download it

Video - executing the code that generates this novel

Here are some initial statistics

 Details
50,500 Words
5,695 Unique Words
253,136 Characters
202,505 Characters (no spaces)
4,856 Sentences
18,186 Longest Sentence (words)
1 Shortest Sentence (words)
11 Avg. Sentence (words)
53 Avg. Sentence (chars)
4 Avg. word length
4,735 Paragraphs
112.2 Pages
62,534 Syllables
4,867 Lines
95,254 Words (Publisher) 
7-8th Grade Reading Level 
3 hrs 4 mins Reading Time 
4 hrs 41 mins Speaking Time 
62 hrs 2 mins Hand Writing Time 

Keyword Density x1

458 (4%)human 
399 (4%)felt 
378 (3%)could 
276 (2%)saw 
250 (2%)see 
176 (2%)know 
168 (1%)going 
155 (1%)down 
149 (1%)like 
138 (1%)morph 
136 (1%)away 
134 (1%)begin 
123 (1%)all 
117 (1%)back 
112 (1%)cassie 
110 (1%)looked 
105 (1%)still 
105 (1%)own 
102 (1%)just 
102 (1%)being 

Keyword Density x2

150 (6%) could see 
77 (3%) hork bajir 
70 (3%) change begin 
70 (3%) felt change 
57 (2%) could feel 
55 (2%) felt changes 
54 (2%) saw human 
53 (2%) changes begin 
34 (1%) getting pushed 
30 (1%) time line 
28 (1%) yeerk pool 
28 (1%) away away 
28 (1%) falling getting 
27 (1%) pushed back 
26 (1%) becoming human 
26 (1%) inside stomach 
24 (1%) human being 
24 (1%) felt cold 
24 (1%) felt time 
22 (1%) human again 

Keyword Density x3

69 (7%) felt change begin 
50 (5%) felt changes begin 
28 (3%) falling getting pushed 
24 (3%) getting pushed back 
20 (2%) felt time line 
17 (2%) felt creature moving 
17 (2%) time line reappear 
16 (2%) red tailed hawk 
13 (1%) down down down 
9 (1%) one could see 
9 (1%) becoming human again 
9 (1%) could feel changes 
8 (1%) becoming human becoming 
8 (1%) human becoming human 
8 (1%) saw patterns feathers 
8 (1%) felt cold touched 
8 (1%) creature moving toward 
8 (1%) moving toward away 
8 (1%) creature moving away 
8 (1%) moving away away 

I'll be back next year!

Next I will be working on mining the novel for data visualization. This is a little bit outside the scope of NaNoGenMo, so for now I will just include some initial statistics above. This was a fun project! I've been looking forward to this for months!


Submission [Completed]

I'm making an entry before the month is out! I've been looking forward to this for a long time. :)

I will use the Animorphs corpus of over one million words written by K.A. Applegate and various ghostwriters to train GPT-2 and ask it to write me a novel.

I'm using a mix of JavaScript and Python and Shell Scripting to accomplish this task.

High-level design overview

  1. Obtain Animorphs corpus
  2. Finetune a GPT-2 deep learning model to analyze the corpus
  3. Generate a raw output of "a bunch of words"
  4. Clean the data and curate a 50,000 word novel!

Read previous drafts here!

First Draft
Second Draft

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

Dev Diary

Obtaining the corpus

  1. I created a new JavaScript project with npm init
  2. I downloaded the entire series as a PDF from reddit. source
  3. I extracted the PDF files and they came out as a folder called PDF. Great! This folder went into my project folder.
  4. I created another folder called texts, which will contain the converted outputs. I also put it in my project folder.
  5. I installed a package to parse the PDF data to look for its text. npm i -s pdf-parse
  6. I created a script to convert the PDF files into TXT files. This took some research on the pdf-parse documentation and additional knowledge of the node fs module.
// runner.js

const fs = require("fs");
const pdf = require("pdf-parse");

const fileNames = fs.readdirSync("./PDF", (err, files) => {
  return files;
});

fileNames.forEach((file) => {
  let targetInput = `./PDF/${file}`;
  let targetOutput = "./texts/" + file.substring(0, file.length - 4) + ".txt";

  let dataBuffer = fs.readFileSync(targetInput);
  pdf(dataBuffer)
    .then((data) => {
      fs.writeFile(targetOutput, data.text, (err) => {
        if (err) throw err;
        console.log(`${targetOutput} has been created!`);
      });
    })
    .catch((e) => {
      console.log(e);
    });
});

Obstacle

The code would not execute from the command prompt $ node runner.js. But opening $ node and then pasting in the program worked.

  1. Finally, a little Unix on the command line to bundle all of these new files into one file.
$ cat texts/*.txt > animorphs-corpus.txt

And that's all the code we have to write for now! Let's get a word count:

$ wc -w animorphs-corpus.txt
 1725941 animorphs-corpus.txt

1.7 million words! The file size is 9.2M. We will now take this and feed it to gpt-2-simple in a Google Colaboratory Notebook. Read this tutorial for more information and stay tuned for an update for how I fine-tuned GPT-2 with the Animorphs Corpus.

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

Additional consideration

I think I will split my novel into 20 chapters. 20 chapters of 2500 words would meet the 50,000 word requirement.

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

Fine-tuning GPT-2

This tutorial is pretty spot on and I've used it before to train on other data sets (Taco-related, Poetry, Gothic Lit). The main thing is that I want to ensure that my generated text has the right "flavor", so I chose to tune for 2000 steps, twice as many of the recommended steps. Training for 1000 steps takes about 45 minutes, so 2000 will take about an hour and a half.

@ajtran303
Copy link
Author

Sample output from Step 1850

Chapter 4
When it was clear she’d be late for her night shift, Rachel and I set off. We couldn’t wait. We
packed trucks and SUVs and rented one of the most advanced Bug fighters: the Blade
ship. We flew it over the Yeerk city by nightfall.
We flew without any problem, because the chopper that landed directly on the head of the Bug fighter
began dropping Dracon beams. The beams were only visible in the darkness.
“Let’s go,” I said. “See if a helicopter can spot a little bird!”
I started on one side of the cube and went up the ladder a few steps to the right. In the darkness
the light was too dim to see clearly. I was still too far up to see the two guys above.
I looked down at the square cube with my keen eyes. I could see two large buildings
below me. There were a dozen lights from the Yeerk invasion army. There were a dozen Bug fighters on

@ajtran303
Copy link
Author

gpt-2-simple documentation

Take advantage of the "cloud" resources available while working on this Machine Learning project. Procedural text generation is a wonderful gateway into Natural Language Processing. So check out the documentation - you could possibly use the package elsewhere!

Link

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

Findings from Documentation

GPT-2 can only generate a maximum of 1024 tokens per request (about 3-4 paragraphs of English text).

GPT-2 cannot stop early upon reaching a specific end token. (workaround: pass the truncate parameter to a generate function to only collect text until a specified end token. You may want to reduce length appropriately.)

When finetuning GPT-2, it has no sense of the beginning or end of a document within a larger text. You'll need to use a bespoke character sequence to indicate the beginning and end of a document. Then while generating, you can specify a prefix targeting the beginning token sequences, and a truncate targeting the end token sequence. You can also set include_prefix=False to discard the prefix token while generating (e.g. if it's something unwanted like <|startoftext|>).

GPT-2 allows you to generate texts in parallel by setting a batch_size that is divisible into nsamples, resulting in much faster generation. Works very well with a GPU (can set batch_size up to 20 on Colaboratory's K80)!

If you have a partially-trained GPT-2 model and want to continue finetuning it, you can set overwrite=True to finetune, which will continue training and remove the previous iteration of the model without creating a duplicate copy. This can be especially useful for transfer learning (e.g. heavily finetune GPT-2 on one dataset, then finetune on other dataset to get a "merging" of both datasets).

generate Method Signature

def generate(sess,
             run_name='run1',
             checkpoint_dir='checkpoint',
             model_name=None,
             model_dir='models',
             sample_dir='samples',
             return_as_list=False,
             truncate=None,
             destination_path=None,
             sample_delim='=' * 20 + '\n',
             prefix=None,
             seed=None,
             nsamples=1,
             batch_size=1,
             length=1023,
             temperature=0.7,
             top_k=0,
             top_p=0.0,
             include_prefix=True):

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

First GPT-2 Output

Okay! Let's make some invocations. I will use the prefix of the very first book, "My name is"

Input Parameters

gpt2.generate(sess,
              length=1024,
              temperature=0.7,
              prefix="My name is",
              nsamples=5,
              batch_size=5
              )

Output Results

My name is Aximili-Esgarrouth-Isthill, I believe. I’m the only Andalite-Controller who ...
I saw it all ...
It was a trap!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!
I saw it all!


Chapter 20
<Aaaahhhh!>
<Ax!>
I was in control now!
<Where is he?>
<I have a story to tell.>
<Really?>
<No,> I said. <I have to tell you something.>
<What?>
<You’re the leader, Cassie.>
<Rachel?>
<Yeah. I’m the leader.>
<I know. I know. I’m the leader. I’m the only one who can stop this from happening. I know. I know
Cassie’s right. I know. The only way to stop it is to get involved.>
<Rachel?>
<I’m not totally with you, Cassie. I don’t know what to do. I just want to see you,> I said.
<I ... I ... I don’t know what to do.>
<Rachel, I don’t think you can do it,> I said.
<No, Cassie. I ... I just don’t know. I just want to see you. I don’t know.>
<I know,> I said. <I know. I can’t have you doing this. I know it’s wrong. But it’s not like
I’m going to give up.>
<Okay, but I’m not going to give up.>
<What?>
<Nothing. Jake, if I give up, you’re going to end up like Marco the Magnificent? Like Alloran the Great?
Then you’re going to end up like Tobias?>
<Tobias? Ax?>
<Yeah. I’m in your head.>
<What are you going to do?>
<I’m the leader. I’m the only one who can stop this from happening. I’m not totally with you,
Cassie.>
<What? What are you talking about?>
<What are you talking about? It is happening.>
<What is it?>
<I don’t know. I just want to see you, Cassie. I don’t know.> I looked down at the ground. I looked up. I couldn’t
believe it. I knew it was possible. But I couldn’t say no. I wasn’t done with it. I just couldn’t do it.
<I don’t know,> I said. <I just want to see you. Rachel? What do you want?>
<I don’t know. I just want to see you.>
I felt a tingling up my stomach, like I was afraid someone would judge me.
<What do you want? Do you want me to listen to you?>
Then a terrible, awful sound. It was like someone had struck a blow in my stomach. Like
something had struck me. Like something had struck me deep inside.
A daggerlike pain shot through my stomach. I could feel the bones in my chest collapsing.
<I don’t know, Cassie. I just want to see you.>
I felt a tingling up my stomach, like I was afraid someone would judge me.
<What do you want? Do you want me to listen to you? I can’
====================
My name is Aximili-Esgarrouth-Isthill. I was born and raised in this town. I lived in this valley until I was seventeen. I was a soldier when I
lived, and when I died.
I was an Andalite, a warrior, and a poet, all of it. I was an inspiration to the entire galaxy.
I was not a true Andalite. And I did not have the strength of will and will and
will. I was strong. I was strong.
I was not a true Andalite.
A very sick, ugly, evil, pitiful, no, a sickening creature.
I am not sure what to do. I am not certain of who I am or what I stand for. I do not know if
I will ever be powerful enough to resist the Yeerks.
I know I will be strong. I know that it will be very hard and very painful for me to resist the
Yeerks.
And I know that it will be very lonely.
I know that it will be very difficult.
But I know that it will be very, very difficult.
I know the stark reality of it.
And I know that it is not just a battle. It is an entire war.
And it is a war that will end in horrible torture and terrible death.

Chapter 25 - Nice Rachel
We spent the day watching TV. We were watching Jake and the others, standing up for what they
had won.
We watched Rachel and the others, standing up and fighting back.
Rachel and I were watching Jake and the others, fighting back.
It was an ugly war. A horrible war, with horrible emotions.
We were watching helplessly as our own people, our own people, went to war.
Trying to stop it.
We were watching the war, trying to stop it.
We were fighting to stop it.
“You know, I don’t think we need to spend all of our energy on fighting,” I said. “We need to
have a plan. Prepare for battle. Have the plan ready.”
“We already have,” Marco said.
“Wait a minute,” Jake said. “We have to do this.”
We waited a minute.
And then, we began to make plans. I was sick with rage. I was sick with fear.
“Okay,” Jake said. “We go in. We don’t -”
“No,” I said.
“- ” Jake?” Marco.
“Yeah. We need some firepower. We can’t allow the Yeerks to come in,” I said. “We can’t
allow a Yeerk to go into and take our bodies. We can’t allow a Yeerk to become a human. We
can’t allow a Yeerk to take our minds off us, make us into something different. We can’t
be human.”
“I know,” Marco said. “I feel like my body against the wall. And I feel like my body against the
wall. I feel like my body against the wall. I feel like I’m being made a slave of the Yeerks.”
“You don’t have to be scared,” I said to Jake.
Jake looked at me. “If you’re scared, you’re kind of a slave to the Yeerks. You know, like you’re
lucky, just not all the time, like I’m.”
I looked away, and he nodded his head.
“Yeah,” I agreed. I was glad he was not.
Marco looked at me. “You know, I’m not exactly looking forward to fighting.”
“I am,” I said. “I am,” I lied.
“You know, I feel like I’m going to be a slave of the Yeerks,” Jake said. “I mean, you know,
like I’m doing, like I’m the only one who can make the world a better place.”
Jake was not interested in talking about it. Not yet.
“We go in,” I said. “We don’t let the Yeerks see us. We don’t let them see us. We
fight. We don’t let them see us, but we also help the others. We help each other.”
Marco nodded. “Yeah. So I
====================
My name is Aximili-Esgarrouth-Isthill. I’m a very strange creature. I’m a very strange species. An alien. If you want to know about aliens, ask me.>
“My name is Aximili-Esgarrouth. I’m an Andalite. A very strange android. A very strange species.”
<Yes,> Ax agreed. He was looking at me from all sides. I could feel the fear that came from
him. He was afraid.
“What is this?” the Andalite asked.
“What is this?” the Andalite asked me.
<It’s an Andalite ship. The ship called the Blade ship. I went to this location. We found the Blade ship.
The ship called the Blade ship is very small. You can’t see it, but it is very large. It is more than
an inch long.>
The Andalite smiled. <You see, I can see. They have hidden Dak and the Blade ship. We have
access to a great many sensors. We have a lot of energy.>
He looked at me. <You see, I can see. They have hidden Dak and the Blade ship. We have
access to a great many sensors. We have a lot of energy. But I don’t know what they have hidden.>
“Dak? What is it?” Dak asked.
<Dak’s student, Aximili-Esgarrouth-Iskillion-Falan. A warrior, aristh, and an
Andalite. He led the invasion of Earth.>
“What?”
<Dak was killed by a Garat ship, in space,>
<Yes, Dak was killed. The ship was destroyed. We have not found the ship.>
<He said it was the only way. He said it was because the Andalite ship was hidden. The Andalite
ship was too small to go anywhere.>
“Why?”
<Because the Blade ship was hidden. The Andalite ship is too small to go anywhere.>
The Andalite looked at me and nodded. <Perhaps we should ask you, Prince Elfangor.>

Chapter 21
“What are you doing, Andalite?” I said.
<I am doing what I must. I am taking a prisoner.>
“What is this?”
<This is the Andalite ship. I have hidden it in a hidden location. There is no way to read its name.
It is a Blade ship. It is hidden. I cannot swim. I cannot breathe. It is hidden by a large, open area. I
must go and find it.>
“What do you mean, I’m going to go and find it?”
<I am going to find it.>
“You’re going to find it?”
<Yes. I am going to find it. It is hidden. I am going to activate the sensors. Once you understand what
it is, it will be easy.>
“What do you mean?”
<It is a Blade ship.>
“You underestimate the power of Andalite eyes.”
<Yes.>
The Andalite looked at me. I had not expected this. <May I ask, Prince Elfangor? What is your
plan?>
<We will reach the Blade ship. We will destroy it, using our Andalite senses. It will be easy. It will
be very difficult, but it will be very satisfying.>
I felt like a fool. I didn’t even know what to do. I was a fool. I was a fool. It was a trap.
“What do you mean, “trap” is an Andalite word for “easy?”
<This is the Blade ship. It is a small ship. It is hidden. I cannot go through a door and see
it. The Andalite ship is too small to go through. We have had to destroy the ship. It is a small
ship. It is hidden.>
“What?”
<This is the Blade ship. It is hidden. I cannot go through a door and see it. The Andalite
ship is too small to go through. We have had to destroy the ship. It is a small ship. It is hidden.>
I felt like being a fool. Like I needed to know what was going to happen. I had no time to know.
“All right,
====================
My name is Rachel. And she’s the killer.”
“She scares me. She’s so, so smart. She’s so, so cool.”
“It’s just that I know she’s not just smart. We have to get out of here, or we’ll lose the Pool ship.
That’s the stupidest thing in the whole whole war. I know. I hate it. But you know what? I’m
going to stay.”
“Yes, I can’t leave. I’m going home.”
“You’re not leaving the city. You’re not leaving the people.”
“Oh. That will come sooner or later.”
“If she’s still alive, she’s going to decide to help us,” Rachel said. “If she’s still alive, if she’s
despite all this, I’m going to fight. If she’s still alive, we’ll fight. And we can’t let her get away.
She’s not doing it alone.”
“Rachel, you’re an amazing person. A leader, a person I will always remember. A person I will
always remember. You’ll never remember me.”
“She’s not going to let that happen. She’s not going to let us get away. She’s not going to
let us be captured. She’s not letting us try and force her hand. You are a brilliant, incredible person.
You will always remember me.”
Rachel looked at me like I was crazy. “I know, I know,” she said to Rachel. “I’m not
worrying too much. I’m just worried about you.”
I went to her. She stared at me like she was terrified. “I know,” she said softly. “I’m not
worrying so much. I’m just worried about you.”
I nodded with a grin. “I’m sorry, Rachel.”
Rachel swung her arms around and hugged me tight. She hugged me with her arms, and Rachel hugged me
with her hands.
“You know, Jake, you’re just like me, right?” I said. “You’re the one who said, ‘If she’s
still alive, she’s going to decide to help us.’ And you, right now, right now, right now, right now, she’s
going to decide if Jake likes you.’ She’ll make him care about you. She’ll make him want to cry.
“I know. I know,” she said to me, and I felt the tears start.
I am not Rachel. I am not human. I am human. I am what I am. I don’t care about who I am,
I don’t care what people think I am. I care about you.
And what people think I am.

Chapter 8
I woke up and saw a bright, moving figure staring back at me.
I looked at it. I thought it was Ax.
I knew what it meant. It was Aximili-Esgarrouth-Isthill. He was on the bridge. He was trying to
turn the ship around. He was trying to turn the ship around, and I knew it.
I looked at him with my main eyes. In the normal way, you would see his eyes, but they were dark. He
was staring at me.
I looked away. I could not think about it. I had to think about it. I had to think about that awful
battle, the battle, and the awful things I’d seen.
I was a prisoner. A prisoner of the Yeerks. And I was going to miss it.
I started to go, but I was not fast enough. I was not very fast, and I was not very fast.
The ship was going to crash.
“Marco!” Cassie yelled. “Do you hear me?”
“I’m going to get you out of here, I’m going to demorph,” I told her.
“No!”
“You’re going to be fine,” she said.
I was going to miss the crash.
She was right.
I was going to miss the
====================
My name is Cassie. I’m a girl. I’m an Andalite. Yeah, a pretty girl. People often ask me how I got my name. Because I’m not a girl. A girl
doesn’t have to know what a person is.
But why should I have to be a girl? To be able to go into a movie theater? To go to an
art exhibit? To go to the mall?
I have to go to movies. There’s no other way. There’s no other way to go.
I started to tell my friends. But they couldn’t. And I couldn’t.
So I told my friends about the Yeerk invasion. About the Chee who had come to rescue
me, and about the people who had been friends with me at school. About the Hork-Bajir, the
Yeerks, the Animorphs. About the Andalites, the humans, and the Yeerks. And about the
Andalite band.
I told them everything. I told them about the Yeerks, about the Yeerks and the Andalites, about the
Andalites and the Yeerks, and about Jake.
And I told them about my friends, who were fighting the Yeerks. And I told them about my
friends, my friends, who were my friends.
I’d never been much of a person. I’d never even known my name. I’d never been
conscious. I’d only known that I was different, different.
I just lay there, lying there in the grass, just lying there, and you couldn’t tell. You could
tell that I had ears and a mouth, and that I had a nose and that I had eyes in there. I had teeth. I
had the power.
But I was different.
I had the power to go to movies. I had the power to go to the mall. To the beach.
“I love you, Cassie,” I said.
<There is only one way out,> Tobias said. <A way that will allow you to go to the mall.>
“I think I know what that means,” I said.
“It’s a protocol,” Marco said. “The Yeerks, they had a prisoner. They went to the prison and
took that prisoner.”
“I don’t know,” I said. “I don’t know if I like it. It’s all about the prisoner. The Yeerks will put
him in a cage. I think it’s a little like when I play games with the rules. Like you can’t tell when you
play a game, but it’s a game. My best friend is a prisoner. He’s a prisoner, I’m saying that in
allegory.”
I said it out loud. “I’m saying it out loud because it’s a lie, Marco. They wanted to kill me.”
<Yeah.> Tobias said, <no one’s going to kill me.>
“Yeah,” Marco said. “I am going to kill them.”
<You know, you’re not alone, Rachel.> I said. <I know what this means.>
“No,” she said softly.
“Maybe we should catch a bus,” I suggested.
“Maybe.”
I thought about it, thought about the way I’d look up at the stars and the way I’d look down at the sky, and
just look down at the sky and look up with that awful bird’s eye. I still hadn’t figured out how to fly.
I still hadn’t figured out how to fly.
I was going to get out of that house, and I was going to get out of that house, and I was going to
make it all over that desert island, and maybe
that’s all I needed.
But I wasn’t sure I could really go all the way. Always the way. Always the way just the
way.
I was going to do the only way.
I was going to do the only way.

Chapter 6
I opened my eyes.
It was night. I could see the sky through the trees. I could see the birds in the trees. I
could see the way the water flowed around me. And I could see the way the rain was falling and the
sudden, violent changes in the
====================

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

Analysis of First GPT-2 Output

Let's grab some statistics from a free online Word Counter Tool

Statistics

Details

3,111 Words
780 Unique Words
15,066 Characters
12,255 Characters (no spaces)
329 Sentences
1,360 Longest Sentence (words)
1 Shortest Sentence (words)
10 Avg. Sentence (words)
46 Avg. Sentence (chars)
3.9 Avg. word length
300 Paragraphs
6.9 Pages
3,876 Syllables
311 Lines
4,976 Words (Publisher) 
7-8th Grade Reading Level 
11 mins 19 sec Reading Time 
17 mins 17 sec Speaking Time 
3 hrs 41 mins Hand Writing Time 

Keyword Density x1

45 (7%) all 
39 (6%) saw 
25 (4%) going 
23 (4%) about 
16 (2%) know 
14 (2%) like 
12 (2%) andalite 
10 (2%) just 
10 (2%) go 
9 (1%) looked

Keyword Density x2

38 (12%) saw all 
4 (1%) still alive 
4 (1%) right now 
4 (1%) could see 
3 (1%) aximili esgarrouth 
3 (1%) going decide 
3 (1%) always remember 
3 (1%) now right 
3 (1%) care about 
3 (1%) think about 

Keyword Density x3

3 (1%) right now right 
3 (1%) now right now 
2 (1%) name aximili esgarrouth 
2 (1%) aximili esgarrouth isthill 
2 (1%) felt tingling stomach 
2 (1%) tingling stomach like 
2 (1%) stomach like afraid 
2 (1%) like afraid someone 
2 (1%) afraid someone judge 
2 (1%) still alive going 

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

Second GPT-2 Output

This time, I will not give it a prefix prompt, and let it write freely on its own.

Input Parameters

gpt2.generate(sess,
              length=1024,
              temperature=0.7,
              # prefix="Chapter ",
              nsamples=5,
              batch_size=5
              )

Output Results

You know, I want to be in the National Guard and I just can't get past the high school drop-in. I don't know why.
I mean, okay, I may not even know the school. But I still’d like to be able to fly.
I want to fly. I wish I knew how crazy the world is. I wish I knew how I had such a
broad, powerful imagination. I wish I knew how I’d done this, and done it so well I’d be part of something
terribly, terribly special.
But it’s like, well, what’s the first thing? The first thing is that I wish I knew? And
what’s the best way to know what’s right and wrong and what’s not?
I’ve been dreaming! I wish I knew. I wish I knew!
I wish I knew!
I wish I knew!
But the world is not a place for crazy, blind, pathetic, brain-stealing, impossible-to-understand,
intellectual, stupid, insane, stupid-yet-surprising-thinking, crazy, weak, weak-yet-bright,
inconspicuous, insane, dumb, dumb-yet-smart, or just plain, dumb.
No, the world is not a place for you, me, or Marco or Rachel or Cassie or Tobias or Ax.
The world is a place where you live your life as a hawk, a bird, or a normal human.
I want to live my life as a human.
I want to live my life as a human.
I want to live as a human.

Chapter 9 - Cassie
Day Zero
“Ax!”
The office door opened. I needed a human voice. The office was empty.
I reached up and pushed it to my ear. Saw the familiar face. The guy who’d just seen me. The
madman I’d just met.
My tail went limp and soft. I heard the familiar voice in my head. The voice of a lost soul,
somewhere deep inside me.
Yo!
I opened the door.
“Hoo-hoo-hoo!”
I was a hawk, in the head.
I moved past the familiar face and into the office. I could see that Marco was staring at me. He
was staring at me, looking embarrassed, maybe. I was embarrassed, as well.
I looked down at my shoulder and saw that it was torn. It was torn.
I knew I shouldn’t move. I knew I had to move. But I couldn’t. I could not move.
The other humans looked at me, then stared back.
I felt the changes begin. I could feel the changes in my body. I was in the park. In the mall.
I saw the familiar face. He looked troubled, but not troubled.
But he was not troubled.
I saw the familiar morph. It was the same morph I’d had since I’d morphed the Chee. It was
different, but different.
It was a human, a grizzly bear.
I could see the bear’s eyes and hear its human mouth. It seemed troubled. I could hear
its hesitation, but it was not frightened.
I moved forward and looked at the bear. I could see the color of his fur. He looked wild. Wild,
wild, wild.
“I have the grizzly bear morph,” I said.
I felt the changes begin. I could see the bear’s vision. It was a flash, a flash that would last
for a second. It was a flash of pure grizzly bear.
I focused my mind on the bear. I felt the changes begin.
I felt the changes begin.
I could see the bear’s mind. It was the bear’s mind. It was still alive. I could see it. It was
complete. It was the bear mind, all that was alive with me. Pure. Utter. Utter. Utter.
I focused my mind. I concentrated on the bear mind. I saw it clearly. I saw the shape and the
power and the cunning and the cunning of the bear. I saw the hunger, the hunger for food. I saw the hunger for
the hunger.
I was in the bear. I was in the bear.
I heard the bear mind. It was in me. I was in the bear. I heard the bear mind. I heard the
bear mind.
I was in the bear. I was in the bear.
I had to run. I had to run! I had to run!
But I
====================
“What?” the captain asked.
“He’s playing dumb,” the woman said. “You got that right. He’s going to cut me off!”
“I don’t think so,” the captain said.
“Yeah, well, you’re not the smartest person I know,” I said. “I mean, maybe you’re not the smartest, but you
still have a lot to learn.”
“We were talking about this guy who recently went missing,” the captain said.
“So maybe he’s a Controller. He’s a friend of his. He’s a friend of his. What do you say?”
“We don’t know,” I said. “We just know that he’s a girl. He’s a friend of his. We know he’s a
older woman. We know he’s a friend of his. I mean, he’s the only person who is going to betray us!”
“What do you think?” the captain asked.
“You think he’s a friend?”
“Oh, he’s still alive. He’s alive.”
I said, “I know I have to go into that bathroom.” I said, “But I’m serious.”
“You know, the captain,” the woman said to the captain, “that’s not the time to be discussing this.
He could be on his way. That’s what we’d have to do.”
“I’ll go in.”
We went in. I told the captain I was going to go in. He said, “You know you’re not going to go in on
any Controller.”
“No,” I said.
“I’m not going in on a Controller,” the captain said.
“You’re right,” I said. “I am.”
The captain looked at me.
“He doesn’t trust you,” I said. “He doesn’t know what we are. He doesn’t trust us.”
I went in. I knew how much I wanted to go in there. I knew how much I wanted to face him.
But I wasn’t going to face him.
“Jake, you know you’re not going in there alone,” Rachel said. “You know it’s our fight. You’ve
seen the way we fight.”
I said, “Yes, I know. That’s why we’re here.”
“I will do whatever you tell me.”
I said, “I will obey you.”

Chapter 8
“Jake, I know you’re smart,” Cassie said. “You know what you’ll say. But do you think I’m better than
I am?”
I answered, “No. I’m not. I’m just trying to help you. I know you’re scared. I know you’re scared. I
know you’re scared. And I know that you have been through more wars than you can count on
anyone.”
I didn’t say anything. I just stood there, silent, and I knew that Cassie was right. She said the word and I
did the same.
I could be scared, I could be scared, I could be scared. But not the way I’d been.
I didn’t want to be scared. I didn’t want to lose. I didn’t want to be alone.
“I’m going in,” I said. “I’m going in.”
I went in, just as Rachel had said. I was a little surprised. I didn’t know if I’d been expecting to find
something very threatening. I was nervous. I was worried about what it would mean for our families.
“Jake,” I said. “Come on. Let’s go.”
I reached into my pouch and took out some water. It was cold, but it was cool. It took a while for
me to get used to it. I could feel the changes. I could feel the changes as they
====================
“Hi,” Rachel said politely. “We’re not going to be getting any
real-time access to the hologram. We’re going to need to leave the room, find a way to get to the outside

of the house, find a way to enter the house through the inside, find the way to the outside.”
“Okay, that was dumb,” Marco said. “I’m not sure we have the right morph.”
“Jake,” Marco said. “We need to think of a way to get to the outside of the house.”
“I’m thinking we should demorph and acquire that Hork-Bajir,” Marco said.
“HrrrEEEEEEE-reeee-ree-ree.”
I looked around the house. I’d have to demorph Rachel. I’d have to acquire her. I didn’t know how
much time had passed. Cassie and I were still too small to acquire. And the morphing cube was still in the
fence.
I was still too small to acquire.
But I was still Tobias, one of the three-foot-long, four-foot-long, six-foot-
alien creatures. And I was going to be a wolf.

Chapter 3
I was trying to think of a way to get to the outside of the house. I’d already done it once when the
first Hork-Bajir came running up and snatched it up. I’d done it twice before.
But this time I didn’t.
“Jake, I’m not sure it is possible to get to the outside of the house,” I said.
“I’m going to take a look.” I looked around the house. I had to keep going. Only then did I see the
Hork-Bajir coming up and grabbing the cube. They were not much bigger than a human. I could not
easily get away through the fence, but I could definitely see the inside of the house. It was a spiral
step. I thought it was a staircase. But I was wrong. Inside the house, as the old saying goes, is the
house with the first to the left.
The stairs ran down the front. And up the main steps. And up the steps. And all around the
house were stairs, until we reached a point where we had to jump.
The stairs were a little too steep, but I could easily drop some stairs. And then the stairs
would come right up to the front of the house.
<I’m going to try to make it,> Jake said in thought-speak. <I’m going to try to find the way to
the outside.>
<I’m going to try to find the way to the inside of the house,> I said. <I’m going to try to find
the way to the inside of the house.>
<Yeah,> Jake agreed. <I think we’re in. I’m going to try to find the way to the outside.>
<That’s it,> I said. I wanted to play that very minute. I wanted to run for my life through the
door. I wanted to find the way to the outside.
Then I spotted a few steps. I ran. I ran to the back of the house. I ran to the front, where I saw
another Hork-Bajir standing at the gate.
I ran.
Tobias came running into the house. I ran.

“Tobias!” Rachel yelled. “They’re coming out! The Hork-Bajir couldn’t see me in time!”
<I can’t run and I can’t stand,> Tobias said.
<I know. I’m going out of my way. I’m going out of my way,> I said. <I’m going to try to find
the way to the inside of the house.>
“You know, Tobias,” I said, “I don’t know what this means. I mean, this is ... this is like
the Great Ellimist once said. We’ve all done the Ellimist.”
<Yeah.>
“Yeah. That’s some of what Tobias said. I guess you said something about how the Ellimist is
going to be great. But we’re not going to be great. We can’t laugh at him. He
====================
My heart stopped beating. I concentrated on the big, powerful heart. I felt the changes begin. I felt the changes begin to occur. I felt the pressure of the air
breathing up from under my feet. I felt the warm blood billowing from the arteries. I felt the changes begin!
The changes begin!
The changes begin!
I felt the changes begin to grow. I felt the changes begin to grow. I felt the changes begin to begin!
I felt the changes begin to grow!
The changes begin!
I felt the changes begin to grow.
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
The air filled the lungs. The changes began!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
The air began to fill my lungs. I felt the changes begin to begin!
The air began to fill my lungs. I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
The air was rushing up from my lungs. I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
The air grew thicker. I felt the changes begin to begin!
I felt the changes begin to begin!
The air grew thicker. I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
The air grew thicker. I felt the changes begin to begin!
I felt the changes begin to begin!
The air grew thinner. I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
The air grew thicker. I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
The air grew thicker. The air grew thicker. I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the changes begin to begin!
I felt the
====================
“Cower in fear, Yeerk. Fear nothing!”
“Yes, I know,” Cassie said. “I’ve been afraid to be brave all this time. It’s not like I can
tell you where you’re going, or what’s happening.”
“It’s all about fear,” I said. “You’ve been afraid to go out into the world. You’re afraid of

being human.”
“Yes,” Cassie agreed. “I’ve been afraid to go out into the world. I’ve done that many times. I
have to defend myself. I’ve done that for many, many days.”
“No,” I said. “I always go out into the world. I want to be brave. I want to be free. I want to
free!”
“You want to be free, Yeerk? You have no right to be afraid.”
“I did not know that,” Cassie said. “I’m not a Controller, I’m a human. I don’t have an
Andalite body.”
“You do?”
“Yes,” I said.
“I would have to respect you, Yeerk. I would have to think, ‘That’s not what I’m doing.’ But you
don’t have an Andalite body, Yeerk.”
I was going to say no. I was going to end up fighting to get out. But a few days later, I was on
the cover of Vogue. I’m a pretty big girl. I’m fighting the Yeerks. If I survive, I’ll be a huge girl, too.
If my dad leaves, I’ll be alone. I’m not going to let that happen to me.
I guess the Yeerks are pretty good at that. They’ve got psychic abilities. They know not
whom to trust. They know what to do. And they know that they can’t win.
They know they can’t win.
They know I’m not a Controller.

Chapter 14
The next day, Rachel was at the mall. And in her shopping cart, she spotted me. I was shopping.
She was shopping.
Then she suddenly stopped and stared at me.
“Hey!” she said. “What is it?”
“She was just coming up behind me when I stopped. I guess she saw the way I was shopping. I’m
poking her with my gun right here.” She looked at me like I was funny.
“Oh, man. Oh, wow!” I said, laughing. I was laughing. “She was looking at me.”
Rachel looked up at me like she was trying to figure out what I was shopping for.
“That was so not normal,” she said. “I don’t even remember what I was shopping for, but I
made it.”
“You made it?”
“I mean, the whole mall scene. I was shopping at the mall. The big department store.”
“The big department store.”
“Oh, yeah. The mall. I mean, I took it from there. I mean, I can’t even remember what I
was shopping for.”
Rachel closed her shopping cart in front of me. She looked down at me and said, “Oh,
this is so not cool. I’m going to buy a suit of armor. I don’t even remember what I was
doing. So, how on Earth did you do?”
“Oh, well, we all knew what we were doing,” I said. I wondered if the little girl who had
come up behind me had been trying to forget what she’d done.
“She looked at me. She said, ‘Oh, man! I’m going to buy a suit of armor.’ I don’t remember
how I did it.”
“That’s not how it was,” I said. I was trying to figure it out. “I guess I’d have been wearing
this gun. I have to go to the mall and find out! I mean, I’d have to explain to my dad, but maybe I
just didn
====================

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

Analysis of Second GPT-2 Output

Let's grab some more statistics again from our favorite free online Word Counter Tool

Statistics

Details

3,264 Words
731 Unique Words
16,073 Characters
13,119 Characters (no spaces)
358 Sentences
1,087 Longest Sentence (words)
1 Shortest Sentence (words)
10 Avg. Sentence (words)
45 Avg. Sentence (chars)
4 Avg. word length
310 Paragraphs
7.3 Pages
4,050 Syllables
320 Lines
5,120 Words (Publisher) 
7-8th Grade Reading Level 
11 mins 52 sec Reading Time 
18 mins 8 sec Speaking Time 
3 hrs 56 mins Hand Writing Time

Keyword Density x1

104 (24%) felt changes begin 
93 (22%) changes begin begin 
6 (1%) air grew thicker 
4 (1%) changes begin grow 
3 (1%) wish knew how 
3 (1%) could feel changes 
3 (1%) could see bear's 
3 (1%) get outside house 
2 (0%) want live life 
2 (0%) live life human 

Keyword Density x2

107 (17%) changes begin 
104 (17%) felt changes 
93 (15%) begin begin 
8 (1%) wish knew 
7 (1%) air grew 
6 (1%) could see 
6 (1%) grew thicker 
5 (1%) knew how 
5 (1%) bear mind 
5 (1%) hork bajir 

Keyword Density x3

104 (24%) felt changes begin 
93 (22%) changes begin begin 
6 (1%) air grew thicker 
4 (1%) changes begin grow 
3 (1%) wish knew how 
3 (1%) could feel changes 
3 (1%) could see bear's 
3 (1%) get outside house 
2 (0%) want live life 
2 (0%) live life human 

@ajtran303
Copy link
Author

Scaling Up

In the previous parameter inputs, asking for 5 nsamples returns about 3000 words.

50000/3000 = 16.66...

let's round up to 17

17 * 5 = 85

By batch generating 85 nsamples, I will have over 50,000 words. Let's look to adjust some parameters.

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

Getting to 50,000 words

GPT-2 has recognized "Chapter X" as part of its output. So we will add a little snippet to procedurally generate a chapter title and use it as a delimiter instead of the default ====================.

And the enemy’s fighters are busy with making sure they don’t get caught.
I know some people’s lives are going to be saved by letting people know about anything.
But I’m not one of those people. In fact, I’m going to tell you, I’m going to tell you that no matter
how bad things get, it’s always a nice day for rain.
A beautiful day.
Somehow, I got to where I was supposed to be.
“Thanks, Rachel,” I said. “But I can’t argue about that.”
I turned my back on Jake and looked at Cassie. She looked at me like she was crazy. But I
knew better than to think about it.
I turned my head slowly, almost sad, and looked away.
“You’re careful what you say, Rachel. You can’t get us all killed,” I said.
I saw Cassie’s eyes jump up at me, her expression troubled. She turned her head to look at me.
“Rachel? I sure hope not.”
I didn’t say anything. I just turned my head and walked away.

Chapter 20
I had a really bad feeling about this. I mean, I can’t even remember what I said to my mom, but I
was sure, it was the way she had said it to me.
“Rachel,” she said with a laugh. “I know it’s hard for you to remember that, but I really don’t want you
to get hurt because I will. I mean, I think it will be okay. I mean, we’re all here. I’m glad to be
here. I’m glad to be a part of something great.”
“It is okay,” she said. Her smile faded. I could just make out her eyes in horror. I guess I
wondered more about the things she said to me, but I couldn’t remember any of it.
I walked up to her and looked away. Then I stepped into the shade. I turned to face her.
She glared at me with her big, evil eyes. “Rachel,” she said softly, “tell me what’s bothering you.”
“It’s nothing,” I said.
“Cassie,” she said. “I know you’re upset over the Yeerks, but it’s not something you want to talk to
about. I mean, it’s not something you want to hear about.”
“It’s not something I want to hear,” I said. “I mean, someday I want to get to know you. I’m
considerately worried about you, Tobias.”
She looked at me with her big, evil eyes. I guess I kind of had a hard time understanding what she
said. I guess she was trying to make me think that I was worried about Tobias.
“I know, Cassie,” she said. “But you don’t have to make that kind of decision. We’ll talk about it.
Now, if it bothers you, you don’t have to make that decision.”
“Yeah. I can’t think about it,” I said. “But ...”
“Yeah. I know, Cassie.”
“Maybe it’s the times we were together,” she said.
I considered that for a while, for a while. Then I realized I wasn’t listening to her. She was looking
at me like I was insane. Like she knew what was going on.
“Okay,” I said. I looked back to the side of the pool. The pool was dark, so I couldn’t see
much of it. But at least I could see the blue box in the water.
I turned my head toward the box and tried to remember what it looked like. It was a normal cube.
It was gray-green with a white center.
I turned my head and looked back to the box again. It was a rectangle.
But it wasn’t a box.
At least it wasn’t a toy.
“What do you think it is?” Cassie asked.
I shrugged. “It looks kind of like a toy.”
She nodded her head. “It is sort of like a toy.”
“What toys?�

Let's modify the batch generation parameters to fit our needs now.

gen_file = 'gpt2_gentext_{:%Y%m%d_%H%M%S}.txt'.format(datetime.utcnow())

import random

chapter_title = 'Chapter ' + str(random.randrange(31))

gpt2.generate_to_file(sess,
                      destination_path=gen_file,
                      length=1023,
                      temperature=0.7,
                      nsamples=85,
                      batch_size=20,
                      sample_delim=chapter_title
                      )

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

Debugging

Error

AssertionError                            Traceback (most recent call last)
<ipython-input-11-de1a1b92f5a6> in <module>()
     11                       nsamples=85,
     12                       batch_size=20,
---> 13                       sample_delim=chapter_title
     14                       )

1 frames
/usr/local/lib/python3.6/dist-packages/gpt_2_simple/gpt_2.py in generate(sess, run_name, checkpoint_dir, model_name, model_dir, sample_dir, return_as_list, truncate, destination_path, sample_delim, prefix, seed, nsamples, batch_size, length, temperature, top_k, top_p, include_prefix)
    426     if batch_size is None:
    427         batch_size = 1
--> 428     assert nsamples % batch_size == 0
    429 
    430     if nsamples == 1:

AssertionError: 

Cause

nsamples % batch_size needs to be zero. My inputs don't do that. Let's ask for 80 samples first and do a word count.

Solution

gen_file = 'gpt2_gentext_{:%Y%m%d_%H%M%S}.txt'.format(datetime.utcnow())

import random

chapter_title = 'Chapter ' + str(random.randrange(31))

gpt2.generate_to_file(sess,
                      destination_path=gen_file,
                      length=1023,
                      temperature=0.7,
                      nsamples=80,
                      batch_size=20,
                      sample_delim=chapter_title
                      )

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

First Attempt at 50,000 words

Download the batch generation

We fixed the bug! Now for the moment of truth

files.download(gen_file)

Unix command line to find Word Count

How many words do we have?

$ wc -w gpt2_gentext_20201127_064236.txt
49266 gpt2_gentext_20201127_064236.txt

We are just 734 words short! Let's create one more generation of nsamples=1 and see how much closer we can get.

# previous code
                      nsamples=1,
                      batch_size=1,
# then
files.download(gen_file)
$ wc -w gpt2_gentext_20201127_065328.txt
750 gpt2_gentext_20201127_065328.txt

750! Nice. Now we have over 50,000 words! But they are in two files. So let's concatenate them into one file. We used a similar strategy earlier, with a glob. So we'll do it again here.

$ cat gpt2*.txt > animorphs-the-lost-chapters.txt
$ wc -w animorphs-the-lost-chapters.txt
50015 animorphs-the-lost-chapters.txt

We did it!!!

@ajtran303
Copy link
Author

Analysis of animorphs-the-lost-chapters.txt

Statistics again from our favorite free online Word Counter Tool

Statistics

Details

50,015 Words
5,107 Unique Words
250,813 Characters
200,692 Characters (no spaces)
4,922 Sentences
22,415 Longest Sentence (words)
1 Shortest Sentence (words)
11 Avg. Sentence (words)
51 Avg. Sentence (chars)
4 Avg. word length
4,802 Paragraphs
111.1 Pages
62,239 Syllables
4,931 Lines
78,896 Words (Publisher) 
7-8th Grade Reading Level 
3 hrs 2 mins Reading Time 
4 hrs 38 mins Speaking Time 
61 hrs 28 mins Hand Writing 

Keyword Density x1

351 (3%) saw 
303 (3%) could 
293 (3%) going 
215 (2%) know 
214 (2%) like 
209 (2%) human 
194 (2%) all 
190 (2%) see 
160 (2%) felt 
151 (1%) just 

Keyword Density x2

104 (5%) could see 
49 (3%) long long 
39 (2%) could feel 
36 (2%) heard sound 
34 (2%) ask made 
29 (1%) hork bajir 
28 (1%) yeerk pool 
26 (1%) human mind 
23 (1%) felt yeerks 
23 (1%) yeerks all 

Keyword Density x3

26 (3%) long long long 
23 (3%) felt yeerks all 
20 (2%) ask made enemies 
14 (2%) mind part human 
13 (2%) part human mind 
13 (2%) because felt sorry 
12 (1%) mad because felt 
11 (1%) saw long long 
11 (1%) yeerks all weakness 
11 (1%) yeerks all power 

@ajtran303
Copy link
Author

And skimming through all of the instances of the word "Chapter", there are 156 chapters!

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

Read the first draft here!

animorphs-the-lost-chapters.txt

@ajtran303 ajtran303 changed the title Animorphs: The Lost Chapters Animorphs: The Lost Chapters [status: DONE] Nov 27, 2020
@ajtran303 ajtran303 changed the title Animorphs: The Lost Chapters [status: DONE] Animorphs: The Lost Chapters [first draft DONE] Nov 27, 2020
@ajtran303
Copy link
Author

There's a few more days before this ends, so I will try to think of anything else I'd like to do. One thing that stands out is generating additional texts in order to "curate" a better novel. So in the current first draft there are chapters with a lot of "repetitive text content". Ideally, I could delete those sections, generate more text, and append / edit it.

@ajtran303
Copy link
Author

There are almost 5000 lines of text in this novel. It shouldn't take too long to scroll and "prune"

@ajtran303
Copy link
Author

But I would also be interested in generating more text.

@ajtran303
Copy link
Author

Maybe running some NLP tools on the original corpus and comparing it to what GPT-2 did.

@ajtran303
Copy link
Author

Or even have GPT-2 generate a million-word corpus to rival the source material!

1,449,150 words, according to reddit

And 1.7 by my shell scripts. So around one and a half million words!

We would need an nsamples size of 2500 !!!

@ajtran303
Copy link
Author

I noticed that I was missing a newline character in my chapter_title delimiter

chapter_title = 'Chapter ' + str(random.randrange(31)) + '\n'

I will use it to generate the 1 million-some words.

                      nsamples=2500,
                      batch_size=20,

@ajtran303
Copy link
Author

I would consider re-training GPT-2 to recognize Chapters by labelling each chapter from the Corpus data set.

When finetuning GPT-2, it has no sense of the beginning or end of a document within a larger text. You'll need to use a bespoke character sequence to indicate the beginning and end of a document. Then while generating, you can specify a prefix targeting the beginning token sequences, and a truncate targeting the end token sequence

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

Second Draft Considerations

There were a couple of things in the first draft that I think can be improved.

  1. The "Chapter" delimiters were lacking a newline character, so you'd see something like Chapter 22Jake was
  2. There's a lot of weird repetition that can be pruned to make way for more "cohesive" text generations
  3. Basically a raw corpus was input into GPT-2 during the fine-tuning stage. Any "patterns" of chapters that it outputs comes from its own inference. It was not trained to recognize Chapters in this book. By labeling the corpus and presenting the data input as "a series of documents" instead of "one giant document, maybe", the hope would be that GPT-2 will have a more "focused" insight into the composition of a "Chapter". After all, this project is about The Lost Chapters!

Mid Level Design

Preparing the raw data for processing

The biggest task would be to label all of the book chapters with a "bespoke character sequence to indicate the beginning and end of a document."

  1. Remove the Table of Contents from all the texts.
  2. At each section starting with Chapter, add the token sequences.
    • Some books have a Prologue or Introduction and I will treat them the same way as a Chapter
    • Some books have Part 1, 2, 3, etc. And I will ignore these.
    • Some books have an Epilogue and I am noting them here for my reference (22.5)
    • Some books have more labels that should not be ignored (47.5)
    • The last book (54.0) has a section A Letter to the Fans. I am not sure to include this.

The token sequences will be <|startoftext|> and <|endoftext|>

<|endoftext|>
<|startoftext|>
Chapter 2
 ...
<|endoftext|>
<|startoftext|>
Chapter 3
...

After labeling the data, I will start a new instance of gpt-2-simple and then retrain it for 2000 steps, just like in the first draft.

BONUS: Cross-training the deep learning model with fanfiction

According to another reddit post from the Animorphs subreddit, there is a fanfiction that has a word count surpassing the first 13 books.

Animorphs: The Reckoning is a fanfic with 64 chapters and 545,625 words! Over half a million words - that's almost 11 extra novels!

This would be a cool "stretch goal", to "cross-train" the main corpus with a giant fanfic and see what its the output!

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

Easter Egg

A Letter to the Fans by K.A. Applegate

A Letter to the Fans
I know, I know, it’s rotten of me to leave you hanging at the end like that. But I figured the
Animorphs should go out the same way they came in: Fighting.
Well, here it is at long last: the final chapter in the Animorphs story. It began in the summer of
1996. It ends in the summer of 2001. Five years, 54 regular titles, 4 Chronicles, 5 Megamorphs and 2
Alternamorphs. An amazing number of you have read all those books. I am deeply grateful.
I had a lot of fun writing these characters. I know it sounds pretentious to say that I’ll miss them,
but I will. It seems strange to think that I won’t ever again write “My name is ...” It makes me a little
sad to say good-bye to Andalites, Hork-Bajir, Chee, Taxxons, and even Yeerks. It was fun sitting
down every day at my computer to invent that strange universe.
There are a bunch of people to thank. (Hey, what is this, an Academy Awards speech?) First of
all, Scholastic, in particular Jean Feiwel, Tonya Alicia Martin, and Craig Walker. Also the talented
folks who created such great art for the series. And, of course, the people who never get mentioned
but who are responsible for the crucial step from publisher to bookstore: the sales and marketing
force.
Mostly, I want to thank you guys, the readers. You praised, you complained, you extolled, you
demanded, you asked questions that sometimes I couldn’t answer. You told your friends, you started
Web sites, you sent letters and e-mails, and wrote fan fiction. You pointed out every error I made. You
were thoughtful and critical and imaginative. You were loyal.
I want you all to know that it is my choice to end Animorphs. Much as I’ll miss it, the time had
come. Time to say good-bye, Jake. Good-bye, Cassie. You, too, Tobias and Marco and Ax. Goodbye,
Rachel.
And now would be the time for me to say good-bye to you ... but, I’m off to a new series called
Remnants, and I’m hoping I’ll see you over there, in that new universe. If not, thanks from the bottom
of my heart for everything.
If you’re coming along on the next trip, grab onto something because we’re going to start off by
blowing up the entire world. Then the real trouble will start.
You may now demorph.

  • K.A. Applegate

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

I started training a new GPT-2-Simple instance with this labeled data set at 1:00pm. I set it at 2000 steps, and should be done around 2:30!

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

Introducing More Variables

The temperature parameter can be specified when generating to let the machine give more "creative" output. The default value is 0.7 and I have generated some additional samples for 0.8, 0.9, and 1.0.

temperature=0.7 samples

gpt2-gen-temp-7.txt

Chapter 9 - Jake
I was shocked, because I was.
I had never seen Cassie turn that way, but I just assumed she was following in the
direction of Marco.
I set my frozen jaw to the ground; she bit carefully.

Chapter 22
The hawk looked around. “We’re in a big, open space, right? I mean, most of the house is still
rotted old, so you’d think it was built before. Right?
“Got it,” I said.
I grabbed the first, and last, hooks off of the falcon. Just enough to hang.

Chapter 3
I was there, with the others. We all went into the bathroom. And I was our guide.
<Okay,> Jake said. <Let’s morph.>
I morphed to human and began to demorph.
“Okay,” Marco said. “It’s like a big, dark, underground cavern.”

temperature=0.8 samples

gpt2-gen-temp-8.txt

Chapter 5
I staggered back.
<The shack!> I cried.
<Yes, yes! You’ll be trapped there!>
<The shack!> I screamed again.

Chapter 15
The hawk, no longer blind, began to glow. It moved surprisingly fast, almost as fast as a
human eye could follow.
I heard what sounded like a Dracon beam hit the base of the tree.
A Dracon beam was aimed at the crow and fired.

Chapter 14 - Jake
I clicked and saw the Bug fighter go skyward, just a shred of a few hundred feet above us. I was
bleeding. I was panting.
“Rachel?” I called out, trying out the poison off my tongue.
A moment later, I was in the struggle. The Bug fighter was zooming up, zooming out.

temperature=0.9 samples

gpt2-gen-temp-9.txt

Chapter 9
“What the ... ?” I asked, not daring to make eye contact.
“You do know what?” Frank asked.
“I know all about the Yeerk guerrilla movement,” Jake said. “That’s what Jake, the Andalite military guy,
told us. But even now, the guerrilla movement has been spread out over a large area in the mountains. To

Chapter 13
3:30 P.M.
We were here at the theater. We were in the main studio, but it was built well so that
students could view all the action. I put my hand on the very delicate instrument that was part cowl and part
pocket.

Chapter 12
3:10 P.M.
I was busy with Visser Two. I was busy with it, too. I knew she was sinking deeper in her
deepest thoughts. She was worried about her fellow Andalite. She looked over her shoulder at me,
considering my reaction. I was calculating her reaction.

temperature=1.0 samples

gpt2-gen-temp-10.txt

Chapter 8
As soon as I had squeezed between the clotted permafrost and the low branches, The Blue
Island crept along.
I wasn’t surprised when we came within a foot of a fully formed lodge, some kind of a thermal shelter.
The entire lodge was covered in a layer of whitish, whitish-red vegetation covered by eucalyptus-

Chapter 24 - Nice Rachel
Fluffer eyes squint. He was in his usual spot in the corner of the kitchen near the computer
monitor. He spread his arms wide, the way two humans would do when they had just had a few
excruciating bites.
We all stared at him, then bent forward to stare at the real Richard. We were the gamer

Chapter 2
The boy answered shallowly in three little words. “Arrow” wasn’t wrong. It’s really only
one of the many sounds I know of Joe Louis. With some morphing technology my superior can create. Strange,
seeming.
“Joe?” Campbell repeated in his smallest whisper ever.

Notes for Data cleanup

Some of the output still includes our "labels" <|endoftext|> and <|startoftext|>. So when processing the final novel, they should be removed from the next.

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

Generating 1,000,000 Words

With this new "Chapter-focused" layout, let's try getting a million words!

gen_file = 'gpt2_gentext_{:%Y%m%d_%H%M%S}.txt'.format(datetime.utcnow())

gpt2.generate_to_file(sess,
                      destination_path=gen_file,
                      run_name='ani2',
                      temperature=0.7,
                      prefix='<|startoftext|>',
                      truncate='<|endoftext|>',
                      include_prefix=False,
                      nsamples=2000,
                      batch_size=20,
                      sample_delim=''
                      )

This is going to take a while to run!

@ajtran303
Copy link
Author

Obstacle

I walked away from my computer to let it work idly. It went to sleep and terminated its connection to the Google Colaboratory VM. So I'm changing my computer settings to NOT DO THAT and try it again. It's risky asking for 2000 all at once. So I will ask for 4 batches of 500.

@ajtran303
Copy link
Author

ajtran303 commented Nov 27, 2020

Obstacle

My internet connection has become very spotty and I keep disconnecting from the runtime. I will try now just using smaller nsample sizes. Asking for 80 should get me amost 50,000 words. And do that 20 times to get a million! Haha.

If I ask for 160, I would just have to repeat 10 times. When nsample=160, it takes a little more than five minutes. So this will take about an hour to complete.

Approach

I am just going to ask twice. I already have a million words from the first iteration. (Didn't post about this earlier). And asking twice has yielded me 141,627 words to curate from!

@ajtran303
Copy link
Author

ajtran303 commented Nov 28, 2020

First Impressions on Data Collection

ani2-raw.txt

We have 141,627 words generated in a 14,870 lines document!

$ cat ani2-raw.txt | grep 'Chapter' | wc -l
352

And it looks like there are 352 chapters.

Labeling the Initial Data Produced Better Results

Pretty much this entire document is arranged in to Chapters! Looking good!

I think a programmatic approach to curation is the next call.

High level design

  1. Clean the data by removing any <|startoftext|> tokens that were generated.
  2. From this one text file, extract each chapter into a separate text file.

I can do step 1 in my text editor, with a find and replace all.
I can do step 2 programmatically, by again using the fs module.

@ajtran303
Copy link
Author

ajtran303 commented Nov 28, 2020

Organizing the Data

We're going to use Node.js again for file manipulation.

// chapterize.js

const fs = require("fs");

let a;

fs.readFile("ani2-cleaned.txt", "utf8", (err, data) => {
  if (err) throw err;
  a = data;
});

const chapters = a.split("Chapter");

chapters.forEach((chapter, i) => {
  chapterData = `Chapter${chapter}`;
  let fileName = `ani2-${i + 1}.txt`;
  let targetOutput = `./gen_chapters/${fileName}`;

  fs.writeFile(targetOutput, chapterData, "utf8", (err) => {
    if (err) throw err;
    console.log(`${fileName} was created.`);
  });
});

console.log(`${chapters.length} files created.`);
$ mkdir gen_chapters

Obstacle

The code would not execute from the command prompt $ node chapterize.js. But opening $ node and then pasting in the program worked.

Results

Now all of the chapters are organized into 353 separate files!

Next Steps

I can programmatically curate 50,000 words from these files. I need to come up with a strategy for doing that.

@ajtran303
Copy link
Author

Filtering the Data

I want to know the distribution of "Chapter Numbers" in the 353 generated files.

$ cat ani2-unraw.txt | grep 'CHAPTER' | cat > chapters.txt 

After a little text editor action to clean the data, let's use some more JavaScript to analyze the chapter distribution!

Analyzing Generated Chapter Distributions

// chapter-counter.js

const fs = require("fs");

const a = fs.readFile("chapters.txt", "utf8", (err, data) => {
  if (err) throw err;
  return data;
});

const chapterNumbers = a.split("\n");

const countOccurrences = (arr) =>
  arr.reduce((prev, curr) => ((prev[curr] = ++prev[curr] || 1), prev), {});

console.log(countOccurrences(chapterNumbers));
$ node chapter-counter.js

Results

{
  '1': 5,
  '2': 1,
  '3': 16,
  '4': 10,
  '5': 29,
  '6': 16,
  '7': 9,
  '8': 18,
  '9': 19,
  '10': 10,
  '11': 11,
  '12': 39,
  '13': 14,
  '14': 17,
  '15': 15,
  '16': 18,
  '17': 12,
  '18': 23,
  '19': 11,
  '20': 8,
  '21': 21,
  '22': 9,
  '23': 7,
  '24': 3,
  '25': 2,
  '26': 3,
  '27': 3,
  '35': 1,
  '39': 2,
  '': 1
}

So there is a nice distribution! It generated Chapters 1-27, and some odd ones at 35 and 39. And there's an "orphan" chapter. Of interesting note, there is exactly ONE Chapter 2!

@katstasaph
Copy link

katstasaph commented Nov 28, 2020

can I submit a pull request to subtitle this 10,000 Bowls of OAT-freaking-MEAL

@ajtran303
Copy link
Author

can I submit a pull request to subtitle this 10,000 Bowls of OAT-freaking-MEAL

Haha. I had to look up the passage.

<So we try and feed them addictive drugs,> Tobias said with obvious distaste.
"It's OAT-freaking-MEAL!" Marco exploded.
Cassie suddenly laughed.

I think that would be a cool little blurb on the book cover :) Something like

"I give it 10,000 out of 10,000 Bowls of OAT-freaking-MEAL!" - @katstasaph 

@ajtran303
Copy link
Author

ajtran303 commented Nov 29, 2020

Summarizing the Data

Remember that we have 353 generated chapters organized into a folder, /gen_chapters:

  let targetOutput = `./gen_chapters/${fileName}`;

Get the word count of each chapter

Let's loop over every file in that directory and get a word count from each of them.

$ for i in gen_chapters/*.txt; do wc -w $i; done

     457 gen_chapters/ani2-10.txt
     227 gen_chapters/ani2-100.txt
     278 gen_chapters/ani2-101.txt
...

Very nice! Let's sort this with a pipe operator! The -r flag will give us a list in descending order:

$ for i in gen_chapters/*.txt; do wc -w $i; done | sort -r

     889 gen_chapters/ani2-259.txt
     849 gen_chapters/ani2-332.txt
     802 gen_chapters/ani2-141.txt
...
       7 gen_chapters/ani2-148.txt
       2 gen_chapters/ani2-241.txt
       1 gen_chapters/ani2-1.txt

Wow! This is a really cool spread! Let's save it into a file.

$ for i in gen_chapters/*.txt; do wc -w $i; done | sort -r | cat > word-count.txt

(Wow, I can't believe that worked!)

Screen Shot of previous shell command

@ajtran303
Copy link
Author

ajtran303 commented Nov 29, 2020

Getting to 50,000 words

With some text editor find and replace all magic, let's turn our text file into word-count.csv

889,gen_chapters/ani2-259.txt
849,gen_chapters/ani2-332.txt
802,gen_chapters/ani2-141.txt

Let's write another JavaScript program that will:

  • Read the CSV file
  • Iterate over every row of data
  • Keep a sum of the word counts until sum > 50_000
  • Make a list of the chapters counted up until that point
  • Make that list useful to us

We will use the csv-parse package. There is a section in this tutorial that will get us started with creating our own parser.

$ npm i -s csv-parse
// word-counter.js

const fs = require("fs");
const parse = require("csv-parse");

// https://stackoverflow.com/questions/2450954/how-to-randomize-shuffle-a-javascript-array
const getShuffledArr = (arr) => {
  if (arr.length === 1) {
    return arr;
  }
  const rand = Math.floor(Math.random() * arr.length);
  return [arr[rand], ...getShuffledArr(arr.filter((_, i) => i != rand))];
};

var wordCounter = 0;
var chapters = [];

const counter = (err, records) => {
  for (let [index, record] of records.entries()) {
    let wordCount, chapter;
    [wordCount, chapter] = record;

    wordCounter += parseInt(wordCount);
    chapters.push(chapter);

    if (wordCounter >= 50_000) {
      const message =
        `Task complete! ` +
        `There are ${chapters.length} chapters and ${wordCounter} total words.` +
        `\n` +
        `One last step! Get the book by using this command in the terminal: ` +
        `\n` +
        `\n` +
        `cp ${getShuffledArr(chapters).join(" ")} lost_chapters/ ` +
        `&& cat lost_chapters/*.txt animorphs-the-lost-chapters-final.txt`;
      console.log(message);
      break;
    }
  }
};

const parser = parse(counter);

fs.createReadStream("word-count.csv").pipe(parser);

A little bit of meta-programming! I wrote code to write code for me! Every time this script is run, it will generate a different ordered output.

$ node word-counter.js

Task complete! There are 80 chapters and 50500 total words.
One last step! Get the book by using this command in the terminal: 

I definitely ran that script several times to see the different outputs. Woo!
I could have used more JavaScript to write that file, automatically, but I wanted to try it this way to see how to write code that will write code for me. Successful experiment!

Let's use that new command now!

@ajtran303
Copy link
Author

ajtran303 commented Nov 29, 2020

Running the Code

The input took two minutes to buffer into the prompt! Note to self: the next time I try to generate code for the shell, just write it into a script file.

BUG (fixed)

cat: animorphs-the-lost-chapters-final.txt: No such file or directory

I was missing a redirect operator. Let's fix that and try again. Thanks, Up Arrow Key!!!

nanogenmo

WOW! That was instant!!!

Check out those last sentences!

I was not the monster.
I was the monster.
“I love you,” I whispered.
“Yeah. I love you.”

@ajtran303
Copy link
Author

Animorphs: The Lost Chapters

It is now complete!

animorphs-the-lost-chapters-final.txt

@ajtran303 ajtran303 changed the title Animorphs: The Lost Chapters [first draft DONE] Animorphs: The Lost Chapters [Complete!] Nov 29, 2020
@hugovk
Copy link
Member

hugovk commented Nov 29, 2020

Congratulations!

Is the code in the comments here, or in a repo? (Fine either way, just checking :)

@ajtran303 ajtran303 changed the title Animorphs: The Lost Chapters [Complete!] Animorphs: The Lost Chapters [Completed] Nov 29, 2020
@ajtran303
Copy link
Author

Congratulations!

Is the code in the comments here, or in a repo? (Fine either way, just checking :)

The code is all in here, in the style of "stream of consciousness."

@hugovk
Copy link
Member

hugovk commented Nov 29, 2020

Good stuff, have a completed label!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants