Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A Young Person's Encyclopedia of the Land of Clandestine and its Capital Apocrypha #60

Open
summerstay opened this issue Nov 24, 2020 · 3 comments

Comments

@summerstay
Copy link

summerstay commented Nov 24, 2020

A Young Person's Encyclopedia of the Land of Clandestine and its Capital Apocrypha
Douglas Summers Stay

This is a GPT-3 generated work. I wrote an introduction, used it to generate a few articles, and then used those to generate many, many more articles. I then edited them to remove occasional references to wroking magic or present-day technology (I wanted a lost civilization, not a fantasy or science fiction novel) and clean up a few ungrammatical sentences. So everything you see was written by GPT-3, but some part of what it wrote has been edited out.
GPT-3 can only generate about a page of coherent text at a time, so a full novel was not really practical. But a novel-length collection of short articles works fine. If I do it again, I will have it randomly select from all previously generated articles for context, instead of just selecting from a much smaller collection, so that it will be less repetitive in topic and gradually build up over time.
The most similar work I am aware of is Dictionary of the Khazars by Milorad Pavic. Codex Serafinianus and Borges's "Tlön, Uqbar, Orbis Tertius" are also related to this idea of fictional nonfiction.
Here is the introduction I wrote:
"The ancient cave city of Apocrypha was rediscovered in 1913 by a brother and sister chasing their lost dog. A recent landslide had reopened a long-hidden entrance. The city had been carved from the desert rock and was large enough to host about 10,000 people and all their sheep and stores. Extensive scrolls detailed the customs and history of the people who lived there. It had been the capital of a minor kingdom known as Clandestine. The Encyclopedia of Clandestine was written based on these scrolls by author Lord Dunsany. It is celebrated for the beauty and depth of the descriptions contained therein, bringing the ancient city to life."

The mention of Lord Dunsany and how beautiful the writing are were meant to shift the writing quality to be better, a trick that has worked for me in the past with GPT-3, but in this case it doesn't seem to have had much effect. The writing style (in my opinion) is a little amateurish, like it was written for a school project, but the content is intriguing, and may have been affected by the mention of Dunsany to include more mythology and other topics he was interested in. At any rate I removed these lines from the final text. There were a few other places I interfered, requesting to know the story of some of the titles listed in one article, or the lyrics to a song or poem, but they are rare and concentrated toward the beginning of the text.

My favorite part is all the absurd festivals. The articles are constantly contradicting each other, but still manage to build up a relatively consistent picture as a whole. I feel like GPT-3 is kind of like the dreaming mind, in its abundant creativity but inability to stick to one view of the world.

Here is some code used to generate it:

# -*- coding: utf-8 -*-
"""
Created on Tue Sep 29 09:42:39 2020

@author: Doug
"""


import openai
from random import randint

entry_file = open("Clandestine.txt", "r")
output_file = open("Clandestine_out2.txt", "w")
openai.api_key = REMOVED
from transformers import GPT2Tokenizer

def generate(engine='davinci-beta', prompt = '', max_tokens=0,
             temperature=0, top_p=0, frequency_penalty=0, logit_bias = {}, stop='@@@@'):
    global api_key
    response = ''
    count = 0
    #this allows the generation to go through, no matter what-- but it may get strange
    tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
    prompt_tokens = len(tokenizer.encode(prompt))
    print(prompt_tokens)
    if prompt_tokens + max_tokens >2038:
        max_tokens = 2048 - prompt_tokens
        print("[prompt too long, truncating max tokens]")
        if max_tokens<1:
            prompt_tokens = prompt_tokens[-2038:]
            print("[prompt still too long, truncating beginning of prompt]")
            

#    while response.strip() == '' and count < 5:
#        try:
    response = openai.Completion.create(
            engine=engine, api_key=openai.api_key, prompt=prompt,
            max_tokens=max_tokens, n=1, logprobs=1, stop=stop,
            temperature=temperature, top_p=top_p,
            frequency_penalty=frequency_penalty)
    generated_text = response.choices[0].text
    return generated_text
#        finally:
#            count += 1
            
            
            
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
lines = entry_file.readlines() 
examples = []
for ii in range(1,276):
    if '***' in lines[ii]:
        examples.append(lines[ii-1])
token_examples = []
for example in examples:
    token_example = tokenizer.encode(example)
    token_examples.append(len(token_example))

for jj in range(1,1000):
    prompt =""
    prompt_len=0
    while prompt_len<1500:
        value=randint(2,len(examples)-1)
        prompt_len=prompt_len+token_examples[value]+5
        if prompt_len<1500:
            prompt=prompt+examples[value]+'\n\n***\n\n'
    
    prompt =prompt+'The'
    description = generate(
            prompt=prompt, max_tokens=512, temperature=0.8, top_p=1,
            stop=['***'])
    print(description)
    output_file.write(description+'\n***\n"')

entry_file.close()
output_file.close()
@summerstay
Copy link
Author

Admins, please mark this as "completed".

@hugovk
Copy link
Member

hugovk commented Nov 25, 2020

Marked "completed"! 🎉

@JonathanCRH
Copy link

That a text this absorbing could have been generated so straightforwardly is sorcery as far as I'm concerned. I've been genuinely enjoying reading this. The Pit of Flies is astonishingly bleak...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants