Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pplm with gpt-neo #39

Open
AJamal27891 opened this issue Jun 4, 2021 · 8 comments
Open

pplm with gpt-neo #39

AJamal27891 opened this issue Jun 4, 2021 · 8 comments

Comments

@AJamal27891
Copy link

AJamal27891 commented Jun 4, 2021

pert_logits, past, pert_all_hidden = model(last, past=pert_past)

I'm trying to implement gpt-neo with PPLM. however gpt-neo meeds upgarde transformers liberary to transfoermers>=4.5; where 'past' is replaced with past_key_values.
when I change past to past_key_values I got an error as following

TypeError: forward() got an unexpected keyword argument 'past'

any ideas how can I solve this ?

@dathath @julien-c

@lierik
Copy link

lierik commented Aug 16, 2021

I meet the same problem. Did you find any methods to solve this?

@pablonieto0981
Copy link

pablonieto0981 commented Aug 16, 2021

Hi there, been working with this today.
Main problem is that transformers has updated the structure of the past from v3.4 to v.4.9.2. See below:

Transformers 3.4 >> List of tf.Tensor of length config.n_layers, with each tensor of shape (2, batch_size, num_heads, sequence_length, embed_size_per_head)) >> [tensor[2,bach,heads,seq_length,embbed],t,t,t,t...]
Transformers 4.9.2 >> Tuple of length config.n_layers, containing tuples of tensors of shape (batch_size, num_heads, sequence_length, embed_size_per_head)) >> ((tensor[bach,heads,seq_length,embbed]),(t),(t),...)

This is relatively easy to solve for the initial issue here raised (just change pert_logits, past, pert_all_hidden = model(last, past=pert_past) to pert_logits, past, pert_all_hidden = model(last, past_value_keys=pert_past)

Note also, that the output of model() now is a dictionary, so it is best to write output_from_model = model(last, past_value_keys=pert_past), and then pert_logits, = output_from_model['logits'], etc.

Still, afterwards, new problems between the old and new structure appear, some can be easily dealt with, others are more challenging as the script has multiple generator functions that deliver the old structure. A good understanding of how tensors are passed around through the program is needed to make it work. This is particularly so in the perturb_past function

A hack workaround could be to translate between the old and new past structure across the script, using old for the in-script functions and translating to the new when sending past to the model (transformers / torch libraries). Not sure if that would work, anyway.

Might try it out, but now I am considering just replicating the paper on my own. The mechanism to tweak the past seems promising.

@callzhang
Copy link

@pablonieto0981 Thanks for the input. I am new to this repo. As of an quick glance of the code, it seems that the code for text generation can be replaced by transformers' own generate method, yes? Any success on your replication?

@pablonieto0981
Copy link

@callzhang , so basically I went the route of coding a similar approach leveraging the standard transformers library. After analysis of the code, it seems the key method here is tweaking the transformers past values to reflect desired steering of the GPT-2 generation process. Given that, I have basically implemented a function that modulates perturbation of the past values based on the desired steering concepts (text). This is done at the level of predictions = model(tokens_tensor), where model is the basic call for a pytorch model as implemented in the transformers library.

@pablonieto0981
Copy link

pablonieto0981 commented Sep 27, 2021

@callzhang as an example, see below the output achieved by the implementation (just run it):

prompt = "The forest behind my house was full of horrors."
perturbation_text = "dog"
perturbation_score = [0.3]

output = be_creative_with_perturbation(tokenizer,model,prompt,perturbation_text,perturbation_score)

print(output)

"Snatches of noisome prowlers, scampering and floundering, would sometimes howl in the dark small hours, and at night when the wind shifted I could hear them howling in the underbrush. There was a daemoniac howling in the underbrush, and when I would whistle a warning cry they would vanish in a frightful swarm. Once I saw a vulture descend from the dark toad-like blasphemy that haunted the upper reaches of that swamp, and I thought of the howling I had heard in the night when I was a boy. There was a daemoniac howling in the underbrush, and when I shot an arrow into the thicket I thought I heard a horrible scurrying as of some gigantic hound. But worst of all was the matter of the prints in the road. Once in a while a great black shape would appear out of the deep woods behind the house and pawed at the moon, but it would always vanish in a frightful swarm."

There is a bit of repetition, but you can see how the generated text brings concepts related to the perturbation_text, in this case the single word "dog".

@callzhang
Copy link

@pablonieto0981 thanks for the reply. Do you have the snippet for your be_creative_with_perturbation function? BTW I ended up using prompt based method as it is much faster but less controllable.

@tungsontran
Copy link

Have anyone succeeded with porting this to GPT-Neo?

@yangshuodelove
Copy link

@pablonieto0981 Thanks for your example. Could you show us your function be_creative_with_perturbation ? Is the prompt (e.g., The forest... ) a context? Is the perturbation_text (e.g., dog) a bag-of-words? Thanks agian.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants