Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to turn off generator (make it a pointer only network) #5

Open
Hellisotherpeople opened this issue Feb 12, 2019 · 5 comments
Open
Labels
enhancement New feature or request

Comments

@Hellisotherpeople
Copy link

I'm really looking for an effective word-level summarization solution. It isn't clear to me how to turn off the "generator" part of the pointer-generator network.

Let me know if this is possible and how I can achieve this.

@Hellisotherpeople
Copy link
Author

I'm going to experiment with what looks like some simple hacks (set the chance of using the generator network to 0 if a certain param is true) and I will submit a PR if it works - I will likely fork this project then.

@Hellisotherpeople
Copy link
Author

So, I can make it only use the vocabulary available to it in the source text (by turning the probability of pointing to be 1 and generation to be 0), but the ideal solution will not rearrange my source text. I am trying to train something that will "highlight" the most important part of a document without rearranging the document.

The ideal solution chooses either to include the source word or not include it.

@ymfa
Copy link
Owner

ymfa commented Feb 15, 2019

Yes, you can try to disable the generator in various ways, such as the way you described (setting the pointer probability to 1) or setting the vocabulary size to a very small number (I don't know if setting it to 0 will break anything). You're welcome to submit a PR if anything needs to be fixed in order to disable the generator.

However, the task you are working on doesn't seem to require a full seq2seq model. You only need to label each input token as important or not important (i.e. binary classification). This can be achieved by running a bi-GRU or bi-LSTM (multiple layers if needed) over the input (similar to my encoder), and then applying a sigmoid function on the output state of each token to get a score between 0 and 1.

@ymfa ymfa added the enhancement New feature or request label Feb 15, 2019
@Hellisotherpeople
Copy link
Author

Hellisotherpeople commented Feb 15, 2019

@ymfa
I think you're correct. I originally tried to do this project with a "PoS" tagger which would either seperate between "underlined" and "not underlined" words - but I had issues even with a moden one like https://github.com/zalandoresearch/flair
because they have such primitive tools for dealing with large datasets that don't easily fit in RAM. Your project easily works on my dataset.

do you have any recommendations for frameworks or other projects which would be useful for myself? I'd like to find a way to get pre-trained word embeddings into whatever tool I use to do the token classification

@Hellisotherpeople
Copy link
Author

Hellisotherpeople commented Feb 15, 2019

Also, can you critique another idea I have?

I have a dataset that includes basically large amounts of news articles, an extractive "highlighted" version of the article, and an abstractively made human summary. I think that if I go from long news article to short abstract, I can "tailor" my summary in such a way that my abstract saying "Strawberries taste bad" and "Strawberries are yummy!" can highlight the document differently.

Any ideas on trying to do this highlighting via attention? I've experimented with this idea by trying to sum the attention layers and heads in BERT together and use the top words most attended to (but, I think a naive sum of all attention layers is wrong so it didn't work vert well). I'll try to experiment with it using the visualization tool you've included. If it's possible, can you help me work out the way to get a list of words and their corresponding attention score from a trained example? I'd be forever in your debt!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants