Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Translating" novels with sentence embeddings #22

Open
1 of 7 tasks
superMDguy opened this issue Oct 29, 2018 · 9 comments
Open
1 of 7 tasks

"Translating" novels with sentence embeddings #22

superMDguy opened this issue Oct 29, 2018 · 9 comments

Comments

@superMDguy
Copy link

superMDguy commented Oct 29, 2018

Not sure exactly what I'll have time to do, but I have a lot of random ideas. I'll definitely not be able to get through all of them, but I'm committing to get at least one of them done.

  • Something that uses word or sentence vectors to make new versions of novels
  • Generated poetry (prose to poem would be fun). See Prose to poem #55
  • A "quote extractor" that finds the most quotable parts of famous literature
  • Some sort of anthology-type thing, in the vein of Hard West Turn from last year
  • Something with folk tales, which are interesting because they're sets of variations on a story
  • Something with different translations of the bible. I was thinking that neural translation could be used to train an "old english" to "modern english" translator, which could then be used on old novels
  • Something with the Marvel API

I also have some vague ideas about using neural networks to extract hierarchical formulaic story structures, but that's probably overambitious. I sort of approached this problem the last couple of years, but never got very far.

@cantino
Copy link

cantino commented Oct 31, 2018

Something that uses word or sentence vectors to make new versions of novels

That's interesting, especially if you go with sentence vectors.

@superMDguy
Copy link
Author

I did some work on "translating" from one book to another with sentence vectors. I'll put up code and a full 50,000 word novel soon. It didn't turn out as well as I thought, but it's still pretty cool. Here's a sample, Lincoln's Gettysburg Address composed entirely of sentences from a corpus of works by Winston Churchill:

One November day nearly two years after my admission as junior member of the firm of Watling, Fowndes and Ripon seven gentlemen met at luncheon in the Boyne Club; Mr. Barbour, President of the Railroad, Mr. Scherer, of the Boyne Iron Works and other corporations, Mr. Leonard Dickinson, of the Corn National Bank, Mr. Halsey, a prominent banker from the other great city of the state, Mr. Grunewald, Chairman of the Republican State Committee, and Mr. Frederick Grierson, who had become a very important man in our community.

IV

I come home impressed with the fact that Britain has learned more from this war than any other nation, and will probably gain more by that knowledge. How excellently they would have agreed on the general question of the war!

Yes, the dream of that youth had been to benefit in some way that community in which circumstances had decreed that he should live, and in this connection it might not be out of place to mention a bill then before the Legislature of the state, now in session. To think that you should be reduced to that, and I not know it!"

"Well," said Fowndes, "there's an element of risk in such a proceeding I need not dwell upon." But most men of his type have seen them in despair; and since he was not related to this particular despair, what finer feelings he had were the more easily aroused. "It was mean, not to tell you, but I'd never had anything like this -- what you were giving me -- and I wanted all I could get."

Some faith indeed had given him strength to renounce those things in life I had held dear, driven him on to fight until his exhausted body failed him, and even now that he was physically helpless sustained him. If we should obtain a majority at the next election -- and I have good hopes that if we act with wisdom and with union, and, above all, with courage, we shall undoubtedly obtain an effective majority -- the prize we shall claim will be a final change in the relations of the two Houses of Parliament, of such a character as to enable the House of Commons to make its will supreme within the lifetime of a single Parliament; and except upon that basis, or for the express purpose of effecting that change, we will not accept any responsibility for the conduct of affairs.

@superMDguy
Copy link
Author

I added a repo here. I'll put most of my updates there, but I'll add samples and full output here as I complete ideas.

@superMDguy
Copy link
Author

I got a full novel. It's 109,483 words according to wc. I should probably break up my ideas into separate issues as I start implementing them.

@superMDguy superMDguy changed the title A lot of ideas, not much time "Translating" novels with sentence embeddings Nov 2, 2018
@superMDguy
Copy link
Author

Here's a sample (from Northanger Abbey with sentences from Sir Arthur Conan Doyle) with the original sentences in italics:

CHAPTER 1 No one who had ever seen Catherine Morland in her infancy would have supposed her born to be an heroine.
Everything which the girl said seemed to be meant as an insult to me, and yet I could not imagine how I had ever offended her.

Her situation in life, the character of her father and mother, her own person and disposition, were all equally against her.
Her whole life was a round of devotion and of love, which was divided between her husband and her only son, Harold.

Her father was a clergyman, without being neglected, or poor, and a very respectable man, though his name was Richard-- and he had never been handsome.
`` You must know, Sir Charles, that though my son knew nothing of his parents, we were both alive, and had never lost sight of him.

He had a considerable independence besides two good livings-- and he was not in the least addicted to locking up his daughters.
Now that they had not only ceased to protect him, but had themselves become a source of trouble to him, he began to understand how great the blessing was which he had enjoyed, and to sigh for the happy days before his girls had come under the influence of his neighbor.

Her mother was a woman of useful plain sense, with a good temper, and, what is more remarkable, with a good constitution.
In her pure and earnest mind her mother's memory was enshrined as that of a saint, and the thought that any one should take her place seemed a terrible desecration.

She had three sons before Catherine was born; and instead of dying in bringing the latter into the world, as anybody might expect, she still lived on-- lived to have six children more-- to see them growing up around her, and to enjoy excellent health herself.
He was married to the second daughter of Sir James Ovington; and as I have seen three of his grandchildren within the week, I fancy that if any of Sir Lothian's descendants have their eye upon the property, they are likely to be as disappointed as their ancestor was before them.

@enkiv2
Copy link

enkiv2 commented Nov 2, 2018 via email

@superMDguy
Copy link
Author

I used sentence embeddings from Facebook's InferSent model. For each sentence in the "to translate" text, it finds the semantically closest sentence in the "source" text based on cosine distance between sentence vectors. This makes it so it's guaranteed to be coherent on the sentence level. You can see the code here (full repo is here, but for some reason Github isn't able to render the notebook).

@superMDguy
Copy link
Author

I've been messing around with this sentence "translation" idea some more, and I found something pretty cool. I translated the sermon on the mount (Matthew chapters 5-7, from the New Testament) using sentences from the Quran. The results were pretty interesting. Hopefully I'm not offending anyone by doing this.

Sample

Bible in italics, Quran below.

"Blessed are the poor in spirit, for theirs is the kingdom of heaven.
But verily thy Lord is full of grace to mankind: Yet most of them are ungrateful.

Blessed are those who mourn, for they will be comforted.
Yea, those who believe,- their faith is increased and they do rejoice.

Blessed are the meek, for they will inherit the earth.
It is We Who will inherit the earth, and all beings thereon: to Us will they all be returned.

Blessed are those who hunger and thirst for righteousness, for they will be filled.
That is best for those who seek the Countenance, of Allah, and it is they who will prosper.

Blessed are the merciful, for they will be shown mercy.
If only ye ask Allah for forgiveness, ye may hope to receive mercy.

Blessed are the pure in heart, for they will see God.
Him We chose and rendered pure in this world: And he will be in the Hereafter in the ranks of the Righteous.

Blessed are the peacemakers, for they will be called children of God.
so order me that I may be grateful for Thy favours, which thou hast bestowed on me and on my parents, and that I may work the righteousness that will please Thee: And admit me, by Thy Grace, to the ranks of Thy righteous Servants.''

Blessed are those who are persecuted because of righteousness, for theirs is the kingdom of heaven.
Yea, to Allah belongs all that is in the heavens and on earth: so that He rewards those who do evil, according to their deeds, and He rewards those who do good, with what is best.

"Blessed are you when people insult you, persecute you and falsely say all kinds of evil against you because of me.
If ye (but) eschew the most heinous of the things which ye are forbidden to do, We shall expel out of you all the evil in you, and admit you to a gate of great honour.

Rejoice and be glad, because great is your reward in heaven, for in the same way they persecuted the prophets who were before you.
Those are the ones who will be rewarded with the highest place in heaven, because of their patient constancy: therein shall they be met with salutations and peace, Dwelling therein;- how beautiful an abode and place of rest!

@kranzky
Copy link

kranzky commented Nov 7, 2018

Careful now... when I first put MegaHAL on the web back in 1997 someone spent all night chatting to it in a futile attempt to have it not say something blasphemous ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants