Skip to content

charlesmrice/quotations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Mechanical Copybook

Quotations from persons of repute or infamy are often deployed to lend weight to ideas and rhetoric. It's a perfectly respectable technique - albeit a logical fallacy - and attested back to the early days of persuasion. And in a time when mass communication was all but unheard of, and personal communication was slow, a quotation from an authority could be validated in time and arguments disputed.

But on the internet, as the New Yorker once said, no one knows you're a dog. And while that may no longer be true for humans, it is still true of quotations. They can be deployed willy-nilly, with scarcely a thought given to accuracy or even validity. Yet by their mere presence, and the unverified attribution, they give unearned weight and substance to ideas and arguments.

There is a dedicated community of people out there who hunt down the sources of some of these spurious quotations and publish their findings. But it is a solitary and thankless task, apart from the joy of the hunt, and the creation of this false wisdom continues at industrial speed.

This project began life as a capstone for the Data Science Immersive course at General Assembly in Washington, DC. I set out to answer whether or not a computer can be trained to recognize a quotation as valid or invalid from the patterns latent in a published writer's works.

My initial version was, frankly, what you'd expect from a novice data scientist. In this repository, I will attempt to fix the errors of that first version and produce an effective validator.