Skip to content

This repository contains the software used in the paper "The Natural Selection of Words: Finding the Features of Fitness" (see natural-selection-of-words.pdf).

License

Notifications You must be signed in to change notification settings

pdturney/natural-selection-of-words

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

natural-selection-of-words

This repository contains the software used in the paper The Natural Selection of Words: Finding the Features of Fitness.

README - Peter Turney, January 10, 2019

See the paper The Natural Selection of Words: Finding the Features of Fitness for a detailed description of the steps in the algorithm.

Here is a brief description of the subdirectories in this project:

  • /catvar/ -- Categorial Variations 2.1
  • /google-ngrams/ -- Google Ngram files for unigrams (single words)
  • /perl-code/ -- the Perl code that runs the experiments
  • /perl-output/ --the output text files of the Perl code

Here are the steps required to install the files and tools. The code was written for Windows 10, but it should also work with Linux and MacOS with some minor modifications.

(1) Install Strawberry Perl for Windows 10

(2) Install WordNet 3.0

  • https://wordnet.princeton.edu/download
  • we used WordNet version 3.0 for compatability with WordNet::QueryData
  • install WordNet 3.0 in the directory "C:\Program Files\WordNet\3.0"
  • any other location will make it difficult to use WordNet::QueryData
  • WordNet 3.1 has not been fully tested with WordNet::QueryData

(3) Install WordNet::QueryData 1.49

  • https://metacpan.org/pod/WordNet::QueryData
  • run the command "cpan WordNet::QueryData" to install WordNet::QueryData
  • "cpan" is included in Strawberry Perl
  • "cpan" will automatically download the required files from the web
  • we used WordNet::QueryData version 1.49

(4) Install Lingua::EN::Syllable 0.30

(5) Install Weka 3.8.2

(6) Install Categorial Variations 2.1

  • https://clipdemos.umiacs.umd.edu/catvar/
  • read the README in the directory /catvar/
  • download the required CatVar files and put them in /catvar/
  • go to the directory /perl-code/ and run the Perl script "wordnet-catvar.pl"
  • in Windows, use the command "perl wordnet-catvar.pl"

(7) Install Google Ngram files

(8) Run the Perl scripts

  • go to the directory /perl-code/
  • the command "perl doit.pl" will run all seven steps of the algorithm
  • "doit.pl" does not take any command line arguments
  • the behaviour of "doit.pl" is controlled by the file "configuration.txt"
  • you can modify the behaviour of "doit.pl" by commenting out code or changing the contents of "configuration.txt"

About

This repository contains the software used in the paper "The Natural Selection of Words: Finding the Features of Fitness" (see natural-selection-of-words.pdf).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages