The Road Not Taken

Building a Language model using a dataset consisting of poems written by Robert Frost.

The repository has multiple directories, with each serving a different purpose:

input/: contains the:
- raw dataset
- input sequences in tokenized format
- output sequences in tokenized format
src/: this directory consists of the source code for the project.
- config.py: consists of variables which are used all across the code.
- feature_engg.py: used for preprocessing the data.
- test_functionalities: using pytest module, i define some sanity checks on the data.
- model.py: this file contains the code for implementing the model. The train and the inference stage.
- main.py: this is where all the code comes together. Calling specific functions using commandline arguments.
plots/: consists of the plots used for observing model loss and accuracy.

To preprocess the data and generate embeddings, use the following command:

python main.py --preprocess data

To train the model, use the following command:

python main.py --train seq2seq

For generating text 5 lines at a time, use:

python main.py --generate text

To install the required packages, use:

pip install -r requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
input		input
plots		plots
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

input

input

plots

plots

src

src

.gitignore

.gitignore

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

The Road Not Taken

To preprocess the data and generate embeddings, use the following command:

To train the model, use the following command:

For generating text 5 lines at a time, use:

To install the required packages, use:

About

Releases

Packages

Languages

debajyoti94/The_road_not_taken

Folders and files

Latest commit

History

Repository files navigation

The Road Not Taken

To preprocess the data and generate embeddings, use the following command:

To train the model, use the following command:

For generating text 5 lines at a time, use:

To install the required packages, use:

About

Topics

Resources

Stars

Watchers

Forks

Languages