ClassifyingNamesWithRNN

In this project, I have trained a classifier which classifies the names (rather surnames) into their possible origin country. The classifier is built using a simple character based Recurrent Neural Network (RNN) model.

Dataset Used

The dataset used for training the classifier can be downloaded from this link. The dataset contains over 20,000 names labelled across 18 different languages. The data is present as seperate text files for each language in which the names belonging to that language are present in different lines.

Model Used

Each character of the name is input sequentially to the RNN unit along with the previous hidden state. The output of the RNN unit is the predicted category along with the new hidden state. Actually the main purpose of the hidden state of any RNN unit is to store some information about the sequence before the current character.

The model used in training the classifier is taken from this tutorial.

Code

The actual code is well documented and explained here

Implementation Details

Negative Log Likelihood loss has been used to compute the losses between predictions and actual outputs. Total dataset of 20000 images was divided into training(80%) and testing(20%) sets. The model was trained for 5 epochs. and the weights were updated using gradient descent algorithm.

Total time of running the whole script was 1 min 55 sec on an i5 5th gen CPU.

Results and Observations

The accuracy on the dataset was 73% approximately which is acceptable seeing the size of the dataset.
Adding extra epochs does not improve the result very much which is observable from the graph of losses of all the five epochs. The losses seem to diminish in successive iterations only in the first epoch. In rest of the epochs the losses kept variating randomly but the loss decreased overall in an epoch.
Although it is recommended that the training to test data ration should be 80:20 but by varying this ratio to 90:10, the results actually got improved although very slightly(close to 1 %) but it only leaves some 2000 examples for testing. Whereas with a ratio of 70:30, the results got degraded by more than one percent. So, I have used 80:20 ratio. The results of 70:30 and 90:10 are available on different commits.
Even in the 4th epoch or 5th epoch, there are some examples which have very high loss. This shows that although the average loss is decreasing per epoch/iteration, but there may be some such examples present where the model may predict extremely different results from reality.

License

See LICENSE

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
model.png		model.png
name_classifier.ipynb		name_classifier.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.ipynb_checkpoints

.ipynb_checkpoints

data

data

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

main.py

main.py

model.png

model.png

name_classifier.ipynb

name_classifier.ipynb

Repository files navigation

ClassifyingNamesWithRNN

Dataset Used

Model Used

Code

Implementation Details

Results and Observations

License

About

Releases

Packages

Languages

License

ac-alpha/ClassifyingNamesWithRNN

Folders and files

Latest commit

History

Repository files navigation

ClassifyingNamesWithRNN

Dataset Used

Model Used

Code

Implementation Details

Results and Observations

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages