The data undergoes preprocessing to enhance its quality:
- Removal of special characters.
- Elimination of single characters and spaces.
- Conversion of all words to lowercase.
The project utilizes an LSTM neural network model with pre-trained word embeddings based on Glove embeddings (glove.6B.50d
) from a large Twitter US Airline database.
- Integration of Dropout layers to prevent overfitting.
- Adjustment of the learning rate from 0.1 to 0.001 for fine-tuning.
- Set the batch size to 200 for efficient training.
- Incorporation of validation data for performance assessment.
These measures collectively optimize the model's performance and facilitate comprehensive result analysis.