Skip to content

Extraction of data from semi-structured text files, and preprocess the text into numerical representations.

Notifications You must be signed in to change notification settings

ricardoariasalazar/Text-Preprocessing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Text-Preprocessing

This repository comprises a data-set that contains 80+ days of COVID-19 related tweets (from late March to mid July 2020). The excel file contains 80+ sheets where each sheet contains 2000 tweets. The task of this project is to preprocess the set of tweets and convert them into numerical representations which are suitable for input into recommender systems and information retrieval algorithms.

About

Extraction of data from semi-structured text files, and preprocess the text into numerical representations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published