Skip to content

Extracting textual data articles from the given URL (webscraping using goose3) and performing text analysis

Notifications You must be signed in to change notification settings

Eakta08/Text-Analysis

Repository files navigation

Data Extraction and NLP

Objective : The aim is to extract textual data articles from the given URL and perform text analysis to compute the following variables:

  1. POSITIVE SCORE
  2. NEGATIVE SCORE
  3. POLARITY SCORE
  4. SUBJECTIVITY SCORE
  5. AVG SENTENCE LENGTH
  6. PERCENTAGE OF COMPLEX WORDS
  7. FOG INDEX
  8. AVG NUMBER OF WORDS PER SENTENCE
  9. COMPLEX WORD COUNT
  10. WORD COUNT
  11. SENTENCE COUNT
  12. SYLLABLE PER WORD
  13. PERSONAL PRONOUNS
  14. AVG WORD LENGTH

Google Drive Link: https://drive.google.com/drive/folders/1Wyg9Z5iMgv6cRX17kYjA44QdCElNQv3j?usp=sharing

You can go through 'Objective.docx' for understanding the aim and what is to be performed. To understand what is to be done you can refer 'Text Analysis.docx'. The working is in 'nlp_text_analysis.ipynb' and the output is in 'Output.xlsx'

About

Extracting textual data articles from the given URL (webscraping using goose3) and performing text analysis

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published