Skip to content

Summarizing YouTube Video Transcripts & Blog Posts with HuggingFace Transformers.

Notifications You must be signed in to change notification settings

kunal-bhadra/Summarize-Text-Video-Transformers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

Summarizing YouTube Transcripts & Blog Posts

Here, I have summarized YouTube Video Transcripts & Blog posts with SOTA models. The Video Summarization was done through the facebook/bart-large-cnn model from HuggingFace Transformers. With the sshleifer/distilbart-cnn model, we fetched and scraped a blog post to return its summary to the user. Lastly, the latest Pegasus model from Google was also run to generate an Abstractive summary of the text from the corpus given, which can generate text from its own & doesn't use the text from the corpus! All these approaches use current technologies in Text Summarization applying recent breakthroughs.

🚀 The Result

Extractive Summary models (Bart) was able to reduce the size of the text corpus to 20-30%, while still retaining the meaning of the corpus. The Abstractive Summary model (Pegasus) often returned only a couple of sentences even after providing a giant corpus.

✏ Tech Stack for Project Development

  • Python
  • 🤗 Transformers
  • Youtube Transcript API
  • BeautifulSoup
  • Tensorflow
  • PyTorch

🧠 Approach taken

  1. YouTube Video Transcription & Summarisation with the facebook/bart-large-cnn model.
  2. Blog Post Summarization with the default sshleifer/distilbart-cnn model from HuggingFace library.
  3. Abstractive Summary of a text corpus with the latest Pegasus model from Google.

🔗 Connect with me:

portfolio linkedin twitter

About

Summarizing YouTube Video Transcripts & Blog Posts with HuggingFace Transformers.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published