Skip to content

arv-anshul/yt-watch-history

Repository files navigation

YouTube Watch History Analysis

This project analyzes a user's YouTube watch history data downloaded from Google Takeout. It provides insights into watch patterns, content preferences, and overall YouTube consumption.

Important

This was my first project where I explored MLOps concepts like FastAPI, Docker, and MLFlow. As the project grew in complexity, I found it challenging to maintain a clear development track.

Therefore, I've decided to archive this version and rebuild it from scratch with a renewed focus on organization and maintainability.

New project repo @arv-anshul/yt-watch-history-v2

Getting Your YouTube History Data

  1. Go to the Google Takeout website: Google Takeout
  2. Sign in with your Google account.
  3. Select "YouTube History" under "Choose data to export".
  4. Choose JOSN file type and delivery options.
  5. Click "Create export".
  6. Wait for the export process to complete and download the file.
Or refer to this blog at dev.to.

Benefits

  • Gain valuable insights into your YouTube viewing habits.
  • Discover your content preferences and identify areas of interest.
  • Track your progress towards achieving your YouTube goals.
  • Make informed decisions about your YouTube consumption.

Project's Notebooks

If you want to see my 📓 notebooks where I have done some interesting analysis on the datasets which I have used in this project then you can se them in my @arv-anshul/notebooks github repository.

Tech Stack

Docker FastAPI MLflow MongoDB NLTK Plotly Polars Pydantic Ruff scikit-learn Streamlit YouTube Badge

Project Setup Guide

This guide helps you set up and run this project using Docker Compose. The project consists of a frontend and backend service.

Prerequisites

Steps to Set Up

  1. Clone the Repository:

    git clone https://github.com/arv-anshul/yt-watch-history
  2. Configuration:

    • Open the docker-compose.yml file in the project root.

    • Set the following environment variables in the frontend service:

      • YT_API_KEY: Replace null with your YouTube API key.
      • API_HOST: Should match the name of the backend service (backend in this case).
      • API_PORT: Port number for the backend service (default is 8001).
      • LOG_LEVEL: Logging level (default is INFO).
    • Set the following environment variables in the backend service:

      • MONGODB_URL: Replace null with your MongoDB URL.
      • API_PORT: Port number for the backend service (default is 8001).
      • API_HOST: Set to "0.0.0.0".
      • LOG_LEVEL: Logging level (default is INFO).
  3. Build and Run:

    docker-compose up --build
  4. Access the application:

    • Frontend: Open a browser and go to http://localhost:8501.
    • Backend: Accessed internally via the configured API endpoints. Or access locally at http://0.0.0.0:8001.

Note

  • Frontend service runs on port 8501 locally.
  • Backend service runs on port 8001 locally.
  • Make sure no other services are running on these ports.
  • /frontend and /backend directories are mounted as volumes for the respective services.
  • /frontend/data and /backend/ml_models directories are mounted for persistent data storage.