Real-Time Stock Price Analysis Pipeline This project demonstrates a scalable data engineering pipeline that collects, processes, and visualizes real-time stock market data. The pipeline integrates APIs, processes data in real-time using Apache Kafka and Python, and delivers insights through visual dashboards and Python-generated graphs. real-time-stock-analysis-pipeline/
β
βββ README.md # Project overview
βββ requirements.txt # Python dependencies
βββ data/ # Sample and processed data
β βββ sample_data.csv
β βββ processed_data.csv
βββ src/ # Core Python scripts
β βββ fetch_data.py
β βββ process_data.py
β βββ load_data.py
β βββ visualize_data.py
β βββ airflow_dag.py
βββ dashboards/ # Tableau/Power BI dashboards
β βββ tableau_dashboard.twb
β βββ power_bi_dashboard.pbix
βββ scripts/ # Kafka producer/consumer scripts
β βββ kafka_producer.py
β βββ kafka_consumer.py
βββ config/ # Configuration files
β βββ api_keys.json
β βββ db_config.yaml
βββ docs/ # Documentation and presentations
βββ architecture_diagram.png
βββ dataset_description.md
βββ presentation.pdf
git clone https://github.com/evans25575/real-time-stock-analysis-pipeline.git cd real-time-stock-analysis-pipeline pip install -r requirements.txt
This project demonstrates a scalable data engineering pipeline designed to collect, process, and visualize real-time stock market data. The pipeline integrates APIs, processes data in real-time using Apache Kafka and Python, and presents insights through visual dashboards and Python-generated graphs. It is ideal for applications such as live market analysis, trading strategies, and financial data exploration.
- Real-Time Data Ingestion: Collects live stock market data using APIs.
- Streaming Processing: Utilizes Apache Kafka for data streaming and processing.
- Data Transformation: Transforms raw data into structured, analyzable formats.
- Data Storage: Stores processed data in CSV files for easy access and further analysis.
- Visualizations: Provides dashboards using Tableau/Power BI and Python-generated graphs.
- Automated Pipelines: Includes Airflow DAGs for scheduling and managing ETL processes.
real-time-stock-analysis-pipeline/
β
βββ README.md # Project overview (You're reading this!)
βββ requirements.txt # Python dependencies
βββ data/ # Sample and processed data
β βββ sample_data.csv
β βββ processed_data.csv
βββ src/ # Core Python scripts
β βββ fetch_data.py # Fetches real-time data from APIs
β βββ process_data.py # Processes and transforms raw data
β βββ load_data.py # Loads data to storage
β βββ visualize_data.py # Generates visualizations
β βββ airflow_dag.py # Automates ETL process using Airflow
βββ dashboards/ # Tableau/Power BI dashboards
β βββ tableau_dashboard.twb
β βββ power_bi_dashboard.pbix
βββ scripts/ # Kafka producer/consumer scripts
β βββ kafka_producer.py
β βββ kafka_consumer.py
βββ config/ # Configuration files
β βββ api_keys.json
β βββ db_config.yaml
βββ docs/ # Documentation and presentations
β βββ architecture_diagram.png
β βββ dataset_description.md
β βββ presentation.pdf
- Clone the repository:
git clone https://github.com/evans25575/real-time-stock-analysis-pipeline.git
cd real-time-stock-analysis-pipeline
- Install dependencies:
pip install -r requirements.txt
- Fetch Data:
python src/fetch_data.py
- Process Data:
python src/process_data.py
- Visualize Data:
python src/visualize_data.py
Visual dashboards are created using:
- Tableau:
dashboards/tableau_dashboard.twb
- Power BI:
dashboards/power_bi_dashboard.pbix
Python-generated graphs are saved in the data/processed_data.csv
folder.
Detailed documentation is available in the docs/
folder, including:
architecture_diagram.png
: Visual representation of the pipeline.dataset_description.md
: Description of the datasets used.presentation.pdf
: Project presentation for stakeholders.
This project is licensed under the MIT License. See the LICENSE file for more details.
Contributions are welcome! Feel free to submit issues, fork the repository, and make pull requests.
For questions or suggestions, feel free to reach out at: kiplaevans2018@gmail.com
python src/fetch_data.py python src/process_data.py python src/visualize_data.py Feel free to contribute or reach out with any questions! Contact: kiplaevans2018@gmail.com