This project provides a versatile framework for sentiment analysis on stock market data, conducting backtesting with Backtrader and leveraging generative AI models such as OpenAI, Llama, and Transformers pipeline for sentiment data analysis. It encompasses functionalities for downloading stock data from Yahoo Finance, fetching stock market news data using Alpaca APIs, preprocessing sentiment data, and running backtests with customizable strategies. Integrating advanced language models enhances the sentiment analysis process, allowing for a more nuanced understanding of market sentiments.
This project aims to provide a streamlined workflow for analyzing stock market sentiment and backtesting trading strategies. It leverages Backtrader for backtesting and yfinance for downloading stock data.
- Download stock data from Yahoo Finance.
- Preprocess sentiment data and merge it with stock data.
- Run backtests with customizable strategies using Backtrader.
- Analyze backtest results with various performance metrics.
- Python 3.6+
- Dependencies listed in
requirements.txt
-
Clone the repository:
git clone https://github.com/your-username/stock-sentiment-backtesting.git cd stock-sentiment-backtesting
-
Create a virtual environment (optional but recommended):
python -m venv venv
-
Activate the virtual environment:
-
On Windows:
venv\Scripts\activate
-
On macOS/Linux:
source venv/bin/activate
-
-
Install dependencies:
pip install -r requirements.txt
Edit the configuration parameters in main.py
to customize the stock, date range, and other settings for your analysis.
STOCK_TICKER = 'AAPL'
START_DATE = '2022-03-21'
END_DATE = '2022-12-31'
SENTIMENT_DATA_PATH = 'data/stock_sentiment_data.csv'
Execute the main script to run the backtest:
python main.py
The backtest results, including performance metrics, will be displayed in the console.
algotrading-sentimentanalysis-genai/
├── alpaca/
│ └── client.py
├── data/
│ └── stock_sentiment_data.csv
│ └── ...
├── llms/
│ └── llama_llm.py
│ └── openai_llm.py
├── processor/
│ └── stock_data_processor.py
├── runner/
│ └── backtest_runner.py
├── sentiment_analysis/
│ └── sentiment_analysis_pipeline.py
├── strategies/
│ └── technical_only_strategy/
│ └── technical_with_sentiment_strategy/
├── output/
│ └── ...
├── .gitignore
├── README.md
├── requirements.txt
├── main.py
└── venv/
└── ...
- Python 3.7 or later
- Alpaca account
- API key ID and secret (available in your Alpaca account dashboard)
-
Install the Alpaca Python library:
pip install alpaca-trade-api
-
Use the API key in your code:
from alpaca_trade_api import REST alpaca_api_key = "YOUR_API_KEY" alpaca_secret_key = "YOUR_SECRET_KEY" rest_client = REST(alpaca_api_key, alpaca_secret_key)
- Python 3.6 or later
- OpenAI account
- API key (available in your OpenAI account dashboard)
-
Install the official OpenAI library:
pip install openai
-
Set your API key as an environment variable:
export OPENAI_API_KEY="YOUR_API_KEY"
Alternatively, provide it directly in your code:
import openai openai.api_key = "YOUR_API_KEY"
- Python 3.7 or later
- Hugging Face account with an access token
- Token with access to the desired Llama model
-
Install necessary libraries:
pip install transformers
-
Set your Hugging Face token as an environment variable:
export HF_ACCESS_TOKEN="YOUR_TOKEN"
Use caution when handling API keys and tokens. Avoid exposing them in public repositories or sharing them without proper security measures.
Feel free to include this information in your README for comprehensive setup instructions.
-
main.py: Contains the main script for running backtests and strategy definition.
-
data: Directory for storing data files, including stock and sentiment data.
-
output: Directory for saving backtest results and plots.
-
llms: Contains OpenAI and Llama clients for sentiment analysis.
-
processor: Contains stock data processor for preprocessing stock news and sentiment data.
-
runner: Contains backtest runner class for backtesting using cerebro and backtrader.
-
sentiment_analysis: Contains transformer pipeline for sentiment analysis on news data.
-
strategies: Contains code for technical only strategy and technical with sentiment analysis strategy.
-
.gitignore: Specifies files and directories to be ignored by version control.
-
README.md: Project documentation.
-
requirements.txt: List of Python dependencies.
Contributions are welcome! Please follow the Contribution Guidelines.
This project is licensed under the MIT License - see the LICENSE file for details.