Bitcoin Price - A Predictive Analysis

Using the Binance cryptocurrency exchange API client BTC data were obtained and examined, culminating in the construction of five distinct regression models to predict next day's close price as a function of its inferred financial market indicator statistics. Three were built using linear regression algorithms: Ordinary Least Squares (OLS), Lasso, and Ridge. One was built using the Random Forest Regression (RFR) algorithm. Then last, one was built using a Long Short Term Memory(LSTM) recurrent neural network.

Results

We found that the RFR model produced the most accurate predictions regarding BTCs closing price. However, the R2 score of the training set for the RFR was substantially lower (0.9172). The concern is solved by computing its OOB (out-of-bag) score, which was 0.9136. Comparing that score to the train R2 (0.9172) we can see how similar they are. Since the OOB is an unbiased estimate of a model’s performance on a test set the same size as the training, we can confirm that the model produces relatively "true" predictions, solidifying its test R2 score of 0.98845.

That being said, the LSTM produced the second best results according to R2 scores (0.9395). Although, because we split the train/test sets differently (80/20) than regression models (75/25) the R2 metric might not be useful for a comparitive analysis between the two. Using MAPE proves more reliable in that regard, showing us that the LSTM prediction errors were smaller by 0.0247 compared to RFR (0.48569).

The OLS and Lasso frameworks produced nearly identical results. Their respective R2 testing scores were 0.93821 and 0.93038, differing by only 0.00783 points. However, when looking at their MAPEs we can see that the OLS received a much better score than Lasso by 0.23988 meaning the OLS model produced more accurate predictions compared to the actual values. Lastly, the Ridge model performed the worst across the board, only receiving an R2 score of 0.75385 and a terrible MAPE in comparison to 1.51219. Meaning that this model's predictions differed from the actual values during testing by over 1% even with the data normalized and scaled.

Future Work

Perform deeper analysis into Tweet sentiment
Train/test models on more data
Fine tune the hyperparameters of models
Perform EDA with other indicators
Implement signal generator
- Back test
- Compute prospective Sharpe Ratio
Test for replicable results on different assets (equities, ETFs, etc.)

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
Data Wrangling		Data Wrangling
Exploratory Data Analysis (EDA)		Exploratory Data Analysis (EDA)
Modeling		Modeling
Preprocessing-Training		Preprocessing-Training
data		data
BTC_PricePred_Presentation.pdf		BTC_PricePred_Presentation.pdf
BTC_PricePred_Report.pdf		BTC_PricePred_Report.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Wrangling

Data Wrangling

Exploratory Data Analysis (EDA)

Exploratory Data Analysis (EDA)

Modeling

Modeling

Preprocessing-Training

Preprocessing-Training

data

data

BTC_PricePred_Presentation.pdf

BTC_PricePred_Presentation.pdf

BTC_PricePred_Report.pdf

BTC_PricePred_Report.pdf

README.md

README.md

Repository files navigation

Bitcoin Price - A Predictive Analysis

Table of Contents:

Results

Future Work

You can read the in-depth report here and view all model prediction visualizations here.

About

Languages

jra333/BTC-Price-Prediction

Folders and files

Latest commit

History

Repository files navigation

Bitcoin Price - A Predictive Analysis

Table of Contents:

Results

Future Work

You can read the in-depth report here and view all model prediction visualizations here.

About

Topics

Resources

Stars

Watchers

Forks

Languages