Cryptocurrencies

Unsupervised Learning algorithm to discover unknown patterns

Overview:

The purpose of this analysis was to use data from to provide a report and visualization of currently traded cryptocurrencies that can be grouped together to create a new classification system. This report would be used to help Accountability Accounting offer a new investment portfolio in the exciting world of cryptocurrency to its customers.

Since the data does not have any known outcome, we needed to preprocess it to fit an unsupervised Machine Learning model that will enable us to run a clustering algorithm that will allow us to group the cryptocurrencies.

In this analysis we learned and applied:

Differences between supervised and unsupervised learning.
Data Preprocessing (Selection, Transformation, Scaling) - the process of helping to prepare data for Machine Learning Algorithms.
Elbow Curve - method to determine the best number of clusters needed for the algorithm to group the objects by.
Principal Component Analysis (PCA) - statistical technique to speed up machine learning algorithms when the number of features is too high.
Clustering Algorithms (KMeans) - the process of grouping similar objects/data points into clusters.
Visualization (hvPlot, Plotly) - graphic libraries that allows us to create 2D and 3D graphs such as, scatter plots.

The code for the challenge can be found in Module Challenge

Results:

Preprocessing the Data for PCA

Initially the original dataset contained 1,252 records, in which only 1,144 cryptocurrencies were currently trading.
All the rows that do not have coins being mined are removed.
Only required columns are kept, dropping others.
The final results identified 532 tradable cryptocurrencies which is displayed below.

Reducing Data Dimensions Using PCA

Created a new DataFrame named pcs_df that includes the following columns, PC 1, PC 2, and PC 3, and uses the index of the crypto_df DataFrame as the index. Which is shown below.

Clustering Crytocurrencies Using K-Means

An elbow curve is created using hvPlot to find the best value for K. Here the value of K is 4 from the diagram.
Predictions are made on the K clusters of the cryptocurrencies’ data
A new DataFrame is created with the same index as the crypto_df DataFrame and has the following columns: Algorithm, ProofType, TotalCoinsMined, TotalCoinSupply, PC 1, PC 2, PC 3, CoinName, and Class

Visualizing Cryptocurrencies Results

The clusters are plotted using a 3D scatter plot, and each data point shows the CoinName and Algorithm on hover
Created a table with tradable cryptocurrencies using the hvplot.table() function.
A DataFrame is created that contains the clustered_df DataFrame index, the scaled data, and the CoinName and Class columns.
A hvplot scatter plot is created where the X-axis is "TotalCoinsMined", the Y-axis is "TotalCoinSupply", the data is ordered by "Class", and it shows the CoinName when you hover over the data point.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.ipynb_checkpoints		.ipynb_checkpoints
ModuleChallenge		ModuleChallenge
resources		resources
Basics.ipynb		Basics.ipynb
ElbowCurveCrypto.ipynb		ElbowCurveCrypto.ipynb
ElbowCurveExample.ipynb		ElbowCurveExample.ipynb
HierarchicalClusteringExample.ipynb		HierarchicalClusteringExample.ipynb
PrincipalComponentAnalysisExample.ipynb		PrincipalComponentAnalysisExample.ipynb
README.md		README.md
clusteringExample.ipynb		clusteringExample.ipynb
cryptocurrencyWork.ipynb		cryptocurrencyWork.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.ipynb_checkpoints

.ipynb_checkpoints

ModuleChallenge

ModuleChallenge

resources

resources

Basics.ipynb

Basics.ipynb

ElbowCurveCrypto.ipynb

ElbowCurveCrypto.ipynb

ElbowCurveExample.ipynb

ElbowCurveExample.ipynb

HierarchicalClusteringExample.ipynb

HierarchicalClusteringExample.ipynb

PrincipalComponentAnalysisExample.ipynb

PrincipalComponentAnalysisExample.ipynb

README.md

README.md

clusteringExample.ipynb

clusteringExample.ipynb

cryptocurrencyWork.ipynb

cryptocurrencyWork.ipynb

Repository files navigation

Cryptocurrencies

Overview:

Results:

Preprocessing the Data for PCA

Reducing Data Dimensions Using PCA

Clustering Crytocurrencies Using K-Means

Visualizing Cryptocurrencies Results

About

Releases

Packages

Languages

ashwinihegde28/Cryptocurrencies

Folders and files

Latest commit

History

Repository files navigation

Cryptocurrencies

Overview:

Results:

Preprocessing the Data for PCA

Reducing Data Dimensions Using PCA

Clustering Crytocurrencies Using K-Means

Visualizing Cryptocurrencies Results

About

Topics

Resources

Stars

Watchers

Forks

Languages