Skip to content

Turing-Machines-PESU/stock_market_analysis

Repository files navigation

stock_market_analysis

Created: 25 September 2018 Completed: 24 November 2018

Open Source Love Status License Maintenance

A brief overview of the project is available at an alternate site. Please click on the image :

IMAGE ALT TEXT HERE

To the run the project

python GUI/GUI.py

Data Acquisition Sources:

  1. https://www.kaggle.com/borismarjanovic/price-volume-data-for-all-us-stocks-etfs
  2. https://trendogate.com/ [Scrapped the required data]
  3. https://www.nasdaq.com/screening/company-list.aspx [Details on companies and Symbol list]

Preprocessed datasets:

  1. Compressed Dataset
  2. Uncompressed CSV

Progress:

  1. 25 September 2018: Created the repository. Added datasets and scripts.
  2. 27 September 2018: Literature survey on required python packages and preprocessing on the text data from the trending topics
  3. 9 October 2018: Analysis of twitter data
  4. 13 October 2018: Added working prototype of newly designed graph "TextoGram" [to branch textogram_prototype]
  5. 14 October 2018: Advanced Analysis on Xerox Inc and comparision with Google Inc. Fitted ARIMA model. Failed attempt to develop interesting insights.
  6. 14 October 2018: Preprocessed the Company data and Filtered out irrelevant data
  7. 14 October 2018: Basic Summary Statistics on Company Data
  8. 10 November 2018: Segmentation of the hashtags and basic text preprocessing.
  9. 12 November 2018: Basic work on GUI.(Testing phase as of 13 Nov)
  10. 13 November 2018: Basic Forecasting. LSTM, Linear Regression.
  11. 15 November 2018: Design GUI Screens. Finalise Pipeline.
  12. 18 November 2018: Modularization and Integration.
  13. 20 November 2018: Bug Fixing and Debugging.
  14. 21 November 2018: GUI Integration. Video Recording. Editing.
  15. 22 November 2018: First draft of the Report.
  16. 23 November 2018: Finalize report. Testing and Debugging.

To install the required packages

  pip install -r requirements.txt

** Please note that all packages cannot be installed from the above command. If encountered with an error please manually install the package from the list in the requirements.txt.

Project Structure

Please maintain the project structure as follows [After downloading the dataset]: .

   ├── _config.yml
   ├── datasets
   │   ├── Companies
   │   │   ├── companylist (1).csv
   │   │   ├── companylist (2).csv
   │   │   └── companylist.csv
   │   ├── companies_stocks.csv
   │   ├── filtered_companies.csv
   │   ├── hashtags.csv
   │   ├── regions.csv
   │   ├── segmented_tags.csv
   │   ├── twitter.csv
   │   ├── words_dates_list_cw.csv
   │   └── words_dates_list_gnrl.csv
   ├── Dickey Fuller Test and Filters.ipynb
   ├── ETFs
   ├── Forecasting
   │   ├── advanced_analysis_xerox.ipynb
   │   ├── Basic_prediction_with_lstm.ipynb
   │   ├── company_preprocessed_data.csv
   │   ├── Forecasting using Auto Arima.ipynb
   │   ├── LinearRegressionModel.py
   │   ├── lstm.py
   │   ├── preprocess_data.py
   │   ├── stock_data.py
   │   └── visualize.py
   ├── GUI
   │   ├── graph_images
   │   ├── GUI.ipynb
   │   ├── GUI.py
   │   └── loading.jpg
   ├── index.md
   ├── LICENSE
   ├── modules
   │   ├── basic.py
   │   └── forecast.py
   ├── packages.txt
   ├── README.md
   ├── requirements.txt
   ├── scripts
   │   ├── hashtags_segmentation.py
   │   ├── mergestocks.py
   │   ├── process_companies.py
   │   ├── regions_scrape.py
   │   ├── seg_tags_preprocess.py
   │   ├── twitterscrape.py
   │   └── update_companies.py
   ├── stock_notes.txt
   ├── Stocks
   ├── stocks.csv
   ├── Understanding_companies_stocks.py
   ├── Understanding_Dataset.ipynb
   ├── Understanding_Dataset.py
   ├── Understanding_stocks.ipynb
   ├── understanding_twitter_dataset.ipynb
   └── understanding_twitter_dataset.py

The folders ./ETFs and ./Stocks are not completely necessary but few modules may not work in their absence

License

This project is made available under the MIT License.

Credits

The project is created and maintained by Vinayak R Kamath, Nikhil V Revankar and Vikram G.

Releases

No releases published

Packages

No packages published