Gurudev Ilangovan 2018-07-29
The IndianStocksR
package is used to download the end of day data of
all stocks in the two primary Indian stock markets,
NSE and BSE. The
end of data data is provided free by the two stock exchanges from their
websites and consists of information like the open, high, low, close
among others for each script that’s traded in them.
The data can be accessed from their websites based on the date formatted in a certain way. The R-Bloggers article was the source of inspiration for the package. However, the package modularizes the code, tweaks a lot of things and creates a much more accessible API that’s more powerful in the sense that it abstracts away the complexity from the user.
It is advised to create a folder and set the working directory to that folder before we start work. Even better, if you’re working from R studio is to create an R project for downloading the data and working on your analysis.
The package is currently getting submitted to CRAN after which a simple
install.packages("IndianStocksR")
will get it installed. But for now,
it is available on github.
install.packages("devtools")
devtools::install_github("ilangurudev/IndianStocksR")
After installation, we load the package. The package basically creates
data frames and hence plays along well with the concepts of tidy data
and the tidyverse
. So it is highly encouraged to load that package as
well
library(IndianStocksR)
library(tidyverse)
The workhorse of the package is the function download_stocks
. However,
you will rarely have to use it. It still pays to understand the
parameters as it is the basis of the other functions that you will
probably
use.
download_stocks(date = "2018-07-20", exchange = c("nse", "bse"), dest_path = "./data", quiet = FALSE)
## Downloading from 'nse' as exchange not clearly specified.
## Dowloaded stocks data from NSE on 20 JUL 2018
- The
date
parameter can be a date object (and defaults to today). It can also be a string (yyyy-mm-dd) or a number that can be parsed as a date bylubridate::as_date()
. For instance,"2018-05-21"
is a valid date. - The
exchange
can either be “nse” or “bse”. - The
dest_path
specifies where you want the data files to get downloaded. It defaults to the data folder in the current working directory (which it will create if not found). This is why it is advisable to work in a project. This keeps all the data files of a project organized. If the path you specify is not found, an error is thrown. - The
quiet
parameter controls whether you want the download status messages or not.
The main purpose of this function is to download data from the specified exchange on the mentioned date. If data is not available for the date you specified, you will get an error.
The function you’ll probably have to use first is the
download_stocks_period
df_period <-
download_stocks_period(start = "2018-07-21",
end = "2018-07-26",
exchange = c("both", "nse", "bse"),
dest_path = "./data",
compile = TRUE,
delete_component_files = TRUE,
quiet = FALSE)
start
andend
: The download stocks period downloads data for all the dates in the date range specified bystart
andend
.start
defaults to today - 8 days andend
defaults to today. If today is 2018-07-30, then end takes that value and start takes the value 2018-07-22. However, it makes sense to make start today - 365 or specify the actual date from when you want the data. You could change theend
value too if you want data for a specific date range. Thestart
andend
values follow the same rules as thedate
parameter indownload_stocks
- The
exchange
function’s behavior is pretty straightforward. Downloads data for the date range from NSE if “nse” or BSE from “bse” or both NSE and BSE if “both”. Defaults to “both” - The
dest_path
does the same job as it does indownload_stocks
- The
compile
parameter compiles all the downloaded files into one file (if exchange is “both”, one compiled file for “nse”, one for “bse” and one combined). This option is by default on as compiled files are much more tractable for analysis. - The
delete_component_files
deletes everything apart from the compiled files. This keeps the work space clean and more efficient for updating. - The
quiet
does the same job as it does indownload_stocks
Let’s take a look at
df_period
.
df_period %>% slice(1:200)
exchange |
date |
symbol |
isin |
open |
high |
low |
close |
volume |
series |
last |
prevclose |
tottrdval |
timestamp |
totaltrades |
sc_group |
no_trades |
net_turnov |
tdcloindi |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
nse |
2018-07-20 |
20MICRONS |
INE144J01027 |
34.90 |
35.20 |
33.90 |
34.40 |
42383 |
EQ |
34.75 |
34.60 |
1456395.4 |
20-JUL-2018 |
607 |
NA |
NA |
NA |
NA |
nse |
2018-07-20 |
21STCENMGM |
INE253B01015 |
34.90 |
34.90 |
34.90 |
34.90 |
1202 |
EQ |
34.90 |
34.25 |
41949.8 |
20-JUL-2018 |
4 |
NA |
NA |
NA |
NA |
nse |
2018-07-20 |
3IINFOTECH |
INE748C01020 |
3.55 |
3.70 |
3.50 |
3.50 |
2998992 |
EQ |
3.50 |
3.60 |
10717093.0 |
20-JUL-2018 |
1137 |
NA |
NA |
NA |
NA |
nse |
2018-07-20 |
3MINDIA |
INE470A01017 |
21541.00 |
23490.00 |
21318.55 |
23338.75 |
9813 |
EQ |
23100.00 |
21722.15 |
222218068.8 |
20-JUL-2018 |
4330 |
NA |
NA |
NA |
NA |
nse |
2018-07-20 |
3PLAND |
INE105C01023 |
15.00 |
15.00 |
12.65 |
13.15 |
2517 |
EQ |
13.10 |
13.45 |
33735.6 |
20-JUL-2018 |
28 |
NA |
NA |
NA |
NA |
nse |
2018-07-20 |
5PAISA |
INE618L01018 |
294.75 |
304.95 |
294.75 |
300.85 |
2925 |
EQ |
302.00 |
304.05 |
875384.8 |
20-JUL-2018 |
241 |
NA |
NA |
NA |
NA |
nse |
2018-07-20 |
63MOONS |
INE111B01023 |
65.25 |
66.80 |
63.10 |
65.55 |
216988 |
EQ |
65.20 |
65.20 |
14102114.2 |
20-JUL-2018 |
2403 |
NA |
NA |
NA |
NA |
nse |
2018-07-20 |
8KMILES |
INE650K01021 |
335.00 |
339.00 |
307.50 |
307.50 |
1422247 |
EQ |
307.50 |
341.65 |
444749983.2 |
20-JUL-2018 |
21142 |
NA |
NA |
NA |
NA |
nse |
2018-07-20 |
A2ZINFRA |
INE619I01012 |
19.65 |
20.10 |
19.30 |
19.80 |
168417 |
EQ |
19.70 |
19.80 |
3304622.9 |
20-JUL-2018 |
740 |
NA |
NA |
NA |
NA |
nse |
2018-07-20 |
AARTIDRUGS |
INE767A01016 |
520.70 |
527.00 |
520.00 |
521.75 |
5263 |
EQ |
520.00 |
520.95 |
2757494.0 |
20-JUL-2018 |
538 |
NA |
NA |
NA |
NA |
nse |
2018-07-20 |
AARTIIND |
INE769A01020 |
1197.15 |
1229.80 |
1197.15 |
1215.45 |
7544 |
EQ |
1211.00 |
1204.40 |
9197825.1 |
20-JUL-2018 |
1295 |
NA |
NA |
NA |
NA |
nse |
2018-07-20 |
AARVEEDEN |
INE273D01019 |
29.60 |
30.40 |
28.80 |
29.10 |
8583 |
EQ |
29.10 |
29.55 |
250922.4 |
20-JUL-2018 |
63 |
NA |
NA |
NA |
NA |
nse |
2018-07-20 |
ABAN |
INE421A01028 |
101.80 |
103.95 |
100.15 |
101.90 |
462821 |
EQ |
102.30 |
102.20 |
47025017.7 |
20-JUL-2018 |
5859 |
NA |
NA |
NA |
NA |
nse |
2018-07-20 |
ABB |
INE117A01022 |
1164.90 |
1179.95 |
1129.00 |
1134.95 |
186787 |
EQ |
1132.00 |
1157.60 |
215143192.6 |
20-JUL-2018 |
14947 |
NA |
NA |
NA |
NA |
nse |
2018-07-20 |
ABBOTINDIA |
INE358A01014 |
7286.00 |
7348.75 |
7193.60 |
7311.95 |
3185 |
EQ |
7325.00 |
7266.20 |
23211751.7 |
20-JUL-2018 |
813 |
NA |
NA |
NA |
NA |
The function returns the compiled files apart from writing them out as a csv.
Once you have the download_stocks_period
run, you can update the
database later by running the update_stocks
df_updated <-
update_stocks(data_path = "./data",
till = lubridate::today(),
exchange = c("both", "nse", "bse"),
compile = TRUE,
delete_component_files = TRUE)
Most of this parameters have been discussed before. This function scans
all the files in the directory and finds out the date till which there
is data and downloads data from the day after till the date mentioned by
till
. If there are no files inside the specified folder, it downloads
data from today - 8 till the date mentioned by till
. You rarely have
to tweak the till
function. It’s primarily used to update till the
current day.
Let’s take a look at
df_updated
.
df_updated %>% slice(1:200)
exchange |
date |
symbol |
isin |
open |
high |
low |
close |
volume |
series |
last |
prevclose |
tottrdval |
timestamp |
totaltrades |
sc_group |
no_trades |
net_turnov |
tdcloindi |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
nse |
2018-07-27 |
20MICRONS |
INE144J01027 |
39.00 |
40.30 |
37.50 |
39.90 |
92698 |
EQ |
39.85 |
38.85 |
3658998.65 |
27-JUL-2018 |
649 |
NA |
NA |
NA |
NA |
nse |
2018-07-27 |
21STCENMGM |
INE253B01015 |
37.10 |
37.10 |
36.50 |
37.10 |
542 |
EQ |
37.10 |
36.40 |
20101.10 |
27-JUL-2018 |
10 |
NA |
NA |
NA |
NA |
nse |
2018-07-27 |
3IINFOTECH |
INE748C01020 |
3.60 |
3.60 |
3.50 |
3.55 |
2067721 |
EQ |
3.55 |
3.55 |
7328844.40 |
27-JUL-2018 |
2013 |
NA |
NA |
NA |
NA |
nse |
2018-07-27 |
3MINDIA |
INE470A01017 |
23610.00 |
23749.95 |
23400.00 |
23609.80 |
924 |
EQ |
23500.00 |
23676.55 |
21850559.30 |
27-JUL-2018 |
504 |
NA |
NA |
NA |
NA |
nse |
2018-07-27 |
3PLAND |
INE105C01023 |
13.50 |
13.95 |
11.55 |
12.15 |
1887 |
EQ |
13.90 |
13.85 |
24549.25 |
27-JUL-2018 |
57 |
NA |
NA |
NA |
NA |
nse |
2018-07-27 |
5PAISA |
INE618L01018 |
346.00 |
352.85 |
337.00 |
344.55 |
3191 |
EQ |
341.00 |
345.50 |
1095565.05 |
27-JUL-2018 |
301 |
NA |
NA |
NA |
NA |
nse |
2018-07-27 |
63MOONS |
INE111B01023 |
70.00 |
71.00 |
69.50 |
69.80 |
93732 |
EQ |
69.60 |
69.40 |
6571384.30 |
27-JUL-2018 |
1215 |
NA |
NA |
NA |
NA |
nse |
2018-07-27 |
8KMILES |
INE650K01021 |
260.85 |
260.85 |
260.85 |
260.85 |
25045 |
EQ |
260.85 |
248.45 |
6532988.25 |
27-JUL-2018 |
353 |
NA |
NA |
NA |
NA |
nse |
2018-07-27 |
A2ZINFRA |
INE619I01012 |
21.40 |
25.50 |
21.30 |
24.65 |
2830527 |
EQ |
24.25 |
21.25 |
68796473.80 |
27-JUL-2018 |
4056 |
NA |
NA |
NA |
NA |
nse |
2018-07-27 |
AAKASH |
INE087Z01016 |
34.50 |
34.50 |
34.50 |
34.50 |
2000 |
SM |
34.50 |
35.00 |
69000.00 |
27-JUL-2018 |
1 |
NA |
NA |
NA |
NA |
nse |
2018-07-27 |
AARTIDRUGS |
INE767A01016 |
548.60 |
567.00 |
545.95 |
552.70 |
12211 |
EQ |
551.10 |
545.30 |
6823408.25 |
27-JUL-2018 |
920 |
NA |
NA |
NA |
NA |
nse |
2018-07-27 |
AARTIIND |
INE769A01020 |
1264.55 |
1274.00 |
1245.00 |
1252.40 |
14181 |
EQ |
1258.70 |
1260.25 |
17933400.20 |
27-JUL-2018 |
1765 |
NA |
NA |
NA |
NA |
nse |
2018-07-27 |
AARVEEDEN |
INE273D01019 |
32.00 |
32.75 |
31.35 |
32.40 |
16656 |
EQ |
32.25 |
31.50 |
535642.45 |
27-JUL-2018 |
93 |
NA |
NA |
NA |
NA |
nse |
2018-07-27 |
ABAN |
INE421A01028 |
106.95 |
110.80 |
105.75 |
108.35 |
662013 |
EQ |
108.00 |
105.65 |
72033538.90 |
27-JUL-2018 |
7728 |
NA |
NA |
NA |
NA |
nse |
2018-07-27 |
ABB |
INE117A01022 |
1188.00 |
1194.00 |
1175.00 |
1188.40 |
33749 |
EQ |
1185.00 |
1175.85 |
40011149.80 |
27-JUL-2018 |
2781 |
NA |
NA |
NA |
NA |
Except the date parameters, one rarely has to tweak the defaults. The defaults are designed to work optimally.
This is just an initial version of the package and I expect to see a few bugs. I’d be very happy if you create github issues if you run into anything. Suggestions and feature requests welcome. Feel free to comment what you think of the package.
Thanks for reading! Cheers!