Fixed reference in EventStudy and tidied examples

cuemacro · Oct 2, 2020 · 08505e0 · 08505e0
1 parent d3c3084
commit 08505e0
Show file tree

Hide file tree

Showing 9 changed files with 91 additions and 71 deletions.
diff --git a/INSTALL.md b/INSTALL.md
@@ -15,7 +15,7 @@ which will help you to write your own trading strategies and Python scripts for
       * This is the most used Python distribution for data science
       * As well as installing the core Python libraries, it also installs many useful libraries like the SciPy stack, which
     contains NumPy, pandas etc. and many other useful libraries like matplotlib which are dependencies for the various Cuemacro libraries
-      * Recommend installing latest version of Python 3.6 (by running in command line `conda install python=3.6`) as some of the multiprocessing libraries have issues with Python 3.6 at present when I've tried it
+      * Recommend installing latest version of Python 3.7 (by running in command line `conda install python=3.7`) as some of the multiprocessing libraries have issues with Python 3.6 at present when I've tried it
       * findatapy, chartpy and finmarketpy should be compatible with the dependencies in Anaconda (eg. version of pandas, numpy etc.)
     * Microsoft Visual Studio 2017 Community Edition - [download](https://www.visualstudio.com/downloads/) or Visual C++ 2015 build tools - Windows
       * Makes sure to do a custom installation and tick Visual C++ in the Visual Studio 2017 installation
@@ -67,21 +67,26 @@ which will help you to write your own trading strategies and Python scripts for
 Open up the Anaconda Command Prompt (accessible via the Start Menu) to run the various "conda" and "pip" commands to install the
 various Python libraries. The Cuemacro libraries will install most Python dependencies, but some need to be installed separately.
 
+* Install finmarketpy, findatapy and chartpy the easy way...
+    * You can install some Python data science conda environments that I use for teaching
+    which include finmarketpy, findatapy and chartpy
+    * Instructions on how to install Anaconda and the py37class conda environment at 
+    [https://github.com/cuemacro/teaching/blob/master/pythoncourse/installation/installing_anaconda_and_pycharm.ipynb](https://github.com/cuemacro/teaching/blob/master/pythoncourse/installation/installing_anaconda_and_pycharm.ipynb)
+
 * Python libraries (open source)
-    * arctic - `pip install git+https://github.com/manahl/arctic.git`
+    * arctic - `pip install arctic`
         * Wrapper for MongoDB (also installs pymongo)
         * Allows us to easily save and load pandas DataFrames in MongoDB
         * Also compresses data contents
     * blpapi - https://www.bloomberglabs.com/api/libraries/ (both C++ and Python libraries)
         * Interact with Bloomberg programmatically via Python to download historical and live data
-        * Note that this requires a C++ compiler to build the Python library (at present Bloomberg doesn't have binaries for Python 3.5,
-        hence you need to build them yourself)
+        * Note that may need requires a C++ compiler to build the Python library 
         * Follow instructions at [https://github.com/cuemacro/findatapy/blob/master/BLOOMBERG.md](https://github.com/cuemacro/findatapy/blob/master/BLOOMBERG.md) for all the steps necessary to install blpapi
     * Whilst Anaconda has most of the dependencies below and pip will install all additional ones needed by the Cuemacro Python
     libraries it is possible to install them manually via pip, below is a list of the dependencies
         * all libraries
             * numpy - matrix algebra (Anaconda)
-            * pandas - time series (Anaconda) - older versions of pandas could have issues due to deprecated methods - recommend 0.24.2
+            * pandas - time series (Anaconda) - older versions of pandas could have issues due to deprecated methods - recommend 1.0.5
             * pytz - timezone management (Anaconda)
             * requests - accessing URLs (Anaconda)
             * mulitprocess - multitasking
@@ -112,14 +117,14 @@ various Python libraries. The Cuemacro libraries will install most Python depend
         * Typically this is in folders like:
             * C:\Anaconda3\Lib\site-packages
             * C:\Program Files\Anaconda\Lib\site-packages
-    * chartpy - `pip install git+https://github.com/cuemacro/chartpy.git`
+    * chartpy - `pip install chartpy`
         * Check constants file configuration [chartpy/chartpy/util/chartconstants.py](https://github.com/cuemacro/finmarketpy/blob/master/chartpy/util/chartconstants.py) for
             * Adding your own API keys for Plotly, Twitter etc
             * Changing the default size of plots
             * Changing the default chart engine (eg. Plotly, Bokeh or Matplotlib)
         * Alternatively you can create chartcred.py class in the same folder to put your own API keys
         * This has the benefit of not being overwritten each time you upgrade the project
-    * findatapy - `pip install git+https://github.com/cuemacro/findatapy.git`
+    * findatapy - `pip install findatapy`
         * Check constants file configuration [findatapy/findatapy/util/dataconstants.py](https://github.com/cuemacro/finmarketpy/blob/master/findatatpy/util/dataconstants.py) for
             * adding your own API keys for Quandl etc.
             * changing path of your local data source (change `folder_time_series_data` attribute)

diff --git a/README.md b/README.md
@@ -58,7 +58,7 @@ Calculate event study around events for asset (see examples/events_examples.py)
 # Requirements
 
 Major requirements
-* Required: Python 3.6
+* Required: Python 3.7
 * Required: pandas 0.24.2, numpy etc.
 * Required: findatapy for downloading market data (https://github.com/cuemacro/findatapy)
 * Required: chartpy for funky interactive plots (https://github.com/cuemacro/chartpy)
@@ -104,6 +104,7 @@ In finmarketpy/examples you will find several examples, including some simple tr
 
 # Release Notes
 
+* 0.11.6 - finmarketpy (02 Oct 2020)
 * 0.11.5 - finmarketpy (24 Aug 2020)
 * 0.11.4 - finmarketpy (06 May 2020)
 * 0.11.3 - finmarketpy (04 Dec 2019)
@@ -115,6 +116,8 @@ In finmarketpy/examples you will find several examples, including some simple tr
 
 # finmarketpy log
 
+* 02 Oct 2020
+    * Fixed vol surface calculation
 * 24 Aug 2020
     * Replaced .ix to work with later versions of pandas
 * 07 May 2020

diff --git a/finmarketpy/economics/eventstudy.py b/finmarketpy/economics/eventstudy.py
@@ -200,14 +200,8 @@ def get_surprise_against_intraday_moves_over_custom_event(
 from findatapy.util import ConfigManager
 from findatapy.market import SpeedCache
 
-try:
-    from numbapro import autojit
-except:
-    pass
-
 marketconstants = MarketConstants()
 
-
 class EventsFactory(EventStudy):
     """Provides methods to fetch data on economic data events and to perform basic event studies for market data around
     these events. Note, requires a file of input of the following (transposed as columns!) - we give an example for
@@ -231,12 +225,13 @@ class EventsFactory(EventStudy):
 
     # _econ_data_frame = None
 
-    # where your HDF5 file is stored with economic data
+    # Where your HDF5 file is stored with economic data
     # TODO integrate with on the fly downloading!
     _hdf5_file_econ_file = MarketConstants().hdf5_file_econ_file
     _db_database_econ_file = MarketConstants().db_database_econ_file
 
-    ### manual offset for certain events where Bloomberg/data vendor displays the wrong date (usually because of time differences)
+    ### Manual offset for certain events where Bloomberg/data vendor displays the wrong date (usually because of time differences)
+    # You may need to add to this list
     _offset_events = {'AUD-Australia Labor Force Employment Change SA.release-dt': 1}
 
     def __init__(self, df=None):
@@ -360,7 +355,7 @@ def get_economic_event_date_time_fields(self, fields, name, event=None):
         data_frame = data_frame[pandas.notnull(data_frame.index)]  # eliminate any NaN dates (artifact of Excel)
         ind_dt = data_frame.index
 
-        # convert yyyymmdd format to datetime
+        # Convert yyyymmdd format to datetime
         data_frame.index = [datetime.datetime(
             int((ind_dt[x] - (ind_dt[x] % 10000)) / 10000),
             int(((ind_dt[x] % 10000) - (ind_dt[x] % 100)) / 100),
@@ -414,7 +409,7 @@ def get_economic_event_date(self, name, event=None):
     def get_economic_event_ret_over_custom_event_day(self, data_frame_in, name, event, start, end, lagged=False,
                                                      NYC_cutoff=10):
 
-        # get the times of events
+        # Get the times of events
         event_dates = self.get_economic_event_date_time(name, event)
 
         return super(EventsFactory, self).get_economic_event_ret_over_custom_event_day(data_frame_in, event_dates, name,
@@ -468,7 +463,7 @@ def get_surprise_against_intraday_moves_over_event(self, data_frame_cross_orig,
 HistEconDataFactory
 
 Provides functions for getting historical economic data. Uses aliases for tickers, to make it relatively easy to use,
-rather than having to remember all the underlying vendor tickers. Can use Fred, Quandl or Bloomberg.
+rather than having to remember all the underlying vendor tickers. Can use alfred, quandl or bloomberg.
 
 The files below, contain default tickers and country groups. However, you can add whichever tickers you'd like.
 - conf/all_econ_tickers.csv
@@ -500,22 +495,22 @@ def __init__(self, market_data_generator=None):
             self.market_data_generator = market_data_generator
 
     def get_economic_data_history(self, start_date, finish_date, country_group, data_type,
-                                  source='fred', cache_algo="internet_load_return"):
+                                  source='alfred', cache_algo="internet_load_return"):
 
         # vendor_country_codes = self.fred_country_codes[country_group]
         # vendor_pretty_country = self.fred_nice_country_codes[country_group]
 
         if isinstance(country_group, list):
             pretty_country_names = country_group
         else:
-            # get all the country names in the country_group
+            # Get all the country names in the country_group
             pretty_country_names = list(self._econ_country_groups[
                                             self._econ_country_groups["Country Group"] == country_group]['Country'])
 
-        # construct the pretty tickers
+        # Construct the pretty tickers
         pretty_tickers = [x + '-' + data_type for x in pretty_country_names]
 
-        # get vendor tickers
+        # Get vendor tickers
         vendor_tickers = []
 
         for pretty_ticker in pretty_tickers:
@@ -535,17 +530,17 @@ def get_economic_data_history(self, start_date, finish_date, country_group, data
         if source == 'bloomberg': vendor_fields = ['PX_LAST']
 
         md_request = MarketDataRequest(
-            start_date=start_date,  # start date
-            finish_date=finish_date,  # finish date
+            start_date=start_date,      # start date
+            finish_date=finish_date,    # finish date
             category='economic',
-            freq='daily',  # intraday data
-            data_source=source,  # use Bloomberg as data source
+            freq='daily',               # daily data
+            data_source=source,         # use Bloomberg as data source
             cut='LOC',
             tickers=pretty_tickers,
-            fields=['close'],  # which fields to download
+            fields=['close'],           # which fields to download
             vendor_tickers=vendor_tickers,
-            vendor_fields=vendor_fields,  # which Bloomberg fields to download
-            cache_algo=cache_algo)  # how to return data
+            vendor_fields=vendor_fields,  # which Bloomberg/data vendor fields to download
+            cache_algo=cache_algo)      # how to return data
 
         return self.market_data_generator.fetch_market_data(md_request)
 

diff --git a/finmarketpy/economics/marketliquidity.py b/finmarketpy/economics/marketliquidity.py
@@ -26,7 +26,7 @@ def __init__(self):
         self.logger = LoggerManager().getLogger(__name__)
         return
 
-    def calculate_spreads(self, data_frame, asset, bid_field = 'bid', ask_field = 'ask'):
+    def calculate_spreads(self, data_frame, asset, bid_field='bid', ask_field='ask'):
         if isinstance(asset, str): asset = [asset]
 
         cols = [x + '.spread' for x in asset]
@@ -38,7 +38,7 @@ def calculate_spreads(self, data_frame, asset, bid_field = 'bid', ask_field = 'a
 
         return data_frame_spreads
 
-    def calculate_tick_count(self, data_frame, asset, freq = '1h'):
+    def calculate_tick_count(self, data_frame, asset, freq='1h'):
         if isinstance(asset, str): asset = [asset]
 
         data_frame_tick_count = data_frame.resample(freq, how='count').dropna()
@@ -48,6 +48,7 @@ def calculate_tick_count(self, data_frame, asset, freq = '1h'):
 
         return data_frame_tick_count
 
+
 if __name__ == '__main__':
     # see examples
-    pass
+    pass
diff --git a/finmarketpy/economics/seasonality.py b/finmarketpy/economics/seasonality.py
@@ -31,7 +31,7 @@ def __init__(self):
         self.logger = LoggerManager().getLogger(__name__)
         return
 
-    def time_of_day_seasonality(self, data_frame, years = False):
+    def time_of_day_seasonality(self, data_frame, years=False):
 
         calculations = Calculations()
 
@@ -46,7 +46,8 @@ def time_of_day_seasonality(self, data_frame, years = False):
         commonman = CommonMan()
 
         for i in year:
-            temp_seasonality = calculations.average_by_hour_min_of_day_pretty_output(data_frame[data_frame.index.year == i])
+            temp_seasonality = calculations.average_by_hour_min_of_day_pretty_output(
+                data_frame[data_frame.index.year == i])
 
             temp_seasonality.columns = commonman.postfix_list(temp_seasonality.columns.values, " " + str(i))
 
@@ -58,29 +59,31 @@ def time_of_day_seasonality(self, data_frame, years = False):
         return intraday_seasonality
 
     def bus_day_of_month_seasonality_from_prices(self, data_frame,
-                                 month_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], cum = True,
-                                 cal = "FX", partition_by_month = True, add_average = False, resample_freq = 'B'):
+                                                 month_list=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], cum=True,
+                                                 cal="FX", partition_by_month=True, add_average=False,
+                                                 resample_freq='B'):
 
         return self.bus_day_of_month_seasonality(self, data_frame,
-                                 month_list = month_list, cum = cum,
-                                 cal = cal, partition_by_month = partition_by_month,
-                                 add_average = add_average, price_index = True, resample_freq=resample_freq)
+                                                 month_list=month_list, cum=cum,
+                                                 cal=cal, partition_by_month=partition_by_month,
+                                                 add_average=add_average, price_index=True, resample_freq=resample_freq)
 
     def bus_day_of_month_seasonality(self, data_frame,
-                                 month_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], cum = True,
-                                 cal = "FX", partition_by_month = True, add_average = False, price_index = False, resample_freq = 'B'):
+                                     month_list=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], cum=True,
+                                     cal="FX", partition_by_month=True, add_average=False, price_index=False,
+                                     resample_freq='B'):
 
         calculations = Calculations()
         filter = Filter()
 
         if price_index:
-            data_frame = data_frame.resample(resample_freq).mean()           # resample into business days
+            data_frame = data_frame.resample(resample_freq).mean()  # resample into business days
             data_frame = calculations.calculate_returns(data_frame)
 
         data_frame.index = pandas.to_datetime(data_frame.index)
         data_frame = filter.filter_time_series_by_holidays(data_frame, cal)
 
-        if resample_freq == 'B':    # business days
+        if resample_freq == 'B':  # business days
             monthly_seasonality = calculations.average_by_month_day_by_bus_day(data_frame, cal)
         elif resample_freq == 'D':  # calendar days
             monthly_seasonality = calculations.average_by_month_day_by_day(data_frame)
@@ -91,7 +94,7 @@ def bus_day_of_month_seasonality(self, data_frame,
             monthly_seasonality = monthly_seasonality.unstack(level=0)
 
             if add_average:
-               monthly_seasonality['Avg'] = monthly_seasonality.mean(axis=1)
+                monthly_seasonality['Avg'] = monthly_seasonality.mean(axis=1)
 
         if cum is True:
             if partition_by_month:
@@ -103,17 +106,17 @@ def bus_day_of_month_seasonality(self, data_frame,
 
         return monthly_seasonality
 
-    def monthly_seasonality_from_prices(self, data_frame, cum = True, add_average = False):
+    def monthly_seasonality_from_prices(self, data_frame, cum=True, add_average=False):
         return self.monthly_seasonality(data_frame, cum, add_average, price_index=True)
 
     def monthly_seasonality(self, data_frame,
-                                  cum = True,
-                                  add_average = False, price_index = False):
+                            cum=True,
+                            add_average=False, price_index=False):
 
         calculations = Calculations()
 
         if price_index:
-            data_frame = data_frame.resample('BM').mean()          # resample into month end
+            data_frame = data_frame.resample('BM').mean()  # resample into month end
             data_frame = calculations.calculate_returns(data_frame)
 
         data_frame.index = pandas.to_datetime(data_frame.index)
@@ -130,8 +133,8 @@ def monthly_seasonality(self, data_frame,
             monthly_seasonality = calculations.create_mult_index(monthly_seasonality)
 
         return monthly_seasonality
-    
-    def adjust_rolling_seasonality(self, data_frame, window = None, likely_period = None):
+
+    def adjust_rolling_seasonality(self, data_frame, window=None, likely_period=None):
         """Adjusted time series which exhibit strong seasonality. If time series do not exhibit any seasonality will return
         NaN values.
 
@@ -157,7 +160,7 @@ def adjust_rolling_seasonality(self, data_frame, window = None, likely_period =
 
         return data_frame
 
-    def _remove_seasonality(self, series, likely_period = None):
+    def _remove_seasonality(self, series, likely_period=None):
         from seasonal import fit_seasons, adjust_seasons
 
         # detrend and deseasonalize
@@ -169,6 +172,7 @@ def _remove_seasonality(self, series, likely_period = None):
 
         return adjusted[-1]
 
+
 if __name__ == '__main__':
     # see seasonality_examples
     pass