Adding Google analytcis tracking

kxytechnologies · May 26, 2020 · e9abef9 · e9abef9
1 parent 2943194
commit e9abef9
Show file tree

Hide file tree

Showing 12 changed files with 45 additions and 18 deletions.
diff --git a/docs/conf.py b/docs/conf.py
@@ -17,7 +17,7 @@
 
 # -- Project information -----------------------------------------------------
 
-project = 'Model-Free & Serverless ML'
+project = 'KXY (Model-Free & Serverless AutoML)'
 copyright = '2020, KXY Technologies, Inc.'
 author = 'Dr. Yves-Laurent Kom Samo'
 version = 'latest'
@@ -32,7 +32,7 @@
 extensions = ['sphinx.ext.autodoc', 'sphinx.ext.coverage', 'sphinx.ext.napoleon', \
     'sphinx.ext.todo', 'sphinx.ext.githubpages', 'sphinxcontrib.bibtex', \
     'sphinx.ext.mathjax', 'sphinx.ext.autosectionlabel', 'nbsphinx', \
-    'sphinx_copybutton']
+    'sphinx_copybutton', 'sphinxcontrib.googleanalytics']
 
 # imgmath_image_format = 'svg'
 # imgmath_font_size = 13
@@ -78,4 +78,7 @@
 nbsphinx_output_prompt = 'Out[%s]:'
 source_suffix = ['.rst', '.md', '.ipynb']
 
+# Google Analytics
+googleanalytics_id = 'UA-167632834-1'
+googleanalytics_enabled = True
 
diff --git a/docs/index.rst b/docs/index.rst
@@ -1,3 +1,6 @@
+.. meta::
+   :description: The Python API to KXY, the first and only AutoML platform for pre-learning and post-learning
+   :keywords: AutoML, Pre-Learning, Post-Learning, KXY API, KXY Technologies
 
 
 Table of Contents

diff --git a/docs/latest/classification/post_learning/index.rst b/docs/latest/classification/post_learning/index.rst
@@ -1,4 +1,6 @@
-
+.. meta::
+   :description: Python API for post-learning (e.g. evaluating whether a trained classification model can be improved wihout adding more inputs, explaining the decisions of a trained model, quantifying bias in a trained classification model etc.), using information theory.
+   :keywords: Classification Suboptimality, Model Explaination, Bias Quantification, Dataset Valuation, AutoML, Pre-Learning, KXY API, KXY Technologies.
 
 =============
 Post-Learning

diff --git a/docs/latest/classification/pre_learning/index.rst b/docs/latest/classification/pre_learning/index.rst
@@ -1,3 +1,6 @@
+.. meta::
+   :description: Python API for evaluating the marginal value added of a dataset, input marginal importance scores, and how feasible a classification problem is, using information theory.
+   :keywords: Classification Feasibility, Input Importance, Dataset Valuation, AutoML, Pre-Learning, KXY API, KXY Technologies.
 
 ============
 Pre-Learning

diff --git a/docs/latest/finance/beta/index.rst b/docs/latest/finance/beta/index.rst
@@ -1,3 +1,6 @@
+.. meta::
+   :description: Python API for nonlinear and memoryful factor analysis.
+   :keywords: Nonlinear Factors, Memoryful Factors, Information-Adjusted Beta, Robust Beta, KXY API, KXY Technologies.
 
 ===============
 Factor Analysis

diff --git a/docs/latest/finance/corr/index.rst b/docs/latest/finance/corr/index.rst
@@ -1,3 +1,6 @@
+.. meta::
+   :description: Python API for nonlinear and memoryful risk analysis.
+   :keywords: Nonlinear Risk, Memoryful Risk, Information-Adjusted Correlation, Robust Correlation, KXY API, KXY Technologies.
 
 =============
 Risk Analysis

diff --git a/docs/latest/introduction/memoryless/index.rst b/docs/latest/introduction/memoryless/index.rst
@@ -1,3 +1,6 @@
+.. meta::
+   :description: The theoretical foundation of the KXY solution to pre-learning and post-learning problems
+   :keywords:  Pre-Learning, Post-Learning, Maximum-Entropy Principle, Input Importance, Feature Importance, KXY API, KXY Technologies, Model Explanation, Dataset Valuation, Input Importance, Feature Importance, Model Suboptimality, Model Optimality
 
 ***********************
 Memoryless Observations

diff --git a/docs/latest/notebooks/classification/input_importance/index.ipynb b/docs/latest/notebooks/classification/input_importance/index.ipynb
@@ -4,7 +4,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Input Importance Explained\n",
+    "# Detecting Useless Inputs\n",
     "\n",
     "In the Getting Started section, we briefly illustrated how pre-learning and post-learning classification problems can be solved using the `kxy` package. \n",
     "\n",
@@ -485,7 +485,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The `kxy` package selected inputs in the same order as our qualitative analysis, and the relative magnitudes of marginal usefulness scores (`incremental_importance`) seem consistent with the impact we would expect each input to have on reducing ambiguity in each stage of our qualitative analysis above. "
+    "The `kxy` package selected inputs in the same order as our qualitative analysis, and the relative magnitudes of marginal usefulness scores (`Incremental Importance`) seem consistent with the impact we would expect each input to have on reducing ambiguity in each stage of our qualitative analysis above. "
    ]
   }
  ],

diff --git a/docs/latest/regression/post_learning/index.rst b/docs/latest/regression/post_learning/index.rst
@@ -1,3 +1,6 @@
+.. meta::
+   :description: Python API for post-learning (e.g. evaluating whether a trained regression model can be improved wihout adding more inputs, explaining the decisions of a trained model, quantifying bias in a trained regression model etc.), using information theory.
+   :keywords: Regression Suboptimality, Model Explaination, Bias Quantification, Dataset Valuation, AutoML, Pre-Learning, KXY API, KXY Technologies.
 
 =============
 Post-Learning

diff --git a/docs/latest/regression/pre_learning/index.rst b/docs/latest/regression/pre_learning/index.rst
@@ -1,3 +1,7 @@
+.. meta::
+   :description: Python API for pre-learning (e.g. evaluating the marginal value added of a dataset, input marginal importance scores, and how feasible a regression problem is) using information theory.
+   :keywords: Regression Feasibility, Input Importance, Dataset Valuation, AutoML, Pre-Learning, KXY API, KXY Technologies.
+
 
 ============
 Pre-Learning

diff --git a/kxy/data/dataframe.py b/kxy/data/dataframe.py
@@ -242,7 +242,7 @@ def classification_suboptimality(self, prediction_column, label_column, discrete
 
 
 
-	def individual_input_importance(self, label_column, input_columns=(), problem=None, space='dual', score=False):
+	def individual_input_importance(self, label_column, input_columns=(), problem=None, space='dual', correlation_scale=False):
 		"""
 		.. _dataframe-input-importance:
 		Calculates the importance of each input in the input set at solving the supervised
@@ -274,7 +274,7 @@ def individual_input_importance(self, label_column, input_columns=(), problem=No
 			The type of supervised learning problem. One of None (default), 'classification'
 			or 'regression'. When problem is None, the supervised learning problem is inferred
 			based on whether labels are numeric and the percentage of distinct labels.
-		score : bool
+		correlation_scale : bool
 			If True, then input importance scores are scaled using the transformation :math:`i \\to \\sqrt{1-e^{-2i}}` 
 			so as to give importance scores the same scale as correlations, and provide developers with a more intuitive
 			understanding of the magnitude of input importance scores. This transformation is inspired by the relation
@@ -309,7 +309,7 @@ def individual_input_importance(self, label_column, input_columns=(), problem=No
 			for imp in p.map(self.__individual_input_importance, args):
 				importance.update(imp)
 
-		if score:
+		if correlation_scale:
 			importance = {col: np.sqrt(1.-min(np.exp(-2.*importance[col]), 1.)) for col in importance.keys()}
 
 		total_importance = np.sum([importance[col] for col in importance.keys() if importance[col]])
@@ -650,7 +650,7 @@ def _pre_solve(self):
 
 
 
-	def regression_input_incremental_importance(self, label_column, input_columns=(), space='dual', greedy=True, score=False):
+	def regression_input_incremental_importance(self, label_column, input_columns=(), space='dual', greedy=True, correlation_scale=False):
 		"""
 		Quantifies how important each input is at solving a regression problem,
 		taking into possible information redundancy between inputs.
@@ -691,7 +691,7 @@ def regression_input_incremental_importance(self, label_column, input_columns=()
 			The type of supervised learning problem. One of None (default), 'classification'
 			or 'regression'. When problem is None, the supervised learning problem is inferred
 			based on whether labels are numeric and the percentage of distinct labels.
-		score : bool
+		correlation_scale : bool
 			If True, then input importance scores are scaled using the transformation :math:`i \\to \\sqrt{1-e^{-2i}}` 
 			so as to give importance scores the same scale as correlations, and provide developers with a more intuitive
 			understanding of the magnitude of input importance scores. This transformation is inspired by the relation
@@ -740,7 +740,7 @@ def regression_input_incremental_importance(self, label_column, input_columns=()
 				remaining_columns.remove(column)
 
 		# Normalize and format as a dataframe.
-		if score:
+		if correlation_scale:
 			res = {col: np.sqrt(1.-min(np.exp(-2.*res[col]), 1.)) for col in res.keys()}
 
 		total_importance = np.sum([res[col] for col in res.keys() if res[col]])
@@ -757,7 +757,7 @@ def regression_input_incremental_importance(self, label_column, input_columns=()
 
 
 
-	def classification_input_incremental_importance(self, label_column, input_columns=(), space='dual', score=False):
+	def classification_input_incremental_importance(self, label_column, input_columns=(), space='dual', correlation_scale=False):
 		"""
 		Quantifies how important each input is at solving a classification problem,
 		taking into possible information redundancy between inputs.
@@ -788,7 +788,7 @@ def classification_input_incremental_importance(self, label_column, input_column
 		input_columns : set, optional
 			The set of columns to as inputs. When input_columns is the empty set,
 			all columns except for label_column are used as inputs.
-		score : bool
+		correlation_scale : bool
 			If True, then input importance scores are scaled using the transformation :math:`i \\to \\sqrt{1-e^{-2i}}` 
 			so as to give importance scores the same scale as correlations, and provide developers with a more intuitive
 			understanding of the magnitude of input importance scores. This transformation is inspired by the relation
@@ -836,7 +836,7 @@ def classification_input_incremental_importance(self, label_column, input_column
 				break
 
 		# Step 3: Normalize and format as a dataframe.
-		if score:
+		if correlation_scale:
 			res = {col: np.sqrt(1.-min(np.exp(-2.*res[col]), 1.)) for col in res.keys()}
 
 		total_importance = np.sum([res[col] for col in res.keys() if res[col]])
@@ -875,7 +875,7 @@ def __classification_input_incremental_importance(self, args):
 
 
 
-	def incremental_input_importance(self, label_column, input_columns=(), space='dual', greedy=True, score=False):
+	def incremental_input_importance(self, label_column, input_columns=(), space='dual', greedy=True, correlation_scale=False):
 		"""
 		Returns :code:`DataFrame.classification_input_incremental_importance` or 
 		:code:`DataFrame.regression_input_incremental_importance` depending on whether the label 
@@ -886,11 +886,11 @@ def incremental_input_importance(self, label_column, input_columns=(), space='du
 		if problem == 'classification':
 			self.adjust_quantized_values()
 			return self.classification_input_incremental_importance(label_column, input_columns=input_columns, \
-				space=space, score=score)
+				space=space, correlation_scale=correlation_scale)
 
 		else:
 			return self.regression_input_incremental_importance(label_column, input_columns=input_columns, \
-				space=space, greedy=greedy, score=score)
+				space=space, greedy=greedy, correlation_scale=correlation_scale)
 
 
 

diff --git a/setup.py b/setup.py
@@ -8,7 +8,7 @@
 import sys
 sys.path.append('.')
 from setuptools import setup, find_packages
-version = "0.0.17"
+version = "0.0.18"
 
 setup(name="kxy",
 	version=version,