Skip to content

Commit

Permalink
Update notebooks and toc
Browse files Browse the repository at this point in the history
  • Loading branch information
j1c committed Apr 26, 2023
1 parent cf1c2a6 commit 18b7dc4
Show file tree
Hide file tree
Showing 4 changed files with 282 additions and 32 deletions.
14 changes: 14 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,17 @@
# Independence Testing for Multivariate Time Series

Code accompanying the publication: [Independence Testing for Multivariate Time Series](https://arxiv.org/abs/1908.06486).

## Abstract

Complex data structures such as time series are increasingly more prevalent in modern data science problems. A fundamental question is whether two such time-series have a statistically significant relationship. Many current approaches rely on making parametric assumptions on the random processes, detecting only linear associations, requiring multiple tests, or sacrificing power in high-dimensional and nonlinear settings. The distribution of any test statistic under the null hypothesis is challenging to estimate, as the permutation test is typically invalid. This study combines distance correlation (Dcorr) and multiscale graph correlation (MGC) from independence testing literature with block permutation from time series analysis to address these challenges. The proposed nonparametric procedure is asymptotic valid, consistent for dependence testing under stationary time-series, able to estimate the optimal lag that maximizes the dependence. It eliminates the need for multiple testing, and exhibits superior power in high-dimensional, low sample size, and nonlinear settings. The analysis of neural connectivity with fMRI data reveals a linear dependence of signals within the visual network and default mode network and nonlinear relationships in other networks. This work provides a primary data analysis tool with open-source code, impacting a wide range of scientific disciplines.

## Repo Structure

## Guide to using the repository

- Navigate to a directory where you want to store the project, and clone this repo:

```
git clone https://github.com/neurodata/bilateral-connectome
```
3 changes: 3 additions & 0 deletions coverpage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Abstract

Complex data structures such as time series are increasingly more prevalent in modern data science problems. A fundamental question is whether two such time-series have a statistically significant relationship. Many current approaches rely on making parametric assumptions on the random processes, detecting only linear associations, requiring multiple tests, or sacrificing power in high-dimensional and nonlinear settings. The distribution of any test statistic under the null hypothesis is challenging to estimate, as the permutation test is typically invalid. This study combines distance correlation (Dcorr) and multiscale graph correlation (MGC) from independence testing literature with block permutation from time series analysis to address these challenges. The proposed nonparametric procedure is asymptotic valid, consistent for dependence testing under stationary time-series, able to estimate the optimal lag that maximizes the dependence. It eliminates the need for multiple testing, and exhibits superior power in high-dimensional, low sample size, and nonlinear settings. The analysis of neural connectivity with fMRI data reveals a linear dependence of signals within the visual network and default mode network and nonlinear relationships in other networks. This work provides a primary data analysis tool with open-source code, impacting a wide range of scientific disciplines.
255 changes: 255 additions & 0 deletions data.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,255 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "746578a4-88dd-4cec-b437-d6f1d0865ef6",
"metadata": {},
"source": [
"# Generating data for analysis"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "f12ed4de-a7b6-4090-9eea-13fefdce9d1e",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import scipy as sp\n",
"from pathlib import Path\n",
"from hyppo.tools import ts_sim\n",
"\n",
"TS_SIMS = [\n",
" \"indep_ar\",\n",
" \"cross_corr_ar\",\n",
" \"nonlinear_process\",\n",
" \"extinct_gaussian_process\",\n",
"]\n",
"\n",
"p = \"./data/\""
]
},
{
"cell_type": "markdown",
"id": "91de8db0-7809-4dd8-89a3-d3b7c098aab6",
"metadata": {},
"source": [
"## Generate Experiment 1 - Independent AR(1) with increasing sample size."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "03d96d14-c068-471b-b7c1-80e17621f73e",
"metadata": {},
"outputs": [],
"source": [
"fname = \"1-independent_ar_n\"\n",
"\n",
"n = 200\n",
"reps = 300\n",
"\n",
"np.random.seed(1)\n",
"\n",
"datas = [ts_sim(\"indep_ar\", n) for _ in range(reps)]\n",
"\n",
"X = np.hstack([data[0] for data in datas])\n",
"Y = np.hstack([data[1] for data in datas])\n",
"\n",
"savedict = {\n",
" 'X' : X,\n",
" 'Y' : Y,\n",
"}\n",
"\n",
"# save to disk\n",
"sp.io.savemat(f'{p}{fname}.mat', savedict, do_compression=True)"
]
},
{
"cell_type": "markdown",
"id": "8ba61b56-eecf-4ea2-9ea1-cf58b704257d",
"metadata": {},
"source": [
"## Generate Experiment 2 - Independent AR(1) with increasing phi."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "d6a02659-109a-451f-9d06-773e0726e9b9",
"metadata": {},
"outputs": [],
"source": [
"fname = \"2-independent_ar_phi\"\n",
"\n",
"n = 1200\n",
"reps = 300\n",
"phis = np.arange(0.2, 1, 0.025)\n",
"\n",
"np.random.seed(1)\n",
"\n",
"Xs = []\n",
"Ys = []\n",
"\n",
"for phi in phis:\n",
" datas = [ts_sim(\"indep_ar\", n, phi=float(phi)) for _ in range(reps)]\n",
" Xs.append(np.hstack([data[0] for data in datas]))\n",
" Ys.append(np.hstack([data[1] for data in datas]))\n",
"\n",
"\n",
"X = np.stack(Xs)\n",
"Y = np.stack(Ys)\n",
"\n",
"savedict = {\n",
" 'X' : X,\n",
" 'Y' : Y,\n",
" 'phi': phis\n",
"}\n",
"\n",
"# save to disk\n",
"sp.io.savemat(f'{p}{fname}.mat', savedict, do_compression=True)"
]
},
{
"cell_type": "markdown",
"id": "1d804f8d-a21c-4c3d-a5d0-dc918ebea269",
"metadata": {},
"source": [
"## Generate Experiment 3 - Linear cross correlated AR(1) with increasing sample size"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "4cc407eb-14a6-4441-b8da-32afca1f6465",
"metadata": {},
"outputs": [],
"source": [
"fname = \"3-linear_ar\"\n",
"\n",
"n = 200\n",
"reps = 300\n",
"\n",
"np.random.seed(1)\n",
"\n",
"datas = [ts_sim(\"cross_corr_ar\", n) for _ in range(reps)]\n",
"\n",
"X = np.hstack([data[0] for data in datas])\n",
"Y = np.hstack([data[1] for data in datas])\n",
"\n",
"savedict = {\n",
" 'X' : X,\n",
" 'Y' : Y,\n",
"}\n",
"\n",
"# save to disk\n",
"sp.io.savemat(f'{p}{fname}.mat', savedict, do_compression=True)"
]
},
{
"cell_type": "markdown",
"id": "a46bc6e9-a0dc-458a-bb24-e0453e1ab000",
"metadata": {},
"source": [
"## Generate Experiment 4 - Non-linearly cross correlated AR(1) with increasing sample size"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "4c232a9d-5c05-4ed9-8a51-1daef8efb0f1",
"metadata": {},
"outputs": [],
"source": [
"fname = \"4-nonlinear_ar\"\n",
"\n",
"n = 200\n",
"reps = 300\n",
"\n",
"np.random.seed(1)\n",
"\n",
"datas = [ts_sim(\"nonlinear_process\", n) for _ in range(reps)]\n",
"\n",
"X = np.hstack([data[0] for data in datas])\n",
"Y = np.hstack([data[1] for data in datas])\n",
"\n",
"savedict = {\n",
" 'X' : X,\n",
" 'Y' : Y,\n",
"}\n",
"\n",
"# save to disk\n",
"sp.io.savemat(f'{p}{fname}.mat', savedict, do_compression=True)"
]
},
{
"cell_type": "markdown",
"id": "5d60065b-fc9e-4dbf-958c-137558de83fb",
"metadata": {},
"source": [
"## Generate Experiment 5 - Non-linearly cross correlated AR(1) with increasing sample size"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "32aaef0f-8d47-4b8c-9a29-e1a749996063",
"metadata": {},
"outputs": [],
"source": [
"fname = \"5-extinct_gaussian\"\n",
"\n",
"n = 200\n",
"reps = 300\n",
"\n",
"np.random.seed(1)\n",
"\n",
"datas = [ts_sim(\"extinct_gaussian_process\", n) for _ in range(reps)]\n",
"\n",
"X = np.hstack([data[0] for data in datas])\n",
"Y = np.hstack([data[1] for data in datas])\n",
"\n",
"savedict = {\n",
" 'X' : X,\n",
" 'Y' : Y,\n",
"}\n",
"\n",
"# save to disk\n",
"sp.io.savemat(f'{p}{fname}.mat', savedict, do_compression=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "98cf8254-3a4e-4421-a272-5a0eb81d2c37",
"metadata": {},
"outputs": [],
"source": [
"# Generate Experiment 6 - optimal lag estimation\n",
"\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
42 changes: 10 additions & 32 deletions figure1.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,44 +2,22 @@
"cells": [
{
"cell_type": "code",
"execution_count": 2,
"id": "3522aa07-8839-4272-93b5-c8ed76e934e1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Obtaining file:///Users/j1c/git/mgcx/hyppo\n",
" Preparing metadata (setup.py) ... \u001b[?25ldone\n",
"\u001b[?25hRequirement already satisfied: numpy>=1.17 in /Users/j1c/miniconda3/envs/mgcx/lib/python3.9/site-packages (from hyppo==0.3.2) (1.23.5)\n",
"Requirement already satisfied: scipy>=1.4.0 in /Users/j1c/miniconda3/envs/mgcx/lib/python3.9/site-packages (from hyppo==0.3.2) (1.10.1)\n",
"Requirement already satisfied: numba>=0.46 in /Users/j1c/miniconda3/envs/mgcx/lib/python3.9/site-packages (from hyppo==0.3.2) (0.56.4)\n",
"Requirement already satisfied: scikit-learn>=0.19.1 in /Users/j1c/miniconda3/envs/mgcx/lib/python3.9/site-packages (from hyppo==0.3.2) (1.2.2)\n",
"Requirement already satisfied: autograd>=1.3 in /Users/j1c/miniconda3/envs/mgcx/lib/python3.9/site-packages (from hyppo==0.3.2) (1.5)\n",
"Requirement already satisfied: future>=0.15.2 in /Users/j1c/miniconda3/envs/mgcx/lib/python3.9/site-packages (from autograd>=1.3->hyppo==0.3.2) (0.18.3)\n",
"Requirement already satisfied: llvmlite<0.40,>=0.39.0dev0 in /Users/j1c/miniconda3/envs/mgcx/lib/python3.9/site-packages (from numba>=0.46->hyppo==0.3.2) (0.39.1)\n",
"Requirement already satisfied: setuptools in /Users/j1c/miniconda3/envs/mgcx/lib/python3.9/site-packages (from numba>=0.46->hyppo==0.3.2) (66.0.0)\n",
"Requirement already satisfied: threadpoolctl>=2.0.0 in /Users/j1c/miniconda3/envs/mgcx/lib/python3.9/site-packages (from scikit-learn>=0.19.1->hyppo==0.3.2) (2.2.0)\n",
"Requirement already satisfied: joblib>=1.1.1 in /Users/j1c/miniconda3/envs/mgcx/lib/python3.9/site-packages (from scikit-learn>=0.19.1->hyppo==0.3.2) (1.1.1)\n",
"Installing collected packages: hyppo\n",
" Attempting uninstall: hyppo\n",
" Found existing installation: hyppo 0.3.2\n",
" Uninstalling hyppo-0.3.2:\n",
" Successfully uninstalled hyppo-0.3.2\n",
" Running setup.py develop for hyppo\n",
"Successfully installed hyppo-0.3.2\n"
]
}
],
"execution_count": 1,
"id": "86756329-8d2b-4794-998c-c7cc3174d8d9",
"metadata": {
"tags": [
"hide-cell"
]
},
"outputs": [],
"source": [
"!pip install -e ./hyppo"
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3b289764-53f8-4385-b763-1f7880070cbe",
"id": "0f8f0546-9cb7-4b0c-9cf4-3c4043cfada2",
"metadata": {},
"outputs": [],
"source": []
Expand Down

0 comments on commit 18b7dc4

Please sign in to comment.