docs/html/_sources/docs/pipes.ipynb.txt

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Pipeline-based processing in pypillometry"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "sys.path.insert(0,\"..\")\n",
    "import pypillometry as pp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`pypillometry` implements a pipeline-like approach where each operation executed on a `PupilData`-object returns a copy of the (modified) object. This enables the \"chaining\" of commands as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "d=pp.PupilData.from_file(\"../data/test.pd\")\\\n",
    "    .blinks_detect()\\\n",
    "    .blinks_merge()\\\n",
    "    .lowpass_filter(3)\\\n",
    "    .downsample(50)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This command loads a data-file (`test.pd`), applies a 3Hz low-pass filter to it, downsamples the signal to 50 Hz, detects blinks in the signal and merges short, successive blinks together. The final result of this processing-pipeline is stored in object `d`. \n",
    "\n",
    "Here, for better visibility, we put each operation in a separate line. For that to work, we need to tell Python that the line has not yet ended at the end of the statement which we achieve by putting a backslash `\\` at the end of each (non-final) line.\n",
    "\n",
    "We can get a useful summary of the dataset and the operations applied to it by simply printing it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "PupilData(test_ro_ka_si_hu_re_vu_vi_be, 331.3KiB):\n",
      " n                 : 6001\n",
      " nmiss             : 117.2\n",
      " perc_miss         : 1.9530078320279955\n",
      " nevents           : 56\n",
      " nblinks           : 24\n",
      " ninterpolated     : 0.0\n",
      " blinks_per_min    : 11.998000333277787\n",
      " fs                : 50\n",
      " duration_minutes  : 2.0003333333333333\n",
      " start_min         : 4.00015\n",
      " end_min           : 6.0\n",
      " baseline_estimated: False\n",
      " response_estimated: False\n",
      " History:\n",
      " *\n",
      " └ reset_time()\n",
      "  └ blinks_detect()\n",
      "   └ sub_slice(4,6,units=min)\n",
      "    └ drop_original()\n",
      "     └ blinks_detect()\n",
      "      └ blinks_merge()\n",
      "       └ lowpass_filter(3)\n",
      "        └ downsample(50)\n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(d)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We see that sampling rate, number of datapoints and more is automatically printed along with the history of all operations applied to the dataset. This information can also be retrieved separately and in a form useful for further processing the function `summary()` which returns the information in the form of a `dict`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'name': 'test_ro_ka_si_hu_re_vu_vi_be',\n",
       " 'n': 6001,\n",
       " 'nmiss': 117.2,\n",
       " 'perc_miss': 1.9530078320279955,\n",
       " 'nevents': 56,\n",
       " 'nblinks': 24,\n",
       " 'ninterpolated': 0.0,\n",
       " 'blinks_per_min': 11.998000333277787,\n",
       " 'fs': 50,\n",
       " 'duration_minutes': 2.0003333333333333,\n",
       " 'start_min': 4.00015,\n",
       " 'end_min': 6.0,\n",
       " 'baseline_estimated': False,\n",
       " 'response_estimated': False}"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The history is internally stored in `PupilData`'s `history` member and can be applied to another object for convenience. That way, a pipeline can be developed on a single dataset and later be transferred to a whole folder of other (similar) datasets.\n",
    "\n",
    "As an example, we create several \"fake\" datasets representing data from several subjects (each with 10 trials):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "nsubj=10 # number of subjects\n",
    "data={k:pp.create_fake_pupildata(ntrials=10, fs=500) for k in range(1,nsubj+1)}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `dict` `data` now contains ten `PupilData` datasets. We will now use the data from the first subject to create a pipeline of processing operations:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "* fake_bomitime_ni_fu\n",
      "└ lowpass_filter(5)\n",
      " └ downsample(100)\n"
     ]
    }
   ],
   "source": [
    "template=data[1].lowpass_filter(5).downsample(100)\n",
    "template.print_history()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We have stored the result of these operations in a new dataset `template` which contains a record of these operations. We can now easily apply identical operations on all the datasets using the `apply_history()` function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "* fake_kowelale_wu_ni\n",
      "└ lowpass_filter(5)\n",
      " └ downsample(100)\n"
     ]
    }
   ],
   "source": [
    "preproc_data={k:template.apply_history(d) for k,d in data.items()}\n",
    "preproc_data[5].print_history()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}