Skip to content
Hauke Jürgen Mönck edited this page Mar 19, 2018 · 7 revisions

Analyzing Data

"Along with the BioTracker comes a python tool named "Data Analysis Module" to evaluate CSV output of the trackers. The simplistic user interface and provides an easy way to calculate a number of metrics for arbitrary CSV files by using user annotated columns, which is done automatically for BioTracker conformate output. It can then calculate metrics like speed, inter-individual distance and transfer entropy for all individuals, pairwise respectively and write them back to new CSV files. There is also the option to filter arbitrary columns.

File selction


Loading a CSV file for analysis

  • "Browse" will open a file selction dialogue which allows the user to choose a raw data file. Only accepted type is CSV.
  • "File deliminator" column seperator of the CSV file. By default this is a comma. Other accepted seperator are semicolon or tab.
  • With "Skip rows" the user can choose to skip the first n rows of the raw data file. By default this value is set to 0 (no skip). Lines beginning with # are interpreted as comments and will always be skipped. Also any column names or headers will be ignored as well as rows with missing values. Missing values means, if there is less values than indicated by the header. Empty, i.e. no text between two separators, is not considered missing.
  • If "view" is unchecked the data display is skipped and data is loaded automatically using the latest defined column numbers/names. This option should only be used when analyzing csv files with the same column order because loading a csv file with the wrong column order may cause errors. It is generally recommended to have 'view' checked and not skip the column selection.
  • "Load" opens a display of the raw data if ‘view’ is checked, where the user can assign names to the columns (see 1.1). If ‘view’ is unchecked the data is loaded directly to the interface.


Selecting the format of the CSV file.

1.1. Display section: shows the raw data, excluding comment lines, headers, and rows the user decided to skip. Only the first N lines will be shown to avoid delays.

1.2. Column selection area: allows to attribute columns of the original datafile to the relevant variables. In the current example this means that the frame numbers are in column 1, time-stamps in milliseconds in column 2, the x-position-component of the tracked animal in column 7 etc. To change these settings see 1.3.1 and 1.3.2

1.3. Change and Finalize area:

  • "Add parameters" opens a menu where the time and angle format can be set (see 1.3.1)
  • "Add/Remove Agents" opens a menu where agent can be added removed and renamed
  • "OK" Loads the selected columns for analysis, saves settings and returns to the main window.


Setting miscallenous options for the to be loaded CSV file

1.3.1
Check boxes allow to set the desired time format (datetime, milliseconds or seconds) and angle format (dregree (deg) or radiant (rad) ). Also additional categories can be selected but are currently not processed. Clicking "Apply" will save these settings and return to the table view.


Selecting amount and names of agents.

1.3.2
The line called "Agents" allows to specify the numbers of agents to be analysed. Clicking the "Change" Button enables the user to give meaningful names to these agents, otherwise the will have the default names agent0, agent1, etc. Clicking "OK" saves the changes and returns to table window.

Track Time Information


View for changing start and stop time of observed data.

This window displays start time, stop time and duration of experiment (in second and frames) and allows to change the relevant time range. Independent of the selected time format the display here only used seconds (s).

"Change" Opens a window where start and stop time can be adjusted with a slider:


Example of plotting an agents speed over time

World Boundaries

Display the minimal and maximal x and y values of the raw data and allows to change the relevant region. All displayed values are interpreted as cm.
"Change" allows to edit the boundary values to e.g. perform analysis in a subregion.
"Add Subregions" opens a window where subregions of the main area can be defined. For those areas the same analysis is performed as for the main region, results appear as additional rows in info.csv (see Results section).

Filtering and Smoothing

"Select Filter" allows to select a filter (currently only median filter with k =5 ).
"Apply Smoothing" applied the selected filter component to the x and y component of each agent.

Inspect Data

Allows to generate plots of the data with respect to the currently selected spatial and temporal boundaries. The plots will show in a separate window and can be edited and saved from there.
1st dropdown menu allows to select the type of plot to be generated. Available options are: "Trajectory", "Timeline", "Histogramm" and "Boxplot".
2nd dropdown menu depends on the selection in the first. For example Timelines are available for the parameters "speed", "distance" and "angle".
"Inspect" opens a display window with the desired plot.

Finalize

"Options" opens a window with three tabs: “Plots”, “Folders” and “Transfer Entropy”

  • “Plots” allows the user to select plots that will be automatically saved with the results files. Options are all, none or individual selection
  • “Folders” provides an overview of currently used default folders e.g. for data or results
  • “Transfer Entropy” allows the user to select whether she wants to calculate Transfer Entropy for the currently selected data. Since this calculation is only applicable for two agents and can require considerable time and computing power, it must be explicitly selected by the user.

"Save" opens a folder selection dialogue allowing the user to specify a results directory. A folder called "BioTrackerAnalysis" + current date / number ( e.g BioTrackerAnalysis_2018_02_22/008) will be created. This files in this folder are (1) timelines.csv (2) info.csv (3) plots. For further description see below.

Results

Info.csv contains basic data and parameters concerning the experiment setup, single agents and pairs of agents. In the following these results are described in more detail:

1.a, general information

Source Name of the original datafile
x_min Minimum value of all x-positions
x_max Maximum value of all x-positions
y_min Minimum value of all y-positions
y_max Maximum value of all y-positions
start Start time in seconds
stop Stop time in seconds
filtered ‘True’ or ‘False’ depending on
whether iltering was performed

1.b. Agent specific informattion

trajectory_length Total length of trajectory covered
speed_mean Mean of agent’s speed
speed_var Variance of agent’s speed
speed_min Minimum value of agents speed
speed_25% 25 percentile of agent’s speed
speed_median Median of agent’s speed (i.e. 50 percentile)
speed_75% 75 percentile of agent’s speed
speed_max Maximum value of agents speed

1.c. Information about pairs of agents

Key Interpretation
dist_mean Mean of distance
dist_var Variance of distance
dist_min Minimum of distance
dist_25% 25 percentile of distance
dist_median Median of distance (i.e. 50 percentile)
dist_75% 75 percentile of distance
dist_max Maximum value of distance
closer_5cm_(s) Time the agents spent closer to each
other than a threshold distance
(default: 5, 10, 15 and 20 cm)
closer_5cm_(%) Percentage of the selected time range
that agents spent closer to each other
than a threshold distance
(same default values as above)
Correlation of speeds Not yet implemented
  1. Timelines.csv contains for each frame in the selected spatial and temporal range (as determined by x_min, x_max, y_min, y_max, start and stop in info.csv) the following values:
frame Number of current frame
time Timestamp of current frame in the original format
seconds Timestamp of current frame in seconds
agent_x X-Coordinate of agents position (using agents given name)
agent_y Y-Coordinate of agents position (using agents given name)
agent_angle Current angle of agent in the selected format i.e rad or deg
agent_vx X-Component of agents velocity
agent_vy Y-Component of agents velocity
agent_speed Agents speed calculated asagent_vx² + agent_vy².
agent1/agent2_dist Distance between two agents calculated via
sqrt((agent1_x-agent2_x)²)+sqrt((agent1_y- agent)²)