Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ft_selectdata failure #2405

Open
schoffelen opened this issue Apr 10, 2024 · 4 comments · May be fixed by #2406
Open

ft_selectdata failure #2405

schoffelen opened this issue Apr 10, 2024 · 4 comments · May be fixed by #2406

Comments

@schoffelen
Copy link
Contributor

Describe the bug
ft_selectdata fails on raw data with an unluckily chosen cfg.latency, and in presence of a sampleinfo field

To Reproduce
Steps to reproduce the behavior:

data.time{1}=1:20;
data.time{2}= 11:20;
data.trial{1}=randn(1,20);
data.trial{2}=randn(1,10);
data.label={'chan01'};
data.fsample=1;

cfgsel.latency = [1 8];
data2=ft_selectdata(cfgsel,data); % runs OK, yet returns a data structure with an empty second trial

data.sampleinfo=[1 20;21 30];
data3=ft_selectdata(cfgsel,data); % crashes, when attempting to update the sampleinfo, based on an empty time axis for the second trial

Expected behavior
No crash

Environment and versions (please complete the following information):

  • Stone age CentOS
  • MATLAB version 2023b
  • FieldTrip version master branch
@schoffelen
Copy link
Contributor Author

In case no sampleinfo field exists, I wonder what happens downstream in the analysis if some of the trial/time cells are empty.

Just a quick check on the data2 structure from the previous comment: it causes a crash in ft_timelockanalysis

So, I would be up for adjusting ft_selectdata to only return the non-empty trials, (and adjust the sampleinfo/trialinfo accordingly)

@schoffelen schoffelen linked a pull request Apr 11, 2024 that will close this issue
@robertoostenveld
Copy link
Contributor

If anyone happens to be reading along here: I discussed this with Jan-Mathijs over a coffee. We identified that if a raw data structure has no sampleinfo field, then it is technically not a problem to represent an empty trial that is internally consistent like this:

data = [];
data.label = {'1', '2', '3'};
data.trial{1} = zeros(3,0);
data.time{1} = zeros(1,0);
data.trialinfo = [1];

The problem appears if sampleinfo is present, as we don't have a way of coding the sample indices of an empty trial. Solutions might be to code them as [nan nan], but [-1 -1] would IMHO be just as valid.

@schoffelen
Copy link
Contributor Author

schoffelen commented Apr 14, 2024

I don't mind either, unless there's a technical reason to choose one above the other, it sounds like it's to be a majority vote? Currently, the PR fills in with a nan. This can be easily swapped with a [-1 -1].

Yet, taking at face value that the first index is the begin, and the second index is the end, [-1:-1] will yield a one-element sample-vector, which might be inconsistent with the dimensionality of the trial/time (I didn't check what [nan:nan] would yield, but I suspect this will lead to an error.

@robertoostenveld
Copy link
Contributor

The [-1 -1] was not thought through well, and might not be a good choice. And you are correct that it would indeed result in an (invalid) one-sample trial.

I think what we want downstream is graceful errors, rather than mishap at random places in the code that make debugging hard. What about first extending ft_datatype_raw to check on validity of the input, for as far we can define "valid" at the moment? Should we also do that for timelock and freq structures? And source and volume structures?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants