Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Enable models for sparsely sampled fMRI series #414

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

effigies
Copy link
Collaborator

Initial commit is all but the Convolve portions of #376 squashed onto master @ 4315865. There may be some parts that are no longer appropriate.

Supersedes and closes #376 (leaving open for now for reference).
Closes #252.

@codecov
Copy link

codecov bot commented Mar 11, 2019

Codecov Report

Merging #414 into master will decrease coverage by 0.29%.
The diff coverage is 44.23%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #414     +/-   ##
=========================================
- Coverage   62.37%   62.08%   -0.3%     
=========================================
  Files          27       27             
  Lines        4564     4602     +38     
  Branches     1174     1185     +11     
=========================================
+ Hits         2847     2857     +10     
- Misses       1433     1455     +22     
- Partials      284      290      +6
Flag Coverage Δ
#unittests 62.08% <44.23%> (-0.3%) ⬇️
Impacted Files Coverage Δ
bids/variables/kollekshuns.py 83.57% <100%> (ø) ⬆️
bids/variables/entities.py 87.77% <100%> (+0.13%) ⬆️
bids/variables/variables.py 83.54% <36.36%> (-4.85%) ⬇️
bids/variables/io.py 72.24% <37.5%> (-3.01%) ⬇️
bids/analysis/analysis.py 86.91% <50%> (-1.87%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4315865...0729028. Read the comment docs.

Copy link
Collaborator

@tyarkoni tyarkoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments. I think it might be worth scheduling a meeting to chat about the more general issue of how to handle dense variables internally in pybids (notably, whether we should switch to using pandas timeseries everywhere).

slicetimes = sorted(img_md['SliceTiming'])
# a, b ... z
# z = final slice onset, b - a = slice duration
ta = np.round(slicetimes[-1] + slicetimes[1] - slicetimes[0], 3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth testing for non-uniform slice onsets (and raise a not supported error)? I don't know if that ever happens in practice, but if it does, we should probably fail...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That feels like a validation problem. While we can put those checks in, ad hoc, I think it would make sense to either fail on load for validation problems or be able to insert a little boilerplate check like: validate(slicetimes, 'SliceTiming')

else:
ta = tr
elif 'VolumeTiming' in img_md:
return NotImplemented
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be NotImplementedError? I think this will fail as written because of the tuple unpacking when _get_timing_info is called, but the user won't have any idea what went wrong.

@@ -453,18 +453,36 @@ def resample(self, sampling_rate, inplace=False, kind='linear'):

self.index = self._build_entity_index(self.run_info, sampling_rate)

x = np.arange(n)
num = len(self.index)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is on me I think, but we could probably use less confusing names than n and num for the old and new lengths.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can try to rewrite; I was mostly aiming for a minimal diff.

f = interp1d(x, self.values.values.ravel(), kind=kind)
x_new = np.linspace(0, n - 1, num=num)
self.values = pd.DataFrame(f(x_new))
if integration_window is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than determine which approach to use based on whether integration_window is passed, I think we should probably always use interpolation for upsampling and decimation for downsampling (we should definitely avoid downsampling via interpolation—that was a pretty bad decision on my part). Actually, I'm not sure integration/averaging is a great solution, as it's sort of a half-assed way to do temporal filtering. If we're going to go down this road, maybe we should just temporally filter out frequencies above half the target sampling rate and then decimate.

In general, I wonder if it makes sense to implement the resampling operations ourselves, or if we should just do it via pandas (or something like traces, which wraps pandas). I would expect that resampling in pandas is probably going to be more efficient than rolling our own solution, while relieving of us of the burden of extensive testing.

integrator = lil_matrix((num, n), dtype=np.uint8)
count = None
for i, new_time in enumerate(new_times):
cols = (old_times >= new_time) & (old_times < new_time + integration_window)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the integration window be centered on new_time, rather than using it as the lower boundary? Otherwise the value at each downsampled point is basically reading the future, which is probably not what we want.

@effigies
Copy link
Collaborator Author

Ah, saw you said we should set up a call. Looking at this again, I agree. Hacking sparse sampling into or next to the current approach is probably not worth it, and we should work out a strategy for resampling more broadly, with discontinuous integration windows in mind. Might also be worth thinking through the clustered acquisition approach at the same time, to avoid settling on a solution that will need to be refactored again when we get around to that.

@satra Do you think you or anyone in your group should be on the call?

@effigies
Copy link
Collaborator Author

effigies commented Aug 7, 2019

@satra Bump.

@tyarkoni
Copy link
Collaborator

Any status update on this, post resampling changes? (Fine if not, this is a low priority from my perspective.)

@satra
Copy link
Contributor

satra commented Jan 10, 2020

@tyarkoni - i haven't had a look at this in a while. can you point me to where the design matrix is being built and i can backtrack from there?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support sparse acquisition
3 participants