Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Fix z statistic computation in computefeats2 #644

Draft
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

smoia
Copy link
Collaborator

@smoia smoia commented Jan 13, 2021

Closes #178 .

@tsalo, apologies for the delay! The branch is incredibly behind ME-ICA/tedana/master, but at first sight I didn't see any conflict related to lack of updates. Joke, this PR does have issues related to update. I'll take care of it.

Also, let me know if instead of a [FIX] this should be a [REF] or a [ENH].

Also, it's still a draft because I see it's missing some final tweaks.

Changes proposed in this pull request:

  • Main change: removed the possibility of normalising the output of computecoeffs2 because with the new Z-score implementation it doesn't make much sense anymore.
  • Main change: added the optional steps of computing z values from beta scores in get_coeffs if an argument of the function is flagged as True. However, the way the output is returned might not be so pythonic - what do you think?
  • Main change: Rename get_coeffs into get_ls_coeffs to make the least square approach more explicit, and update the related API documentation as well as its calls throughout the code. If it's an issue, though, we can revert to the former name.
  • Add t_to_z function to tedana's stats.py, from Vanessa Sochat's TtoZ package, add its reference for duecredit (but please do check the docstring for copyright). However, this could be avoided by adding the package as dependency and importing it. What do you all think? (Now that I'm not a python newbie, I think I should!) Or shouldn't, given that the package works with nifti files. Maybe we could ask to modify the original script, adding an extra function, and import that? Also, this might have been someone else's work during DCCC2019 - if you know who was the author, let me know in order to add acknowledgements.
  • Update test_get_coeffs to test_get_ls_coeffs and the calls to get_coeffs. However, it's not testing the new part of get_ls_coeffs, only what was previously tested. If you have any suggestion on how to test the new part of the code, let me know.

Things left to do and I might or might not (if not required to merge PR) do [and by extension, question for the reviewer to see if they expect them]:

  • Update branch to latest master and fix conflicts!
  • Import Kindly propose a PR to TtoZ to be able to import TtoZ as dependency rather than merely copying code.
  • Add an empty commit to acknowledge @CesarCaballeroGaudes as co-author of this PR since we worked on it together.
  • Think about the output of get_ls_coeffs and whether it should always compute z scores or not
  • Update documentation beside API
  • Test the z score computation (any idea for that?)

… normalisation of z_score happening inside get_ls_coeffs
Changed default compute_zvalues of get_coeffs to False
Modified get_coeffs into get_ls_coeffs, corrected its use in the code.
@smoia smoia marked this pull request as draft January 13, 2021 22:39
@smoia smoia requested a review from tsalo January 13, 2021 22:40
# R-to-Z transform
data_Z = np.arctanh(data_R)
# get betas and z-values of `data`~`mmix`
# mmix is normalized internally
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do it here anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we use add_const=True while calling get_ls_coeffs instead, now that there's the option? I think it would make more sense given how the program runs.

In fact, I would set add_const default as True, given that if the data is not demeaned, you need the intercept, and if it's demeaned, the intercept doesn't bother you (it's 0).

@CesarCaballeroGaudes is there any other difference between normalising before computing LS and adding the intercept in the model?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like including the intercept by default, but isn't variance normalization required to get standardized parameter estimates?

@tsalo tsalo changed the title [FIX] Fix z statistic computaiton in computefeats2 [FIX] Fix z statistic computation in computefeats2 Jan 14, 2021
tedana/stats.py Outdated Show resolved Hide resolved
Base automatically changed from master to main February 1, 2021 23:57
Comment on lines 97 to 98
"""
Converts `data` to component space using `mmix`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""
Converts `data` to component space using `mmix`
"""Fits mmix to data with OLS and calculates standardized beta values and z-statistics.

# sigma = RSS / Degrees_of_Freedom
sigma = np.sum(np.power(mdata - np.dot(X, betas.T), 2), axis=0) / df
sigma = sigma[:, np.newaxis]
# Copmute std of betas:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Copmute std of betas:
# compute std of betas:

Comment on lines +37 to +41
"""
From Vanessa Sochat's TtoZ package.
Copyright (c) 2015 Vanessa Sochat
MIT Licensed
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just recommending the docstring I use in NiMARE.

Suggested change
"""
From Vanessa Sochat's TtoZ package.
Copyright (c) 2015 Vanessa Sochat
MIT Licensed
"""
"""Convert t-statistics to z-statistics.
An implementation of [1]_ from Vanessa Sochat's TtoZ package [2]_.
Parameters
----------
t_values : array_like
T-statistics
dof : int
Degrees of freedom
Returns
-------
z_values : array_like
Z-statistics
References
----------
.. [1] Hughett, P. (2007). Accurate Computation of the F-to-z and t-to-z
Transforms for Large Arguments. Journal of Statistical Software,
23(1), 1-5.
.. [2] Sochat, V. (2015, October 21). TtoZ Original Release. Zenodo.
http://doi.org/10.5281/zenodo.32508
"""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual code from NiMARE might also be good to use since I improved the internal variable names.

@handwerkerd handwerkerd added bug issues describing a bug or error found in the project effort: high More than 40h total work enhancement issues describing possible enhancements to the project impact: high Enhancement or functionality improvement that will affect most users priority: medium Should get addressed soon labels Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug issues describing a bug or error found in the project effort: high More than 40h total work enhancement issues describing possible enhancements to the project impact: high Enhancement or functionality improvement that will affect most users priority: medium Should get addressed soon
Projects
None yet
Development

Successfully merging this pull request may close these issues.

computefeats2 does not calculate z-statistics accurately
4 participants