Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Examples dataset for Phenotype BEP #393

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Remi-Gau
Copy link
Contributor

@Remi-Gau Remi-Gau commented Aug 8, 2023

No description provided.

@@ -0,0 +1,73 @@
"""Utility script to remove subjects from the phenotype dataset.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Utility script to remove subjects from the phenotype dataset.
"""
Utility script to remove subjects from the phenotype dataset.

Comment on lines +18 to +19
if "sub-ON01016" not in subjects_to_keep:
subjects_to_keep.append('sub-ON01016')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please comment for reasoning?

print(f'Removing {subject_dir}')
shutil.rmtree(subject_dir)

# remove subject from participants.tsv

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# remove subject from participants.tsv
# remove subjects from participants.tsv

@Remi-Gau
Copy link
Contributor Author

by the way I am not sure we will want to keep the script in the end, but I put it in the PR in case we want to regenerate those datasets slightly differently

print(participants_df)
participants_df.to_csv(participants_tsv, sep='\t', index=False)

# remove subject from all tsv in phenotype folder

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# remove subject from all tsv in phenotype folder
# remove subjects from all tsv's in phenotype folder

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a great utility! Thanks.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe delete the unpaired JSONs like this one (meaning there's no TSV to go with the JSON)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I think I did that manually for a couple of things and that would have to be scripted.

@Remi-Gau
Copy link
Contributor Author

another todo in the scrip:

  • truncate the data.

@@ -0,0 +1,2 @@
participant_id bai01_r03, bai1_1 bai02_r03, bai1_2 bai03_r03, bai1_3 bai04_r03, bai1_4 bai05_r03, bai1_5 bai06_r03, bai1_6 bai07_r03, bai2_7 bai08_r03, bai2_8 bai09_r03, bai2_9 bai10_r03, bai2_10 bai11_r03, bai2_11 bai12_r03, bai3_12 bai13_r03, bai3_13 bai14_r03, bai3_14 bai15_r03, bai3_15 bai16_r03, bai3_16 bai17_r03, bai4_1 bai18_r03, bai4_2 bai19_r03, bai4_3 bai20_r03, bai4_4 bai21_r03, bai4_5
sub-ON66199 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember if we decided since this file is nested under a participant_id folder that we don't need the participant_id column, or if maybe it just always needs to be there. Open to commentary...

Copy link

@ericearl ericearl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a wonderful bunch of examples and utilities! Address those few mini-suggestions and the couple slightly larger comments and I approve!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants