Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModelManager.predict_all() modifies dataframe in place #163

Open
GeorgWa opened this issue May 7, 2024 · 1 comment
Open

ModelManager.predict_all() modifies dataframe in place #163

GeorgWa opened this issue May 7, 2024 · 1 comment
Assignees

Comments

@GeorgWa
Copy link
Collaborator

GeorgWa commented May 7, 2024

Describe the bug
This is not really a bug, rather unexpected behaviour. The use case is a filtered spectral library which does not have the expected precursor & fragment order. It would be good to know, what the expected precursor and fragment order is.

To Reproduce

speclib = SpecLibBase()
speclib.precursor_df = pd.DataFrame([
    {'sequence': 'PEPTIDEK', 'charge':2, 'mods': '', 'mod_sites': ''},
    {'sequence': 'MYCMENK', 'charge':2, 'mods': '', 'mod_sites': ''},
    {'sequence': 'IDEK', 'charge':3, 'mods': '', 'mod_sites': ''},
    {'sequence': 'PELLPTIDEK', 'charge':3, 'mods': '', 'mod_sites': ''},
])
speclib.calc_fragment_mz_df()

speclib.precursor_df = speclib.precursor_df[speclib.precursor_df['charge']==3]

print('before: ',speclib.precursor_df['frag_start_idx'].values)
model_manager = ModelManager(
    device="mps",
)
_ = model_manager.predict_all(speclib.precursor_df, predict_items=['ms2'])
print('after: ',speclib.precursor_df['frag_start_idx'].values)

Results

before:  [ 0 16]
2024-05-07 09:55:17> Predicting MS2 ...
100%|██████████| 2/2 [00:00<00:00, 94.23it/s]
after:  [0 3]

Expected behavior
as ModelManager.predict_all() returns a Dataframe it would be expected that precursor_df is not changed.

@jalew188 jalew188 self-assigned this May 13, 2024
@stratomaster31
Copy link

stratomaster31 commented May 21, 2024

The thing is that precursor_df is sorted in-place by nAA and I assume it is for computational speed purposes.

Apart from altering the precursor_df the predictions are not returned with the same original ordering

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants