PROPOSAL: diffs #103

lskatz · 2021-08-24T17:34:24Z

Hi, I am finding one aspect of ChewBBACA problematic: that it adds alleles in the same command that it analyzes. This leads to several problems including

Automatic errors if the database is on a read-only drive. It will err as soon as it tries to write. This has happened if I mount read-only with Singularity, for example. Or if there is a central read-only MLST database on our high performance computer (HPC) that everyone uses.
Pollution of the database. I queried with some bad assemblies and now the database is ruined. The only way to backtrack is to delete and recreate the database. If there is a central MLST database on our HPC, then it is problematic if one user's mistakes lead to the pollution of the database which affects all users.

I would like to propose that the AlleleCall step produces something like diff or patch files. I would also like to propose an additional step that can accept a patch file to update the database. The most efficient way to accept a patch might be through git commands but that is just a suggestion.

Having patch files might also be helpful for compatibility with any current or future MLST callers like STing, if they decide to accept patches. It would also help in communicating between labs using ChewBBACA. For example, if I discover a new allele, it would be a standardized approach to communicating it to chewbbaca.online.

Thank you for your consideration on this topic.

The text was updated successfully, but these errors were encountered:

lskatz · 2021-08-24T17:41:53Z

The standard patch format: https://www.oreilly.com/library/view/git-pocket-guide/9781449327507/ch11.html

ramirma · 2021-08-25T14:03:30Z

Thanks for the suggestions @lskatz . Some of the points you raised have been in discussion in the group for some time, so your comments are an excellent starting point to think more seriously about this. I see @rfm-targa has already self-assigned this. I would just like to highlight that the communication with chewie name server at chewbbaca.online is already automated in chewBBACA, including the submission of new alleles identified for the first time locally. You can see more on this at https://chewie-ns.readthedocs.io/en/latest/user/synchronize_api.html.

lskatz · 2021-08-25T16:24:11Z

Thank you @ramirma and @rfm-targa for having already thought about this! Thank you for considering this topic!

lskatz · 2023-02-07T18:04:06Z

Hi, has all this been fixed in version 3?

rfm-targa · 2023-02-08T11:09:38Z

Hello @lskatz! We've added the --no-inferred parameter to allow users to decide if they want to add novel alleles to the schemas. If you use that parameter, chewBBACA will still classify novel alleles but will not add them to the schema (intermediate files are created in a separate directory). This should help prevent database pollution.
Since it does not add novel alleles to the schema if you pass the --no-inferred parameter, it should also be possible to perform allele calling if the schema is read-only. Except for the first time you use a schema to perform allele calling (created with chewBBACA v3 or schemas from chewBBACA <= 2.8.5). chewBBACA v3 creates files with pre-computed values that are used to speedup execution. After the first AlleleCall execution, you can use/copy the schema and use it in read-only mode with the --no-inferred parameter. It only updates the pre-computed files when novel alleles are added to the schema.
Let us know if you run the latest version and if any of these issues are not fixed. We'll gladly add changes to make it work under both scenarios you've described.

ramirma · 2023-02-10T08:59:15Z

@lskatz , I hope @rfm-targa's answer clarifies the points you raised. Also please note that chewBBACA may now run in 3 different modes that may also be of use to you. For more information on this please have a look at the documentation. Do let us know if the solutions implemented fully address the issues you raised.

rfm-targa self-assigned this Aug 24, 2021

rfm-targa added Status: In Progress Has been assigned and is being worked on. Type: Enhancement labels Aug 24, 2021

rfm-targa added this to the v3.0 - New AlleleCall implementation milestone Aug 28, 2021

rfm-targa added Status: On Hold Assigned but not being worked on at the moment. and removed Status: In Progress Has been assigned and is being worked on. labels Jan 31, 2022

rfm-targa removed this from the v3.0 - New AlleleCall implementation milestone Feb 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PROPOSAL: diffs #103

PROPOSAL: diffs #103

lskatz commented Aug 24, 2021

lskatz commented Aug 24, 2021

ramirma commented Aug 25, 2021 •

edited

lskatz commented Aug 25, 2021

lskatz commented Feb 7, 2023 •

edited

rfm-targa commented Feb 8, 2023

ramirma commented Feb 10, 2023

PROPOSAL: diffs #103

PROPOSAL: diffs #103

Comments

lskatz commented Aug 24, 2021

lskatz commented Aug 24, 2021

ramirma commented Aug 25, 2021 • edited

lskatz commented Aug 25, 2021

lskatz commented Feb 7, 2023 • edited

rfm-targa commented Feb 8, 2023

ramirma commented Feb 10, 2023

ramirma commented Aug 25, 2021 •

edited

lskatz commented Feb 7, 2023 •

edited