Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add detailed batch stats (for individual samples/records) #20

Open
karel-brinda opened this issue Aug 2, 2021 · 1 comment
Open

Add detailed batch stats (for individual samples/records) #20

karel-brinda opened this issue Aug 2, 2021 · 1 comment
Labels
enhancement New feature or request

Comments

@karel-brinda
Copy link
Owner

Example:

  • {batch}.samples.tsv

    Example from previous pipelines:

    sample	asm_fn	asm_ns	asm_cl	asm_fa_bytes	pre_fn	pre_ns	pre_cl	pre_kmers	pre_fa_bytes	baps_order	mashtree_order	mashquicktree_order	mashquicktree_nonlad_order	random_order
    GCGS0001	assemblies/GCGS0001.fa	74	2147327	2184850	simplitigs/GCGS0001.fa	912	2133948	2106588	2175339	1041	10	16	1101	302
    GCGS0002	assemblies/GCGS0002.fa	70	2150247	2187721	simplitigs/GCGS0002.fa	902	2137553	2110493	2178952	703	328	312	154	901
    GCGS0003	assemblies/GCGS0003.fa	77	2138373	2175816	simplitigs/GCGS0003.fa	846	2127038	2101658	2167894	832	249	86	1065	178
    GCGS0004	assemblies/GCGS0004.fa	78	2152720	2190421	simplitigs/GCGS0004.fa	895	2137401	2110551	2178733	701	331	316	152	783
    
    
@karel-brinda karel-brinda changed the title Add local stats aggregation Add detail batch stats Nov 2, 2023
@karel-brinda karel-brinda changed the title Add detail batch stats Add detailed batch stats (for individual samples/records) Nov 2, 2023
@karel-brinda
Copy link
Owner Author

karel-brinda commented Nov 2, 2023

Specification:

  • Output file: 1 line per record (asm, dbg, propagated k-mer set)
  • k-mer counting should use params for 1 genome (eg hash table size 10M)
  • counting should be streamed via GNU Parallel, collecting output lines and deduplicating by awk (to keep 1 header only)

@karel-brinda karel-brinda added the enhancement New feature or request label Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant