-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error with samplesheet.csv with two column headers that have prefix in common #249
Comments
I had this issue too. Probably can be fixed by changing this: column_number=\$(cat $samples | head -n 1 | tr '$separator' "\\n" | grep -En "^$variable" | awk -F':' '{print \$1}') to this: column_number=\$(cat $samples | head -n 1 | tr '$separator' "\\n" | grep -En "^$variable\$" | awk -F':' '{print \$1}') But the file could probably use some fixing up in general. e.g. for that line: column_number=\$(head -n 1 $samples | tr '$separator' "\\n" | grep -En "^$variable\$" | cut -d: -f1) I'm also unsure how many backslashes it needs. |
Alternatively, you can replace line 30 and 31 with this: classes=\$(awk -F '$separator' 'NR==1 { for (i=1; i<=NF; i++) if (\$i == $variable) {lnum = i; next}} 1 {print \$lnum}' $samples) One-liner with no piping. |
Which branch was this error observed on? Is there a simple test nf-cmd to trigger the error? |
To come back to the question of @asp8200, on which branch did you observe the error? I could only recreate the error on the main branch. If you pull the pipeline from the dev branch this should solve the issue. |
Looks like it's already fixed here: https://github.com/nf-core/differentialabundance/blob/dev/modules%2Fnf-core%2Fcustom%2Ftabulartogseacls%2Fmain.nf#L30 But my suggested awk-only replacement of that line and the next might still be more robust. Should I bother with a pull-request or nah? |
Hmm, I'm not familiar with the run time of awk. Could we run in some problems with the for-loop in your on-line awk option? Especially, with large datasets. |
It just loops over the fields on the first line of the file, so it's
unlikely to be an issue. I suppose I could make a test file with thousands
of columns to check, but it seems unnecessary.
A tool called Miller is actually much better for this, but it's best to not
add any dependencies.
…On Mon, Mar 18, 2024, 9:27 AM Jennifer Müller ***@***.***> wrote:
Hmm, I'm not familiar with the run time of awk. Could we run in some
problems with the for-loop in your on-line awk option? Especially, with
large datasets.
—
Reply to this email directly, view it on GitHub
<#249 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZ2Z2FEUPKP2QI5VLHRCEDYY3TSLAVCNFSM6AAAAABETB5BTCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBTHEYDEMZRGM>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Description of the bug
Pipeline works fine with samplesheet.csv with two columns with column headers
condition
andXcondition2
, but it throws error withcondition
andcondition2
. All other input files and parameters were identical.Error shown below.
Command used and terminal output
Relevant files
No response
System information
The text was updated successfully, but these errors were encountered: