Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compseq: add page #12713

Merged
merged 7 commits into from May 15, 2024
Merged

compseq: add page #12713

merged 7 commits into from May 15, 2024

Conversation

kamurani
Copy link
Contributor

@kamurani kamurani commented May 2, 2024

  • The page(s) are in the correct platform directories: common, linux, osx, windows, sunos, android, etc.
  • The page(s) have at most 8 examples.
  • The page description(s) have links to documentation or a homepage.
  • The page(s) follow the content guidelines.
  • The PR title conforms to the recommended templates.
  • Version of the command being documented (if known): EMBOSS:6.6.0.0

@kamurani kamurani requested a review from cyqsimon as a code owner May 2, 2024 08:16
@CLAassistant
Copy link

CLAassistant commented May 2, 2024

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot added the new command Issues requesting creation of a new page. label May 2, 2024
@kamurani
Copy link
Contributor Author

kamurani commented May 2, 2024

Alternative documentation link (to identical webpage content):

https://emboss.sourceforge.net/apps/cvs/emboss/apps/compseq.html

Copy link
Member

@Magrid0 Magrid0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks for your contribution
The only thing is that maybe there are a bit too much example but they're not more than 8 so it's fine


- Count observed frequencies of words in a FASTA file, providing parameter values with interactive prompt:

`compseq {{example.fasta}}`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`compseq {{example.fasta}}`
`compseq {{path/to/file.fasta}}`


- Count observed frequencies of amino acid pairs from a FASTA file, save output to a text file:

`compseq {{example_protein.fasta}} -word 2 {{result1.comp}}`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`compseq {{example_protein.fasta}} -word 2 {{result1.comp}}`
`compseq {{path/to/input_file.fasta}} -word 2 {{path/to/output_file.comp}}`


- Count observed frequencies of hexanucleotides from a FASTA file, save output to a text file and ignore zero counts:

`compseq {{example_dna.fasta}} -word 6 {{result2.comp}} -nozero`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`compseq {{example_dna.fasta}} -word 6 {{result2.comp}} -nozero`
`compseq {{path/to/input_file.fasta}} -word 6 {{path/to/output_file.comp}} -nozero`


- Count observed frequencies of codons in a particular reading frame; ignoring any overlapping counts (i.e. move window across by word-length 3):

`compseq -sequence {{example_rna.fasta}} -word 3 {{result3.comp}} -nozero -frame {{1}}`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`compseq -sequence {{example_rna.fasta}} -word 3 {{result3.comp}} -nozero -frame {{1}}`
`compseq -sequence {{path/to/input_file.fasta}} -word 3 {{path/to/output_file.comp}} -nozero -frame {{1}}`


- Count observed frequencies of codons frame-shifted by 3 positions; ignoring any overlapping counts (should report all codons except the first one):

`compseq -sequence {{example_rna.fasta}} -word 3 {{result4.comp}} -nozero -frame 3`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`compseq -sequence {{example_rna.fasta}} -word 3 {{result4.comp}} -nozero -frame 3`
`compseq -sequence {{path/to/input_file.fasta}} -word 3 {{path/to/output_file.comp}} -nozero -frame 3`


- Count amino acid triplets in a FASTA file and compare to a previous run of `compseq` to calculate expected and normalised frequency values:

`compseq -sequence {{human_proteome.fasta}} -word 3 {{result5.comp}} -nozero -infile {{prev.comp}}`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`compseq -sequence {{human_proteome.fasta}} -word 3 {{result5.comp}} -nozero -infile {{prev.comp}}`
`compseq -sequence {{path/to/input_file.fasta}} -word 3 {{path/to/output_file1.comp}} -nozero -infile {{path/to/output_file2.comp}}`


- Approximate the above command without a previously prepared file, by calculating expected frequencies using the single base/residue frequencies in the supplied input sequence(s):

`compseq -sequence {{human_proteome.fasta}} -word 3 {{result6.comp}} -nozero -calcfreq`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`compseq -sequence {{human_proteome.fasta}} -word 3 {{result6.comp}} -nozero -calcfreq`
`compseq -sequence {{path/to/input_file.fasta}} -word 3 {{path/to/output_file.comp}} -nozero -calcfreq`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestions @sebastiaanspeck , I agree showing that it accepts a path is more clear.

However, would it still be good having different example filenames (as opposed to always being path/to/input_file.fasta) -- for example, I want to highlight that the program can be equivalently used for amino acid and nucleotide sequences.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, would it still be good having different example filenames (as opposed to always being path/to/input_file.fasta) -- for example, I want to highlight that the program can be equivalently used for amino acid and nucleotide sequences.

If that clarifies the example, that is a good thing to do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebastiaanspeck If possible, can you update your suggestions to use the example names as the author suggests?

Copy link
Member

@sebastiaanspeck sebastiaanspeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, after the suggestions are applied

Copy link
Member

@kbdharun kbdharun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks for your contribution.

Copy link
Member

@sebastiaanspeck sebastiaanspeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, after the suggestions are applied

@kamurani
Copy link
Contributor Author

LGTM, after the suggestions are applied

@sebastiaanspeck added your suggestions but modified them to still retain information such as the sequence type (amino acid / nucleotide) in the input filename to match the examples' descriptions.

@sebastiaanspeck sebastiaanspeck merged commit 60254aa into tldr-pages:main May 15, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new command Issues requesting creation of a new page.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants