/
Changelog.txt
134 lines (76 loc) · 6.09 KB
/
Changelog.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
## v5.7.1 July 16, 2023
- Minor updates to better support Docker-based execution.
- Output directory (via --output_dir | -O) can be set by user, but the working directory name will be fixed within the output directory and based on the name of the target transcriptome.
## v5.7.0 Jan 27, 2023
- compatible with hmmsearch or hmmscan output
- cleaner organization of outputs and checkpoints
- TransDecoder.LongOrfs includes option for --complete_orfs_only (as requested)
- GFF3 files no longer have the uri-encoding so easier to directly read.
- misc bugfixes - corrected length in fasta header for start-refined orfs
## v5.6.0, Apr 07, 2022
- genome propagation of orfs resets orientation for unspliced transcripts with opposite strand orfs.
- speed update from Yanick Paco Hagemeijer
- added option --output_dir | -O to both TransDecoder.LongOrfs and TransDecoder.Predict, as per request, so users can point to specific output directories rather than relying on the default ( basename(target.fasta) + ".transdecoder_dir/").
- removing track name from bed output
- if select single best orf, do selection before removing overlapping preds
- updated get_longest_ORF_per_transcript.pl to match current header formatting
- really retain all blast hits, ignoring overlaps to prev selected entries, unless single_best_orf indicated
- updated genetic code options in help menu
## v5.5.0, Oct 22, 2018
added option --output_dir | -O to both TransDecoder.LongOrfs and TransDecoder.Predict, as per request, so users can point to specific output directories rather than relying on the default ( basename(target.fasta) + ".transdecoder_dir/").
## v5.4.0, Oct 17, 2018
bugfix - earlier version was inadvertently reporting a single best orf per transcript when --single_best_orf was not invoked. Now fixed and behaves as advertised.
Also, less verbose.
Added tests for runmode validation.
Other minor updates
## v5.3.0, May 11, 2018
Removed DB_File requirement
## v5.2.0, April 21, 2018
Allow to rerun w/ blast and pfam results and to just reexecute the steps needed to take those data into account.
The test runner now includes a small pfam and blast on-the-fly execution step as well.
## v5.1.0, Mar 27, 2018
Better support for different genetic codes as described here:
https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi
Also, shouldn't report UTR records for cases where the coding region starts at a partial codon.
## v5.0.2, Oct 20, 2017
protein identifiers are more manageable (ex. ${transcript_acc}.p1, .p2, ..., .pn)
added example for use w/ supertranscripts
## TransDecoder Release v5.0.1, Sept 12, 2017
the start-adjustment can now cope with sequences containing non-{GATC} characters.
## v5.0.0 August 26, 2017
-algorithm updates: frame[0] score > 0 and max for first 3 reading frames (instead of all 6), and orf with highest frame[0] score is chosen allowing for minimal overlap among selected predictions.
-option --single_best_only provides the single longest of the selected orfs per contig.
-long orfs unlikely to appear in random sequence are automatically selected as candidates with this minimal long orf length set dynamically according to GC content.
-orf score and blast or pfam info is propagated to gff3 output
## v4.1.0
-single best orf now selected by default. If more than the single best orf is wanted, use the --all_good_orfs parameter.
-start codon refinement is now done by default. To turn it off and get the original behavior of extending to the longest orf position, use parameter: --no_refine_starts
-cdhit has been removed and replaced by our own fast method for removing redundancies.
-selection of coding regions is strictly governed by Markov-based likelihood scores across reading frames. No auto-retention of long orfs by default, but can be activated by parameter: --retain_long_orfs_length
** all v4 releases pre-v4.1 were fairly quickly retracted due to bugs and insufficient benchmarking **
## v3.0.2 release Oct 31, 2016
minor bugfix release - when checking for required utilities to be installed, doesn't require ^/ in path
## v3.0.0 release April 26, 2016
TransDecoder v3.0.0 includes the following changes:
TransDecoder.LongOrfs now includes parameter '--gene_trans_map ' as a way to retain the gene identifier information. In the case of Cufflinks and Trinity, the gene identifiers will automatically be recognized and retained. For PASA and other inputs, it is necessary to provide the gene-to-transcript identifier mappings in order to generate isoform-clustered output files (gff3).
TransDecoder.Predict now includes flag ' --single_best_orf ' to retain only the single 'best' ORF per transcript. ORFs are prioritized according to homology information (if given the blast and pfam results) and by sequence length, with longer ORFs preferred.
Codon phase information is now included in the GFF3 output files.
The .mRNA files that were generated by default for genome-free TransDecoder runs are now deprecated, but of course the .cds and .pep files are provided.
The sample data sets include examples for running TransDecoder in a few different contexts, including starting from Trinity, PASA, or Cufflinks data.
More useful logging information is provided to it's clearer as to how many orfs are being retained and which are being eliminated along the way.
## 2016-03-11 v2.1 release
-added cpu parameter to TransDecoder.predict
-retaining gene identifier information from cufflinks output
-added sample data and examples for the various use-cases.
## 2015-01-26 v2.0 release
-overhauled the build
-removed the active searching of Pfam and all MPI-related funcitonality
-runs in 2 phase:
-TransDecoder.LongOrfs : extracs the long ORFs
-TransDecoder.Predict : predicts the likely coding regions among the ORFs
-step can use Pfam and blastp search results (blast support is a new addition)
-run Pfam and/or BlastP searches directly or try using "HPC GridRunner" (http://HpcGridRunner.github.io)
-moved to github
## 2014-07-04
-added 'make simple' to build just the essential components involving parafly and cdhit
-removed the 'cds.' prefix from the pep and cds sequence accessions.