Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inflated TE counts and masked bp in EDTA annotation after removal of part of the genome #434

Open
Nyasita opened this issue Feb 20, 2024 · 3 comments

Comments

@Nyasita
Copy link

Nyasita commented Feb 20, 2024

Hi,
I have used EDTA to annotate TEs for my plant genome. Initially, I had 12 chromosomes and I ran EDTA but then I got rid of one chromosome that we suspect to be something else and re-ran EDTA. Now my point of contention is I'm getting rather inflated counts and number of bp masked for the 'smaller' genome. Basically, I would expect with 11 chromosomes in my genome, the numbers of particular TEs and the bp masked would be less than what was observed for 12 chromosomes.

I ran the code with --sensitive 1 --anno 1 --evaluate 1

Below is a table showing the counts. The counts in brackets are from the run with 12chromosomes and the highlighted values are where i observed inflated values in the smaller genome,

image

I also removed the 'suspect' chromosome from the gff file i had obtained from the first run with 12 chromosomes genome and computed the results using the protocol here, and got very different results from the actual run (results shown in the table below; again the values in brackets are from the run with 12 chromosomes and the difference is between the unbracketed values in the table below and the table above) . What would be the explanation to this disparity?

image

@oushujun
Copy link
Owner

Hello,

Sorry for the delay. You may want to remove the extra chromosome in the fasta file, the gff, and the stat file to do the manual computation. Direct rerun EDTA may directly give you peace of mind. Please let me know if you have other thoughts.

Thanks,
Shujun

@Nyasita
Copy link
Author

Nyasita commented Mar 15, 2024

I did remove the extra chromosome in the fasta, gff and stat file and the results are as i shared above. I then reran EDTA directly and the results are also shown in my original issue. My worry is just why the results are different.

@oushujun
Copy link
Owner

I have no idea. Can you create a reproducible example for me to check on my end?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants