You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Great for your workflow, really useful guide to perform Genome Annotation.
Unfortunately, I got stuck in the step 4c. Create CDS-only annotation bed file, because I only obtain empty files. I think that I know the reason, so here is my hypothesis.
I'm using genomes from RefSeq, and so far in a lot of examples, if not all, I found that the genome annotation relies on genomes from Ensembl, which of course have another notation related to the IDs.
So, I started running the script WriteChromLengthBedFromFasta.py, using a *_genomic.fna file from the following link. For example, that file only contains a fasta file with the sequence of chromosome 1. After the conversion, my new bed file looks like this:
The second column contains only 0. This output made me understand that at some point I got an empty file because the information was incorrect, why I say that?, because other examples available contain 0 and other numbers, so my question is:
What do you think is happening?
Did anyone try to run this with RefSeq genomes too?
Do I need to use another .fna file?
I really want to avoid changing to Ensembl genomes, because I had a lot of work already done with RefSeq data.
Any comment about what could be the source of the issue is more than welcome!
The text was updated successfully, but these errors were encountered:
Great for your workflow, really useful guide to perform Genome Annotation.
Unfortunately, I got stuck in the step
4c. Create CDS-only annotation bed file
, because I only obtain empty files. I think that I know the reason, so here is my hypothesis.I'm using genomes from
RefSeq
, and so far in a lot of examples, if not all, I found that the genome annotation relies on genomes fromEnsembl
, which of course have another notation related to the IDs.So, I started running the script
WriteChromLengthBedFromFasta.py
, using a*_genomic.fna
file from the following link. For example, that file only contains a fasta file with the sequence of chromosome 1. After the conversion, my newbed
file looks like this:The second column contains only 0. This output made me understand that at some point I got an empty file because the information was incorrect, why I say that?, because other examples available contain 0 and other numbers, so my question is:
.fna
file?I really want to avoid changing to
Ensembl
genomes, because I had a lot of work already done withRefSeq
data.Any comment about what could be the source of the issue is more than welcome!
The text was updated successfully, but these errors were encountered: