Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotation track not showing, if locus is set #1923

Open
gernophil opened this issue Dec 7, 2024 · 22 comments
Open

Annotation track not showing, if locus is set #1923

gernophil opened this issue Dec 7, 2024 · 22 comments

Comments

@gernophil
Copy link

According to the docs (and what I've tried) this is only available for GFF3s. Unfortunately, I have to use a GTF file. Would be great, if this could be implemented. Alternatively, colorBy could also take Type.

@jrobinso
Copy link
Contributor

jrobinso commented Dec 8, 2024

OK, noted, leave this open until I can get to it. If you could zip and post a small example GTF along with a description of what you want to filter or color (or both?) by that would help.

@gernophil
Copy link
Author

Thanks for the swift reply. Will do tomorrow. My computer is already switched off :).

@gernophil
Copy link
Author

I am using this GTF file and I prepare it using these commands. Since "normal" IGV seems to have trouble with unsorted GTFs, I figured this might be the case for igvjs too:

bgzip -d Homo_sapiens.GRCh38.112.gtf.gz > Homo_sapiens.GRCh38.112.gtf
sort -k1,1 -k4,4n -s Homo_sapiens.GRCh38.112.gtf > Homo_sapiens.GRCh38.112.sorted.gtf
bgzip -c Homo_sapiens.GRCh38.112.sorted.gtf > Homo_sapiens.GRCh38.112.sorted.gtf.gz
tabix Homo_sapiens.GRCh38.112.sorted.gtf.gz

With this is use this HTML code:

<div id="igv_div">

    <script type="module">
    
        import igv from "https://cdn.jsdelivr.net/npm/[email protected]/dist/igv.esm.min.js"
    
        const div = document.getElementById("igv_div")
    
        const options =
        {
        reference:
            {
            "id": "hg38",
            "fastaURL": "{refgenome}",
            "indexURL": "{refgenemefai}",
            "locus": "{locus}",
            "tracks":
                [
                    {
                    "type": "annotation",
                    "format": "gtf",
                    "name": "annotations",
                    "url": "{gtffile}",
                    "indexURL": "{gtffiletbi}",
                    "colorBy": "transcript_biotype",
                    "colorTable":
                        {
                        "protein_coding": "magenta",
                        "*": "lightgrey",
                        }
                    }
                ]
            }
        }
    
        const browser = await igv.createBrowser(div, options)
    
    </script>
</div>

This highlights all protein_coding transcripts, but what I actually want to is is show only CDS features (col3(Feature?) == "CDS"). And best case would be to only show those that also have a gene_name (HGNC gene symbol), since some CDS don't have this since they come from a different source.

@gernophil
Copy link
Author

Any ideas?

@jrobinso jrobinso transferred this issue from igvteam/igv.js-docs Dec 12, 2024
@jrobinso
Copy link
Contributor

This sort of filtering isn't possible in igv.js. The "filterTypes" is a list of types to filter out, that is the only filtering available.

@gernophil
Copy link
Author

Thanks for the info. Guess the easiest would then maybe be to simply filter the GTF file to only contain what I want to see :).

Btw. I am experiencing a little inconsistent behavior with the loading of the annotation track with the above code. In some loci (like B2M) it loads, but in others (like TP53 or KRAS) it doesn't. Any idea why that is happening? Can't be a general code issue since it's work. Could it be a caching issue somehow?

@jrobinso
Copy link
Contributor

Its probably either a problem with the file or the index.

@jrobinso jrobinso reopened this Dec 12, 2024
@gernophil
Copy link
Author

Hmm it's the official Ensembl file. Maybe I'll try indexing without sorting, but I think sorting is necessary for indexing.

@jrobinso
Copy link
Contributor

I can't reproduce the original issue posted, "filterTypes" should work for both GFF and GTF. Do you have a reproducible test case?

@gernophil
Copy link
Author

I'l send one tomorrow (it's midnight here right now :D)

@gernophil
Copy link
Author

gernophil commented Dec 13, 2024

It seems to depend on the starting locus. Not sure, if you are familiar with shiny, but that's the context I use. I made a little script showing this. Just put the app.py and the two html files in the same folder and make sure, to put the resource files in the folder defined by resources_dir = Path(__file__).parent / "resources" (and make sure the paths are correct). If you run the py-shiny app with shiny run app.py it should open two igvjs views. One without starting location and one with. The one with doesn't show the annotation track and the one without shows it, if you manually go to the same location. I provide the location like this: "locus": "17:7661921-7676610" (using "locus": "17:7,661,921-7,676,610" also doesn't work)

Archiv.zip

@gernophil gernophil changed the title Use filterTypes for GTF files Annotation track not showing, if locus is set Dec 13, 2024
@gernophil
Copy link
Author

Here are some screenshots:
directly after starting the app:
Bildschirmfoto 2024-12-13 um 12 46 58
and after manually navigating:
Bildschirmfoto 2024-12-13 um 12 47 09

@jrobinso
Copy link
Contributor

Hi, sorry I don't have time to debug your app, or learn shiny. Also, you didn't attach any files. If you can create a zip with the gtf data file and index I will look into it further with the igv.js html you posted. I shouldn't need the fastas.

All of our example files set an initial locus without issue. Also, the current release is 3.1.0.

@gernophil
Copy link
Author

No need to debug the app, I just use this as workaround to start a webserver since JavaScript doesn't load local files. Just wanted to make it easy. You would just need to put the three files in a folder and run the shiny app. However, you can simply check the screenshots :). The problem is not that the locus is not set. Setting the locus works, but in the version, where I set the locus the annotation track is not shown. If I don't set the locus, but navigate there manually the annotation track is shown (see screenshot). So the GTF and fai seem to be fine, but I'll generate a zip file later.

@jrobinso
Copy link
Contributor

The screenshots don't tell me anything. I don't understand the steps you are taking, I have access to gtf files if you are confident its not the file. What I need is exact steps to reproduce, as in. (1) start with this html, (2) navigate..., whatever.

@jrobinso
Copy link
Contributor

If you don't set an an initial locus it should start at whole genome view. You won't see any annotations there, you see the message "Zoom in to see features". That is normal. You screenshot from "after manually navigating" shows 2 igv views, so I'm not sure what is being illustrated there.

@jrobinso
Copy link
Contributor

And what happened to the "filterTypes" issue, that is what I was referring to re "cannot reproduce". We seem to be discussing something else now.

@jrobinso
Copy link
Contributor

Here's a session that is equivalent to yours, it starts with no locus set. What steps do I take from here to reproduce what you are seeing? https://tinyurl.com/2d4vltgq

The session json follows

{
	"version": "3.1.1",
	"reference": {
		"id": "hg38",
		"name": "Human (GRCh38/hg38)",
		"cytobandURL": "https://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/cytoBandIdeo.txt.gz",
		"aliasURL": "https://igv.org/genomes/data/hg38/hg38_alias.tab",
		"twoBitURL": "https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.2bit",
		"chromSizesURL": "https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes",
		"chromosomeOrder": "chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrX,chrY"
	},
	"tracks": [
		{
			"name": "Gencode v44",
			"type": "annotation",
			"format": "gff3",
			"url": "https://igv-genepattern-org.s3.amazonaws.com/genomes/hg38/gencode.v44.basic.annotation.gff3.gz",
			"indexURL": "https://igv-genepattern-org.s3.amazonaws.com/genomes/hg38/gencode.v44.basic.annotation.gff3.gz.tbi"
		}
	]
}

@gernophil
Copy link
Author

Ok, first let's go back to the screenshots:
Both screenshots show two instances of igvjs. The upper one is created using the HTML code from igvjs_minimal_locus.html the lower one is created using the HTML code igvjs_minimal.html. igvjs_minimal_locus.html is the exact same code as igvjs_minimal.html just with an added line "locus": "17:7661921-7676610", between indexURL and tracks.

First screenshot: Initially the upper panel (with locus) shows the correct locus (but no annotations track is loaded and the track stays empty). The lower panel (without locus) shows the whole genome and of course also no annotation track.

Second screenshot: If I now navigate in the lower panel (initially without locus) to the exact same locus of the upper panel the lower panel shows the same region as the upper, but it also shows the annotation track that the upper panel (unchanged) fails to load.

To reproduce it, I put the three files from the code blocks from above into a zip file containing a app.py, and igvjs_minimal.html and an igvjs_minimal_locus.html. You can find this file here:
Archiv.zip

If you wanna try shiny (I totally understand, if that's to inconvenient for you), just put those in a folder and install python and the shiny python package and download these FASTA and GTF files:
FASTA: ftp://ftp.ensembl.org/pub/release-112/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
GTF: ftp://ftp.ensembl.org/pub/release-112/gtf/homo_sapiens/Homo_sapiens.GRCh38.112.gtf.gz

# downloading and unzipping commands
mkdir -p "resources/Ensembl/release-112/fasta/homo_sapiens/dna/"
curl --output-dir "resources/Ensembl/release-112/fasta/homo_sapiens/dna/" -O "ftp://ftp.ensembl.org/pub/release-112/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz"
gunzip "resources/Ensembl/release-112/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz"

mkdir -p "resources/Ensembl/release-112/gtf/homo_sapiens/"
curl --output-dir "resources/Ensembl/release-112/gtf/homo_sapiens/" -O "ftp://ftp.ensembl.org/pub/release-112/gtf/homo_sapiens/Homo_sapiens.GRCh38.112.gtf.gz"
cd "resources/Ensembl/release-112/gtf/homo_sapiens/"
bgzip -d Homo_sapiens.GRCh38.112.gtf.gz > Homo_sapiens.GRCh38.112.gtf
sort -k1,1 -k4,4n -s Homo_sapiens.GRCh38.112.gtf > Homo_sapiens.GRCh38.112.sorted.gtf
bgzip -c Homo_sapiens.GRCh38.112.sorted.gtf > Homo_sapiens.GRCh38.112.sorted.gtf.gz
tabix Homo_sapiens.GRCh38.112.sorted.gtf.gz

After that you can run it with shiny run app.py. I totally understand, if that's to inconvenient. It should also work, if you simply load the HTML files, but then you would need to host those files somewhere. Unfortunately, they are too big to upload them here.

Your example is using a GFF3 file and I cannot reproduce it using this GFF3. Do you know of an online hosted GTF with in index maybe?

For the originally intended "filterTypes" issue I am sorry. After you mentioned that such type of filtering isn't possible, I dropped the idea. Not sure, if I still have a code snippet for this. And sorry, I should have started another thread instead of simply continuing here. I'd say let's keep this thread to the annotations issue and I'll try to find the code for the filtering and come back to this in another thread, if we figured this here out :).

@jrobinso
Copy link
Contributor

OK I'll download that gtf and host it somewhere temporarily. This will take a little time. I am starting to suspect your server, it is highly unlikely that there will be any difference between gff3 and gtf.

@jrobinso
Copy link
Contributor

Also be sure you are using the latest igv.js, otherwise we might be wasting our time. The latest version is 3.1.1

@jrobinso
Copy link
Contributor

I can't reproduce your issue with the gtf, this is not surprising there is no real difference between gtf and gff other than how column 9 is parsed. The session json is below, gtf is hosted on dropbox.

{
	"version": "3.1.1",
	"reference": {
		"id": "hg38",
		"name": "Human (GRCh38/hg38)",
		"cytobandURL": "https://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/cytoBandIdeo.txt.gz",
		"aliasURL": "https://igv.org/genomes/data/hg38/hg38_alias.tab",
		"twoBitURL": "https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.2bit",
		"chromSizesURL": "https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes",
		"chromosomeOrder": "chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrX,chrY"
	},
		{
			"url": "https://www.dropbox.com/scl/fi/lrf4e7i9pxm4q8beemege/Homo_sapiens.GRCh38.112.sorted.gtf.gz?rlkey=s549yiir2p7ufggtyiqy8sqmg&dl=0",
			"indexURL": "https://www.dropbox.com/scl/fi/1i73prk4hj9g1ksvpsugj/Homo_sapiens.GRCh38.112.sorted.gtf.gz.tbi?rlkey=aq1dk1b683wulvhiu5wsdneew&dl=0",
			"name": "Homo_sapiens.GRCh38.112.sorted.gtf.gz",
			"format": "gtf",
			"type": "annotation"
		}
	]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants