0

Pros!

I have a visualization project that render the biological data to canvas charts, in which I use a javascritp framwork called jgv.js(the doc API) to generate canvas.

Here’s a simple config demo:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>IGV Data Vis</title>
    <link rel="stylesheet" href="source/jquery-ui.css">
    <link rel="stylesheet" href="source/font-awesome.min.css">
    <link rel="stylesheet" href="source/igv-1.0.1.css">
    <script src="source/jquery.min.js"></script>
    <script src="source/jquery-ui.min.js"></script>
    <script src="source/igv-1.0.1.js"></script>
</head>
<body>
    <div id="container"></div>

    <script>
        let options = {
                palette: ["#00A0B0", "#6A4A3C", "#CC333F", "#EB6841"],
                locus: "7:55,085,725-55,276,031",

                reference: {
                    id: "hg19",
                    fastaURL: "//igv.broadinstitute.org/genomes/seq/1kg_v37/human_g1k_v37_decoy.fasta",
                    cytobandURL: "//igv.broadinstitute.org/genomes/seq/b37/b37_cytoband.txt"
                },

                trackDefaults: {
                    bam: {
                        coverageThreshold: 0.2,
                        coverageQualityWeight: true
                    }
                },

                tracks: [
                    {
                        name: "Genes",
                        url: "//igv.broadinstitute.org/annotations/hg19/genes/gencode.v18.collapsed.bed",
                        index: "//igv.broadinstitute.org/annotations/hg19/genes/gencode.v18.collapsed.bed.idx",
                        displayMode: "EXPANDED",
                        height: 350,
                        color: '#ff0000'
                    }
                ]
            };

        let browser = igv.createBrowser(document.getElementById('container'), options);
    </script>
</body>
</html>

The items of tracks in the code above are bio-information statments that could be in the form of plain-text file or binary file(*.bam).

The problem is the bio files are so terible large that I can not access them directly, no mention for the clients. Such as:

  • .bam approximate 3G
  • .vcf approximate 1G

So, is there any back-end solutions to make those files accessable piece by piece? Just like the way of AJAX.

Any suggestions will be appreciated!

Timur Shtatland
  • 12,024
  • 2
  • 30
  • 47
1Cr18Ni9
  • 1,737
  • 1
  • 12
  • 21

3 Answers3

1

Depends of what you mean by 'piece by piece'.

Bam and vcf files use a bgzip format that can be accessed using random access. Even through the web has long as the hosting server supports the 'Byte-Range:' request.

$ tabix "http://igv.broadinstitute.org/annotations/hg19/genes/gencode.v18.collapsed.bed.gz" "1:40723778-40759856"

1   40723778    40759856    ZMPSTE24    1000.0  +   40723778    40759856    .   17  288,159,156,183,147,72,87,51,117,153,142,185,105,353,144,1740,177,  0,129,132,1243,2732,4727,9679,9679,10312,11868,13787,23236,27818,32538,32747,34338,34338,
1   40728343    40728656    RP1-39G22.4 1000.0  -   40728343    40728656    .   1   313,    0,

For bioinformatics, you can also ask biostars.org

Pierre
  • 34,472
  • 31
  • 113
  • 192
0

Too broad question. There are many ways to get a file by pieces. Php has a lot of functions to deal with files like fseek (doc) or fgets. You'd better not transfer 3G of data to user, but do the calculations needed at your back-end.

Using any image library (gd2?) you can make the image base on the genome file on your server. No need to transfer a huge amount of data to a client.

shukshin.ivan
  • 11,075
  • 4
  • 53
  • 69
0

yes. format bam dispicts the whole genome reads alignment details, so it is very large. format vcf dispicts the whole genome SNP infos and thier respective annotations