The package providies utilties to parse and query commonly used genomic file formats. Genomic files are usually indexed (BigWig, BigBed, Tabix etc) and the library will only read the necessary bytes of the file to query data. In addition, The library also works with remotely hosted files. This requires the remotely hosted file to support HTTP Byte-Range requests
Note
package is open source and is available on GitHub
Installation¶
Using PyPI¶
will be on PyPI soon.. but for now install from GitHub below
From GitHub (devel version)¶
To install the devel version from GitHub: Install using pip
pip install git@github.com:epiviz/epivizFileParser.git
or clone the repository and install from local directory using pip
Note
Depending on how python was setup, installing packages may sometime require sudo permission, in this case, add the –user option
pip install --user git@github.com:epiviz/epivizFileParser.git
Usage¶
For example, to read a BigWig file,
from epivizFileParser import BigWig
# initialize a file
bw = BigWig("tests/test.bw")
# extract header and zoom levels from the file
print(bw.header, bw.zooms)
# query the file
res, err = bw.getRange(chr="chr1", start=10000000, end=10020000)
print(res)
# summarize data into equals windows/bins
sres = bw.bin_rows(res, chr="chr1", start=10000000, end=10020000, columns=['score'], bins=10)
print(sres)