GenomicIO.py - Subroutines for working on I/O of large genomic files¶

Author:
Release:	$Id$
Date:	December 09, 2013
Tags:	Python

I tried the Biopython parser, but it was too slow for large genomic chunks.

GenomicIO.index_file(filenames, db_name)¶

index file/files.

Two new files are create - db_name.fasta and db_name.idx

GenomicIO.index_exists(filename)¶: check if a certain file has been indexed.

GenomicIO.getSequence(db_name, sbjct_token, sbjct_strand, sbjct_from, sbjct_to, as_array=False, forward_coordinates=False)¶: get genomic fragment.

GenomicIO.splitFasta(infile, chunk_size, dir='/tmp', pattern=None)¶

split a fasta file into a subset of files.

If pattern is not given, random file names are chosen.

GenomicIO.getConverter(format)¶

return a converter function for converting various coordinate schemes into 0-based, both strand, closed-open ranges.

converter functions have the parameters x, y, s, l: with x and y the coordinates of a sequence fragment, s the strand (True is positive) and l being the length of the contig.

GenomicIO.py - Subroutines for working on I/O of large genomic files¶

Previous topic

Next topic

This Page

Navigation

GenomicIO.py - Subroutines for working on I/O of large genomic files¶

Previous topic

Next topic

This Page

Quick search

Navigation