IOTools - tools for I/O operations

Author:Andreas Heger
Release:$Id$
Date:December 09, 2013
Tags:Python

Code

IOTools.readMap(infile, columns=(0, 1), map_functions=(<type 'str'>, <type 'str'>), both_directions=False, has_header=False)

read a map (pairs of values) from infile. returns a hash.

Use map functions to convert elements. If both_directions is set to true, both mapping directions are returned.

IOTools.readList(infile, column=0, map_function=<type 'str'>, map_category={}, with_title=False)

read a list of values from infile.

Use map_function to convert values. Use map_category, to map read values directory If with_title, first line is assumed to be a title

IOTools.ReadList(infile, column=0, map_function=<type 'str'>, map_category={})

read a list of values from infile.

Use map_function to convert values. Use map_category, to map read values directory

IOTools.readMultiMap(infile, columns=(0, 1), map_functions=(<type 'str'>, <type 'str'>), both_directions=False, has_header=False, dtype=<type 'dict'>)

read a map (pairs of values) from infile. returns a hash.

Use map functions to convert elements. If both_directions is set to true, both mapping directions are returned. This function can have n:n matches

IOTools.readTable(file, separator='\t', numeric_type=<type 'float'>, take='all', headers=True, truncate=None, cumulate_out_of_range=True)

read a table of values. There probably is a routine for this in Numpy, which I haven’t found yet.

If cumulate_out_of_range is set to true, the terminal bins will contain the cumulative values of bins out of range.

IOTools.writeTable(outfile, table, columns=None, fillvalue='')

write a table to outfile.

If table is a dictionary, output columnwise. If columns is a list, only output columns in columns in the specified order.

IOTools.readMatrix(infile, dtype=<type 'float'>)

read a numpy matrix from infile.

return tuple of matrix, row_headers, col_headers

IOTools.writeMatrix(outfile, matrix, row_headers, col_headers, row_header='')

write a numpy matrix to outfile.

row_header gives the title of the rows

IOTools.getInvertedDictionary(dict, make_unique=False)

returns an inverted dictionary with keys and values swapped.

IOTools.readSequence(file)

read sequence from a fasta file.

returns a tuple with description and sequence

IOTools.getLastLine(filename, nlines=1, read_size=1024)

return last line of a file.

IOTools.getNumLines(filename, ignore_comments=True)

get number of lines in filename.

IOTools.ReadMap(*args, **kwargs)

compatibility - see readMap.

IOTools.isEmpty(filename)

return True if file exists and is empty.

raises OSError if file does not exist

class IOTools.FilePool(output_pattern=None, header=None, force=True)

manage a pool of output files

This class will keep a large number of files open. To see if you can handle this, check the limit within the shell:

ulimit -n

The number of currently open and maximum open files in the system:

cat /proc/sys/fs/file-nr

Changing these limits might not be easy for a user.

This class is inefficient if the number of files is larger than maxopen and calls to write do not group keys together.

close()

close all open files.

getFilename(identifier)

get filename for an identifier.

openFile(filename, mode='w')

open file.

If file is in a new directory, create directories.

deleteFiles(min_size=0)

delete all files below a minimum size.

class IOTools.FilePoolMemory(*args, **kwargs)

Bases: IOTools.FilePool

manage a pool of output files

The data is cached in memory before writing to disk.

close()

close all open files. writes the data to disk.

deleteFiles(min_size=0)

delete all files below a minimum size.

getFilename(identifier)

get filename for an identifier.

openFile(filename, mode='w')

open file.

If file is in a new directory, create directories.

IOTools.val2str(val, format='%5.2f', na='na')

return formatted value.

If value does not fit format string, return “na”

IOTools.str2val(val, format='%5.2f', na='na')

guess type of value.

IOTools.prettyFloat(val, format='%5.2f')

deprecated, use val2str

IOTools.prettyPercent(numerator, denominator, format='%5.2f', na='na')

output a percent value or “na” if not defined

IOTools.prettyString(val)

output val or na if val == None

class IOTools.nested_dict

Bases: collections.defaultdict

Auto-vivifying nested dictionaries.

For example:

nd= nested_dict()
nd["mouse"]["chr1"]["+"] = 311 
iterflattened()

iterate through values with nested keys flattened into a tuple

clear() → None. Remove all items from D.
copy() → a shallow copy of D.
default_factory

Factory for default value called by __missing__().

static fromkeys(S[, v]) → New dict with keys from S and values equal to v.

v defaults to None.

get(k[, d]) → D[k] if k in D, else d. d defaults to None.
has_key(k) → True if D has a key k, else False
items() → list of D's (key, value) pairs, as 2-tuples
iteritems() → an iterator over the (key, value) items of D
iterkeys() → an iterator over the keys of D
itervalues() → an iterator over the values of D
keys() → list of D's keys
pop(k[, d]) → v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised

popitem() → (k, v), remove and return some (key, value) pair as a

2-tuple; but raise KeyError if D is empty.

setdefault(k[, d]) → D.get(k,d), also set D[k]=d if k not in D
update([E], **F) → None. Update D from dict/iterable E and F.

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

values() → list of D's values
viewitems() → a set-like object providing a view on D's items
viewkeys() → a set-like object providing a view on D's keys
viewvalues() → an object providing a view on D's values
IOTools.flatten(l, ltypes=(<type 'list'>, <type 'tuple'>))

flatten a nested list/tuple.

IOTools.which(program)

check if program is in path.

from post at http://stackoverflow.com/questions/377017/test-if-executable-exists-in-python

IOTools.convertValue(value, list_detection=False)

convert a value to int, float or str.

IOTools.iterate_tabular(infile, sep='\t')

iterate over infile skipping comments.

IOTools.openFile(filename, mode='r', create_dir=False)

open file in filename with mode mode.

If create is set, the directory containing filename will be created if it does not exist.

gzip - compressed files are recognized by the suffix .gz and opened transparently.

Note that there are differences in the file like objects returned, for example in the ability to seek.

returns a file or file-like object.

IOTools.iterate(infile)

iterate over infile and return a namedtuple according to first row.

Table Of Contents

Previous topic

CSV.py - Tools for parsing CSV files

Next topic

Iterators.py - general purpose iterators.

This Page