Mali.py - Tools for multiple alignments

Author:Andreas Heger
Release:$Id$
Date:December 09, 2013
Tags:Python

Code

class Mali.SequenceCollection

Bases: Mali.Mali

reads in a sequence collection, but permits several entries per id.

Note that this might cause problems with interleaved formats like phylips or clustal.

This mapping is achieved by appending a numeric suffix. The suffix is retained for the life-time of the object, but not output to a file.

addEntry(s)

add an aligned string object.

readFromFile(infile, format='fasta')

read multiple alignment from file in various format.

addAnnotation(key, annotation)

add annotation.

apply(f)

apply function f to every row in the multiple alignment.

buildColumnMap(other, join_field=None)

build map of columns in other to this.

checkLength()

check lengths of aligned strings.

Return false if they are inconsistent.

clipByAnnotation(key, chars='')

restrict alignment to positions where annotation identified by key in chars.

if chars is empty, nothing is clipped.

copyAnnotations(other)

copy annotations from annother mali.

filter(f)

filter multiple alignment using function f.

The function f should return True for entries that should be kept and False for those that should be removed.

getAlphabet()

get alphabet from the multiple alignment.

Alphabet is “na”, if more than 90% of characaters are “actgxn”, otherwise it is “aa”.

getAnnotation(key)

return annotation associated with key.

getColumns()

return mali in column orientation.

getConsensus(mark_with_gaps=False)

return consensus string.

The consensus string returns the most frequent character per column that is not a gap. If mark_with_gaps is set to True, positions with any gap characater are set to gaps.

getLength()

deprecated.

getResidueNumber(key, position)

return residue number in sequence key at position position.

getWidth()

deprecated.

insertColumns(position, num_gaps, keep_fixed=None, char='-')

insert gaps at position into multiple alignment.

if keep_constant is a list of identifiers, those are kept constant, instead, gaps are added to the end.

lower()

convert all characters in mali to lowercase.

lowerCase()

set all characters to lower case.

mapColumns(columns, map_function)

apply map_function to all residues in columns.

mapIdentifiers(map_old2new=None, pattern_identifier='ID%06i')

map identifiers in multiple aligment.

if map_old2new is not given, a new map is created (map_new2old)

markCodons(mode='case')

mark codons.

markTransitions(map_id2transitions, mode='case')

mark transitions in the multiple alignment.

if mode == case, then upper/lower case is used for the transitions

Otherwise, a character given by mode is inserted.

maskColumn(column, mask_char='x')

mask a column.

maskColumns(columns, mask_char='x')

mask columns in a multiple alignment.

propagateMasks(min_chars=1, mask_char='x')

propagate masked characters to all rows of a multiple alignment within a column.

If there is at least min_chars in a mali column, that are masks, propagate the masks to all other rows.

propagateTransitions(min_chars=1)

propagate lower case in a column to all residues.

recount(reset_first=False)

recount residue in alignments.

removeEmptySequences()

remove sequences that are completely empty.

removeEndGaps()

remove end gaps.

end gaps do not include any characters and thus the alignment coordinates won’t change.

removeGaps(allowed_gaps=0, minimum_gaps=1, frame=1)

remove gappy columns.

allowed_gaps: number of gaps allowed for column to be kept minimum_gaps: number of gaps for column to be removed

set minimum_gaps to the number of sequences to remove columns with all gaps.

If frame is > 1 (3 most likely), then a whole codon is removed as soon as there is one column to be removed.

removePattern(match_function, allowed_matches=0, minimum_matches=1, delete_frame=1, search_frame=1)

remove columns (or group of columns), that match a certain pattern.

allowed_matches: number of matches allowed so that column is still kept minimum_matches: number of matches required for column to be removed

set minimum_matches to the number of sequences to remove columns with all gaps.

Patterns are matches in search_frame. For example, if frame is 3, whole codons are supplied to match_function.

delete_frame specifies the frame for deletion. If it is set to 3, codons are removed if already one column matches.

Example: remove all columns that contain at least one stop-codon:

removePattern( lambda x: x.upper() in (“TAG”, “TAA”, “TGA”),
allowed_matches = 0, minimum_matches = 1, search_frame = 3, delete_frame = 3)
removeUnalignedEnds()

remove unaligned ends in the multiple alignment.

unaligned ends correspond to lower-case characters.

rename(old_name, new_name)

rename an entry.

setAnnotation(key, value)

set annotation associated with key to value.

shiftAlignment(map_id2offset)

shift alignment by offset.

shuffle(frame=1)

shuffle multiple alignment.

The frame determines the block size for shuffling. Use 3 for codons in a multiple alignment without frame-shifts.

takeColumns(columns)

restrict alignments to certain columns.

truncate(first, last)

truncate alignment within range.

upper()

convert all characters in mali to uppercase.

upperCase()

set all characters to upper case.

writeToFile(outfile, write_ranges=True, format='plain', options=None)

write alignment to file.

If options is given, these lines are output into the multiple alignment.

Mali.convertMali2Alignlib(mali)

convert a multiple alignment of type Mali into an alignlib_lite.py_multiple alignment object.

Mali.convertAlignlib2Mali(mali, identifiers=None, seqs=None)

convert a multiple alignment into an alignlib_lite.py_multiple alignment object.

Table Of Contents

Previous topic

Intervals.py -

Next topic

MaliIO.py -

This Page