454 Transcript mapping pipeline


Map 454 reads onto a genome and assemble overlapping transcripts into transcript models.

The pipeline currently does not use base quality information during mapping and does not consider alternative transcripts.

Setting up

To set up the pipeline in the current directory run:

python setup.py --method=map_transcripts_454 > setup.log

Add or link fasta files of reads into directory. These should end with the suffix .fasta. The pipeline will process several files at the same time. For example:


Link towards the genome from /net/cpp-data/backup/databases/indexed_fasta and call the files genome.fasta and genome.idx. For example:

ln -s /net/cpp-data/backup/databases/indexed_fasta/hs_ncbi36_softmasked.fasta genome.fasta
ln -s /net/cpp-data/backup/databases/indexed_fasta/hs_ncbi36_softmasked.idx genome.idx

Build the index for gmap by running gmap_setup. By default, gmap indices should be put in /net/cpp-mirror/databases/gmap. Provide the location to the indices using the variable PARAM_GMAP_OPTIONS.


Indices on networked disks are slow to load up. For performance reasons work with local indices.


Edit the Makefile to configure the pipeline. See Parameters below.


The following parameters can be set in the Makefile:

Table Of Contents

Previous topic

GPipe - Gene prediction pipeline

Next topic


This Page