454 Transcript mapping pipeline¶

Purpose¶

Map 454 reads onto a genome and assemble overlapping transcripts into transcript models.

The pipeline currently does not use base quality information during mapping and does not consider alternative transcripts.

Setting up¶

To set up the pipeline in the current directory run:

python setup.py --method=map_transcripts_454 > setup.log

Add or link fasta files of reads into directory. These should end with the suffix .fasta. The pipeline will process several files at the same time. For example:

tissue1.fasta
tissue2.fasta
tissue3.fasta

Link towards the genome from /net/cpp-data/backup/databases/indexed_fasta and call the files genome.fasta and genome.idx. For example:

ln -s /net/cpp-data/backup/databases/indexed_fasta/hs_ncbi36_softmasked.fasta genome.fasta
ln -s /net/cpp-data/backup/databases/indexed_fasta/hs_ncbi36_softmasked.idx genome.idx

Build the index for gmap by running gmap_setup. By default, gmap indices should be put in /net/cpp-mirror/databases/gmap. Provide the location to the indices using the variable PARAM_GMAP_OPTIONS.

Note

Indices on networked disks are slow to load up. For performance reasons work with local indices.

Configuration¶

Edit the Makefile to configure the pipeline. See Parameters below.

Parameters¶

The following parameters can be set in the Makefile:

454 Transcript mapping pipeline¶

Purpose¶

Setting up¶

Configuration¶

Parameters¶

Table Of Contents

Previous topic

Next topic

This Page

Navigation

454 Transcript mapping pipeline¶

Purpose¶

Setting up¶

Configuration¶

Parameters¶

Table Of Contents

Previous topic

Next topic

This Page

Quick search

Navigation