Welcome to the CGAT code collection

This document brings together the various pipelines and scripts written before and during CGAT.


The documentation is under construction.

The CGAT Code collection is documented here.


The CGAT code collection has grown out of the work in comparative genomics by the Ponting group in the last decade. Now, CGAT has added functionality to do next-generation sequencing analysis.

The collection has three major components, these are directories in the package.

  • Scripts A collection of useful scripts for genomics and NGS analysis
  • Modules A collection of modules with utility functions for genomics and NGS analysis.
  • CGAT Pipelines A collection of pipelines for common workflows in genomics and NGS analysis.

Scripts and modules

The CGAT code collection is as set of tools and modules for genomics. Most of these scripts are written in python. The tools are grouped by topic:

CGAT Pipelines

CGAT pipelines perform basic tasks, are fairly generic and might be of wider interest.


The collection of scripts and tools is the outcome of 10 years working in various fields in bioinformatics. It contains both the good, the bad and the ugly. Use at your own risk.

