The CGAT code collection has been written over several years in the context of comparative genomics and more recently next-generation sequencing analysis.
The aim of the toolkit is to solve practical problems in the analysis of genomic data. The focus of the toolkit is to facilitate the interpretation of genomic data in a biological context. Furthermore, as a training institution our aim is to write code that is well structured and can serve as an introduction to advanced bioinformatic scripting for biologists.
The CGAT code collection extends, complements but also overlaps various other toolkits. As all toolkits, and ours, continue to evolve, this is a very dynamic relationship. For example, our workflows frequently use other toolkits, in particular bedtools and the UCSC tools, for high-performance computations. Usage of common genomic file formats and a command line interface ensures compatibility.
Below is a list of toolkits with similar or complementarity functionality to the CGAT code collection and quotes from their respective web-sites: