Beginner’s guide to comparative bacterial genome analysis using next-generation sequence data
1 Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Victoria 3010, Australia
2 Victorian Life Sciences Computation Initiative, University of Melbourne, Victoria, 3010, Australia
Microbial Informatics and Experimentation 2013, 3:2 doi:10.1186/2042-5783-3-2Published: 10 April 2013
High throughput sequencing is now fast and cheap enough to be considered part of the toolbox for investigating bacteria, and there are thousands of bacterial genome sequences available for comparison in the public domain. Bacterial genome analysis is increasingly being performed by diverse groups in research, clinical and public health labs alike, who are interested in a wide array of topics related to bacterial genetics and evolution. Examples include outbreak analysis and the study of pathogenicity and antimicrobial resistance. In this beginner’s guide, we aim to provide an entry point for individuals with a biology background who want to perform their own bioinformatics analysis of bacterial genome data, to enable them to answer their own research questions. We assume readers will be familiar with genetics and the basic nature of sequence data, but do not assume any computer programming skills. The main topics covered are assembly, ordering of contigs, annotation, genome comparison and extracting common typing information. Each section includes worked examples using publicly available E. coli data and free software tools, all which can be performed on a desktop computer.