BamBam

Download several simple-to-use tools to facilitate NGS analysis

BamBam's goal is to be useful and easy-to-use. The tools in this package try to do things in the simplest way possible and avoid, as much as possible, and special formats. Where a special format is needed, there's a program or script that generates that file format. Run any program or script without arguments to see command-line usage information. A brief description of some of the programs follow. You may find other tools included, which should be fairly easy to understand based on their name if you try to run them. Our interpretation of the MIT usage license: You're welcome to use the programs and/or source code for anything you want, but please be cool about helping out other researchers. Feel free to email me at jtpage68@gmail.com if you have questions about any of the tools.

Some of the advantages of BamBam

  • C++ APIs (Bamtools) to reference mapped reads and efficiently access bam files
  • Multi-threaded where necessary to provide scalability for simultaneously processing large numbers of bam files. 
  • Parameters that can be understood without needing specific knowledge of the underlying algorithm.
  • To keep memory usage low with large reference genomes, chunks of the genome are processed at a time. The size of chunks can be set by the user 
  • It provides a quick simple way to do common tasks with Bamfile (e.g. counter generates read count data for annotated genes in expression experiments of RNA-seq, subBam selects out a user-defined slice(s) from bam files).

    Please cite our paper:
  •  


    Page, J.T., Lietchty, M. Huynh, and J.A. Udall. 2014. BamBam: genome sequence analysis tools for biologists. BMC Research Notes. In Press.




    Installation
    Unpack the tarball and run "make all" to install all tools. All scripts are already good to go.

    If you have issues, you may need to install BAMtools and/or BioPerl separately and link to locally compiled versions of those libraries


    Tools

    Bam2Consensus: reports 1 or more consensus sequences for each BAM file provided
    CalcMeth: extracts and summarizes methylation values from a BAM file prepared by MethHead
    Counter: summarizes the number of reads mapped to each annotated feature, 1 column per BAM
    Gapfall: takes 2 BAM files and reports ranges where one of the BAMs has nothing and the other has something
    HapHunt: infers haplotypes from a set of BAM files based on K-means clustering
    InterSnp: takes 1 or more BAM files and writes out a ".snp" file with a summary of all polymorphic loci
    MethHead: converts a BAM file with bisulfite-treated reads in 4 possible orientations to one with only 2 possible orientations (for use with CalcMeth)
    ModRef: changes the coordinates in a BAM file as if it had been built from a different reference sequence
    Pebbles: infers phylogenetic tree using MCMC
    PolyCat: see PolyCat