SOCS
AboutSOCS is a reference-based, ungapped alignment tool designed for SOLiD data. Alignments are tolerant of a user-specified number of color-space mismatches. Optimal alignments can be found across multiple reference files, each containing multiple sequences. Modes are available for detecting short sequence-space variants (such as SNPs) and for aligning bisulfite transformed sequences. SOCS also offers a census function which can provide read coverage depth for the reference as well as short-variant and methylation census, which can be used for downstream consensus computation. For usage instructions, see Usage.html, which is included in the package and posted in Docs and Data. Sample data are available at GEO, accession GSE13543. OS: MacOSX/Unix/Linux SOCS was developed at the Georgia Institute of Technology and the National Biodefense Analysis & Countermeasures Center with assistance from Life Technologies.
PublicationsOndov BD, Varadarajan A, Passalacqua KD, Bergman NH. Efficient mapping of ABI SOLiD sequence data to a reference genome for functional genomic applications. Bioinformatics 2008 24: 2776-2777. Full text Submitted for review: Ondov BD, Cochran C, Landers M, Meredith GD, Dudas M, Bergman NH. An alignment algorithm for bisulfite sequencing using the Applied Biosystems SOLiD System.
ContactNicholas Bergman, PhD, National Biodefense Analysis & Countermeasures Center bergmann@nbacc.net
Version 2.0 is now available. Major updates include: Several users have encountered crashes that were traced to uncalled colors, which appear as "." in the csfasta file. A read filtering script has been posted with 1.2.1, and it is recommended that this script is run on any SOLiD 3 data before running SOCS. We are currently working on version 2 of SOCS and would like to hear how it can be improved. Let us know what kinds of features you would like to see or what kinds of bottlenecks SOCS causes in your workflow. Previous versions contained a bug that affected best match files (.best.txt) for mappings that took more than one round. The read number column restarted at 0 for each round (instead of representing the absolute order in the csfasta file). Version 1.2 adds detection of contiguous sequence-space variants longer than 1 base (up to one less than the mismatch tolerance). Bugs reporting the number of color space mismatches for mapped reads have also been fixed.
|
|
||||||||||||||||||||||||||||

