Peak alignment procedures for samples from LC-MS and GC-MS (also CE-MS, MS, FT-MS, UV, NMR, MALDI) measurements play an important role during biomarker detection and metabolomic studies in general. As there is always a difference in the samples due to machine drift, samples need accurate correction to point to the same metabolite or component. Several packages have emerged since several years, some of them commercial, some of them free, some of them simple, some of them complex :-).

As the topic is rather complex, there are major pitfalls you may be trapped in. Such problems include: peak finding, peak integration, step/bin size selection, centroided or profile data, adduct removal, noise calculation and normalization of the whole dataset. Another problem with large datasets is the computational time. MultiProcessor and MultiCore (including multithreaded implementations) or even cluster support should be enabled. Why? Read the plea for multithreaded software [The free lunch is over]. You may checkout one of the ideal implementations towards using all computational resources → mzmine which supports multi-core and multi-processor and even compute-clusters. Another requirement is the use of open exchange formats like netCDF and MZxml. The days of proprietary data formats are counted and will definitely scare away customers who are willing to pay for new software.

You may recognize after you worked with several alignement programs or scripts, that some of the programs do already a complete statistical evaluation using univariate (ANOVA) and multivariate statistics (PLS, LDA, ICA, PCA). This is very helpful for a quick analysis, however a much deeper investigation may be needed with indepenent statistical packages for a complete biomarker identification process. The ultimate biomarker identification tool of course is a workflow or pipeline software using LC, GC, MS and NMR as input and later transfers the found biomarkers to an attached automated structure elucidation process.

  • MetaboAnalyst (metaboanalyst.ca) - A complete metabolomics workflow system for NMR, LC-MS, or GC-MS data
  • mzmine and mzmine2 (http://mzmine.sourceforge.net/) - mzxml, mzdata, netCDF and XCalibur data (LC-MS, GC-MS, MS data)
  • metAlign (RIKILT-WUR Institute of Food Safety) - LC-MS and GC-MS data
  • BinBase (fiehnlab.ucdavis.edu) [PPT] or [PDF]
  • xcms and xcms2 (Scripps) - netCDF data (LC-MS, GC-MS, MS and MS2 data)
  • metaXCMS (Scripps) - untargeted metabolomics for multi-class experiments with identification step
  • XCMS Online (Scripps) - alignment for low resolution and high resolution data (LC-MS and GC-MS)
  • MarkerLynx (Waters) (LC-MS data)
  • BluFuse (BlueGnome) - for MS and NMR data [PDF]
  • SpecAlign University of Oxford (Jason Wong) - Alignment of SELDI, MALDI, NMR, RAMAN, IR (via TXT import)
  • HiRes (Columbia University Medical Center) - for NMR data
  • msInspect (Proteomics Fred Hutchinson Cancer Center)
  • Progenesis PG600 (Nonlinear) - for MALDI and SELDI mass spectra
  • caMassClass (NCBI) - [ZIP] for SELDI protein mass spectra
  • pairseqsim - (Bioconductor - Witold Wolski) - for mass spectra [DOI]
  • Xalign - for LC-MS data [DOI] - request here
  • msalign from Matlab Bioinformatics Toolbox - for MS data (example using msalign)
  • Randolph Yasui code - [DOI] download Matlab code and WMTSA wavelet toolbox
  • RTAlign algorithm of MSFACTs (noble.org) - GC-MS and LC-MS data
  • Genedata Expressionist (genedata.com) - for LC-MS and infusion MS data.
  • MS Align (David Grant - Uconn.edu) - for high resolution mass spectral data [DOI]
  • LCMSWARP (PNNL) - for proteomics and metabolomics LC-MS data (http://ncrr.pnl.gov/software)
  • ChromAlign (Thermo) - included in Sieve and Biosieve package for LC-MS and LC-MS-MS data
  • PETAL - Peptide Element Alignment for LC-MS data (http://peiwang.fhcrc.org/research-project.html)
  • MarkerView (ABI/Sciex) - for LC-MS and MALDI data peak picking and alignment and statistics
  • MathDAMP (Keio University) - for GC-MS, LC-MS, CE-MS data with Mathematica source code [DOI]
  • NameLess - for MALDI MS and FT-MS data with JAVA source code [DOI]
  • CPM MatLab toolbox (J Listgarten) - for LC-MS, proteomics, metabolomics and time series data + source code.
  • GASP (genedrift.org) - for GC-MS alignment
  • AnalyzerPro (SpectralWorks) - for alignment of GC-MS data
  • meta-b (Vladimir Likic) - for alignment of LC-MS data with python source code (go SVN)
  • spectconnect (MIT) - for alignment of GC-MS data using AMDIS for deconvolution
  • ChenomX Profiler (Chenomx) - for binning and alignment of NMR signals (+ DB search)
  • KnowItAll Metabolomics Editions (BioRad) - with IntelliBucket bucketing and binning of NMR data (+ DB search)
  • MS-Xelerator (MSMETRIX) - Advanced Algorithms for LC/MS Data Processing (Marco Ruijken)
  • OBI-Warp (U Texas) - Ordered Bijective Interpolated Warping for LC-MS data [PDF]
  • Census (Scripps) - Alignment using time warp and RT algorithm (from MS1 data) can also be used for label free semi-quantification of peptide data.
  • Rosetta Elucidator System (Rosetta) - for LC-MS alignment
  • MultiAlign (PNNL) - using LCMSWARP algorithm for alignment of LC-MS data
  • DTC/COW (U Copenhagen) - Correlation Optimized Warping and Dynamic Time Warping for LC and GC and MS data
  • Automics - for NMR alignment and statistics
  • ChromA - Chromatogram Alignment for Chromatography-Mass Spectrometry from University of Bielefeld
  • AmsRPM - Robust Point Matching for Retention Time Alignment in LC/MS (an R package)
  • apLCMS - Adaptive processing of LC/MS metabolomics data (R package)
  • Id-Align - An postprocessor for GC/MS metabolomics data using AMDIS
  • ChromA4D - A MaltCMS framework based alignment program for comprehensive GCxGC
  • MAVEN - a multiplatform metabolomics data analyser (Princeton) [LINK] [PDF]
  • GC2MSClass - GCxGC-MS Data Classification and Alignment, more free to use tools @ cceHUB