MapReduce jobs to generate a list of peptides to score at a specified m/z ratio. The first mapper generates all possible sequences and modified sequences defined in the search parameters for a given fasta database. The reducer eliminates duplicates, remembers all source proteins and emits the peptide with m/z as the key. The next set of reducers collects all peptides to be scored against a given m/z and stores them in the database.
Lewis et al. BMC Bioinformatics 2012 13:324 doi:10.1186/1471-2105-13-324