SWAPHI is the first parallel algorithm to accelerate the Smith-Waterman protein database search on Intel Xeon Phi coprocessors (with affine gap penalty and only using Xeon Phis). By searching against the UniProtKB/TrEMBL database (comprising 13,208,986,710 amino acids in 41,451,118 sequences, with the longest sequence of 36,805 amino acids), SWAPHI achieves a performance of up to 58.8 billion cell updates per second (GCUPS) on a single coprocessor and up to 228.4 GCUPS on four coprocessors (B1PRQ-5110P/5120D) sharing the same host. In addition, we also developed its sister program SWAPHI-LS for the alignment of very long sequences on Intel Xeon Phi clusters.
NOTE: please tune the number of threads used according to the configuration of your Xeon Phi device. Typically #threads = 4 * (#processors - 1), but SWAPHI works best at full capacity, i.e. #threads = 4 * #processors through our evaluations.
- latest source code (v1.0.5)NEW
The source code and a binary version are available for download now! Note that the binary version is compiled using the Intel C++ compiler (v13.1.3 20130607) on a x86_64 computer with Scientific Linux release 6.3 (Carbon) operating system.
- Yongchao Liu and Bertil Schmidt: "SWAPHI: Smith-Waterman protein database search on Xeon Phi coprocessors". 25th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2014), 2014, pp. 184-185. [preprint at arXiv] [poster].
Other related papers
- Yongchao Liu, Bertil Schmidt, and Douglas L. Maskell: "MSA-CUDA: multiple sequence alignment on graphics processing units with CUDA". 20th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2009), 2009, 121-128
- Yongchao Liu, Douglas L. Maskell, Bertil Schmidt: "CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units". BMC Research Notes, 2009, 2:73
- Yongchao Liu, Bertil Schmidt, Douglas L. Maskell: "CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions". BMC Research Notes, 2010, 3:93
- Yongchao Liu, Adrianto Wirawan, Bertil Schmidt: "CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions". BMC Bioinformatics, 2013, 14:117.
- Yongchao Liu, Tuan-Tu Tran, Felix Lauenroth, Bertil Schmidt: "SWAPHI-LS: Smith-Waterman algorithm on Xeon Phi coprocessors for long DNA sequences". 2014 IEEE International Conference on Cluster Computing, 2014, pp.257-265
- Yongchao Liu and Bertil Schmidt: "GSWABE: faster GPU-accelerated sequence alignment with optimal alignment retrieval for short DNA sequences". Concurrency and Computation: Practice and Experience, 2015, 27: 958-972
- Tuan Tu Tran, Yongchao Liu, Bertil Schmidt: "Bit-parallel approximate pattern matching: Kepler GPU versus Xeon Phi". Parallel Computing, 2016, 54: 128-138
We have integrated both the database indexing and the alignment subroutines into a single executable binary, and have given two commands, i.e. index and align, for the database indexing and Smith-Waterman alignment against the indexed database, respectively. The following list the parameters for both of the two commands:
- -x < int> (number of Intel Xeon Phi coprocessors [REQUIRED])
- -i < str> (input sequence file in FASTA/FASTQ format [REQUIRED])
- -o < str> (prefix of output index files, default = INPUT file name [OPTIONAL])
- -m < int> (minimum sequence length, default = 0 [OPTIONAL])
- -M < int> (maximum sequence length, default = 4294967295 [OPTIONAL])
- -n < int> (the first number of sequences in the input, default = -1 [OPTIONAL])
- -q < str> (input query sequence file)
- -d < str> (prefix of database file index)
- -i < int> (parallelization model, default = 0)
0: inter-task model with one vector lane computing one sequence pair
1: intra-task model with all vector lanes computing one sequence pair
- -m < str> (scoring matrix name, default = blosum62)
supported matrix names: blosum45, blosum50, blosum62 and blosum80.
- -g < int> (gap open penalty, default = 10)
- -e < int> (gap extension penalty, default = 2)
- -p < int> (sequence profile used for inter-task model, default = 0)
0: score profile
1: query profile
- -c < int> (maximum bytes per data fetch by a device, default = 134217728)
- -t < int> (number of threads per Xeon Phi, deafult = 240)
- -x < int> (number of Xeon Phis used, default = 4)
- -l < int> (load the database entirely onto the device, default = 1)
- -k < int> (number of top alignments reported, default = 10)
- Modify the "INTEL_DIR" macro in the "Makefile" to point to the correct Intel C++ compiler installation directory.
- Modify the "ZLIB_HEADER" macro in the "Makefile" to point to the correct ZLIB installation directory.
- Type "make" command to compile the software.
This software works at two steps:
- Build the database indices using the command "swaphi index [options]".
- Perform the Smith-Waterman sequence alignment against the indexed database using the command "swaphi align [options]".
Typical command lines:
- command line "swaphi"
Get the command line options
- command line "swaphi index"
Get the database indexing command line options
- command line "swaphi align"
Get the alignment against the indexed database command line options
- command line "swaphi index -x 2 -i database.fasta -o outBaseName"
Build indices for the database for two Intel Xeon Phi coprocessors. The common base name of all index files are "outBaseName".
- command line "swaphi align -q query.fasta -d outBaseName -x 2"
Perform the Smith-Waterman sequence alignment against the indexed database on two Intel Xeon Phi coprocessors.
- June 09, 2015 (v1.0.5)
- By default, we have removed the dependence on ZLIB. Instead, we allow users to configure whether to support gziped input or not. This can be done by enabling the macro "COMPRESSED_INPUT" in the Makefile.
- There is no change in the code.
- Feb 02, 2015 (version 1.0.14)
- The source code has already been released!
If any questions or improvements, please feel free to contact Liu, Yongchao.