README 1.3 KB

12345678910111213141516171819202122232425262728293031323334
  1. DIAMOND is a sequence aligner for protein and translated DNA searches,
  2. designed for high performance analysis of big sequence data. The key
  3. features are:
  4. - Pairwise alignment of proteins and translated DNA at 500x-20,000x
  5. speed of BLAST.
  6. - Frameshift alignments for long read analysis.
  7. - Low resource requirements and suitable for running on standard
  8. desktops or laptops.
  9. - Various output formats, including BLAST pairwise, tabular and XML,
  10. as well as taxonomic classification.
  11. To now run an alignment task, we assume to have a protein database file
  12. in FASTA format named 'nr.faa' and a file of DNA reads that we want to
  13. align named 'reads.fna'.
  14. In order to set up a reference database for DIAMOND, the 'makedb'
  15. command needs to be executed with the following command line:
  16. $ diamond makedb --in nr.faa -d nr
  17. This will create a binary DIAMOND database file with the specified name
  18. ('nr.dmnd'). The alignment task may then be initiated using the 'blastx'
  19. command like this:
  20. $ diamond blastx -d nr -q reads.fna -o matches.m8
  21. The output file here is specified with the '-o' option and named
  22. 'matches.m8'. By default, it is generated in BLAST tabular format.
  23. Publication:
  24. Buchfink B, Xie C, Huson DH, "Fast and sensitive protein alignment using
  25. DIAMOND", Nature Methods 12, 59-60 (2015). doi:10.1038/nmeth.3176