BLAST API
The BLAST tools API runs standalone BLAST+ executables with v5 NCBI databases for design related queries. Three modes are available: nopssm, pssm, and patent for running standard BLAST sequence alignments and patent queries respectively. The nopssm and pssm modes run standard BLAST+ against the NR database; use the latter if you wish to generate a PSSM for your query. The patent mode runs BLAST+ against the curated NCBI patent protein sequence database (pataa) that is generated in partnership with the USPTO and will return specific patent sequence hits along with the standard alignments.
Command Line Interface
Examples
Generate only sequence alignments for input.fasta:
lev engine submit blast input.fasta \
--mode nopssm
Generate sequence alignments and PSSM for input.fasta:
lev engine submit blast input.fasta \
--mode pssm
Generate sequence alignments and patent information for input.fasta:
lev engine submit blast input.fasta \
--mode patent
Generate sequence alignments and return up to 1000 hits for input.fasta:
lev engine submit blast input.fasta \
--mode nopssm \
--max-target-sequences 1000
Flags
--fasta-file(str) (Required)- Input FASTA file with query sequence
--max-target-sequences(int) (Default:500)- Sets the maximum number of sequences that can be returned for a query
--mode(str) (Required)- Sets the mode for the BLAST run.
- Options:
nopssm- run BLAST+ (blastp) for sequence alignments using NR databasepssm- run BLAST+ (psiblast) for sequence alignments and PSSM generation using NR databasepatent- run BLAST+ (blastp) for sequence alignments and patent hits using PATAA database
Python Interface
Examples
from engine import EngineClient
client = EngineClient()
client.authorize()
# Generate sequence alignments only
result = submit_blast(
fasta_path="input.fasta",
mode="nopssm"
)
# Generate sequence alignments and PSSM
result = submit_blast(
fasta_path="input.fasta",
mode="pssm"
)
# Generate sequence alignments with patent information
result = submit_blast(
fasta_path="input.fasta",
mode="patent"
)
# Generate alignments with custom number of hits
result = submit_blast(
fasta_path="input.fasta",
mode="nopssm",
max_target_sequences=1000
)
Flags
fasta_path(str) (Required)- FASTA file containing query sequence of interest.
max_target_sequences(int) (Default:500)- Sets the maximum number of sequences that can be returned for a query
mode(str) (Required)- Sets the mode for the BLAST run.
- Options:
nopssm- run BLAST+ (blastp) for sequence alignments using NR databasepssm- run BLAST+ (psiblast) for sequence alignments and PSSM generation using NR databasepatent- run BLAST+ (blastp) for sequence alignments and patent hits using PATAA database
Outputs
Some outputs depend on the mode you choose to run.
| Mode | Filename | Description |
|---|---|---|
nopssm, pssm, patent |
query.out | BLASTP or PSIBLAST query alignments (Note: formatting differs between BLASTP and PSIBLAST - PSIBLAST is set to run 4 iterative rounds and data from each round is logged to this file) |
nopssm, pssm, patent |
query.entries | Accession IDs of query hits are parsed from query.out file for gathering the full sequences and descriptions for the full-query.fasta file |
nopssm, pssm, patent |
full-query.fasta | FASTA file with complete sequences and descriptions from query hits |
pssm |
query.chk | PSSM checkpoint file generated by PSIBLAST |
pssm |
query.pssm | PSSM from PSIBLAST query in NCBI formatting |
patent |
query.patents | Textfile listing NCBI accession codes and patent descriptions for each query hit (ex. “ADA00576.1 Sequence 10 from patent US 7595057”) |