Bacillus megaterium Sequence Utilities

The sections of this page allow you to:

  1. Do a BLAST search against the QM B1551 or DSM319 B. megaterium genomes
  2. Search the gene annotations for specific words or phrases
  3. Get the sequence (DNA or protein) of a gene by its gene ID
  4. Extract any portion of the B. meg genome based on its coordinates
  5. Visualize a region of chromosome surrounding a gene
  6. Align Bacillus genomes with QM B1551 bidirectional best hit genes

Basic Data
Molecule Length (bp) GC % GenBank Accession Numbers
DSM319 chromosome 5,097,447 38.13% CP001982
QM B1551 chromosome 5,097,129 38.26% CP001983
QM B1551 pBM100 plasmid 5428 34.84% CP001984
QM B1551 pBM200 plasmid 9098 34.50% CP001985
QM B1551 pBM300 plasmid 26,587 35.50% CP001986
QM B1551 pBM400 plasmid 53,865 36.47% CP001987
QM B1551 pBM500 plasmid 66,985 33.91% CP001988
QM B1551 pBM600 plasmid 99,694 32.97% CP001989
QM B1551 pBM700 plasmid 164,406 33.49% CP001990

E-mail me for any questions or comments: rjohns at niu dot edu (Rick Johns)

January 7, 2011

Go to the general sequence untilities page. It has ClustalW, translation, and reverse-complement

Newick tree rearrangement and drawing (from MrBayes output)




B. megaterium BLAST search

If you enter a nucleotide sequence, a blastn search will be done against either the entire genomic DNA sequence (including plasmids in QM B1551), or the DNA sequences of the individual genes.

If you enter an amino acid sequence, a blastp search will be done against the set of individual genes. However, if you select a genomic DNA database, tblastn will be performed instead: for tblastn searches, the database is automatically translated in all 6 reading frames.

Choose the genome you wish to search:

Search genes or genomic DNA

Your query sequence:

      


Options

       Just ignore these options unless you know what you are doing. The defaults are set to the standard NCBI values, except for the simple sequence filter and the expect value.

 

Simple sequence (DUST) filter
By default, simple repetitive sequences in the subject sequence ARE searched. This is the opposite of the default behavior at the NCBI BLAST site. Check this box to use the filter.

Expect:
This box alters the e-value of the worst blast hit that is reported. Smaller numbers are more stringent. The default value of 0.001 implies that a random DNA sequence of the same length as the query will have an average of 0.001 hits in this database. The NCBI BLAST site sets the default value to 10.

Use -m8 (tabular) output format. .

Use gapped alignment . The older form of BLAST did an ungapped alignment, which can be invoked by unchecking this box.

Word size:
To start a matched region, the BLAST program needs an exact match of this length. The default is 11 for blastn and 3 for blastp and tblastn. For blastn, 7 is the minimum acceptable size.




FASTA Gene Sequences

Enter the gene ID, and get the sequence (DNA or peptide), gene name, and coordinates of that gene in FASTA format.

Currently working formats:

Gene name:

As DNA: As peptide:




Extract any sequence from the B. megaterium genome

Also for the listed Bacillus (and related) species

Sequences are extracted clockwise around the circular genome. If the start coordinate is greater than the end coordinate, your sequence will cross the origin of replication. Use the reverse-complement checkbox if you want sequences running counter-clockwise.

Choose a specific molecule

Starting position:        Ending position:

Reverse-complement this sequence? Check here:




Annotation Search

Search the Genbank annotation files for a word or phrase. Retrieves all the information from all matching genes (not including the gene sequence).

You can search for locus tags, gene synbols, gene products, EC numbers, or gene types (CDS, tRNA, rRNA).
Search is case-insensitive. Only one term can be used: no AND, OR, or NOT.

Search term:

Choose your genome: QM B1551       DSM 319       WSH-002       B. subtilis 168      other Bacilli

Use a Perl regular expression: just what goes inside m/ /i.




Visualize a Gene's Neighborhood

Produces a plot showing the genes surrounding the focus gene (or position). Genes and intergenic regions are drawn to scale, with forward-orientation genes drawn above the axis and reverse-orientation genes drawn below. Protein-coding genes are in blue, and RNA-only genes are in red. Mousing over a gene displays its ID, coordinates, gene product, and gene symbol (if it exists).

B. megaterium gene at center of plot:
      OR: choose a chromosome        and a center position:

Choose a scale to plot (base pairs per pixel):        How many bases to plot?




Alignment of Bmeg genes with other Bacillus genomes

A 300,000 bp region of each genome is aligned based on a bidirectional best hit (BBH) with the QM B1551 genes. A two step process: first choose the region in each genome to display, then choose the alignment point and a set of BBH genes to color.
Align

Find all BBH for any B. megaterium (QM B1551) gene:

(e.g. BMQ_0109)        Annotation Peptide Fasta DNA Fasta       







If you would like other options please let me know. Rick Johns (rjohns at niu.edu)