New article from Meredith Doellman and the Feder lab!

Geographic and Ecological Dimensions of Host Plant-Associated Genetic Differentiation and Speciation in the Rhagoletis cingulata



How to update or install your local NCBI BLAST database in a Unix shell using

I recently updated my local BLAST database and I thought I would revisit the process of installing/updating, but this time using the included script.

First,  make sure the BLAST program is set to your path.  Because I work on a cluster all I  do is load the module.

module load bio/blast+/2.7.1

I then delete the old database folder and make a new folder with the same exact name (this keeps your old  scripts working ). If you are doing a fresh install, just create a new folder.

module load bio/blast+/2.7.1 
rm -r blastdb_folder_name
mkdir blastdb_folder_name

Now use the perl script to download the database of your choice.  The decompress option automatically decompresses the tar.gz files. Depending on what database you choose to download and your internet speeds, this could be a lengthy process.

perl --decompress nt

I am also downloading the taxonomy database to know more about my BLAST hits. You have to manually unpack this database.

module load bio/blast+/2.7.1 
perl taxdb
gunzip taxdb.tar.gz 

That’s it! Now you should have an updated BLAST database.

Installing and querying a local NCBI nucleotide database (nt)

While the online version of the non-redundant nucleotide database (nr/nt) is useful for small scale applications, checking for contamination in an assembly is best done with a local NCBI nt database.  Read along for a guide on how I installed and then queried the NCBI nt database on a unix cluster.

First  you need to make a folder where you will store your entire database and enter that folder.

mkdir NCBI_nt_DB
cd NCBI_nt_DB

Next you need to download the entire nt database from the NCBI website. Note that this database is almost 50 GB in size so make sure that you have sufficient space. This download may take some time depending on the speed of your connection.

wget ""

Next each tar.gz file has to first be uncompressed and then deleted. Doing this manually will take some time so I used a for loop. After the loop is complete the database is ready! No formatting is needed as it already is formatted.

for file in *.gz
tar -zxvpf "$file"
rm "$file"

To query the new database its important to point blast to the database folder and to the nt index file, so: path_to_database_folder/nt.  An example query looks like so:

blastn -db path_to_the_folder/nt -query fastafile

That’s it! Everything should we working now. With an offline database it’s important to decide on a regular update schedule. I plan on updating my database every 6 months or so. This can be done with a perl script found with the blast+ software or by deleting and downloading the entire database anew.

A primer on PCR

The PCR gods are a fickle sort and it’s an art to appease them. This strange intersection between science and the occult can be at the best of times trying, but fear not for I have braved the trials of PCR and write to offer advice.

To start I am sharing my standard PCR reaction. 10μL may seem like a tiny amount of product, but it’s enough to allow for testing on an agarose gel, amplicon cleanup for sequencing, and evaporation.

I use standard 10-μL PCR reactions, containing:

  • 5μL of PCR MasterMix (Promega, Madison, Wisconsin, USA)
  • 2.5μL of DNA-free H­2O
  • 0.5μL MgCl­­­­­2
  • 0.5μL forward primer
  • 0.5μL reverse primer
  • 1μL of template DNA


I then use touchdown PCR programs optimized for each primer pair to maximize the primer specificity. PCR product is then checked with Agarose gels. If the reaction failed I troubleshoot by trying each step below. Most of the time, diluting the template solves the problem.

  1. Dilute the template
  2. Decrease the specifity of the PCR program
  3. Increase the amount of MasterMix
  4. Increase the size of the reaction
  5. Re-extract template DNA

No-boil Chelex 100 proteinase K genomic extraction.

I previously posted a Chelex 100 extraction protocol that while easy, cheap, and moderately effective calls for a boiling step to degrade the proteinase. The degradation step denatures template DNA causing it to become a single stranded product, which is unsuitable for genomic work and in my experience increases the difficulty of PCR.

A protocol published in Molecular Ecology Resources by Casquet et al. (2012) removes the boiling step. Their published protocol allows for high throughout and minimal cost. To their protocol I have added a grinding step which I believe increases DNA yield. I have successfully extracted and used DNA from just 1 leg of a microlepidopteran species.


Modified Chelex without boiling from Casquet et al. (2012)

  1. Add 10ul of protinease K (20 mg/ml) to each tube.
  2. Add 150ul of 10% Chelex 100 to each tube.
  3. Grind specimens with melted and sterilized pipet tips
  4. Incubate for 24 hours at 55 °C., swirling occasionally.
  5. Spin down to pellet the Chelex, pull DNA from top.