Blast compare two protein sequences

9/21/2023

Using our RIF1 example again, take Figure 2 and scroll down further to see if “mouse” is one of our options. Is this Protein Found Specifically in Mice? Again, How Similar is it? The species name is provided on the second line in the “OS=” field. To address specifically which species it is you’re looking at, scroll down to the view provided in Figure 3. If you see anything other than “HUMAN” in this space, you’ve answered your question. In Figure 2, the species name is a five-letter code following the gene name (i.e. Knowing what we know now, this question is easy to answer. Is this Protein Found in a Different Species? In this case, the protein in question is human RIF1. This includes the “identities” section, which means “these two amino acids identical,” the “positives,” which can be read as “these two amino acids are different, but have similar chemistry,” and then gaps, which reflects any regions that are missing between the query and the subject sequences. The score and E-value are re-stated here, but now you can see new information. Also, as the program considers less-similar proteins, these bars will become increasingly red, and the bar will become increasing broken, indicating gaps in the sequence. The second refers to “isoform 2” of this protein and uses the accession number instead of the protein name – “Q5UIP0-2”. The first refers to the canonical isoform of RIF1 – hence “RIF1_HUMAN”. You’ll notice two different naming schema in this figure. They are both green, reflecting a high level of homology. In the example shown in Figure 1, I ran a BLAST query on an “unknown” sequence and am showing the first two returned values as an illustration. It also shows areas of significant differences. A color scale of green to red indicates a greater and lesser similarity. Upon completion, you encounter a colored, graphical representation of the similarity with different proteins identified from the BLAST database. Once you open the site, you can easily address the aforementioned question of “where did this sequence come from?” Simply copy and paste your amino acid sequence into the window and click “Run BLAST.” For the sake of consistency, I will be using the BLAST tool found on the ExPASy website. It contains a few more options and variables. Dimitris talked about the first, found on the NCBI website, in the aforementioned article. There are two versions of BLAST software you can use. Here, I hope to illustrate how to use BLAST in combination with ClustalW to answer some very practical questions about protein sequences that you may find yourself stumbling into as you learn to use these tools. Dimitris Skliros put together a great article on the BLAST tool that explains the inner machinations and how the system works. 15 years later, those two programs have done nothing but improve by expanding the data contained in these databases and simplifying the user interface.

When I was being trained in microbiology as an undergrad, one of the first skills I acquired was the ability to quickly compare and visualize amino acid sequences using BLAST and ClustalW.

0 Comments

Blast compare two protein sequences

Leave a Reply.

Author

Archives

Categories