Monday, March 31, 2008

My Geek Pride is hurt: BLOSUM matrices

BLOSUM (BLOcks of Amino Acid SUbstitution Matrix) are the canonical substitution matrices used for scoring protein sequence alignments. In essence, it calculates the relative frequencies of all aminoacids in each position within an alignment and assigned a probability to the substitution of a particular residue. BLOSUM matrices built with closely related sequences are more stringent and have high numbers (BLOSUM80) indicating the percentage similarity allowed to include a sequence in the matrix (in the latter case, all proteins share at least 80% sequence identity).

BLOSUM matrices were developed in 1992 by Henikoff and Henikoff and since then have been extensively used in all analyses involving protein sequences...

and then, here comes he "AAARRGHHHH!!!"

Styczynski et al (2008) were killing their time looking at the evolution of the BLOCKS database and found the unthinkable.... an error in the source code for the algorithm that calculates de BLOSUM matrices!!!! that means... the results obtained with the available BLOSUM matrices differ significantly from the expected algorithm from Henikoff & Henikoff... merde!

Weirdest thing of all.. when corrected and tested back for the use of the matrices in database sequence search, it turned out that the "wrong" matrices performed much better in retrieving protein homologs than the "corrected" matrices.

Fortunately, it seems that though the difference is statistically significative, it is not big. That means, we haven't fucked it up so bad.

Epilogue to the blosum...

1) 16 years of extensive usage doesn't mean it is RIGHT.

2) how come that no one, ever, in 16 years, ever noticed this difference!!! THAT is what happens with dogma... when you take anything from granted

3) messing things up is not always THAT bad...

4) I didn't understand from the article if they proposed that the matrices were corrected even if they performed worse...

5) I would expect to see a huge ocean of erratas everywhere because "when using the revised blosum matrices... our results from the past ten years have completely changed!!"

Tuesday, March 04, 2008

Bound to...

Diving into my ipod, I rediscovered what I think is one of the best breaking up songs ever. I'm not in that mood now, but I keep recognizing it. So, this post goes dedicated to all those girls that actually are brave enough to say 'I'm through'... I mean, there must be some!

"I've thought about it for a while
and I've thought about that many miles
but I think it's time that I've gone away

The feelings that you had for me
have gone away it's plain to see
and it looks to me
that you're pulling away

I'm gonna pick it up
I'm gonna pick it up today
I'm bound to pack it up
I'm bound to pack it up
and go away

I found it hard to say to you
that this is what I have to do
but there is no way that I'm gonna stay

there are someone many things
you need to know
and I wanna tell you before I go
but it's hard to think of just what to say

I'm gonna pick it up
I'm gonna pick it up today
I'm bound to pack it up
I'm bound to pack it up
and go away

I'm sorry to leave you all alone
you're sitting silent by the phone
but we've always known there would come a day

The bus is warm and softly lit
and a hundred people are ridin' it
I guess I'm just another running away

I'm gonna pick it up
I'm gonna pick it up today
I'm bound to pack it up
I'm bound to pack it up
and go away

Oh yeah, yeah"

The White Stripes, De Stijl...