Creating Phylogenetic Trees (a Happy Ending)

You may remember that I posted a few weeks ago about how to create phylogenetic trees out of similar genes using seaview and RaxML. To re-cap briefly, I created a multiple sequence alignment from FastA files, removed all the gaps so that only the substitutions were left, and then ran it through RaxML to produce the trees. Unfortunately, I couldn’t get the trees to open with treeview.

I spoke to an expert the next day and discovered that treeview has certain dependencies on Ubuntu that were difficult to resolve, so the answer was… to use Windows. I have to admit, it seems like a pretty funny answer to the question, given how much better Ubuntu is for bioinformatics tasks than Windows. Even BioLinux, the Linux built especially for bioinformatics, was unable to open my phylogenetic tree.

Anyway, here are two different views of the same tree for you to enjoy, with highlighting and legends added in MS Paint.

Creating Phylogenetic Trees

Our assignment for the past week has been to create phylogenetic trees from multiple sequences alignments based on clusters of orthologous genes (COGs). Specifically, to decide why a simple BLAST search was unable to accurately place a subject gene from Cryptosporidium parvum into a COG category.

I think this assignment is an interesting exercise in ‘real’ bioinformatics: where data is messy, the programs are challenging to install and use, and in the end you’re not quite sure what you ended up with, but it’s enormous fun anyway!

Continue reading

Generating plots and correlation coefficients with PostgreSQL and R

The major task of today was to generate some correlation coefficients showing that our approach to inferring data was consistent with established results. One of my colleagues generated a plot a few months (years?) ago that showed a respectable correlation of 0.86. Unfortunately, there are only 11 points on the plot, where there should be closer to 500 000. In the many presentations I’ve seen on this topic, that correlation slide is always questioned.

Fortunately, all of this data is available in our PostgreSQL database. Unfortunately, it was an adventure in several languages and programs that I tend to avoid: Perl, vi, and especially R.
Continue reading

Avoiding spam using mouseOver

All saavy website designers come across the problem sooner or later: how to make my email address easily accessible to the world, but not to the spam bots who creep through the internet looking for unsuspecting @ signs. Googling for an answer to the problem returns widespread cynicism. Many people have invested quite a lot of their time into making email addresses ‘bot-resistant, hoping that they won’t alienate their users  while at the same time keeping spam-free.
Continue reading

HTML love and my new website

I have known how to write HTML code for over half of my life. How many people can say that? I learned to write it twelve years ago. I signed up for GeoCities and Tripod and built half a dozen websites. Although they’ve all been long retired, I can still remember the joy I derived from tweaking the code, a little here and a little there, to make something that the world could see.

Several years later, my first work website was also coded by hand, with amazing CSS drop-down menus and PHP include statements that made website maintenance simple, provided you could actually sort out the layers within layers. I think that website was the first time I realized that it was possible for code to be beautiful.

That website is now offline too (though available on the Wayback Machine! Aw, look how cute it is!). I’ve pondered what kind of new site to create. By pondered, I mean procrastinated. Finally, after going to the Cytoscape Retreat and seeing how woefully backward I am without my own webspace, I finally gave in and spent a day of my free time crafting a new home for my details. And fell in love all over again with HTML and associated technologies.

I could probably spend a week tweaking and modifying my website so it is perfect. However, since I have a life these days and also more work to do than I have available time, I’ll have to satisfy myself with a few tweaks here and there.

Some places that really helped in my search:

  • Open source web design: lots of great free and easily tweakable templates. And open source!
  • COLOURlovers: For those of us who aren’t art majors, finding more than two colors that go together is challenging. COLOURlovers have over a million pre-conceived palettes with fanciful and fun names like Pluck Off and Wasabi Suicide. They also provide hex and RGB codes for each color.
  • Cytoscape: Cytoscape didn’t exactly help me with the details of the website, but thanks to it, I finally have some of my work that pretty enough to be made into a website banner. The networks on my homepage are from real data, and the colors mapped in Cytoscape. All I had to do was rasterise it.

And so, with great pleasure, I give you my new home on the web: http://www.staff.ncl.ac.uk/m.taschuk/

Postgresql and Karmic Koala

On the Friday before the Christmas holiday, I decided to update to Karmic Koala. I had put off upgrading for weeks (months?) because I’d heard plenty of horror stories from colleagues about how Karmic messed with all of their settings and options. Upgrading before the holidays would give me something to do, and after the holidays I could ease back into work by fixing whatever the Koala had trampled.

Fortunately, I didn’t have any of the reported problems, but I did encounter a new one today. I was previously running Postgresql 8.3 on Jaunty. During upgrade, Karmic informed me that 8.3 was no longer supported and uninstalled some of the packages, which was perfectly fine because I could use pg_upgradecluster to update to 8.4. Theoretically.

$ pg_upgradecluster -v 8.4 8.3 main
Error: specified cluster is not running

Or maybe not. Which specified cluster wasn’t running? Postgresql 8.3, of course. Because Karmic had uninstalled the packages

$ /etc/init.d/postgresql-8.3 status
8.3 main 5432 down postgres /var/lib/postgresql/8.3/main /var/log/postgresql/postgresql-8.3-main.log
8.4 main 5433 online postgres /var/lib/postgresql/8.4/main /var/log/postgresql/postgresql-8.4-main.log

In order to start Postgresql 8.3, I had to reinstall Postgresql 8.3, but pg_upgradecluster still wouldn’t let me upgrade.

$ sudo pg_upgradecluster -v 8.4 8.3 main
Error: target cluster 8.4/main already exists

Finally, I found a helpful bug entitled Postgres 8.4 in karmic does not upgrade 8.3 cluster or include instructions on how to do so, which luckily does offer instructions on how to do so. Not only does Postgresql 8.3 need to be reinstalled and running, both /etc/postgresql/8.4 and /var/lib/postgresql/8.4 need to be removed in order for pg_upgradecluster to work.

This problem can’t be too uncommon. I wonder why it never came up for in the Ubuntu team when they proposed the forced upgrade to Postgresql 8.4?