my Twitter account!

I have just created a twitter account. You can now follow me at: http://twitter.com/#!/dalloliogm

I tried to resist joining Twitter for a long time, but now I need it to participate to a spare-time project of mine. I recognize that twitter can be a very useful tool for a researcher, but I am worried that it can be too intrusive and distract me too much.

Do you have any suggestions for a new twitter user? Which software (on Ubuntu) do you use to check the feeds? Which groups would you recommend to a bioinformatician?

I have just created a twitter account. You can now follow me at: http://twitter.com/#!/dalloliogm 

I tried to resist joining Twitter for a long time, but now I need it to participate to a project of mine. I recognize that twitter can be a very useful tool for a researcher, but I am worried that it can be too intrusive and distract me too much.

Do you have any suggestions for a new twitter user? Which software (on Ubuntu) do you use to check the feeds? Which groups would you recommend to a bioinformatician?

 

Share on TwitterSubmit to StumbleUponDigg ThisSubmit to reddit
Posted in Uncategorized | 3 Comments

scripting Cytoscape to plot different Node Centrality measures

Finally I have made it: scripting and automatizing Cytoscape with python!! Below you can see a figure that I have automatically generated with Cytoscape, including legend and values distributions merged into a single file:

Distribution of 'Centroid' values in the pathway of N-Glycosylation. Figure generated by automatically scripting Cytoscape: Click on the figure to see the whole pdf report.

Cytoscape is a software to visualize and analyze networks, widely adopted by the bioinformatics community and with a lot of plugins to analyze biological data. Unfortunately for me, it is written in Java, making it a lot more difficult to automatize (at least for the people who don’t program in Java, like me).

One of the protocols I wanted to automatize in Cytoscape was to plot different measures applied to the nodes of the same network, and export an image (along with the legend) of it automatically. For example, I wanted to calculate different measures of node centralities to a network with Centiscape, and then plot a figure for each measure and save it to a file.

I’ve finally managed to automatize this when I discovered the XMLRPC plugin for Cytoscape. I have learned a lesson: if you want to automatize anything in Cytoscape, with any programming language other than Java, then use the XMLRPC plugin. There is also a Python Console plugin for Cytoscape, but I recommend you to use the XMLRPC directly. It is better documented (I couldn’t find any documentation for the Console plugin), you can launch it from a bash terminal, and if you use ipython, you have name completition. Moreover, the XMLRPC protocol is more standard than the Cytoscape’s inner python console, so you will also learn something useful from it.

So, if you want to see an example of how to automatize Cytoscape with python, or want to compare different measures of node centralities on a network, you can use access a repository called ‘Cytoscape compare node centralities’ I set up on bitbucket. The code can also be used as an answer to one of the most pressing problems that affect Cytoscape users: export a network view along with its legend.

Share on TwitterSubmit to StumbleUponDigg ThisSubmit to reddit
Posted in Uncategorized | Leave a comment

colleague leaving the academia

Massimo Sandal is the person who introduced me to Linux. The people who know me in person will understand how much does this mean for me, since I totally am an hard core Linux Geek.

Almost 6 or 7 years ago, second year of bachelor, I joined a group of geek-lous students in the Biotec faculty, who had created a mailing list to discuss about Linux and free software and proposed to meet every now and then to install Fedora Core 1 or play with it. In practice, we ended up meeting only once or twice: but that was enough to lead me to the dark side and transform me into the Linux nerd I am now.

Since then, the open source world became very important to me. I remember clearly the day when something in my brain switched on and realized how much open source software there is available out there – and how many things I could learn by using it. The exact moment of that conversion was when I was reading the Zope book on the train back to my home town and I was playing with my laptop. I could not believe that there it was so much documentation and modules available for free: that was how the Microsoft’s blinkers fell for me. I like to say that, after that, the speed at which I improved my programming and computer skills boosted at least 5 or 10 folds.

So, I was surprised to read, in the Italian medias, about a post that Massimo has written in his blog on his decision to leave the academia. I am not sad about him leaving the research field:  it is a personal decision, I respect that. However, I am sad that a person like Massimo doesn’t find himself comfortable in the academic world.

What else can I say.. I wish him all the best, and I hope he will be able to find an even more geeky and nerdish job wherever he goes. Now it’s my turn to start converting new innocent souls to the dark Free Software side.. I already started doing it :-)

Share on TwitterSubmit to StumbleUponDigg ThisSubmit to reddit
Posted in Uncategorized | 2 Comments

Should a pipeline have ‘if’ conditions and loops?

On the ruffus mailing list[1] I am participating in a discussion on whether a pipeline should contain ‘if‘ conditions and loops.

I don’t like to see conditions in a pipeline. In the real pipelines, the ones with pipes and water in them, there is no equivalent of ‘if‘ conditions. The tubes can be either be closed or not, but that is defined in the pipeline structure. It is not that the pipeline can change its structure and open/modify its path depending on whether there is water or oil running in it, or that the water can choose whether to enter in a tube or note.

With Makefiles, you usually avoid having if and while conditions, to keep the code easier to understand. Moreover with make, when you have to execute the same task for multiple elements (e.g. call the same program with different inputs) you rather launch a series of parallel jobs instead of having a loop: that is similar to splitting a pipeline into different tubes.

So, for me a pipeline is just a script that you call it and it executes a series of steps. If you start putting if and loops in it, then you won’t be able to tell which steps are called every time you launch it, and it will be more difficult to understand the code.

What do you think? Am I being too silly? W the pipelines!! :-)

[1] ruffus is a tool to define bioinformatics pipelines with a python-like syntax, alternative to Makefiles

 

 

Share on TwitterSubmit to StumbleUponDigg ThisSubmit to reddit
Posted in Uncategorized | 2 Comments

a discussion about node centralities with G.Scardoni

Last week we hosted the visit of G.Scardoni, the author of Centiscape, a Cytoscape plugin to calculate different measures for Node Centralities in a network.

I recommend you to read the supplementary material 1 of his paper (it’s a pdf), because it has a good description of many measures of node centralities and their possible explication in a biological context.

A node centrality is a parameter that, given a node’s position and interactions in a network, determine its importance. To understand it better, think that one of the main purposes of centralities for biology is to identify which genes are more important in a biological process: which are in a bottleneck position, which are required for having a proper function and which ones are only redundant.

The simplest measure of node centrality is the Degree, which is the number of connections of a node. It seems logic to think that genes with an higher degree (higher number of interactions) should be more important than the others, because a loss of function there will affect more interactions. However, after reading at the Centiscape plugin I realized that there are a lot of measures for node centralities, including closeness, betweenness, stress, centroid, etc. The degree is not the best parameter to identify genes in bottleneck positions, for which we should use betweenness or stress instead.

wikipedia image showing the Betweenness in a network.

Just to make this post round, I have opened a discussion on biostar about which measures of Node Centrality can be applied to biological networks. Let’s see what comes out from that discussion, and if there are other centralities I do not know yet :-) .

Share on TwitterSubmit to StumbleUponDigg ThisSubmit to reddit
Posted in Uncategorized | 2 Comments

second part of the slideshow on HG for bioinformatics

I gave the second part of the talk on Version Control and hg for my group (check the first part). Here you have the slides:

I am working in collaboration with some of my colleagues to write a pipeline for calculating some tests for our projects.

The idea is to use hg to coordinate the writing of these scripts. We will have a reference version of the scripts on a private bitbucket.org repository; then, everybody will synchronize its local copy of the scripts from there, uploading new changes to the same place.

Some of my colleagues told me that hg is much easier than what they thought. I am very happy of this because I was worried about it being too difficult to use. It is really a long time that I want to convince my colleagues to adopt some version control tools, and it seems that it was easier than what I expected.

Share on TwitterSubmit to StumbleUponDigg ThisSubmit to reddit
Posted in slideshows, talks | 2 Comments

links, resources, games, tools (January 2011)

These are links that I have collected in the past two months. I am copying them in a pseudo-random order.

PhD students life / becoming a better PhD student

1000 genomes & co

Share on TwitterSubmit to StumbleUponDigg ThisSubmit to reddit
Posted in Uncategorized | Leave a comment

Apps and videogames for bioinformatics/genetics geeks (January 2011)

Apps and Games

  • FreePub is a mind-mapping software to organize scientific materials. Check also this presentation
  • After the games about protein folding and multiple alignments, a new geeky bioinformatics game has been published on Internet. Check out EteRNA!
Share on TwitterSubmit to StumbleUponDigg ThisSubmit to reddit
Posted in Uncategorized | Leave a comment

For N-Glycosylation freakies

If you are a N-Glycosylation geek, check out this interview with Ajit Varki, a guru of the field:

Interview with A. Varki on the importance of studying post-translational modifications

Share on TwitterSubmit to StumbleUponDigg ThisSubmit to reddit
Posted in Uncategorized | Leave a comment

short introduction to version control and hg (mercurial)

Last week I gave a short introductory talk to explain hg and the concept of version control to my colleagues.

If you are new to the concepts of version control, I recommend you to watch the excellent introductory videos at Software Carpentry.

Share on TwitterSubmit to StumbleUponDigg ThisSubmit to reddit
Posted in talks | Leave a comment