rsync-ing remote videos to a local machine

We have crafted the following bash script using rsync to get all the videos from the stream folder in Babelium Project (a remote to local backup):

rsync -r --partial --progress -t --ignore-existing --rsh "ssh  -i certificate.pem"* .

Quite long rsync command, isn’t it? Well, there are a lot of options involved here:

-r reverse rsync (copy the remote files to local hard-drive and not the other way around)
–partial keep partially transferred files
–progress show progress indication during transfer
-t preserve modification times
–ignore-existing skip updating files that already exist on the destination
ssh -i certificate.pem use a certificate to login via ssh

A handy script that you could be interested in…

Will see you all at IEB 2011 congress!

Informatikari Euskaldunen Bilkura 2011 We have been accepted to present in a short 20 minutes session our Babelium Project at the IEB 2011 Congress (Informatikari Euskaldunen Bilkura), a congress (now in its 8th edition) oriented to show the top notch developments and ideas from the Basque Country’s computer scientists’ arena. It will be held in Donostia on May 12th and this year’s motto will be “Social Networks (en) / Gizarte Sareak (eu)”. We will keep you updated with new info about this event. See you all there!

Migrating from to

We have decided to start the migration from to (Google Project Hosting – GPH) . First, we have downloaded the source code from Subversion and uploaded it to a Mercurial repo in GPH.  Piece of cake! 😉

Then, the tough one;  we have had to migrate the issues from xp-dev to GPH but there’s NO API in, so this weekend I have rolled up my sleeves and crafted a tiny PHP script for web-scrapping the bugs/issues list from (you have to replace the USERNAME, PASSWORD and BUGS_ID with your appropriate fields):

   //CURL stuff
   //This executes the login procedure
   $ch = curl_init();
   curl_setopt($ch, CURLOPT_URL, '');
   curl_setopt($ch, CURLOPT_POSTFIELDS, 'username=USERNAME&password=PASSWORD');
   curl_setopt($ch, CURLOPT_POST, 1);
   curl_setopt($ch, CURLOPT_HEADER, 0);
   //curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
   curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
   curl_setopt($ch, CURLOPT_COOKIEJAR, "cookies");
   curl_setopt($ch, CURLOPT_COOKIEFILE, "cookies");
   curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
   //make sure you put a popular web browser here (signature for your web browser can be retrieved with 'echo $_SERVER['HTTP_USER_AGENT'];'
   curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv: Gecko/2009070611 Firefox/3.0.12");
   curl_setopt($ch, CURLOPT_URL, '');
   $subject = curl_exec($ch);
   foreach(preg_split("/(\r?\n)/", $subject) as $line){
       preg_match('/bugTitle">(.*)<\/h5><p>(.*)<\/p>/ism', $line, $trozos);
       if (count($trozos) > 0) echo html_entity_decode($trozos[1], ENT_QUOTES, 'UTF-8'). "~" . html_entity_decode($trozos[2], ENT_QUOTES, 'UTF-8') ."~";

So far, so good, but what are we supposed to do with that script? Well, you have to execute it from the command line and redirect the output to a file. For example:

$ php -q issue_scrapping.php  >  issues.txt

Now, the hardest part: you have to import that issues in GPH using the GData API for Issue Tracking. I’ve used the Java binding of the API. The only thing that I’ve had to customize in this example from the API is another little piece of Java code in the run() method:

               Scanner file = new Scanner(new File("issues.txt"));
		while (file.hasNext()) {
			String summary =;
			String description =;
			// Create an issue
			IssuesEntry issueInserted = client.insertIssue(makeNewIssue(
					summary, description));
			String issueId = client.getIssueId(issueInserted.getId());
			System.out.println("Issue #" + issueId + " created");

(and in the makeNewIssue() method so that it accepts the two new String parameters, summary and description). And here you have the result of this hand-made porting 😉

Open sourcing the Babelium’s source code

After more than 1 year and half we already have a working version of Babelium’s source code. Nowadays it’s in use in (there are some rough edges that need to be fixed, though).

We have been using a Subversion VCS server, but we have also suffered the pain of branching and merging  in SVN so we decided to grab the bull by the horns and start to use a DVCS like Mercurial. Why Mercurial and not Git? Well, there is a lot of documentation about this two beasts, both good and bad, depending on the teller.  But Google Project Code Hosting is using Mercurial, so there we go.

Until today, we have been using the excellent service. It has a freemium model for project hosting. One of the services under the pay-wall is precisely the DVCS support. There are also some other annoyances (being lack of an API and lack of integration with Eclipse/Mylyn two of the biggest). But as I’ve said, it’s a great website if you want to stop kicking the tires and get the ball rolling for free! Their active support via Twitter (@xpdev) is also excellent. Kudos to them!

Now we are struggling to get the conversion from SVN to HG right. Hope to finish this task tomorrow. We’ll keep you informed!