Babelium applied for GSoc 2011

Yesterday I filled and sent the form for Babelium Project to be part of the this year’s edition of Google Summer of Code. The list of accepted mentoring organizations will be published on the GSoC 2011 site this friday (on March 18th).

You can read more about our ideas proposal here. Don’t be shy and add your own ideas and comments!

Cross your fingers! 🙂

Video Slicing module coming up soon!

We have been working during the last semester of 2010 and the beginning of 2011 on a new feature to be included in babeliumproject. It is none other but the video slicing module, with which the user will be able to search for their favourite language learning videos hosted in Youtube and by a simple “copy-paste” action, preview the video in our video player, slice up the most interesting part and upload it to babelium’s server.

This could not have been done without Youtube’s Chromeless Player API, the youtube-dl command for downloading temporarily the video, and ffmpeg command to slice up the video. Moreover, we also had to use Youtube’s php API for various aspects, being one of them the control of the video’s maximum duration as we currently do not accept videos longer than 2 mins.

But enough of explanations for now! Let us show you how it looks right now:

Granted that it is only a draft, we feel it is enough for getting a grasp of how it could look like. We are currently at the final stage of development, trying to have it integrated with our current video player, and will soon start running tests  on the module. Hopefully, it will be available during March’s last weeks or first part of April.

The idea is to make it work in a nearby future not only with Youtube, but also with Vimeo and Dailymotion. Hope you enjoyed this piece of news! See you soon!

rsync-ing remote videos to a local machine

We have crafted the following bash script using rsync to get all the videos from the stream folder in Babelium Project (a remote to local backup):

rsync -r --partial --progress -t --ignore-existing --rsh "ssh  -i certificate.pem"* .

Quite long rsync command, isn’t it? Well, there are a lot of options involved here:

-r reverse rsync (copy the remote files to local hard-drive and not the other way around)
–partial keep partially transferred files
–progress show progress indication during transfer
-t preserve modification times
–ignore-existing skip updating files that already exist on the destination
ssh -i certificate.pem use a certificate to login via ssh

A handy script that you could be interested in…

Will see you all at IEB 2011 congress!

Informatikari Euskaldunen Bilkura 2011 We have been accepted to present in a short 20 minutes session our Babelium Project at the IEB 2011 Congress (Informatikari Euskaldunen Bilkura), a congress (now in its 8th edition) oriented to show the top notch developments and ideas from the Basque Country’s computer scientists’ arena. It will be held in Donostia on May 12th and this year’s motto will be “Social Networks (en) / Gizarte Sareak (eu)”. We will keep you updated with new info about this event. See you all there!

Migrating from to

We have decided to start the migration from to (Google Project Hosting – GPH) . First, we have downloaded the source code from Subversion and uploaded it to a Mercurial repo in GPH.  Piece of cake! 😉

Then, the tough one;  we have had to migrate the issues from xp-dev to GPH but there’s NO API in, so this weekend I have rolled up my sleeves and crafted a tiny PHP script for web-scrapping the bugs/issues list from (you have to replace the USERNAME, PASSWORD and BUGS_ID with your appropriate fields):

   //CURL stuff
   //This executes the login procedure
   $ch = curl_init();
   curl_setopt($ch, CURLOPT_URL, '');
   curl_setopt($ch, CURLOPT_POSTFIELDS, 'username=USERNAME&password=PASSWORD');
   curl_setopt($ch, CURLOPT_POST, 1);
   curl_setopt($ch, CURLOPT_HEADER, 0);
   //curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
   curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
   curl_setopt($ch, CURLOPT_COOKIEJAR, "cookies");
   curl_setopt($ch, CURLOPT_COOKIEFILE, "cookies");
   curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
   //make sure you put a popular web browser here (signature for your web browser can be retrieved with 'echo $_SERVER['HTTP_USER_AGENT'];'
   curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv: Gecko/2009070611 Firefox/3.0.12");
   curl_setopt($ch, CURLOPT_URL, '');
   $subject = curl_exec($ch);
   foreach(preg_split("/(\r?\n)/", $subject) as $line){
       preg_match('/bugTitle">(.*)<\/h5><p>(.*)<\/p>/ism', $line, $trozos);
       if (count($trozos) > 0) echo html_entity_decode($trozos[1], ENT_QUOTES, 'UTF-8'). "~" . html_entity_decode($trozos[2], ENT_QUOTES, 'UTF-8') ."~";

So far, so good, but what are we supposed to do with that script? Well, you have to execute it from the command line and redirect the output to a file. For example:

$ php -q issue_scrapping.php  >  issues.txt

Now, the hardest part: you have to import that issues in GPH using the GData API for Issue Tracking. I’ve used the Java binding of the API. The only thing that I’ve had to customize in this example from the API is another little piece of Java code in the run() method:

               Scanner file = new Scanner(new File("issues.txt"));
		while (file.hasNext()) {
			String summary =;
			String description =;
			// Create an issue
			IssuesEntry issueInserted = client.insertIssue(makeNewIssue(
					summary, description));
			String issueId = client.getIssueId(issueInserted.getId());
			System.out.println("Issue #" + issueId + " created");

(and in the makeNewIssue() method so that it accepts the two new String parameters, summary and description). And here you have the result of this hand-made porting 😉

Open sourcing the Babelium’s source code

After more than 1 year and half we already have a working version of Babelium’s source code. Nowadays it’s in use in (there are some rough edges that need to be fixed, though).

We have been using a Subversion VCS server, but we have also suffered the pain of branching and merging  in SVN so we decided to grab the bull by the horns and start to use a DVCS like Mercurial. Why Mercurial and not Git? Well, there is a lot of documentation about this two beasts, both good and bad, depending on the teller.  But Google Project Code Hosting is using Mercurial, so there we go.

Until today, we have been using the excellent service. It has a freemium model for project hosting. One of the services under the pay-wall is precisely the DVCS support. There are also some other annoyances (being lack of an API and lack of integration with Eclipse/Mylyn two of the biggest). But as I’ve said, it’s a great website if you want to stop kicking the tires and get the ball rolling for free! Their active support via Twitter (@xpdev) is also excellent. Kudos to them!

Now we are struggling to get the conversion from SVN to HG right. Hope to finish this task tomorrow. We’ll keep you informed!