TranslateSvg v2.0 beta testing begins

After a short delay while I sorted out a Wikimedia Labs account, I am pleased to announce that version 2.0 of the TranslateSvg extension is officially available for testing.

TranslateSvg enables the easy translation of virtually any (currently 93.1%, but increasing all the time) SVG image containing text, with the result embedded into the SVG file so that graphical updates instantly propagate to all language versions.

Available for testing are three images to give you a feel for the interface. There’s likely to be one future change – the introduction of an extra dialog box – but it’s 99% feature complete. Well, until *you* tell me what’s wrong with it 🙂

So what are you waiting for? Find ten minutes and get yourself to http://translatesvg.wmflabs.org/wiki/Main Page .

From 87% to 100%

A week I posted about how TranslateSvg can handle 87% of all translatable files (in fact, the figure is probably now at around 84.5% due to some methodology tweaks). I haven’t been working on that much since, but I did run an analysis of how to get from there to 100%, a move related to my original analysis of the structures I need to support. The breakdown, then, is as follows:

  • 84.5% – already supported
  • 7% – ability to look inside style tags
  • 3.5% – supporting random clutter inside text tags, not sure what this might be yet
  • 2.8% – support for existing switches with deep hierarchies
  • 2.2% – support for nested tspans

Looks like I’ll be working on some of those, then, over the next fortnight.

UPDATE (30 July): Currently

  • 93.1% – already supported
  • 3.0% – support for existing switches with deep hierarchies
  • 2.5% – support for nested tspans
  • 0.8% – IDs used in CSS
  • 0.6% – supporting random clutter inside text tags, still not sure what this might be

UPDATE (11 August): Currently, and probably for the forseeable future

  • 96.0% – already supported
  • 2.3% – support for nested tspans inside tspans (these don’t actually render correctly on Wikimedia wikis anyway).
  • 0.95% – support for random clutter inside text tags (mostly textPath plus some custom namespace tags)
  • 0.75% – IDs used in CSS

TranslateSvg: what’s it for?

Public domain map of South Sudan (click for more details)

As my Google Summer of Code project progresses, I realise that I haven’t got any blog posts actually explaining what TranslateSvg is for. Thus, I thought I should at least give one example (there are many I could have picked from) to illustrate the point, so here goes.

On 9 July 2011, South Sudan declared independence. A year on, 142 Wikipedias have created some sort of entry about it, many of them during the initial buzz. I haven’t checked, but I suspect a high proportion haven’t really been edited since.

Several months before, an Italian Wikimedian created a map showing the likely borders of the new nation and its proposed state boundaries. Sometimes with the aid of an existing tool, that map was then translated into other languages, among them English, Greek, Catalan and even Macedonian. These copies were then uploaded onto Wikimedia Commons as separate files.

So far, so good. But South Sudan is a state in its infancy. It has numerous boundary disputes ongoing, and no-one really knows if the state boundaries have been drawn in the ideal places. Thus, one would expect the map to change significantly over the next decade – if it has not changed already. More often than not, these kinds of change are picked up first by editors of the larger projects, who rapidly update their own versions of the map. To do so takes, say, 20 minutes; but to replicate that same change across Catalan, Greek, Macedonian? Hours of work – and dozens of separate uploads. So, editors being volunteers and all that, they tend to only update the language(s) they care about. Unfortunately, this means that image versions can become horribly out of sync, normally to the disadvantage of the smaller wikis.

TranslateSvg changes this workflow, firstly by making it easier to translate files (thus reducing the all-too-common sight of English-language diagrams in use on non-English wikis), and secondly by embedding the new translations within the same SVG file. Thus, when boundaries change, a single update will propagate to all language versions instantly (if you’re worried about how Inkscape handles these, don’t be: you’ll simply see one set of translations on the screen at any one time, and you can even move that label around, thus nudging labels in every language at the same time).

I think that’s pretty nifty, and I hope you do too 🙂

GSoC update

Over the past few days I’ve been busy upgrading the parser built into TranslateSvg, such that ~87% of all SVG files with strings in them can now be translated — up from ~75% before the upgrade.

More importantly, the parser is now of a kind that could support up to 100%, whereas the old one was effectively tied to 75%.

GSoC: Phase 3 complete

Today, I reached a turning point in my Google Summer of Code project: the tentative completion of phase 3.

This means that, as of today, my local copy of TranslateSvg can take a freshly uploaded SVG file and shepherd the translation from beginning to end.

Well, with a 75% probability it can, anyway 🙂

Still, there’s lots to do. Documentation, testing, bugfixes, a wizard, a colorpicker, and code review to name just some of the things I still have to do.

But good news nonetheless.

GSoC: Midterm assessments

Yes, it’s coming up to mid-July, or, as its known to Google Summer of Code students, mid-term assessment time. This is something of a misnomer for me personally – I’m only about a third of the way through my own scheduled hours on the project, but it’s nevertheless a good time to take a step back and survey the scene.

The original project plan consisted largely of five parts: three main “phases” plus an introduction and a wrapup. At this point, I’m more or less where I should be: the introduction, phases 1 and 2 completed; phase 3 and the wrapup not yet started.

You can see what it means to say “phases 1 and 2 completed” by taking a look at this video (you may need to turn your sound up), which follows a user (me pretending to be French) translating a file into his/her own language. A wizard or guide to make the interface, which is borrowed from the Translate extension, more intuitive to newbies is in the works, as is the addition of a “color” property to help with recolouring text after translation.

Reuse onwiki for this visitor is now as simple as [[File:Picturebook 1.svg|thumb|lang=fr|Caption.]].

TranslateSvg currently supports about 76% of all translatable SVG files; once the basic import structure is complete (i.e. sometime in the next week or so), I’ll then have time to start pushing that up towards 99%.

GSoC update

Now that my exams are finally over, I can turn my attention back towards my Google Summer of Code project, TranslateSvg.

The first step is to fix up phase 1, taking on board the feedback I have received (including via code review). In particular, Niklas’ insightful comments about the relationship between Translate and TranslateSvg have prompted me to siphon off the new code into a separate extension – albeit one dependent on the presence of Translate.

After that, it’ll be on to phases 2 and 3 – import and export of SVG files.

Berlin Hackathon

After a largely unproblematic journey yesterday, I made it to Berlin for this year’s Berlin “Hackathon”, a large (>120 people) meetup for Wikimedia techies (and hackers working on related “open data” projects). The official programme started at 5pm local time today, and will continue into this evening, through Saturday and until 5pm on Sunday evening. If the pre-conference chats are anything to go by, it should be a great few days. So many new people to meet though!

In general, the hackathon is more unconference than conference, with plenty of time devoted to chatting and coding, rather than attending set speeches and talks. There are a few slots set aside for workshops, however, on a wide range of topics from Lua to Gerrit; I hope to attend several of these ”interactive” sessions over the next couple of days. I’m also going to be rounding up some guinea pigs to test phase 1 of my Google Summer of Code project (SVG translation) and fixing any bugs that emerge. After the hackathon ends on Sunday, there is also going to be a sightseeing tour – an excellent opportunity for seeing something of the city itself.

GSOC – Week 3/4

Progress was slightly slower this week, but phase 1 of the project is still well on target for a Berlin demo:

  • Get Translate to work on an already established message group, loading properties from wiki pages
  • Get Translate to work on an already established message group, saving properties back to wiki pages
  • Implement static thumbnail for Special:Translate
  • Suppress documentation for SVG images
  • Implement static thumbnail for individual translation page
  • Steal file description from file description page
  • Create message files in .i18n.php