|Asgeir Frimannsson||Sep 14, 2008 11:08 pm|
|Mike McGrath||Sep 15, 2008 8:15 pm|
|Asgeir Frimannsson||Sep 15, 2008 11:16 pm|
|Asgeir Frimannsson||Sep 16, 2008 12:14 am|
|Asgeir Frimannsson||Sep 16, 2008 12:30 am|
|Mike McGrath||Sep 16, 2008 6:29 am|
|Asgeir Frimannsson||Sep 16, 2008 4:23 pm|
|Asgeir Frimannsson||Sep 16, 2008 4:28 pm|
|Dimitris Glezos||Sep 20, 2008 3:11 pm|
|varun patial||Sep 21, 2008 10:17 am|
|Asgeir Frimannsson||Sep 21, 2008 6:15 pm|
|Subject:||Re: Planning a future L10N infrastructure (including Fedora)|
|From:||Asgeir Frimannsson (asge...@redhat.com)|
|Date:||Sep 21, 2008 6:15:57 pm|
Thanks for your comments.
----- "Dimitris Glezos" <dimi...@glezos.com> wrote:
2008/9/17 Asgeir Frimannsson <asge...@redhat.com>:
On Tuesday 16 September 2008 23:29:32 Mike McGrath wrote:
Please correct me if I'm reading this wrong but I see "transifex is great or close to it" and "here's how we're going to build our own solution anyway" ?
Yes, "Transifex is great and will continue to serve us".
If you look at the state of the art in L10N outside the typical Linux projects where PO and Gettext rule, you'll notice we are very short on areas like: - Translation Reuse - Terminology Management - Translation Workflow and Project Management - Integration with CMSs. - Richer Translation Tools
This is an effort in narrowing that gap, and I can't see that effort work by evolving an existing tool from this 'cultural background'. Yes, we can get some of the way by developing custom solutions for e.g. linking wikis to Transifex for CMS integration, or using e.g. Pootle for web-based translation. But we would still be limited to the core architecture of the intent of the original developers, which is something that would radically slow the project down.
For the record, I believe these are some fine ideas, which I would like to see added to Transifex as features (eg. through plugins). I have been discussing most of them with people around conferences for the past year. An example: Tx already downloaded all the translation files from upstream projects, so if someone requests a translation file, why not be able to pre-populate it using existing translations from all the other projects (translation reuse)?
Also, I should mention that Transifex isn't (and will never be) specific to a particular translation file format (eg. PO) or any translation repository. I'd like to support translation of both PO and XLIFF files. And also support not only VCSs, but CMSs, wiki pages and even arbitrary chunks of text. Transifex's goal is to be a platform to help you manage your translations.
For the record (since XLIFF is mentioned and since I'm part of the Oasis XLIFF
Technical Committee), I am not aiming to design anything around XLIFF in this
project, other than perhaps support XLIFF is an import/export format for
resources in the same way as we support PO (we do have the odd XLIFF file coming
through for translation). I don't think XLIFF (1.2) is mature enough yet as a
L10N resource format.
I know there are some big ideas in transifex. In fact, when transifex is
mentioned, often people refer to the *goal/idea* of transifex, rather the actual
current implementation. Take for example plugins, transifex doesn't currently
have a plugin system, neither does it have workflow, project management, or any
concept of translation resources internally. Transifex today is a simple 'file
submission system' with a growing community aiming to build it into something
more. With this in mind, 'building on top of transifex' really means redefining
what transifex really is. For example, 'file submission' should really be a
plugin, not a core feature. That means all of transifex today (excluding maybe
the login UI), should really be plugins to a core model of projects, people,
etc, that currently doesn't exist.
Defining this 'model' of a repository doesn't really depend much on the
implementation, and in fact many implementations might help push this faster and
ensure a better solution (if it was on the tx roadmap in the first place). And
it's not like it is impossible for e.g. a java based repository to communicate
with Transifex for file submissions, isn't that exactly what the
remote-interface of TX (on the roadmap) is supposed to provide? What I'm hearing
is "Don't build something new, continue building on the python/tg/transifex
architecture", which is fully understandable. However, considering the cost of
developing this on top of tx (re-architecture, convincing all that it is the
right path to go, immaturity/stability of libraries for e.g. ajax, limited
workflow support), I honestly think it's better with two projects that
'compliment' each other. There are more than enough tasks for everyone in the
existing Tx roadmap, and the idea is bigger than what a combined development team could accomplish. Diversifying and pulling in good people from e.g. the
java-side of things might even help speed things up.
Correct me if I'm wrong though, instead of forking or adapting or working with upstream, you are talking about doing your own thing right?
We have a goal of where we want to see L10N infrastructure go, to enable us in the future to provide internal (translators paid by Red Hat) and community translators with tools to increase their productivity as well as better tools to manage the overall L10N process. If there is an 'upstream' that provides this, or a platform on to which we could develop this, then yes, we would consider 'working with upstream' or (in a worst-case-scenario) forking upstream.
The Translate Toolkit folks are a very friendly bunch, actively maintaining and extending the rich library, and always open to suggestions. Maybe some (if not all) of the features could be done in TT, and the rest that might not fit there, as Python libraries to maximize interoperability and community involvement.
Yes, I know TT very well, and have discussed the library with Dwayne Bailey (the
main visionary behind the project) in the past, even before tx was born. In
fact, a django-migration of Pootle (built on top of the TT) has been on the
agenda for a while, and combining forces with TT is one of the other options I
have been strongly considering for a repository (TT e.g. has a file submission
library, and there is a lot of duplication between tt and tx). Looking at the
svn activity of TT (in my rss reader), it is definetly a project with a
I also think that Transifex could serve as the "UI" for a lot of translation-specific tasks. If there's a library that does X, that would help people manage their translations or leverage Transifex's strong points of "I read a lot of repositories" and "I write to some repositories", then we could provide a web wrapper around it. (eg. search for string "X" in all translation files of language "Y", or "mark <this> file as a downstream of <that> and send me an msgmerged file whenever <that> changes".
So to answer your question bluntly, YES - after 4 years involvement in industry and community L10N processes - I believe we can do better. But holding that thought, remember that this is in many ways 'middleware', and making use of e.g. the vast amount of knowledge invested in Translate Toolkit (file format conversions, build tools, QA) makes sense, and I'm not saying 'forget about all that we have invested in tools so far'.
It might be my poor English or the fact that I usually read long mails at night, but despite the lengthy descriptions I still don't have a clear picture of exactly what problem you'd like to solve, and the reasoning behind the decisions being made.
I do understand there is a 'semantic gap' here, and that we do need to provide a
better description and demonstration of why a new project is necessary. I do
believe everything is theoretically possible to build on top of python/tg and
through reuse of concepts in e.g. tx and TT, but I honestly believe if we are
going to manage and drive the development effort in this, it is more worthwhile
to expand beyond the fedora/python community, and use tools that the core
developers would be more comfortable and productive with. This is not a 'we
think you guys should develop this' request, we are taking ownership of the
project, as well as inviting anyone that is interested in the community to
participate and take ownership.
Don't take me wrong -- I think there are some good ideas. But I feel it would be too bad if you guys didn't invest on top of existing tools (TT for file formats, Transifex for file operations and UI, OmegaT for translation memory) or just isolate specific solutionsthat don't fit into other projects in well-defined libraries (do one thing, to it right). Sure, it takes a lot more effort to work *with* other people, but it is usually worth it. :-)
This is *not* about an effort to avoid working with people. It is an effort to
get more people working on this. I know more people in the Java community that
is or might be interested in a open source solution for these problems than in
the Python/Fedora/TG community. And of course adding to this a portion of my
natural bias towards Java, and the fact that the people that would be working on
this would initially be much more productive in Java than in Python (TG2 or
With the fact that we throw this idea out to the fedora/tx community early,
please take that as a sign that we are trying to work with the community, rather
than simply developing something on our own. And I for one will continue being
involved with Tx to some degree, and help out where I can. L10N is an area with
a lot of space for improvement, and an area that has sadly been to some extent
'neglected' except for Dimitris' recent work. We still have a long way to go
before we have what I would call a L10N infrastructure that serves translators
_______________________________________________ Fedora-infrastructure-list mailing list Fedo...@redhat.com https://www.redhat.com/mailman/listinfo/fedora-infrastructure-list