DevHeads.net

Finalized proposal for changes to i18n in KF5

I have written up the headers and Doxygen pages for the Ki18n framework as
it should finally look like, available here:

<a href="http://nedohodnik.net/misc/ki18n-kf5-01/html/index.html" title="http://nedohodnik.net/misc/ki18n-kf5-01/html/index.html">http://nedohodnik.net/misc/ki18n-kf5-01/html/index.html</a>
<a href="http://nedohodnik.net/misc/ki18n-kf5-01/klocalizedstring.h" title="http://nedohodnik.net/misc/ki18n-kf5-01/klocalizedstring.h">http://nedohodnik.net/misc/ki18n-kf5-01/klocalizedstring.h</a>
<a href="http://nedohodnik.net/misc/ki18n-kf5-01/kuitmarkup.h" title="http://nedohodnik.net/misc/ki18n-kf5-01/kuitmarkup.h">http://nedohodnik.net/misc/ki18n-kf5-01/kuitmarkup.h</a>

As I have proposed earlier on k-c-d, this introduces two major changes.

The first change is that every i18n call looks up translations in strictly
one catalog, which is even determined statically in library code (i.e. at
compile time). This will stop undefined results when several catalogs in use
contain the same message. The section "Connecting Calls to Catalogs" in the
documentation above shows how to make the link.

The second change is that KUIT markup is gone -- if not quite. It is totally
gone from i18n* calls; one can always write an i18n* call and be sure that
there will be no markup processing. But I just couldn't plainly kill it (a
few people were not happy with that either), so I added separate markup-
aware xi18n* calls. Markup processing is now stricter, but allows
customization (new tags, etc). The section "Semantic Markup" describes this.

Comments

Re: Finalized proposal for changes to i18n in KF5

By Dominik Haumann at 12/27/2012 - 14:17

On Saturday, December 22, 2012 10:44:22 AM Chusslove Illich wrote:
Following e.g. QUrl, QSql, QXml, ..., shouldn't it be KUitSetup instead of
KUITSetup etc?

And the API documentation should at least state once what KUIT stands for.
K User Interface Translator? Just a guess, and probably wrong ;)

Greetings,
Dominik

Re: Finalized proposal for changes to i18n in KF5

By David Faure at 12/26/2012 - 13:10

On Saturday 22 December 2012 10:44:22 Chusslove Illich wrote:
Nice writeup, very complete!

#define TRANSLATION_CATALOG "foolib"
#include <klocalizedstring.h>

The goal is very sound (looking up messages in only one .po file, that must be
good for performance, too). The #define will kill any "enable final" compilation
support, but then again I think this is gone anyway.

Is this portable to all compilers supported by KDE?

BTW, something else I spotted:
KLocalizedString subs (QChar a, int fieldWidth = 0,
const QChar &fillChar = QLatin1Char(' ')) const;
See how one QChar is passed by value, and the other by const ref?
QChar is just a ushort, so it should be passed by value everywhere, as per
<a href="http://www.macieira.org/blog/2012/02/the-value-of-passing-by-value/" title="http://www.macieira.org/blog/2012/02/the-value-of-passing-by-value/">http://www.macieira.org/blog/2012/02/the-value-of-passing-by-value/</a>

Re: Finalized proposal for changes to i18n in KF5

By Chusslove Illich at 12/27/2012 - 07:45

On IRC Albert noted that I didn't quite cover tr2i18n use for .ui files,
which also holds for .kcfg and .rc. For these I should document how to set
the catalog and markup-awareness. For .ui this can be non-invasive (optional
arguments to CMake *_ADD_UI_FILES macro) because it is strictly compiled.
But for .kcfg and .rc I think two new optional attributes to the top tag are
needed.

I expect that every group of source files that could be treated with enable-
final, like one library, will have only one translation catalog. Then this
#define-#include construct would be put in exactly one private header, which
itself would have normal include guards, so enable-final would still work if
it worked otherwise. Or am I missing something here?

Ok :)

Re: Finalized proposal for changes to i18n in KF5

By Oswald Buddenhagen at 01/04/2013 - 17:02

On Sat, Dec 22, 2012 at 10:44:22AM +0100, Chusslove Illich wrote:
of course, it would be even better if you strived for submission to
qt-project, if at all realistic (for now probably an add-on, but
definitely under cla). otherwise you'll see the same effect every other
useful lgpl'd qt framework sees sonner or later: it gets re-implemented
(if the effort is deemed acceptable by an interested party).

On Thu, Dec 27, 2012 at 12:45:37PM +0100, Chusslove Illich wrote:

in <a href="http://nedohodnik.net/misc/ki18n-kf5-01/html/prg_guide.html" title="http://nedohodnik.net/misc/ki18n-kf5-01/html/prg_guide.html">http://nedohodnik.net/misc/ki18n-kf5-01/html/prg_guide.html</a> "escaping",
xi18n("Installed Fooapp too old, need release &lt;= 2.1.8.");
that needs to be &gt;. :D

in "Phrase Tags", <bcode>, the example makes no sense, as there are no
linebreaks in the actual string.

the "<nl>" example is unfortunate, because it uses a sentence which
should end in a question mark ... but doesn't. "delete the following
files?<nl/>" would seem better to me.

i find the 5.0-superscripts in the tables way too noisy. alternative ideas:
- use a separate column
- just default to 5.0, and thus defer the problem (possibly
indefinitely)
strictly speaking, the 5.0 is a lie anyway. ^^
in qt, we actually left in the 4.x version markers, to play in line with
the "mostly source-compatible" theme.

and as usual for native-only-in-slavic speakers, some "the"s are
missing. i was too lame to record their locations. ^^

your headers consistently have a space after function names. the kdelibs
coding style says something different ...

the "extern"s in front of the ki18n* declarations seem pointless and
untypical to me.

random ramblings:

i don't like the recommendation for extracted vs. disambiguating
comments (and closed-source authors will typically do the exact opposite
anyway). wouldn't it be sufficient for disambiguiation to strongly
recommend consistent use of user interface markers, and thus allow
all comments to be extracted? the matter of flagging changes is merely
tooling-related.

one thing i noticed while looking through catalogs is that it often
would be useful to be able to declare some kind of hierarchical
comments, so that a particular comment could apply to a whole group of
strings, without needing to replicate it, or relying on the translators'
ability to see the pattern themselves (which is a pipe dream, especially
if only some strings in an existing group changed). i suspect that this
may turn out "a bit" hard to implement without hacking gettext (and the
.po format) ...

regards

Re: Finalized proposal for changes to i18n in KF5

By Chusslove Illich at 01/05/2013 - 13:38

I'm not opposed to some additional bureaucracy in order to make the
framework more accessible to potential users. But I'd have to see what it
actually means, and what could be the tradeoff.

As for another party reimplementating the framework, I don't see what factor
is that. (Hypothetically speaking, though, I don't see it happening: if
someone wants Gettext-based translation in Qt code but not through Ki18n, I
expect he will, well, use Gettext directly.)

I looked, but couldn't figure out how to use it in this context.

Could someone parse it for us and make a set-intersection with supported
compilers for KF5? :)

Great!

I'd definitelly like to have the version markers visible for all elements.
For example, so that there is no uncertainty whether a marker was forgotten
or not.

I struggled with how to present it, and in particular thought that a
separate colon is an overkill. Maybe have the superscript yet smaller and a
bit dimmer?

What the initial version should be, I'll wait for someone else to decide.

I've given up and put a pox on them.

(I did toy with another idea though: compute the statistical average of
the's-per-word for a given class of texts, and then pepper my text
proportionally.)

I'm putting a space after the function name when the function is declared or
defined, as opposed to being called. Grudgingly, I'll get rid of them.

Actually I've no idea what these exports are/were for. They were added in
e51d7bfb with note "fix build failure for MSVC++'2005", and I didn't feel
compelled to inquire.

Yes, but tooling decisions are related to PO convention and workflow.
There'd be awful lot of tooling to modify, and modify by adding options and
not changeing the default behavior. There would also be no practical purpose
to having both types of contexts, unless there was a significant difference
between them.

And in KDE code this recommendation is actually the tradition. Even if
because it maybe wasn't given much second thought once contexts were
available...

The nicety of not having to manually replicate comments and contexts in
hierarchical situations, would have to be balanced by introducing yet more
i18n-related syntax to source files. This also means that drive-by i18n
fixers would have to pay more attention, and that code i18n checking tools
would have to be smarter. The usual story of simplicity and robustness vs.
capability and efficiency.

I don't think anything in PO files should change in this case, simply make a
proper split of information between of #. and msgctxt. It is the extraction
tool, xgettext, that would need changes.

Re: Finalized proposal for changes to i18n in KF5

By Oswald Buddenhagen at 01/07/2013 - 19:01

On Sat, Jan 05, 2013 at 06:38:58PM +0100, Chusslove Illich wrote:
i'm not sure this actually works ...

Re: Finalized proposal for changes to i18n in KF5

By Chusslove Illich at 01/08/2013 - 10:05

One step back: who exactly would find KDE Frameworks licensing terms non-
workable? I can't say I care what a party for which none of the options at
<a href="http://techbase.kde.org/Policies/Licensing_Policy" title="http://techbase.kde.org/Policies/Licensing_Policy">http://techbase.kde.org/Policies/Licensing_Policy</a> works will do. By "more
accessible" I meant in technical and organizational terms.

Doesn't...

I don't see that advantage. In fact, PO comments unfortunately have no
multi-line semantics so the editor must present them as is, while it can
rewrap PO contexts according to user's set width.

Right. But doing something more clever on this side introduces the same
tradeoff as for programmers, only worse due to much more diverse tooling on
translator side.

The current handling of this situation is to put the long comment/context on
one message, and a short comment/context pointing to it on other messages.
Pointing is of course ad-hoc, but one could easily adopt an intuitive
convention here.

I really insist on keeping the translation file format syntax-low and
convention-rich, i.e. highly human-readabable and forgiving on tools. This
is because the problem domain looks simple, so other than the basic need for
id-value pairs, two people will propose three sets of additional features.
This will result in a mountain of syntax of not always clear semantics,
which humans will find hard to look at, and tools hard to fully conform to,
denying the intended advantages.

Re: Finalized proposal for changes to i18n in KF5

By Oswald Buddenhagen at 01/09/2013 - 08:00

On Tue, Jan 08, 2013 at 03:05:25PM +0100, Chusslove Illich wrote:

Re: Finalized proposal for changes to i18n in KF5

By Kevin Krammer at 01/05/2013 - 13:19

The opposite thing as in only having comments and not caring at all about
ambiguity and that it makes the translations of their software suck?

If so, why should we care? We offer the options of doing it better and have
recommendations based on more than a decade of large scale i18n.

Is there any reason at all other than lazy programmers to even have i18n
functions that are not i18nc variants (i.e. require a comment)?

Cheers,
Kevin

Re: Finalized proposal for changes to i18n in KF5

By Allen Winter at 12/24/2012 - 10:36

On Saturday 22 December 2012 10:44:22 AM Chusslove Illich wrote:
I think you have quite an elegant solution here.
Plus, you documentation is really nice and easy to follow.

I am hoping you will adapt the Krazy i18ncheckargs accordingly.

Great work.
-Allen

Re: Finalized proposal for changes to i18n in KF5

By Chusslove Illich at 12/24/2012 - 14:38

Thank you. Given the general motivation behind Frameworks, my main intention
was for documentation to be reasonably self-contained: without assuming that
additional articles on Techbase must be read, that the code will be in
official KDE repository, that CMake is used as the build system, etc.

Yes. Most necessary changes are minor, except for one. When the programmer
defines a new tag, which will be possible with new markup, the current tag
validity check based on fixed tag set would raise false alarm. So, when a
file is being checked and an unknown tag is encountered, I thought of
looking upwards in the directory tree for a file named say kuit-custom-tags,
which would list any custom tags. Thus one can cover the complete project
with one file. If you see any problem with this scheme, please say.

(Come to think of it, I should also mention the availability of Krazy i18n
checks in the documentation.)