Replacing glibc langpacks

I'm investigating whether it makes sense to switch to a scheme where the
glibc locale data is built from source, during package installation,
based on the langpack configuration system. This is similar to what
Debian does.

The reason is that the compressed locale source code (without the
charmaps, which are not strictly needed once we patch localedef) is
smaller than the subset of locales of a langpack package which people
actually. For example, glibc-langpack-en on Fedora 29 is 6.7 MiB when
installed, but en_US.utf8 is 2.9 MiB, and the locale sources are
3.4 MiB, so even the common case realizes a small saving.

For the installer, the savings might be much larger. If we can teach
anaconda to generate the appropriate locale only after the user has
selected the language, then we no longer need the full locale archive in
the installation image (and in RAM).



Re: Replacing glibc langpacks

By Hans de Goede at 05/27/2019 - 06:40


On 27-05-19 11:34, Florian Weimer wrote:
Interesting idea, my first thoughts on this are that doing this
during installation time feels wrong. How are you going to figure
out for which languages to generate the locale data ? The language
can differ per user. e.g. on my system the system language is nl_NL,
for testing purposed, but I greatly prefer to have my apps in English,
so for the hans user it is en_US.

Even if you check the lang setting for all users during install time,
it may change later at a per user level an new users may be added
after install time.

Thinking out loud here, if we go this route I think the data should be
under say /var/cache/locale and be generated on demand. E.g.
/var/cache/locale could be owned by a locale user/group and the binary
to generate these files could be suid or sgid locale; then glibc could
start this helper on demand if necessary. This would also remove the
need to add some support / hack to anaconda for this.




An alternative to a suid/sgid helper would be a dbus activated service,
with an idle timeout to make it stop after it has been unused for a while.