DevHeads.net

Heads-Up: Beware of xmlCleanupParser() when your package links against libxml2

Heya!

if you have code that links against libxml2 and calls
xmlCleanupParser() please verify that your project does that at the
appropriate place (i.e. immediately before exiting, and only once).

That function might delete TLS fields that belong to other libraries
(such as PA's client libs) if called twice. To the effect that PA
might look bad. And I don't like that.

So, please: if you have a library linking against libxml2, verify that
xmlCleanupParser() is called at the appropriate places and only once.

For now at least Empathy and Abiword are calling this function where
they shouldn't, but a google code search reveals that at least
inkscape and dia do it too. I stopped looking after that,
extrapolating that this means "everyone is doing this worng". Hence
this mail.

For more information see:

<a href="https://bugzilla.redhat.com/show_bug.cgi?id=554903" title="https://bugzilla.redhat.com/show_bug.cgi?id=554903">https://bugzilla.redhat.com/show_bug.cgi?id=554903</a>
<a href="http://git.gnome.org/browse/libxml2/tree/parser.c#n14002" title="http://git.gnome.org/browse/libxml2/tree/parser.c#n14002">http://git.gnome.org/browse/libxml2/tree/parser.c#n14002</a>

I also asked that libxml2 warns directly about this misuse:

<a href="https://bugzilla.redhat.com/show_bug.cgi?id=554903" title="https://bugzilla.redhat.com/show_bug.cgi?id=554903">https://bugzilla.redhat.com/show_bug.cgi?id=554903</a>

Thanks to Michal Schmidt for tracking this down.

Lennart

Comments

Re: Heads-Up: Beware of xmlCleanupParser() when your package lin

By Lennart Poettering at 01/12/2010 - 21:54

The Debian folks have their entire source tree indexed and have this
search engine for symbols:

<a href="http://source.debian.net/source/search?n=25&amp;start=0&amp;q=xmlCleanupParser" title="http://source.debian.net/source/search?n=25&amp;start=0&amp;q=xmlCleanupParser">http://source.debian.net/source/search?n=25&amp;start=0&amp;q=xmlCleanupParser</a>

I figure our package set is not that different from theirs, so you
might want to have a look on the list of packages that uses that
function call. Already on the first page of that output there's
fwbuilder, a PAM module and more misusing that function.

Oh, and we definitely need a source code indexer for fedora too!

Lennart

Re: Heads-Up: Beware of xmlCleanupParser() when your package lin

By Tom Lane at 01/12/2010 - 20:07

Lennart Poettering < ... at 0pointer dot de> writes:

Why exactly is this a misuse, and not libxml2's bug to fix? There's
certainly nothing in their documentation suggesting that there's
such a requirement.

regards, tom lane

Re: Heads-Up: Beware of xmlCleanupParser() when your package lin

By Ian Kent at 01/13/2010 - 02:29

libxml2 also doesn't use pthreads Thread Specific Keys correctly either
which can lead to segv if the library is unloaded prior to the threaded
application exiting.

Ian

Re: Heads-Up: Beware of xmlCleanupParser() when your package lin

By Lennart Poettering at 01/12/2010 - 21:03

<a href="http://xmlsoft.org/html/libxml-parser.html#xmlCleanupParser" title="http://xmlsoft.org/html/libxml-parser.html#xmlCleanupParser">http://xmlsoft.org/html/libxml-parser.html#xmlCleanupParser</a>

Lennart

Re: Heads-Up: Beware of xmlCleanupParser() when your package lin

By =?UTF-8?B?SGHDr... at 01/12/2010 - 20:19

Le 13/01/2010 02:07, Tom Lane a écrit :

Afaik it does, i agree with Lennart that a warning from libxml2 would be
welcome.

=> extracted from xmlCleanupParser() documentation
WARNING: if your application is multithreaded or has plugin support
calling this may crash the application if another thread or a plugin is
still using libxml2. It's sometimes very hard to guess if libxml2 is in
use in the application, some libraries or plugins may use it without
notice. In case of doubt abstain from calling this function or do it
just before calling exit() to avoid leak reports from valgrind

H.

Re: Heads-Up: Beware of xmlCleanupParser() when your package lin

By Daniel Veillard at 01/13/2010 - 10:00

there is definitely,
<a href="http://xmlsoft.org/html/libxml-parser.html#xmlCleanupParser" title="http://xmlsoft.org/html/libxml-parser.html#xmlCleanupParser">http://xmlsoft.org/html/libxml-parser.html#xmlCleanupParser</a>

The problem is that you can perfectly have application not relying on
libxml2 outside of their own code use libxml2 at different phases,
for example when parsing input, and when generating result, not using
the library in the meantime and calling xmlCleanupParser() twice (or
more) in a perfectly legal way.

And there is nothing more frustrating than a library outputting
a warning to some uncontrolled channel, because it thinks it's
appropriate to do so, but making the user mad because the programmer
might have made a mistake (or not). Latest fun example being

libnuma: Warning: /sys not mounted or no numa system. Assuming one
node: No such file or directory

for anybody running an application using libnuma on a non NUMA system.
Annoying the user is not a substitute for warning the programmer.

I have tried to clearly explain the problem in the documentation, raised
the point many time on the <a href="mailto: ... at gnome dot org"> ... at gnome dot org</a> mailing-list, but maybe
something more drastic is needed. I could turn xmlCleanupParser()
into an empty entry point and create a new function where the name would
make clear that this is not to be used without being careful, but
that decision would have to be discussed on the above list, not here.

Daniel

Re: Heads-Up: Beware of xmlCleanupParser() when your package lin

By Michael Cronenworth at 01/13/2010 - 16:57

Can the function call be renamed?

xmlCleanupLibrary() ?

This might lower the chance of it being implemented multiple times.

Heck, allocate 1 bit-field for a check on the function call to determine
if it has already been run or not...

Re: Heads-Up: Beware of xmlCleanupParser() when your package lin

By Daniel Veillard at 01/18/2010 - 08:50

yes, my current thinking is that's the best which can be done in the
current situration, and making xmlCleanupParser() an empty function. But
as said this is not where I will discuss this, Fedora accounts to a
small fraction on the overall libxml2 user base.

Daniel

Re: Heads-Up: Beware of xmlCleanupParser() when your package lin

By Lennart Poettering at 01/13/2010 - 15:26

Hmm, given that probably the majority of the libxml2 users are
misusing it, and misusing is an actual bug that caused program aborts
due to TLS vars being released that shouldn't i wonder if printing a
msg if this might happen is that bad. I mean if the options are "be
unstable but be quiet" or "be unstable and print a msg", then I'd vote
for the second choice...

There's something else that came to my mind: if libxml2 is loaded into
memory indirectly because some dlopen'ed module wanted it, and then
used, and then unloaded again because the module got dlcose'd again,
won't you leak TLS vars unless the xmlCleanupParser() function was
called properly before? In that case, not calling xmlCleanupParser()
is an error, right? And calling it, too, since some other
plugin/thread might still need it. Which means you are in a dilemma:
in either case you are doing it wrong.

(-z nodelete linking for libxml2 might be the solution for this prob,
but I guess the fact is simply that the existance of
xmlCleanupParser() in itself is probably not a good idea)

If you don't want to print a msg during runtime, maybe it could be an
option to use link time warnings? that way developers should see the
warning, but during runtime it will not be visible:

#if defined(__GNUC__) && defined(__ELF__)
#define WARN_REFERENCE(sym, msg) \
__asm__(".section .gnu.warning." #sym); \
__asm__(".asciz \"" msg "\""); \
__asm__(".previous")
#else
#define WARN_REFERENCE(sym, msg)
#endif

And then simply use something like this:

WARN_REFERENCE(xmlCleanupParser, "You are probably misusing xmlCleanupParser().");

Lennart

Re: Heads-Up: Beware of xmlCleanupParser() when your package lin

By Tom Lane at 01/13/2010 - 16:07

Lennart Poettering < ... at 0pointer dot de> writes:

Got it in one. libxml's API in this area is unusably broken, and needs
a ground-up redesign not random messages telling users that they didn't
use it right --- there *is* no right way to use it.

My experience with this comes from the Postgres project, which wasted
many moons trying to use xmlMemSetup() ... basically, you can't replace
libxml's memory management, you just have to live with whatever it
chooses to leak.

regards, tom lane

Re: Heads-Up: Beware of xmlCleanupParser() when your package lin

By Daniel Veillard at 01/18/2010 - 08:41

First time I see something coming from you or Postgres group
complaining about a leak or about the use of xmlMemSetup()

Daniel

Re: Heads-Up: Beware of xmlCleanupParser() when your package lin

By Lennart Poettering at 01/13/2010 - 16:39

Dunno. I wouldn't be so harsh. If libxml2 is linked with -z nodelete
(maybe it already is? I never checked...) this dilemma *does* go
away. And if then xmlCleanupParser() would be renamed to something
like xmlCallMeOnlyIfYouValgrindMeOtherWiseYouSuck() and then make the
symbol xmlCleanupParser() a NOP plus a gcc linker warning things are
more than fixed in my eyes.

(Oh, and the issue above would leak memory too, not just TLS, but TLS
is usually more of a scarce resource)

Oh, and let's note that other libraries have exactly the same
probs. Let's pick dbus for example which many might consider a
benchmark in many ways. There is dbus_shutdown() which does about the
same thing as xmlCleanupParser(). And libdbus isn't linked with -z
nodelete. A module that pulls it in has hence exactly the same
problems. For example, let's say there was a module by the name of
/lib64/security/pam_ck_connector.so which links against libdbus.so it
will leak memory each time it is invoked by a process that not by
itself links against libdbus, which we'll call "/bin/login" for
now. And there you have it: the same dilemma: either the module calls
dbus_shutdown and hence breaks every other dbus-using PAM module that
might be loaded, or it might not call it in which case it will leak
memory. Same dilemma.

It's a common problem. In fact, libpulse had a similar issue until i
added -z nodelete to its linker line.

Lennart

Re: Heads-Up: Beware of xmlCleanupParser() when your package lin

By Simo Sorce at 01/13/2010 - 16:54

On Wed, 13 Jan 2010 22:39:43 +0100

The dilemma is in broken libraries that use global variables instead of
explicitly initialized memory contexts so that you can have multiple
completely independent instances and also happen to help make them
thread safe.

Simo.

Re: Heads-Up: Beware of xmlCleanupParser() when your package lin

By Daniel Veillard at 01/18/2010 - 08:44

I provided a bunch of new parsing API in libxml2 precisely to avoid
the problem of global variables, but well unless I make a non-compatible
API and hence libxml3 (no target date, probably never) this problem
will stay around.

Daniel

Re: Heads-Up: Beware of xmlCleanupParser() when your package lin

By Lennart Poettering at 01/13/2010 - 17:31

If things really were that simple. Unfortunately they aren't.

Lennart