Closed
Bug 493541
Opened 16 years ago
Closed 16 years ago
jemalloc integration cause crashes when libraries or plugins dlopen with RTLD_DEEPBIND
Categories
(Core :: Memory Allocator, defect, P1)
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
status1.9.2 | --- | beta1-fixed |
status1.9.1 | --- | .3-fixed |
People
(Reporter: wolfiR, Assigned: karlt)
References
Details
(Keywords: crash, topcrash)
Attachments
(1 file, 2 obsolete files)
1.79 KB,
patch
|
benjamin
:
review+
samuel.sidler+old
:
approval1.9.1.3+
|
Details | Diff | Splinter Review |
(This is related to bug 473428)
Apparently the jemalloc integration can cause confusion for in process library functions if libraries are usually not using jemalloc but referencing malloc() and free() through the processes memory map. (Sorry I'm not a low level expert).
Here is a bugreport mentioning two examples:
https://bugzilla.novell.com/show_bug.cgi?id=503151
And an explanation I found is:
https://bugzilla.novell.com/show_bug.cgi?id=477061#c11
which led me to this bugreport.
Reporter | ||
Updated•16 years ago
|
Severity: normal → critical
Reporter | ||
Updated•16 years ago
|
Flags: blocking1.9.1?
Reporter | ||
Comment 1•16 years ago
|
||
According to
https://bugzilla.novell.com/show_bug.cgi?id=503151#c5
this is nothing which need to be fixed from mozilla.
Here's the explanation from the above comment:
(and NSS is the glibc's Name Service Switch not Network Security Services)
"
We currently do not support custom malloc() implementation in NSS due to our
patch to open NSS modules deep-bound (that is meant to protect the main process
from library namespace pollution by libraries the NSS module depends on - e.g.
Thunderbird depended on one kind of OpenLDAP library, while nss_ldap depended
on an entirely incompatible one). This causes the main process to use the
custom malloc(), but the NSS module to use the stock free().
"
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → WONTFIX
Updated•16 years ago
|
Flags: blocking1.9.1? → blocking1.9.1-
Assignee | ||
Comment 2•16 years ago
|
||
I'm reopening this, because as well as the name service switch module loading issues (which show up as bug 473428, https://bugzilla.novell.com/show_bug.cgi?id=503151 and https://bugs.gentoo.org/show_bug.cgi?id=252302), the same issue is affecting the Flash plugin (bug 469439).
Assignee: nobody → mozbugz
Blocks: 469439
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Summary: jemalloc integration can cause crashes in certain environments → jemalloc integration cause crashes when libraries or plugins dlopen with RTLD_DEEPBIND
Version: 1.9.1 Branch → Trunk
Assignee | ||
Comment 4•16 years ago
|
||
Excepts from what Ulrich Drepper says about the RTLD_DEEPBIND flag he added:
("How To Write Shared Libraries", August 20, 2006,
http://people.redhat.com/drepper/dsohowto.pdf)
this feature should only be used if it cannot be avoided. There are several
reasonse for this:
The change in the scope affects all symbols and all
the DSOs which are loaded. Some symbols might
have to be interposed by definitions in the global
scope which now will not happen.
Already loaded DSOs are not affected which could
cause unconsistent results depending on whether
the DSO is already loaded (it might be dynamically
loaded, so there is even a race condition).
...
The RTLD_DEEPBIND flag should really only be used as
a last resort. Fixing the application to not depend on the
flag's functionality is the much better solution.
The inconsistency that RTLD_DEEPBIND causes with jemalloc is that dynamic libraries opened with RTLD_DEEPBIND will use libc's malloc while libc is still using jemalloc. A libc function may return a pointer to something that should be passed to free, and the dynamic library will call libc's free, but libc used jemalloc to allocate the memory.
I raised a question on this behavior here:
http://sourceware.org/ml/libc-alpha/2009-06/msg00168.html
But it looks like we can make libc's free (and malloc, etc) use jemalloc:
http://www.gnu.org/s/libc/manual/html_node/Hooks-for-Malloc.html
Assignee | ||
Comment 5•16 years ago
|
||
I wonder whether we ever build against glibc and expect to run against a different glibc.
I hoping this will fix the bug but I'm not able to test right now.
The jemalloc dependency in the build system is broken so
OBJ-DIR/browser/app/firefox-bin must be explicitly removed to pick up the changes.
Assignee | ||
Comment 6•16 years ago
|
||
Comment on attachment 386244 [details] [diff] [review]
hook jemalloc into glibc's malloc
This doesn't work as glibc does not run__malloc_initialize_hook on free.
(The assumption is probably that glibc's malloc or similar would have been
called before free, but that's not happening here.)
Attachment #386244 -
Attachment is obsolete: true
Assignee | ||
Comment 7•16 years ago
|
||
We shouldn't need to use __malloc_initialize_hook because the hook functions will not call glibc malloc functions. This patch uses symbol interposing to set the 4 hooks.
With this patch, the initial crash of bug 469439 is avoided, but I'm having trouble testing with my setup here. I get a different (slightly later) crash with this patch but I seem to get the same crash without jemalloc, so it may just be related to the hackish way that I've installed NVIDIA's libGL.
I'd appreciate if someone can help me by testing this patch, please?
You'll need to explicitly remove OBJ-DIR/browser/app/firefox-bin before the build.
![]() |
||
Comment 8•16 years ago
|
||
i can confirm that without the patch, a build of SeaMonkey built on top of 1.9.2 mozilla-central code crashes at print preview while with only attachment 386469 [details] [diff] [review] applied in addition, print preview works fine. Nice work!
Assignee | ||
Comment 9•16 years ago
|
||
Thanks very much, Robert.
This also fixes bug 469439. (I managed to use the correct libnvidia-tls.so.1.)
Assignee | ||
Updated•16 years ago
|
Attachment #386469 -
Flags: review?(jasone)
Reporter | ||
Comment 10•16 years ago
|
||
According to the feedback in https://bugzilla.novell.com/show_bug.cgi?id=503151 your patch fixes the issues we've seen.
Comment 11•16 years ago
|
||
I am sorry for my english, but I was sent here from here> http://bugs.archlinux.org/task/15441
I am very weak in programming, not to say that does not know any language. Just wanted to say that I have a problem with the browser when using the macromedia / adobe flash.
Ready to share any technical information that will be required.
Thanks.
Comment 12•16 years ago
|
||
Given that this causes problems with flash in at least some cases (bug 469439), I think we should fix this for 1.9.2 (and 1.9.1.x as well).
blocking1.9.1: --- → ?
Flags: blocking1.9.2?
Assignee | ||
Comment 14•16 years ago
|
||
#1 Firefox 3.5.1 crash on Linux ATM
Comment 15•16 years ago
|
||
Ubuntu Bug:
https://bugs.launchpad.net/bugs/333127
![]() |
||
Comment 16•16 years ago
|
||
From all I hear from the Novell/openSUSE side of things, the patch is used in builds they ship now and users cheer for it as the problems seem to be gone.
We really should get this into both 1.9.2 and 1.9.1 ASAP.
Comment 17•16 years ago
|
||
Comment on attachment 386469 [details] [diff] [review]
hook jemalloc into glibc's malloc (without __malloc_initialize_hook)
I don't understand the "elif !defined(malloc) bit here... can you explain the purpose of that clause?
Comment 18•16 years ago
|
||
(In reply to comment #14)
> #1 Firefox 3.5.1 crash on Linux ATM
What is this based on? I don't think it's based on our stats because the highest crash signature has four crashes in the last week...
status1.9.1:
--- → wanted
Comment 19•16 years ago
|
||
Based on bug 469439 being marked a dupe of this. Many of the libc-2.9.so@0x2d097 crashes are crashes when fullscreening flash.
http://crash-stats.mozilla.com/report/list?product=Firefox&platform=linux&query_search=signature&query_type=exact&query=&date=&range_value=1&range_unit=weeks&do_query=1&signature=libc-2.9.so%400x2d097
Assignee | ||
Comment 20•16 years ago
|
||
Comment on attachment 386469 [details] [diff] [review]
hook jemalloc into glibc's malloc (without __malloc_initialize_hook)
(In reply to comment #17)
> I don't understand the "elif !defined(malloc) bit here... can you explain the
> purpose of that clause?
I saw this code
/* Mangle standard interfaces on Darwin and Windows CE,
in order to avoid linking problems. */
#if defined(MOZ_MEMORY_DARWIN)
#define malloc(a) moz_malloc(a)
#define valloc(a) moz_valloc(a)
#define calloc(a, b) moz_calloc(a, b)
#define realloc(a, b) moz_realloc(a, b)
#define free(a) moz_free(a)
#endif
http://hg.mozilla.org/mozilla-central/annotate/55955ee71c10/memory/jemalloc/jemalloc.c#l6126
and assumed that in some cases jemalloc does not replace the system malloc but
is used as an alternative allocator in parallel to the system malloc (used
only in cases where mixing of allocate/free implementations can be avoided).
Attachment #386469 -
Flags: review?(jasone) → review?(benjamin)
Comment 21•16 years ago
|
||
(In reply to comment #20)
> (From update of attachment 386469 [details] [diff] [review])
> (In reply to comment #17)
> > I don't understand the "elif !defined(malloc) bit here... can you explain the
> > purpose of that clause?
>
> I saw this code
>
> /* Mangle standard interfaces on Darwin and Windows CE,
> in order to avoid linking problems. */
> #if defined(MOZ_MEMORY_DARWIN)
> #define malloc(a) moz_malloc(a)
> #define valloc(a) moz_valloc(a)
> #define calloc(a, b) moz_calloc(a, b)
> #define realloc(a, b) moz_realloc(a, b)
> #define free(a) moz_free(a)
> #endif
>
> http://hg.mozilla.org/mozilla-central/annotate/55955ee71c10/memory/jemalloc/jemalloc.c#l6126
>
> and assumed that in some cases jemalloc does not replace the system malloc but
> is used as an alternative allocator in parallel to the system malloc (used
> only in cases where mixing of allocate/free implementations can be avoided).
on mac they use this zone allocator nonsense, and so malloc calls in to zone[0] basically and does an allocation. free() loops through each zone asking if it owns the allocation and then calls free on that zone. on mac with jemalloc (which we don't actually use at the moment), we setup a zone and replace the default zone with our own, so we need to define our functions as something other than malloc, etc. We still replace the system allocations.
Assignee | ||
Comment 22•16 years ago
|
||
Thank you, Stuart for the explanation.
The behavior of this patch is the same as attachment 386469 [details] [diff] [review].
The difference is that preprocessor conditionals are moved around a bit to make it clearer when each section is processed.
Attachment #386469 -
Attachment is obsolete: true
Attachment #390399 -
Flags: review?(benjamin)
Attachment #386469 -
Flags: review?(benjamin)
Updated•16 years ago
|
blocking1.9.1: ? → needed
Updated•16 years ago
|
Attachment #390399 -
Flags: review?(benjamin) → review+
Comment 24•16 years ago
|
||
Is the patch scheduled for 3.5.2?
Comment 25•16 years ago
|
||
(In reply to comment #24)
> Is the patch scheduled for 3.5.2?
Not currently, no. A patch has not yet baked on trunk and is, therefore, not ready to land on the 1.9.1 branch.
Assignee | ||
Comment 26•16 years ago
|
||
Status: REOPENED → RESOLVED
Closed: 16 years ago → 16 years ago
Resolution: --- → FIXED
Comment 28•16 years ago
|
||
Verified - Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2a1pre) Gecko/20090730 Minefield/3.6a1pre
Assignee | ||
Updated•16 years ago
|
Attachment #390399 -
Flags: approval1.9.1.3?
Comment 32•16 years ago
|
||
Comment on attachment 390399 [details] [diff] [review]
hook jemalloc into glibc's malloc v2.1
Approved for 1.9.1.3. a=ss
Attachment #390399 -
Flags: approval1.9.1.3? → approval1.9.1.3+
Assignee | ||
Comment 33•16 years ago
|
||
Comment 34•16 years ago
|
||
(In reply to comment #33)
> http://hg.mozilla.org/releases/mozilla-1.9.1/rev/d919708797fa
Hi, I'm from Venezuela and I have this error described here and I see that here is resolved, but I have not much experience in this and I don't know exactly what I should do to fix this problem on my machine, can you help me?
Assignee | ||
Comment 35•16 years ago
|
||
A build with the fix can be downloaded from here:
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-1.9.1/
Comment 36•16 years ago
|
||
(In reply to comment #35)
> A build with the fix can be downloaded from here:
> http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-1.9.1/
Thanks, but that build is in english and I think that that version is not an published official version yet, so I can wait when that version to be published because I already can see the videos in fullscreen by disabling the hardware acceleration in the configuration of flash player.
Thank you for your help!!
Updated•16 years ago
|
Updated•15 years ago
|
blocking1.9.1: needed → ---
Comment 38•15 years ago
|
||
Is this fixed in 3.5.4?
Comment 39•15 years ago
|
||
Should have been fixed in 3.5.3 as noted by the .3-fixed entry in the status1.9.1 field.
See Also: → https://launchpad.net/bugs/333127
Updated•9 years ago
|
blocking-b2g: 2.2r? → ---
tracking-b2g:
backlog → ---
You need to log in
before you can comment on or make changes to this bug.
Description
•