Bug 101542 - Q_EXPORT being no-op creates multiple copies of RTTI symbols w/hidden default visibility
Summary: Q_EXPORT being no-op creates multiple copies of RTTI symbols w/hidden default...
Status: RESOLVED FIXED
Alias: None
Product: kdelibs
Classification: Unmaintained
Component: general (show other bugs)
Version: 3.4
Platform: Gentoo Packages Linux
: NOR major
Target Milestone: ---
Assignee: Dirk Mueller
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-03-15 12:13 UTC by Marcus D. Hanwell
Modified: 2005-09-25 23:43 UTC (History)
8 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
kasteroids-3.4.0-keypressfix.patch (974 bytes, patch)
2005-03-15 12:15 UTC, Marcus D. Hanwell
Details
kasteroids-keypressfix.patch (679 bytes, patch)
2005-03-20 13:57 UTC, Marcus D. Hanwell
Details
Some debug output (720 bytes, patch)
2005-03-25 00:41 UTC, Maksim Orlovich
Details
evil workaround (623 bytes, patch)
2005-03-25 01:43 UTC, Maksim Orlovich
Details
Ugly hack that fixes the problem (755 bytes, patch)
2005-03-25 02:55 UTC, Thiago Macieira
Details
kde-visibility-switch.patch (665 bytes, patch)
2005-07-10 23:53 UTC, Gregorio Guidi
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Marcus D. Hanwell 2005-03-15 12:13:54 UTC
Version:           2.3 (using KDE KDE 3.4.0)
Installed from:    Gentoo Packages
Compiler:          GCC 3.4.3 gcc version 3.4.3 20050110 (Gentoo Linux 3.4.3.20050110, ssp-3.4.3.20050110-0, pie-8.7.7)
OS:                Linux

kasteroids starts up and waits for L to be pressed to lauch a ship. When L or any other key is pressed it crashes. I did some debugging and managed to track down the cause of the problem and am attaching a small patch. The patch applies to toplevel.cpp and allows the game to work without any issues on amd64. I don't think it introduces any new problems on any other architectures.
Comment 1 Marcus D. Hanwell 2005-03-15 12:15:26 UTC
Created attachment 10123 [details]
kasteroids-3.4.0-keypressfix.patch

Patch to stop segfault on amd64.
Comment 2 Thiago Macieira 2005-03-15 12:24:20 UTC
This can't be right:
-	QKeyEvent *e = dynamic_cast<QKeyEvent*>(event);
+	QKeyEvent *e = (QKeyEvent*) event;

But it doesn't seem to be checking for e != 0 either.
Comment 3 Marcus D. Hanwell 2005-03-20 13:57:51 UTC
Created attachment 10209 [details]
kasteroids-keypressfix.patch

This patch may be a little cleaner, and it works flawlessly on my x86 and amd64
machines. I am not certain why the dynamic_cast causes a segfault, but after
searching on this issue I found several places that support the use of
static_cast such as,

http://lists.trolltech.com/qt-interest/2002-10/msg00011.html

Please do let me know if this is the wrong way to fix this bug, it certainly
seems to work well and after researching the area a little it seems to be
technically sound.
Comment 4 Maksim Orlovich 2005-03-20 17:07:15 UTC
No, the patch is nonsense. If dynamic_cast doesn't work, static_cast should never work either
Comment 5 Thiago Macieira 2005-03-20 18:36:30 UTC
Your patch is correct, but the error is somewhere else. There's something seriously wrong up the chain of events.

Just to document our findings:

dynamic_cast<QKeyEvent*>(event) returns 0
typeid(event).name() returns "P6QEvent" (that is, "QEvent *")

Those two mean the same thing: the C++ RTTI thinks the object is of type QEvent.

However, Qt's QEvent documentation says all QEvent::AccelOverride and QEvent::KeyRelease events carry objects of type QKeyEvent.

Maks also found the function that is generating those AccelOverride events (KAccelEventHandler::x11Event), but it also generates QKeyEvent objects.

Conclusion: we haven't found the source of the bug yet.

Is it possible to rebuild Qt, kdecore and kasteroids with debugging symbols?
Comment 6 Marcus D. Hanwell 2005-03-24 11:33:00 UTC
Just to add some more to our findings. The source to kasteroids has not changed between 3.3.2 and 3.4.0, yet kasteroids works when built with 3.3 and fails with 3.4. They are both built on the same system, with the same qt and toolchain/libs (Just recompiled KDE 3.3.2 here to confirm that). I have also had a few people confirm they are having this issue on Gentoo.

I am in the process of rebuilding with debugging symbols, but kdelibs is currently failing to link in my debugging chroot. I will hopefully get it sorted soon to give a better trace.
Comment 7 Diego Elio Pettenò 2005-03-24 17:20:25 UTC
I'm having the same problem, too.
Just trying to get something more from it, trying to place an eventFilter on a simple QT or KDE app doesn't make it crazy and works perfectly well.

The test of typeid(event).name() is pointless, it returns "P6QEvent" as it should as that code tests the type of QEvent *event (the pointer, not the pointed variable). A more useful test is typeid(*event).name() which returns...

kasteroids: typeid(*event).name() == 9QKeyEvent

so it seems to be the right instance, rtti works, just dynamic_cast not...

This could be dued by some hackish thing done to allow AccelOverrides event to get ate by KApplication, but I honestly can think of anything which can break in such a way dynamic_cast...

No one on other arches has had this problem before?
Comment 8 Thiago Macieira 2005-03-24 20:58:55 UTC
> kasteroids: typeid(*event).name() == 9QKeyEvent

That's interesting. It means the type is correct, and only dynamic_cast is wrong.

That means the problem is in libstdc++ somewhere. It is also very likely Maks' guess is right and it is caused by libfam.

So, can someone who has the problem test if:
a) attachment #10209 [details] really solves the problem
b) recompiling without libfam makes the problem vanish

According to the Qt docs, attachment #10209 [details] is correct. I think it should be applied.
Comment 9 Diego Elio Pettenò 2005-03-24 21:29:40 UTC
What's up with libfam?

Anyway. Attachment #10209 [details] works fine for me, too. The logic is right, and it works. Also if i would do it differently (with a switch statement and a direct return process*), but only stylistics.

About libfam... i would do it but there are two main problems:
a) there's no fam useflag on kdelibs so I need to hack it and make sure that it will work with that hack
b) my development environment is my own environment, so i'll need to close some paid jobs before tinker too much with it, as i'm not sure if kdelibs without fam will work decently at all

I'll start looking at it tonight.

P.S.: in my testcase application, dynamic_cast works fine, so there's something more.
Comment 10 Maksim Orlovich 2005-03-24 21:44:12 UTC
libFAM is a C++ lib, and in some distros it is -statically- linked to libstdc++, occasionally causing two different versions of the C++ runtime to be around.  No idea whether it's relevant for gentoo at all..

But again, dcast failing is a very worrisome event.
Comment 11 Diego Elio Pettenò 2005-03-24 23:58:28 UTC
Gentoo doesn't compile libfam statically, anyway.

I tried building kdelibs without libfam (so i'll be trying dnotify from now on, as I dislike fam, anyway ;) ), and nothing changes.

And just to say, i've built everything in that system with the same compiler, i've assembled it last week.
Comment 12 Maksim Orlovich 2005-03-25 00:41:41 UTC
Created attachment 10333 [details]
Some debug output

If you have a moment, could you perhaps try this?
Also interesting is the output of:
 objdump -T `which kasteroids`|grep _ZTI9QKeyEvent

And does the crash happen if the app is started from a shell as well?
Comment 13 Diego Elio Pettenò 2005-03-25 00:49:14 UTC
Marksin you have found it!
dyn:0x2aaaabecd1c0 static:0x41dbf0
Only I don't know how to fix such a thing...

[yes i'm running kasteroids from a shell, anyway.]
Comment 14 Maksim Orlovich 2005-03-25 01:07:12 UTC
well, that's certainly related. Two follow ups:
1. What's the rough memory map like? big spread there.
2. What does the objdump return?
Comment 15 Diego Elio Pettenò 2005-03-25 01:24:33 UTC
I know quite nothing of amd64 tech specs, I've bought this machine just last week and I haven't stopped bugfixing other things yet :)

the grep on objdump returns nothing.
Comment 16 Maksim Orlovich 2005-03-25 01:43:15 UTC
Created attachment 10334 [details]
evil workaround

OK, I think we (myself and Thiago) know what the issue is, and it's quite
nasty, symbol-visibility related bug, which goes quite outside the scope of
kasteroids. Could you please try the attached ultra-evil patch to see whether
it helps things?
Comment 17 Diego Elio Pettenò 2005-03-25 02:04:10 UTC
Still crashes and objdump says nothing new.
Comment 18 Maksim Orlovich 2005-03-25 02:52:24 UTC
Sorry, the hack was broken in many ways, but the analysis was sound. Here is the actual issue:

Since 3.4 KDE is built, when the compilers support it, with -fvisibility=hidden. This hides symbols not explicitly exported. 

RTTI support relies on there being a unique copy of the special TypeInfo node symbols. The linker is responsible for merging multiple copies. Unfortunately, if any are hidden, they get their own separate address. QKeyEvent has no out-of-line virtuals, so the vtable and the typeinfo node gets exported at every use. But there is no explicit export of those, so they are private --- Q_EXPORT is empty! Hence, Qt has one copy, kasteroid another, and kdecore the third of _ZTI9QKeyEvent, making dynamic_cast fail.

Full impact: unknown, but also affects IA-32.
Potential fix: -D of Q_EXPORT to set default visibility.
Comment 19 Thiago Macieira 2005-03-25 02:55:10 UTC
Created attachment 10335 [details]
Ugly hack that fixes the problem

Here's a better hack that works around the problem. It makes Q_EXPORT be
KDE_EXPORT and includes kdelibs_export.h in all KDE files.

This is bad because:
1) it's ugly
2) it's a hack
3) if you don't use the KDE admin dir, you won't get this fix

I haven't rebuilt my whole system with it to test. I have just verified that
kasteroids works fine if I rebuild kdecore/kaccel.cpp with it. I didn't even
rebuild the whole library.
Comment 20 Dirk Mueller 2005-03-25 21:33:44 UTC
imho systems using a hacked compiler with backported visibility support should just apply a fix to qt to define Q_EXPORT to something reasonable. 

Comment 21 Maksim Orlovich 2005-03-25 21:59:26 UTC
I don't believe Q_EXPORT gets defined in Qt on UNIX even with the upcoming gcc-4.0
Comment 22 Dirk Mueller 2005-03-25 22:09:46 UTC
no, it doesn't of course. its the job of the distributor who cared enough
to patch their gcc anyway (there is nobody using gcc 4.0, as it cannot even
compile KDE at all). 

we should maybe do a patch for our qt-copy, and configure check on it. I'm afraid I missed the last Qt 3.3.x release by a few hours when I first implemented visibility support, so I didn't care to fix that issue so far. 

Comment 23 Dirk Mueller 2005-03-25 22:11:17 UTC
btw, there is nested visibility scope support in gcc 4.0, where you can "reset" visibility with a #pragma, to switch visibility scopes. 

with some tuning we could make use of that in kde (make sure that kdemacros is included after all qt and before all kde headers), but its not worth the trouble and very very fragile
Comment 24 Dirk Mueller 2005-03-25 22:16:59 UTC
thinking about it, we have to do that anyway - for other C++ based libraries
that do not support visibility. libfam, taglib, and probably others. 

so some kind of KDE_RESET_VISIBILITY_TO_DEFAULT macro wrapping around all
non-kdelibs includes. 

boy that sucks. better ideas appreciated. 



Comment 25 Dirk Mueller 2005-03-26 00:30:46 UTC
given that I'm the one who introduced visibility support, I should
probably taking care of fixing this issue. 
Comment 26 Thiago Macieira 2005-03-26 06:05:05 UTC
Regarding Qt4: yes, it does have visibility support. It has even a configure check for that, which defines a macro for export visibility. So, Qt4 supports it out of the box.

I asked tronical today if there was any chance of that being backported to Qt3. His answer was, understandably, that new features into Qt3 are hardly a priority.

If it were just Qt, we could fix this problem with my acinclude patch (attachment #10335 [details]), which redefines Q_EXPORT to something useful. However, Dirk is right: this affects everything we include in our headers and source code.

C stuff isn't much of a problem since undefined symbols can't be of type hidden. But C++ special symbols (virtual tables, typeinfo nodes, names, implicit constructors or destructors, etc.) may be a problem as is the case here. 

However, please note this affects only a very restricted set of symbols: namely, those that have their virtual tables & typeinfo structures being defined as weak symbols: polymorphic classes without any explicit virtual function. If a simple destructor were added to QKeyEvent, this problem would vanish.

If this weren't the case, we would have noticed bugs much sooner. I have personally been building KDE with hidden visibility for many months now.
Comment 27 Thiago Macieira 2005-03-29 14:07:37 UTC
I don't think this is the distributor's fault. I think it is ours.

We have broken the build even for gcc4 here, even though that's not released. It is the distributor's fault, however, that gcc4 problems have been "backported" into 3.4.

The proper fix, as I see it, is to remove the -fvisibility=hidden -fvisibility-inlines-hidden from the command-line. Every KDE header and source code should use the #pragmas to set the visibility around its own code.

Setting the visibility around other people's code is what caused this problem.
Comment 28 Maksim Orlovich 2005-04-10 22:06:11 UTC
CVS commit by orlovich: 

Workaround the duplicated TI nodes issue due to hidden visibility in app here.

In vernacular: this makes kasteroids not crash, but doesn't fix the underlying issue.
CCBUG:101542


  M +1 -1      toplevel.cpp   1.77


--- kdegames/kasteroids/toplevel.cpp  #1.76:1.77
@@ -337,5 +337,5 @@ return; // remove this and the above whe
 bool KAstTopLevel::eventFilter( QObject* /* object */, QEvent *event )
 {
-        QKeyEvent *e = dynamic_cast<QKeyEvent*>(event);
+        QKeyEvent *e = static_cast<QKeyEvent*>(event);
         if (event->type() == QEvent::AccelOverride)
         {
Comment 29 danarmak 2005-05-24 23:27:32 UTC
> The proper fix, as I see it, is to remove the -fvisibility=hidden
> -fvisibility-inlines-hidden from the command-line. Every KDE header and
> source code should use the #pragmas to set the visibility around its own
> code. 

> Setting the visibility around other people's code is what caused this
> problem. 
So are you planning to do that, now or later? We (Gentoo) are going to disable
all hidden visibility in kde for 3.4.1 - a better safe than sorry policy...
Comment 30 Unai Garro 2005-07-03 23:48:08 UTC
SVN commit 431311 by uga:

CCBUG:101542
Fixes showing the kpf applet menu. The dynamic cast of the event resulted in the same problem as in kasteroids (e=0)


 M  +1 -1      AppletItem.cpp  


--- trunk/KDE/kdenetwork/kpf/src/AppletItem.cpp #431310:431311
@@ -150,7 +150,7 @@
 
       case QEvent::MouseButtonPress:
         {
-          QMouseEvent * e = dynamic_cast<QMouseEvent *>(ev);
+          QMouseEvent * e = static_cast<QMouseEvent *>(ev);
 
           if (0 == e)
           {
Comment 31 Maksim Orlovich 2005-07-04 00:48:53 UTC
Other instances affected by this bug:

kdegraphics/kghostview/fullscreenfilter.cpp:    if ( QKeyEvent* keyevent = dynamic_cast<QKeyEvent*>( ev ) ) {
kdegraphics/kghostview/fullscreenfilter.cpp:    if ( QMouseEvent* mouseevent = dynamic_cast<QMouseEvent*>( ev ) ) {

=> Used for esc, Left-click handling in full-screen mode in KGhostView, but doesn't seem to be broken.

kdelibs/kstyles/plastik/plastik.cpp:            QMouseEvent *me = dynamic_cast<QMouseEvent*>(ev);

=> Hover effect on tabs broken. Will workaround there, I guess

kdemultimedia/juk/statuslabel.cpp:    QMouseEvent *mouseEvent = dynamic_cast<QMouseEvent *>(e);

=> Jumps to songs from clicking on statusbar broken. Worked around in the app.

kdesdk/kbabel/commonui/projectprefwidgets.cpp:        QKeyEvent *ke = dynamic_cast<QKeyEvent*>(e);

No clue on how to test this, never used kbabel.

Comment 32 Gregorio Guidi 2005-07-10 23:53:59 UTC
Created attachment 11751 [details]
kde-visibility-switch.patch

Maybe you can apply the patch above until this situation is resolved?
It adds a configure switch (--enable-gcc-visibility/--disable-gcc-visibility)
to make visibility support optional, defaulting it to off.

Or you can just apply the patch and set the default to on to keep the current
(problematic) behavior, but still making it configurable.

Thanks.
Comment 33 Thiago Macieira 2005-07-10 23:59:13 UTC
That would make sense.
Comment 34 Gregorio Guidi 2005-07-15 09:41:21 UTC
Any other opinion?

Actually, I would be glad if the patch was applied to the 3.4 branch before 3.4.2 is tagged. Do you think it's possible?
Comment 35 Dirk Mueller 2005-07-15 16:27:21 UTC
I'm currently preparing a Qt patch and will disable visibility if Qt is not
patched. That should fix all the issues and is the easiest fix too. 

Comment 36 Gregorio Guidi 2005-07-19 12:32:21 UTC
> I'm currently preparing a Qt patch and will disable visibility if Qt is not 
> patched. That should fix all the issues and is the easiest fix too.

Thinking about it, an issue with this approach is that it doesn't take other C++ libraries into account (comment #24)...

Is the Qt fix already available in SVN?
Comment 37 Dirk Mueller 2005-07-19 12:38:28 UTC
no, I'm still busy fixing the build system. 

Yes, other C++ libraries are not taken care of, but thats just requiring
1% of the source code to adjust, not 100%. 

Comment 38 Gregorio Guidi 2005-07-19 14:02:21 UTC
Sorry to annoy everyone again then ;) but I guess a fix will not be available for 3.4.2, so applying the patch could really make sense at this point...

(on a side note for those interested, change AS_HELP_STRING into AC_HELP_STRING in the patch to make it compatible with autoconf-2.53)
Comment 39 Dirk Mueller 2005-07-20 15:45:17 UTC
SVN commit 436904 by mueller:

disable -fvisibility=hidden if Qt is not patched.
CCBUG: 101542


 M  +33 -10    acinclude.m4.in  
 M  +15 -0     configure.in.bot.end  


--- trunk/KDE/kde-common/admin/acinclude.m4.in #436903:436904
@@ -1643,8 +1643,8 @@
 
 If you did install kdelibs, then the Qt version that is picked up by
 this configure is not the same version you used to compile kdelibs. 
-The Qt Plugin installed by kdelibs is *ONLY* loadable if its the 
-same Qt version, compiled with the same compiler and the same Qt
+The Qt Plugin installed by kdelibs is *ONLY* loadable if it is the 
+_same Qt version_, compiled with the _same compiler_ and the same Qt
 configuration settings.
 ])
 fi
@@ -3283,25 +3283,48 @@
   ]
 )
 
-AC_DEFUN([KDE_CHECK_AND_ADD_HIDDEN_VISIBILITY],
+AC_DEFUN([KDE_ENABLE_HIDDEN_VISIBILITY],
 [
-  if test "$GXX" = "yes"; then
+  AC_BEFORE([AC_PATH_QT_1_3], [KDE_ENABLE_HIDDEN_VISIBILITY])
+
+  if test "x$GXX" = "xyes"; then
+    kde_have_gcc_visibility=no
     KDE_CHECK_COMPILER_FLAG(fvisibility=hidden, 
     [
+      kde_have_gcc_visibility=yes
+      AC_CACHE_CHECK([if Qt is patched for -fvisibility], kde_cv_val_qt_gcc_visibility_patched,
+        [
+          AC_LANG_SAVE
+          AC_LANG_CPLUSPLUS
+
+          safe_CXXFLAGS=$CXXFLAGS
+          CXXFLAGS="$CXXFLAGS $all_includes"
+
+          AC_TRY_COMPILE(
+          [
+#include <qglobal.h>
+#if Q_EXPORT - 0 != 0
+/* if this compiles, then Q_EXPORT is undefined */
+/* if Q_EXPORT is nonempty, this will break compilation */
+#endif
+          ], [/* elvis is alive */],
+          kde_cv_val_qt_gcc_visibility_patched=no, kde_cv_val_qt_gcc_visibility_patched=yes)
+
+          CXXFLAGS=$safe_CXXFLAGS
+          AC_LANG_RESTORE
+        ]
+      )
+
+      if test x$kde_cv_val_qt_gcc_visibility_patched = "xyes"; then
         CXXFLAGS="$CXXFLAGS -fvisibility=hidden"
         KDE_CHECK_VISIBILITY_GCC_BUG
-
         HAVE_GCC_VISIBILITY=1
         AC_DEFINE_UNQUOTED(__KDE_HAVE_GCC_VISIBILITY, "$HAVE_GCC_VISIBILITY", [define to 1 if -fvisibility is supported])
+      fi
     ])
   fi
 ])
 
-AC_DEFUN([KDE_ENABLE_HIDDEN_VISIBILITY],
-[
-  AC_REQUIRE([KDE_CHECK_AND_ADD_HIDDEN_VISIBILITY])
-])
-
 AC_DEFUN([KDE_ADD_DEPENDENCIES],
 [
    [A]M_DEPENDENCIES(CC)
--- trunk/KDE/kde-common/admin/configure.in.bot.end #436903:436904
@@ -16,6 +16,21 @@
   fi
 fi
 
+if test x$GXX = "xyes" -a x$kde_have_gcc_visibility = "xyes" -a x$kde_cv_val_qt_gcc_visibility_patched = "xno"; then
+  echo ""
+  echo "Your GCC supports symbol visibility, but the patch for Qt supporting visibility"
+  echo "was not included. Therefore, GCC symbol visibility support remains disabled."
+  echo ""
+  echo "For better performance, consider including the Qt visibility supporting patch"
+  echo "located at:"
+  echo ""
+  echo "http://bugs.kde.org/show_bug.cgi?id=109386"
+  echo ""
+  echo "and recompile all of Qt and KDE. Note, this is entirely optional and"
+  echo "everything will continue to work just fine without it."
+  echo ""
+fi
+
 if test "$all_tests" = "bad"; then
   if test ! "$cache_file" = "/dev/null"; then
     echo ""
Comment 40 Dirk Mueller 2005-07-21 04:24:16 UTC
fixed for KDE 3.4.2. 
Comment 41 Diego Elio Pettenò 2005-09-25 23:01:33 UTC
Don't think this is still fixed.

With Gentoo, GCC4, QT 3.3.5, arts CRASHES. Disabling visibility (removed KDE_ENABLE_VISIBILITY from configure.in.in) makes it work fine.

The same problem seems to be present on KUbuntu.

So the visibility is still an issue.
Comment 42 Carsten Lohrke 2005-09-25 23:35:43 UTC
Diego, have a look at http://bugs.kde.org/show_bug.cgi?id=109386
Comment 43 Diego Elio Pettenò 2005-09-25 23:43:37 UTC
Thanks, moving to that then.