Bug 316066

Summary: TileSet doesn't remove cairo surfaces in destructor
Product: [Plasma] Oxygen Reporter: Ruslan Kabatsayev <b7.10110111>
Component: gtk2-engineAssignee: Hugo Pereira Da Costa <hugo.pereira.da.costa>
Status: RESOLVED FIXED    
Severity: normal CC: b7.10110111, hugo.pereira.da.costa, kokoko3k, rahimi.nv, web
Priority: NOR    
Version: Git   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: patch to test leak at tileset deletion
Refcount test
New test

Description Ruslan Kabatsayev 2013-03-03 10:37:26 UTC
This is a bug which has always been present, but only now noticed thanks to Bug 304381.
When the cache is overfilled, its adjustSize() method removes some older elements, and in case of TileSet it removes old TileSets. At the same time, as TileSet doesn't clear cairo surfaces it initialized, we get bugs similar to bug 304381.
This, of course, is not the only reason for that bug, the main one is that described in it, but still this one can bite us one day if not fixed.

Reproducible: Always

Actual Results:  
Bug 304381 wouldn't come up if this one didn't exist, so if that bug without being fixed is reproduced (in X11 resource usage symptoms), then this one is also present.
Comment 1 Ruslan Kabatsayev 2013-03-03 11:04:13 UTC
Well.  In fact, this description is wrong, because Cairo::Surface removes the surfaces itself. Still, something makes their Pixmaps not disappear on Tileset deletion.
Comment 2 Hugo Pereira Da Costa 2013-03-05 13:15:09 UTC
Seems to me like a bug in cairo then, than cairo_surface destruction does not free the underlying pixmap.

Question: does the behavior persists if in Oxygen::StyleHelper::initializeRefSurface we use the second code path only (CAIRO_FORMAT_ARGB32) to create the ref surface ?

(I have not tried to reproduce yet, but am working on it)
Comment 3 Hugo Pereira Da Costa 2013-03-05 16:25:21 UTC
Created attachment 77777 [details]
patch to test leak at tileset deletion

This patch forces all tileset cackes to a size of 1. 
It also adds a printout everytime the cache expunges some elements due to max_size being exceeded. 

With this patch applied I am not able to reproduce the leak mentioned above:
running xrestop in one terminal, then oxygen-gtk-demo in another and moving back-and-force between the first two tabs, I can see the number of allocated pixmaps remain constant at ~63.

@Ruslan: can you confirm/infirm the above observation ? 
How did you actually spot the leak mentionned above ? 

Note: pixmaps are not *always* freed when a tileset (or surface) is deleted. This is the case for instance when tileset or surface has been copied (somewhere else), in which case the same underlying pixmap is re-used by the copy (this is the _reference count in the surface equal-to operator) and freed only when then last copy is deleted. 

I plan to make further check on leaks by printing out the reference count of the cairo_surface_t objects (cairo_surface_get_reference_count), that counts the number of copies of the same surface. The pixmap is actually freed only when the cairo_surface_t that has reference_count = 1 is deleted.

Cheers,

Hugo
Comment 4 Ruslan Kabatsayev 2013-03-05 18:45:30 UTC
With this patch I can see constant increase in Total pixmap allocation every time I hover a tab, i.e. the prelight pixmaps get allocated but seem to not be unallocated.

> Note: pixmaps are not *always* freed when a tileset (or surface) is deleted. This is the case for instance when tileset or surface has been copied (somewhere else), in which case the same underlying pixmap is re-used by the copy (this is the _reference count in the surface equal-to operator) and freed only when then last copy is deleted. 

Where do we copy surfaces? AFAIR, we only use some original surface to create 'similar' ones.
Comment 5 Hugo Pereira Da Costa 2013-03-05 19:14:15 UTC
(In reply to comment #4)
> With this patch I can see constant increase in Total pixmap allocation every
> time I hover a tab, i.e. the prelight pixmaps get allocated but seem to not
> be unallocated.
mmm. 

Can't reproduce. Has to be an upstream bug then ...
cairo, or X11, or even graphics driver.
My cairo is 1.12.8
X: libx11_6-1.4.99.1-4.mga2 (not sure that's the right version string)
graphics: tested on both intel and nvidia.

> 
> > Note: pixmaps are not *always* freed when a tileset (or surface) is deleted. This is the case for instance when tileset or surface has been copied (somewhere else), in which case the same underlying pixmap is re-used by the copy (this is the _reference count in the surface equal-to operator) and freed only when then last copy is deleted. 
> 
> Where do we copy surfaces? AFAIR, we only use some original surface to
> create 'similar' ones.

We do that via the implicit copy constructor of tilesets (that calls the "=" operator of Cairo::Surface), for instance in oxygenshadowhelper.

not sure there are other places where this guy is called. 
(mmm, in fact, possibly every time we insert a tileset into a cache)
Comment 6 Ruslan Kabatsayev 2013-03-05 19:16:58 UTC
Haaa. I've found one more place where something goes wrong: TileSet::initSurface(): 
                #if CAIRO_VERSION >= CAIRO_VERSION_ENCODE(1, 10, 0)                                                    |        // set metrics
                Cairo::Surface tile( cairo_surface_create_for_rectangle( source, sx, sy, sw, sh ) );                          |        _w3 = cairo_surface_get_width( surface ) - (w1 + w2);
                #else                                                                                                         |        _h3 = cairo_surface_get_height( surface ) - (h1 + h2);
                Cairo::Surface tile( cairo_surface_create_similar( source, CAIRO_CONTENT_COLOR_ALPHA, sw, sh ) );

Having your patch applied, if I change #if to #if false, I don't reproduce the pixmap increase with oxygen-gtk when moving mouse over lineedit controls on the first page. If I make it as is, then I do reproduce this.
Comment 7 Ruslan Kabatsayev 2013-03-05 19:17:30 UTC
Oops, should be this instead:
                #if false&&CAIRO_VERSION >= CAIRO_VERSION_ENCODE(1, 10, 0)                                                    
                Cairo::Surface tile( cairo_surface_create_for_rectangle( source, sx, sy, sw, sh ) );                          
                #else
Comment 8 Hugo Pereira Da Costa 2013-03-05 19:21:18 UTC
... I don't have this issue. 
Maybe the solution is just to increase the version check in the incriminated #if.
Maybe the first version of cairo_surface_create_for_rectangle is buggy
(it was absent in earlier cairo versions, and is supposed to be faster than what's after the #if)

What's your cairo ?
Comment 9 Ruslan Kabatsayev 2013-03-05 19:23:49 UTC
> What's your cairo ?
1.12.3, so not that old. What's yours?
Comment 10 Hugo Pereira Da Costa 2013-03-05 19:25:00 UTC
1.12.8 (see comment 5).
Any chance you can upgrade ? 
(Well I guess I can also try to downgrade)
Comment 11 Ruslan Kabatsayev 2013-03-05 19:27:38 UTC
Will try now
Comment 12 Ruslan Kabatsayev 2013-03-05 19:45:58 UTC
Created attachment 77783 [details]
Refcount test

I've installed version 1.12.14, newer than yours, but still the same issue.
Now, I'm attaching a new test for leaks. See typical output:
______________________________
source refcount1: 1
tile refcount: 1
source refcount2: 2
source refcount3: 2

Notice refcount2 equal refcount3, unequal refcount1. This means 'tile' removal doesn't decrease refcount for some reason (tile is removed when its scope ends, right?).
Do you confirm this?
Comment 13 Ruslan Kabatsayev 2013-03-05 19:53:36 UTC
Hmm, OK. It's not because of some bug, it's because source is used as source surface for Context.
Well, then I don't understand what's up...
Comment 14 Ruslan Kabatsayev 2013-03-05 20:04:21 UTC
Funny. I've done something, and now fail to reproduce this either. Even after downgrade.
Comment 15 Hugo Pereira Da Costa 2013-03-05 20:07:07 UTC
when you upgrade/downgrade, you uninstall the old cairo version first ?
Comment 16 Ruslan Kabatsayev 2013-03-05 20:16:04 UTC
Created attachment 77784 [details]
New test

Great point, you were right. Now, here's what I get with this updated test with old cairo:
______________________________
source refcount1: 1
source refcount1.5: 1
source refcount3: 2
source refcount4: 1
______________________________
source refcount1: 1
source refcount1.5: 1
tile refcount: 1
source refcount2: 2
source refcount3: 2
source refcount4: 2
______________________________
source refcount1: 2
source refcount1.5: 2
source refcount3: 3
source refcount4: 2
______________________________
source refcount1: 2
source refcount1.5: 2
tile refcount: 1
source refcount2: 3
source refcount3: 3
source refcount4: 3

An more up to 6, sometimes increasing to 7. So, apparently, it's a bug in cairo up to at least 1.12.3, maybe something until 1.12.8.
BTW, do you confirm that refcount1==refcount4 with your cairo?
Comment 17 Hugo Pereira Da Costa 2013-03-05 20:20:52 UTC
(In reply to comment #16)
> Created attachment 77784 [details]
> New test
> 
> Great point, you were right. Now, here's what I get with this updated test
> with old cairo:
> ______________________________
> source refcount1: 1
> source refcount1.5: 1
> source refcount3: 2
> source refcount4: 1
> ______________________________
> source refcount1: 1
> source refcount1.5: 1
> tile refcount: 1
> source refcount2: 2
> source refcount3: 2
> source refcount4: 2
> ______________________________
> source refcount1: 2
> source refcount1.5: 2
> source refcount3: 3
> source refcount4: 2
> ______________________________
> source refcount1: 2
> source refcount1.5: 2
> tile refcount: 1
> source refcount2: 3
> source refcount3: 3
> source refcount4: 3
> 
> An more up to 6, sometimes increasing to 7. So, apparently, it's a bug in
> cairo up to at least 1.12.3, maybe something until 1.12.8.
> BTW, do you confirm that refcount1==refcount4 with your cairo?

Yes. So this is good. And unlike your printout with earlier version (which definitly shows a leak)

So. We just bump version in the VERSION check, to 1.12.8, to be sure.
No big deal, the old code path is as valid as the first, and likely only a tiny bit less efficient.
You do that and push and close ?
Comment 18 Ruslan Kabatsayev 2013-03-05 20:24:37 UTC
I'll reverse-bisect cairo to determine first working version, add some warnings in case it breaks again (like refcount1!=refcount4->cerr) and bump the version.
Comment 19 Hugo Pereira Da Costa 2013-03-05 20:51:32 UTC
(In reply to comment #18)
> I'll reverse-bisect cairo to determine first working version, add some
> warnings in case it breaks again (like refcount1!=refcount4->cerr) and bump
> the version.

Not sure the warning are necessary (and would clutter the code).
For one, we could likely have spotted it with valgrind (memchecker)
Also, it is unlikely to get broken again, once fixed.
What's more likely is that we uncover new issues elsewhere.
All in all, I'd rather avoid adding upstream checks in our code and rather rely on  upstream being "working" (we are not writing a cairo, or gtk debugger, after all :))

Note: version check update should go in all our "active" branches
Comment 20 Ruslan Kabatsayev 2013-03-05 20:59:14 UTC
Git commit c7e54e1e33b5634c5e5455253a2e3e364e1ffce6 by Ruslan Kabatsayev.
Committed on 05/03/2013 at 21:58.
Pushed by kabatsayev into branch '1.3'.

Bump needed cairo version for cairo_surface_create_for_rectangle()

M  +2    -1    src/oxygentileset.cpp

http://commits.kde.org/oxygen-gtk/c7e54e1e33b5634c5e5455253a2e3e364e1ffce6
Comment 21 Ruslan Kabatsayev 2013-03-05 21:02:39 UTC
I was lucky enough that I caught cairo before they released 1.12.4. Without this, we'll still have X resource leaks unreproduced. Now this one down.
Comment 22 Ruslan Kabatsayev 2013-03-05 21:12:18 UTC
Git commit d9884c293982d1c82025229d6b66388df2524b0c by Ruslan Kabatsayev.
Committed on 05/03/2013 at 21:58.
Pushed by kabatsayev into branch 'gtk3-1.1'.

Bump needed cairo version for cairo_surface_create_for_rectangle()

M  +2    -1    src/oxygentileset.cpp

http://commits.kde.org/oxygen-gtk/d9884c293982d1c82025229d6b66388df2524b0c
Comment 23 Hugo Pereira Da Costa 2013-04-08 05:48:18 UTC
*** Bug 318002 has been marked as a duplicate of this bug. ***
Comment 24 Hugo Pereira Da Costa 2013-04-08 15:02:19 UTC
*** Bug 318039 has been marked as a duplicate of this bug. ***