Bug 316449 - Large memory leak when updating entries via data sources
Summary: Large memory leak when updating entries via data sources
Status: RESOLVED FIXED
Alias: None
Product: tellico
Classification: Applications
Component: general (show other bugs)
Version: 2.3.7
Platform: Unlisted Binaries Linux
: NOR major
Target Milestone: ---
Assignee: Robby Stephenson
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-03-10 06:56 UTC by Dominik Stadler
Modified: 2013-03-12 03:58 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In: 2.3.8


Attachments
Fix large memory leak for every xslt transformation that is done, e.g. as part of fetches (3.04 KB, patch)
2013-03-11 18:08 UTC, Dominik Stadler
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dominik Stadler 2013-03-10 06:56:11 UTC
I have a large collection and when I update many entries via data sources in one go, I see a steady increase in memory usage until available memory runs out and finally the linux-oom-killer kicks in and forcefully closes tellico.

I see aprox. 1MB of memory leak per 10 entries being updated.

Reproducible: Always

Steps to Reproduce:
1. Open a sufficiently large collection of books
2. Select many entries and choose "update entries -> from all sources"
3. Wait until memory runs out
Actual Results:  
tellico crashes/is killed by the linux oom-killer

Expected Results:  
memory should keep steady
Comment 1 Dominik Stadler 2013-03-10 23:25:24 UTC
I did some investigation and I have two suspects here: 

* xslthandler.cpp:233: The members m_docIn and m_docOut are only freed in the destructor, but overwritten with newly allocated objects, thus leaking two XML documents for every XSLT parsing.

I am currently testing a patch which get's rid of the members m_docIn and m_docOut completely and just keeps the documents locally and frees them immediately after the parse, they are not used any more anyway and thus it is generally a good idea to throw them away quickly rather than keeping them around.

* I don't know much about KDE's signals, but it looks to me like signalResultFound leaks FetchResults if the FetchDialog is not shown. 
When updating entries, the dialog is not displayed and thus the slot for "ResultFound" is not connected. However each fetcher allocates a FetchResult and passes it to the signal. I could not yet verify that it actually leaks, but valgrind reports some lost memory around that area...

    FetchResult* r = new FetchResult(...);
    emit signalResultFound(r);

I'll update here with patches if my changes work for my collections.
Comment 2 Robby Stephenson 2013-03-11 00:50:24 UTC
(In reply to comment #1)
> * xslthandler.cpp:233: The members m_docIn and m_docOut are only freed in
> the destructor, but overwritten with newly allocated objects, thus leaking
> two XML documents for every XSLT parsing.
> 
> I am currently testing a patch which get's rid of the members m_docIn and
> m_docOut completely and just keeps the documents locally and frees them
> immediately after the parse, they are not used any more anyway and thus it
> is generally a good idea to throw them away quickly rather than keeping them
> around.

Sound appropriate. I did some refactoring of that class a while back and probably didn't realize I could potentially make m_docIn and m_docOut local variables.

> * I don't know much about KDE's signals, but it looks to me like
> signalResultFound leaks FetchResults if the FetchDialog is not shown. 
> When updating entries, the dialog is not displayed and thus the slot for
> "ResultFound" is not connected. However each fetcher allocates a FetchResult
> and passes it to the signal. I could not yet verify that it actually leaks,
> but valgrind reports some lost memory around that area...
> 
>     FetchResult* r = new FetchResult(...);
>     emit signalResultFound(r);

Even though the dialog isn't shown, the updater still gets the new information from signalResultFound(). It could certainly be leaking, but it's not just because the dialog isn't used. The update job may not be deleting something that it should.

> I'll update here with patches if my changes work for my collections.
Absolutely, thanks! Just in general, the fetch jobs aren't really written with large update loops in mind, but any memory leaks should definitely be fixed. The update job loop isn't all that efficient...one of these days, I may need to learn threads, too...
Comment 3 Dominik Stadler 2013-03-11 18:08:26 UTC
Created attachment 77962 [details]
Fix large memory leak for every xslt transformation that is done, e.g. as part of fetches

I tested with the attached patch to free up the memory for the XML documents as early as possible. With this in place memory increase is a lot less, maybe even zero, fetching many entries at once now works for up to 1000 entries without much memory increase. 

I will probably take another look which items are still reported by valgrind now, but for my usage it should already suffice for now.
Comment 4 Robby Stephenson 2013-03-12 03:58:29 UTC
Git commit 301ed4cce8a4c4aafa46de647bcfc1d0d55d4306 by Robby Stephenson.
Committed on 12/03/2013 at 04:56.
Pushed by rstephenson into branch '2.3'.

Free xmlDoc memory as soon as possible in XSLTHandler

Make the xmlDoc pointers local and free them as soon as possible
instead of waiting for the XSLTHandler destructor

Patch from Dominik Stadler
FIXED-IN: 2.3.8

M  +20   -24   src/translators/xslthandler.cpp
M  +1    -3    src/translators/xslthandler.h

http://commits.kde.org/tellico/301ed4cce8a4c4aafa46de647bcfc1d0d55d4306