Version: 2.1 (using KDE 4.1.3) OS: Linux Installed from: Ubuntu Packages When using the Konsole with several tabs (in general 10 or more), the memory consumption is raising to a quite high volume (between 1 to 2 GByte). This cause the whole system to have random slow downs due to memory swapping. I already checked some other similar bug reports which are already closed and got the hint that I should switch all scrollback buffers to unlimited (and Reset the scrollback buffer in each tab), so that the the scrollback buffer is in a file instead of being hold in the RAM. This doesn't have any effect on my Konsole application. I generally have a running Konsole with uptimes around 2-4 month and I often have at least 10 (often even more) tabs open. The konsole binary in this example is running 4 days with 10 tabs. # ps -eo rss,args | grep konsole 1047032 /usr/bin/konsole Snapshot of top: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 27255 daniel 20 0 2181m 1.0g 15m R 8 50.4 34:04.47 konsole # file /usr/bin/konsole /usr/bin/konsole: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.6.8, dynamically linked (uses shared libs), stripped # ldd /usr/bin/konsole linux-gate.so.1 => (0xb7f7b000) libkdeinit4_konsole.so => /usr/lib/libkdeinit4_konsole.so (0xb7e6d000) libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7d0f000) libkpty.so.4 => /usr/lib/libkpty.so.4 (0xb7d04000) libkio.so.5 => /usr/lib/libkio.so.5 (0xb7a87000) libkdeui.so.5 => /usr/lib/libkdeui.so.5 (0xb771b000) libkdecore.so.5 => /usr/lib/libkdecore.so.5 (0xb7502000) libQtCore.so.4 => /usr/lib/libQtCore.so.4 (0xb72d4000) libQtDBus.so.4 => /usr/lib/libQtDBus.so.4 (0xb7266000) libknotifyconfig.so.4 => /usr/lib/libknotifyconfig.so.4 (0xb7254000) libX11.so.6 => /usr/lib/libX11.so.6 (0xb7165000) libXrender.so.1 => /usr/lib/libXrender.so.1 (0xb715b000) libQtGui.so.4 => /usr/lib/libQtGui.so.4 (0xb6858000) libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0xb6769000) /lib/ld-linux.so.2 (0xb7f61000) libutil.so.1 => /lib/tls/i686/cmov/libutil.so.1 (0xb6764000) libz.so.1 => /usr/lib/libz.so.1 (0xb674e000) libstreamanalyzer.so.0 => /usr/lib/libstreamanalyzer.so.0 (0xb66e0000) libsolid.so.4 => /usr/lib/libsolid.so.4 (0xb6670000) libQtNetwork.so.4 => /usr/lib/libQtNetwork.so.4 (0xb656f000) libQtXml.so.4 => /usr/lib/libQtXml.so.4 (0xb652a000) libacl.so.1 => /lib/libacl.so.1 (0xb6522000) libattr.so.1 => /lib/libattr.so.1 (0xb651d000) libQtSvg.so.4 => /usr/lib/libQtSvg.so.4 (0xb64d0000) libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0xb64aa000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb649b000) libSM.so.6 => /usr/lib/libSM.so.6 (0xb6491000) libICE.so.6 => /usr/lib/libICE.so.6 (0xb6479000) libXtst.so.6 => /usr/lib/libXtst.so.6 (0xb6474000) libXcursor.so.1 => /usr/lib/libXcursor.so.1 (0xb646b000) libXfixes.so.3 => /usr/lib/libXfixes.so.3 (0xb6466000) libbz2.so.1.0 => /lib/libbz2.so.1.0 (0xb6454000) libgthread-2.0.so.0 => /usr/lib/libgthread-2.0.so.0 (0xb644e000) librt.so.1 => /lib/tls/i686/cmov/librt.so.1 (0xb6445000) libglib-2.0.so.0 => /usr/lib/libglib-2.0.so.0 (0xb638e000) libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb6375000) libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb6370000) libphonon.so.4 => /usr/lib/libphonon.so.4 (0xb632d000) libxcb-xlib.so.0 => /usr/lib/libxcb-xlib.so.0 (0xb632a000) libxcb.so.1 => /usr/lib/libxcb.so.1 (0xb6311000) libaudio.so.2 => /usr/lib/libaudio.so.2 (0xb62f9000) libpng12.so.0 => /usr/lib/libpng12.so.0 (0xb62d2000) libXi.so.6 => /usr/lib/libXi.so.6 (0xb62c8000) libXrandr.so.2 => /usr/lib/libXrandr.so.2 (0xb62c1000) libfreetype.so.6 => /usr/lib/libfreetype.so.6 (0xb624b000) libfontconfig.so.1 => /usr/lib/libfontconfig.so.1 (0xb621e000) libXext.so.6 => /usr/lib/libXext.so.6 (0xb620f000) libstreams.so.0 => /usr/lib/libstreams.so.0 (0xb61dc000) libxml2.so.2 => /usr/lib/libxml2.so.2 (0xb60a0000) libpcre.so.3 => /lib/libpcre.so.3 (0xb6076000) libXau.so.6 => /usr/lib/libXau.so.6 (0xb6073000) libXdmcp.so.6 => /usr/lib/libXdmcp.so.6 (0xb606d000) libXt.so.6 => /usr/lib/libXt.so.6 (0xb601c000) libexpat.so.1 => /usr/lib/libexpat.so.1 (0xb5ff5000)
*** This bug has been confirmed by popular vote. ***
I can confirm this bug. Even if I do "Clear Scrollback" in all my ten open sessions, Konsole RSS size is still above 160MB which is absolutely insane, because a newly started Konsole with 10 bash sessions consume just 21MB of RAM (RSS value). Probably there's a leak in a font rendering code however it's just a wild speculation. I'm running KDE 4.2.1.
This bug status should be raised to Critical. Memory leaks sometimes lead to successful exploits.
> This bug status should be raised to Critical. > Memory leaks sometimes lead to successful exploits That won't do - I'd need actual evidence of a vulerability such as a working exploit first. An application using leaking memory is enough of a nuisance in its own right - playing the security card won't get it fixed any faster.
I'd like to chime in here as well - this is a problem for me when openSUSE 11.1, and KDE 4.2.3. currently, konsole (up for less than 24 hours) is consuming 86MB, more than 4x greater than the next highest consumer (excluding firefox).
Please fix this bug. With 10 tabs open an a scrollback of 30.000, Konsole can consume up to 800 MB (RSS). This is insane. It's unbelievable that this bug has been reported 6 months ago, been vote for by quite some people, and has not even been acknowledged by the developers. 10 tabs with 30.000 lines of scrollback, each line filled with say 100 characters, should need no more than say 10*30.000*100*2*1.5 = 85MB (if you allow for 16-bit characters and 50% data structure overhead). And this is already rather generous. Currently, using a large scrollback will lead to extended swapping periods or can activate the OOM who will kill one of your processes (-> data loss).
Bug severity should be changed to "Critical". The definition of "Critical" is as follows: crashes, loss of data, severe memory leak I am not sure if this is a memory leak or just highly inefficient data structures, but as already said, once the OOM killer kicks in and kills a process, loss of data is likely. My Eclipse IDE gets killed regularky due to the high RAM usage of Konsole.
This bug is already in a confirmed state (NEW) ;) And since developers of Konsole are unwilling to fix it, probably someone has to debug/valgring this bug on its own: http://www.cprogramming.com/debugging/valgrind.html http://www.faqs.org/docs/Linux-HOWTO/Valgrind-HOWTO.html I dared not do that since running an application under valgrind slows down it probably tenfold so working with such a slow Konsole isn't what I could bare.
> Please fix this bug. With 10 tabs open an a scrollback of 30.000, > Konsole can consume up to 800 MB (RSS). This is insane The memory figure that is more useful is private writable memory. Alternatively the 'memory' figures quoted by ksysguard or gnome's system monitor are sensible. > 10 tabs with 30.000 lines of scrollback, each line filled with say 100 > characters, should need no more than say 10*30.000*100*2*1.5 = 85MB (if you > allow for 16-bit characters and 50% data structure overhead) Konsole currently uses 11 bytes per character. 2 bytes for the character, 1 for formatting/other flags and 4 for foreground color and 4 for background color. Plus there is a certain amount of per-line overhead - I'd guess about ~30 bytes. Based on 30K lines of scrollback in 10 tabs with 100 characters per line that gives 324MB of expected memory usage. If significantly more than that is being used there is probably a leak somewhere which requires investigation. Leaks aside, this could be improved a lot, obviously, but I cannot promise when - if you want it sorted soon then get coding. > Bug severity should be changed to "Critical". > The definition of "Critical" is as follows: > crashes, loss of data, severe memory leak The severity of a bug is mostly determined by the number of times it gets reported, reflecting the number of people it affects. I'd imagine most people use the default setting of 1K lines scrollback or unlimited - in which case it does not affect them. > Currently, using a large scrollback will lead to extended swapping periods > or can activate the OOM who will kill one of your processes (-> data loss). Even when the memory usage is reduced, if you have 30K lines of output that you care about then there is still the problem that Konsole doesn't keep the output backed up on disk anywhere. If you close a tab or exit the shell accidentally you will currently lose everything. It would undoubtedly be useful if the terminal kept a log of its output somewhere persistent but in the meantime if you should be logging to a file or something less fragile.
> And since developers of Konsole are unwilling to fix it, > probably someone has to debug/valgring this bug on its own: Not so much unwilling as otherwise occupied. You don't have to have a special badge and an initiation ceremony to contribute though. Anyone is welcome to submit patches at reviewboard.kde.org
I can see that at least in the source the problem is acknowledged and marked as critical: http://websvn.kde.org/trunk/KDE/kdebase/apps/konsole/src/History.cpp?revision=863326&view=markup But as it seems there are at least 2 issues to be solved here: (1) memory consumption does not decrease after clearing the history, closing the tab or changing the history type (memory leak!!!) 2. redundant way of storing the color per character; however that would be ok if there was a way to free up that memory again Is it possible that (1) is just a simple bug caused by a missing destructor, missing delete or similar? If yes, fixing that would at least be a good short-term solution. Improving (2) is obviously more complicated.
> I can see that at least in the source the problem is acknowledged > and marked as critical Those comments were written back in the KDE 3 days and I believe do not refer to a leak but just that if you set the number of scrollback lines to a very large number, open many tabs and fill all those buffers then a lot of memory will be used - see my calculations above. > Is it possible that (1) is just a simple bug caused by a > missing destructor, missing delete or similar? I need to confirm that there is actually a leak first - not just a lot of memory being used as a result of many tabs and a large scrollback buffer in each. I valgrinded a simple Konsole session with 1K scrollback opening tabs, filling the scrollback buffer and then closing the tab but did not see any relevant problems in the leak report. It would be helpful if you could try as well using: valgrind --leak-check=full --show-reachable=yes konsole --nofork
My valgrind run did not show any major leaks. But I am working on some improvements to reduce mem usage. If I come up with something useful, I'll send you the patches.
Created attachment 34358 [details] Patch for konsole 4.2.4, reduces RAM usage
Here's a patch that drastically reduces mem usage when a fixed size history is used. History lines are stored as an array of chars when possible (i.e. when color and rendition flag of all characters in the line are equal). When not (easily) possible, it falls back to the current way of storing history lines. Memory usage has been cut down from about 171MB to 55MB for a konsole process with 5 tabs, 30000 lines scrollback, 65 chars per line.
(In reply to comment #12) > > I can see that at least in the source the problem is acknowledged > > and marked as critical > > Those comments were written back in the KDE 3 days and I believe do not refer > to a leak but just that if you set the number of scrollback lines to a very > large number, open many tabs and fill all those buffers then a lot of memory > will be used - see my calculations above. > > > Is it possible that (1) is just a simple bug caused by a > > missing destructor, missing delete or similar? > > I need to confirm that there is actually a leak first - not just a lot of > memory being used as a result of many tabs and a large scrollback buffer in > each. I valgrinded a simple Konsole session with 1K scrollback opening tabs, > filling the scrollback buffer and then closing the tab but did not see any > relevant problems in the leak report. > > It would be helpful if you could try as well using: > > valgrind --leak-check=full --show-reachable=yes konsole --nofork Robert, definitely there's a memory leak, since I've already indicated that after closing all bash sessions in Konsole doesn't reduce Konsole's memory usage. As a simple experiment, open an empty Konsole, set history Konsole's history scollback value to 30 thousands lines, observe its RSS value (15M in my case). Then open 10 bash sessions, cat 50K lines file in each of them, then open a new bash session and close 10 'old' bash sessions. Something tells me that RSS value of Konsole process should be restored. In my case I see 664MB of memory STILL occupied by Konsole. That's insane.
Created attachment 34360 [details] Proof screenshot. $ ps fax --- cut --- 21181 ? Sl 0:11 /usr/bin/konsole 21496 pts/12 Ss 0:00 \_ /bin/bash 22275 pts/12 R+ 0:00 \_ ps fax --- cut --- Konsole runs only a freshly started bash session (after closing 10 others).
Created attachment 34361 [details] ms_print for Konsole valgrind --tool=massif konsole --nofork ms_print massif.out Probably someone can spot something wrong here (or probably I have to rebuild Konsole with debug symbols).
Review request for above patch created: http://reviewboard.kde.org/r/802/
Hi Michael, Thanks for the patch. Approach looks sensible. Rather than having two separate types of line an alternative would be to have a formatting array for each line which specifies the start position and index into the character array for each change of format. Something also worth noting is that QVector/QList objects have a high overhead per-instance (somewhere between 20-30 bytes IIRC) plus the memory allocator's own overhead - might be worth looking at QVarLengthArray instead. reviewboard is being funny at the moment - I'll post comments later on. Unfortunately KDE 4.3 is now in deep-freeze (small fixes only) but I should be able to get this in for 4.3.1. More urgent though is tracking down actual memory leaks - as opposed to inefficiency. It might be possible to push fixes for that into 4.3.0, I'd have to check with the release team.
I agree that the legacy format should be improved as well. If a user has only multi-color lines in the history, he'll get no improvement. The idea in the current solution is to keep the most frequent use case (single-color lines) as memory-efficient as possible. Once we completely get rid of the old-style storage format, e.g. by implementing a formatting array for multi-color lines as you suggested, the overhead of QVector/QList should be negligible as it is only per tab and not per history line. I might have a look into that. Until the source tree is open again for new features, I'll run (and maybe improve) the changed version on my own system to test it further.
It's great that you are trying to reduce Konsole memory consumption but the root cause of the problem hasn't even been identified.
> Then open 10 bash sessions, cat 50K lines file in each of them, > then open a new bash session and close 10 'old' bash sessions. > Something tells me that RSS value of Konsole process should be restored. Repeat that process 3 or 4 times and see if the memory usage stays at the same figure it was after the first 'cycle' or whether it goes up by the same amount each cycle.
OK. 1) Close all konsole applications/windows. 2) Run Konsole with the only session. Observe top values in xterm: 21668 birdie 20 0 73764 15m 11m S 0.0 0.8 0:00.33 konsole 3) cat big_file; open a new session; close `cat big_file` session; top: 21668 birdie 20 0 136m 80m 11m S 0.0 4.2 0:01.71 konsole 4) repeat 3) 21668 birdie 20 0 136m 80m 12m S 0.0 4.2 0:02.98 konsole Strange, opening and consequently closing *one* bash session, makes konsole stay at the same RSS usage level. * Let's check the case by opening by cat'ting a big file in two konsole sessions: 21668 birdie 20 0 201m 144m 12m S 0.0 7.7 0:06.81 konsole * And in three Konsole sessions: 21668 birdie 20 0 266m 209m 12m S 0.0 11.1 0:10.54 konsole * And in four Konsole sessions: 21668 birdie 20 0 331m 274m 12m S 0.0 14.5 0:15.37 konsole So, unless there's "Undo close tab" feature in Konsole, there's a major memory leak.
Created attachment 34367 [details] big_file :) Every time when I cat this file Konsole's RSS value increases by 65MB. It contains 49 165 lines. My Konsole terminal session has these parameters: COLUMNS=158 LINES=57 History scroll back size is set to 32 768 lines.
I confirm that the VmData of konsole does not decrease when closing a tab with a huge history, so there has to be a leak. And I made a small malloc/free testcase to double-check my measurement tool: indeed, when free() is called, VmData and VMRSS decrease accordingly. I don't know konsole enough to debug this, but I used gdb a bit and I do see a lot of destructors being called when closing tab: ~Screen and especially ~HistoryScrollBuffer which deletes _historyBuffer (and that took a really long time, so surely that was big). And yet VmData didn't change at all. Puzzling.
Can you try the test case also with new/delete? Is all memory freed with "delete" returned to the system? It depends on the implementation whether freed memory is returned to the OS, and when. And there might be a difference between malloc/free (might be based on mmap or brk) and new/delete. Btw, mem usage of patched version (after the last step, 4 tabs open, big_file cat in each one): 20512 mm 20 0 345m 61m 11m S 0 3.1 0:08.43 konsole
I'm in a process of upgrading to Fedora 11 (its ISO image is already distributed on mirrors). As soon as I complete it, I will try your patches. In my comment 24 "consequently" means "subsequently" :) Sorry for my English.
I've been away or I would have joined the conversation sooner. The current method stores each character as a 11-byte struct? (Comment #20 makes me think yes.) Also, I think konsole is optimizing the paint anyway, to combine chars with same format? A possibility that occurs to me is to store each line as a packed array with run count and one or more runs, where a "run" looks like: 2: total bytes in run (for easy skipping to next run) 1: attributes present 1: format/flags (optional, if specified in 'attributes present') 4: foreground color (optional, if specified in 'attributes present') 4: background color (optional, if specified in 'attributes present') 2*k: char (k >= 1, a wchar_t*, NUL omitted if possible) This would minimize the space used to that needed to record the content (no overhead from storing text and formatting separately) and still allows the characters of a run to be passed directly to the paint function. The overhead for a line with no formatting (usual case) goes from 9*k bytes to 10 bytes. Of course, one advantage to separating text and formatting is that copying text is easier.
> The current method stores each character as a 11-byte struct? Yes and has done for years (or at least, since I inherited the code) > Also, I think konsole is optimizing the paint anyway, > to combine chars with same format? Yes > A possibility that occurs to me is to store each line as a packed > array with run count and one or more runs, where a "run" looks like That would work as well - although as pointed out above there appears to be a leak somewhere which is more urgent.
#29, yes that's about what was proposed in #20 Don't forget that attributes are always present, because the characters of a line have always at least 1 bg and fg color, even if that's the same for all of them. Anyway, I have changed my original patch and implemented this. I'll post the 2nd version soon to the review board.
Michale, have you measured the performance impact of your patch? Konsole is known to be the fastest X terminal emulator application and I hope it will remain so. Thanks.
I think it's negligible. It's only slightly slower. User time to print 3x500k lines in 3 tabs (with a 30k lines history): - old: 20.74s - new: 21.62s
That's impressive (considering memory savings), thanks!
> #29, yes that's about what was proposed in #20 Was it? I understood that as separate arrays, whereas my suggestion was everything packed into one unified buffer. (For that to be most useful, however, you need to eliminate some abstraction and tie it pretty closely with the painting code.) > Don't forget that attributes are always present Of course. As a design point, I assume that the buffer always starts with a full attribute block. (While you *could* optimize that out - most lines are the same, after all - you're now looking at quite some performance overhead searching backwards to find out what are the current attributes for not enough gain.) Personally, if I were doing the work, I think I would be doing something with the 'everything is optional' attribute bitmask. But really, you need the storage to tie directly into the painting code if you're going to get maximum efficiency out of that. And there are still trade-offs to each possible approach.
#35, the two approaches share that the format information is extracted from each character and only stored once for each block of consecutive chars that have the same attributes. That reduces the amount of space required per line roughly from 11*n+x to 2*n+x. The difference is that my current solution does not use one big block but uses a QList to store a pointer to each line. This makes the implementation rather straightforward compared to a self-implemented buffer, which would be much more error-prone. What you basically need is a fixed-size FIFO to store the lines and that's easily achieved using a QList. That might not be the most efficient way of storing the history, but it certainly makes a lot of difference to what there is now. Konsole currently allocates one big block and puts everything there, but such an implementation is hard to maintain IMHO. I rather rely on existing data structures. When the size of the history changes, you have to allocate a new block and copy everything from the old to the new buffer. During this time, 2 buffers will be on the heap and thus increase the heap size, and there's no guarantee that this memory will ever be returned to the OS. There is certainly room for improvement if you want to squeeze out the last bits: - convert everything to utf-8 (might save you another 50% in the best case) - use any kind of compression for character data (e.g. RLE) - store the actual raw output which was printed to konsole originally and re-interpret it when you need to paint it... - etc.
Short version: Without actually testing it, I think the patch is good. There is maybe room for further tweaking, but for the "level of invasiveness" I would still push the current patch or something very similar for the short term. We all seem to agree on RLE'ing the formatting and making the history into some sort of line-array. Long version: > The difference is that my current solution does not use one big block but uses > a QList to store a pointer to each line. My intent was still to use some array structure (e.g. QList) to store lines, just that each line is a compressed buffer containing both format and text in one memory block. There is a slight savings in memory, also there is savings from the only-what-changed format blocks. There is a penalty to extracting text hunks e.g. for copying, and you don't save a /whole/ lot by combining buffers. The improved efficiency painting is probably nice, but you can only take advantage of that with direct access to the format buffer from the paint routine. (And ideally you would want to apply that methodology /everywhere/, including the content of the tty. Preferably in such a way that you no longer even need to build the format block for the history buffer, because you already have it built from the tty. You could certainly build all of that onto the latest patch at a later date.) > There is certainly room for improvement if you want to squeeze out the last > bits: > - convert everything to utf-8 (might save you another 50% in the best case) How is text handed off to Qt for painting, or put on the clipboard? Specifically, I wonder if you would lose anything converting to UTF-8, in which case IMO this should simply be done ;-). (But... later, probably.) > - use any kind of compression for character data (e.g. RLE) Conversely, this would be horribly inefficient when you need to actually get the text. You would need to allocate buffers for decompression, decompress, etc. Probably not worth it? > - store the actual raw output which was printed to konsole originally and > re-interpret it when you need to paint it... I actually considered that as well... but it wouldn't work very well. The main advantage would be if you could correctly re-flow text, but that breaks quickly (e.g. as soon as you throw anything tty-aware into the mix, /especiallly/ curses applications). And if you only store format-change codes, I think the ANSI format-change codes end up being just as many, or possibly /more/, bytes than simply storing the format changes. Something else I noticed; I *think* Konsole remembers the difference between 16-color codes (ANSI), 256-color codes (xterm), and 24-bit color codes (Konsole). Which means that you should be able to pack most color changes into fewer than 4 bytes; even into 1 byte for both fg/bg if they're ANSI.
> Michale, have you measured the performance impact of your patch? > Konsole is known to be the fastest X terminal emulator application > and I hope it will remain so. Fast in what way? The traditional test of a terminal is 'how long does it take to print X lines'. This has some use in terms of measuring the throughput of the back-end code but it doesn't make that much difference in day to day work. Both Konsole and gnome-terminal 'cheat' to do well in this by not drawing every line of the output but if a lot of output is received in succession then the updates are buffered and in the end only four or so re-draws of the screen are done in a second. xterm by comparison appears to be a lot slower if you do this even though it can actually render the whole window much quicker because of the simple text rendering. Other than keeping reviewers who want to benchmark Konsole happy, the one genuine reason to worry about 'printing X lines quickly' speed is not slowing down an application too much if it happens to spew large amounts of output to the terminal. For me the thing that really matters is having quick tab/screen switching and smooth scrolling even when the window is large. > User time to print 3x500k lines in 3 tabs (with a 30k lines history): > - old: 20.74s > - new: 21.62s That's fine by me given the memory reductions.
#37 In this case, I fully agree with you. One heap allocation per line would be more efficient. On the other hand, I guess at least 90% of all lines ever in a history will not contain multiple colors, but that strongly depends on the use case. If you can keep the per-line overhead for these lines equally small in the combined buffer approach, that's fine. If not, a polymorph approach (like in my first patch) using a minimal line type for single-color lines and a more complex one for formatted lines might be better. Plus, we already have all code necessary for that. How about taking the compact line class of the second patch and use it for multi-color lines, adding the minimal line class from the first patch and use it for single-color lines, and create the final patch out of that? I volunteer to do that, however, I won't invest the time in any other, maybe more elaborate solution. What does the maintainer think about that?
Yes I made a QVector-based testcase too (with the actual konsole data structures in them), and it works as expected. VmData shrinks as soon as it's deleted. To answer my own "puzzling" remark above: this is because of implicit sharing; the items in the vector are copied from somewhere else which is still using them, and that's why the memory is not going down. IMHO the discussion in this bug report is really mixing two problems. How to make the memory usage smaller is a good topic, but is unrelated to the actual bug, which is that the memory is not freed at all, something is keeping all that data somewhere...
On my machine, VmData does not shrink. I have tested several applications, and the results are the same. For example take "kate": 1. Open some big files in the editor and watch VmData grow 2. Close the editing windows but leave kate open 3. VmData will never shrink Other example: "konsole" - we all know about that already Yet another example: "dolphin" 1. Open some windows or tabs and watch VmData grow 2. Close all but one 3. VmData will never shrink Same with "okular": 1. Open a large PDF file and note VmData 2. Open a small PDF file - okular should free the mem needed for the large PDF file 3. VmData does not shrink So, the question is, why is VmData shrinking in your test app but not in all these apps. I actually doubt they are all affected by serious mem leaks.
Hi Michael, I have been testing your patch for a couple of days and it is good to go into trunk soon but I would prefer to find the root cause of the memory 'leak' first.
Michael, but what if they are all affected? Jokes aside, glibc malloc()/free() implementation has very serious problems when dealing with a lot of small memory allocations - this is why Firefox for Linux uses its own memory allocator. See this bug: https://bugzilla.mozilla.org/show_bug.cgi?id=130157 and its duplicates. Even Qt 4.5.x started to offer its own allocator - probably you have to retest all these applications after recompiling Qt with -ptmalloc configure flag.
I discovered a problem with copy&paste from the history in the current patch. I am working on resolving that. As for the mem leak: I am not sure one exists. Valgrind shows some lost blocks, but they seem all related to QT's font classes. I can't find anything in the history code or in konsole itself. The reason for VmData not shrinking (most of the time) upon mem being freed by "free" or "delete" is IMHO not caused by anything in the konsole code but rather is a normal consequence of heap fragmentation. And as I described in my last post, this can be observed in other applications, too. If you are lucky enough to free the last block allocated on the heap, the system might be able to return everything until the end of the but-last block. Otherwise, nothing can be freed as this would require heap compaction. At least this is the case with small blocks in the brk() area. Large blocks are usually allcated using mmap and could be freed independently, but for the history that's not really useful, unless you want to use mallopt() and implement some sort of customized allocation and partitioned heap.
Michael, in comment #24 I clearly showed that there's a huge whatever-you-call-memory-not-returned-after-closing-a-tab-issue, and numerous people report the same and you still say that everything's fine? It's all in the ways of God ;)
Artem, don't get me wrong, I don't say that everything is ok with the current version. Its memory usage is very high, but I doubt there's a leak. If the memory usage stays constant after a certain number of actions, that is a good sign that there's no leak. Plus, valgrind also shows no leaks. My conclusion was that it's all due to to heap fragmentation, and the link you posted gives exactly the same explanation. Even if you have multiple tabs, they all share the same heap. And if the heap gets fragmented, little memory can be returned to the OS. Meanwhile, the high memory consumption has been greatly reduced. In my tests with 6-8 tabs and 30.000 history, it now hardly ever reaches 100 MB. So just way for the patch to be included, I think you'll be satisfied then.
I have one more follow-up that substantiates the heap-fragmentation theory: I ran konsole with an alternative malloc implementation I took from http://mr.himki.net/OpenBSD_malloc_Linux.c For comparison, here's the data size of konsole with two different implementations of malloc: glibc malloc: - 1 tab (after startup): 20m - 10 tabs (30k history): 107m - 1 tab (closed all but one): 104m OpenBSD mallloc ported to Linux: - 1 tab (after startup): 20m - 10 tabs (30k history): 121m - 1 tab (closed all but one): 32m (!) If you compile and preload this library (LD_PRELOAD=/path/malloc.so konsole), almost all memory is freed once you close a tab. This malloc obviously is able to handle the allocation/de-allocation sequence of konsole much better than glibc's, at the cost of a slightly higher maximum heap size.
Michael, thank you for this experiment! I suppose this bug can potentially affect Konqueror and great many other KDE applications but as my surmise is now proved right (see comment #43) at least Konsole should have its own memory allocator. With OpenBSD malloc allocator there can be a problem - when I ran Firefox under it, Firefox crashed quite frequently. Can you confirm that Konsole is stable for you? And now I wonder if it is possible to preload OpenBSD allocator for all applications using libQt (by overriding libQt's malloc/free glibc usage).
First of all, I wouldn't consider this a bug (unless crashes happen, but that might be a bug in Firefox), not in konsole nor in the allocator. There's always a tradeoff between execution speed and memory usage. I didn't benchmark the OpenBSD allocator against the glibc one in terms of speed. The glibc allocator presumably prefers speed (cache locality, multi-threading performance, etc.) over memory usage and I wouldn't be surprised if it's faster than the OpenBSD one. But there's probably room for improvement there. Secondly, I am not sure whether applications should start developing their own allocators. Given the glibc allocator's inefficiencies in some cases (like this one), I'd rather wait until this allocator gets improved. If that does not happen, the choice of allocator probably should go into the Linux distribution or in the core KDE libs. Many KDE applications are affected and therefore the effort shouldn't be repeated in every application. It is certainly possible to put LD_PRELOAD into /etc/profile or your user profile, but then it will affect all applications. If you want the OpenBSD malloc to be used by QT applications only, I guess you'd have to use some custom script to launch applications that does some ldd magic.
Hi Michael, Perhaps we could use mmap() to do memory allocation for just the history buffer then? With your current patch there is one buffer per line for the character data and one for the formatting data. Perhaps an alternative would be to have two ring buffers, one for the character data and one for the formatting data. I believe these buffers would then be large enough that they would be allocated using mmap() rather than from the memory zone glibc uses for smaller chunks of data?
Two buffers are only allocated for multi-color lines. But you even need a total of three allocations for such lines: one for the actual HistoryLine object and another two for the two arrays (character data, format). Single color lines have only character data, so there are only two allocations in total. As we know now, this stresses the heap quite a lot. In theory, it would be possible to use mallopt to force mmap allocations also for small buffers, but without an algorithm to fit multiple lines into one buffer we would waste a whole page per line. The default size at which glibc uses mmap for allocations is 1MB. The general problem with one large buffer is, that you don't know in advance how much space you need for n lines of variable length. If the user wishes n lines of history, you would need to estimate how big the buffer must be. If it's too large, you waste space again. And if it's too small, you must reallocate+copy and I don't know about the implications on performance of such an approach. A possible solution would be to use a variable-length list of fixed size buffers that are mmap-allocated. Each buffer contains a variable number of history lines and once such a buffer fills up, a new one is allocated and added to the list. If all lines of a buffer are removed from the history, you can unmap the whole buffer.
I'm not a programmer at all, but here's my solution (probably a complete rubbish :) ). Say a user have N lines of history enabled. Most users run their console maximized and have a known window width (shell variable $COLUMNS). Also most applications/user activity doesn't fill up the history instantly so we can safely allocate/mmap memory buffer = N*$COLUMNS*OVERHEAD/10 at a time. If a user reaches N/10 lines of history we allocate yet another buffer of E size. If a user decides to close a tab, then we can easily munmap(E*N/10*allocated_buffers). Of course, this approach doesn't take into consideration a possible resize operation of a Konsole window or font size changes. But in this case we can dynamically reallocate memory and change buffer size. Another problem is that this approach doesn't take into consideration an endless history. But in this case we can allocate buffers by 10K lines at a time (or any other history lines number). Just my $.02.
I've just posted yet another version of the patch to the review board. The patch now uses a custom heap (list of mmap'ed blocks) to store the history, similar to what was proposed in comment 50. The data size now decreases when the history is cleared or tabs are closed. As a bonus, memory consumption is a little lower, as we can allocate memory more efficiently on this custom heap.
Now that 4.3 has been branched off trunk, how are the chances of the latest patch getting merged into trunk?
SVN commit 987102 by knight: Greatly reduce memory usage required by Konsole's scrollback buffer and improve releasing of scrollback memory to OS when no longer needed. * Reduce amount of memory required for representing characters in scrollback buffer by not storing formatting data for each character. Instead store UTF-16 characters and formatting ranges * Allocate memory for scrollback buffer use a custom mmap-based memory pool - this allows the memory to be released to the OS sooner when freed. There is a small performance cost when printing a large number of lines - see bug report for figures. Patch by Michael Meier REVIEW: http://reviewboard.kde.org/r/802/ BUG:176974 M +13 -0 Character.h M +283 -0 History.cpp M +145 -0 History.h M +1 -1 SessionController.cpp M +1 -1 SessionManager.cpp WebSVN link: http://websvn.kde.org/?view=rev&revision=987102
Hi Michael, Thank-you for your time on this. I will test it a little more locally and it should be possible to backport this for a KDE 4.3.x point release.
Michael and Robert, thanks for working on this bug, but I have one request. Can you please create the simplest code (say, one C++ file using Qt includes) which deliberately "leaks" memory using the same memory allocation/free as it's done in "old" Konsole from KDE 4? So, this code in its basic code just allocates say 250000 random sized objects then frees them, then we see that memory is *not* returned to the underlying OS. I think Qt developers *should* know about this problem, because I suspect there can be other places where Qt "leaks" memory the same way, and there could be other applications which suffer from this Qt/glibc bug/feature/problem.
Artem: I believe your conclusion is wrong. See what I said in comment #40: there is no such bug in Qt/glibc AFAICS, VmData does shrink when destructing the vector, as my testcase showed. The memory _is_ returned to the OS. From what I can see, the bug was in konsole.
However no one could spot a memory leak using valgrind. ;) I suppose it won't hurt if someone wrote a testcase so that we could verify my hypothesis.
David, When I run unpatched Konsole under OpenBSD_malloc_Linux.c (which is located here http://mr.himki.net/index-alloc.html ) it returns _all_ the allocated scrollback memory when I close tab(s). So, no matter how you call it this problem/bug/etc. definitely affects Qt and Qt based applications at least when Qt is running under Linux/glibc. I'm going to report these findings to Nokia/Qt software.
> I'm going to report these findings to Nokia/Qt software. To clarify, we do not know that Qt is doing anything 'wrong' and nobody has yet done enough testing with Konsole or pieces of it to have a clear picture of what is going on. Someone at Qt Software may be aware of a similar problem in another application though.
Robert, Can you, please, create a simple C++ application like the one I've outlined in comment #57? My skills in C++ programming are almost non-existent. :(
Artem: your findings point at a bug in glibc then, not at a bug in Qt :-) About a testcase: I made one but it _was_ returning all the memory even with standard glibc malloc... so it wouldn't help you much, would it?
David, as it was mentioned earlier, Ulrich Drepper and other glibc developers treat this bug as a feature, please, read comments in this file: http://sourceware.org/cgi-bin/cvsweb.cgi/libc/malloc/malloc.c?rev=1.200&content-type=text/x-cvsweb-markup&cvsroot=glibc This only means that Qt developers should use their own memory allocator under linux for all memory allocations or at least for allocations which are not multiple of a page size (usually 4K). And probably your testcase doesn't use/have the same scrollback history memory structure as konsole does.
There's one more problem that drives me mad. Over time almost all KDE 4.x.x applications consume more and more memory even when I don't actively interact with them. Most visible are: Plasma (47m at startup here, up to 90m after a week of usage) KWin (32m at startup, up to 70m after a week of usage with zero applications running, except those which are minimized to systray) etc. I never shut my PC down but almost every week or two I have to restart my KDE session to free some memory (and mind that I have 2GB of RAM). I don't have desktop effects (OpenGL) enabled. I have no idea why no one is paying attention to this problem and it seems like developers each have 8GB of RAM installed on their PCs. Of course, Konqueror is the worst of all, I've recently filed a bug 200439 report against it.
I'm sorry for flooding but I've just found a number of very good explanations why libc malloc doesn't return memory to the underlying OS more often than it does: http://forums11.itrc.hp.com/service/forums/questionanswer.do?admit=109447626+1247830987898+28353475&threadId=110107 (Subject: Subject: free() not really freeing?) and http://www.linuxquestions.org:80/questions/programming-9/how-does-malloc-and-free-functions-work-262165/ (Subject: how does malloc() and free() functions work?) and comment here: http://digg.com/software/Firefox_Memory_Leak_Problem_Explored (starting with "(This is slightly expanded from the comment I made on an older Firefox memory thread on Digg today.) The problem for many people is *NOT* a "memory leak". It is a heap memory fragmentation problem. The following statements apply to Linux systems though I suspect they may be a consideration under Windows as well. Firefox is built on top of C++ and C, those use the GNU "libc" ANSI C standard malloc() and free() to allocate memory.")
Is it possible that the old implementation used 16 bytes per character on a 64bit machine due to alignment? This would explain some gap between the calculated data and reported data.
I think the patch could now be merged into the 4.3.x branch. I've been using it without problems for several weeks now.
*** Bug 228037 has been marked as a duplicate of this bug. ***
Created attachment 73539 [details] malloc() testing program Just for posterity, the observed "memory leak" in comment 24 is not a memory leak and deserves a proper explanation. Try the attached program: g++ testmem.cpp ./a.out You will want to run something like watch -n0.3 "ps aux | grep a.out" in another konsole while it's running. For the non-developers, I'll explain what exactly happens here. The program allocates half a gig of memory in a single chunk and writes to the memory (step 1), then frees it (step 2). This memory will be immediately returned to the system. Next, it will allocate half a gig in 32k chunks and write to each chunk (step 3), then free each of the chunks (step 4). This memory is not immediately returned to the system. Steps 5 and 6 are a repeat of steps 3 and 4, but they forcefully return the used memory to the system, and so you see the memory usage drop again. So what happened, why wasn't the memory returned to the system after step 4, it must be a memory leak, right? Wrong. Run the program in valgrind and you'll find that there's no memory leak, and a cursory viewing of the source will support that observation. First of all, it's important to note that memory allocations are (very) expensive operations, and a typical program does A LOT of them, and more specifically, the typical program does a lot of small memory allocations which last for only a small fraction of the lifetime of the program. Consequently, it's a worthwhile practice to try and optimize for this very common use case of many short lived mallocs, and glibc's malloc implementation does exactly that. For memory allocations below a configurable size threshold, they are drawn from a pool of memory entries which are allocated as needed and cached for later usage, even after being free()d. This means that when we repeated the existing steps 3 and 4 in as steps 5 and 6, the 2nd round of malloc()s would be essentially free (and as you can see, there is no increase in memory usage), instead of being very expensive like the first time. Memory allocations of a size larger than the configurable threshold are not drawn from a special pool and are not cached for later use, because such larger memory allocations are rarely made in any meaningful volume within the same program execution context and have a substantial memory cost associated (if they were to be kept around like the smaller chunks). The malloc implementation in glibc is (very much by necessity) a very general malloc that performs well in a wide variety of uses both in terms of CPU time spent and memory usage. It is also a very tunable malloc implementation. Sometimes there are degenerate cases that a general implementation cannot handle well, and that's why functions like malloc_trim() exist. In this case, it might be appropriate to call malloc_trim() after closing a tab/konsole with a large (or non-unlimited) scrollback buffer, as konsole can have relatively unusual bursty patterns of memory usage, which if allocated in small chunks, will not be returned to the system for awhile.