Bug 115670 - Reload needs to be more efficient
Summary: Reload needs to be more efficient
Status: RESOLVED FIXED
Alias: None
Product: kst
Classification: Applications
Component: general (show other bugs)
Version: 1.x
Platform: unspecified Solaris
: NOR normal
Target Milestone: ---
Assignee: kst
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-11-04 12:55 UTC by Nicolas Brisset
Modified: 2008-08-29 23:02 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nicolas Brisset 2005-11-04 12:55:01 UTC
Version:           1.2.0_devel (using KDE 3.4.0, compiled sources)
Compiler:          gcc version 3.4.3
OS:                SunOS (sun4u) release 5.8

- take a datasource that does not implement reset() and outputs some trace when entering the contructor/destructor
- create 4 plots with 1 curve in each (5 vectors total)
- hit reload: the datasource is built/destroyed 5 times
- add two vectors 
- hit reload: the datasource is built/destroyed 7 times

As it may take a significant amount of time to initialize a datasource, this behavior is very wrong and should be fixed !
Comment 1 George Staikos 2005-11-04 17:50:30 UTC
On Friday 04 November 2005 06:55, Nicolas Brisset wrote:
> - take a datasource that does not implement reset() and outputs some trace
> when entering the contructor/destructor - create 4 plots with 1 curve in
> each (5 vectors total)
> - hit reload: the datasource is built/destroyed 5 times
> - add two vectors
> - hit reload: the datasource is built/destroyed 7 times
>
> As it may take a significant amount of time to initialize a datasource,
> this behavior is very wrong and should be fixed !


  Sorry, but this begs the question: why not use reset()?  Reset was designed 
to fix just this case.
Comment 2 Nicolas Brisset 2005-11-07 10:18:31 UTC
Of course, using reset() is an option. But I still wonder why we would need to construct/destroy the datasource so many times just to reload it...
This behavior seemed wrong to me, and I just happened to have added some traces that evidenced it. As I was not sure it was a known "feature", I thought filing a bug report was appropriate so that the code gets cleaned up in case it is needed (especially as it may have a dramatic impact on kst performance for some datasources not implementing reset, including some in the official distribution like CDF, blame me). 
Feel free to close this bug if what happens is necessary for some reason, even unclear to me :-)
Comment 3 George Staikos 2005-11-07 17:34:02 UTC
On Monday 07 November 2005 04:18, Nicolas Brisset wrote:
> 10:18 ------- Of course, using reset() is an option. But I still wonder why
> we would need to construct/destroy the datasource so many times just to
> reload it... This behavior seemed wrong to me, and I just happened to have
> added some traces that evidenced it.


   It's mostly because the datasources expected this, and well, how do you 
"reset" a datasource?  If it has no reset() function, then the only choice is 
to destroy it and create a new one.

> As I was not sure it was a known 
> "feature", I thought filing a bug report was appropriate so that the code
> gets cleaned up in case it is needed (especially as it may have a dramatic
> impact on kst performance for some datasources not implementing reset,
> including some in the official distribution like CDF, blame me). Feel free
> to close this bug if what happens is necessary for some reason, even
> unclear to me :-)


  No, this mechanism will remain the fallback as long as reset is not 
implemented.
Comment 4 George Staikos 2005-11-07 17:34:52 UTC
Required behaviour.
Comment 5 Nicolas Brisset 2005-11-08 11:09:04 UTC
>   It's mostly because the datasources expected this, and well, how do you 
> "reset" a datasource?  If it has no reset() function, then the only choice is 
> to destroy it and create a new one. 
Sure, when reset is not implemented you can only destroy it and create a new one. This is *not* what I was wondering about...

The thing that surprised me was that you'd need to destroy it and create a new ones 10 times if you are using 10 vectors from that datasource, 15 times if you are using 15, etc. To me, it sounded like destroying the datasource instance and recreating one should be enough to allow access to all its vectors. 

As a matter of fact, trying to evidence my point, I noticed that 
- with cdf (which implements reset, please ignore my wrong comment above!), the reset() method gets called as many times as there are vectors
- with another datasource not implementing reset(), the constructor gets called as many times as there are vectors, after which the initial datasource instance is destroyed. If you reload more than once, only the first instance created in the previous loading gets killed, which looks to me like a potential memleak...

Once again, I understand the need for a reset() method or destroying and creating a new datasource as fallback, but I still wonder why it depends on the number of vectors provided by that source. Pardon me for insisting on this, but I definitely feel there is something strange there. Why should we call reset() more than once, when it actually resets the whole datasource instance so that one call would be enough, and why would we need x instances of a given datasource if it provides x vectors (especially when they are not destroyed later) ? 
Comment 6 George Staikos 2005-11-09 05:57:37 UTC
On Tuesday 08 November 2005 05:09, Nicolas Brisset wrote:
> >   It's mostly because the datasources expected this, and well, how do you
> > "reset" a datasource?  If it has no reset() function, then the only
> > choice is to destroy it and create a new one.
>
> Sure, when reset is not implemented you can only destroy it and create a
> new one. This is *not* what I was wondering about...
>
> The thing that surprised me was that you'd need to destroy it and create a
> new ones 10 times if you are using 10 vectors from that datasource, 15
> times if you are using 15, etc. To me, it sounded like destroying the
> datasource instance and recreating one should be enough to allow access to
> all its vectors.
>
> As a matter of fact, trying to evidence my point, I noticed that
> - with cdf (which implements reset, please ignore my wrong comment above!),
> the reset() method gets called as many times as there are vectors - with
> another datasource not implementing reset(), the constructor gets called as
> many times as there are vectors, after which the initial datasource
> instance is destroyed. If you reload more than once, only the first
> instance created in the previous loading gets killed, which looks to me
> like a potential memleak...


  I don't think a leak is possible.

> Once again, I understand the need for a reset() method or destroying and
> creating a new datasource as fallback, but I still wonder why it depends on
> the number of vectors provided by that source. Pardon me for insisting on
> this, but I definitely feel there is something strange there. Why should we
> call reset() more than once, when it actually resets the whole datasource
> instance so that one call would be enough, and why would we need x
> instances of a given datasource if it provides x vectors (especially when
> they are not destroyed later) ?


  OK calling reset that many times is clearly also slow.  I need to come up 
with a solution for how to fix this though.  It will take some time and may 
wait until 1.3.
Comment 7 George Staikos 2005-11-09 05:59:26 UTC
Possible solution: clear the data source list and then call reload() on each.  Reload will take whatever is in the list if it exists.  If not, it will call reload() and put that in the list.  If that fails, it will give up its pointer and create a new one, then put it in the list.
Comment 8 Netterfield 2006-02-15 01:00:04 UTC
George: Please elaborate on your proposed fix.
Comment 9 George Staikos 2007-03-05 01:35:18 UTC
Defer to 2.0: we can make datasources true objects and get rid of the "smart" behavior we tried to have which causes some of these problems.
Comment 10 Andrew Walker 2008-08-28 01:26:03 UTC
There does not appear to be any memory leak.

However, there are several efficiency problems with the current code:

* where reloading vectors that read from the same datasource object each cause a reset on that data source, if reset is supported. It is only necessary to call reset once per datasource object.

* if a datasource object does not support reset, then at the end of the reload each vector will now read from a different, and newly created, datasource object. It should be possible to have each vector refer to just one instance of the datasource.

Another problem is that a reload can also be performed on a per vector basis from the data manager. If multiple vectors hold the same datasource object then they will all be updated, which may not be expected by the user.

The underlying problem seems to be as to whether the reload method belongs to a vector or a datasource.
Comment 11 Andrew Walker 2008-08-29 23:02:33 UTC
Modified kst.cpp, kstrvector.*, kstrmatrix.*

The reload is now performed more intelligently when done at the application level.