Bug 134987 - Skipping frames sometimes misses the last point
Summary: Skipping frames sometimes misses the last point
Status: RESOLVED NOT A BUG
Alias: None
Product: kst
Classification: Applications
Component: general (show other bugs)
Version: 1.x
Platform: unspecified Solaris
: VHI normal
Target Milestone: ---
Assignee: kst
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-10-02 11:47 UTC by Nicolas Brisset
Modified: 2007-08-16 10:32 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Attachments
Testcase for bug #134987 (5.47 KB, application/x-tgz)
2006-10-02 11:50 UTC, Nicolas Brisset
Details
Reverts revision number 596909 (6.79 KB, patch)
2006-12-07 23:25 UTC, Adam Treat
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nicolas Brisset 2006-10-02 11:47:08 UTC
Version:           1.3.0 (using KDE 3.4.0, compiled sources)
Compiler:          gcc version 3.4.3
OS:                SunOS (sun4u) release 5.8

Kst is excellent when working with data sampled at different rates and coming from different files. There is however a small glitch that just caused some anger: when skipping points (e.g. read 1 sample per 25 frames) the last point may be omitted when the user expects it to be considered.

I'll attach a testcase soon, where you will notice the following:
- the vector read from the smaller file has 141 samples, which is correct considering the fact that the file provides 141 samples
- the vector read from the bigger file with skip activated (read 1 sample per 25 frames) has only 140 samples, while there are 3501 samples: we expected to find 141 samples as well (140*25=3500). 

This is surely a very small index issue somewhere (not in the datasourrce since skip is not implemented in the datasource), but in our case the missing point really is an issue because we want to know the final result accurately, and it is given by that last point!
Comment 1 Nicolas Brisset 2006-10-02 11:50:43 UTC
Created attachment 17984 [details]
Testcase for bug #134987
Comment 2 Nicolas Brisset 2006-10-09 12:00:02 UTC
Looking at kstrvector.cpp (which I assume is where the problem lies), I wonder about lines 473-477: why do we need to align start and end frames on skip boundaries ? I think that's what causes the problem, but I'm not sure how to fix it as I'm afraid of collateral damage !
But I noticed something else which I find equally wrong (possibly even worse !): if you select first frame = 2 in the above example for the vector with skip = 25, the first value you get is not the 2nd value in the data file but the 26th or 27th! This is really quite dangerous !
All in all, looking at this file there seem to be so many possibilities and corner cases, plus some "hidden" rounding with implicit casts to (int) that there are probably more weirdnesses...
Comment 3 Adam Treat 2006-10-09 22:03:03 UTC
This is related to this bug:

http://bugs.kde.org/show_bug.cgi?id=104765
Comment 4 Adam Treat 2006-10-09 23:24:57 UTC
I think this is a simple off by one error.  The size should be (new_nf / Skip) +1...

As for the other bug, it seems that new_f0 should not be > ReqF0...

   if (DoSkip) {
     // reallocate V if necessary
     //kstdDebug() << "new_nf = " << new_nf << " and skip = " << Skip << " so new_nf/Skip+1 = " << (new_nf / Skip + 1) << endl;
-    if (new_nf / Skip != _size) {
-      bool rc = resize(new_nf/Skip);
+    if (new_nf / Skip +1 != _size) {
+      bool rc = resize(new_nf/Skip +1);
       if (!rc) {
         // FIXME: handle failed resize
       }
Comment 5 Adam Treat 2006-10-14 00:03:15 UTC
I have written a set of test cases for rvector skip functionality.  The current implementation fails seven of these tests.  I have a patch which makes rvector pass all of the tests, but waiting for confirmation that the testcases themselves are correct.  See list for details..
Comment 6 Netterfield 2006-12-07 23:05:38 UTC
Somehow, recent changes have caused kst to crash on reading dirfiles in skip mode.  This is very very bad!  Can it be fixed real soon (even if the last point is not read?)

I don't have time to deal with it as we are flying in a few days.  Can the changes related to this bug (and associated discussion on the list) be reverted in the immediate term?
Comment 7 Adam Treat 2006-12-07 23:24:33 UTC
I've attached a patch that reverts revision number 596909 that was related to this bug.  It doesn't revert the subsequent commits though.  Can you tell me if this fixes your dirfiles in skip mode?  If so, I'll commit and we can look at this later.
Comment 8 Adam Treat 2006-12-07 23:25:52 UTC
Created attachment 18837 [details]
Reverts revision number 596909

Reverts revision number 596909, but doesn't touch subsequent commits.
Comment 9 Netterfield 2006-12-08 01:46:08 UTC
This seems to re-stablize skip reads.  I think it should be applied to head 
until such time as the modified version is fixed.

cbn

On Thursday 07 December 2006 17:24, Adam Treat wrote:
[bugs.kde.org quoted mail]
Comment 10 Netterfield 2007-03-28 21:08:48 UTC
Correct Behaviour:
(i) [current behaviour] always read on skip boundaries: ie only read frames N*S where S is the value of 'skip' and N is an integer.  (eg, if S = 2, only read even numbered frames, regardless of the starting frame given.

(ii) [current behaviour] when 'average' is on, only return points if there are a full S frames available to average. Average forward.


this bug seems to request
(iii-a) when 'average' is off, read as many frames as you can, given the frame range given.

the current behaviour is
(iii-b) [current behaviour] when 'average' is off, read the same number of frames as you would if 'average' were on.

I think (i), (ii), (iii-b), the current behaviour, is, in fact, what we want.

Thoughts?
Comment 11 Andrew Walker 2007-03-28 22:37:09 UTC
To rephrase slightly:

for skip mode we will wait until an entire frame is present before reading the first point from it. For some data sources (such as ascii) this isn't strictly necessary but maintains consistency with other options.

I agree that the current behaviour is correct - though I suspect we'll have to defend it again in the future.
Comment 12 Netterfield 2007-03-28 23:35:05 UTC
Not quite... 
In skip mode, we wait for all of S frames to read the first point in the 
group, because we might be called on to average those S frames.

On Wednesday 28 March 2007 4:37:10 pm Andrew Walker wrote:
> for skip mode we will wait until an entire frame is present before reading
> the first point from it. For some data sources (such as ascii) this isn't
> strictly necessary but maintains consistency with other options.

Comment 13 Nicolas Brisset 2007-03-29 17:50:57 UTC
I am somewhat dubious here. We rarely (or never) use average/filtering on load, and I insist that it is very confusing for a user to indicate "read 1 sample every 25 frames", have 3501 frames and only 140 samples. I think the mental image the typical user has of skipping is that it reads the first data point as indicated in the UI, then skips (skip-1) points, then reads the next sample, and so on. If there is a sample #3501 that lies just after 24 skipped points, it should be read. Anything else will cause confusion ! Of course, there may be performance issues or consistency between the behavior with/without averaging on, and all sorts of reasons. But in the end it IS confusing.
I am aware that taking into account all aspects (averaging on/off, read from start/end, skip or not, updates paused or not, performance, ...) makes this pretty tricky. I am just not sure that there isn't a way to remove this confusion while preserving a sensible behavior for the rest. That said, I haven't spent hours thinking about it... and maybe there is no better solution ?
Comment 14 Andrew Walker 2007-05-07 23:08:53 UTC
Have we reached anything approaching a consensus on this?
Comment 15 Andrew Walker 2007-08-09 20:51:54 UTC
The existing behaviour is self-consistent. It may be initially confusing for some users but I believe this to be unavoidable if we are to maintain consistency across data sources and data source options.
Comment 16 Nicolas Brisset 2007-08-16 10:32:12 UTC
Oh well, so it seems I lost the argument in the end ? It's not clear to me why we can't do what Barth suggests in iii-a) in comment #10. To me, iii-a) is clearly better than iii-b) and indeed exactly what I wanted, especially as not everybody will be using averaging (around here nobody uses it in fact: we prefer to have all data points and possibly zoom out, kst has enough performance for that).
Hoping there is still a chance, at least for 2.0...