Version: 1.3.0 (using KDE 3.4.0, compiled sources) Compiler: gcc version 3.4.3 OS: SunOS (sun4u) release 5.8 Kst is excellent when working with data sampled at different rates and coming from different files. There is however a small glitch that just caused some anger: when skipping points (e.g. read 1 sample per 25 frames) the last point may be omitted when the user expects it to be considered. I'll attach a testcase soon, where you will notice the following: - the vector read from the smaller file has 141 samples, which is correct considering the fact that the file provides 141 samples - the vector read from the bigger file with skip activated (read 1 sample per 25 frames) has only 140 samples, while there are 3501 samples: we expected to find 141 samples as well (140*25=3500). This is surely a very small index issue somewhere (not in the datasourrce since skip is not implemented in the datasource), but in our case the missing point really is an issue because we want to know the final result accurately, and it is given by that last point!
Created attachment 17984 [details] Testcase for bug #134987
Looking at kstrvector.cpp (which I assume is where the problem lies), I wonder about lines 473-477: why do we need to align start and end frames on skip boundaries ? I think that's what causes the problem, but I'm not sure how to fix it as I'm afraid of collateral damage ! But I noticed something else which I find equally wrong (possibly even worse !): if you select first frame = 2 in the above example for the vector with skip = 25, the first value you get is not the 2nd value in the data file but the 26th or 27th! This is really quite dangerous ! All in all, looking at this file there seem to be so many possibilities and corner cases, plus some "hidden" rounding with implicit casts to (int) that there are probably more weirdnesses...
This is related to this bug: http://bugs.kde.org/show_bug.cgi?id=104765
I think this is a simple off by one error. The size should be (new_nf / Skip) +1... As for the other bug, it seems that new_f0 should not be > ReqF0... if (DoSkip) { // reallocate V if necessary //kstdDebug() << "new_nf = " << new_nf << " and skip = " << Skip << " so new_nf/Skip+1 = " << (new_nf / Skip + 1) << endl; - if (new_nf / Skip != _size) { - bool rc = resize(new_nf/Skip); + if (new_nf / Skip +1 != _size) { + bool rc = resize(new_nf/Skip +1); if (!rc) { // FIXME: handle failed resize }
I have written a set of test cases for rvector skip functionality. The current implementation fails seven of these tests. I have a patch which makes rvector pass all of the tests, but waiting for confirmation that the testcases themselves are correct. See list for details..
Somehow, recent changes have caused kst to crash on reading dirfiles in skip mode. This is very very bad! Can it be fixed real soon (even if the last point is not read?) I don't have time to deal with it as we are flying in a few days. Can the changes related to this bug (and associated discussion on the list) be reverted in the immediate term?
I've attached a patch that reverts revision number 596909 that was related to this bug. It doesn't revert the subsequent commits though. Can you tell me if this fixes your dirfiles in skip mode? If so, I'll commit and we can look at this later.
Created attachment 18837 [details] Reverts revision number 596909 Reverts revision number 596909, but doesn't touch subsequent commits.
This seems to re-stablize skip reads. I think it should be applied to head until such time as the modified version is fixed. cbn On Thursday 07 December 2006 17:24, Adam Treat wrote: [bugs.kde.org quoted mail]
Correct Behaviour: (i) [current behaviour] always read on skip boundaries: ie only read frames N*S where S is the value of 'skip' and N is an integer. (eg, if S = 2, only read even numbered frames, regardless of the starting frame given. (ii) [current behaviour] when 'average' is on, only return points if there are a full S frames available to average. Average forward. this bug seems to request (iii-a) when 'average' is off, read as many frames as you can, given the frame range given. the current behaviour is (iii-b) [current behaviour] when 'average' is off, read the same number of frames as you would if 'average' were on. I think (i), (ii), (iii-b), the current behaviour, is, in fact, what we want. Thoughts?
To rephrase slightly: for skip mode we will wait until an entire frame is present before reading the first point from it. For some data sources (such as ascii) this isn't strictly necessary but maintains consistency with other options. I agree that the current behaviour is correct - though I suspect we'll have to defend it again in the future.
Not quite... In skip mode, we wait for all of S frames to read the first point in the group, because we might be called on to average those S frames. On Wednesday 28 March 2007 4:37:10 pm Andrew Walker wrote: > for skip mode we will wait until an entire frame is present before reading > the first point from it. For some data sources (such as ascii) this isn't > strictly necessary but maintains consistency with other options.
I am somewhat dubious here. We rarely (or never) use average/filtering on load, and I insist that it is very confusing for a user to indicate "read 1 sample every 25 frames", have 3501 frames and only 140 samples. I think the mental image the typical user has of skipping is that it reads the first data point as indicated in the UI, then skips (skip-1) points, then reads the next sample, and so on. If there is a sample #3501 that lies just after 24 skipped points, it should be read. Anything else will cause confusion ! Of course, there may be performance issues or consistency between the behavior with/without averaging on, and all sorts of reasons. But in the end it IS confusing. I am aware that taking into account all aspects (averaging on/off, read from start/end, skip or not, updates paused or not, performance, ...) makes this pretty tricky. I am just not sure that there isn't a way to remove this confusion while preserving a sensible behavior for the rest. That said, I haven't spent hours thinking about it... and maybe there is no better solution ?
Have we reached anything approaching a consensus on this?
The existing behaviour is self-consistent. It may be initially confusing for some users but I believe this to be unavoidable if we are to maintain consistency across data sources and data source options.
Oh well, so it seems I lost the argument in the end ? It's not clear to me why we can't do what Barth suggests in iii-a) in comment #10. To me, iii-a) is clearly better than iii-b) and indeed exactly what I wanted, especially as not everybody will be using averaging (around here nobody uses it in fact: we prefer to have all data points and possibly zoom out, kst has enough performance for that). Hoping there is still a chance, at least for 2.0...