Bug 249544 - kst mixes up columns in some ASCII files!!!
Summary: kst mixes up columns in some ASCII files!!!
Status: RESOLVED FIXED
Alias: None
Product: kst
Classification: Applications
Component: datasources (show other bugs)
Version: 2.0.0
Platform: Microsoft Windows Microsoft Windows
: NOR major
Target Milestone: 2.0.0
Assignee: kst
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-08-30 22:26 UTC by Nicolas Brisset
Modified: 2010-11-12 10:42 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Data file to reproduce the bug (29.53 KB, application/zip)
2010-08-30 22:26 UTC, Nicolas Brisset
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nicolas Brisset 2010-08-30 22:26:31 UTC
Created attachment 51119 [details]
Data file to reproduce the bug

Version:           2.0.0
OS:                MS Windows

When reading some ASCII files, kst gets completely confused. This is very bad as it gives no error message but starts mixing columns silently!!!

Reproducible: Always

Steps to Reproduce:
I'm going to attach the required data file to reproduce the bug. Then:
- open the unzipped file with the datawizard
- configure the datasource: ignore regional settings, var names in line 0, data from line 1, custom delimiter=;
- load VAR1A against INDEX
- open the datafile in your favorite editor

Actual Results:  
Edit->View vectors, select VAR1A, scroll down to sample 234 and check against line 235 in the editor (offset of 1 due to the line with field names): the value is the same. Now look at sample #235: its value is completely wrong, as that of subsequent samples, which all seem to be coming from column VAR1R, starting from the same line!

Expected Results:  
Values are those read in column VAR1A.

OS: Windows 32-Bit, not tested under Linux.
When hitting reload, only samples 1 to 233 are loaded, stopping on exactly that line causing trouble. Very strange....
Comment 1 Nicolas Brisset 2010-08-30 22:35:05 UTC
Variant: try to load VAR1A from the file from saple #230, 20 samples. You get values from another vector! It looks like something is badly broken in the computation of column offsets...
Comment 2 Peter Kümmel 2010-08-31 14:39:37 UTC
SVN commit 1170281 by kuemmel:

Don't break offset calculation

CCBUG: 249544 

 M  +3 -1      asciisource.cpp  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1170281
Comment 3 Nicolas Brisset 2010-08-31 15:12:18 UTC
Thanks! I've just tested, and it looks much better. I think we should release 2.0.1 pretty quickly with that fix and the update detection changes, as they both seem to be pretty important.

I've just noticed a small glitch, when loading samples from #230: the first value is actually 5.15206, i.e. sample #231 (line 232 in the file). What's strange is that starting from 1, 2, 3 gives the right results... I'm not opening another bug for that, but it should be fixed and then we can close that one.

I also had cases where the vectors had the wrong number of samples, I suspect it was related to that. But I think we have to be careful, because bugs in the datasources where the wrong data gets plotted are pretty nasty. We should definitely introduce unit tests there (hadn't we actually already introduced some?)
Comment 4 Peter Kümmel 2010-08-31 16:15:35 UTC
I don't understand:
In the editor, counter starts with 1, line 232 has 5.152058401 for VAR1A.
Having one header line, this value shows up as vector value 231 in Kst (counter also starts at 1). Why is this not correct? Or should the Kst vector start at 0?
Comment 5 Nicolas Brisset 2010-08-31 16:24:35 UTC
Sorry, my explanation was not clear enough. To reproduce:
- load VAR1A from the file, from sample 0 to end
- look at the values: they are correct (ie file line 2 in sample #1, etc - always with an offset of 1 due to the header line)
- Tools->Change data sample ranges, select VAR1A and set Start to 230 and Range to 20
- Look at the values now: I expect 23.37203562 as sample #1 (because start was set to 230 and it is what I have in line 231 of the file) and I have 5.152058401, which is for me sample #231

It is certainly easy to solve, but what I don't understand is why it works for 1, 2, 3 and not 230 (I have not tested all values in between). If we change the code to adjust by 1, maybe we'll fix one case and break another one.

Is it clearer now?
Comment 6 Peter Kümmel 2010-08-31 16:37:34 UTC
OK, I see. But I get the same error also for small values:
starting at 2 I see the value from vector index 3.

Seems somewhere we ignore that the vector starts at index 1 (at least in the GUI).
Comment 7 Peter Kümmel 2010-08-31 19:55:59 UTC
SVN commit 1170369 by kuemmel:

Start vector index at 0 in GUI

CCBUG: 249544

 M  +26 -34    vectormodel.cpp  
 M  +3 -4      vectormodel.h  
 M  +3 -0      viewvectordialog.cpp  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1170369
Comment 8 Nicolas Brisset 2010-09-01 09:33:10 UTC
OK, so now the table in the GUI has an index starting at 0. But still, the first data point in the above example (VAR1A, Start: 230, Range: 20) should be 23.37203562. It isn't yet the case. I guess you were still planning to fix it as you haven't closed this bug, but I just wanted to make sure.
I don't know whether the problem is specific to ASCII (and should be fixed in the ASCII datasource) or general (and should be fixed in kst "core")...
Comment 9 Peter Kümmel 2010-09-01 12:42:07 UTC
This is really hairy, what I see is:

Editor: Line 232 = 5.152. The editor line number 232 translates to vector index 230, because there is one header and we start counting at 0.
"View Vector": [230] = 5.152

After changing the vector range with "Tools->Change Data Sample Range" to start at 230 for 20 items the "View Vector" shows [0] = 5.152 which is correct.

[0] = 23.37203562 is only true when you start the range at 229 (in the file this value is at line 231).

So I think the bug is fixed already, if you agree, could you close the bug?
Comment 10 Nicolas Brisset 2010-09-01 14:10:10 UTC
OK,you're right. Sorry for the confusion.
The only question is: would a "normal user" type in "229" when he wants the data from sample #230? While C has indexes starting at 0, humans tend to think from 1.
I see two options:
1) we add a note on indexes starting at 0 in the tooltip of the Start field (as that is not too invasive) and leave the rest as is
2) we change the code so that the user-visible UI has indexes starting from 1, and  only developers see indexes starting at 0.

Ideally, I think 2) would be the nicer solution for end-users, but we are pretty sure to break a couple of things and may have compatibility problems. In the end, 1) is probably the better compromise.
I'll commit something that goes in this direction, if someone does not like it or has a better idea, then just change it.
Comment 11 Peter Kümmel 2010-11-12 10:42:49 UTC
These bugs are solved with 2.0.0