Bug 128436 - invalid fields should not be loaded as 0 (zeroes)
Summary: invalid fields should not be loaded as 0 (zeroes)
Status: RESOLVED FIXED
Alias: None
Product: kst
Classification: Applications
Component: general (show other bugs)
Version: 1.x
Platform: unspecified Solaris
: NOR wishlist
Target Milestone: ---
Assignee: kst
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-06-01 12:42 UTC by Nicolas Brisset
Modified: 2006-06-19 09:36 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Attachments
Small ASCII file to reproduce the problem (167 bytes, text/plain)
2006-06-01 12:43 UTC, Nicolas Brisset
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nicolas Brisset 2006-06-01 12:42:53 UTC
Version:           1.3.0_devel (using KDE 3.4.0, compiled sources)
Compiler:          gcc version 3.4.3
OS:                SunOS (sun4u) release 5.8

Maybe this is just a documentation problem, or dependent on datasources. Feel free to close this if you feel it is not a bug...

When you try to load unexisting fields e.g. from an ASCII file, if the field can't be found instead of complaining kst will load zeroes.

To reproduce:
1) load VAR1 and VAR2 vs INDEX from the file I'll attach soon (or any other ASCII file), with the right settings (data from line 1, fields in line 0)
2) save the plot to plot.kst
3) edit the data file and change VAR1 to VAR1234 (could well happen with files generated by external programs, where some fields may or may not exist according to the configuration in use)
4) reload plot.kst: VAR1234 is loaded with zeroes whereas I would expect a warning that this field could not be loaded (like I have seen in some circumstances I can't remember)

This is strange, considering that the ASCII datasource implements isValidField(): I would expect that it is called before readField() is called. Another option would be for datasources to perform the check within readField, but that looks wrong to me somehow.
Comment 1 Nicolas Brisset 2006-06-01 12:43:38 UTC
Created attachment 16398 [details]
Small ASCII file to reproduce the problem
Comment 2 Andrew Walker 2006-06-01 20:27:33 UTC
Nicolas, the VAR1 text appears many places in the .kst file. Could you describe more fully which ones you changed to produce the problem.
Comment 3 Andrew Walker 2006-06-02 00:43:55 UTC
Sorry, ignore that last comment. I thought you were editing the kst file rather than the data file.
Comment 4 Andrew Walker 2006-06-02 01:41:45 UTC
before the vector is read from the datasource the size is set to the known length of the ascii file (in this case 6) and the values all initialised to zero. When no values are subsequently read in there is no error given and no attempt to clear the vector of the 6 zero values.
Comment 5 Nicolas Brisset 2006-06-02 09:27:35 UTC
Yes, that kind of makes sense. But it would be much better to check that the required field exists (using isValidField()) before initializing a (potentially large !) memory area and calling readField().
I don't think such cases are rare: many data files have "holes" and it could very well happen that a field which existed in the past is not in a newer data file. What surprises me the most, is that I remember seeing an error message of the kind "... file could not be loaded entirely due to missing fields" in such a case in the past. Has something been changed here, or is it just that there are other conditions to meet to get that message ?
Comment 6 George Staikos 2006-06-02 14:34:35 UTC
  Checking to see if the field is valid and not loading it if it isn't is 
quite ugly since it could break down the entire object chain.  The only 
behaviour I can think of in this case is to stop loading the Kst file 
entirely and wipe out all objects in memory.  If we load 0s (or NANs), then 
at least the user can repair and reload.  Maybe we should warn in the debug 
log when this happens.
Comment 7 Nicolas Brisset 2006-06-02 14:49:28 UTC
Sure, it would be bad to break the object chain or not load what can be loaded !
And yes, NaNs would be a good idea since they are handled properly, at least by curves (no curve will tell you that there is a problem !). I just hope that plugins, equations and the entire object chain can all handle them... otherwise 0s are fine, provided that the user gets warned (either in the debug output, or preferably in non-batch mode in a window like I remember to have seen: have I been dreaming ?)
Comment 8 Nicolas Brisset 2006-06-16 15:52:40 UTC
One user just came to me and complained about that behavior again after scratching her head for a (long !) while trying to understand what she was looking at: 0 is too common a value to raise the user's attention that something has gone wrong when loading the .kst file.

Considering George's insightful comments about not breaking the object chain and giving the user a chance to repair, I'd say we should either:
- initialize the fields with NaNs instead of 0s (my preferred, as the curves would not even show, which will surely not pass unnoticed !) unless that breaks something further down the chain if some plugins or equations "dislike" that and crash
- in the case where NaNs are not an option, check with isValidField(...) before loading the vector, and if it is not still initialize with 0 BUT WARN THE USER with a dialog like the one I was mentioning, which by the way is the one at line 544 of kstdoc.cpp (I don't exactly know when it is shown, but I believe it could be used for unexisting fields as well).
Comment 9 Nicolas Brisset 2006-06-16 16:10:50 UTC
A small additional remark: when using kst in batch mode (which is one the things it does better than any of the other tools we have used in the past), NaNs would be _much_ better !
Comment 10 Nicolas Brisset 2006-06-16 16:20:23 UTC
A vector full of NaNs apparently does not crash equations nor the low-pass filter (the other filtes being very close, I suppose it's the same for them). It sounds good :-)
And in the worst case (i.e. some plugins crashing) we could actually call the plugin only if the input vectors contain something else than NaNs, and return output scalars and vectors with NaNs without calling the plugin otherwise. That would mean forcing plugin output to NaNs when at least one input is pure-Nan, which sounds sensible to me...
Comment 11 Netterfield 2006-06-17 04:03:56 UTC
I agree that it makes sense for data sources to return NaNs if there is no valid data.
Comment 12 George Staikos 2006-06-19 09:35:07 UTC
SVN commit 552809 by staikos:

use NOPOINT instead of 0 for missing data
CCBUG: 128436


 M  +4 -8      ascii.cpp  


--- trunk/extragear/graphics/kst/src/datasources/ascii/ascii.cpp #552808:552809
@@ -497,15 +497,15 @@
   file.readBlock(_tmpBuf, bufread);
 
   if (_config->_columnType == AsciiSource::Config::Fixed) {
-    for (int i = 0; i < n; i++, s++) {
+    for (int i = 0; i < n; ++i, ++s) {
       // Read appropriate column and convert to double
       v[i] = atof(_tmpBuf + _rowIndex[i] - _rowIndex[0] + _config->_columnWidth * (col - 1));
     }
   } else if (_config->_columnType == AsciiSource::Config::Custom) {
-    for (int i = 0; i < n; i++, s++) {
+    for (int i = 0; i < n; ++i, ++s) {
       bool incol = false;
       uint i_col = 0;
-      v[i] = 0.0;
+      v[i] = KST::NOPOINT;
       for (int ch = _rowIndex[s] - bufstart; ch < bufread; ++ch) {
         if (_config->_columnDelimiter.contains(_tmpBuf[ch])) {
           incol = false;
@@ -523,8 +523,6 @@
               } else if (ch + 2 < bufread && tolower(_tmpBuf[ch]) == 'i' &&
                   tolower(_tmpBuf[ch + 1]) == 'n' && tolower(_tmpBuf[ch + 2]) == 'f') {
                 v[i] = INF;
-              } else {
-                v[i] = NAN;
               }
               break;
             }
@@ -537,7 +535,7 @@
       bool incol = false;
       uint i_col = 0;
       
-      v[i] = 0.0;
+      v[i] = KST::NOPOINT;
       for (int ch = _rowIndex[s] - bufstart; ch < bufread; ++ch) {     
         if (isspace(_tmpBuf[ch])) {
           if (_tmpBuf[ch] == '\n' || _tmpBuf[ch] == '\r') {
@@ -557,8 +555,6 @@
               } else if (ch + 2 < bufread && tolower(_tmpBuf[ch]) == 'i' &&
                   tolower(_tmpBuf[ch + 1]) == 'n' && tolower(_tmpBuf[ch + 2]) == 'f') {
                 v[i] = INF;
-              } else {
-                v[i] = NAN;
               }
               break;
             }
Comment 13 George Staikos 2006-06-19 09:36:50 UTC
SVN commit 552811 by staikos:

two patches that got a bit intertwined due to offline work.

1) Make vectors initialize to NOPOINT instead of 0
BUG: 128436
2) Fix a very tricky deadlock in objects iwth providers, especially plugins.
There is more to be done here.  The fix only applies to vectors at the moment.


 M  +2 -0      libkst/kstscalar.h  
 M  +1 -0      libkst/kststring.h  
 M  +91 -7     libkst/kstvector.cpp  
 M  +19 -10    libkst/kstvector.h  
 M  +80 -16    libkstmath/kstdataobject.cpp