Bug 215931

Summary: Make the datawizard faster
Product: [Applications] kst Reporter: Nicolas Brisset <nicolas.brisset>
Component: generalAssignee: kst
Status: RESOLVED FIXED    
Severity: wishlist CC: netterfield
Priority: NOR    
Version: 2.0.0   
Target Milestone: 2.0.0   
Platform: unspecified   
OS: Microsoft Windows   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description Nicolas Brisset 2009-11-24 08:51:05 UTC
Version:           2.0.0_devel (using unspecified)
Compiler:          mingw delivered with QtCreator 1.1.0
OS:                Windows 32-Bit

When opening the datawizard, in the first steps the data file needs to be selected and the corresponding datasource created. This is currently suboptimal and in some circumstances can lead to lots of wasted time. Just try to work with ASCII files with > 2 million points (which we routinely do, in fact that's one of the reasons to use kst as very few tools handle such data amounts sensibly), you'll know what I mean... 

This problem is due to datasources being instantiated too soon:
- when the contents of the lineedit change, including upon opening the wizard
- when the settings change ("Configure..." button)
- when selecting a file in the file requester from the first page (which is particularly bad if you have single-click settings and only hovering a file starts the process of looking for an appropriate datasource!)

To improve that situation I think it should work the following way:
1) open the datawizard: restore the last used data file and do nothing more
2) if the user clicks on the file requester to select a different file, open a standard file dialog and don't try to "validate" the data file at this stage. After all the user should know what he's doing, and if not he'll come back to here. It is hard to understand why selecting a file can be so slow, while the dialog looks like a usual file selection dialog
3) if the user clicks the "Configure" button, only create the datasource's config dialog with the settings that will be used for that file (i.e. default settings if the file is not known, otherwise the ones stored in QSettings for that file). The user should just see the settings and have an option to change them easily. Upon validating the settings, don't instantiate the datasource yet: he may wish to change them again!
4) when the user clicks "Next..." on the first page, either grab the existing datasource for that file, or recreate one if the settings have changed, or create it if it's a new file. If there is a problem, bring up an error dialog informing the user that the file can't be read and stay on that page. It is not a problem, I don't think it is required to activate "Next" only when we are sure we can read the file (though if we can without a performance penalty it sure is better, but I don't think that's possible)
5) from there on, proceed as usual

I think this change should be fairly easy to do and would be a great time saver for some file types (in particular ASCII, but possibly also others)
Comment 1 Netterfield 2009-11-24 16:27:57 UTC
The intended behavior is that the ascii data source only reads the first N lines, where N isn't too many to be slow, until the data is actually being read.  What is happening now is, as you suspect, the entire ascii file is being parsed before it can be looked at.  This is silly.

If this issue were fixed (so delays were <~100ms + window drawing times) would you still suggest changes to the UI?
Comment 2 Nicolas Brisset 2009-11-24 16:37:45 UTC
My suggestion would not change the UI at all, it is only behind-the-scenes changes. Therefore I don't understand your last comment...

From a user perspective, only the response time is relevant. So if the ASCII data source was faster in the first steps (file selection / configuration) it may be sufficient... until I use a different data source that also takes lots of times for these actions. 
I still think that the suggested behavior is fundamentally better as it solves the cause of the problem and not only the symptom for one data source. It would work with all data sources, regardless of how well they are implemented or how fast they are.
Comment 3 Nicolas Brisset 2009-11-29 22:47:01 UTC
Interestingly, with the timing debug output introduced recently in the ASCII source we see that the constructor is called 3 times in the first run and twice in subsequent runs!
It should be called once in the first run and never in subsequent runs!
Comment 4 Netterfield 2009-12-15 03:32:02 UTC
SVN commit 1062534 by netterfield:

Fix an update bug in which fixed-range plots were not being re-drawn.
Don't create a datasource when validating it.

CCBUG: 215931



 M  +9 -8      devel-docs/Kst2Specs/Bugs  
 M  +5 -0      devel-docs/Kst2Specs/Wishlist  
 M  +1 -1      src/libkst/dataprimitive.h  
 M  +13 -5     src/libkst/datasource.cpp  
 M  +1 -1      src/libkst/datasource.h  
 M  +2 -2      src/libkstapp/applicationsettings.cpp  
 M  +2 -2      src/libkstapp/changedatasampledialog.cpp  
 M  +3 -3      src/libkstapp/circleitem.cpp  
 M  +1 -1      src/libkstapp/datawizard.cpp  
 M  +3 -1      src/libkstapp/plotitem.cpp  
 M  +5 -5      src/widgets/scalarselector.cpp  
 M  +1 -1      src/widgets/scalarselector.h  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1062534
Comment 5 Netterfield 2009-12-16 05:52:58 UTC
SVN commit 1062805 by netterfield:


Only read the first part of an ascii file the first time through,
to avoid delaying datasource selection widgets.

This would fix Bug 215931, except that the qimagesource, which uses
QImage, reads the entire file to decide if it is compatible,
so there is still a (shorter) delay in reading very long
ascii files.

CCBUG: 215931



 M  +5 -2      datasources/ascii/ascii.cpp  
 M  +9 -0      datasources/qimagesource/qimagesource.cpp  
 M  +14 -6     libkst/datasource.cpp  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1062805
Comment 6 Nicolas Brisset 2009-12-16 08:33:31 UTC
Ah, that feels much better now. From the debug output, it seems that datasource constructors get called only once, and the ASCII datasource postpones parsing to when it's actually needed. 
This seems to be good enough for me, as qimagesource is apparently fairly fast. Maybe we could still change qimagesource slightly so that it checks the file type instead of parsing it to know whether it can be handled?

By the way, this whole validation stuff in the file selection dialog only makes sense to me when we show the user which datasource is going to be used in the first wizard page, as in kst 1.x. But this is no longer shown... (the label that used to be just left from the Configure.. button). That definitely had added value, and should be pretty cheap once we have queried all datasources.
Comment 7 Netterfield 2009-12-17 03:36:07 UTC
SVN commit 1063132 by netterfield:

BUG: 215931
QImages look at file name extensions before attempting to parse files.
This means that the QImage data source requires recognized extensions
to read a file, but is a lot faster if the file is not an image.

Re-instate the file type label in the data wizard.


 M  +1 -1      datasources/ascii/ascii.cpp  
 M  +21 -16    datasources/qimagesource/qimagesource.cpp  
 M  +7 -10     libkst/datasource.cpp  
 M  +2 -0      libkstapp/datawizard.cpp  
 M  +77 -116   libkstapp/datawizardpagedatasource.ui  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1063132
Comment 8 Nicolas Brisset 2009-12-17 09:24:54 UTC
??? I don't seem to see the label next to the Configure... button on the first page...
Comment 9 Peter Kümmel 2010-08-14 12:14:18 UTC
Change version from 2.0.0_devel to 2.0.0 to simplify version numbering.
Comment 10 Peter Kümmel 2010-11-12 10:41:49 UTC
These bugs are solved with 2.0.0