Bug 124942 - multi-file datawizard for datafile comparison
Summary: multi-file datawizard for datafile comparison
Status: CONFIRMED
Alias: None
Product: kst
Classification: Applications
Component: general (show other bugs)
Version: 1.10.0
Platform: unspecified Linux
: NOR wishlist
Target Milestone: ---
Assignee: kst
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-04-05 00:37 UTC by Nicolas Brisset
Modified: 2011-01-31 09:56 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nicolas Brisset 2006-04-05 00:37:05 UTC
Version:           1.3.0_devel (using KDE 3.4.2, Mandrake Linux Cooker i586 - Cooker)
Compiler:          Target: i586-mandriva-linux-gnu
OS:                Linux (i686) release 2.6.12-12mdk

After being subscribed to the kst mailing-list for a few months, I have the feeling that the typical use cases of the original kst users sometimes differ significantly from ours. One of the things we do the most frequently is compare two (or more) files corresponding to different configurations to track differences. Typically, the two data files will contain the same list of fields and during a session, the user will:
1) load a bunch of variables from conf1.dat with the datawizard
2) call Tools->Change data file with the "duplicate" option to create the corresponding curves from conf2.dat
3) notice discrepancies somewhere that need to be explained by looking at more vars
4) load new vars with the datawizard. *But* that's where the workflow breaks down: the wizard will only allow to load curves from the last file it was used with (and pay attention to the X vector that's going to be reused: it should be from the right file!), so that you will soon find yourself toggling back and forth between files, invoking the wizard an incredible number of times... until you can't stand it, and switch back to using gaiw :-(

But I have good news: that's not desperate :-) I have thought about this for a while, and I think I have come up with a very user-friendly way of improving this situation with a minimal complexification of the data wizard (in terms of user interaction, but hopefully coding it will not require too many changes either !). So, here we go for the changes I'd like to see:
a) add to the first page of the wizard a listbox with checkable items, one entry for each cached datasource instance. When a datasource is chosen in the KFileDialog it gets added to the list. The type and configure buttons would easier be duplicated or apply to the selected listbox item (listbox in single selection mode)
b) when the user checks more than one datasource, it means that subsequently created curves should be created for all checked datasources
c) the list of available vars is now the intersection of checked datasources field lists (and the "position" column can be removed if more than one source is checked)
d) for choosing X vectors, the same concept can be applied: the dropdown list could contain only vectors existing in both datasources (say, a "Time" field). Note that this will require a small change from the current implementation (see below for the gory details)
e) the last page can be kept as is, the only difference being that instead of creating one curve for each Y field selected by the user, the wizard will create as many as there are checked datasources.

I hope that's somewhat clear ??? Because the workflow in point 4) above would be great: just check the 2 (or more) datasources you want to work this, and step through the rest as usual : at the end, you get the curves superimposed just like you wanted to :-)

Now, I think I need to elaborate a bit on point d). Interestingly, that reflexion exposes a small issue in the current implementation. It would be better to provide only one list (vectors available in the datasource(s)), and a "reuse existing" checkbox (*not* a dropdown list) to try and reuse that vector, if it is already instantiated, regardless of its name. (Is that clear ? The issue here is that vectors in kst can't have duplicate names, while existing fields in two comparable datasources will very likely bear exactly the same names). To give an example, iy you have TIME and TIME' loaded from the "TIME" field in respectively conf1.dat and conf2.dat, in the current implementation you are trapped if you want to reuse the TIME field for X: you can select only TIME or TIME', but there is no way to instruct the wizard to create the new curves as [selected Y fields] = f(TIME from the same datasource). It would be better to look in current kst vectors for an instance of "TIME" from the given datasource. In other words, the current implementation does not scale to more than one datasource... But maybe I should open a separate report for that as it is getting complicated ?
Comment 1 Netterfield 2006-06-17 02:59:04 UTC
I'm wondering if you might end up being happier writing and using a small javascript for this.... once it is written, you would type 'addCurve("19")' to add plots of field 19.  I wrote a brief tutorial which is in graphics/doc/kst/scripttutorial which should get you on your way.

My worry, as a general one, is that we run the risk of making the UI infinitely complicated, as we add the capability for each new work case, so in the end, it isn't useful for any work case.  Some might argue that the current wizard is already heading in that direction.
Comment 2 George Staikos 2006-06-17 17:06:29 UTC
Exactly.  This is what scripting and extensions are for.
Comment 3 Nicolas Brisset 2006-06-19 15:09:41 UTC
I am not againts scripting or extensions, all the more as integrating this cleanly in the wizard is not very simple (even though my report is long because I tried to list all issues, I am not sure it is a good measure of code complexity, and the UI would not be affected tremendously).
But in any case I'm looking for a solution to the above use case (adding a new vector from a number of sources simultaneously very easily) and I definitely don't think we can ask end users to write scripts. They should be offered a point-and-click GUI for such a fairly simple and frequent task.
Another option may be to not add this to the wizard, but rather to a new tool with just the minimum amount of information (no PSDs, just datasources to read from in a checklist, field name(s), from/to values, and destination (new plot, new window, etc...)
Comment 4 Nicolas Brisset 2006-11-03 18:06:09 UTC
I'm tempted to close that bug as a duplicate of bug #136780. What do you think ?
At some point it was said that this would be a good testcase for scripting extensions, but it seems nobody got around to doing it and I believe the proposal made in #136780 is better implemented in C++ (but I may be wrong)... and covers this need completely.
Comment 5 Peter Kümmel 2010-08-14 14:12:47 UTC
Could be still open in Kst 1.
Comment 6 Netterfield 2011-01-14 22:55:06 UTC
Does the duplicate option in the change file dialog solve this problem?  If so, is it ok to close this?
Comment 7 Nicolas Brisset 2011-01-31 09:56:21 UTC
The duplicate option in the change file dialog does not completely solve this, but almost. The thing is, you'd need to call the change data file tool over and again and select the last vectors added only, instead of having all curves added at once.

I think the idea from this report is still nice, but we can live a while without it. It's not high priority, but I would not close it. As already mentioned, we could start by implementing bug #136780, which is also nice and probably easier.
And when we have enough hands to work on kst (someday hopefully), integrating those ideas into the wizard could be a nice addition. It does not make it so much more complicated, and makes the feature easier to use and more visible to new users.