Bug 123699 - fails to create INDEX from .fits file binary extention
Summary: fails to create INDEX from .fits file binary extention
Status: RESOLVED WORKSFORME
Alias: None
Product: kst
Classification: Applications
Component: general (show other bugs)
Version: 1.10.0
Platform: unspecified Linux
: NOR normal
Target Milestone: ---
Assignee: kst
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-03-16 02:19 UTC by Brendan Crill
Modified: 2023-01-13 05:13 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Brendan Crill 2006-03-16 02:19:46 UTC
Version:           1.3.0_devel (using KDE 3.4.2 Level "b" , SUSE 10.0)
Compiler:          Target: i586-suse-linux
OS:                Linux (i686) release 2.6.13-15.8-default

How to reproduce:
kst -y 1 kst/tests/healpix/healpix_example_sm.fits

I don't get a plot, and the following error is reported in the debug dialog:
Failed to create vector 'INDEX' from file '/home/bpc/graphics/kst/tests/healpix/healpix_example_sm.fits'.


I do not get this error when I am trying to read a fits file with an ascii extension
Comment 1 Ted Kisner 2006-03-16 02:31:56 UTC
healpix fits files are a "special" type of FITS file from kst's perspective.  
The healpix datasource claims ownership of the file instead of the LFIIO 
datasource (which normally reads table extensions).

To plot the healpix data as a map, one uses the "-z" command line option.

The command below suggests that you wish to plot the raw pixel data on an x-y 
plot (i.e. pixel value vs. pixel index), correct?  This is not currently 
supported by the healpix datasource.  The datasource currently returns either 
a cartesian projection of a map as a matrix, OR one of the polarization 
pseudo vector coordinates.  Vector fields won't be supported until the new 
rendering system is in place.

Do we want the healpix datasource to be able to return raw pixel values?

-Ted

On Wednesday 15 March 2006 17:19, Brendan Crill wrote:
| How to reproduce:
| kst -y 1 kst/tests/healpix/healpix_example_sm.fits
|
| I don't get a plot, and the following error is reported in the debug
| dialog: Failed to create vector 'INDEX' from file
| '/home/bpc/graphics/kst/tests/healpix/healpix_example_sm.fits'.
Comment 2 Brendan Crill 2006-03-16 02:43:50 UTC
OK, I get it, it's not a problem with fits binary extensions, its a problem with kst automatically recognizing the file as a fits healpix map. I first encountered this bug when was trying to plot the signal from one map vs. the signal from another in an x-y plot (which would be very useful...)  

Comment 3 Netterfield 2006-03-16 02:55:36 UTC
I think all data sources that return vectors should return an INDEX column.
Comment 4 Ted Kisner 2006-03-16 02:57:24 UTC
On Wednesday 15 March 2006 17:43, Brendan Crill wrote:
| ------- OK, I get it, it's not a problem with fits binary extensions, its a
| problem with kst automatically recognizing the file as a fits healpix map.
| I first encountered this bug when was trying to plot the signal from one
| map vs. the signal from another in an x-y plot (which would be very
| useful...)

This is easy to implement.  The next question is should the datasource simply 
return the data in the table, or should it always return full-sphere vectors?

For example, let's say that you are comparing pixel data from 2 files.  One 
file is a "full-sphere" FITS file without an pixel number column, and the 
other is a "cut-sphere" FITS file with pixel number and signal columns.

hmmm, need to think about this some more.

-Ted
Comment 5 Ted Kisner 2006-03-16 04:38:57 UTC
On Wednesday 15 March 2006 17:55, netterfield@astro.utoronto.ca wrote:
| 02:55 ------- I think all data sources that return vectors should return an
| INDEX column.

Ok, I think what I will do is the following:

1.  Add INDEX and an entry for each map to the vector fieldlist.

2.  when readField is called for a map, do the simple thing and just return 
the table columns.  So this means that for cut-sphere files, the user will 
need to plot the data versus the pixel column.

For example, with full-sphere file:

kst -y 1 -y 2 fulldata.fits

and with cut-sphere:

kst -x 1 -y 2 -y 3 cutdata.fits 

This means that it will NOT be possible to mix cut and full files on the 
command line (you have to go to the datamanager).

Brendan (and others)- is this an acceptable solution?

-Ted
Comment 6 Nicolas Brisset 2006-03-16 10:56:57 UTC
> I think all data sources that return vectors should return an INDEX column. 
That's not so easy: think about files where there are vectors with 10 samples and others with 100 (just a simple case !). What should INDEX return ?
Comment 7 Ted Kisner 2006-03-16 17:10:32 UTC
On Thursday 16 March 2006 01:56, Nicolas Brisset wrote:
| That's not so easy: think about files where there are vectors with 10
| samples and others with 100 (just a simple case !). What should INDEX
| return ?

I thought that the INDEX field just returned however many samples are 
requested, regardless of the length of the actual vectors in the file.  So in 
the above example, if I read *1000* samples from INDEX, it would just return 
1000 index values (even though that is longer than any of the vectors in the 
file).

Is this the desired behaviour of datasources?

-Ted
Comment 8 Netterfield 2006-06-16 23:22:41 UTC
The problem is significant: A fits file has no concept of frames.  But kst wants to know how many frames a data source has so it knows what it can ask for.  It must be that now the data source generates NF from the size of the first data object in the file, so it would be expected that it would also generate INDEX from the first data object in the file.  Anything else, and we have bigger fish to fry (like over or under reading fields).

I suggest that for the short term, we generate INDEX and NF from the first data object in the file.

For the long term, I think we need to some how break a fits file into sub-data sources each with equal sized fields as Piolib apparently does.
Comment 9 Nicolas Brisset 2006-06-19 15:32:28 UTC
Paste the list discussion here because it is very important: first, Ted's answer:

On Friday 16 June 2006 14:22, netterfield@astro.utoronto.ca wrote:
| The problem is significant: A fits file has no concept of frames.  But 
| kst wants to know how many frames a data source has so it knows what 
| it can ask for.  It must be that now the data source generates NF from 
| the size of the first data object in the file, so it would be expected 
| that it would also generate INDEX from the first data object in the 
| file.  Anything else, and we have bigger fish to fry (like over or under reading fields).

Isn't this is the *whole point* of specifying the desired field in the call to KstDataSource::frameCount(field)?  In general, fields within a datasource are not required to have the same frame count.  If a specific datasource wants to require that (e.g. dirfile), then fine, but it is not a general requirement.

| I suggest that for the short term, we generate INDEX and NF from the 
| first data object in the file.

why?  why can't the datasource return as many index samples as requested?  The datasource should query each "real" field to determine its length.  If we cannot support this scenario, then (in my opinion) our datasource model is fundamentally broken.  We should never use the length of the INDEX field to determine the number of samples/frames.  

The index field is just a dummy placeholder when we want to plot data vectors versus sample number.  At some level, the INDEX functionality should be built into the basecurve so we don't have to deal with it over and over again in the datasources.

| For the long term, I think we need to some how break a fits file into 
| sub-data sources each with equal sized fields as Piolib apparently does.

This is not the method I'm using in libfitstools and the fitsgeneral datasource.  I think it is better to have the fitsgeneral datasource present a fieldlist/matrixlist of all data in a given FITS file.

-Ted
Comment 10 Nicolas Brisset 2006-06-19 15:33:34 UTC
Barth's opinion:

The reason to have a single length per field is to handle asyronously arriving real time data in count from end mode (ie, the #1 requirement for kst), to make sure that the X vectors and the Y vectors stay synced.  Because of this, in fact, currently, fields within a data source are required to have the same frame count.  NF is a property of a datasource.

If we can simultaneously solve syncing asyncronous fields and handling variable length fields, then that is great.  But the former has to take priority over the latter.

The fundamental problem is this:
Imagine that we are in read-from-end mode (reading 20 frames), and go to read from a potentially asynchronous data source (eg, a dirfile), and find that field X has 1001 frames, and field Y has 1005 frames.  Kst could conclude
that:
  -X is just behind, and so read 20 frames ending with 1001.
  -There are different length frames, and so read 20 frames from X ending at 1001, and 20 frames from Y, ending with 1005.

The existence of this discussion indicates that there will be times when you want one, and other times when you want the other.

Perhaps the best solution is to let the data source decide - it can know which is appropriate.

So: 
-kst learns to read NF on a per field basis, not a per data source basis.

-Data sources that need to are responsible to make sure that NF is the same for all fields.  Data sources that don't need to don't (eg: dirfiles can have a bool in the format file which tells getdata what to do here)  We keep 
datasource->update() in order for syncronized data sources to determine 
datasource->their
one and only NF.  non-sycronized data sources don't have to do anything here.  

-INDEX always returns what you ask for, without wondering if the data exists for any other fields.


Seem sensible?

The other issue we have to address is marshaled reads, like for NAD or for 
ascii.  This doesn't solve that, but may be related.
Comment 11 Nicolas Brisset 2006-06-19 15:36:30 UTC
Ted:

On Friday 16 June 2006 21:52, Barth Netterfield wrote:
| The reason to have a single length per field is to handle asyronously 
| arriving real time data in count from end mode (ie, the #1 requirement 
| for kst), to make sure that the X vectors and the Y vectors stay synced.
| Because of this, in fact, currently, fields within a data source are 
| required to have the same frame count.  NF is a property of a datasource.

Yes, we've beaten on this issue a ton for dirfiles, but it seems to me that synchronicity is a datasource-dependent feature which doesn't make sense for some datasources. 

| Perhaps the best solution is to let the data source decide - it can 
| know which is appropriate.

I thought this was why we had the "field" argument to the frameCount function?  
The datasource can choose whether to use it or ignore it.

| -kst learns to read NF on a per field basis, not a per data source basis.

I am obviously confused- doesn't it already do this???  As far as I can tell from grep'ing through the sources, every call to KstDataSource::frameCount includes the field parameter- and thus the datasource can take advantage of it.  

What am I missing?  Obviously the dirfile source can ignore the field parameter, but other datasources can use it already.

| -Data sources that need to are responsible to make sure that NF is the 
| same for all fields.  Data sources that don't need to don't (eg: 
| dirfiles can have a bool in the format file which tells getdata what 
| to do here)  We keep datasource->update() in order for syncronized 
| data sources to determine their one and only NF.  non-sycronized data 
| sources don't have to do anything here.

Exactly- the dirfile source is synchronous, but other sources don't have to be.  This is how the code works *right now*.  

| -INDEX always returns what you ask for, without wondering if the data 
| exists for any other fields.

Yep.

| Seem sensible?

I think so :-)  Also, it *would* be possible for a datasource to have mechanisms for synchronous updates with fields of varying length- but such things are left up to the datasource (as they should be).

-Ted
Comment 12 Nicolas Brisset 2006-06-19 15:38:25 UTC
And George:

On Saturday 17 June 2006 00:52, Barth Netterfield wrote:
> The reason to have a single length per field is to handle asyronously 
> arriving real time data in count from end mode (ie, the #1 requirement 
> for kst), to make sure that the X vectors and the Y vectors stay synced.
> Because of this, in fact, currently, fields within a data source are 
> required to have the same frame count.  NF is a property of a datasource.

  Actually NF is now computed directly from frameCount(field), not 
frameCount().    It's entirely up to the datasource to determine if it 
returns different values for each field, or a synchronized value.  Basically what you wrote is almost what we already do, with the exception that none of the sources actually behaves like the fields are desynchronized.
Comment 13 Nicolas Brisset 2006-06-19 15:40:11 UTC
Hum, maybe we should rename this bug as the discussion on synchronous datasources is interesting but we need to be able to find it later, and the link with the current bug summary is not obvious...
Comment 14 Nicolas Brisset 2006-06-19 16:19:39 UTC
Well, I'm also getting confused. When I wrote the cdf and netcdf datasources, I implemented per-field NF, and that's still the way it works. These datasources do not provide INDEX (in the field list) for the reason that I did not see how I could provide the number of frames for that virtual field, and how it would be used. They still return values in readField(...) for INDEX, though, but I'm not sure whether/how this is used.

I'd like to have a clear picture of what I should do to these datasources:
- add "INDEX" to the field list so that the user can pick it as X vector ? But then kst should not call frameCount("INDEX") because I can't return any sensible value in the general case, it should determine the right range by calling frameCount(field) for each field to plot and create index vectors as required (and not more than one for each range !!!)
- do nothing ? But then you need to have useful X vectors in your data file because otherwise you have nothing like an index to plot against (apart from a manual static vector, but that can't be handled from the datawizard and isn't nice)

The better option in my view would be to do as I suggested on the mailing list a while ago: implement a virtual buddyVar() method which by default returns "INDEX" and can be overriden by asynchronous datasources who can handle index/time vectors as appropriate.
Actually, the main difficulty here is that there could be one INDEX per Y vector, and we don't want such a potentially large list. Maybe we need to make a special case for it so that it is not stored, but recomputed as needed ? Clearly not a simple problem...
Comment 15 Netterfield 2006-06-19 17:57:17 UTC
So... I was also out of date on a few issues.  After all of this - The only remaining question is what to define for frameCount(INDEX) for data sources with fields of different numbers of frames.  And the problem shows up when reading to the end of the fields.  Hmmm...

One solution is:
another bug has requested the ability to make vectors from [scalar] to [scalar] with [scalar] points.  We could add to VCurves the ability to, if there is no INDEX field, create one of these vectors from the stats of the field being plotted.... I think this fixes all cases.

Comment 16 Peter Kümmel 2010-08-14 14:12:09 UTC
Could be still open in Kst 1.
Comment 17 Andrew Crouthamel 2018-11-05 03:11:36 UTC
Dear Bug Submitter,

This bug has been stagnant for a long time. Could you help us out and re-test if the bug is valid in the latest version? I am setting the status to NEEDSINFO pending your response, please change the Status back to REPORTED when you respond.

Thank you for helping us make KDE software even better for everyone!
Comment 18 Andrew Crouthamel 2018-11-17 04:54:47 UTC
Dear Bug Submitter,

This is a reminder that this bug has been stagnant for a long time. Could you help us out and re-test if the bug is valid in the latest version? This bug will be moved back to REPORTED Status for manual review later, which may take a while. If you are able to, please lend us a hand.

Thank you for helping us make KDE software even better for everyone!
Comment 19 Justin Zobel 2022-12-14 03:09:04 UTC
Thank you for reporting this issue in KDE software. As it has been a while since this issue was reported, can we please ask you to see if you can reproduce the issue with a recent software version?

If you can reproduce the issue, please change the status to "REPORTED" when replying. Thank you!
Comment 20 Bug Janitor Service 2022-12-29 05:22:41 UTC
Dear Bug Submitter,

This bug has been in NEEDSINFO status with no change for at least
15 days. Please provide the requested information as soon as
possible and set the bug status as REPORTED. Due to regular bug
tracker maintenance, if the bug is still in NEEDSINFO status with
no change in 30 days the bug will be closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

If you have already provided the requested information, please
mark the bug as REPORTED so that the KDE team knows that the bug is
ready to be confirmed.

Thank you for helping us make KDE software even better for everyone!
Comment 21 Bug Janitor Service 2023-01-13 05:13:34 UTC
This bug has been in NEEDSINFO status with no change for at least
30 days. The bug is now closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

Thank you for helping us make KDE software even better for everyone!