Version: 1.2.0_svn_480257 (using KDE KDE 3.4.2) OS: Linux Currently a dirfiles 'length' (number of frames) is defined by the first (valid) field in the format file. Traditionally (on BLAST at least) this field is called FORMAT and (most) other fields are in lowercase. The fact that the number of frames is defined by the first field is important so that data writing programs can write the first field last so that the race condition of not having some of the data written yet bug doesn't come to light. However, getdata sorts the raw field list (with qsort), so the first field in the format file isn't neccesarily the first field in the list. This can cause kst to either report the wrong number of frames available (if the number of samples per frame and/or size of the type differ) or the race condition mentioned above.
Created attachment 13486 [details] Patch to fix the above A patch by Don Wiebe to fix the above issue. I won't commit it until people have a look at to to make sure this is kosher. Note, the patch will apply, even though there have been some offsets introduced since the code we generated the fix from was made. The patch is relative to the datasources/dirfile/ directory.
(For "FORMAT" read "FASTSAMP" in Matt's report.)
I should point out that the supplied patch completely deprecates the third argument to GetNFrames (in_field) and will always look at the first valid raw field in the format file, whatever it is passed for that parameter. This is the behaviour Barth and I settled on when we hammered out getting defile (the BLAST dirfile writer) and kst to behave well together, but I don't know if this convention is universal. AFAICT, kst never passes a non-NULL value for in_field when calling GetNFrames, so it shouldn't care.
> 01:33 ------- I should point out that the supplied patch completely > deprecates the third argument to GetNFrames (in_field) and will always look > at the first valid raw field in the format file, whatever it is passed for > that parameter. In my dirfile library here: http://cmb.phys.cwru.edu/kisner/code/dirfile/ I always return the number of available samples for the desired field. This code is based on Barth's original getData, with lots of memory leak plugging and full "putData" functionality implemented by me. Frankly I think it is silly to assume that all fields will have the same number of samples on disk. > This is the behaviour Barth and I settled on when we hammered out getting > defile (the BLAST dirfile writer) and kst to behave well together, I still don't understand why you guys didn't just use my putData stuff for writing dirfiles- it does all of the reverse mapping from linterp/lincom/bit fields to the underlying RAW field. But whatever- I know from personal experience that sometimes everyone needs to reinvent a wheel or 3 ;-) > but I > don't know if this convention is universal. AFAICT, kst never passes a > non-NULL value for in_field when calling GetNFrames, so it shouldn't care. I certainly have used kst to plot dirfile directories where the fields in the format file all have different sizes. Is this situation formally unsupported? In my opinion, that removes much of the usefullness of the dirfile format. -Ted
On Tuesday 15 November 2005 20:05, Ted Kisner wrote: > > 01:33 ------- I should point out that the supplied patch completely > > deprecates the third argument to GetNFrames (in_field) and will always > > look at the first valid raw field in the format file, whatever it is > > passed for that parameter. > > In my dirfile library here: > > http://cmb.phys.cwru.edu/kisner/code/dirfile/ > > I always return the number of available samples for the desired field. > This code is based on Barth's original getData, with lots of memory leak > plugging and full "putData" functionality implemented by me. Frankly I > think it is silly to assume that all fields will have the same number of > samples on disk. > > > This is the behaviour Barth and I settled on when we hammered out getting > > defile (the BLAST dirfile writer) and kst to behave well together, > > I still don't understand why you guys didn't just use my putData stuff for > writing dirfiles- it does all of the reverse mapping from > linterp/lincom/bit fields to the underlying RAW field. But whatever- I > know from personal experience that sometimes everyone needs to reinvent a > wheel or 3 ;-) If you have some time to hook this into a datasource we could definitely give it a try too. The current one is just Kst legacy... it was there and just got poked around as needed.
On Tuesday 15 November 2005 17:17, George Staikos wrote: >
On Tue, Nov 15, 2005 at 05:05:11PM -0800, Ted Kisner wrote: > I still don't understand why you guys didn't just use my putData stuff for > writing dirfiles- it does all of the reverse mapping from linterp/lincom/bit > fields to the underlying RAW field. But whatever- I know from personal > experience that sometimes everyone needs to reinvent a wheel or 3 ;-) BLAST's dirfile writer (what became 'defile') was written when dirfiles were invented, so it's pretty old. > > but I > > don't know if this convention is universal. AFAICT, kst never passes a > > non-NULL value for in_field when calling GetNFrames, so it shouldn't care. > > I certainly have used kst to plot dirfile directories where the fields in the > format file all have different sizes. Is this situation formally > unsupported? In my opinion, that removes much of the usefullness of the > dirfile format. AFAIK (someone please correct me if I'm wrong), kst assumes that the dirfile has a uniform number of frames, and if kst tries to read more data than a raw field has it enters a race (and consumes all your cpu).
> I still don't understand why you guys didn't just use my putData stuff for > writing dirfiles- it does all of the reverse mapping from linterp/lincom/bit > fields to the underlying RAW field. But whatever- I know from personal > experience that sometimes everyone needs to reinvent a wheel or 3 ;-) I had no desire to reinvent anything when I wrote defile. We hacked the dirfile writer that Barth wrote when he first wrote getdata out of the BLAST flight code because we didn't want it there anymore and repackaged it as a groundstation app. Had I known putData existed at the time, I might have looked into it, but I didn't. That aside, speaking as a getdata user, if these bugs we're encountering are already fixed, it would be really nice if they could be incorporated into the getdata trunk so we don't have to waste time re-fixing them.
I know they're not all your fault, but these sprintf() are dangerous. They should use snprintf instead. There is a security impact in the sense that we hit files on remote sites with KIO so there is a possibility for someone to cause an overflow. I would approve the patch with those fixed.
On Tuesday 15 November 2005 17:57, D.V.Wiebe wrote: > I had no desire to reinvent anything when I wrote defile.
SVN commit 480677 by truch: Apply the patch to fix the 'getdata borked' bug for now. BUG: 116461 M +19 -23 getdata.c M +1 -0 getdata_struct.h --- trunk/extragear/graphics/kst/kst/datasources/dirfile/getdata.c #480676:480677 @@ -287,13 +287,15 @@ /* */ /***************************************************************************/ struct FormatType *GetFormat(const char *filedir, int *error_code) { - int i_format; + int i_format, i; + struct stat statbuf; FILE *fp; char format_file[MAX_FILENAME_LENGTH+6]; char instring[MAX_LINE_LENGTH]; struct FormatType *F; char in_cols[MAX_IN_COLS][MAX_LINE_LENGTH]; int n_cols; + char raw_data_filename[MAX_FILENAME_LENGTH+FIELD_LENGTH+2]; /** first check to see if we have already read it **/ for (i_format=0; i_format<Formats.n; i_format++) { @@ -312,7 +314,7 @@ F = Formats.F+Formats.n-1; /***** open the format file (if there is one) ******/ - sprintf(format_file,"%s/format", filedir); + snprintf(format_file, MAX_FILENAME_LENGTH+6, "%s/format", filedir); fp = fopen(format_file, "r"); if (fp == NULL) { *error_code = GD_E_OPEN_FORMAT; @@ -406,6 +408,15 @@ } /** Now sort the lists */ if (F->n_raw > 1) { + for (i=0; i<F->n_raw; i++) { + snprintf(raw_data_filename, MAX_FILENAME_LENGTH+FIELD_LENGTH+2, + "%s/%s", filedir, F->rawEntries[i].field); + if (stat(raw_data_filename, &statbuf) >=0) { + F->first_field = F->rawEntries[i]; + break; + } + } + qsort(F->rawEntries, F->n_raw, sizeof(struct RawEntryType), RawCmp); } @@ -891,7 +902,8 @@ /** open the file (and cache the fp) if it hasn't been opened yet. */ if (R->fp <0) { - sprintf(datafilename, "%s/%s", F->FileDirName, field_code); + snprintf(datafilename, MAX_FILENAME_LENGTH+FIELD_LENGTH + 1, + "%s/%s", F->FileDirName, field_code); R->fp = open(datafilename, O_RDONLY); if (R->fp<0) { *n_read = 0; @@ -1682,9 +1694,6 @@ char raw_data_filename[MAX_FILENAME_LENGTH+FIELD_LENGTH+2]; struct stat statbuf; int nf; - char field_buf[80]; - const char *field; - int i; *error_code = GD_E_OK; @@ -1706,28 +1715,15 @@ return(0); } - if (in_field == NULL) { - /* check for first valid raw field */ - for (i=0; i<F->n_raw; i++) { - sprintf(raw_data_filename,"%s/%s", filename, F->rawEntries[i].field); - if (stat(raw_data_filename, &statbuf) >=0) { - strncpy(field_buf, F->rawEntries[i].field, 79); - i = F->n_raw; - } - } - field = field_buf; - } else { - field = in_field; - } - - sprintf(raw_data_filename,"%s/%s", filename, field); + /* load the first valid raw field */ + snprintf(raw_data_filename, MAX_FILENAME_LENGTH+FIELD_LENGTH+2, + "%s/%s", filename, F->first_field.field); if (stat(raw_data_filename, &statbuf) < 0) { return(0); } nf = statbuf.st_size/ - (F->rawEntries[0].size*F->rawEntries[0].samples_per_frame); - + (F->first_field.size*F->first_field.samples_per_frame); nf += F->frame_offset; nf -= 2; --- trunk/extragear/graphics/kst/kst/datasources/dirfile/getdata_struct.h #480676:480677 @@ -69,6 +69,7 @@ struct FormatType { char FileDirName[MAX_FILENAME_LENGTH]; int frame_offset; + struct RawEntryType first_field; struct RawEntryType *rawEntries; int n_raw; struct LincomEntryType *lincomEntries;