Bug 389738 - .esq files can't be run again by scheduler if prior captures exist
Summary: .esq files can't be run again by scheduler if prior captures exist
Status: RESOLVED FIXED
Alias: None
Product: kstars
Classification: Applications
Component: general (show other bugs)
Version: 2.9.2
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Jasem Mutlaq
URL:
Keywords: investigated, triaged
Depends on:
Blocks:
 
Reported: 2018-02-01 07:30 UTC by schwim
Modified: 2018-09-19 14:38 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Example .esq file (1.09 KB, application/xml)
2018-02-01 07:30 UTC, schwim
Details

Note You need to log in before you can comment on or make changes to this bug.
Description schwim 2018-02-01 07:30:52 UTC
Created attachment 110276 [details]
Example .esq file

Confirmed in latest pull as of 31 Jan 2018. I think this goes back to prior releases.

If an .esq file is run more than once by the scheduler the first job is successful but all subsequent jobs quit stating they are already complete. The scheduler log window shows the job "in progress" then immediately shows "Complete".

The debug logs show:

[2018-02-01T00:19:57.275 MST DEBG ][   org.kde.kstars.ekos.capture] - Preparing capture job "Light_L_60_secs_ISO8601" for execution.
[2018-02-01T00:19:57.275 MST DEBG ][   org.kde.kstars.ekos.capture] - Job "Light_L_60_secs_ISO8601" already complete.
[2018-02-01T00:19:57.275 MST DEBG ][   org.kde.kstars.ekos.capture] - All capture jobs complete.

If the original output fits files are deleted, the job can be run again.
Comment 1 Jasem Mutlaq 2018-02-01 10:09:00 UTC
Git commit 6b91b0c249948e62741c4630d0d4fc45a6ab60a4 by Jasem Mutlaq.
Committed on 01/02/2018 at 10:07.
Pushed by mutlaqja into branch 'master'.

Keep track of identical scheduler jobs with identical sequences to avoid marking all identical jobs as complete.

M  +15   -0    kstars/ekos/scheduler/scheduler.cpp
M  +2    -0    kstars/ekos/scheduler/scheduler.h

https://commits.kde.org/kstars/6b91b0c249948e62741c4630d0d4fc45a6ab60a4
Comment 2 Jasem Mutlaq 2018-02-01 10:50:11 UTC
Actually, that solution is not complete and can have issues. I have another proposal. Would it be OK if we make the path like this:

TargetName/Job_#/Light/OIII/..etc

That is, after the target name, each each a subdirectory containing the job number. This way the path is always unique. What do you think?
Comment 3 schwim 2018-02-01 16:54:36 UTC
(In reply to Jasem Mutlaq from comment #2)
> Actually, that solution is not complete and can have issues. I have another
> proposal. Would it be OK if we make the path like this:
> 
> TargetName/Job_#/Light/OIII/..etc
> 
> That is, after the target name, each each a subdirectory containing the job
> number. This way the path is always unique. What do you think?

That would work, however it would induce a bit of work when collecting all the data for processing (need to descend into multiple directory trees to get all the files pulled together).

Presuming its a matter of having a unique filename for each image, I have a few alternatives:

 1) Require that TS be on for file naming and rely on that -or-
 2) Append a part of the job name to the file -or-
 3) Use a system-wide counter for all image saves 

I've seen 1&3 in use elsewhere and they work well. In the case of #3, the user can maybe even set the numbering if they wish. Example - append 4 digits at the end of each image, start at 1234 and increment, never repeat unless reset.

Probably good to have some corner case collision avoidance as well. In the rare instance a collision does occur, iterate to the next for the filename. #1 and #3 should both accommodate this.
Comment 4 schwim 2018-02-01 16:56:20 UTC
(In reply to Jasem Mutlaq from comment #2)
> Actually, that solution is not complete and can have issues. I have another
> proposal. Would it be OK if we make the path like this:
> 
> TargetName/Job_#/Light/OIII/..etc
> 
> That is, after the target name, each each a subdirectory containing the job
> number. This way the path is always unique. What do you think?

Actually, thinking more on this. This may not work that well. Consider processing the data in PixInsight. Presuming the file names are the same but in different directories, you'd end up with problems when the tools save them out. Best to have each file have a unique name.
Comment 5 Jasem Mutlaq 2018-02-01 20:11:57 UTC
Ok I need to think more about now. The solution as applied already solves the issue at the scheduler level, but as soon as the job goes to capture, it checks the directory and finds that all the required images are already captured. It does not know number of other jobs with the exact same path. So I'll see if there a graceful way to resolve this.
Comment 6 schwim 2018-02-01 20:54:03 UTC
(In reply to Jasem Mutlaq from comment #5)
> Ok I need to think more about now. The solution as applied already solves
> the issue at the scheduler level, but as soon as the job goes to capture, it
> checks the directory and finds that all the required images are already
> captured. It does not know number of other jobs with the exact same path. So
> I'll see if there a graceful way to resolve this.

Yes, this would make multi-night imaging with the same .esq files a breeze. .esq re-use would be very handy to have indeed - basically they become capture profiles that are orchestrated by the scheduler.
Comment 7 Jasem Mutlaq 2018-02-08 06:29:11 UTC
Can you check with latest GIT if this issue is resolved?
Comment 8 schwim 2018-02-10 18:49:50 UTC
(In reply to Jasem Mutlaq from comment #7)
> Can you check with latest GIT if this issue is resolved?

Built and testing in between imaging. Standby...
Comment 9 schwim 2018-02-10 23:29:08 UTC
(In reply to schwim from comment #8)
> (In reply to Jasem Mutlaq from comment #7)
> > Can you check with latest GIT if this issue is resolved?
> 
> Built and testing in between imaging. Standby...

Here's what I'm seeing for two different job completion conditions

 - Sequence Completion: Can re-run the job over and over - GOOD

 - Multiple jobs mixed of the two: seems to work fine - GOOD

 - Repeat for _N_ runs - will repeat N times as specified. An attempt to re-run this job (by restarting it) fails with "No valid jobs found, aborting..." message. If you edit the job and change N (e.g. N+1) it will run that new number of times. If you try to run it again, the same failure occurs. If you run the job several times to Sequence Completion as above, then switch to Repeat for ___ runs it will run.

 - Run one "N runs" job, deleted it, then re-add an identical job gives "observation job is already complete" message.

One thing I noticed is if you have multiple jobs with "Repeat for __ runs" they all have to repeat the same number of times. You can't have one job repeat 2 times and another repeat 3 times, they both have to repeat the same number of times. I also noticed you can't edit the priority of a job once it is in the list. These are probably a different item to be tracked. I'll open bugs on those two independently if you prefer.
Comment 10 schwim 2018-02-14 03:17:13 UTC
More testing. I decided a video of what's going on is best. In short, a save .esq is loaded. The save location for each image is an empty directory. However, this job had been run previously, but all files moved or deleted. The sequence job will start and the 1st job will run, but all others will move to completed. Resetting the status on them at that point allows them all to run.

https://www.dropbox.com/s/gm909b7c8smg8gf/seq_oddity.mov?dl=0

...and a logfile:

https://www.dropbox.com/s/r9tuxzwm77ofw8w/seq_oddity.txt?dl=0
Comment 11 Jasem Mutlaq 2018-02-14 11:30:56 UTC
I think the only way for now to resolve the problem in the video is by setting "Remember Job Progress" to false in the Ekos settings.
Comment 12 TallFurryMan 2018-09-03 10:53:29 UTC
Please review this report with 2.9.8.
In this version, Scheduler will honor "Remember Job Progress" by counting captures already present in the storage, and executing what is needed and only that. If "Remember Job Progress" is disabled, sequences will run unconditionally.
Comment 13 Andrew Crouthamel 2018-09-19 14:38:00 UTC
This bug has had its resolution changed, but accidentally has been left in NEEDSINFO status. I am thus closing this bug and setting the status as RESOLVED to reflect the resolution change.