Version 5 (modified by 12 years ago) (diff) | ,
---|
Table of Contents
- Finish creation of tests for all important configurations
- Make it easier to insert additional step/remove a step
- Auto generate the list of parameters that you need
- Better error reporting on input
- Monitoring of progress
- Monitoring of success and resource usage
- Make transparant/unimportant for the user which backend is actually used
- Transparent server to stage data
- Restart from specific step
- Store pilot job id in the task in the database
- Like to have a 'heartbeat' for jobs
- Add putFile -force to manual
- Enable easy merging of workflows, merging of parameters
- Get rid of parameters.csv and instead create worksheet
- Cleanup backend specific protocols
- Visualization framework for analyses runs
- Rearchitecture the components
- Make submit.sh uses 'lock' file so the jobs can only end when all is …
- Seperate protocols from code
- Users for runs, priorities?
- Publish!
- Approach
Compute Roadmap
This pages describes plans to make Compute even better.
NB: the features below have still to be put on a release schedule!
Finish creation of tests for all important configurations
- local, pbs, grid
- impute, align
Make it easier to insert additional step/remove a step
- requires seperation between 'per protocol input/output parameters' and 'workflow links'
- should not be extra work; could do automatic mapping?
- should be solved in the workflow.csv (instead of control flow, do data flow, i.e. create list of output-input edges)
Auto generate the list of parameters that you need
- could automatigically be filled from templates or, if we have it, the data flow?
Better error reporting on input
- doesn't list which variable is missing
- templates are missing
- syntax checking of CSV files
Monitoring of progress
- how far is the analysis
- succesfully or wrongly, during running (could be done with #end macro)
- add a job at the end that creates report of results (for commandline)
- Also do this for database version to have this info incase of database problems.
Monitoring of success and resource usage
- have harmonized method to report 'success' or 'error', incl message + runtime
- include also stuff like max, min etc.
Make transparant/unimportant for the user which backend is actually used
- system decides where it (can) run: cluster, pbs, grid, local
- needs flexible file manager that can 'stage' data for any backend
- like to restart on other backend
Transparent server to stage data
- to easily move pipeline to other storage.
Restart from specific step
- remove files so far
- reduce problem by using *tmp file and and only 'mv' if step succesful
- however, this doesn't work if we want to restart
- could use folders per step, so you could delete the folders from step onwards
- can be solved by good practice
Store pilot job id in the task in the database
- I need to know which pilot jobs have died, and which tasks were associated with it
- Then I can re-release tasks to they can be done by another pilot
Like to have a 'heartbeat' for jobs
- so I can be sure a (pilot) job is still alive
- could us a 'background' process that pings back to database
- could also be used for pbs jobs
Add putFile -force to manual
Enable easy merging of workflows, merging of parameters
- Easily combine protocols from multiple workflows
- wants less parameter files
- meanwhile allow multiple worksheets
Get rid of parameters.csv and instead create worksheet
- so parameter names on first row
- hasOne using naming scheme A_B, means B has one A
- conclusion: use multiple headers.
- allow -parameters and -tparameter
Cleanup backend specific protocols
- e.g. 'touch' commands
Visualization framework for analyses runs
Rearchitecture the components
- one interface, multiple implementations
- unit tests
- Can we do something with unit tests per protocol?
Make submit.sh uses 'lock' file so the jobs can only end when all is submitted
- Problem is that first jobs fails quickly, many dependent jobs are not yet submitted, and get orphaned
- so dependent jobs can be submitted and never have issue (alex feature)
- at this in the #end macro so jobs can never finish until complete workflow is submitted
- After all jobs are submitted lock file will be removed
Seperate protocols from code
- yes, seperate github repo
- should enable to combine multiple protocol folders, multiple parameter filse
- should indicate at which compute version it works
Users for runs, priorities?
- needed if we want 'priority queue' for pilot jobs
Publish!
Approach
Clean start = yes seperate development 5 from using 4 (bug fixes) = yes database, commandline or both = both release schedule -> roadmap backwards compatibility = no