wiki:ComputeRoadmap

Context Navigation

Version 9 (modified by george, 13 years ago) (diff)
--

Compute Roadmap

This pages describes plans to make Compute even better.

NB: the features below have still to be put on a release schedule!

Finish creation of tests for all important configurations

local, pbs, grid
impute, align

Make it easier to insert additional step/remove a step

requires seperation between 'per protocol input/output parameters' and 'workflow links'
should not be extra work; could do automatic mapping?
should be solved in the workflow.csv (instead of control flow, do data flow, i.e. create list of output-input edges)

Auto generate the list of parameters that you need

could automatigically be filled from templates or, if we have it, the data flow?

Better error reporting on input

doesn't list which variable is missing
templates are missing
syntax checking of CSV files

Monitoring of progress

how far is the analysis
succesfully or wrongly, during running (could be done with #end macro)
add a job at the end that creates report of results (for commandline)
- Also do this for database version to have this info incase of database problems.

Monitoring of success and resource usage

have harmonized method to report 'success' or 'error', incl message + runtime
include also stuff like max, min etc.

Make transparant/unimportant for the user which backend is actually used

system decides where it (can) run: cluster, pbs, grid, local
needs flexible file manager that can 'stage' data for any backend
like to restart on other backend

Transparent server to stage data

to easily move pipeline to other storage.

Restart from specific step

Remove files for the step from which to restart as well as for all steps that depend on that one.
Reduce problem by using *tmp file and and only 'mv' if step is successful (as automatically determined by compute)
However, this doesn't work if we want to restart from a step that was successful as automatically determined by compute, but has failed as manually determined by a user.
Could use folders per step, so you could more easily delete all output from a specific step onwards
Can be solved by good practice -> Compute should assist users as much as possible to standardize good practice and make it as easy as possible.

Store pilot job id in the task in the database

I need to know which pilot jobs have died, and which tasks were associated with it
Then I can re-release tasks to they can be done by another pilot
Alternative: jobs are sending heart-bits to the server every 5-10 minutes and the actual analysis is running in the background

Like to have a 'heartbeat' for jobs

so I can be sure a (pilot) job is still alive
could us a 'background' process that pings back to database
could also be used for pbs jobs

Add putFile -force to manual

Enable easy merging of workflows, merging of parameters

Easily combine protocols from multiple workflows
wants less parameter files
meanwhile allow multiple worksheets

Get rid of parameters.csv and instead create worksheet

so parameter names on first row
hasOne using naming scheme A_B, means B has one A
conclusion: use multiple headers.
allow -parameters and -tparameter

Cleanup backend specific protocols

e.g. 'touch' commands

Visualization framework for analyses runs

Rearchitecture the components

one interface, multiple implementations
unit tests
Can we do something with unit tests per protocol?

Stable JPA

Make submit.sh uses 'lock' file so the jobs can only end when all is submitted

Problem is that first jobs fails quickly, many dependent jobs are not yet submitted, and get orphaned
so dependent jobs can be submitted and never have issue (alex feature)
at this in the #end macro so jobs can never finish until complete workflow is submitted
After all jobs are submitted lock file will be removed

Seperate protocols from code

yes, seperate github repo
should enable to combine multiple protocol folders, multiple parameter filse
should indicate at which compute version it works

Users for runs, priorities?

needed if we want 'priority queue' for pilot jobs

Publish!

Approach

Clean start = yes
Separate development of v5 from v4
Bug fixes for v4 = yes
Database, commandline or both = both
Release schedule -> roadmap
Backwards compatibility = no

Download in other formats:

Plain Text

Context Navigation

Table of Contents

Compute Roadmap

Finish creation of tests for all important configurations

Make it easier to insert additional step/remove a step

Auto generate the list of parameters that you need

Better error reporting on input

Monitoring of progress

Monitoring of success and resource usage

Make transparant/unimportant for the user which backend is actually used

Transparent server to stage data

Restart from specific step

Store pilot job id in the task in the database

Like to have a 'heartbeat' for jobs

Add putFile -force to manual

Enable easy merging of workflows, merging of parameters

Get rid of parameters.csv and instead create worksheet

Cleanup backend specific protocols

Visualization framework for analyses runs

Rearchitecture the components

Stable JPA

Make submit.sh uses 'lock' file so the jobs can only end when all is submitted

Seperate protocols from code

Users for runs, priorities?

Publish!

Approach

Download in other formats: