[[TOC()]] = Compute Roadmap = This pages describes plans to make Compute even better. NB: the features below have still to be put on a release schedule! == Finish creation of tests for all important configurations == * local, pbs, grid * impute, align == Make it easier to insert additional step/remove a step == * requires seperation between 'per protocol input/output parameters' and 'workflow links' * should not be extra work; could do automatic mapping? * should be solved in the workflow.csv (instead of control flow, do data flow, i.e. create list of output-input edges) == Auto generate the list of parameters that you need == * could automatigically be filled from templates or, if we have it, the data flow? == Better error reporting on input == * doesn't list which variable is missing * templates are missing * syntax checking of CSV files == Monitoring of progress == * how far is the analysis * succesfully or wrongly, during running (could be done with #end macro) * add a job at the end that creates report of results (for commandline) * Also do this for database version to have this info incase of database problems. == Monitoring of success and resource usage == * have harmonized method to report 'success' or 'error', incl message + runtime * include also stuff like max, min etc. == Make transparant/unimportant for the user which backend is actually used == * system decides where it (can) run: cluster, pbs, grid, local * needs flexible file manager that can 'stage' data for any backend * like to restart on other backend == Transparent server to stage data == * to easily move pipeline to other storage. == Restart from specific step == * Remove files for the step from which to restart as well as for all steps that depend on that one. * Reduce problem by using *tmp file and and only 'mv' if step is successful (as automatically determined by compute) * However, this doesn't work if we want to restart from a step that was successful as automatically determined by compute, but has failed as manually determined by a user. * Could use folders per step, so you could more easily delete all output from a specific step onwards * Can be solved by good practice -> Compute should assist users as much as possible to standardize good practice and make it as easy as possible. == Store pilot job id in the task in the database == * I need to know which pilot jobs have died, and which tasks were associated with it * Then I can re-release tasks to they can be done by another pilot * Alternative: jobs are sending heart-bits to the server every 5-10 minutes and the actual analysis is running in the background == Like to have a 'heartbeat' for jobs == * so I can be sure a (pilot) job is still alive * could us a 'background' process that pings back to database * could also be used for pbs jobs == Add putFile -force to manual == == Enable easy merging of workflows, merging of parameters == * Easily combine protocols from multiple workflows * wants less parameter files * meanwhile allow multiple worksheets == Get rid of parameters.csv and instead create worksheet == * so parameter names on first row * hasOne using naming scheme A_B, means B has one A * conclusion: use multiple headers. * allow -parameters and -tparameter == Cleanup backend specific protocols == * e.g. 'touch' commands == Visualization framework for analyses runs == == Rearchitecture the components == * one interface, multiple implementations * unit tests * Can we do something with unit tests per protocol? == Make submit.sh uses 'lock' file so the jobs can only end when all is submitted == * Problem is that first jobs fails quickly, many dependent jobs are not yet submitted, and get orphaned * so dependent jobs can be submitted and never have issue (alex feature) * at this in the #end macro so jobs can never finish until complete workflow is submitted * After all jobs are submitted lock file will be removed == Seperate protocols from code == * yes, seperate github repo * should enable to combine multiple protocol folders, multiple parameter filse * should indicate at which compute version it works == Users for runs, priorities? == * needed if we want 'priority queue' for pilot jobs == Publish!== == Approach == * Clean start = yes * Separate development of v5 from v4 * Bug fixes for v4 = yes * Database, commandline or both = both * Release schedule -> roadmap * Backwards compatibility = no