Implement nextflow process execution logic
Modifiers | Name | Description |
---|---|---|
class |
TaskProcessor.BaseProcessInterceptor |
|
static class |
TaskProcessor.ForwardClosure |
Implements the closure which *combines* all the iteration |
static class |
TaskProcessor.InvokeTaskAdapter |
Adapter closure to call the invokeTask(java.lang.Object) method |
static enum |
TaskProcessor.RunType |
|
class |
TaskProcessor.TaskProcessorInterceptor |
Intercept dataflow process events |
Modifiers | Name | Description |
---|---|---|
static java.lang.String |
TASK_CONTEXT_PROPERTY_NAME |
|
protected boolean |
allScalarValues |
|
protected boolean |
blocking |
Whenever the process execution is required to be blocking in order to handle shared object in a thread safe manner |
protected boolean |
completed |
Flag set true when the processor termination has been invoked |
protected ProcessConfig |
config |
The corresponding task configuration properties, it holds the inputs/outputs definition as well as other execution meta-declaration |
protected java.lang.ThreadLocal<TaskRun> |
currentTask |
Keeps track of the task instance executed by the current thread |
protected Executor |
executor |
The underlying executor which will run the task |
protected ch.grengine.Grengine |
grengine |
Groovy engine used to evaluate dynamic code |
protected groovyx.gpars.group.PGroup |
group |
Gpars thread pool |
protected boolean |
hasEachParams |
|
protected java.lang.Object |
indexCount |
Unique task index number (run) |
protected java.lang.String |
name |
The processor descriptive name |
protected java.util.concurrent.atomic.AtomicIntegerArray |
openPorts |
Track the status of input ports. |
protected groovyx.gpars.dataflow.operator.DataflowProcessor |
operator |
The corresponding DataflowProcessor which will receive and
manage accordingly the task inputs |
protected BaseScript |
ownerScript |
The script object which defines this task |
protected Session |
session |
The current workflow execution session |
protected boolean |
singleton |
Whenever the process is executed only once |
protected groovyx.gpars.agent.Agent<StateObj> |
state |
The state is maintained by using an agent |
protected TaskBody |
taskBody |
The piece of code to be execute provided by the user |
Constructor and description |
---|
protected TaskProcessor
() |
TaskProcessor
(java.lang.String name, Executor executor, Session session, BaseScript script, ProcessConfig config, TaskBody taskBody) Create and initialize the processor object |
Type Params | Return Type | Name and description |
---|---|---|
|
static java.lang.String |
bashEnvironmentScript(java.util.Map<java.lang.String, java.lang.String> environment, boolean escape = false ) Given a map holding variables key-value pairs, create a script fragment exporting the required environment variables |
|
protected void |
bindOutParam(OutParam param, java.util.List values) |
|
protected void |
bindOutputs(TaskRun task) Bind the expected output files to the corresponding output channels |
|
protected boolean |
checkCachedOrLaunchTask(TaskRun task, com.google.common.hash.HashCode hash, boolean shouldTryCache) Try to check if exists a previously executed process result in the a cached folder. |
|
boolean |
checkCachedOutput(TaskRun task, java.nio.file.Path folder, com.google.common.hash.HashCode hash) Check whenever the outputs for the specified task already exist |
|
protected ErrorStrategy |
checkErrorStrategy(TaskRun task, ProcessException error, int taskErrCount, int procErrCount) |
|
boolean |
checkStoredOutput(TaskRun task) Check if exists a *storeDir* for the specified task. |
|
protected boolean |
checkWhenGuard(TaskRun task) |
|
protected void |
collectOutFiles(TaskRun task, FileOutParam param, java.nio.file.Path workDir, java.util.Map context) |
|
protected void |
collectOutValues(TaskRun task, ValueOutParam param, java.util.Map ctx) |
|
protected void |
collectOutputs(TaskRun task) |
|
protected void |
collectOutputs(TaskRun task, java.nio.file.Path workDir, java.lang.Object stdout, java.util.Map context) Once the task has completed this method is invoked to collected all the task results |
|
protected void |
collectStdOut(TaskRun task, StdOutParam param, java.lang.Object stdout) Collects the process 'std output' |
|
protected void |
createOperator() Verify if this process run only one time |
|
protected com.google.common.hash.HashCode |
createTaskHashKey(TaskRun task) |
|
protected TaskRun |
createTaskRun() Create a new TaskRun instance, initializing the following properties :
TaskRun#id
TaskRun#status
TaskRun#index
TaskRun#name
TaskRun#process |
|
java.lang.String |
dumpTerminationStatus() Dump the current process status listing all input *port* statuses for debugging purpose |
|
protected java.util.List<nextflow.file.FileHolder> |
expandWildcards(java.lang.String name, java.util.List<nextflow.file.FileHolder> files) An input file name may contain wildcards characters which have to be handled coherently given the number of files specified. |
|
protected java.lang.String |
expandWildcards0(java.lang.String path, java.lang.String stageName, int index, int size) |
|
static java.lang.String |
fetchInterpreter(java.lang.String script) Given the task script extract the top *she-bang* interpreter declaration removing the #! |
|
protected java.util.List<java.lang.String> |
formatGuardError(java.util.List<java.lang.String> message, FailedGuardException error, TaskRun task) |
|
protected java.util.List<java.lang.String> |
formatTaskError(java.util.List<java.lang.String> message, java.lang.Throwable error, TaskRun task) |
|
ProcessConfig |
getConfig() @return The TaskConfig object holding the task configuration properties |
|
Executor |
getExecutor() @return The Executor associated to this processor |
|
int |
getId() @return The processor unique id |
|
java.lang.String |
getName() @return The processor name |
|
groovyx.gpars.dataflow.operator.DataflowProcessor |
getOperator() @return The DataflowOperator underlying this process |
|
BaseScript |
getOwnerScript() @return The BaseScript object which represents pipeline script |
|
java.util.Map<java.lang.String, java.lang.String> |
getProcessEnvironment() @return The map holding the shell environment variables for the task to be executed |
|
protected java.lang.String |
getRndTip() Display a random tip at the bottom of the error report |
|
protected ScriptType |
getScriptType() Define the type of script hold by the #code property |
|
Session |
getSession() @return The current Session instance |
|
protected java.util.List<java.nio.file.Path> |
getTaskBinEntries(java.lang.String script) This method scans the task command string looking for invocations of scripts defined in the project bin folder. |
|
TaskBody |
getTaskBody() @return The user provided script block |
|
protected java.util.Map<java.lang.String, java.lang.Object> |
getTaskGlobalVars(TaskRun task) |
|
protected java.util.Map<java.lang.String, java.lang.Object> |
getTaskGlobalVars(java.util.Set<java.lang.String> variableNames, groovy.lang.Binding binding, java.util.Map context) @param variableNames The collection of variables referenced in the task script |
|
protected boolean |
handleException(java.lang.Throwable error, TaskRun task = null ) Handles an error raised during the processor execution |
|
protected void |
invokeTask(java.lang.Object args) The processor execution body |
|
boolean |
isCacheable() Whenever the process can be cached |
|
protected int |
makeTaskContextStage1(TaskRun task, java.util.Map secondPass, java.util.List values) |
|
protected void |
makeTaskContextStage2(TaskRun task, java.util.Map secondPass, int count) |
|
protected void |
makeTaskContextStage3(TaskRun task, com.google.common.hash.HashCode hash, java.nio.file.Path folder) |
|
protected nextflow.file.FileHolder |
normalizeInputToFile(java.lang.Object input, java.lang.String altName) An input file parameter can be provided with any value other than a file. |
|
protected java.util.List<nextflow.file.FileHolder> |
normalizeInputToFiles(java.lang.Object obj, int count) |
|
static java.lang.String |
normalizeScript(java.lang.String script, java.lang.Object shell) Remove extra leading, trailing whitespace and newlines chars, also if the script does not start with a shebang line,
add the default by using the current #shell attribute |
|
protected void |
publishOutputs(TaskRun task) Publish output files to a specified target folder |
|
protected java.lang.String |
replaceQuestionMarkWildcards(java.lang.String name, int index) |
|
protected java.lang.String |
replaceStarWildcards(java.lang.String name, int index, boolean strip = false) |
|
java.lang.Object |
run() Launch the 'script' define by the code closure as a local bash script |
|
protected void |
sendPoisonPill() Send a poison pill over all the outputs channel |
|
static java.lang.Object |
shebangLine(java.lang.Object shell)
|
|
protected java.lang.Object |
singleItemOrList(java.util.List<nextflow.file.FileHolder> items) |
|
protected void |
submitTask(TaskRun task, com.google.common.hash.HashCode hash, java.nio.file.Path folder) Execute the specified task shell script |
|
protected void |
terminateProcess() |
|
protected void |
validateInputSets(java.util.List values) |
Methods inherited from class | Name |
---|---|
class java.lang.Object |
java.lang.Object#wait(long, int), java.lang.Object#wait(long), java.lang.Object#wait(), java.lang.Object#equals(java.lang.Object), java.lang.Object#toString(), java.lang.Object#hashCode(), java.lang.Object#getClass(), java.lang.Object#notify(), java.lang.Object#notifyAll() |
Whenever the process execution is required to be blocking in order to handle shared object in a thread safe manner
Flag set true
when the processor termination has been invoked
See #checkProcessTermination
The corresponding task configuration properties, it holds the inputs/outputs definition as well as other execution meta-declaration
Keeps track of the task instance executed by the current thread
The underlying executor which will run the task
Groovy engine used to evaluate dynamic code
Gpars thread pool
Unique task index number (run)
The processor descriptive name
Track the status of input ports. When 1 the port is open (waiting for data), when 0 the port is closed (ie. received the STOP signal)
The corresponding DataflowProcessor
which will receive and
manage accordingly the task inputs
note: it must be declared volatile -- issue #41
The script object which defines this task
The current workflow execution session
Whenever the process is executed only once
The state is maintained by using an agent
The piece of code to be execute provided by the user
Create and initialize the processor object
Given a map holding variables key-value pairs, create a script fragment exporting the required environment variables
Bind the expected output files to the corresponding output channels
Try to check if exists a previously executed process result in the a cached folder. If it exists use the that result and skip the process execution, otherwise the task is sumitted for execution.
task
- The TaskRun
instance to be executedhash
- The unique HashCode
for the given task inputsscript
- The script to be run (only when it's a merge task)false
when a cached result has been found and the execution has skipped,
or true
if the task has been submitted for executionCheck whenever the outputs for the specified task already exist
task
- The task instancefolder
- The folder where the outputs are stored (eventually)true
when all outputs are available, false
otherwiseCheck if exists a *storeDir* for the specified task. When if exists and contains the expected result files, the process execution is skipped.
task
- The task for which check the stored outputtrue
when the folder exists and it contains the expected outputs,
false
otherwiseOnce the task has completed this method is invoked to collected all the task results
Collects the process 'std output'
task
- The executed process instanceparam
- The declared StdOutParam objectstdout
- The object holding the task produced std out objectVerify if this process run only one time
Create a new TaskRun
instance, initializing the following properties :
TaskRun#id
TaskRun#status
TaskRun#index
TaskRun#name
TaskRun#process
TaskRun
Dump the current process status listing all input *port* statuses for debugging purpose
An input file name may contain wildcards characters which have to be handled coherently given the number of files specified.
name
- A file name with may contain a wildcard character star *
or question mark ?
.
Only one occurrence can be specified for star or question mark wildcards.value
- Any value that have to be managed as an input files. Values other than Path
are converted
to a string value, using the #toString
method and saved in the local file-system. Value of type Collection
are expanded to multiple values accordingly. Given the task script extract the top *she-bang* interpreter declaration removing the #!
characters.
script
- The script to be executed/usr/bin/env perl
TaskConfig
object holding the task configuration properties
DataflowOperator
underlying this process
BaseScript
object which represents pipeline script
Display a random tip at the bottom of the error report
Define the type of script hold by the #code
property
Session
instanceThis method scans the task command string looking for invocations of scripts defined in the project bin folder.
script
- The task command string
variableNames
- The collection of variables referenced in the task scriptbinding
- The script global bindingcontext
- The task variable contextHandles an error raised during the processor execution
error
- The exception raised during the task executiontask
- The TaskDef
instance which raised the exceptiontrue
to terminate the processor execution,
false
ignore the error and continue to process other pending tasksThe processor execution body
Whenever the process can be cached
An input file parameter can be provided with any value other than a file.
This function normalize a generic value to a Path
create a temporary file
in the for it.
input
- The input valuealtName
- The name to be used when a temporary file is created.Path
that will be staged in the task working folder Remove extra leading, trailing whitespace and newlines chars,
also if the script does not start with a shebang
line,
add the default by using the current #shell
attribute
Publish output files to a specified target folder
task
- The task whose outputs need to be publishedoverwrite
- When true
any existing file will be overwritten, otherwise the publishing is ignoredLaunch the 'script' define by the code closure as a local bash script
code
- A Closure
retuning a bash script e.g.
{ """ #!/bin/bash do this \${x} do that \${y} : """ }
this
instanceSend a poison pill over all the outputs channel
shell
Execute the specified task shell script
script
- The script string to be execute, e.g. a BASH scriptTaskDef
Groovy Documentation