public interface BioModule
Config
file to include a module.Pipeline
class executes BioModules in the order provided in the Config
file.
BioModule Directory Structure
Directory | Description |
---|---|
output | Contains all module output files |
temp | Holds intermediate files generated by the module, but are not to be passed on to the next module. This directory will deleted after pipeline execution if "pipeline.deleteTempFiles" = "Y" |
Modifier and Type | Field and Description |
---|---|
static String |
LOG_DIR
Name of the temporary sub-directory: "log"
|
static String |
MAIN_SCRIPT_PREFIX
Script prefix appended to start of file name to indicate the main script in the script directory.
Non-AWS pipelines execute worker scripts via executing the main shell script - named with the prefix: "MAIN_" |
static String |
NO_VERSION |
static String |
OUTPUT_DIR
Name of the output sub-directory: "output"
|
static String |
RES_DIR
Name of the temporary sub-directory: "resources"
|
static String |
TEMP_DIR
Name of the temporary sub-directory: "temp"
|
Modifier and Type | Method and Description |
---|---|
void |
checkDependencies()
During pipeline initialization, all configured BioModules will run this method to validate dependencies.
|
void |
cleanUp()
This method executes after execution to update Config modified by the module or other cleanup operations.
|
void |
executeTask()
This is the main method called when it is time for the BioModule to complete its task.
|
String |
getAlias()
Some BioModules may be added to a pipeline multiple times.
The user may provide an alias for a module in the run order, thus allowing the user direct properties to an individual instance of a module. |
String |
getDockerImageName()
Get the docker image to use for this module.
|
String |
getDockerImageOwner()
Get the name of the docker hub user that owns the image to use for this module.
|
String |
getDockerImageTag()
Get the version / tag to use for the docker image.
|
Integer |
getID()
Some BioModules may be added to a pipeline multiple times so must be identified by an ID.
This is the same value as the directory folder prefix when run. The 1st module ID is 0 (or 00 if there are more than 10 modules. |
List<File> |
getInputFiles()
Each BioModule takes the previous BioModule output as input:
BioModule[ n ].getInputFiles() = BioModule[ n - 1 ].getOutputDir().listFiles() Special cases: The 1st BioModule return all files in "input.dirPaths" If previous BioModule BioModule[ n - 1 ] is a MetadataModule, forward it's input + output file: BioModule[ n ].getInputFiles() = BioModule[ n -1 ].getInputFiles() + BioModule[ n -1 ].getOutputFiles() |
File |
getLogDir()
Retains records of the process of running the module.
The files are intended to be small and stored long term with successful pipelines. |
File |
getMetadata()
Updated/new metadata files are saved to the module output directory (if created by the module).
|
File |
getModuleDir()
Each BioModule generates sub-directory under $DOCKER_PROJ
|
File |
getOutputDir()
Output files destined as input for the next BioModule is created in this directory.
|
List<String> |
getPostRequisiteModules()
Pipeline calls this method when building the list of pipeline BioModules to execute. |
List<String> |
getPreRequisiteModules()
Pipeline calls this method when building the list of pipeline BioModules to execute. |
String |
getPropDefault(String prop)
If a property is null based on the config files (including all defaults and standard.properties) but
a module is passed to the Config class as the context for getting that module, the Config class
can query the module for a value for the property.
|
File |
getResourceDir()
The resource sub-direcotry contains files that the module may depend on.
Unlike temp or log, these files are not created by the module. |
String |
getSummary()
Gets the BioModule execution summary, this is sent as part of the notification email, if configured.
Summary should not include data content, to avoid unintentional publication of confidential information. However, meta-data such as number/size of files can be helpful during debug. |
File |
getTempDir()
Contains intermediate files generated by the module but not used by the next BioModule.
The files may contain supplementary information or data that may be helpful during debug or recovery. If "pipeline.deleteTempFiles" = "Y", successful pipelines delete this directory. |
void |
init()
Initialize a new module to generate a unique ID and module directory.
|
boolean |
isValidInputModule(BioModule previousModule)
BioModules
getInputFiles() method typically, but not always, return the previousModule output files. |
void |
setAlias(String alias) |
default String |
version()
Changes to a module class should be accompanied by a increment in version.
|
static final String MAIN_SCRIPT_PREFIX
static final String OUTPUT_DIR
static final String TEMP_DIR
static final String LOG_DIR
static final String RES_DIR
static final String NO_VERSION
void checkDependencies() throws Exception
Exception
- thrown if missing or invalid dependencies are foundvoid cleanUp() throws Exception
Exception
- thrown if any runtime error occursvoid executeTask() throws Exception
Exception
- thrown if the module is unable to complete is taskInteger getID()
String getAlias()
void setAlias(String alias) throws PipelineFormationException
PipelineFormationException
List<File> getInputFiles()
File getMetadata()
File getModuleDir()
File getOutputDir()
List<String> getPostRequisiteModules() throws Exception
Pipeline
calls this method when building the list of pipeline BioModules to execute. Any
BioModules returned by this method will be added to the pipeline after the current BioModule. If multiple
post-requisites are found, the modules will be added in the order listed.Exception
- if invalid Class names are returned as post-requisitesList<String> getPreRequisiteModules() throws Exception
Pipeline
calls this method when building the list of pipeline BioModules to execute. Any
BioModules returned by this method will be added to the pipeline before the current BioModule. If multiple
prerequisites are returned, the modules will be added in the order listed.Exception
- if invalid Class names are returned as prerequisitesString getSummary() throws Exception
Exception
- if any error occursFile getTempDir()
File getLogDir()
File getResourceDir()
void init() throws Exception
Exception
- if errors occurString getPropDefault(String prop)
property
- boolean isValidInputModule(BioModule previousModule)
getInputFiles()
method typically, but not always, return the previousModule output files.
This method checks the output directory from the previous module to check for input deemed acceptable by the
current module. The conditions coded in this method will be checked on each previous module in the pipeline until
acceptable input is found. If no previous module produced acceptable input, the files under
Config
."input.dirPaths" are returned.previousModule
- BioModule that ran before the current BioModuleString getDockerImageOwner()
String getDockerImageName()
String getDockerImageTag()
default String version()