public class Demultiplexer extends JavaModuleImpl implements SeqModule, ApiModule
Modifier and Type | Field and Description |
---|---|
protected static int |
NUM_LINES_TEMP_FILE
Module splits multiplexed file into smaller files with this number of lines: 2000000
|
BLJ_OPTIONS
GZIP_EXT, LOG_EXT, PDF_EXT, RETURN, SH_EXT, TAB_DELIM, TSV_EXT, TXT_EXT
LOG_DIR, MAIN_SCRIPT_PREFIX, NO_VERSION, OUTPUT_DIR, RES_DIR, TEMP_DIR
Constructor and Description |
---|
Demultiplexer() |
Modifier and Type | Method and Description |
---|---|
protected void |
breakUpFiles()
Some multiplexed files can be very large.
|
void |
checkDependencies()
Validate module dependencies:
If
Config ."demultiplexer.strategy" indicates use of barcodes to
demultiplexer, validate metadata column named
Config ."metadata.barcodeColumn" exists
Call DemuxUtil.setMultiplexedConfig() to set multiplexed Config if needed
If Config ."demultiplexer.barcodeCutoff" defined, validate between 0.0 -
1.0
|
void |
cleanUp()
Update SeqUtil to indicate data has been demultiplexed.
|
protected void |
demultiplex(Map<String,Set<String>> validHeaders)
Demultiplex the file into separate small temp files, with 2000000 lines each for
processing.
|
String |
getCitationString()
At a minimum, this should return the name and/or url for the wrapped tool.
|
String |
getDescription()
Briefly describe what this module does.
|
String |
getDetails()
A extension of
getDescription . |
List<File> |
getInputFiles()
BioModule
BioModuleImpl.getInputFiles() is called to initialize upon first call and cached. |
List<File> |
getSeqFiles(Collection<File> files)
Return only sequence files for sample IDs found in the metadata file.
If Config ."metadata.required" = "Y", an
error is thrown to list the files that cannot be matched to a metadata row. |
String |
getSummary()
Produces initial count and demultiplexed output count summaries for forward/reverse reads.
|
protected Map<String,Set<String>> |
getValidFwHeaders()
Get valid forward read headers that belong to reads with a valid barcode or sample identifier.
|
protected Map<String,Set<String>> |
getValidHeaders()
This method obtains all valid headers for the forward reads, and returns only headers that also have a matching
reverse read
|
void |
runModule()
Module execution summary:
Execute breakUpFiles() to split the multiplex file into smaller size for processing
Execute getValidHeaders() to obtain list of valid headers matched to sample ID with the metadata
file and also verifies matching forward and reverse read headers if demuliplexing paired reads. |
buildScript, executeTask, getDockerImageName, getDockerImageOwner, getDockerImageTag, getWorkerScriptFunctions, isValidInputModule, markStatus, moduleComplete, moduleFailed, runBioLockJ_CMD
buildScriptForPairedReads, getJobParams, getMainScript, getRuntimeParams, getScriptDir, getScriptErrors, getTimeout, hasScripts, isValidProp
addGeneralProperty, addGeneralProperty, addGeneralProperty, addNewProperty, addNewProperty, cacheInputFiles, compareTo, equals, findModuleInputFiles, getAlias, getDescription, getFileCache, getID, getLogDir, getMenuPlacement, getMetadata, getModuleDir, getOutputDir, getPostRequisiteModules, getPreRequisiteModules, getPropDefault, getPropDescMap, getPropType, getPropTypeMap, getResourceDir, getTempDir, getTitle, hashCode, init, listProps, setAlias, toString
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
buildScript, buildScriptForPairedReads, getJobParams, getMainScript, getScriptDir, getScriptErrors, getTimeout, getWorkerScriptFunctions
executeTask, getAlias, getDockerImageName, getDockerImageOwner, getDockerImageTag, getID, getLogDir, getMetadata, getModuleDir, getOutputDir, getPostRequisiteModules, getPreRequisiteModules, getPropDefault, getResourceDir, getTempDir, init, isValidInputModule, setAlias, version
getDescription, getMenuPlacement, getPropType, getTitle, isValidProp, listProps
protected static final int NUM_LINES_TEMP_FILE
public Demultiplexer() throws API_Exception
API_Exception
public void checkDependencies() throws Exception
Config
."demultiplexer.strategy" indicates use of barcodes to
demultiplexer, validate metadata column named
Config
."metadata.barcodeColumn" exists
DemuxUtil.setMultiplexedConfig()
to set multiplexed Config if needed
Config
."demultiplexer.barcodeCutoff" defined, validate between 0.0 -
1.0
checkDependencies
in interface BioModule
checkDependencies
in class ScriptModuleImpl
Exception
- thrown if missing or invalid dependencies are foundpublic void cleanUp() throws Exception
cleanUp
in interface BioModule
cleanUp
in class BioModuleImpl
Exception
- if unable to modify propertypublic List<File> getInputFiles()
BioModuleImpl
BioModuleImpl.getInputFiles()
is called to initialize upon first call and cached.getInputFiles
in interface BioModule
getInputFiles
in class BioModuleImpl
public List<File> getSeqFiles(Collection<File> files) throws SequnceFormatException
SeqModule
Config
."metadata.required" = "Y", an
error is thrown to list the files that cannot be matched to a metadata row.getSeqFiles
in interface SeqModule
files
- Module input filesSequnceFormatException
- If Config
."metadata.required" =
"Y" but sequence files found that do not have a corresponding record in the metadata
file or if invalid metadata prevents parsing SEQ files.public String getSummary() throws Exception
getSummary
in interface BioModule
getSummary
in class ScriptModuleImpl
Exception
- if any error occurspublic void runModule() throws Exception
breakUpFiles()
to split the multiplex file into smaller size for processing
getValidHeaders()
to obtain list of valid headers matched to sample ID with the metadata
file and also verifies matching forward and reverse read headers if demuliplexing paired reads.
demultiplex(Map)
to demultiplex the data into a separate file (or pair of files) for each
sample
If paired reads are combined in a single file the read direction must be identified in the sequence header using key strings " 1:N:" " 2:N:"
runModule
in interface JavaModule
runModule
in class JavaModuleImpl
Exception
- thrown if any runtime error occursprotected void breakUpFiles() throws Exception
Exception
- if unexpected errors occur at runtimeprotected void demultiplex(Map<String,Set<String>> validHeaders) throws Exception
validHeaders
- Set of valid headersException
- if error occurs reading the multiplexed fileprotected Map<String,Set<String>> getValidFwHeaders() throws Exception
Exception
- if error occurprotected Map<String,Set<String>> getValidHeaders() throws Exception
Exception
- if unable to obtain headerspublic String getDescription()
ApiModule
getDetails
.getDescription
in interface ApiModule
public String getDetails() throws API_Exception
ApiModule
getDescription
. Beyond the brief description, give details such as
the interaction between properties.getDetails
in interface ApiModule
getDetails
in class BioModuleImpl
API_Exception
public String getCitationString()
ApiModule
getCitationString
in interface ApiModule