public class Demultiplexer extends JavaModuleImpl implements SeqModule, ApiModule
| Modifier and Type | Field and Description |
|---|---|
protected static int |
NUM_LINES_TEMP_FILE
Module splits multiplexed file into smaller files with this number of lines: 2000000
|
BLJ_OPTIONSGZIP_EXT, LOG_EXT, PDF_EXT, RETURN, SH_EXT, TAB_DELIM, TSV_EXT, TXT_EXTLOG_DIR, MAIN_SCRIPT_PREFIX, NO_VERSION, OUTPUT_DIR, RES_DIR, TEMP_DIR| Constructor and Description |
|---|
Demultiplexer() |
| Modifier and Type | Method and Description |
|---|---|
protected void |
breakUpFiles()
Some multiplexed files can be very large.
|
void |
checkDependencies()
Validate module dependencies:
If
Config."demultiplexer.strategy" indicates use of barcodes to
demultiplexer, validate metadata column named
Config."metadata.barcodeColumn" exists
Call DemuxUtil.setMultiplexedConfig() to set multiplexed Config if needed
If Config."demultiplexer.barcodeCutoff" defined, validate between 0.0 -
1.0
|
void |
cleanUp()
Update SeqUtil to indicate data has been demultiplexed.
|
protected void |
demultiplex(Map<String,Set<String>> validHeaders)
Demultiplex the file into separate small temp files, with 2000000 lines each for
processing.
|
String |
getCitationString()
At a minimum, this should return the name and/or url for the wrapped tool.
|
String |
getDescription()
Briefly describe what this module does.
|
String |
getDetails()
A extension of
getDescription. |
List<File> |
getInputFiles()
BioModule
BioModuleImpl.getInputFiles() is called to initialize upon first call and cached. |
List<File> |
getSeqFiles(Collection<File> files)
Return only sequence files for sample IDs found in the metadata file.
If Config."metadata.required" = "Y", an
error is thrown to list the files that cannot be matched to a metadata row. |
String |
getSummary()
Produces initial count and demultiplexed output count summaries for forward/reverse reads.
|
protected Map<String,Set<String>> |
getValidFwHeaders()
Get valid forward read headers that belong to reads with a valid barcode or sample identifier.
|
protected Map<String,Set<String>> |
getValidHeaders()
This method obtains all valid headers for the forward reads, and returns only headers that also have a matching
reverse read
|
void |
runModule()
Module execution summary:
Execute breakUpFiles() to split the multiplex file into smaller size for processing
Execute getValidHeaders() to obtain list of valid headers matched to sample ID with the metadata
file and also verifies matching forward and reverse read headers if demuliplexing paired reads. |
buildScript, executeTask, getDockerImageName, getDockerImageOwner, getDockerImageTag, getWorkerScriptFunctions, isValidInputModule, markStatus, moduleComplete, moduleFailed, runBioLockJ_CMDbuildScriptForPairedReads, getJobParams, getMainScript, getRuntimeParams, getScriptDir, getScriptErrors, getTimeout, hasScripts, isValidPropaddGeneralProperty, addGeneralProperty, addGeneralProperty, addNewProperty, addNewProperty, cacheInputFiles, compareTo, equals, findModuleInputFiles, getAlias, getDescription, getFileCache, getID, getLogDir, getMenuPlacement, getMetadata, getModuleDir, getOutputDir, getPostRequisiteModules, getPreRequisiteModules, getPropDefault, getPropDescMap, getPropType, getPropTypeMap, getResourceDir, getTempDir, getTitle, hashCode, init, listProps, setAlias, toStringclone, finalize, getClass, notify, notifyAll, wait, wait, waitbuildScript, buildScriptForPairedReads, getJobParams, getMainScript, getScriptDir, getScriptErrors, getTimeout, getWorkerScriptFunctionsexecuteTask, getAlias, getDockerImageName, getDockerImageOwner, getDockerImageTag, getID, getLogDir, getMetadata, getModuleDir, getOutputDir, getPostRequisiteModules, getPreRequisiteModules, getPropDefault, getResourceDir, getTempDir, init, isValidInputModule, setAlias, versiongetDescription, getMenuPlacement, getPropType, getTitle, isValidProp, listPropsprotected static final int NUM_LINES_TEMP_FILE
public Demultiplexer()
throws API_Exception
API_Exceptionpublic void checkDependencies()
throws Exception
Config."demultiplexer.strategy" indicates use of barcodes to
demultiplexer, validate metadata column named
Config."metadata.barcodeColumn" exists
DemuxUtil.setMultiplexedConfig() to set multiplexed Config if needed
Config."demultiplexer.barcodeCutoff" defined, validate between 0.0 -
1.0
checkDependencies in interface BioModulecheckDependencies in class ScriptModuleImplException - thrown if missing or invalid dependencies are foundpublic void cleanUp()
throws Exception
cleanUp in interface BioModulecleanUp in class BioModuleImplException - if unable to modify propertypublic List<File> getInputFiles()
BioModuleImplBioModuleImpl.getInputFiles() is called to initialize upon first call and cached.getInputFiles in interface BioModulegetInputFiles in class BioModuleImplpublic List<File> getSeqFiles(Collection<File> files) throws SequnceFormatException
SeqModuleConfig."metadata.required" = "Y", an
error is thrown to list the files that cannot be matched to a metadata row.getSeqFiles in interface SeqModulefiles - Module input filesSequnceFormatException - If Config."metadata.required" =
"Y" but sequence files found that do not have a corresponding record in the metadata
file or if invalid metadata prevents parsing SEQ files.public String getSummary() throws Exception
getSummary in interface BioModulegetSummary in class ScriptModuleImplException - if any error occurspublic void runModule()
throws Exception
breakUpFiles() to split the multiplex file into smaller size for processing
getValidHeaders() to obtain list of valid headers matched to sample ID with the metadata
file and also verifies matching forward and reverse read headers if demuliplexing paired reads.
demultiplex(Map) to demultiplex the data into a separate file (or pair of files) for each
sample
If paired reads are combined in a single file the read direction must be identified in the sequence header using key strings " 1:N:" " 2:N:"
runModule in interface JavaModulerunModule in class JavaModuleImplException - thrown if any runtime error occursprotected void breakUpFiles()
throws Exception
Exception - if unexpected errors occur at runtimeprotected void demultiplex(Map<String,Set<String>> validHeaders) throws Exception
validHeaders - Set of valid headersException - if error occurs reading the multiplexed fileprotected Map<String,Set<String>> getValidFwHeaders() throws Exception
Exception - if error occurprotected Map<String,Set<String>> getValidHeaders() throws Exception
Exception - if unable to obtain headerspublic String getDescription()
ApiModulegetDetails.getDescription in interface ApiModulepublic String getDetails() throws API_Exception
ApiModulegetDescription. Beyond the brief description, give details such as
the interaction between properties.getDetails in interface ApiModulegetDetails in class BioModuleImplAPI_Exceptionpublic String getCitationString()
ApiModulegetCitationString in interface ApiModule