public class ImportMetadata extends BioModuleImpl implements ApiModule
GZIP_EXT, LOG_EXT, PDF_EXT, RETURN, SH_EXT, TAB_DELIM, TSV_EXT, TXT_EXT
LOG_DIR, MAIN_SCRIPT_PREFIX, NO_VERSION, OUTPUT_DIR, RES_DIR, TEMP_DIR
Constructor and Description |
---|
ImportMetadata() |
Modifier and Type | Method and Description |
---|---|
protected void |
buildNewMetadataFile()
Create a simple metadata file in the module output directory, with only the 1st column populated with Sample IDs.
|
void |
checkDependencies()
If restarting or running a direct pipeline execute the cleanup for completed modules.
|
void |
cleanUp()
Verify the metadata fields configured for R reports.
|
void |
executeTask()
If
Config ."metadata.filePath" is undefined, build a new metadata file
with only 1 column of sample IDs. |
protected String |
formatMetaId(String sampleIdColumnName)
Format the metadata ID to remove problematic invisible characters (particularly converted Excel files).
|
String |
getCitationString()
At a minimum, this should return the name and/or url for the wrapped tool.
|
String |
getDescription()
Briefly describe what this module does.
|
String |
getDetails()
A extension of
getDescription . |
String |
getDockerImageName()
Get the docker image to use for this module.
|
String |
getDockerImageOwner()
Get the name of the docker hub user that owns the image to use for this module.
|
protected String |
getQuotedValue(String val)
The member variable quotedText caches the input held within a quoted block.
|
protected TreeSet<String> |
getSampleIds()
Extract the sample IDs from the file names with
SeqUtil.getSampleIdFromString(String) |
String |
getSummary()
The metadata file can be updated several times during pipeline execution.
|
protected boolean |
inQuotes(String val)
Method called each time a line from metadata contains the
Config ."metadata.columnDelim". |
protected String |
parseRow(String line,
boolean isHeader)
Method called to parse a row from the metadata file, where
Config ."metadata.columnDelim" separates columns. |
protected void |
verifyAllRowsMapToSeqFile(List<File> files)
Verify every row (every Sample ID) maps to a sequence file
|
protected static void |
verifyHeader(String cell,
List<String> colNames,
int colNum)
Verify column headers are not null and unique
|
addGeneralProperty, addGeneralProperty, addGeneralProperty, addNewProperty, addNewProperty, cacheInputFiles, compareTo, equals, findModuleInputFiles, getAlias, getDescription, getDockerImageTag, getFileCache, getID, getInputFiles, getLogDir, getMenuPlacement, getMetadata, getModuleDir, getOutputDir, getPostRequisiteModules, getPreRequisiteModules, getPropDefault, getPropDescMap, getPropType, getPropTypeMap, getResourceDir, getTempDir, getTitle, hashCode, init, isValidInputModule, isValidProp, listProps, setAlias, toString
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
getDescription, getMenuPlacement, getPropType, getTitle, isValidProp, listProps
getAlias, getDockerImageTag, getID, getInputFiles, getLogDir, getMetadata, getModuleDir, getOutputDir, getPostRequisiteModules, getPreRequisiteModules, getPropDefault, getResourceDir, getTempDir, init, isValidInputModule, setAlias, version
public void checkDependencies() throws Exception
BioModuleImpl
checkDependencies
in interface BioModule
checkDependencies
in class BioModuleImpl
Exception
- thrown if missing or invalid dependencies are foundpublic void cleanUp() throws Exception
cleanUp
in interface BioModule
cleanUp
in class BioModuleImpl
Exception
- thrown if any runtime error occurspublic void executeTask() throws Exception
Config
."metadata.filePath" is undefined, build a new metadata file
with only 1 column of sample IDs. Otherwise, import "metadata.filePath" file and call
MetaUtil.refreshCache()
to validate, format, and cache metadata as a tab delimited text
file.executeTask
in interface BioModule
executeTask
in class BioModuleImpl
Exception
- thrown if the module is unable to complete is taskpublic String getSummary() throws Exception
MetaUtil
).getSummary
in interface BioModule
getSummary
in class BioModuleImpl
Exception
- if any error occursprotected void buildNewMetadataFile() throws BioLockJException
BioLockJException
protected String formatMetaId(String sampleIdColumnName)
sampleIdColumnName
- Current name of metadata Sample ID columnprotected String getQuotedValue(String val)
Config
."metadata.columnDelim" which will be read as a character. If
val closes an open quoted block, the entire quotedBlock is returned (ending with
Config
."metadata.columnDelim" as all cells do) and the quotedText
cache is cleared.val
- Parameter to evaluateprotected TreeSet<String> getSampleIds() throws Exception
SeqUtil.getSampleIdFromString(String)
Exception
protected boolean inQuotes(String val)
Config
."metadata.columnDelim". If the
Config
."metadata.columnDelim" is encountered within a quoted block,
it should be interpreted as a character (not interpreted as a column delimiter).val
- Parameter to evaluateprotected String parseRow(String line, boolean isHeader) throws Exception
Config
."metadata.columnDelim" separates columns. The quotedText
member variable serves as a cache to build cell values contained in quotes which may include the
Config
."metadata.columnDelim" as a standard character. Each row
increments rowNum member variable. When the header row is processed, colNames caches the field names.line
- read from metadata fileisHeader
- is true for only the first rowException
- if required Config values are missing or invalidprotected void verifyAllRowsMapToSeqFile(List<File> files) throws Exception
files
- List of sequence filesConfigViolationException
- if unmapped Sample IDs are foundException
- if other errors occurprotected static void verifyHeader(String cell, List<String> colNames, int colNum) throws Exception
cell
- value of the header column namecolNames
- a list of column names read so farcolNum
- included for reference in error message if neededException
- if a column header is null or not uniquepublic String getDockerImageOwner()
BioModule
getDockerImageOwner
in interface BioModule
getDockerImageOwner
in class BioModuleImpl
public String getDockerImageName()
BioModule
getDockerImageName
in interface BioModule
public String getDescription()
ApiModule
getDetails
.getDescription
in interface ApiModule
public String getCitationString()
ApiModule
getCitationString
in interface ApiModule
public String getDetails()
ApiModule
getDescription
. Beyond the brief description, give details such as
the interaction between properties.getDetails
in interface ApiModule
getDetails
in class BioModuleImpl