RdpHierParser#
Add to module run order:
#BioModule biolockj.module.implicit.parser.r16s.RdpHierParser
Description#
Create taxa tables from the _hierarchicalCount.tsv files output by RDP.
Properties#
Properties are the name=value
pairs in the configuration file.
RdpHierParser properties:#
Property | Description |
---|---|
rdp.hierCounts | boolean Set this property to "Y" to use this module instead as the follow-up to the RdpClassifier module. default: null |
rdp.minThresholdScore | numeric RdpClassifier will use this property and ignore OTU assignments below this threshold score (0-100) default: 80 |
General properties applicable to this module:#
Property | Description |
---|---|
cluster.batchCommand | string Terminal command used to submit jobs on the cluster default: null |
cluster.jobHeader | string Header written at top of worker scripts default: null |
cluster.modules | list List of cluster modules to load at start of worker scripts default: null |
cluster.prologue | string To run at the start of every script after loading cluster modules (if any) default: null |
cluster.statusCommand | string Terminal command used to check the status of jobs on the cluster default: null |
docker.saveContainerOnExit | boolean If Y, docker run command will NOT include the --rm flag default: null |
docker.verifyImage | boolean In check dependencies, run a test to verify the docker image. default: null |
script.defaultHeader | string Store default script header for MAIN script and locally run WORKER scripts. default: #!/bin/bash |
script.numThreads | integer Used to reserve cluster resources and passed to any external application call that accepts a numThreads parameter. default: 8 |
script.numWorkers | integer Set number of samples to process per script (if parallel processing) default: 1 |
script.permissions | string Used as chmod permission parameter (ex: 774) default: 770 |
script.timeout | integer Sets # of minutes before worker scripts times out. default: null |
Details#
version: 1.0.0
This module requires that rdp.hierCounts=Y for the RdpClassifier module to make the required output type. As long as rdp.hierCounts is set, this module will automatically be added to the module run order by the RdpClassifier module.
If this module is in the module run order, it adds biolockj.module.classifier.r16s.RdpClassifier
as a pre-quisite module.
To use this module without the RDP module, include ModuleOutput[RdpClassifier] in the list of input types:pipeline.inputTypes=ModuleOutput[RdpClassifier]
When using input from a directory, this module takes exactly one input directory.
This module is an alternative to the default parser, RdpParser. The two parsers produce nearly identical output. The RdpParser module parses the output for each sequence and determines counts for each taxanomic unit. It fills in missing levels so all sequences are counted for all taxanomic levels; this means reads that are unclassified are reported as an OTU with "unclassified" in the name.By contrast, the RdpHierParser module relies on RDP to determine these totals.When using RdpParser the confidence threshold is applied by the parser, when using RdpHierParser the coinfidence threshold is applied by RDP during classification.
Adds modules#
pre-requisite modules
biolockj.module.classifier.r16s.RdpClassifier
post-requisite modules
none found
Docker#
If running in docker, this module will run in a docker container from this image:
biolockjdevteam/biolockj_controller:v1.4.2
This can be modified using the following properties:
RdpHierParser.imageOwner
RdpHierParser.imageName
RdpHierParser.imageTag
Citation#
Module created by Ivory Blakley.