RdpHierParser#

Add to module run order:
#BioModule biolockj.module.implicit.parser.r16s.RdpHierParser

Description#

Create taxa tables from the _hierarchicalCount.tsv files output by RDP.

Properties#

Properties are the name=value pairs in the configuration file.

RdpHierParser properties:#

Property Description
rdp.hierCounts boolean
Set this property to "Y" to use this module instead as the follow-up to the RdpClassifier module.
default: null
rdp.minThresholdScore numeric
RdpClassifier will use this property and ignore OTU assignments below this threshold score (0-100)
default: 80

General properties applicable to this module:#

Property Description
cluster.batchCommand string
Terminal command used to submit jobs on the cluster
default: null
cluster.jobHeader string
Header written at top of worker scripts
default: null
cluster.modules list
List of cluster modules to load at start of worker scripts
default: null
cluster.prologue string
To run at the start of every script after loading cluster modules (if any)
default: null
cluster.statusCommand string
Terminal command used to check the status of jobs on the cluster
default: null
docker.saveContainerOnExit boolean
If Y, docker run command will NOT include the --rm flag
default: null
docker.verifyImage boolean
In check dependencies, run a test to verify the docker image.
default: null
script.defaultHeader string
Store default script header for MAIN script and locally run WORKER scripts.
default: #!/bin/bash
script.numThreads integer
Used to reserve cluster resources and passed to any external application call that accepts a numThreads parameter.
default: 8
script.numWorkers integer
Set number of samples to process per script (if parallel processing)
default: 1
script.permissions string
Used as chmod permission parameter (ex: 774)
default: 770
script.timeout integer
Sets # of minutes before worker scripts times out.
default: null

Details#

version: 1.0.0 This module requires that rdp.hierCounts=Y for the RdpClassifier module to make the required output type. As long as rdp.hierCounts is set, this module will automatically be added to the module run order by the RdpClassifier module.
If this module is in the module run order, it adds biolockj.module.classifier.r16s.RdpClassifier as a pre-quisite module.
To use this module without the RDP module, include ModuleOutput[RdpClassifier] in the list of input types:
pipeline.inputTypes=ModuleOutput[RdpClassifier]
When using input from a directory, this module takes exactly one input directory.

This module is an alternative to the default parser, RdpParser. The two parsers produce nearly identical output. The RdpParser module parses the output for each sequence and determines counts for each taxanomic unit. It fills in missing levels so all sequences are counted for all taxanomic levels; this means reads that are unclassified are reported as an OTU with "unclassified" in the name.By contrast, the RdpHierParser module relies on RDP to determine these totals.When using RdpParser the confidence threshold is applied by the parser, when using RdpHierParser the coinfidence threshold is applied by RDP during classification.

Adds modules#

pre-requisite modules
biolockj.module.classifier.r16s.RdpClassifier
post-requisite modules
none found

Docker#

If running in docker, this module will run in a docker container from this image:

biolockjdevteam/biolockj_controller:v1.4.2

This can be modified using the following properties:
RdpHierParser.imageOwner
RdpHierParser.imageName
RdpHierParser.imageTag

Citation#

Module created by Ivory Blakley.