NormalizeTaxaTables#
Add to module run order:
#BioModule biolockj.module.report.taxa.NormalizeTaxaTables
Description#
Normalize taxa tables for sequencing depth.
Properties#
Properties are the name=value
pairs in the configuration file.
NormalizeTaxaTables properties:#
none
General properties applicable to this module:#
Property | Description |
---|---|
cluster.batchCommand | string Terminal command used to submit jobs on the cluster default: null |
cluster.jobHeader | string Header written at top of worker scripts default: null |
cluster.modules | list List of cluster modules to load at start of worker scripts default: null |
cluster.prologue | string To run at the start of every script after loading cluster modules (if any) default: null |
cluster.statusCommand | string Terminal command used to check the status of jobs on the cluster default: null |
docker.saveContainerOnExit | boolean If Y, docker run command will NOT include the --rm flag default: null |
docker.verifyImage | boolean In check dependencies, run a test to verify the docker image. default: null |
report.logBase | string Options: 10,e,null. If e, use natural log (base e); if 10, use log base 10; if not set, counts will not be converted to a log scale. default: 10 |
script.defaultHeader | string Store default script header for MAIN script and locally run WORKER scripts. default: #!/bin/bash |
script.numThreads | integer Used to reserve cluster resources and passed to any external application call that accepts a numThreads parameter. default: 8 |
script.numWorkers | integer Set number of samples to process per script (if parallel processing) default: 1 |
script.permissions | string Used as chmod permission parameter (ex: 774) default: 770 |
script.timeout | integer Sets # of minutes before worker scripts times out. default: null |
Details#
version: 1.0.0
Normalize taxa tables based on formula:
Where:
- = raw count; the cell value before normalizing
- = number of sequences in the sample (total within a sample)
- = total number of counts in the table (total across samples)
- = total number of samples
Typically the data is put on a scale, so the full forumula is:
The values will be in output dir of the LogTransformTaxaTables
module. The values will be in the output of the NormalizeTaxaTables
module.
For further explanation regarding the normalization scheme, please read The ISME Journal 2013 paper by Dr. Anthony Fodor: "Stochastic changes over time and not founder effects drive cage effects in microbial community assembly in a mouse model"
If report.logBase is not null, then the LogTransformTaxaTables
will be added as a post-requisite module.
Adds modules#
pre-requisite modules
pipeline-dependent
post-requisite modules
biolockj.module.report.taxa.LogTransformTaxaTables
Docker#
If running in docker, this module will run in a docker container from this image:
biolockjdevteam/biolockj_controller:v1.4.2
This can be modified using the following properties:
NormalizeTaxaTables.imageOwner
NormalizeTaxaTables.imageName
NormalizeTaxaTables.imageTag
Citation#
"Stochastic changes over time and not founder effects drive cage effects in microbial community assembly in a mouse model"
Module developed by Mike Sioda.