General Properties

biolockj#

Property	Description
biolockj.version	string Property giving the biolockj version that was used to generate the config file. default: null

cluster #

Property	Description
cluster.batchCommand	string Terminal command used to submit jobs on the cluster default: null
cluster.host	string The remote cluster host URL (used for ssh, scp, rsync, etc) default: null
cluster.jobHeader	string Header written at top of worker scripts default: null
cluster.modules	list List of cluster modules to load at start of worker scripts default: null
cluster.prologue	string To run at the start of every script after loading cluster modules (if any) default: null
cluster.returnsBatchIds	boolean Options Y/N. Does the cluster.batchCommand return a job id. If Y, if a job is submitted and no id is returned, that will be treated as a failure and the pipeline will stop. default: Y
cluster.statusCommand	string Terminal command used to check the status of jobs on the cluster default: null

demultiplexer#

Property	Description
demultiplexer.barcodeCutoff	numeric Options: (0.0 - 1.0); if defined, pipeline will fail if the percentage of reads with a barcode is less than this cutoff. default: 0.05
demultiplexer.barcodeRevComp	boolean Options: Y/N. Use reverse compliment of metadata.barcodeColumn if demultimplexer.strategy = barcode_in_header or barcode_in_seq. default: null
demultiplexer.strategy	string Options: barcode_in_header, barcode_in_seq, id_in_header, do_not_demux.If using barcodes, they must be provided in the metadata file within column defined by metadata.barcodeColumn. default: null

docker #

Property	Description
docker.imageName	string The name of a docker image to override whatever a module says to use. Only use the module-specific-override form of this property. default: null
docker.imageOwner	string Name of the Docker Hub user that owns the docker containers. Only use the module-specific-override form of this property. default: null
docker.imageTag	string Image tag, a specific version of Docker images. Only use the module-specific-override form of this property. default: null
docker.mountSock	boolean should /var/run/docker.sock be mounted for modules. default: N
docker.saveContainerOnExit	boolean If Y, docker run command will NOT include the --rm flag default: null
docker.verifyImage	boolean In check dependencies, run a test to verify the docker image. default: null

exe #

Property	Description
exe.Rscript	executable Path for the "Rscript" executable; if not supplied, any script that needs the Rscript command will assume it is on the PATH. default: null
exe.awk	executable Path for the "awk" executable; if not supplied, any script that needs the awk command will assume it is on the PATH. default: null
exe.docker	executable Path for the "docker" executable; if not supplied, any script that needs the docker command will assume it is on the PATH. default: null
exe.gzip	executable Path for the "gzip" executable; if not supplied, any script that needs the gzip command will assume it is on the PATH. default: null
exe.java	executable Path for the "java" executable; if not supplied, any script that needs the java command will assume it is on the PATH. default: null
exe.python	executable Path for the "python" executable; if not supplied, any script that needs the python command will assume it is on the PATH. default: null

humann2#

Property	Description
humann2.disableGeneFamilies	boolean disable HumanN2 Gene Family report default: null
humann2.disablePathAbundance	boolean disable HumanN2 Pathway Abundance report default: null
humann2.disablePathCoverage	boolean disable HumanN2 Pathway Coverage report default: null

input #

Property	Description
input.allowDuplicateNames	boolean Should files with the same name be permitted in inputs. File names are used to link data to metadata, and duplicated names create ambiguity. However in some pipelines, duplicates are appropriate. default: N
input.dirPaths	list of file paths List of one or more directories containing the pipeline input data. default: null
input.ignoreFiles	list file names to ignore if found in input directories default: null
input.requireCompletePairs	boolean Require all sequence input files have matching paired reads default: Y
input.suffixFw	regex file suffix used to identify forward reads ininput.dirPaths default: _R1
input.suffixRv	regex file suffix used to identify reverse reads ininput.dirPaths default: _R2
input.trimPrefix	string Prefix to trim from sequence file names or headers to obtain Sample ID; this string can appear anywhere in the filename and all text before it will be removed. default: null
input.trimSuffix	string Suffix to trim from sequence file names or headers to obtain Sample ID; this string can appear anywhere in the filename and all text after it will be removed. default: null

metadata #

Property	Description
metadata.barcodeColumn	string metadata column with identifying barcodes default: BarcodeSequence
metadata.columnDelim	string defines how metadata columns are separated; Typically files are tab or comma separated. default: \t
metadata.commentChar	string metadata file comment indicator; Empty string is a valid option indicating no comments in metadata file. default: null
metadata.fileNameColumn	list name of the metadata column(s) with input file names default: null
metadata.filePath	file path If absolute file path, use file as metadata. If directory path, must find exactly 1 file within, to use as metadata. default: null
metadata.nullValue	string metadata cells with this value will be treated as empty default: NA
metadata.required	boolean If Y, require metadata row for each sample with sequence data in input dirs; If N, samples without metadata are ignored. default: N
metadata.useEveryRow	boolean If Y, require a sequence file for every SampleID (every row) in metadata file; If N, metadata can include extraneous SampleIDs. default: null

pipeline#

Property	Description
pipeline.copyInput	boolean copy input files into pipeline root directory default: null
pipeline.defaultDemultiplexer	string Java class name for default module used to demultiplex data default: biolockj.module.implicit.Demultiplexer
pipeline.defaultFastaConverter	string Java class name for default module used to convert files into fasta format default: biolockj.module.seq.AwkFastaConverter
pipeline.defaultProps	list of file paths file path of default property file(s); Nested default properties are supported (so the default property file can also have a default, and so on). default: null
pipeline.defaultSeqMerger	string Java class name for default module used combined paired read files default: biolockj.module.seq.PearMergeReads
pipeline.defaultStatsModule	string Java class name for default module used generate p-value and other stats default: biolockj.module.report.r.R_CalculateStats
pipeline.deleteTempFiles	boolean delete files in temp directories default: null
pipeline.detachJavaModules	boolean If true Java modules do not run with main BioLockJ Java application. Instead they run on compute nodes on the CLUSTER or AWS environments. default: Y
pipeline.disableAddImplicitModules	boolean If set to true, implicit modules will not be added to the pipeline. default: null
pipeline.disableAddPreReqModules	boolean If set to true, prerequisite modules will not be added to the pipeline. default: null
pipeline.downloadDir	file path local directory used as the destination in the download command default: $HOME/projects/downloads
pipeline.env	string Environment in which a pipeline is run. Options: cluster, aws, local default: local
pipeline.envVars	list list of variables that should be passed into the runtime environment for all modules. default: BLJ
pipeline.inputTypes	list List of file types. This manually overrides the recommended auto-detection. default: null
pipeline.limitDebugClasses	list limit classes that log debug statements default: null
pipeline.logLevel	string Options: DEBUG, INFO, WARN, ERROR default: INFO
pipeline.permissions	string Set chmod -R command security bits on pipeline root directory (Ex. 770) default: 770
pipeline.setSeed	integer set the seed for a random process. Must be positive integer. default: null
pipeline.useEnvVars	boolean when evaluating variables in the ${VAR} format, should environment variables be used. Regardless, priority is given to variable values defined in the config file. default: Y
pipeline.userProfile	file path Bash profile - may be ~/.bash_profile or ~/.bashrc or others default: null

qiime#

Property	Description
qiime.alphaMetrics	list alpha diversity metrics to calculate through qiime; For complete list of skbio.diversity.alpha options, see http://scikit-bio.org/docs/latest/generated/skbio.diversity.alpha.html default: shannon
qiime.plotAlphaMetrics	boolean default: Y

r #

Property	Description
r.colorBase	string base color used for labels & headings in the PDF report; Must be a valid color in R. default: black
r.colorFile	file path path to a tab-delimited file giving the color to use for each value of each metadata field plotted. default: null
r.colorHighlight	string color is used to highlight significant OTUs in plot default: red
r.colorPalette	string palette argument passed to get_palette {ggpubr} to select colors for some output visualiztions default: null
r.colorPoint	string default color of scatterplot and strip-chart plot points default: black
r.debug	boolean Options: Y/N. If Y, will generate R Script log files default: Y
r.excludeFields	list Fields from the metadata that will be excluded from any auto-determined typing, or plotting; R reports must contain at least one valid nominal or numeric metadata field. default: null
r.nominalFields	list Override default property type by explicitly listing it as nominal. default: null
r.numericFields	list Override default property type by explicitly listing it as numeric. default: null
r.pch	integer Sets R plot pch parameter for PDF report default: 21
r.pvalCutoff	numeric p-value cutoff used to assign label r.colorHighlight default: 0.05
r.rareOtuThreshold	numeric If >=1, R will filter OTUs found in fewer than this many samples. If <1, R will interperate the value as a percentage and discard OTUs not found in at least that percentage of samples default: 1
r.reportFields	list Metadata fields to include in reports; Fields listed here must exist in the metadata file. R reports must contain at least one valid field. default: null
r.saveRData	boolean If Y, all R script generating BioModules will save R Session data to the module output directory to a file using the extension ".RData" default: null
r.timeout	integer defines the number of minutes before R script fails due to timeout. If set to 0, an estimate is used. default: 0
r.useUniqueColors	boolean force to use a unique color for every value in every field plotted; only recommended for low numbers of metadata columns/values. default: null

r_PlotMds#

Property	Description
r_PlotMds.reportFields	list Metadata column names indicating fields to include in the MDS report; Fields listed here must exist in the metadata file. default: null

report#

Property	Description
report.logBase	string Options: 10,e,null. If e, use natural log (base e); if 10, use log base 10; if not set, counts will not be converted to a log scale. default: 10
report.minCount	integer minimum table count allowed, if a count less that this value is found, it is set to 0. default: 2
report.numHits	boolean Options: Y/N. If Y, and add Num_Hits to metadata default: Y
report.numReads	boolean Options: Y/N. If Y, and add Num_Reads to metadata default: Y
report.scarceCountCutoff	numeric Minimum percentage of samples that must contain a count value for it to be kept. default: 0.25
report.scarceSampleCutoff	numeric Minimum percentage of data columns that must be non-zero to keep the sample. default: 0.25
report.taxonomyLevels	list Options: domain,phylum,class,order,family,genus,species. Generate reports for listed taxonomy levels default: phylum,class,order,family,genus
report.unclassifiedTaxa	boolean report unclassified taxa default: Y

script #

Property	Description
script.defaultHeader	string Store default script header for MAIN script and locally run WORKER scripts. default: #!/bin/bash
script.fileRefreshDelay	integer delay this many seconds after scripts complete to allow the file system to reflect changes from a worker node/container/virtual machine. default: 1
script.numThreads	integer Used to reserve cluster resources and passed to any external application call that accepts a numThreads parameter. default: 8
script.numWorkers	integer Set number of samples to process per script (if parallel processing) default: 1
script.permissions	string Used as chmod permission parameter (ex: 774) default: 770
script.timeout	integer Sets # of minutes before worker scripts times out. default: null

validation #

Property	Description
validation.compareOn	list Which columns in the expectation file should be used for the comparison. Options: name, size, md5. Default: use all columns in the expectation file. default: null
validation.disableValidation	boolean Turn off validation. No validation file output is produced. Options: Y/N. default: N default: null
validation.expectationFile	file path file path that gives the expected values for file metrics (probably generated by a previous run of the same pipeline) default: null
validation.reportOn	list Which attributes of the file should be included in the validation report file. Options: name, size, md5 default: null
validation.sizeWithinPercent	numeric What percentage difference is permitted between an output file and its expectation. Options: any positive number default: null
validation.stopPipeline	boolean If enabled, the validation utlility will stop the pipeline if any module fails validation. Options: Y/N default: N

aws#

Property	Description
aws.copyDbToS3	boolean If true, save all input files to S3 default: null
aws.copyPipelineToS3	boolean If enabled save pipeline to S3 default: null
aws.copyReportsToS3	boolean If enabled save reports to S3 default: null
aws.ec2AcquisitionStrategy	string The AWS acquisition strategy (SPOT or DEMAND) sets the service SLA for procuring new EC2 instances default: null
aws.ec2InstanceID	string ID of an existing ec2 instance to use as the head node default: null
aws.ec2InstanceType	string AWS instance type determines initial resource class (t2.micro is common) default: null
aws.ec2SpotPer	__ default: null
aws.ec2TerminateHead	boolean default: null
aws.profile	file path default: null
aws.purgeEfsInputs	boolean If enabled delete all EFS dirs (except pipelines) default: null
aws.purgeEfsOutput	boolean If enabled delete all EFS/pipelines default: null
aws.ram	string AWS memory set in Nextflow main.nf default: null
aws.region	string default: null
aws.s3	string AWS S3 pipeline output directory used by Nextflow main.nf default: null
aws.s3TransferTimeout	integer Set the max number of minutes to allow for S3 transfers to complete. default: null
aws.saveCloud	boolean default: null
aws.stack	string An existing aws cloud stack ID default: null
aws.walltime	__ default: null