This document outlines best practices for writing the main workflow file (main.nf) for Nextflow pipelines. This is the entry point that users run and orchestrates all modules and subworkflows.
File Structure and Organization
1. Standard Structure
Organize the workflow file in clear sections:
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
HEADER AND DOCUMENTATION
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
IMPORT LOCAL MODULES/SUBWORKFLOWS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
IMPORT NF-CORE MODULES/SUBWORKFLOWS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
IMPORT FUNCTIONS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
RUN MAIN WORKFLOW
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
workflow WORKFLOW_NAME {
// Workflow implementation
}
2. Section Separators
Use clear visual separators for major sections:
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SECTION NAME
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
This makes the file easy to navigate and understand.
Header and Documentation
1. File Header
Include a descriptive header comment:
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
nf-core/pipeline Nextflow workflow file
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Main entry point for the pipeline
Run with: nextflow run nf-core/pipeline --input samplesheet.csv
----------------------------------------------------------------------------------------
*/
2. Inline Documentation
Document complex logic and decisions:
// Check for tools that require Docker and are not available in conda/mamba
if ((workflow.profile.contains('conda') || workflow.profile.contains('mamba')) && !params.skip_riboorf) {
// RibORF 2.0 requires custom Docker image, not available in conda/mamba
error "RibORF 2.0 is not available in conda/mamba!"
}
Module and Subworkflow Imports
1. Import Organization
Group imports by source:
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
IMPORT LOCAL MODULES/SUBWORKFLOWS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
include { VALIDATE_RIBOSEQ_QC } from '../../modules/local/validate_riboseq_qc/main'
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
IMPORT NF-CORE MODULES/SUBWORKFLOWS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
include { MULTIQC } from '../../modules/nf-core/multiqc/main'
include { FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS } from '../../subworkflows/nf-core/fastq_qc_trim_filter_setstrandedness/main'
2. Module Aliasing
Use aliases to distinguish multiple uses of the same module:
// Same module used for different sample types
include { RIBOTISH_QUALITY as RIBOTISH_QUALITY_TISEQ } from '../../modules/nf-core/ribotish/quality'
include { RIBOCODE_QUALITY as RIBOCODE_QUALITY_RIBOSEQ } from '../../modules/nf-core/ribocode/quality'
// Same module used for individual vs. all samples
include { RIBOORF_PREDICT as RIBOORF_PREDICT_INDIVIDUAL } from '../../modules/nf-core/riboorf/predict'
include { RIBOORF_PREDICT as RIBOORF_PREDICT_ALL } from '../../modules/nf-core/riboorf/predict'
3. Import Best Practices
- Group imports by source (local vs nf-core)
- Use descriptive aliases
- Order imports logically (subworkflows before modules, or alphabetically)
- Document complex import patterns
Function Imports
1. Utility Functions
Import utility functions from plugins and subworkflows:
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
IMPORT FUNCTIONS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
// From nf-core schema plugin
include { paramsSummaryMap } from 'plugin/nf-schema'
include { samplesheetToList } from 'plugin/nf-schema'
// From pipeline-specific utilities
include { paramsSummaryMultiqc } from '../../subworkflows/nf-core/utils_nfcore_pipeline'
include { softwareVersionsToYAML } from '../../subworkflows/nf-core/utils_nfcore_pipeline'
include { validateInputSamplesheet } from '../../subworkflows/local/utils_nfcore_riboseq_pipeline'
2. Function Usage
Use imported functions throughout the workflow:
// Validate samplesheet
ch_samplesheet = Channel.fromPath(params.input)
.map { file -> validateInputSamplesheet(file) }
// Collect versions
softwareVersionsToYAML(ch_versions)
.collectFile(storeDir: "${params.outdir}/pipeline_info", name: 'versions.yml', sort: true, newLine: true)
Workflow Definition
1. Workflow Structure
Define the main workflow with clear sections:
workflow WORKFLOW_NAME {
take:
// Input channel definitions with comments
main:
// Main workflow logic
publish:
// Output publishing (Nextflow 25.10+)
emit:
// Output channels for workflow chaining
}
2. Input Channel Definitions (take:)
Document each input channel:
workflow RIBOSEQ {
take:
ch_samplesheet // channel: path(sample_sheet.csv)
ch_contrasts_file // channel: path(contrasts.csv)
ch_versions // channel: [ path(versions.yml) ]
ch_fasta // channel: path(genome.fasta)
ch_gtf // channel: path(genome.gtf)
ch_transcript_fasta // channel: path(transcript.fasta)
ch_star_index // channel: path(star/index/)
ch_salmon_index // channel: path(salmon/index/)
ch_bowtie_index // channel: tuple val(meta), path(bowtie/index/) - BOWTIE_ALIGN expects tuple
Best Practices:
- Comment each channel with its type and purpose
- Group related channels together
- Use descriptive channel names
- Document tuple structures when needed
Validation and Preprocessing
1. Profile Compatibility Checks
Validate profile compatibility for tools that require specific environments:
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
VALIDATE PROFILE COMPATIBILITY
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
// Check for tools that require Docker and are not available in conda/mamba
if ((workflow.profile.contains('conda') || workflow.profile.contains('mamba')) && !params.skip_riboorf) {
def separator = "=".multiply(80)
log.error(separator)
log.error("ERROR: RibORF 2.0 is not available in conda/mamba!")
log.error(separator)
log.error("")
log.error("The RibORF 2.0 tool requires a custom Docker image and cannot")
log.error("be used with the conda/mamba profile. RibORF 2.0 is not available in")
log.error("bioconda or biocontainers.")
log.error("")
log.error("Please use one of the following options:")
log.error(" 1. Use -profile docker, -profile singularity, or -profile podman")
log.error(" 2. Use --skip_riboorf to skip RibORF 2.0 analysis")
log.error("")
log.error(separator)
exit 1
}
2. Parameter Validation
Validate required parameters and combinations:
// Validate required inputs
if (!params.input) {
error "Input samplesheet must be provided via --input"
}
// Validate mutually exclusive parameters
if (params.genome && (params.fasta || params.gtf)) {
error "Cannot specify both --genome and --fasta/--gtf. Use one or the other."
}
// Validate conditional requirements
if (!params.genome && !params.fasta && !params.gtf && !params.gff) {
error "Must specify either --genome, or --fasta with --gtf, or --gff"
}
3. Input Preprocessing
Preprocess input channels before use:
// Validate and parse samplesheet
ch_samplesheet = Channel.fromPath(params.input)
.map { file -> validateInputSamplesheet(file) }
// Filter samples by type
ch_riboseq_samples = ch_samplesheet
.filter { meta -> meta.type == 'riboseq' }
ch_tiseq_samples = ch_samplesheet
.filter { meta -> meta.type == 'tiseq' }
Workflow Logic Organization
1. Logical Grouping
Organize workflow steps into logical groups with comments:
main:
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
INITIALIZE CHANNELS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
ch_multiqc_files = channel.empty()
ch_versions = channel.empty()
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PREPROCESSING
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS(ch_fastq, ch_fasta, ch_gtf, ...)
ch_versions = ch_versions.mix(FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS.out.versions)
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ALIGNMENT
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
FASTQ_ALIGN_STAR(FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS.out.reads, ...)
ch_versions = ch_versions.mix(FASTQ_ALIGN_STAR.out.versions)
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ANALYSIS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
if (!params.skip_ribocode) {
RIBOCODE_QUALITY_RIBOSEQ(ch_bams_for_analysis, ...)
ch_versions = ch_versions.mix(RIBOCODE_QUALITY_RIBOSEQ.out.versions)
}
2. Sequential Dependencies
Order steps to respect dependencies:
// Step 1: Preprocessing (must complete first)
FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS(...)
// Step 2: Alignment (depends on preprocessing)
FASTQ_ALIGN_STAR(FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS.out.reads, ...)
// Step 3: Analysis (depends on alignment)
RIBOCODE_QUALITY_RIBOSEQ(FASTQ_ALIGN_STAR.out.bam, ...)
Channel Management
1. Channel Initialization
Initialize channels for collecting outputs:
// Initialize empty channels for collection
ch_multiqc_files = channel.empty()
ch_versions = channel.empty()
ch_qc_files = channel.empty()
2. Channel Mixing
Mix channels to collect outputs from multiple processes:
// Collect versions from all processes
ch_versions = ch_versions.mix(FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS.out.versions)
ch_versions = ch_versions.mix(FASTQ_ALIGN_STAR.out.versions)
ch_versions = ch_versions.mix(RIBOCODE_QUALITY_RIBOSEQ.out.versions)
3. Channel Filtering
Filter channels based on conditions:
// Filter out null values
ch_versions = ch_versions.filter { version -> version != null }
// Filter by sample type
ch_riboseq_bams = ch_bams.filter { meta, bam -> meta.type == 'riboseq' }
4. Channel Transformation
Transform channels for downstream use:
// Extract specific outputs
ch_genome_bam = FASTQ_ALIGN_STAR.out.bam
ch_genome_bam_index = FASTQ_ALIGN_STAR.out.bai
ch_transcriptome_bam = FASTQ_ALIGN_STAR.out.orig_bam_transcript
// Combine channels
ch_all_bams = ch_genome_bam.mix(ch_transcriptome_bam)
Conditional Execution
1. Parameter-Based Conditionals
Use parameters to control execution:
// Conditional module execution
if (!params.skip_ribocode) {
RIBOCODE_QUALITY_RIBOSEQ(ch_bams_for_analysis, ...)
RIBOCODE_DETECT_ORFS_INDIVIDUAL(...)
}
if (!params.skip_riboorf) {
RIBOORF_PREDICT_INDIVIDUAL(...)
RIBOORF_PREDICT_ALL(...)
}
if (!params.skip_multiqc) {
MULTIQC(ch_multiqc_files.collect(), ...)
}
2. Sample Type Conditionals
Execute different workflows based on sample type:
// Process TI-seq samples differently
ch_tiseq_samples = ch_samplesheet.filter { meta -> meta.type == 'tiseq' }
if (ch_tiseq_samples.count().toInteger() > 0) {
RIBOTISH_QUALITY_TISEQ(ch_tiseq_samples, ...)
RIBOTISH_PREDICT_INDIVIDUAL(...)
}
3. Conditional Channel Assignment
Handle conditional outputs in publish block:
publish:
// Only publish if module was executed
ribocode_quality = params.skip_ribocode ? channel.empty() : RIBOCODE_QUALITY_RIBOSEQ.out.distribution
ribocode_orfs = params.skip_ribocode ? channel.empty() : RIBOCODE_DETECT_ORFS_INDIVIDUAL.out.predictions
Version Collection
1. Version Channel Initialization
Initialize a channel for collecting versions:
ch_versions = channel.empty()
2. Collecting Versions
Mix version outputs from all processes:
// Collect versions from each module/subworkflow
ch_versions = ch_versions.mix(FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS.out.versions)
ch_versions = ch_versions.mix(FASTQ_ALIGN_STAR.out.versions)
ch_versions = ch_versions.mix(RIBOCODE_QUALITY_RIBOSEQ.out.versions)
// Filter out null values
ch_versions = ch_versions.filter { version -> version != null }
3. Version Collation
Collate versions into a single YAML file:
// Convert to YAML and collect into single file
softwareVersionsToYAML(ch_versions)
.collectFile(
storeDir: "${params.outdir}/pipeline_info",
name: 'versions.yml',
sort: true,
newLine: true
)
.set { ch_collated_versions }
4. Version Output
Include versions in workflow outputs:
publish:
versions = ch_collated_versions
emit:
versions = ch_versions
Output Definitions
1. Output Channel Organization
Organize outputs logically:
publish:
// Quality control outputs
fastqc_html = FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS.out.html
fastqc_zip = FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS.out.zip
// Alignment outputs
genome_bam = ch_genome_bam
genome_bam_index = ch_genome_bam_index
transcriptome_bam = ch_transcriptome_bam
// Analysis outputs
ribocode_quality = params.skip_ribocode ? channel.empty() : RIBOCODE_QUALITY_RIBOSEQ.out.distribution
ribocode_orfs = params.skip_ribocode ? channel.empty() : RIBOCODE_DETECT_ORFS_INDIVIDUAL.out.predictions
// Reports
multiqc_report = ch_multiqc_report
versions = ch_collated_versions
2. Emit Channels
Define channels for workflow chaining:
emit:
multiqc_report = ch_multiqc_report
versions = ch_versions
genome_bam = ch_genome_bam
Comments and Documentation
1. Section Comments
Use clear section separators:
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SECTION NAME
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
2. Inline Comments
Document complex logic:
// Filter out null values before collating versions
ch_versions = ch_versions.filter { version -> version != null }
// Combine all BAM files for downstream analysis
ch_all_bams = ch_genome_bam.mix(ch_transcriptome_bam)
3. Channel Type Comments
Document channel types and structures:
take:
ch_samplesheet // channel: path(sample_sheet.csv)
ch_bowtie_index // channel: tuple val(meta), path(bowtie/index/) - BOWTIE_ALIGN expects tuple
Common Patterns
1. Empty Channel Pattern
Initialize and conditionally populate channels:
// Initialize empty
ch_multiqc_files = channel.empty()
// Conditionally add to channel
if (!params.skip_fastqc) {
ch_multiqc_files = ch_multiqc_files.mix(FASTQC.out.zip)
}
2. Conditional Module Execution
Execute modules based on parameters:
if (!params.skip_module) {
MODULE_NAME(input_channel, ...)
ch_versions = ch_versions.mix(MODULE_NAME.out.versions)
}
3. Sample Type Filtering
Filter and process samples by type:
ch_riboseq_samples = ch_samplesheet.filter { meta -> meta.type == 'riboseq' }
ch_tiseq_samples = ch_samplesheet.filter { meta -> meta.type == 'tiseq' }
ch_rnaseq_samples = ch_samplesheet.filter { meta -> meta.type == 'rnaseq' }
4. Version Collection Pattern
Standard pattern for collecting versions:
// Initialize
ch_versions = channel.empty()
// Collect from each module
ch_versions = ch_versions.mix(MODULE1.out.versions)
ch_versions = ch_versions.mix(MODULE2.out.versions)
// Filter and collate
ch_versions = ch_versions.filter { version -> version != null }
softwareVersionsToYAML(ch_versions)
.collectFile(storeDir: "${params.outdir}/pipeline_info", name: 'versions.yml', sort: true, newLine: true)
.set { ch_collated_versions }
5. Conditional Output Publishing
Handle conditional outputs:
publish:
// Use ternary operator for conditional outputs
module_output = params.skip_module ? channel.empty() : MODULE.out.results
Complete Example
See the Workflow Output Publishing section for a complete example of a workflow with all components.
Workflow Output Publishing (Nextflow 25.10+)
1. Overview
Starting with Nextflow 25.10, the recommended approach for publishing workflow outputs is to use workflow-level output definitions instead of configuring publishDir for each module in modules.config. This new method provides:
- Centralized output management: All output publishing logic in one place
- Better clarity: Clear separation between workflow logic and output organization
- More flexibility: Easier to reorganize outputs without modifying module configs
- Improved maintainability: Changes to output structure don’t require module config updates
2. Workflow Publish Block
Define outputs in the workflow using a publish: block:
workflow WORKFLOW_NAME {
take:
// Input channels
main:
// Workflow logic
publish:
// Assign channels to output labels
output_label = process_output_channel
another_output = another_channel
emit:
// Output channels for workflow chaining
}
Example:
workflow RIBOSEQ {
take:
ch_samplesheet
ch_fasta
ch_gtf
main:
// ... workflow logic ...
FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS(ch_fastq, ...)
FASTQ_ALIGN_STAR(FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS.out.reads, ...)
RIBOCODE_QUALITY_RIBOSEQ(ch_bams_for_analysis, ...)
publish:
// Quality control outputs
fastqc_html = FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS.out.html
fastqc_zip = FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS.out.zip
// Alignment outputs
genome_bam = FASTQ_ALIGN_STAR.out.bam
genome_bam_index = FASTQ_ALIGN_STAR.out.bai
transcriptome_bam = FASTQ_ALIGN_STAR.out.orig_bam_transcript
// Analysis outputs
ribocode_quality = RIBOCODE_QUALITY_RIBOSEQ.out.distribution
ribocode_orfs = RIBOCODE_DETECT_ORFS_INDIVIDUAL.out.predictions
// Reports
multiqc_report = ch_multiqc_report
versions = ch_collated_versions
emit:
multiqc_report = ch_multiqc_report
versions = ch_versions
}
3. Output Block Definition
Define output publishing structure in a separate output: block (outside the workflow):
output {
output_label {
path 'relative/output/path'
mode 'copy' // or 'link', 'move', 'symlink'
pattern '*.ext' // optional: file pattern
saveAs { filename -> "custom_name.txt" } // optional: custom naming
}
}
Example:
output {
fastqc_html {
path 'fastqc'
mode 'copy'
}
fastqc_zip {
path 'fastqc'
mode 'copy'
}
genome_bam {
path 'star/align'
mode 'copy'
pattern '*.bam'
}
genome_bam_index {
path 'star/align'
mode 'copy'
pattern '*.bai'
}
transcriptome_bam {
path 'star/align/transcriptome'
mode 'copy'
pattern '*.bam'
}
ribocode_quality {
path 'ribocode/quality'
mode 'copy'
}
ribocode_orfs {
path 'ribocode/orfs'
mode 'copy'
}
multiqc_report {
path 'multiqc'
mode 'copy'
}
versions {
path 'pipeline_info'
mode 'copy'
}
}
4. Output Directory Configuration
Set the top-level output directory using:
Command-line option:
nextflow run main.nf -output-dir 'results'
Configuration file:
// nextflow.config
outputDir = 'results'
Default: If not specified, outputs are published to results/ in the launch directory.
5. Output Modes
Available output modes:
'copy': Copy files to output directory (default)'link': Create hard links'move': Move files to output directory'symlink': Create symbolic links'rellink': Create relative symbolic links
output {
large_files {
path 'data'
mode 'symlink' // Use symlinks for large files to save space
}
small_files {
path 'reports'
mode 'copy' // Copy small files for portability
}
}
6. Conditional Publishing
Handle conditional outputs:
workflow RIBOSEQ {
main:
// ... workflow logic ...
if (!params.skip_ribocode) {
RIBOCODE_QUALITY_RIBOSEQ(...)
}
publish:
// Only publish if module was executed
ribocode_quality = params.skip_ribocode ? channel.empty() : RIBOCODE_QUALITY_RIBOSEQ.out.distribution
ribocode_orfs = params.skip_ribocode ? channel.empty() : RIBOCODE_DETECT_ORFS_INDIVIDUAL.out.predictions
}
7. Metadata Preservation
Preserve metadata when publishing:
publish:
// Preserve metadata tuple structure
genome_bam = ch_genome_bam // Already contains [meta, bam] tuples
transcriptome_bam = ch_transcriptome_bam.map { meta, bam -> [meta, bam] }
8. File Pattern Filtering
Filter specific file types:
output {
bam_files {
path 'alignment'
mode 'copy'
pattern '*.bam' // Only publish .bam files
}
index_files {
path 'alignment'
mode 'copy'
pattern '*.{bai,csi}' // Publish both .bai and .csi index files
}
}
9. Custom File Naming
Use saveAs for custom file naming:
output {
results {
path 'analysis'
mode 'copy'
saveAs { filename ->
// Extract sample ID from metadata if available
def meta = filename.metadata
meta ? "${meta.id}_results.txt" : filename.name
}
}
}
10. Complete Example
Workflow with publish block:
workflow RIBOSEQ {
take:
ch_samplesheet
ch_fasta
ch_gtf
main:
// Initialize channels
ch_multiqc_files = channel.empty()
ch_versions = channel.empty()
// Preprocessing
FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS(ch_fastq, ch_fasta, ch_gtf, ...)
ch_versions = ch_versions.mix(FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS.out.versions)
// Alignment
FASTQ_ALIGN_STAR(FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS.out.reads, ...)
ch_genome_bam = FASTQ_ALIGN_STAR.out.bam
ch_genome_bam_index = FASTQ_ALIGN_STAR.out.bai
ch_transcriptome_bam = FASTQ_ALIGN_STAR.out.orig_bam_transcript
ch_versions = ch_versions.mix(FASTQ_ALIGN_STAR.out.versions)
// Analysis
if (!params.skip_ribocode) {
RIBOCODE_QUALITY_RIBOSEQ(ch_bams_for_analysis, ...)
RIBOCODE_DETECT_ORFS_INDIVIDUAL(...)
ch_versions = ch_versions.mix(RIBOCODE_QUALITY_RIBOSEQ.out.versions)
}
// MultiQC
if (!params.skip_multiqc) {
MULTIQC(ch_multiqc_files.collect(), ...)
ch_multiqc_report = MULTIQC.out.report.toList()
} else {
ch_multiqc_report = channel.empty()
}
// Collate versions
ch_versions = ch_versions.filter{ version -> version != null }
softwareVersionsToYAML(ch_versions)
.collectFile(storeDir: "${params.outdir}/pipeline_info", name: 'versions.yml', sort: true, newLine: true)
.set { ch_collated_versions }
publish:
// Quality control
fastqc_html = FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS.out.html
fastqc_zip = FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS.out.zip
// Alignment outputs
genome_bam = ch_genome_bam
genome_bam_index = ch_genome_bam_index
transcriptome_bam = ch_transcriptome_bam
// Analysis outputs (conditional)
ribocode_quality = params.skip_ribocode ? channel.empty() : RIBOCODE_QUALITY_RIBOSEQ.out.distribution
ribocode_orfs = params.skip_ribocode ? channel.empty() : RIBOCODE_DETECT_ORFS_INDIVIDUAL.out.predictions
// Reports
multiqc_report = ch_multiqc_report
versions = ch_collated_versions
emit:
multiqc_report = ch_multiqc_report
versions = ch_versions
}
Output block (in same file or separate config):
output {
fastqc_html {
path 'fastqc'
mode 'copy'
}
fastqc_zip {
path 'fastqc'
mode 'copy'
}
genome_bam {
path 'star/align'
mode 'copy'
pattern '*.bam'
}
genome_bam_index {
path 'star/align'
mode 'copy'
pattern '*.{bai,csi}'
}
transcriptome_bam {
path 'star/align/transcriptome'
mode 'copy'
pattern '*.bam'
}
ribocode_quality {
path 'ribocode/quality'
mode 'copy'
}
ribocode_orfs {
path 'ribocode/orfs'
mode 'copy'
}
multiqc_report {
path 'multiqc'
mode 'copy'
}
versions {
path 'pipeline_info'
mode 'copy'
}
}
11. Migration from publishDir
Old approach (modules.config):
// conf/modules.config
process {
withName: 'FASTQC' {
publishDir = [
path: { "${params.outdir}/fastqc" },
mode: 'copy',
pattern: '*.html'
]
}
}
New approach (workflow main.nf):
// workflows/riboseq/main.nf
workflow RIBOSEQ {
publish:
fastqc_html = FASTQC.out.html
}
// Output block
output {
fastqc_html {
path 'fastqc'
mode 'copy'
pattern '*.html'
}
}
Benefits:
- No need to configure
publishDirinmodules.configfor each module - Centralized output management in the workflow
- Easier to see all outputs at a glance
- More flexible output organization
12. Best Practices
-
Group related outputs:
publish: // Group all QC outputs together fastqc_html = FASTQC.out.html fastqc_zip = FASTQC.out.zip trimgalore_html = TRIMGALORE.out.html -
Use descriptive output labels:
publish: genome_bam = FASTQ_ALIGN_STAR.out.bam // Clear, descriptive name -
Handle conditional outputs:
publish: ribocode_outputs = params.skip_ribocode ? channel.empty() : RIBOCODE.out.results -
Organize output paths logically:
output { genome_bam { path 'star/align' // Organized by tool/step } transcriptome_bam { path 'star/align/transcriptome' // Subdirectory for related files } } -
Use appropriate modes:
output { large_bam_files { path 'alignment' mode 'symlink' // Save space for large files } small_reports { path 'reports' mode 'copy' // Copy for portability } } -
Document output structure:
publish: // Quality control outputs fastqc_html = FASTQC.out.html fastqc_zip = FASTQC.out.zip // Alignment outputs genome_bam = FASTQ_ALIGN_STAR.out.bam
References
- Nextflow Workflow Outputs Documentation
- Nextflow Workflow Documentation
- Nextflow DSL2 Documentation
- nf-core Pipeline Guidelines
- SUBWORKFLOW_BEST_PRACTICES.md - Guide for building subworkflows
- MODULE_MAIN_NF_BEST_PRACTICES.md - Guide for module main.nf files
- Current pipeline:
workflows/riboseq/main.nf
Comments