snakemake-workflow-template/config/README.md at main · Marc-commits/snakemake-workflow-template

Workflow overview

This workflow is a best-practice workflow for <detailed description>. The workflow is built using snakemake and consists of the following steps:

Download genome reference from NCBI
Validate downloaded genome (python script)
Simulate short read sequencing data on the fly (dwgsim)
Check quality of input read data (FastQC)
Collect statistics from tool output (MultiQC)

Running the workflow

Input data

This template workflow creates artificial sequencing data in *.fastq.gz format. It does not contain actual input data. The simulated input files are nevertheless created based on a mandatory table linked in the config.yaml file (default: .test/samples.tsv). The sample sheet has the following layout:

sample	condition	replicate	read1	read2
sample1	wild_type	1	sample1.bwa.read1.fastq.gz	sample1.bwa.read2.fastq.gz
sample2	wild_type	2	sample2.bwa.read1.fastq.gz	sample2.bwa.read2.fastq.gz

Parameters

This table lists all parameters that can be used to run the workflow.

parameter	type	details	default
samplesheet
path	str	path to samplesheet, mandatory	"config/samples.tsv"
get_genome
ncbi_ftp	str	link to a genome on NCBI's FTP server	link to S. cerevisiae genome
simulate_reads
read_length	num	length of target reads in bp	100
read_number	num	number of total reads to be simulated	10000

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workflow overview

Running the workflow

Input data

Parameters

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Workflow overview

Running the workflow

Input data

Parameters