scbirlab/nf-baccumulation is a Nextflow pipeline process files from TargetLynx for intracellular accumulation.
Table of contents
For each experiment in the sample_sheet:
You need to have Nextflow and either Anaconda, Singularity, or Docker installed on your system.
If you're at the Crick or your shared cluster has it already installed, try:
module load Nextflow SingularityOtherwise, if it's your first time using Nextflow on your system and you have Conda installed, you can install it using conda:
conda install -c bioconda nextflow You may need to set the NXF_HOME environment variable. For example,
mkdir -p ~/.nextflow
export NXF_HOME=~/.nextflowTo make this a permanent change, you can do something like the following:
mkdir -p ~/.nextflow
echo "export NXF_HOME=~/.nextflow" >> ~/.bash_profile
source ~/.bash_profileMake a sample sheet (see below) and, optionally,
a nextflow.config file in the directory where you want the
pipeline to run. Then run Nextflow.
nextflow run scbirlab/nf-baccumulation -latestIf you want to run a particular tagged version of the pipeline, such as v0.0.1,
you can do so using
nextflow run scbirlab/nf-baccumulation -r v0.0.1For help, use nextflow run scbirlab/nf-promotermap --help.
The first time you run the pipeline, the software dependencies
in environment.yml will be installed. This may take several minutes.
The following parameters are required:
sample_sheet: path to a CSV with information about the samples and FASTQ files to be processedfastq_dir: path to where FASTQ files are storedcontrol_label: the bin ID (from sample sheet) of background controls
The following parameters have default values which can be overridden if necessary.
inputs = "inputs": The folder containing your inputs.outputs = "outputs": The folder to containing the pipeline outputs.
The parameters can be provided either in the nextflow.config file or on the nextflow run command.
Here is an example of the nextflow.config file:
params {
sample_sheet = "/path/to/sample-sheet.csv"
inputs = "/path/to/inputs"
}Alternatively, you can provide the parameters on the command line:
nextflow run scbirlab/nf-promotermap \
--sample_sheet /path/to/sample-sheet.csv \
--inputs /path/to/inputsThe sample sheet is a CSV file providing information about which FASTQ files belong to which sample.
The file must have a header with the column names below (in any order), and one line per sample to be processed. You can have additional columns eith extra information if you like.
experiment_id: Unique name of a peak-calling experiment. Peaks will be called across all samples with the same experiment ID.lcms_filename: Filename pattern of exported TargetLynx files.compound_info: Filename of compound library information.
Additional columns can be included. These will be included in output tables, so can be used for downstrwam analysis.
Here is an example of a couple of lines from a sample sheet:
| experiment_id | lcms_filename | compound_info |
|---|---|---|
| expt01 | TargetLynx/LCMS_*_Data.txt | compounds.xlsx |
You cna find some examples in the test directory of this repository.
Outputs are saved in the directory specified by --outputs (outputs by default).
They are organised into these directories:
Add to the issue tracker.
Here are the help pages of the software used by this pipeline.