vvv2_display

Description

Tools to create:

a .png image file describing all variants (obtained from vardict-java variant caller) alongside a genome/assembly (to provide) with their proportion (ordinates), with CDS descriptions (obtained from vadr annotator). At the top of the figure can be displayed the coverage depth repartition (if -o cov_depth_f option is provided).
a .tsv file describing all details of significant variants (according to the proportion threshold chosen by the user, default: 7 percents)
[optional] a .vcf file describing all significant variants (according to the proportion threshold)

Python/R scripts and Galaxy wrapper to use them.

It uses the results of:

vadr >= 1.4.1 for annotation (of reference/assembly, tested with vadr 1.6.4 too)
vardict-java 1.8.3 for variant calling (of BAM alignement using reference/assembly and reads)

Programs

vvv2_display.py: main script running each step of analyses This script can be run independently, once vvv2 conda environment is installed and activated. Type ./vvv2_display.py then enter to get help on how to use it.
PYTHON_SCRIPTS/convert_tbl2json.py: Convert vadr annotation output .tbl file to json
PYTHON_SCRIPTS/convert_vcffile_to_readablefile.py: Convert vardict-java variant calling vcf file to human readable txt file
PYTHON_SCRIPTS/correct_multicontig_vardict_vcf.py: Correct vadr annotation output .tbl file for contigs positions when the assembly provided is composed of more than one contig.

R_SCRIPTS/visualize_snp_v4.R: Create a .png file showing on the same png figure:
- coverage depth repartition alongside the genome/assembly (if -o cov_depth_d option provided)
- variant proportions alongside the genome/assembly and CDS positions.

Installation

Use conda environment:

conda create -n vvv2_display -y
conda activate vvv2_display
mamba/conda install -c bioconda -c conda-forge vvv2_display

Prefer mamba installation if completely new conda environments (faster). Do not mix mamba and conda.

Description:

vvv2_display.py -h

Typical usage:

vvv2_display.py -p res_vadr_pass.tsv -f res_vadr_fail.tsv -s res_vadr_seqstat.txt -n res_vardict_all.vcf -r res_vvv2_display.png -u res_vvv2_display_snp_summary.tsv -o cov_depth_f.txt -y -w 10 -x res_vvv2_display_snp_summary.vcf

where:

res_vadr_pass.tsv is the 'pass' file of vadr annotation program run on the genome/assembly (input)
res_vadr_fail.tsv is the 'fail' file of vadr annotation program (input)
res_vadr_seqstat.txt is the 'seqstat' file of vadr annotation program (input)
res_vardict_all.vcf is the result of vardict-java variant caller (input)
res_vvv2_display.png is the name of the main output file (will be created) (main output)
res_vvv2_display_snp_summary.tsv is the name of the main output file (will be always created, this option allow to choose its name) (main output)
cov_depth_f.txt is the coverage depth by position, provided by samtools depth run on the bam alignement file (optional input)
-y tells to display coverage depth in linear scale (default log10 scale) (optional input)
-w 10 tells to set var significant threshold at 10% (default 7%): graphics display all variants, tsv summary will keep only significant ones (representation higher than this threshold) (optional input)
res_vvv2_display_snp_summary.vcf is the summary of significatn variants in vcf format (optional output)

All other options are for Galaxy wrapper compatibility (these are intermediate temporary files that must appear as parameter for Galaxy wrapper but are not used in a usual command line call)

Minimal usage:

vvv2_display.py -p res_vadr_pass.tsv -f res_vadr_fail.tsv -s res_vadr_seqstat.txt -n res_vardict_all.vcf -r res_vvv2_display.png [-o cov_depth_f.txt]

Output example

Example is obtained on Turkey Coronavirus sequencing data, with as reference, the first draft assembly.

png file:

Dotted vertical dash lines are contig boundaries.

tsv summary file:

indice	position	position_ori	ref	alt	freq	gene	prot	lseq	rseq	isHomo*
1	6388	6388	A	G	0.1429	1a	ORF1a,ORF1ab polyprotein [exception ribosomal slippage],NSP3  putative papain-like protease	GTATGGTCATCAAAATACAT	GTATTGTAGAAATTGTGATG	no
2	6622	6622	A	G	0.0833	1a	ORF1a,ORF1ab polyprotein [exception ribosomal slippage],NSP3  putative papain-like protease	GGAAGCATTGAAATGTGAAC	GAAGAAAGCTGTTTTTCTTA	no
3	6838	6838	A	G	0.1429	1a	ORF1a,ORF1ab polyprotein [exception ribosomal slippage],NSP3  putative papain-like protease	TATAATTTCTGTAGATACTG	AGTTTGTGACATTTTGTCTA	no
4	7014	7014	R	A	0.8824	1a	ORF1a,ORF1ab polyprotein [exception ribosomal slippage],NSP3  putative papain-like protease	CTGATAAATTAACACCTCGT	TACCGTCATATGGTATAGAC	no
5	7833	7833	G	A	0.0909	1a	ORF1a,ORF1ab polyprotein [exception ribosomal slippage],NSP4	ATGCACCTGGAGCTTTACCA	ATTGTTTTAATGGTGATAAT	no
6	8110	8110	T	A	0.0833	1a	ORF1a,ORF1ab polyprotein [exception ribosomal slippage],NSP4	TAGTACATTCTTTACTGGTG	AGAACTTATGTTTAATATGG	no
7	9328	9328	A	G	0.1034	1a	ORF1a,ORF1ab polyprotein [exception ribosomal slippage],NSP5  putative 3C-like proteinase	CCTACATGGTGAGTTCTATG	TGCATTACACACTGGAACGG	no
8	13404	48	A	C	0.1429	intergene	intergene	TTTAGTTGATCTTAGAACGT	GTTAGTGGGAACATCCAATA	no
9	15255	1358	A	T	0.0882	1ab	similar to ORF1ab polyprotein,similar to NSP13:GBSEP:putative helicase	GTTGTCAATACCGTTAGTAT	CTGTGGTAATCATAAACCAA	no
10	15319	1422	C	T	0.0769	1ab	similar to ORF1ab polyprotein,similar to NSP13:GBSEP:putative helicase	AGCGAAAATGTTGATGATTT	TACAGGGCTAATTGTGCTGG	no
11	15326	1429	A	G	0.08	1ab	similar to ORF1ab polyprotein,similar to NSP13:GBSEP:putative helicase	ATGTTGATGATTTTAATCAA	CTAATTGTGCTGGCAGCGAA	no
12	19937	6040	G	A	0.0714	1ab	similar to ORF1ab polyprotein,similar to NSP16:GBSEP:putative 2-O-ribose methyltransferase	AAAATTTATATGACATTGCA	TAACAGAGACAAGTTGGCAC	no
13	21092	7195	T	C	0.0811	S	similar to spike protein	GTTTCTTATGATTATCAGTG	TTACGTGGTGATAACACTGG	no
14	25794	11897	TT	AA	0.0838	5b	5b protein	CTTAACAAAGCAGGACAAGC	AGGATTAGATTGTGTTTACT	no

*NB: an homopolymer region is set to 'yes' if there is a succession of at least 3 identical nucleotides.
     it looks like a restrictive measure, but Ion Torrent and Nanopore sequencing are very bad on such region, so make sure you verify these variants.

Test set

Input data files to test the program are provided in the test-data directory when you clone the repository of vvv2_display program.

Then you can run one of the following command depending on your expected graphical output.

if you don't want coverage depth graphical display in the picture or do not have coverage depth informations of your sample:

vvv2_display.py -p test-data/res2_vadr_pass.tbl -f test-data/res2_vadr_fail.tbl -s test-data/res2_vadr.seqstat -n test-data/res2_vardict.vcf -r test-data/res2_vvv2.png -u test-data/res2_vvv2.tsv

if you want coverage depth graphical display in the picture (log scale)

vvv2_display.py -p test-data/res2_vadr_pass.tbl -f test-data/res2_vadr_fail.tbl -s test-data/res2_vadr.seqstat -n test-data/res2_vardict.vcf -o test-data/res2_covdepth.txt -r test-data/res2_vvv2.png -u test-data/res2_vvv2.tsv

if you want coverage depth graphical display in the picture (normal scale)

vvv2_display.py -p test-data/res2_vadr_pass.tbl -f test-data/res2_vadr_fail.tbl -s test-data/res2_vadr.seqstat -n test-data/res2_vardict.vcf -o test-data/res2_covdepth.txt -r test-data/res2_vvv2.png -u test-data/res2_vvv2.tsv -y

Citation

Please, if you use vvv2_display and publish results, cite:

The article: Flageul, Alexandre, Edouard Hirchaud, Céline Courtillon, Flora Carnet, Paul Brown, Béatrice Grasland, and Fabrice Touzain. "vvv2_align_SE, vvv2_align_PE / vvv2_display: Galaxy-Based Workflows and Tool Designed to Perform, Summarize and Visualize Variant Calling and Annotation in Viral Genome Assemblies". Viruses. 2025;17:1385. https://doi.org/10.3390/v17101385.

And for vardict-java and vadr, respectively:

Lai, Zhongwu, Aleksandra Markovets, Miika Ahdesmaki, Brad Chapman, Oliver Hofmann, Robert McEwen, Justin Johnson, Brian Dougherty, J. Carl Barrett, and Jonathan R. Dry. “VarDict: A Novel and Versatile Variant Caller for next-Generation Sequencing in Cancer Research.” Nucleic Acids Research 44, no. 11 (June 20, 2016): e108–e108. https://doi.org/10.1093/nar/gkw227.
Schäffer, Alejandro A., Eneida L. Hatcher, Linda Yankie, Lara Shonkwiler, J. Rodney Brister, Ilene Karsch-Mizrachi, and Eric P. Nawrocki. “VADR: Validation and Annotation of Virus Sequence Submissions to GenBank.” BMC Bioinformatics 21, no. 1 (December 2020): 211. https://doi.org/10.1186/s12859-020-3537-3.

Galaxy wrapper

vvv2_display.xml: Allow Galaxy integration of vvv2_display.py. vvv2_display can be used in Galaxy pipelines.

it can be found in the Galaxy toolshed at https://toolshed.g2.bx.psu.edu/repository

Related Galaxy workflows on workflowhub

with bwa-mem2 alignment of Illumina paired-end sequencing data (Mi-seq, Nextseq, Novaseq, Hiseq, Iseq): https://workflowhub.eu/workflows/1738
with bwa-mem2 alignment of Illumina or Proton single-end sequencing data: https://workflowhub.eu/workflows/1739
with bwa-mem2 alignment of Nanopore sequencing data (MinION, PromethION, GridION): https://workflowhub.eu/workflows/1740
with minimap2 alignment of Pacbio sequencing data (high quality long reads): https://workflowhub.eu/workflows/1741

Additional informations / data for upstream programs

Poster of the program accepted in JOBIM 2025 conference in Bordeaux (France, July 2025), can be found here: doi: 10.5281/zenodo.16918391 or accessed using these QRcode (A0 pdf, 2.7 MB):
Additional vadr database for specific viruses:
- Porcin Circo Virus: doi: 10.5281/zenodo.15065124

Fundings

EMERGEN/EMERGEN2 ANR project involving:
- Agence Nationale de Sécurité Sanitaire de l'Alimentation, de l'Environnement et du Travail
- Santé Publique France
Conseil régional de Bretagne

Name		Name	Last commit message	Last commit date
Latest commit History 306 Commits
img		img
src		src
test-data		test-data
.gitignore		.gitignore
CHANGELOG		CHANGELOG
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py
vvv2_display.xml		vvv2_display.xml
vvv2_display.yaml		vvv2_display.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vvv2_display

Description

Programs

Installation

Output example

Test set

Citation

Galaxy wrapper

Related Galaxy workflows on workflowhub

Additional informations / data for upstream programs

Fundings

About

Uh oh!

Releases 24

Packages

Uh oh!

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

vvv2_display

Description

Programs

Installation

Output example

Test set

Citation

Galaxy wrapper

Related Galaxy workflows on workflowhub

Additional informations / data for upstream programs

Fundings

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 24

Packages 0

Uh oh!

Uh oh!

Contributors 1

Languages

Packages