Viola 1.0.0 Release Notes
New Features
Class conversion from Vcf to Bedpe
We implemented Vcf to Bedpe converter Vcf.as_bedpe as a method of Vcf class.
>>> vcf = viola.read_vcf('https://raw.githubusercontent.com/dermasugita/ViolaDocs/main/docs/html/_static/tutorial.manta.vcf', patient_name='patient1')
>>> bedpe = vcf.as_bedpe()
>>> bedpe
INFO=imprecise,svtype,svlen,end,cipos,ciend,cigar,mateid,event,homlen,homseq,svinslen,svinsseq,left_svinsseq,right_svinsseq,contig,bnd_depth,mate_bnd_depth,somatic,somaticscore,junction_somaticscore,inv3,inv5
Documentation of Bedpe object ==> https://dermasugita.github.io/ViolaDocs/docs/html/reference/bedpe.html
id be1 be2 strand qual svtype
0 test1 chr1:82550461 chr1:82554226 +- None DEL
1 test2 chr1:22814217 chr1:92581132 -- None INV
2 test3 chr1:60567906 chr1:60675941 +- None DEL
3 test4 chr1:69583190 chr1:69590948 +- None DEL
4 test5 chr11:104534877 chr11:104536574 +- None DEL
5 test6_1 chr11:111134697 chr17:26470495 +- None BND
6 test6_2 chr17:26470495 chr11:111134697 -+ None BND
Note that this process is lossy because only the “positions” table and INFO tables are inherited by Bedpe class.
The similar method MultiVcf.as_bedpe_multi has been also added.
>>> multi_vcf = viola.read_vcf_multi('/path/to/the/multivcf_dir/', variant_caller='manta')
>>> multi_bedpe = multi_vcf.as_bedpe_multi()
Concatenation of MultiBedpe objects
Combining data from multiple research projects is effective option for bioinformatics analysis, especially when the size of the data at hand is small. Consider the case where you want to mix VCF data from your own research with BEDPE files from PCAWG project.
File tree for this section.
.
├── pcawg/
├── your_project/
└── this_script.py
First, read the pcawg-BEDPE and your_project-VCF using read_bedpe_multi and read_vcf_multi respectively.
>>> multi_vcf_your_project = viola.read_vcf_multi('./your_project/', exclude_empty_cases=True)
>>> multi_bedpe_pcawg = viola.read_bedpe_multi('./pcawg/', svtype_col_name='svclass', exclude_empty_cases=True)
In this update, we’ve implemented viola.concat function that concatenate more than one MultiBedpe or Bedpe object into a single MultiBedpe object.
Since concatenating Vcf-relating objects is not available at this time due to technical reason, here we are going to convert MultiVcf object into MultiBedpe object and then concatenate it with pcawg MultiBedpe object.
>>> multi_bedpe_your_project = multi_vcf_your_project.as_bedpe_multi()
>>> concat_bedpe = viola.concat([multi_bedpe_your_project, multi_bedpe_pcawg])
Now you’ve got conacate_bedpe as the MultiBedpe instance containing all the patients in your project and pcawg!
A “patient_name” attribute of Bedpe/Vcf classes
Bedpe/Vcf classes now retain “patient_name” as an attribute. You can pass the patient name to the read_bedpe and read_vcf functions.
>>> vcf = read_vcf('/path/to/the/vcf', variant_caller='gridss', patient_name='patient1')
>>> bedpe = read_bedpe('/path/to/the/bedpe', patient_name='case1')
>>> bedpe.patient_name
'patient1'
Warning
Passing NoneType to the patient_name argument is deprecated though the default value of this argument is None.
Python 3.9.0 support
Python 3.9.0 is now officially supported!
Fixed Features
Fixed feature in viola.Vcf.breadend2breakpoint not calculating SVLEN (GH72).
Fixed feature in viola.MultiVcf.view and viola.MultiVcf.view showing wrong documentation url.
Deprecations
Passing NoneType to the ‘patient_name’ argument of viola.read_bedpe or viola.read_vcf is deprecated.