viola.Vcf.merge
- Vcf.merge(ls_vcf: list, ls_caller_names: list, threshold: float, linkage='complete', str_missing=True, integration=True)
Return a merged or integrated vcf object from mulitple caller’s bedpe objects in ls_bedpe
- Parameters
ls_vcf (list) – A list of vcf objects to be merged, which are the same order with ls_caller_names
ls_caller_names (list) – A list of names of bedpe objects to be merged, which should have self’s name as the first element
mode ({'distance', 'confidence_intervals'}, default 'distance') –
The mode of the merging strategy.
'distance': Merge SV records by representative SV positions, that is, coordinates of POS field or that of END in the INFO field. If multiple SV positions are within the distance specified inthresholdeach other, they will be merged.'confidence_intervals': Merge SV records according to the confidence intervals reported by SV callers. If confidence intervals of multiple SV records share their genomic coordinates at least 1bp, the will be merged.
threshold (float, default 100) – Two SVs whose diference of positions is under this threshold are cosidered to be identical. This argument is enabled only when
mode='distance'.linkage ({‘complete’, ‘average’, ‘single’}, default ’complete’) – The linkage of hierarchical clustering. To keep the mutual distance of all SVs in each cluster below the threshold, “complete” is recommended. This argument is enabled only when
mode='distance'.str_missing (bool, default True) – If True, all the missing strands are considered to be the same with the others.
integration (bool, default True) – If True, vcf objects in ls_vcf will be merged and integrated with priority as ls_caller_names.
- Returns
A merged vcf object or an integrated vcf object
- Return type