viola.merge

merge(ls_inputs: list, ls_caller_names: list, threshold=100, integration=False)

merge Vcf objects or Bedpe objects Return a merged Vcf or Bedpe

Parameters
  • ls_inputs (list) – A list of Vcf or Bedpe objects to be merged

  • ls_caller_names (list) – A list of names of callers(str). Only needed for Bedpes.

  • mode ({'distance', 'confidence_intervals'}, default 'distance') –

    The mode of the merging strategy.

    • 'distance': Merge SV records by representative SV positions, that is, coordinates of POS field or that of END in the INFO field. If multiple SV positions are within the distance specified in threshold each other, they will be merged.

    • 'confidence_intervals': Merge SV records according to the confidence intervals reported by SV callers. If confidence intervals of multiple SV records share their genomic coordinates at least 1bp, the will be merged.

  • threshold (int, default 100) – Two SVs of mutual distance is under this threshold are cosidered to be identical.

  • integration (bool, default True) – If true, only the results from one caller will be retained for each SV event that is predicted to be identical, according to the caller priority. Otherwise, all SV records in the input VCF file are retained and the “mergedid” INFO is added. The caller priority of integration is the same as the order of callers in ls_inputs. For now it only works for Vcf. Detail explanation is described in the User Guide “VCF Merging”.

Return type

Vcf or Bedpe

See also

VCF Merging