Different arithmetic operations can be performed on Hi-C contact matrices:

  • normalize a contact matrix using iterative correction;

  • detrend a contact matrix, i.e. remove the distance-dependent contact trend;

  • autocorrelate a contact matrix: this is is typically done to highlight large-scale compartments;

  • divide one contact matrix by another;

  • merge multiple contact matrices;

  • despeckle (i.e. smooth out) a contact matrix out;

  • aggregate (average) a contact matrices over a set of genomic loci of interest;

  • boost Hi-C signal by enhancing long-range interactions while preserving short- range interactions (this is adapted from Boost-HiC);

  • subsample interactions using a proportion or a fixed number of final interactions.

  • coarsen a contact matrix to a larger (coarser) resolution

# S4 method for HiCExperiment
aggregate(
  x,
  targets,
  flankingBins = 51,
  maxDistance = NULL,
  BPPARAM = BiocParallel::bpparam()
)

detrend(x, use.scores = "balanced")

autocorrelate(x, use.scores = "balanced", detrend = TRUE, ignore_ndiags = 3)

divide(x, by, use.scores = "balanced", pseudocount = 0)

# S4 method for HiCExperiment,HiCExperiment
merge(x, y, ..., use.scores = "balanced", FUN = mean)

despeckle(x, use.scores = "balanced", focal.size = 1)

boost(x, use.scores = "balanced", alpha = 1, full.replace = FALSE)

coarsen(x, bin.size)

# S4 method for HiCExperiment
normalize(
  object,
  use.scores = "count",
  niters = 200,
  min.nnz = 10,
  mad.max = 3
)

subsample(x, prop)

Arguments

x, y, object

a HiCExperiment object

targets

Set of chromosome coordinates for which interaction counts are extracted from the Hi-C contact file, provided as a GRanges object (for diagnoal-centered loci) or as a GInteractions object (for off-diagonal coordinates).

flankingBins

Number of bins on each flank of the bins containing input targets.

maxDistance

Maximum distance to use when compiling distance decay

BPPARAM

BiocParallel parameters

use.scores

Which scores to use to perform operations

detrend

Detrend matrix before performing autocorrelation

ignore_ndiags

ignore N diagonals when calculating correlations

by

a HiCExperiment object

pseudocount

Add a pseudocount when dividing matrices (Default: 0)

...

HiCExperiment objects. For aggregate, targets (a set of GRanges or GInteractions).

FUN

merging function

focal.size

Size of the smoothing rectangle

alpha

Power law scaling factor. As indicated in Boost-HiC documentation, the alpha parameter influences the weighting of contacts: if alpha < 1 long-range interactions are prioritized; if alpha >> 1 short-range interactions have more weight when computing the distance matrix.

full.replace

Whether to replace the entire set of contacts, rather than only filling the missing interactions (count=0) (Default: FALSE)

bin.size

Bin size to coarsen a HiCExperiment at

niters

Number of iterations for ICE matrix balancing

min.nnz

Filter bins with less than min.nnz non-zero elements when performing ICE matrix balancing

mad.max

Filter out bins whose log coverage is less than mad.max median absolute deviations below the median bin log coverage.

prop

Float between 0 and 1, or integer corresponding to the # of

Value

a HiCExperiment object with extra scores

Examples

#### -----
#### Normalize a contact matrix
#### -----

library(HiContacts)
contacts_yeast <- contacts_yeast()
#> see ?HiContactsData and browseVignettes('HiContactsData') for documentation
#> loading from cache
normalize(contacts_yeast)
#> 
#> `HiCExperiment` object with 8,757,906 contacts over 763 regions 
#> -------
#> fileName: "/github/home/.cache/R/ExperimentHub/8ee50b5962e_7752" 
#> focus: "whole genome" 
#> resolutions(5): 1000 2000 4000 8000 16000
#> active resolution: 16000 
#> interactions: 267709 
#> scores(3): count balanced ICE 
#> topologicalFeatures: compartments(0) borders(0) loops(0) viewpoints(0) centromeres(16) 
#> pairsFile: N/A 
#> metadata(0):

#### -----
#### Detrending a contact matrix
#### -----

detrend(contacts_yeast)
#> `HiCExperiment` object with 8,757,906 contacts over 763 regions 
#> -------
#> fileName: "/github/home/.cache/R/ExperimentHub/8ee50b5962e_7752" 
#> focus: "whole genome" 
#> resolutions(5): 1000 2000 4000 8000 16000
#> active resolution: 16000 
#> interactions: 267709 
#> scores(4): count balanced expected detrended 
#> topologicalFeatures: compartments(0) borders(0) loops(0) viewpoints(0) centromeres(16) 
#> pairsFile: N/A 
#> metadata(0):

#### -----
#### Auto-correlate a contact matrix
#### -----

autocorrelate(contacts_yeast)
#> 
#> `HiCExperiment` object with 8,757,906 contacts over 763 regions 
#> -------
#> fileName: "/github/home/.cache/R/ExperimentHub/8ee50b5962e_7752" 
#> focus: "whole genome" 
#> resolutions(5): 1000 2000 4000 8000 16000
#> active resolution: 16000 
#> interactions: 267709 
#> scores(5): count balanced expected detrended autocorrelated 
#> topologicalFeatures: compartments(0) borders(0) loops(0) viewpoints(0) centromeres(16) 
#> pairsFile: N/A 
#> metadata(0):

#### -----
#### Divide 2 contact matrices
#### -----

contacts_yeast <- refocus(contacts_yeast, 'II')
contacts_yeast_eco1 <- contacts_yeast_eco1() |> refocus('II')
#> see ?HiContactsData and browseVignettes('HiContactsData') for documentation
#> loading from cache
divide(contacts_yeast_eco1, by = contacts_yeast)
#> `HiCExperiment` object with 996,154 contacts over 51 regions 
#> -------
#> fileName: N/A 
#> focus: "II" 
#> resolutions(1): 16000
#> active resolution: 16000 
#> interactions: 1326 
#> scores(6): count.x balanced.x count.by balanced.by balanced.fc balanced.l2fc 
#> topologicalFeatures: () 
#> pairsFile: N/A 
#> metadata(2): hce_list operation

#### -----
#### Merge 2 contact matrices
#### -----

merge(contacts_yeast_eco1, contacts_yeast)
#> `HiCExperiment` object with 733,836.5 contacts over 51 regions 
#> -------
#> fileName: "/github/home/.cache/R/ExperimentHub/8ee79d66ed6_7754" 
#> focus: "II" 
#> resolutions(5): 1000 2000 4000 8000 16000
#> active resolution: 16000 
#> interactions: 1326 
#> scores(2): count balanced 
#> topologicalFeatures: () 
#> pairsFile: N/A 
#> metadata(2): hce_list operation

#### -----
#### Despeckle (smoothen) a contact map
#### -----

despeckle(contacts_yeast)
#> `HiCExperiment` object with 471,364 contacts over 51 regions 
#> -------
#> fileName: "/github/home/.cache/R/ExperimentHub/8ee50b5962e_7752" 
#> focus: "II" 
#> resolutions(5): 1000 2000 4000 8000 16000
#> active resolution: 16000 
#> interactions: 1326 
#> scores(3): count balanced balanced.despeckled 
#> topologicalFeatures: compartments(0) borders(0) loops(0) viewpoints(0) centromeres(16) 
#> pairsFile: N/A 
#> metadata(0):

#### -----
#### Aggregate a contact matrix over centromeres, at different scales
#### -----

contacts <- contacts_yeast() |> zoom(resolution = 1000)
#> see ?HiContactsData and browseVignettes('HiContactsData') for documentation
#> loading from cache
centros <- topologicalFeatures(contacts, 'centromeres')
aggregate(contacts, targets = centros, flankingBins = 51)
#> Going through preflight checklist...
#> Parsing the entire contact matrice as a sparse matrix...
#> Modeling distance decay...
#> Filtering for contacts within provided targets...
#> `AggrHiCExperiment` object over 16 targets 
#> -------
#> fileName: "/github/home/.cache/R/ExperimentHub/8ee50b5962e_7752" 
#> focus: 16 targets 
#> resolutions(5): 1000 2000 4000 8000 16000
#> active resolution: 1000 
#> interactions: 10609 
#> scores(4): count balanced expected detrended 
#> slices(4): count balanced expected detrended 
#> topologicalFeatures: targets(16) compartments(0) borders(0) loops(0) viewpoints(0) centromeres(16) 
#> pairsFile: N/A 
#> metadata(0):

#### -----
#### Enhance long-range interaction signal
#### -----

contacts <- contacts_yeast() |> zoom(resolution = 1000) |> refocus('II')
#> see ?HiContactsData and browseVignettes('HiContactsData') for documentation
#> loading from cache
boost(contacts, alpha = 1)
#> `HiCExperiment` object with 469,569 contacts over 814 regions 
#> -------
#> fileName: "/github/home/.cache/R/ExperimentHub/8ee50b5962e_7752" 
#> focus: "II" 
#> resolutions(5): 1000 2000 4000 8000 16000
#> active resolution: 1000 
#> interactions: 330763 
#> scores(3): count balanced boosted 
#> topologicalFeatures: compartments(0) borders(0) loops(0) viewpoints(0) centromeres(16) 
#> pairsFile: N/A 
#> metadata(1): alpha

#### -----
#### Subsample & "coarsen" contact matrix 
#### -----

subcontacts <- subsample(contacts, prop = 100000)
coarsened_subcontacts <- coarsen(subcontacts, bin.size = 4000)