R/annoLinker.R
annoLinker.RdFast annotation of genomic peaks using DNA interaction data by building interaction networks with igraph. Peaks overlapping any node in a connected subgraph are annotated with all genes in that subgraph.
annoLinker(
peaks,
annoData,
interactions,
bindingType = c("startSite", "body", "endSite"),
bindingRegion = c(-5000, 5000),
cluster_method = c("components", "louvain", "walktrap", "infomap"),
maxgap = 0,
interactionDistanceRange = c(10000, 1e+07),
addEvidence = FALSE,
parallel = FALSE,
verbose = FALSE,
...
)GRanges object containing peak regions
annoGR or GRanges object with gene annotations
GInteractions, or Pairs object with interaction data (e.g., Hi-C, ChIA-PET)
Character, one of "startSite", "body", or "endSite"
Numeric vector of length 2 defining promoter window (e.g., c(-5000, 5000))
Character, clustering method: "components" (connected components), "louvain", "walktrap", or "infomap"
Integer, bp to extend interaction anchors for overlap detection (default: 0)
Numeric vector of length 2 defining the minimal and maximal distance of interactions. This is used to make sure the annotations are not supper far away.
Logical, add evidence to the metadata or not.
Logical, use future_lapply to do parallel computing or not.
Logical, print the message or not
Parameters for cluster. see cluster_louvain, cluster_walktrap, and cluster_infomap.
An annoLinkerResult object or NULL if no annotations found
## read the peaks and interactions
library(rtracklayer)
#> Loading required package: GenomicRanges
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#> Loading required package: generics
#>
#> Attaching package: 'generics'
#> The following objects are masked from 'package:base':
#>
#> as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
#> setequal, union
#>
#> Attaching package: 'BiocGenerics'
#> The following object is masked from 'package:annoLinker':
#>
#> as.data.frame
#> The following objects are masked from 'package:stats':
#>
#> IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#>
#> Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
#> as.data.frame, basename, cbind, colnames, dirname, do.call,
#> duplicated, eval, evalq, get, grep, grepl, is.unsorted, lapply,
#> mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
#> rank, rbind, rownames, sapply, saveRDS, table, tapply, unique,
#> unsplit, which.max, which.min
#> Loading required package: S4Vectors
#>
#> Attaching package: 'S4Vectors'
#> The following object is masked from 'package:utils':
#>
#> findMatches
#> The following objects are masked from 'package:base':
#>
#> I, expand.grid, unname
#> Loading required package: IRanges
#> Loading required package: Seqinfo
extPath <- system.file('extdata', package='annoLinker')
peaks <- rtracklayer::import(file.path(extPath, 'peaks.bed'))
interactions <- rtracklayer::import(file.path(extPath, 'interaction.bedpe'))
library(TxDb.Drerio.UCSC.danRer10.refGene)
#> Loading required package: GenomicFeatures
#> Loading required package: AnnotationDbi
#> Loading required package: Biobase
#> Welcome to Bioconductor
#>
#> Vignettes contain introductory material; view with
#> 'browseVignettes()'. To cite Bioconductor, see
#> 'citation("Biobase")', and for packages 'citation("pkgname")'.
annoData <- genes(TxDb.Drerio.UCSC.danRer10.refGene)
#> 32 genes were dropped because they have exons located on both strands of the
#> same reference sequence or on more than one reference sequence, so cannot be
#> represented by a single genomic range.
#> Use 'single.strand.genes.only=FALSE' to get all the genes in a GRangesList
#> object, or use suppressMessages() to suppress this message.
anno <- annoLinker(peaks, annoData, interactions, verbose=TRUE)
#> Step 1/4: Building interaction network graph...
#> Graph has 3753 nodes and 2500 edges in 1264 clusters, on average 2.969146 nodes in each cluster
#> Step 2/4: Finding interaction anchors overlapping genes and peaks...
#> Step 3/4: Finding genes and peaks in the same cluster...
#> Step 4/4: Annotating peaks with gene clusters...
#> Annotated 4336 peaks with gene clusters