The HMMtBroadPeak package is designed to call very broad peaks for data such as lamina-associated domains (LADs), nucleolus-associated domains (NADs), or other topologically associating domains.
The methods is following the description of Christ et.al1. Reads will be count by each bins. Only bins with at least given reads (defined by background parameter) for all samples (pool all reads for each bin) will be subsequently normalized. These bins will be first normalized to CPM (count per million) reads and then do log2 transform for the ratio over control with a pseudocount. The peaks were defined by running a hidden markov model over the normalized values (using the R-package HMMt).
There are three steps for calling peaks:
The bam files should be clean with reads passed quality control and proper paired (if applicable). The index file of bam should be stored in the same folder and with same prefix.
treatment <- system.file("extdata", "LB1.KD.chr1_1_5000000.bam",
package = "HMMtBroadPeak",
mustWork = TRUE)
control <- system.file("extdata", "LB1.WT.chr1_1_5000000.bam",
package = "HMMtBroadPeak",
mustWork = TRUE)
## For local file, please try
# treatment <- "path/to/treatment/bam/files"
# control <- "path/to/control/bam/files"
The reads counts for treatment and control will be pool for each group. That is to say duplicates will not be considered when we call peaks.
library(HMMtBroadPeak)
called <- HMMtBroadPeak(treatment, control)
##
iteration: 1
iteration: 2
iteration: 3
iteration: 4
iteration: 5
iteration: 6
iteration: 7
iteration: 8
iteration: 9
iteration: 10
iteration: 11
iteration: 12
iteration: 13
iteration: 14
iteration: 15
iteration: 16
iteration: 17
iteration: 18
iteration: 19
iteration: 20
iteration: 21
iteration: 22
iteration: 23
iteration: 24
iteration: 25
iteration: 26
iteration: 27
iteration: 28
iteration: 29
iteration: 30
iteration: 31
iteration: 32
iteration: 33
iteration: 34
iteration: 35
iteration: 36
iteration: 37
iteration: 38
iteration: 39
iteration: 40
iteration: 41
iteration: 42
iteration: 43
iteration: 44
iteration: 45
iteration: 46
iteration: 47
iteration: 48
iteration: 49
iteration: 50
iteration: 51
iteration: 52
iteration: 53
called$peaks
## GRanges object with 3 ranges and 0 metadata columns:
## seqnames ranges strand
## <Rle> <IRanges> <Rle>
## [1] chr1 774227-1698303 *
## [2] chr1 1713289-2657344 *
## [3] chr1 2777225-5000001 *
## -------
## seqinfo: 1 sequence from an unspecified genome
## R Under development (unstable) (2021-03-18 r80099)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
##
## Matrix products: default
## BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats4 parallel stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] ggplot2_3.3.3 rtracklayer_1.51.5
## [3] HMMtBroadPeak_0.0.4 GenomicAlignments_1.27.2
## [5] Rsamtools_2.7.1 Biostrings_2.59.2
## [7] XVector_0.31.1 SummarizedExperiment_1.21.2
## [9] Biobase_2.51.0 MatrixGenerics_1.3.1
## [11] matrixStats_0.58.0 GenomicRanges_1.43.4
## [13] GenomeInfoDb_1.27.8 IRanges_2.25.6
## [15] S4Vectors_0.29.12 BiocGenerics_0.37.1
##
## loaded via a namespace (and not attached):
## [1] lattice_0.20-41 rprojroot_2.0.2 digest_0.6.27
## [4] utf8_1.2.1 R6_2.5.0 evaluate_0.14
## [7] highr_0.8 pillar_1.5.1 zlibbioc_1.37.0
## [10] rlang_0.4.10 Matrix_1.3-2 rmarkdown_2.7
## [13] pkgdown_1.6.1 labeling_0.4.2 textshaping_0.3.3
## [16] desc_1.3.0 BiocParallel_1.25.5 stringr_1.4.0
## [19] RCurl_1.98-1.3 munsell_0.5.0 DelayedArray_0.17.10
## [22] compiler_4.1.0 xfun_0.22 pkgconfig_2.0.3
## [25] systemfonts_1.0.1 htmltools_0.5.1.1 tibble_3.1.0
## [28] GenomeInfoDbData_1.2.4 XML_3.99-0.6 fansi_0.4.2
## [31] withr_2.4.1 crayon_1.4.1 bitops_1.0-6
## [34] grid_4.1.0 gtable_0.3.0 lifecycle_1.0.0
## [37] magrittr_2.0.1 scales_1.1.1 HMMt_0.1
## [40] stringi_1.5.3 debugme_1.1.0 cachem_1.0.4
## [43] farver_2.1.0 fs_1.5.0 ellipsis_0.3.1
## [46] ragg_1.1.2 vctrs_0.3.7 rjson_0.2.20
## [49] restfulr_0.0.13 tools_4.1.0 glue_1.4.2
## [52] fastmap_1.1.0 yaml_2.2.1 colorspace_2.0-0
## [55] memoise_2.0.0 knitr_1.31 BiocIO_1.1.2