Calculating the frequency of read duplication based on alignment status determined by rname, strand, pos, cigar, mrnm, mpos and isize.

readsDupFreq(bamFile, index = bamFile)

Arguments

bamFile

A character vector of length 1L containing the name of a BAM file. Only a BAM file with duplication reads are meaningful for estimating the library complexity. For example, a raw BAM file output by aligners, or a BAM file with mitochondrial reads removed.

index

A character vector of length 1L containing the name of a BAM index file.

Value

A two-column matrix of integers. The 1st column is the frequency j = 1,2,3,.... The 2nd column is the number of genomic regions with the same fequency (j) of duplication. The frequency column is in ascending order.

Author

Haibo Liu

Examples

bamFile <- system.file("extdata", "GL1.bam", package = "ATACseqQC") freq <- readsDupFreq(bamFile)
#> Warning: There is not much information for estimating library complexity. #> Are you sure that you used a BAM file without remove duplication reads?