A DNA copy number profile can be viewed as a succession of segments
representing regions in the genome that share the same DNA copy number.
Multiple-change-point detection methods constitute a natural framework for
their analysis and the detection of change-points.
Assessing the quality of a segmentation and in particular the confidence we
have in a particular change-point is a difficult problem. In a Bayesian
context, I will present exact and explicit formulas for the posterior
distribution of variables such as the number of change-points or their
positions. I will also show that several Bayesian model selection criteria
(BIC, ICL, IC) can be computed exactly. These results are based on an
efficient strategy to explore the whole segmentation space.
Due to the increasing size of DNA copy number profiles (n > 106), the
computational burden is now one of the foremost issues when
analysing DNA copy number profiles. The fastest exact algorithm is in O(n2), which is
prohibitive for large signals. I will present a fast algorithm ( O(n
\log(n) ) to recover the optimal segmentation (w.r.t. maximum likelihood) in 1 to K segments for models which have one parameter per segment.