This talk will start with a broad introduction to the field of
phylogenetics and in particular of distance-based methods: several
popular methods for phylogenetic inference (but also hierarchical
clustering) are based on a matrix of estimated distances between taxa
(or any kind of objects); the goal is to construct a tree with edge
lengths so that the distances between the leaves in that tree are as
close as possible to the estimated distances.
The second part of the talk will introduce a novel framework unifying
many of the approaches for distance-based inference. In a recent
publication [1], we have shown that all the methods that fit into this
general framework have highly desirable statistical properties (the
consistency of the tree estimates) and algorithmic properties
(efficiency of hill climbing heuristics).
Finally, I will speak about my work on the robustness of
distance-based methods: how much can the estimated distances deviate
from their 'true' values without compromising the estimation of the
'true' tree topology? It turns out that only one linear optimization
principle (among all possible linear principles) guarantees optimal
robustness, a very strong result...
[1] Pardi F, Gascuel O. Combinatorics of distance-based tree
inference. Proc Natl Acad Sci USA 109: 16443-16448 (2012).