Séminaire de Probabilités et Statistique

Le lundi 03 avril 2017 à 13:45 - UM - Bât 09 - Salle de conférence (1er étage)

Vivian Viallon
Regression modeling on stratified data with the lasso

We consider the estimation of regression models on strata defined using a categorical covariate, in order to identify interactions between this categorical covariate and the other predictors. A basic approach requires the choice of a reference stratum. We show that the performance of a penalized version of this approach depends on this arbitrary choice. We propose a refined approach that bypasses this arbitrary choice, at almost no additional computational cost. Regarding model selection consistency, our proposal mimics the strategy based on an optimal and covariate-specific choice for the reference stratum. Results from an empirical study confirm that our proposal generally outperforms the basic approach in the identification and description of the interactions. An illustration on gene expression data is provided. (joint work with Edouard Ollier).

