Title : Statistical analysis of random graph models. Application to social sciences and history.
- Random graphs
- Mixture models
- Model selection
- MCMC, EM algorithm, variational approximations
- Dirichlet processes. Chinese restaurant process, Indian buffet process
- Hidden Markov models
PhD : A 3-years PhD position is available at the SAMM laboratory of the Sorbonne university (http://www.univ-paris1.fr), in Paris. We are looking for a dynamic candidate with a master degree in statistics, capable of implementing algorithms. Good skills in R or Matlab are therefore mandatory. The candidate does not need to speak french, however very good skills in english are required. The student will be supervised by Charles Bouveyron (Professor, Paris Descartes) and Pierre Latouche (Assistant Professor, Paris 1).
Salary (After payment of health insurance, taxes, and pension) : from 1450€ to 1700€ depending on wether the candidate decides to give lab sessions for students
Description : Because networks are simple data structures yet capable of representing complex systems, they are used in numerous scientific fields from computer science to social sciences. For instance, in Biology, metabolic networks focus on representing pathways of biochemical reactions while the regulation of genes through transcriptional factors is described using regulatory networks. Recently, there has been a growing interest in studying historical networks. Thus, Jernite et al. (2013) considered a graph describing the relation ties between ecclesiastics and notable people in the kingdoms that made up Merovingian Gaul. They proposed a statistical model along with an inference procedure in order to provide historians with insights into the relationship between these actors. In particular, a score was derived that made possible the comparison of kingdoms in terms of organisation and topology. This research field is referred as historical sciences.
Since the very first work of Moreno in 1934, many methods have been proposed to extract knowledge from networks. Most of them look for clusters of vertices with homogenous connection profiles. The stochastic block model (Nowicki et al 2001) is a widely used random graph model which assumes that vertices are spread into latent clusters that have to be inferred. The inference of the number of blocks and the clustering of vertices require approximation techniques (Latouche et al. 2012) since some conditional distributions are not tractable and therefore standard inference approach such as the EM algorithm cannot be used in practice. Extensions have also been proposed in order to deal with continuous and discrete edges (Mariadassou et al. 2010). However, very few approaches can deal with networks having their vertices and/or edges evolving through time. It is challenging in terms of modelling : will a hidden Markov chain on the clustering structure be sufficient to uncover interesting features in historical networks ? Do we need on more complicated dynamic processes ?
Existing random graph models will have to be adapted to model the connections. We are also interested in deriving new approaches for graph comparison.
K. Nowicki, T.A.B. Snijders, Estimation and prediction for stochastic block structures, Journal of the American Statistical Association 96 (2001) 1077-1087
M. Mariadassou, S. Robin, C. Vacher, Uncovering latent structure in valued graphs: a variational approach, Annals of Applied Statistics 4 (2) (2010) 715-742
P. Latouche, E. Birmelé, C. Ambroise, Variational bayesian inference and complexity control for stochastic block models, Statistical Modelling 12 (1) (2012) 93-115
Y. Jernite, P. Latouche, C. Bouveyron, P. Rivera, L. Jegou, S. Lamassé, Arxiv, The random subgraph model for the analysis of an ecclesiastical network in Merovingian Gaul (2012)