Scuola superiore sant'anna

Professori Ordinari

Francesca Chiaromonte


I am a statistician developing methods for the analysis of large, high-dimensional and complex data, and applying such methods in several scientific fields – including contemporary “Omics” sciences, Meteorology and Economics.

I received a Laurea (cum laude) in Statistic and Economic Sciences from the University of Rome La Sapienza, where I worked with Giovanni Dosi on a thesis titled Processes of Microeconomic Innovation and Macroeconomic Dynamics, and a Ph.D. in Statistics from the University of Minnesota (Minneapolis, MN), where I worked with R. Dennis Cook on a thesis titled A Reduction Paradigm for Multivariate Laws.

At the Sant’Anna School of Advanced Studies I am a faculty in the Institute of Economics, and I contribute to a novel PhD in Data Science -- established as a consortium with the Scuola Normale Superiore, the University of Pisa, the CNR and the IMT of Lucca. At Penn State (University Park PA, USA) I work in the Department of Statistics, have a courtesy affiliation with the Department of Public Health Sciences, and I am active in the Institute for Genome Sciences (one of the Huck Institutes of the Life Sciences), the Center for Computational Biology and Bioinformatics and the Center for Medical Genomics.

Other academic institutions where I entertain collaborations and spent time over the years include the MOX laboratory of the Politecnico di Milano (Milan, Italy), the Istituto di Analisi dei Sistemi e Informatica of the CNR (Rome, Italy), the Institute for Pure and Applied Mathematics of UCLA (Los Angeles CA, USA), the Courant Institute of Mathematical Sciences and the Department of Biology of NYU (New York, NY, USA), the International Institute for Applied Systems Analysis (Laxenburg, Austria), and the Santa Fe Institute (Santa Fe NM, USA).


Below are my main peer-reviewed publications sorted by area [LAST UPDATED JAN 2017]:

Methodology in Statistics and Bioinformatics

  1. Liu Y., Chiaromonte F. and Li B. (2016) Structured Ordinary Least Squares: a sufficient dimension reduction approach for regressions with partitioned predictors and heterogeneous units. Biometrics. doi:10.1111/biom.12579. R-package in CRAN.
  2. Bartolucci F., Chiaromonte F., Kuruppumullage Don P. and Lindsay B.G. (2016) Composite likelihood inference in a discrete latent variable model for two-way “clustering-by-segmentation” problems. Journal of Computational and Graphical Statistics. doi: 10.1080/ 10618600.2016.1172018.
  3. Liu Y., Chiaromonte F., Ross H., Malhotra R., Elleder D. and Poss M. (2015) Error correction and statistical analyses for intra-host comparisons of feline immunodeficiency virus diversity from high-throughput sequencing data. BMC Bioinformatics. 16(202). DOI: 10.1186/s12859-015-0607-z.
  4. Goldstein J., Haran M., Simeonov I., Fricks J. and Chiaromonte F. (2015). An attraction-repulsion point process model for respiratory syncytial virus infections. Biometrics. 71(2), 376–385 (student paper competition winner, Graybill/ENVR 2014 conference).
  5. Chiaromonte F. and Makova K.D. (2014). Using statistics to shed light on the dynamics of the human genome: a review. Advances in Complex Data Modeling and Computational Methods to Statistics, Contributions in Statistics. A. Paganoni and P. Secchi (eds), Springer Intl Publishing, SW. 69-85.
  6. Kuruppumullage Don P., Lindsay B. and Chiaromonte F. (2014). Model-based block clustering with EM algorithm (reviewed; 2014 student paper award finalist, ASA Nonparametric Statistics Section).
  7. Lee K.Y., Li B. and Chiaromonte F. (2013) A general theory of nonlinear sufficient dimension reduction: formulation and estimation. Annals of Statistics. 41(1), 221-249. doi:10.1214/12-AOS1071
  8. Chiaromonte F. and Taylor J. (2010) Information Based Agglomerative Segmentation in Metric Spaces. Journal of the Indian Society of Agricultural Statistics, 64(1), 33-44.
  9. Cook R.D., Li B. and Chiaromonte F. (2010) Envelope models for parsimonious and efficient multivariate linear regression. Discussion paper. Statistica Sinica, 20(3), 927-910 (including comments and rejoinder).
  10. Kosakovsky Pond S., Wadhawan S., Chiaromonte F., Ananda G., Chung W.Y., Taylor J., Nekrutenko A. and The Galaxy Team. (2009) Windshield splatter analysis with the Galaxy metagenomic pipeline. Genome Research, 19, 2144-2153.
  11. Tyekucheva S. and Chiaromonte F. (2008) Augmenting the bootstrap to analyze high dimensional genomic data. Invited discussion article Test, 17, 1-18 (article), 47-55 (rejoinder).
  12. Cook R.D., Li B. and Chiaromonte F. (2007) Dimension reduction in regression without matrix inversion. Biometrika, 94, 569-584.
  13. Taylor J., Tyekucheva S., King D.C., Hardison R., Miller W. and Chiaromonte F. (2006) ESPERR: Learning strong and weak signals in genomic sequence alignments to identify functional elements. Genome Research, 16, 1596-1604.
  14. Li B., Zha H. and Chiaromonte F. (2005) Contour regression: a general approach to dimension reduction. Annals of Statistics, 33(4), 1580-1616.    
  15. Kolbe D., Taylor J., Elnitski L., Eswara P., Li J., Miller W., Hardison R.C. and Chiaromonte F. (2004) Regulatory potential scores from genome-wide 3-way alignments of human, mouse and rat. Genome Research, 14, 700-707.
  16. Li B., Cook R.D. and Chiaromonte F. (2004) Dimension reduction for the conditional mean in regressions with categorical predictors. Annals of Statistics, 30, 1636-1668.
  17. Li B., Zha H. and Chiaromonte F. (2004) Linear contour learning: a method for supervised dimension reduction. Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence. ACM International Conference Proceeding Series. 349-356.
  18. Chiaromonte F., Bing Yap V. and Miller W. (2002) Scoring pairwise genomic sequence alignments. Proceedings of the Pacific Symposium on Biocomputing 2002.
  19. Chiaromonte F., Martinelli J.A. (2002) Dimension reduction strategies for analyzing global gene expression data with a response. Mathematical Biosciences, 176 (1),123-144.
  20. Chiaromonte F., Cook R.D. and Li B. (2002) Sufficient dimension reduction in regressions with categorical predictors. Annals of Statistics, 30(2). 475-497
  21. Chiaromonte F. and Cook R.D. (2002) Sufficient dimension-reduction and graphics in regression. Annals of the Institute of Statistical Mathematics, 54(4) 768-795.
  22. Chiaromonte F. (2001). Graphics and sufficient dimension reduction with continuous and categorical predictors. Modelli Complessi e Metodi Computazionali Intensivi per la Stima e la Previsione, C. Provasi (ed), Cleup, Padova ITALY. 39-44.
  23. Chiaromonte F. (1998). On multivariate structures and exhaustive reductions. Computing Science and Statistics, 30, S. Weisberg (ed), Interface Foundation of North America, Fairfax Station VA, 204-213.
  24. Chiaromonte F. (1997). A reduction paradigm for multivariate laws. L1 Statistical Procedures and Related Topics, Y. Dodge (ed), Institute of Mathematical Statistics Monograph Series, Hayward CA, 229-240.

Applications in “Omics” and Biomedical Sciences

  1. Campos-Sanchez R., Cremona M., Pini A., Chiaromonte F. and Makova K.D. (2016) Integration and fixation preferences of human and mouse endogenous retroviruses uncovered with Functional Data Analysis. PLoS Computational Biology, 12(6): e1004956. doi: 10.1371/journal.pcbi.1004956
  2. Rebolledo-Jaramilloa B., Shu-Wei M., Stoler N., McElhoec J.A., Dickins B., Blankenberg D., Korneliussen T.S., Chiaromonte F., Nielsen R., Holland M.M., Paul I., Nekrutenko A. and Makova K.D. (2014). Maternal age effect and severe germ-line bottleneck in the inheritance of human mitochondrial DNA. Proceedings of the National Academy of Sciences USA, 111(43), 15474–15479. doi: 10.1073/pnas.1409328111
  3. Campos-Sanchez R., Kapusta A., Feschotte C., Chiaromonte F. and Makova K.D. (2014). Genomic landscape of human, bat and ex vivo DNA transposon integration. Molecular Biology and Evolution, 31(7), 1816–1832 doi:10.1093/molbev/msu138
  4. Kuruppumullage Don P., Andanda G., Chiaromonte F. and Makova K.D. (2013) Segmenting the human genome based on states of neutral genetic divergence. Proceedings of the National Academy of Sciences USA, 110(36), 14699–14704. doi:10.1073/pnas.1221792110
  5. Ananda G., Walsh E., Jacob K.D., Krasilnikova M., Eckert K.A., Chiaromonte F., Makova K.D. (2012) Distinct mutational behaviors distinguish simple tandem repeats from microsatellites in the human genome. Genome Biology and Evolution, 5(3), 606–620. doi: 10.1093/gbe/evs116
  6. Wagstaff B.J., Hedges D.J., Derbes R.S., Campos Sanchez R., Chiaromonte F., Makova K.D. and Roy-Engel A.M. (2012) Rescuing Alu: recovery of new inserts shows LINE-1 preserves Alu activity through A-tail expansion. PLoS Genetics, 8(8) e1002842.
  7. Fungtammasan A., Walsh E., Chiaromonte F., Eckert K.A., Makova K.D. (2012) A Genome-Wide Analysis of Common Fragile Sites: What Features determine chromosomal instability in the human genome? Genome Research, 22, 993-1005.
  8. Kelkar Y.D., Eckert K.A. Chiaromonte F. and Makova K.D. (2011) A matter of life and death: how microsatellites emerge in and vanish from the human genome. Genome Research, 21(12), 2038-2048. PMID:21994250
  9. Wu W., Cheng Y., Keller C.A., Kumar S.A., Ernst J., Mishra T., Morrissey C., Dorman C.M., Chen K.B., Drautz D., Giardine B., Shibata Y., Song L., Crawford G.E., Furey T.S., Kellis M., Miller W., Taylor J., Schuster S.C., Zhang Y., Chiaromonte F., Blobel G.L., Weiss M.J. and Hardison R.C. (2011) Dynamics of the Epigenetic Landscape During Erythroid Differentiation after GATA1 Restoration. Genome Research, 21(10), 1659-1671. PMID:21795386
  10. Ananda G., Chiaromonte F. and Makova K.D. (2011) A genome-wide view of mutation rate co-variation using multivariate analyses. Genome Biology, 12(3):R27. doi:10.1186/gb-2011-12-3-r27. PMID:21426544
  11. Simeonov I., Gong X., Kim O., Poss M., Chiaromonte F. and Fricks J. (2010) Exploratory spatial analysis of in vitro Respiratory Syncytial Virus co-infections. Viruses, 2(12), 2782-2802; doi:10.3390/v2122782
  12. Kelkar Y.D., Strubczewski N., Hile S.E., Chiaromonte F., Eckert K.A. and Makova K.D. (2010) What Is a Microsatellite: A Computational and Experimental Definition Based upon Repeat Mutational Behavior at A/T and GT/AC Repeats. Genome Biology and Evolution, 2, 620-635. doi: 10.1093/gbe/evq046
  13. Schuster S., Miller W. et al. (2010) Complete Khoisan and Bantu genomes from southern Africa. Nature, 463, 943-947.
  14. Cheng Y., Wu W., Kumar S.A., Yu D., Deng W., Tripic T., King D.C., Chen K.B.,  Zhang Y., Drautz D., Giardine B., Schuster S.C., Miller W., Chiaromonte F., Zhang Yu, Blobel G.A., Weiss M.J. and Hardison R.C. (2009) Erythroid GATA1 function revealed by genome-wide analysis of transcription factor occupancy, histone modifications and mRNA expression. Genome Research, 19, 2172-2184.
  15. Roy S., Lavine J., Chiaromonte F., Terwee J., VandeWoude S., Bjornstad O. and Poss M. (2009) Multivariate statistical analyses demonstrate unique host immune responses to single and dual lentiviral infection. PLoS ONE, 4(10) e7359. doi:10.1371/journal.pone.0007359
  16. Zhang Y., Wu W., Cheng Y., King D.C., Harris R.S., Taylor J., Chiaromonte F. and Hardison R.C. (2009) Primary sequence and epigenetic determinants of in vivo occupancy of genomic DNA by GATA1. Nucleic Acids Research. doi: 10.1093/nar/gkp747
  17. Kvikstad E.M., Chiaromonte F. and Makova K.D. (2009) Ride the wavelet: a multi-scale analysis of genomic context flanking small insertions and deletions. Genome Research, 19, 1153-1164.
  18. Cheng Y., King D.C., Dore L.C., Zhang X., Zhou Y., Zhang Y., Dorman C., Abebe D., Kumar S., Chiaromonte F., Miller W., Green R.D., Weiss M.J. and Hardison R.C. (2008) Transcriptional enhancement by GATA1-occupied DNA segments is strongly associated with evolutionary constraint on the binding site motif. Genome Research, 18, 1896-1905.
  19. Kelkar Y., Tyekucheva S., Chiaromonte F. and Makova K. (2008) The genome-wide determinants of microsatellite evolution. Genome Research, 18, 30-38.
  20. Tyekucheva S., Makova K., Karro J. Hardison R.C., Miller W. and Chiaromonte F. (2008) Human-macaque comparisons illuminate variation in neutral substitution rates. Genome Biology, 9(4): R76. Highly accessed article.
  21. Gutiérrez R.A., Lejay L., Chiaromonte F., Shasha D.E. and Coruzzi G.M. (2007) Qualitative network models and genome-wide expression data define carbon/nitrogen-responsive molecular machines in Arabidopsis. Genome Biology, 8(1): R7. 
  22. King D.C., Taylor J., Zhang Y., Cheng Y., Lawson H.A., Martin J., ENCODE groups for Transcriptional Regulation and Multispecies Alignment, Chiaromonte F., Miller W. and Hardison R.C. (2007) Finding cis-regulatory modules using comparative genomics: some lessons from ENCODE data. Genome Research, 17, 775-786.
  23. Kvikstad E.M., Tyekucheva S., Chiaromonte F. and Makova K.D. (2007) A macaque’s-eye view of human insertions and deletions: differences in mechanisms. PLoS Computational Biology, 3(9) e176, 1772-1782.
  24. Wang H., Zhang Y., Petrykowska H., Cheng Y., Zhou Y., King D., Kasturi J., Taylor J., Chiaromonte F., Miller W., Welch J., Weiss M. and Hardison R. (2006) Experimental validation of predicted mammalian erythroid cis-regulatory modules. Genome Research, 16, 1480-1492.
  25. Carrel L., Park C., Tyekucheva S., Dunn J., Chiaromonte F. and Makova K.D. (2006) Genomic environment predicts expression patterns on the human inactive X chromosome. PLoS Genetics, 2(9) e151, 1477-1486.
  26. Taylor J., Tyekucheva S., Zody M., Chiaromonte F. and Makova K. (2006) Strong and weak male mutation bias at different sites in the primate genomes: insights from the human-chimpanzee comparison. Molecular Biology and Evolution, 23(3), 565-573.
  27. King D.C., Taylor J., Elnitski L., Chiaromonte F., Miller W. and Hardison R.C. (2005) Evaluation of regulatory potential and conservation scores for detecting cis-regulatory modules in aligned mammalian genome sequences. Genome Research, 15, 1051-1060.
  28. Gibbs R. et al., Rat Genome Sequencing Project Consortium. (2004) Genome sequence of the brown Norway rat yields insights into mammalian evolution. Nature,428,493-521.
  29. Hillier L. et al. International Chicken Genome Sequencing Consortium (2004). Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature, 432, 695–716.
  30. Makova K.D., Yang S. and Chiaromonte F. (2004) Insertions and deletions are male-biased too: a whole-genome analysis in rodents. Genome Research, 14, 567-573.
  31. Yang S., Smit A.F., Schwartz S., Chiaromonte F., Roskin K. M., Haussler D., Miller W. and Hardison R.C. (2004) Patterns of insertions and their covariation with substitutions in the rat, mouse and human genomes. Genome Research, 14, 517-527.
  32. Hardison R.C., Chiaromonte F., Kolbe D., Wang H., Petrykowska H., Elnitski L., Yang S., Giardine B., Zhang Y., Riemer C., Schwartz S., Haussler D., Roskin K., Weber R., Diekhans M., Kent W.J., Weiss M.J., Welch J. and Miller W. (2003) Global prediction and tests for erythroid regulatory regions. Cold Spring Harbor Symposia in Quantitative Biology: The Genome of Homo Sapiens, 68, 335-345.
  33. Chiaromonte F., Weber R. J., Roskin K.M., Diekhans M., Kent W.J. and Haussler D. (2003) The share of human genomic DNA under selection estimated from human-mouse genomic alignments. Cold Spring Harbor Symposia in Quantitative Biology: The Genome of Homo Sapiens, 68, 245-255.
  34. Elnitski L., Hardison R., Li J., Yang S., Kolbe D., Eswara P., O’Connor M., Schwartz S., Miller W. and Chiaromonte F. (2003) Distinguishing regulatory DNA from neutral sites. Genome Research, 13, 64-72.
  35. Hardison R., Roskin K.M., Yang S., Diekhans M., Kent J.W., Weber R., Elnitski L., Li J., O’Connor M., Kolbe D., Schwartz S., Furey T.S., Whelan S., Goldman N., Smit A., Miller W., Chiaromonte F. and Haussler D. (2003) Co-variation in frequencies of substitution, deletion, transposition and recombination during eutherian evolution. Genome Research, 13, 13-26.
  36. Chiaromonte F., Miller W. and Bouhassira E. (2003) Gene length and proximity to neighbors affect genome-wide expression levels. Genome Research, 13, 2602-2608.
  37. Waterston, R. et al., International Mouse Genome Sequencing consortium (2002) Initial sequencing and comparative analysis of the mouse genome. Nature. 420, 520-562.
  38. Chiaromonte F., Yang S., Elnitski L., Bing Yap V., Miller W. and Hardison R. (2001). Association between divergence and interspersed repeats in mammalian noncoding genomic DNA. Proceedings of the National Academy of Sciences USA, 98(25), 14503-14508.


  1. Chiaromonte F. and Dosi G. (1993). Heterogeneity, competition, and macro-economic dynamics. Structural Change and Economic Dynamics, 4(1), 39-63.
  2. Chiaromonte F. and Dosi G. (1993). The microfoundations of competitiveness and their macro-economic implications. Technology and the Wealth of Nations; The Dynamics of Constructed Advantage, C. Freeman, D. Foray (eds). Pinter, London UK, New York NY, 107-134.
  3. Chiaromonte F., Dosi G., Orsenigo L. (1993). Innovative learning and institutions in the process of developments: on the microfoundations of growth regimes. Learning and Technological Change, R. Thompson (ed). MacMillan, London UK, 117-149.

Other scientific fields

  1. Kuruppumullage Don P., Evans J.L., Chiaromonte F. and Kowaleski A.M. (2016) Mixture-Based Path Clustering for Synthesis of ECMWF Ensemble Forecasts of Tropical Cyclone Evolution. To appear, Monthly Weather Review. doi: 10.1175/MWR-D-15-0214.1
  2. Veren D., Evans J.L., Jones S. and Chiaromonte F. (2009) Novel Metrics for Evaluation of Ensemble Forecasts of Tropical Cyclone Structure. Monthly Weather Review, 137(9), 2830–2850.
  3. Cesari P., Chiaromonte F. and Newell K.M. (2007) Support Vector Machines Categorize the Scaling of Human Grip Configurations. Behavior Research Methods, 39(4), 1001-1007.
  4. Evans J.L., Arnott J. and Chiaromonte F. (2006) Evaluation of operational model cyclone structure forecasts during extratropical transition. Monthly Weather Review, 134, 3054-3072.
  5. Arnott J., Evans J.L. and Chiaromonte F. (2004) Characterization of extratropical transition using cluster analysis. Monthly Weather Review, 132(12), 2916–2937.

Contributo su Rivista

My interests as a statistician include methods to analyze high-dimensional, complex and potentially under-sampled regression and classification problems (in particular dimension reduction and feature selection methods); computational techniques for the empirical assessment of significance (e.g., re-sampling, perturbation and permutation schemes); latent structure and Markov modeling approaches; and functional data analysis methods.

Most of my applied research occurs at the interface between Statistics and contemporary “Omics” sciences. This work comprises interdisciplinary collaborations with biologists and computer scientists in which large genomic, epigenomic, transcriptomic, metagenomic (microbiomes) and metabolomic data sets are analyzed to investigate various aspects of genome dynamics, evolution and function – and to characterize human diseases.

In other interdisciplinary collaborations, I work on Meteorology applications where clustering and re-sampling techniques are used to improve forecast and delineate structure and lifecycle of tropical storms, and on statistical methods for inference and emulation of agent-based models in Economics.

Over the years, my research has been supported by several awards from U.S. funding agencies such as the NIH (National Institutes of Health) and the NSF (National Science Foundation). I am also a Fellow of the American Statistical Association “for outstanding collaborative work in high throughput biology, contributions to methodology in statistics and bioinformatics, commitment to interdisciplinary research, and leadership in developing training programs at the interface of statistics, computation and the life sciences.”


Below is a selection of representative invited lectures I gave at international conferences and universities (out of ~70) [LAST UPDATED JAN 2017]:

  1. IWSM 2016 Conference (plenary speaker). Rennes, FRANCE. 07/2016. Functional Data Analysis at the boundary of “Omics”.
  2. 3rd ISNPS Conference. Avignon, FRANCE. 06/2016. Structured Sufficient Dimension Reduction and its applications.
  3. ISNPS Meeting 2015. Graz, AUSTRIA. 07/2015. Exploiting structure to reduce and integrate high-dimensional, under-sampled “Omics” data.
  4. 7th ERCIM Conference. Pisa, ITALY. 12/2014. Exploiting structure to reduce and integrate high dimensional, under sampled “Omics” data.
  5. SCO 2013 Conference, Politecnico di Milano. Milan, ITALY. 09/2013. Common fragile sites, microsatellites and genome dynamics: old and new statistics for human genomic data.
  6. Yale University, Department of Biostatistics. New Haven CT, USA. 11/2013. Segmenting the human genome based on states of neutral genetic divergence.
  7. 1st ISNPS Conference. Chalkidiki, GREECE. 06/2012. Statistical characterizations of genome dynamics.
  8. University of Minnesota, 40th Anniversary Reunion of the School of Statistics. Minneapolis MN, USA. 05/2011. A Statistician’s travels in Omics-land.
  9. IPAM Program on Mathematical and Computational Approaches in High-Throughput Genomics, UCLA. Los Angeles, CA. 10/2011. Genome-wide statistical analyses of mutagenic processes and their interactions.
  10. ITA 2009 Conference, UCSD. San Diego, CA. 02/2009. The words to predict it: finding patterns in high-dimensional comparative genomics spaces.


Istituto di Economia