Titolare di contratto di ricerca
Centro di Ricerca Interdisciplinare Health Science - Sociali
Download curriculum vitaeFrancesco Tosoni
Bio
Francesco Tosoni has been a research contractor at the School since the 1st of September 2025, affiliated with the interdisciplinary centre Health Science and the department of excellence L'EMbeDS.
He earned a PhD in Computer Science from the University of Pisa, under the supervision of Professors P. Ferragina and G. Manzini. His doctoral dissertation, titled Computation-friendly Compression of Matrices and Tries, focused on efficient data compression techniques. Since 2019, he has been a member of the Acube Laboratory (A³, Advanced Algorithms and Applications), directed by Professor P. Ferragina.
His research interests include lossless data compression, string indexing and stringology, and big data analytics.
He obtained a BSc in Computer and Electronic Engineering from the University of Perugia. He then continued his studies at the University of Pisa, earning an MSc in Computer Science and Networking
in 2020, as part of a joint programme with the Sant’Anna School of Advanced Studies. His MSc thesis, Algorithms and Data Structures for Efficient Ride-Sharing Platforms, was awarded the Con.Scienze 2020 Best Thesis Award.
In 2020, he was awarded a scholarship and research grant on "Algorithms and Data Structures for Urban Mobility Platforms" at the University of Pisa. That same year, he obtained the qualification to register as a chartered engineer (Section A, Information Engineering).
From 8 September to 20 December 2022, he was a visiting researcher at Gonzalo Navarro's laboratory at the University of Chile in Santiago. In July 2025, he was a visiting researcher at the Software Heritage team at Inria Paris, co-founded by Roberto di Cosmo.
Research
As an algorithmist, he primarily specialised in lossless data compression. Since July 2024, he has been working on optimising the compression and efficient indexing of large code archives in collaboration with the Software Heritage team, including Roberto di Cosmo, David Douard, Martin Kirchgessner and Stefano Zacchiroli. In his doctoral thesis, Tosoni investigated compressed formats for matrices and trie structures. Subsequently, he explored various sparse matrix formats that support matrix-vector multiplications (SpMV) in the compressed domain, with a focus on energy efficiency.
For those familiar with the IPA, the pronunciation of his name is [fraŋ'ʧesko to'zoːni].
Research Topic Distribution
Co-author Network
Collaboration Geography
Publications
News
16 Feb 2026 — My article When source-code archival is recognised as Digital Public Good: Insights from Software Heritage's 10-year journey at UNESCO was published on Diff, the official Wikimedia Foundation blog. The piece reflects on the UNESCO symposium celebrating Software Heritage's 10th anniversary, discussing source code archival as a global Digital Public Good and its role in cultural preservation and ethical AI.

📷 Camille Françoise, CC BY-SA 4.0
3 Feb 2026 — I published Lightening the robotic scraping: Insights for a 'green' cache from the Software Heritage archive on Diff (Wikimedia Foundation). The article explores strategies to reduce the environmental impact of massive robotic scraping for AI training using a compressed cache approach inspired by Software Heritage.

📷 Betty Wills (Atsme), CC BY-SA 4.0