Academic research themes | PhD Program in Computer Science and Systems Engineering

Computer Science and Systems Engineering

Integration of multimodal data sources for 3D Genome reconstruction
Tutor: Irene Farabella, Integrative Nuclear Architecture Research Line, IIT, and Luca Calatroni, Unige.

Background description: Understanding the genome as a complex system requires approaches that integrate quantitative modeling, data-driven inference, and physical principles. Advances in high-resolution chromatin imaging, such as FISHomics (Flores, Farabella, & Nir, Curr. Opin. Cell Biol., 2023), now produce large-scale 3D point-cloud data of individual chromosomes. Due to the Big Data nature of these imaging resources, there is a pressing need for standardized quantitative analyses and for methods that integrate spatial genomics with other omics information to uncover the relationship between genome structural plasticity and function. Transforming such complex datasets into mechanistic insight demands the development of integrative models capable of combining multimodal information, both spatial and probabilistic, while preserving the inherent complexity and variability of 3D genome organization.

Project description: This PhD project aims to explore advanced computational techniques to develop integrative modelling strategies for 3D genome architecture, with particular emphasis on sparse-data modelling and data-driven priors for spatial genomics. The student will design computational frameworks that combine restraint-based polymer models with AI-driven potential (e.g., deep learning architectures) and optimization protocols to integrate varied spatial and/or sequence genomics data (Farabella et al., Nat Struct Mol Biol, 2021; Nir et al., Farabella et al., PLoS Genet, 2018). By incorporating deep learning or other AI-driven potentials, the project will enable the direct extraction of structural patterns and probabilistic priors from large-scale datasets, thereby improving the accuracy, scalability, and interpretability of reconstructed genome structures.

Essential Requisites:

Good proficiency in Python
Physics/mathematical/bioinformatics background or closely related fields
Familiarity with machine learning/deep learning, statistics, and optimization.
Familiarity with 3D biomolecules

Optional Requisites:

Familiarity with genomic data
Familiarity with imaging technologies, processing, and analysis
Familiarity with Pytorch and similar

Contacts: irene.farabella@iit.it