Selected Publications
Key papers from the Bao Lab — for the full list, see Google Scholar

# co-corresponding author    * equal contribution

2026
Whole-body molecular and cellular mapping of the laboratory mouse
Spatial Atlas Multi-modal Omics
Understanding how organs interact requires knowing the molecular identity of every cell in context — not just in isolated tissues. We generated a whole-body single-cell and spatial atlas of the adult laboratory mouse, integrating transcriptomics, chromatin accessibility, and protein data across all major organs. The resource provides a reference map for studying systemic disease, inter-organ crosstalk, and the cellular basis of physiology at unprecedented resolution.
Clevenger MH, Cipurko D, Patil A, Li B, Takahama M, Mei L, Plaster M, Kawamoto G, Bao F#, Chevrier N#
2026 Cell DOI: 10.1016/j.cell.2026.03.006
Paper →
2025
Transitive prediction of small molecule function through alignment of high-content screening resources
Drug Discovery High-Content Screening Machine Learning
Predicting what a new compound does in cells is central to drug discovery, but direct experimental testing of every compound–target pair is prohibitively expensive. We developed a transitive alignment strategy that infers a compound's biological function by linking it through shared cellular phenotypes across multiple large-scale imaging datasets — even when no direct comparison exists. Applied to millions of compound–cell image pairs, the method accurately recovers mechanism-of-action groupings and identifies functionally related compounds that classical approaches miss.
Bao F, Li L, Hammerlindl H, Shen SQ, Hammerlindl S, Altschuler SJ, Wu LF
2025 Nature Biotechnology
Paper →
2024
Tissue characterization at an enhanced resolution across spatial omics platforms with deep generative model
Spatial Omics Deep Generative Model
Spatial omics technologies reveal where genes are active within tissue, but most platforms trade molecular resolution for spatial coverage, leaving fine cellular boundaries blurred. We built a deep generative model that learns tissue structure from low-resolution spatial data and reconstructs it at a much finer scale — without requiring additional experiments or higher-throughput instruments. The approach works across platforms including Visium, Slide-seq, and MERFISH, and substantially improves the accuracy of downstream cell-type mapping and domain boundary detection.
Li B, Bao F*, Hou Y, Li F, Li H, Deng Y, Dai Q
2024 Nature Communications 15(1):6541
Paper →
2022
Integrative spatial analysis of cell morphologies and transcriptional states with MUSE
Multimodal Analysis Spatial Omics Imaging
A cell's function is encoded not only in its genes but also in its physical form — yet most analyses treat morphology and transcriptomics as entirely separate modalities. MUSE is a self-supervised framework that jointly embeds microscopy images of cell shape and single-cell gene expression into a shared latent space, learning a unified representation of cellular identity. Applied to spatial transcriptomics and high-content imaging datasets, MUSE uncovers tissue organization and cell subtypes that neither modality alone can resolve, providing a general strategy for integrating heterogeneous biological data.
Bao F, Deng Y, Wan S, Wang B, Dai Q, Altschuler SJ, Wu LF
2022 Nature Biotechnology
Paper →
2021
Giotto: a toolbox for integrative analysis and visualization of spatial expression data
Spatial Transcriptomics Computational Tool
Spatial transcriptomics generates rich data that maps gene expression to physical locations in tissue, but extracting biological insight requires coordinating many distinct analysis steps. Giotto is a comprehensive open-source toolbox that unifies pre-processing, spatially variable gene detection, cell-type deconvolution, spatial domain identification, and interactive visualization in a single framework. It has become one of the most widely adopted platforms in the field, with active use across thousands of datasets and support for all major spatial omics technologies.
Dries R, Zhu Q, Dong R, Eng CHL, Li H, Liu K, Fu Y, Bao F, et al.
2021 Genome Biology 22(1):1–31
Paper →
Unsupervised content-preserving transformation for optical microscopy
Computational Microscopy Self-supervised Learning
Different microscopy modalities reveal complementary aspects of a biological sample, but translating between them experimentally is costly and often impossible. We developed an unsupervised deep learning framework that learns to convert images across microscopy modalities — for example, from brightfield to fluorescence — while strictly preserving the underlying biological content. The method requires no paired training images and generalises across microscope types, enabling new imaging capabilities from existing hardware and reducing the burden of multi-modal imaging experiments.
Li X, Zhang G, Qiao H, Bao F, Deng Y, Wu J, He Y, et al.
2021 Light: Science & Applications 10(1):1–11
Paper →
2020
Explaining the Genetic Causality for Complex Phenotype via Deep Association Kernel Learning
Genomics / GWAS Interpretable ML
Genome-wide association studies identify statistical links between genetic variants and disease, but understanding which variants are truly causal — and why — remains a major challenge. We introduced a deep association kernel learning framework that jointly models non-linear interactions among variants and learns to assign causal importance scores, producing both accurate predictions and human-interpretable explanations. Applied to lung cancer and other complex traits, the method recovered known causal loci and revealed previously overlooked variant–gene pathways, demonstrating that deep learning can be made mechanistically informative in population genetics.
Bao F, Deng Y, Du M, Ren Z, Wan S, Liang KY, Liu S, Wang B, Xin J, Chen F, Christiani DC, Wang M, Dai Q
2020 Patterns Cover Article
Paper →
2019
Scalable analysis of cell type composition from single-cell transcriptomics using deep recurrent learning
Single-cell Genomics Deep Learning
Bulk RNA sequencing captures the average gene expression of millions of cells together, obscuring the contributions of individual cell types — yet single-cell sequencing of every sample is still too expensive for large cohorts. We developed a deep recurrent learning framework (IQNCELL) that uses a single-cell reference atlas to decompose bulk RNA-seq data into precise cell type proportions, handling tens of thousands of genes and hundreds of cell types with high accuracy. The method scales to biobank-sized datasets and enables retrospective re-analysis of existing bulk RNA-seq cohorts with single-cell resolution.
Deng Y*, Bao F*, Dai Q, Wu LF, Altschuler SJ
2019 Nature Methods 16:311–314
Paper →
For the complete list of publications, visit Google Scholar →