Single-cell ATAC-Seq in human pancreatic islets and deep learning upscaling of rare cells reveals cell-specific type 2 diabetes regulatory signatures

Vivek Rai, Daniel X. Quang, Michael R. Erdos, Darren A. Cusanovich, Riza M. Daza, Narisu Narisu, Luli S. Zou, John P. Didion, Yuanfang Guan, Jay Shendure, Stephen C.J. Parker, Francis S. Collins

Genetic predisposition is one of the factors that can lead to type 2 diabetes (T2D). ATAC-seq is a high-throughput epigenomic profiling method to determine chromatin accessibility across samples in a tissue-wide manner. Rai, Quang, et al. use single-cell combinatorial indexing ATAC-seq that enables them to deconvolve cell populations and identify cell-type-specific regulatory signatures underlying T2D. They found T2D single nucleotide polymorphisms to be significantly enriched in beta cell-specific and across cell-type shared islet open chromatin, but not in alpha or delta cell-specific open chromatin. They also developed a novel deep learning-based strategy to improve signal recovery and feature reconstruction for low abundance cell populations and apply it successfully to delta cells (<5% of the total islet population) identified in the study.

Objective: Type 2 diabetes (T2D) is a complex disease characterized by pancreatic islet dysfunction, insulin resistance, and disruption of blood glucose levels. Genome-wide association studies (GWAS) have identified > 400 independent signals that encode genetic predisposition. More than 90% of associated single-nucleotide polymorphisms (SNPs) localize to non-coding regions and are enriched in chromatin-defined islet enhancer elements, indicating a strong transcriptional regulatory component to disease susceptibility. Pancreatic islets are a mixture of cell types that express distinct hormonal programs, so each cell type may contribute differentially to the underlying regulatory processes that modulate T2D-associated transcriptional circuits. Existing chromatin profiling methods such as ATAC-seq and DNase-seq, applied to islets in bulk, produce aggregate profiles that mask important cellular and regulatory heterogeneity.

Methods: We present genome-wide single-cell chromatin accessibility profiles in >1,600 cells derived from a human pancreatic islet sample using single-cell combinatorial indexing ATAC-seq (sci-ATAC-seq). We also developed a deep learning model based on U-Net architecture to accurately predict open chromatin peak calls in rare cell populations.

Results: We show that sci-ATAC-seq profiles allow us to deconvolve alpha, beta, and delta cell populations and identify cell-type-specific regulatory signatures underlying T2D. Particularly, T2D GWAS SNPs are significantly enriched in beta cell-specific and across cell-type shared islet open chromatin, but not in alpha or delta cell-specific open chromatin. We also demonstrate, using less abundant delta cells, that deep learning models can improve signal recovery and feature reconstruction of rarer cell populations. Finally, we use co-accessibility measures to nominate the cell-specific target genes at 104 non-coding T2D GWAS signals.

Conclusions: Collectively, we identify the islet cell type of action across genetic signals of T2D predisposition and provide higher-resolution mechanistic insights into genetically encoded risk pathways.