Use cases

A major goal of EUGENe is to streamline end-to-end DL solutions in regulatory genomics. We want to make common tasks that have been published in the field accessible to a broad user base and in doing so hope to make it easy for users to adapt these solutions to their own data. The table below lists several common DL for regulatory genomics tasks that can be analyzed in an end-to-end fashion with EUGENe:

Task

Examples

Potential insights gained

ETL

Training and evaluation

End-to-end currently available?

Interpretation analyses currently available

Example in EUGENe use cases

Single task regression from a tabular file

DeepBind, ResidualBind

Identification and quantification of motif importance on continuous or binary events (e.g. RBP binding)

Yes

Yes

Yes

Filter interpretation, attribution analysis, evolution, GIA

DeepBind

Single track classification of peak regions from a single bed file

DeepBind

Identification and quantification of motif importance on binary events (e.g. TF binding)

Yes

Yes

Yes

Filter interpretation, attribution analysis, evolution, GIA

Kopp21

Multitask track classification (ChIP, ATAC, DNase, etc.) of peak regions from multiple bed files

DeepSEA, DanQ, Basset, Sei, Satori

Identification and quantification of motif importance on biochemical activity (e.g. TF binding, transcription, DNA accessibility, etc.). Variant effects on biochemical activity

Yes

Yes

Yes

Filter interpretation, attribution analysis, evolution, GIA

Basset

Multitask track regression (ChIP, ATAC, DNase, etc.) at binned or base pair resolution

Basenji, Enformer, BPNet

Identification and quantification of motif importance on biochemical activity (e.g. transcription, DNA accessibility, etc. Variant effects on biochemical activity. CRE syntax rules

Yes

Yes

Yes

Filter interpretation, GIA

BPNet

Single task and multitask CRE activity prediction (both regression and classification (multiclass and multilabel)

DeepSTARR, MPRA-DragoNN

Identification and quantification of motif importance on CRE activity. Variant effects on CRE activity. CRE syntax rules

Yes

Yes

Yes

Filter interpretation, attribution analysis, evolution, GIA

DeepSTARR

Single cell ATAC-seq topic classification (multiclass classification)

DeepMEL, DeepMEL2, DeepFlyBrain

Identification and quantification of cell type specific motif importance. Cell type specific variant effect prediction . Cell type specific CRE syntax

Requires preprocessing with pycisTopic

Yes

Yes, with preprocessing performed by pycisTopic

Filter interpretation, attribution analysis, evolution, GIA

DeepMEL

Single cell ATAC-seq cell accessibility prediction*

scBasset

Single cell analysis (denoising, imputation, clustering, etc.). Identification and quantification of cell type specific motif importance

Requires preprocessing with ScanPy

Yes

Yes, with preprocessing performed by ScanPy

Filter interpretation, attribution analysis, evolution, GIA

scBasset

The final column provides a link to a implementated example of this task in EUGENe’s accompanying “use cases” GitHub repository that are described below. Many of these are works in progress and we welcome contributions from the community to help us expand this list. We envision that this list will grow as the field of regulatory genomics continues to develop and new DL solutions are published.