Use cases¶
A major goal of EUGENe is to streamline end-to-end DL solutions in regulatory genomics. We want to make common tasks that have been published in the field accessible to a broad user base and in doing so hope to make it easy for users to adapt these solutions to their own data. The table below lists several common DL for regulatory genomics tasks that can be analyzed in an end-to-end fashion with EUGENe:
Task |
Examples |
Potential insights gained |
ETL |
Training and evaluation |
End-to-end currently available? |
Interpretation analyses currently available |
Example in EUGENe use cases |
---|---|---|---|---|---|---|---|
Single task regression from a tabular file |
DeepBind, ResidualBind |
Identification and quantification of motif importance on continuous or binary events (e.g. RBP binding) |
Yes |
Yes |
Yes |
Filter interpretation, attribution analysis, evolution, GIA |
DeepBind |
Single track classification of peak regions from a single bed file |
DeepBind |
Identification and quantification of motif importance on binary events (e.g. TF binding) |
Yes |
Yes |
Yes |
Filter interpretation, attribution analysis, evolution, GIA |
Kopp21 |
Multitask track classification (ChIP, ATAC, DNase, etc.) of peak regions from multiple bed files |
DeepSEA, DanQ, Basset, Sei, Satori |
Identification and quantification of motif importance on biochemical activity (e.g. TF binding, transcription, DNA accessibility, etc.). Variant effects on biochemical activity |
Yes |
Yes |
Yes |
Filter interpretation, attribution analysis, evolution, GIA |
Basset |
Multitask track regression (ChIP, ATAC, DNase, etc.) at binned or base pair resolution |
Basenji, Enformer, BPNet |
Identification and quantification of motif importance on biochemical activity (e.g. transcription, DNA accessibility, etc. Variant effects on biochemical activity. CRE syntax rules |
Yes |
Yes |
Yes |
Filter interpretation, GIA |
BPNet |
Single task and multitask CRE activity prediction (both regression and classification (multiclass and multilabel) |
DeepSTARR, MPRA-DragoNN |
Identification and quantification of motif importance on CRE activity. Variant effects on CRE activity. CRE syntax rules |
Yes |
Yes |
Yes |
Filter interpretation, attribution analysis, evolution, GIA |
DeepSTARR |
Single cell ATAC-seq topic classification (multiclass classification) |
DeepMEL, DeepMEL2, DeepFlyBrain |
Identification and quantification of cell type specific motif importance. Cell type specific variant effect prediction . Cell type specific CRE syntax |
Requires preprocessing with pycisTopic |
Yes |
Yes, with preprocessing performed by pycisTopic |
Filter interpretation, attribution analysis, evolution, GIA |
DeepMEL |
Single cell ATAC-seq cell accessibility prediction* |
scBasset |
Single cell analysis (denoising, imputation, clustering, etc.). Identification and quantification of cell type specific motif importance |
Requires preprocessing with ScanPy |
Yes |
Yes, with preprocessing performed by ScanPy |
Filter interpretation, attribution analysis, evolution, GIA |
scBasset |
The final column provides a link to a implementated example of this task in EUGENe’s accompanying “use cases” GitHub repository that are described below. Many of these are works in progress and we welcome contributions from the community to help us expand this list. We envision that this list will grow as the field of regulatory genomics continues to develop and new DL solutions are published.