TwinCell Blueprint: Foundation for AI-Assisted Cell Reprogramming
We propose to couple our automated in vitro cellular reprogramming platform with a new compositional, multimodal AI/LLM modeling and control approach to demonstrate Direct Reprogramming of human fibroblasts to cell type X at 10X efficiency vs. reprogramming via iPSC
PAPERS
Ronquist S, Patterson G, Muir LA, Lindsly S, Chen H, Brown M, Wicha MS, Bloch A, Brockett R, Rajapakse I. "Algorithm for cellular reprogramming." Proceedings of the National Academy of Sciences. 2017 Nov 7;114(45):11832-7. Data-guided Control (DGC) Supporting Information
Chen H, Chen J, Muir LA, Ronquist S, Meixner W, Ljungman M, Ried T, Smale S, Rajapakse I. "Functional Organization of the Human 4D Nucleome. " Proceedings of the National Academy of Sciences 112.26 (2015): 8002-8007. Supporting Information
Böttcher, Lucas, Nino Antulov-Fantulin, and Thomas Asikis. "AI Pontryagin or how artificial neural networks learn to control dynamical systems." Nature communications 13.1 (2022): 333.
Yang, Fan, et al. "scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data." Nature Machine Intelligence 4.10 (2022): 852-866.
Geshkovski, Borjan, et al. "The emergence of clusters in self-attention dynamics." arXiv preprint arXiv:2305.05465 (2023).
Chen, Ricky TQ, et al. "Neural ordinary differential equations." Advances in neural information processing systems 31 (2018).
Feng, Jiarui, et al. "PathFinder: a novel graph transformer model to infer multi-cell intra-and inter-cellular signaling pathways and communications." bioRxiv (2024): 2024-01.
Consens, Micaela E., et al. "To Transformers and Beyond: Large Language Models for the Genome." arXiv preprint arXiv:2311.07621 (2023).
Khan, Sumeer Ahmad, et al. "Reusability report: Learning the transcriptional grammar in single-cell RNA-sequencing data using transformers." Nature Machine Intelligence 5.12 (2023): 1437-1446.
Geshkovski, Borjan, et al. "A mathematical perspective on Transformers." arXiv preprint arXiv:2312.10794 (2023).
Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017).
Schubert, Ingmar, et al. "A Generalist Dynamics Model for Control." arXiv preprint arXiv:2305.10912 (2023).
Dotson, Gabrielle A., Can Chen, Stephen Lindsly, Anthony Cicalo, Sam Dilworth, Charles Ryan, Sivakumar Jeyarajan et al. "Deciphering Multi-way Interactions in the Human Genome." Nature Communications 13, no. 1 (2022): 5498.
He, Zhen, et al. "Mosaic integration and knowledge transfer of single-cell multimodal data with MIDAS." Nature Biotechnology (2024): 1-12.
Li, Q., Hu, Z., Wang, Y., Li, L., Fan, Y., King, I., ... & Li, Y. (2024). "Progress and Opportunities of Foundation Models in Bioinformatics." arXiv preprint arXiv:2402.04286.
K. Kawaharazuka, T. Matsushima, A. Gambardella, J. Guo, C. Paxton, and A. Zeng, “Real-World Robot Applications of Foundation Models: A Review.” arXiv, Feb. 08, 2024
Gemini Team et al., “Gemini: A Family of Highly Capable Multimodal Models.” arXiv, Dec. 18, 2023
C. V. Theodoris et al., “Transfer learning enables predictions in network biology,” Nature, vol. 618, no. 7965, pp. 616–624, Jun. 2023
W. Hou and Z. Ji, “Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis,” Nat Methods, pp. 1–4, Mar. 2024
H. Cui et al., “scGPT: toward building a foundation model for single-cell multi-omics using generative AI,” Nat Methods, pp. 1–11, Feb. 2024
V. Sharma and V. Raman, “A reliable knowledge processing framework for combustion science using foundation models,” Energy and AI, vol. 16, p. 100365, May 2024
X. Qiu et al., “Mapping transcriptomic vector fields of single cells,” Cell, p. S0092867421015774, Feb. 2022.
Schiebinger, Geoffrey, et al. "Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming." Cell 176.4 (2019): 928-943.
A. Gayoso et al., “Deep generative modeling of transcriptional dynamics for RNA velocity analysis in single cells,” Nat Methods, vol. 21, no. 1, pp. 50–59, Jan. 2024
Luck, Katja, et al. "A reference map of the human binary protein interactome." Nature (2020): 1-7.
Szklarczyk, Damian, et al. "STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets." Nucleic acids research 47.D1 (2019): D607-D613.
R. Eguchi, M. Hamano, M. Iwata, T. Nakamura, S. Oki, and Y. Yamanishi, “TRANSDIRE: data-driven direct reprogramming by a pioneer factor-guided trans-omics approach,” Bioinformatics, vol. 38, no. 10, pp. 2839–2846, May 2022
F. M. Mione et al., “A workflow management system for reproducible and interoperable high-throughput self-driving experiments,” Computers & Chemical Engineering, vol. 187, p. 108720, Aug. 2024
S. D. Rihm et al., “The digital lab manager: Automating research support,” SLAS Technology, vol. 29, no. 3, p. 100135, Jun. 2024: SUPPLEMENTARY INFORMATION
M. Hao et al., “Large Scale Foundation Model on Single-cell Transcriptomics.” bioRxiv, p. 2023.05.29.542705, Jun. 02, 2023
Chiliński, Mateusz, and Dariusz Plewczynski. "HiCDiffusion-diffusion-enhanced, transformer-based prediction of chromatin interactions from DNA sequences." bioRxiv (2024): 2024-02.
C. Fang et al., “Cell-Graph Compass: Modeling Single Cells with Graph Structure Foundation Model.” bioRxiv, p. 2024.06.04.597354, Jun. 06, 2024
Boiko, Daniil A., et al. "Autonomous chemical research with large language models." Nature 624.7992 (2023): 570-578.
Zhao, Hongyu, et al. "Evaluating the Utilities of Large Language Models in Single-cell Data Analysis." (2023).
Paaß, Gerhard, and Sven Giesselbach. Foundation models for natural language processing: Pre-trained language models integrating media. Springer Nature, 2023.
Abramson, Josh, et al. "Accurate structure prediction of biomolecular interactions with AlphaFold 3." Nature (2024): 1-3.
Kanda, Genki N., et al. "Robotic search for optimal cell culture in regenerative medicine." Elife 11 (2022): e77007.
Bai, Jiaru, et al. "A dynamic knowledge graph approach to distributed self-driving laboratories." Nature Communications 15.1 (2024): 462.
Rihm, Simon D., et al. "Transforming research laboratories with connected digital twins." Nexus 1.1 (2024). (Technical Report)
Criscitiello, Christopher, et al. "Synchronization on circles and spheres with nonlinear interactions." arXiv preprint arXiv:2405.18273 (2024).
Silver, David, et al. "Mastering the game of go without human knowledge." nature 550.7676 (2017): 354-359.
Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." nature 529.7587 (2016): 484-489.
Trapnell, Cole. "Defining cell types and states with single-cell genomics." Genome research 25.10 (2015): 1491-1498.
Qiu, Xiaojie, et al. "Reversed graph embedding resolves complex single-cell trajectories." Nature methods 14.10 (2017): 979-982.
Rafelski, Susanne M., and Julie A. Theriot. "Establishing a conceptual framework for holistic cell states and state transitions." Cell 187.11 (2024): 2633-2651.
Wang, Junlin, et al. "Mixture-of-Agents Enhances Large Language Model Capabilities." arXiv preprint arXiv:2406.04692 (2024).
Geneva, Nicholas, and Nicholas Zabaras. "Transformers for modeling physical systems." Neural Networks 146 (2022): 272-289.
Liu, Jiajia, et al. "Large language models in bioinformatics: applications and perspectives." ArXiv (2024).
Davis, Jared Quincy, et al. "Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design." arXiv preprint arXiv:2407.16831 (2024).
Szałata, Artur, et al. "Transformers in single-cell omics: a review and new perspectives." Nature Methods 21.8 (2024): 1430-1443.
Trapnell, Cole, et al. "The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells." Nature biotechnology 32.4 (2014): 381-386.
Lotfollahi, Mohammad. "Toward learning a foundational representation of cells and genes." Nature Methods 21.8 (2024): 1416-1417.
Tan, Jimin, et al. "Cell-type-specific prediction of 3D chromatin organization enables high-throughput in silico genetic screening." Nature biotechnology 41.8 (2023): 1140-1150.
Sexton, Tom, and Giacomo Cavalli. "The role of chromosome domains in shaping the functional genome." Cell 160.6 (2015): 1049-1059.
Simon, Elana, Kyle Swanson, and James Zou. "Language models for biological research: a primer." Nature Methods 21.8 (2024): 1422-1429.
Rood, Jennifer E., Anna Hupalowska, and Aviv Regev. "Toward a foundation model of causal cell and tissue biology with a Perturbation Cell and Tissue Atlas." Cell 187.17 (2024): 4520-4545.
Roohani, Yusuf, et al. "BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments." arXiv preprint arXiv:2405.17631 (2024).
Ivanov, Igor. "BioLP-bench: Measuring understanding of AI models of biological lab protocols." bioRxiv (2024): 2024-08.
H. Chen et al., “Quantized multi-task learning for context-specific representations of gene network dynamics,” Aug. 19, 2024
Silver, David, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot et al. "A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play." Science 362, no. 6419 (2018): 1140-1144.
Schrittwieser, Julian, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez et al. "Mastering atari, go, chess and shogi by planning with a learned model." Nature 588, no. 7839 (2020): 604-609.
Maheshwari, Paridhi, et al. "TimeGraphs: Graph-based Temporal Reasoning." arXiv preprint arXiv:2401.03134 (2024).
Zhang, Ke, et al. "Prediction of gene co-expression from chromatin contacts with graph attention network." Bioinformatics 38.19 (2022): 4457-4465.
J. M. Replogle et al., “Mapping information-rich genotype-phenotype landscapes with genome-scale Perturb-seq,” Cell, vol. 185, no. 14, pp. 2559-2575.e28, Jul. 2022
Y. Roohani, K. Huang, and J. Leskovec, “Predicting transcriptional outcomes of novel multigene perturbations with GEARS,” Nat Biotechnol, vol. 42, no. 6, pp. 927–935, Jun. 2024
D. Molho et al., “Deep Learning in Single-cell Analysis,” ACM Trans. Intell. Syst. Technol., vol. 15, no. 3, p. 40:1-40:62, Mar. 2024
S. Gao et al., “Empowering biomedical discovery with AI agents,” Cell, vol. 187, no. 22, pp. 6125–6151, Oct. 2024
V. Zambaldi et al., “De novo design of high-affinity protein binders with AlphaProteo,” Sep. 12, 2024, arXiv: arXiv:2409.08022
A. Kabir et al., “DNA breathing integration with deep learning foundational model advances genome-wide binding prediction of human transcription factors,” Nucleic Acids Research, vol. 52, no. 19, p. e91, Oct. 2024
de Lima Camillo, Lucas Paulo, Raghav Sehgal, Jenel Armstrong, Albert Tzongyang Higgins-Chen, Steve Horvath, and Bo Wang. "CpGPT: a Foundation Model for DNA Methylation." bioRxiv (2024): 2024-10.
Schaefer, Moritz, Peter Peneder, Daniel Malzl, Mihaela Peycheva, Jake Burton, Anna Hakobyan, Varun Sharma et al. "Multimodal learning of transcriptomes and text enables interactive single-cell RNA-seq data exploration with natural-language chats." bioRxiv (2024): 2024-10.
Doctor, Yesh, Milan Sanghvi, and Prashant Mali. "A Manual for Genome and Transcriptome Engineering." IEEE Reviews in Biomedical Engineering (2024).
A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” Jun. 03, 2021, arXiv: arXiv:2010.11929
A. Singh et al., “FLAVA: A Foundational Language And Vision Alignment Model,” Mar. 29, 2022, arXiv: arXiv:2112.04482
OTHER RESOURCES
Graph Attention Networks: https://petar-v.com/GAT/
NVIDIA and other related sites etc....