Paperpile

Referenced Papers (6)

Unraveling the functional dark matter through global metagenomics

Georgios A Pavlopoulos, Fotis A Baltoumas, Sirui Liu, Oguz Selvitopi, Antonio Pedro Camargo, Stephen Nayfach

Nature, 2023

"This paper is cited to support the claim that current biological knowledge has only scratched the surface of sequence diversity, especially in metagenomes where new protein sequence clusters are discovered exponentially with more samples."

Referenced at: 01:57

Exploring the conformational landscape of cryo-EM using energy-aware pathfinding algorithm

Teng-Yu Lin, Szu-Chi Chung

bioRxiv, 2023

"This citation is used to illustrate that while protein structure prediction has made great strides (like with ESMFold/AlphaFold2), knowing the structure alone does not directly reveal a protein's function."

Referenced at: 03:40

Convergent evolution of enzyme active sites is not a rare phenomenon

Pier Federico Gherardini, Mark N Wass, Manuela Helmer-Citterich, Michael J E Sternberg

J. Mol. Biol., 2007

"This paper is cited to demonstrate that protein function is not solely determined by structure, as similar structures can have different functions, and different structures can have identical functions, indicating convergent evolution of active sites."

Referenced at: 04:02

Genomic language model predicts protein co-regulation and function

Yunha Hwang, Andre L Cornman, Elizabeth H Kellogg, Sergey Ovchinnikov, Peter R Girguis

Nat. Commun., 2024

"This is the speaker's own recently published work, introducing the concept of a genomic language model (gLM) that predicts protein co-regulation and function by learning patterns in genomes."

Referenced at: 05:56

Protein language models are biased by unequal sequence sampling across the tree of life

Frances Ding, Jacob Steinhardt

bioRxiv, 2024

"This paper is cited to highlight the problem of bias in protein sequence datasets, where sampling is unequal across the tree of life, which can impact the generalizability of foundation models."

Referenced at: 23:01

PLMSearch: Protein language model powers accurate and fast sequence search for remote homology

Wei Liu, Ziye Wang, R You, Chenghan Xie, Hong Wei, Yi Xiong

Nat. Commun., 2024

"This paper introduces PLMSearch, a method that uses protein language models for accurate and fast sequence search for remote homology, providing inspiration for the speaker's gLMSearch concept."

Referenced at: 24:59