Data-driven protein engineering: learning the sequence-function mapping from experimental data
ChEMS Seminar
Dr. Philip Romero
Postdoctoral Fellow
University of California, San Francisco
Proteins are amazingly diverse molecules that are capable of performing a wide variety of chemical and biological tasks. Such versatility presents tremendous opportunities for solving challenging human problems that range from medicine and agriculture to environmental protection and industrial chemistry. Despite this great potential, our ability to design proteins with tailor-made functions has been impeded by our limited understanding of these complex molecules.
Rational protein engineering relies on accurate models that relate a protein's sequence to its function. However, many molecular properties are extremely difficult to model because they may be poorly understood or involve subtle, possibly dynamic, structural changes. In this talk, I will present an alternative modeling approach where statistical models are used to learn the relationship between protein sequence and function from experimental data. These data-driven methods are able to implicitly capture the numerous and possibly unknown factors that shape the sequence-function mapping. Using these models, I describe an adaptive protein design algorithm that can efficiently identify optimized protein sequences. I will finish by describing my current work in high-throughput experimentation and how new technologies are being used to generate protein sequence-function data sets of an unprecedented scale.
Bio:
Phil Romero is currently a postdoctoral fellow in Adam Abate's lab at UCSF where he is developing microfluidic technologies for protein engineering. He obtained his B.S.E. and M.S. degrees from Tulane University in Biomedical Engineering and Molecular Biology, respectively. As a graduate student at Caltech, he worked in Frances Arnold's laboratory, where he engineered proteins for a variety of applications including medical imaging, cancer therapeutics, and biofuel production. His thesis research focused on developing new statistical methods that can learn the relationship between protein sequence and function from experimental data.
Share
Upcoming Events
-
MSE 298 Seminar: Intelligent Learning Strategies for Thermal Science in the AI Era
-
CBE 298: Development and Understanding of New Concept Catalytic Materials for Environmental Applications
-
CEE Ph.D. Defense Announcement: Tracking COVID-19 in Low Population Communities through Wastewater Surveillance
-
CEE Seminar: Uncertainty in the Vulnerability of Metro Transit Networks - A Global Perspective on Infrastructure Resilience
-
UCI CEE FALL MIXER - 2025