Francesca-Zhoufan Li

Francesca-Zhoufan Li

AI for Science & Engineering, currently focusing on machine learning for proteins

California Institute of Technology

Arnold Lab

Yue Group

Amazon AI4Science Fellow

NSF Graduate Research Fellowship

Biotech Leadership Training Program

About

With a broad interest in applying AI to science and engineering problems, I am currently focusing on the development and evaluation of machine learning-assisted protein engineering tools as a Bioengineering Ph.D. student at Caltech, co-advised by Frances Arnold and Yisong Yue. My current main project involves developing zero-shot predictors for non-native enzyme activity prediction, building on my recent work, Evaluation of Machine Learning-Assisted Directed Evolution Across Diverse Combinatorial Landscapes. I have also worked with Kevin K. Yang, Alex X. Lu, and Ava P. Amini through my summer internship at Microsoft Research New England on Feature Reuse and Scaling: Understanding Transfer Learning with Protein Language Models, which was presented at ICML 2024.

I am passionate about making computational tools usable and accessible to the wet-lab scientists and engineers. In collaboration, I contributed to developing data analysis and interactive visualization software for Long-read every variant Sequencing (LevSeq), a newly developed nanopore-based technology for rapid protein sequencing.

During and shortly after my time at the University of California, Berkeley where I earned my B.S. in Bioengineering and my B.S. in Chemical Biology, I gained experience in various research areas: developing RNA-seq software tools at Zymergen, discovering genetic circuit components with Richard Murray, contributing to cancer immunotherapy and SARS-CoV-2 antibody therapeutics development with Shohei Koide,optimizing cell-free platforms at Tierra Biosciences, and undertaking metabolic engineering and synthetic biology tool building projects at the Dueber Lab.

Outside of research, I enjoy being active outdoors, experiencing diverse cultures, solving fun puzzles, and doing minimalism iPhoneography. Given my personal background and journey, I am committed to promoting equitable opportunities and individualized education in STEM through mentoring, teaching, and outreach volunteering.

Download my resumé .

Interests
  • Machine learning
  • Protein engineering
  • Biomolecular tool development
  • Bioengineering
  • Interactive visulization
Education
  • Ph.D. in Bioengineering, 2025

    California Institute of Technology

  • B.S. in Bioengineering, 2019

    University of California, Berkeley

  • B.S. in Chemical Biology, 2019

    University of California, Berkeley

Experience Highlights

 
 
 
 
 
BioML Research Intern
Jun 2022 – Sep 2022 Cambridge, MA
Transfer learning for pretrained protein language models
 
 
 
 
 
Machine Learning for Proteins
Arnold Lab & Yue Group, Caltech
Jan 2021 – Present Pasadena, CA
  • Developing zero-shot predictors for non-native enzyme activities
  • Systematically analyzed multiple machine learning-assisted directed evolution strategies, including active learning and focused training using six distinct zero-shot predictors, across 16 diverse protein fitness landscapes
  • Facilitated the rapid generation of sequence-function data and tool development for constructing protein mutant libraries

Recent & Upcoming Talks

Generative and Experimental Perspectives for Biomolecular Design
Co-organize The Generative and Experimental Perspectives for Biomolecular Design (GEM) workshop at ICLR 2024 to foster collaboration between machine learning experts and experimental scientists.
Generative and Experimental Perspectives for Biomolecular Design