A survey of how to use protein language models for protein design: Part 1

My first attempt at writing a very long form (but still accessible and practical) introduction to protein language models

Jul 31, 2024

Hello fellow datanistas!

Have you ever wondered how the cutting-edge technology of Generative AI is impacting the field of protein engineering? If so, I've just penned a detailed exploration in the first installment of a three-part series that dives deep into the world of Protein Language Models (PLMs) and their practical applications in protein engineering.

In this blog post, I unravel the complexities of PLMs, drawing parallels with natural language models like GPT-4, but with a twist—instead of generating text, these models are trained to produce sequences of amino acids. From patent-breaking to deep mutational scans and beyond, the potential uses of PLMs in protein engineering are vast and varied.

Whether you're a seasoned data scientist, a curious academic, or somewhere in between, this post aims to shed light on a topic that is as intriguing as it is vital to advancements in the life sciences. I invite you to read through, reflect on the examples provided, and consider the implications of such technologies in your own work.

You can find the full article here. I sincerely hope it provides you with new insights and inspiration.

If you find the content enlightening, please feel free to share it with colleagues or friends who might also appreciate the discussion—it's a great way to support our community's growth and understanding.

Thank you for your continued support and curiosity. Stay tuned for the next parts of this series, where we'll dive even deeper into the training and optimization of these fascinating models!

Cheers,
Eric

Eric's Data Science Newsletter

Discussion about this post