The release of AlphaFold3 is the most talked-about event in the machine learning and life sciences communities. It seems that the new diffusion model, which converts the predicted contact matrix into atomic coordinates, is the major technological breakthrough in version 3, which leads to an impressive boost of accuracy. Regardless of how ingenious the architecture of AlphaFold3 is, the quality of its predictions cannot surpass the quality of the training data. With a limited amount of experimentally resolved protein-ligand complexes it's naive to expect the model to be general enough to predict the binding of any compound to any known protein. The flagship model of Receptor.AI, ArtiDock, tackles this problem through multimodal data augmentation (https://lnkd.in/drVwhERi). As far as we can tell from the publication, AlphaFold3 doesn't use any data augmentation, so its validity across the broader chemical space of ligands still needs to be tested in real-world scenarios. It's telling that the AlphaFold3 web server does not allow you to dock an arbitrary small molecule to a protein; you can only choose a compound from a predefined list. We have also found that AlphaFold3 fails to produce correct transient protein-protein interfaces in the cases when no similar structures are available in PDB. Another issue is the possible misuse of the PoseBusters benchmark. Currently ArtiDock and AlphaFold3 show comparable or better PoseBusters structure validity scores than classical docking, so the concerns initially highlighted by this dataset are no longer valid. Meanwhile, the ~94% score demonstrated by AlphaFold3 is so high that it raises concerns about potential model overfitting. It seems that we've already exhausted the usefulness of PoseBusters and need to stop emphasizing inflated metrics, which lack practical utility. We urgently need a much larger and more diverse benchmarking dataset to assess AI docking and co-folding techniques. Finally, AlphaFold is a heavyweight and slow model. If you're seeking an AI docking model for high-throughput screening, wait a bit until ArtiDock appears on Nvidia BioNeMo or contact us to try it out in your drug discovery project right now. It is around 600 times faster than AlphaFold and is comparable in real-world protein-ligand docking accuracy. In conclusion, AlphaFold3 is a fascinating model, but our excitement should be tempered a bit with thorough evaluation and experience from practical application.
Leash Bio
Biotechnology Research
Salt Lake City, UT 1,267 followers
Unleashing machine learning to solve medicinal chemistry.
About us
Every single machine learning (ML)-driven breakthrough, from language translation to Go championships, was immediately preceded not by an algorithmic innovation but by a massive, well-controlled, purpose-built dataset. An ML-driven breakthrough in medicinal chemistry is going to require a similar dataset, but such high-quality, well-structured experimental data doesn’t exist yet. We're building that data set by making physical measurements of lots of targets (with automated in-house protein production) screened against lots of compounds (with DNA-encoded chemical libraries). We're also building the ML models, that will eventually be generalizable across any protein and novel medicines using insights from our models.
- Website
-
http://leash.bio
External link for Leash Bio
- Industry
- Biotechnology Research
- Company size
- 2-10 employees
- Headquarters
- Salt Lake City, UT
- Type
- Privately Held
- Founded
- 2021
Locations
-
Primary
Salt Lake City, UT 84101, US
Employees at Leash Bio
Updates
-
Super excited to be releasing AlphaFold 3 today, developed by Isomorphic Labs and Google DeepMind: our next generation AI model for predicting the biomolecular structures and interactions of proteins, DNA, RNA, small molecules, and more: https://lnkd.in/gY7deAqk The AlphaFold 3 paper is published in Nature today, bringing together more training data from PDB with new neural net architectures and a diffusion module that generates the 3D coordinates of each atom. https://lnkd.in/gSZ5k2Dj When looking at the accuracy of this model, for interactions between proteins and other molecule types we see at least a 50% improvement compared to existing methods, and for some important categories we have doubled the prediction accuracy. We’ve been using these bleeding-edge models day-to-day at @IsomorphicLabs for drug design on our internal and partnership projects. There’s so much scope for advancing rational structure based drug design! https://lnkd.in/g7df5p4Q And @GoogleDeepMind have also developed AF Server which makes a lot of these capabilities accessible for free for non-commercial research https://lnkd.in/gJGC-xK3 Very proud of the teams across Isomorphic Labs and Google DeepMind for all the amazing research and work that has gone into this. Read more: AF3 blog: bit.ly/44yfaCw AF3 for Drug Design: bit.ly/4a5o3EM Nature paper: bit.ly/3yaLLSL AFServer:bit.ly/3JWY1Zy
-
Leash Bio reposted this
It's been two weeks since we launched our Kaggle competition, and I want to single out Top Harvest Capital, Adam Ghobarah and Choongsoon Bae for their generous cosponsorship of the contest. That crew champions open science, and we're grateful to have them on the team. Thanks, guys!
-
Leash Bio reposted this
📣 Competition Launch Alert! Predict New Medicines with BELKA hosted by Leash Bio. 🎯 Predict the binding affinity of small molecules to specific protein targets. 💰 $50,000 Prize Pool ⏰ Entry Deadline: July 1, 2024 In this competition, you’ll develop machine learning (ML) models to predict the binding affinity of small molecules to specific protein targets. Learn more at https://lnkd.in/e6TAn88F
Leash Bio - Predict New Medicines with BELKA | Kaggle
kaggle.com
-
Leash Bio reposted this
Leash Biosciences, a biotech innovator, has secured $9.3 million in seed funding to advance AI in medicinal chemistry, aiming to transform drug discovery. With contributions from SpringTide Ventures, Metaplanet, and others, Leash plans to refine its ML model for predicting drug interactions, leveraging a vast database of over 17 billion protein-chemical interactions. Furthermore, the launch of the BELKA competition on Kaggle highlights Leash's dedication to open science, inviting global collaboration to push the boundaries of drug discovery. https://lnkd.in/euEqj8_h #Biotech #AI #MachineLearning #DrugDiscovery #LeashBiosciences #Innovation #SeedFunding #BELKACompetition
Leash Biosciences Secures $9.3M in Oversubscribed Seed Funding to Transform Medicinal Chemistry with AI
https://theaiinsider.tech
-
We launched our Kaggle competition last week, and today I want to single out Nathan Wilkinson for his tireless, diligent, creative work in putting it together. Nate built huge amounts of engineering infrastructure to make it happen and developed methods to prevent ML models from cheating with the utmost care, all while threading a tiny needle of tradeoff decisions to do so. Thank you, Nate!
Leash Bio - Predict New Medicines with BELKA | Kaggle
kaggle.com
-
We'll be at #AACR. Reach out to Becca Levin, PhD Levin if you want to chat about our #Kaggle competition, our approach to scaling data collection for ML in Med Chem or #ML in drug discovery generally.
-
Thoughts from our CEO, Ian Quigley, PhD on the launch of our Kaggle competition BELKA https://lnkd.in/g47UEsVx #machinelearning #ai #ml #techbio #kaggle
Introducing BELKA, the biggest public molecule-protein interaction dataset on earth, generated by us at Leash Biosciences in the basement of my house.
Introducing BELKA
leashbio.substack.com
-
But wait, there's more! We have also launched a Kaggle competition, Predict New Medicines with BELKA (Big Encoded Library For Chemical Assessment). BELKA is a data set of protein<>small molecule interactions of massive scale, 133M empirically derived interactions. The contest is to use machine learning to "look" at chemical structures and predict whether they will bind to one of three protein targets. We invite everyone in the AI/ML community to compete! Read more from Ian Quigley, PhD on BELKA here (https://lnkd.in/gifNp7Z9) and take part in the competition here (https://lnkd.in/g47UEsVx) Big thanks to the whole Leash team in getting the competition up! Brayden Halverson Nathan Wilkinson Andrew Blevins Ben Miller
-
Honored for our financing, Kaggle competition, and most importantly team dog Belka to be covered by Andrew Dunn in Endpoints News this week! #machinelearning #techbio #AI #ML
A self-described group of "weirdos in this mountain state" are launching their startup with a challenge to the AI-bio world. Leash Bio has officially launched, recently closing a $7.9 million seed round, moving out of its longtime basement laboratory to a proper office space, and betting on the need for more data to fuel the next AI advances. That includes, most notably, an open contest that went live Thursday. AI groups are invited to test their own models predicting how molecules bind to proteins. Leash is releasing its own massive database on binding interactions — entrants can use it to train their models, or compete without Leash's data. “There’s a world where this is a huge mistake,” CEO Ian Quigley, PhD told me. “I don’t think that’s true, but it’s very possible. I could be really humiliated in a month, but no one can tell. And I think it’s bullshit that no one can tell. If everyone talks about how great they are, I look forward to seeing other datasets if we get a bunch of critics.” My latest for Endpoints News:
Exclusive: Ex-Recursion team launches Leash Bio, challenging AI field with its bet on needing more data
https://endpts.com