The article (and what I can access of the paper it is based on) doesn't really give any details as to what this class is, how it works etc. All the interesting parts about this aren't mentioned.
It sounds like they trained a classification model using 39,000 molecules with known reactivity to MRSA. The molecules are vectorized text representations of the structures. Once trained, they can run arbitrary molecules through the model and see which ones are predicted to have antibiotic properties, or at least MRSA reactivity.
They likely fed in molecules from families of structures that seem likely to contain an antibiotic but are too numerous to manually test them all. They get a prediction of which ones are likely to have the properties they want, and then start the slow process of creating and testing the molecules in the lab.
I get what they did (its been something a lot of groups have been wanting to do for years) but I am curious what molecule specifically they found that worked especially well. i.e What does this thing look like? What is the new antibiotic's mechanism of action? None of those latter details are discussed. Its something we can only guess at.
My dad was allergic to practically every antibiotic. He only developed the allergy in his senior years. It was a big problem for him. Even if the antibiotic seemed to be working okay, he had to take a lot of Benadryl just in case and keep an epi pen around.
Does this AI use the same process for piecing together things as LLMs do for art and writing? Is this a drug we have known about but not yet applied as an antibiotic or a whole new compound?
It doesn't sound like it but they don't have enough detail in the article to say.
It sounds likey they are using a classification model that takes a vectorized text representation of molecules and classifies or scores them by their expected properties/reactivity. They took 39,000 molecules with known reactivity to MRSA to train the model, I assume to classify the structures. Once trained they can feed in arbitrary molecules into the trained model and see which ones are predicted to have antibiotic properties, which they can verify with bench work.
They likely fed in molecules from classes of likely candidate structures, and the model helped focus and direct the wet work.
I'm not up on the latest, but years ago I helped a similar project using FPGAs running statistical models to direct lab work.
llms have progressed beyond cut and paste way more than a year ago. they have shown understanding of what items are and how they behave and interact. I know it's popular here to call it a parrot or whatever but most people don't have access to the high level stuff and most seem afraid/snobby/parroting things themselves.
Virtual screening libraries are usually some form of expanded chemical space meaning they contain real and previously unknown compounds. The article says the 12 million compounds screened virtually were commercially available, but I couldn't see enough of the nature paper to verify. It could be that the virtual screening set was acquired from a private company, but that doesn't necessarily mean all the compounds are known.