Image: Bernd Schroder
Retrosynthesis software is intended to revolutionize the efficiency of producing organic substances. Especially a product is currently being media-effective, but not all chemists are completely convinced
Why chemical reactions expire and what could come out under which conditions in each case, today it becomes known to become an integral part of a coarse part empirically obtained experience. The design of chemical syntheses is a very challenging task, which is required of the environmental chemists of years of experience as well as time and the fair. Today, chemoinformatic solutions are searched for this.
For the first time, chemists have now practically tested whether a computer program is able to plan a complete chemical synthesis in all necessary steps – without any human supervision, it is called, but that is not quite correct, finishes the wisdom of the program from 250 years of organic chemical history, which was evaporated to him, on a rules.
The preparation of a chemical synthesis decays in three partial problems: the synthesis planning itself, which seeks a suitable for the desired target molecule strategy, which determines the reaction planning, the appropriate reaction conditions, and the reaction forecast with details of the expected course of the reaction. An important concept: The synthesis is planned from the target molecule starting backwards – retrosynthetic.
With the collection of new computer technology into the laboratories, the interest in the idea came to ease the planning of syntheses computers. The experiments have been known since the 1970er as Caos (Computer Assisted Organic Synthesis).
One of the pioneers is the Father Chemistry Nobel Prize Trading Elias James Corey from the Harvard University, who undertook the first significant steps in the field with the Synthesis Simulator OCSS (Organic Chemistry Synthesis Simulator), to which the well-known Lhasa followed. Shortly thereafter, SECS (Simulation and Evaluation of Chemical Synthesis) and Synchem appeared, with the focus of the latter in search problems typical for approaches of artificial intelligence (AI): Synchem was developed in contrast to LHASA and SECS to itself quite independent of chemist proposing to bring himself to the goal.
To date, a variety of other programs have been published: CAMEO, EROS, WODCA or SYLVIA for example. The coarse breakthrough, however, are in themselves – despite decades of research have been reported so far, reports had failed to generate computers to generate complete synthetic routes, which were then successfully implemented in the laboratory. Most of the knowledge foundation was to be credited to chemical reactions, or the programs were not designed to convince the huge terrain of synthetic possibilities in an intelligent manner. Because the number of possibilities of each retrosynthetic step is around 100, with N steps are 100N possibilities. The challenge of looking for a promising synthesis events at this starting position is obvious. Here are intelligent algorithms asked, which do not leave very promising paths self-relatively and focus on the search for possible efficient ways.
Chematica: "Software that started thinking like a chemist"
Even today programs are trying to plan the synthesis planning. For example, Chematica, which combined chemical expertise with powerful computers, network search and AI algorithms. Bartosz Grzybowski from the Ulsan National Institute of Science Technology in Sudkorea and the Polish Academy of Sciences had worked for 15 years, together with employees before he sold Grzybowski Scientific Inventions (GSI) 2017 in Merck Millipore.
The starting point was Grzybowski’s realization that the linking of all known chemical compounds with the varied between them chemical reactions to a fully novel knowledge platform was drilled in which the linking of each reaction ever carried out and any substance ever produced has a collective "chemical brain" could be created, which can then be searched with algorithms, as are applied to Google or in telecommunications networks. However, however, the non-reproducible synthesis rules of non-reproducible synthesis rules should be held to affect the overall result remains unclear, because published recipes that act in reality did not have to be recognized as such.
A general difficulty: the storage of the skier amount of data that has been attracted over time. The number of published connections in the CAS registry database shows a rapid increase incurred by the millennium turn. In 2015, the hundred-of-millionth connection was registered. In 2014 alone, more links were added than in the years from 1965 to 1990 together. Also the scope of reaction data has just increased sharply in the youngest past. They are detected in reaction databases, such as Cheminform RX (CIRX), which more than 1.8 million reactions are housed. Image: www.CAS.org
The algorithms are programmed so that billion chemical reaction possibilities were scanned within second fractions, which could lead to a desired molecule. The program should mainly grab the search assets of human chemist under my arms, which could simply be required for the coarse number of possibilities. The program is at the aim when it comes to the needed starting materials at the level of geary chemicals: either commercially available – the Sigma-Aldrich catalog is currently listing more than 200.000 of which – or in the case of synthesis chemists, around 7 million molecules of patents and chemical literature.
The trend "Green chemistry" should also be taken into account: due to predefinable restrictions, with which, for example, reactions in environmentally harmful solvents can be avoided or exported only with water-soluble components.