Thursday, December 22, 2016

Artificial intelligence to generate new cancer drugs on demand

Summary:
  • Clinical trial failure rates for small molecules in oncology exceed 94% for molecules previously tested in animals and the costs to bring a new drug to market exceed $2.5 billion
  • There are around 2,000 drugs approved for therapeutic use by the regulators with very few providing complete cures
  • Advances in deep learning demonstrated superhuman accuracy in many areas and are expected to transform industries, where large amounts of training data is available
  • Generative Adversarial Networks (GANs), a new technology introduced in 2014 represent the "cutting edge" in artificial intelligence, where new images, videos and voice can be produced by the deep neural networks on demand 
  • Here for the first time we demonstrate the application of Generative Adversarial Autoencoders (AAEs), a new type of GAN, for generation of molecular fingerprints of molecules that kill cancer cells at specific concentrations
  • This work is the proof of concept, which opens the door for the cornucopia of meaningful molecular leads created according to the given criteria 
  • The study was published in Oncotarget and the open-access manuscript is available in the Advance Open Publications section
  • Authors speculate that in 2017 the conservative pharmaceutical industry will experience a transformation similar to the automotive industry with deep learned drug discovery pipelines integrated into the many business processes
  • The extension of this work will be presented at the "4th Annual R&D Data Intelligence Leaders Forum" in Basel, Switzerland, Jan 24-26th, 2017
Thursday, 22nd of December Baltimore, MD - Scientists at the Pharmaceutical Artificial Intelligence (pharma.AI) group of Insilico Medicine, Inc, today announced the publication of a seminal paper demonstrating the application of generative adversarial autoencoders (AAEs) to generating new molecular fingerprints on demand. The study was published in Oncotarget on 22nd of December, 2016. The study represents the proof of concept for applying Generative Adversarial Networks (GANs) to drug discovery. The authors significantly extended this model to generate new leads according to multiple requested characteristics and plan to launch a comprehensive GAN-based drug discovery engine producing promising therapeutic treatments to significantly accelerate pharmaceutical R&D and improve the success rates in clinical trials.
Since 2010 deep learning systems demonstrated unprecedented results in image, voice and text recognition, in many cases surpassing human accuracy and enabling autonomous driving, automated creation of pleasant art and even composition of pleasant music. 
GAN is a fresh direction in deep learning invented by Ian Goodfellow in 2014. In recent years GANs produced extraordinary results in generating meaningful images according to the desired descriptions. Similar principles can be applied to drug discovery and biomarker development. This paper represents a proof of concept of an artificially-intelligent drug discovery engine, where AAEs are used to generate new molecular fingerprints with the desired molecular properties. 
"At Insilico Medicine we want to be the supplier of meaningful, high-value drug leads in many disease areas with high probability of passing the Phase I/II clinical trials. While this publication is a proof of concept and only generates the molecular fingerprints with the very basic molecular properties, internally we can now generate entire molecular structures according to a large number of parameters. These structures can be fed into our multi-modal drug discovery pipeline, which predicts therapeutic class, efficacy, side effects and many other parameters. Imagine an intelligent system, which one can instruct to produce a set of molecules with specified properties that kill certain cancer cells at a specified dose in a specific subset of the patient population, then predict the age-adjusted and specific biomarker-adjusted efficacy, predict the adverse effects and evaluate the probability of passing the human clinical trials. This is our big vision", said Alex Zhavoronkov, PhD, CEO of Insilico Medicine, Inc. 
Previously, Insilico Medicine demonstrated the predictive power of its discovery systems in the nutraceutical industry. In 2017 Life Extension will launch a range of natural products developed using Insilico Medicine's discovery pipelines. Earlier this year the pharmaceutical artificial intelligence division of Insilico Medicine published several seminal proof of concept papers demonstrating the applications of deep learning to drug discovery, biomarker development and aging research. Recently the authors published a tool in Nature Communications, which is used for dimensionality reduction in transcriptomic data for training deep neural networks (DNNs). The paper published in Molecular Pharmaceutics demonstrating the applications of deep neural networks for predicting the therapeutic class of the molecule using the transcriptional response data received the American Chemical Society Editors' Choice Award. Another paper demonstrating the ability to predict the chronological age of the patient using a simple blood test, published in Aging, became the second most popular paper in the journal's history. 
"I am very happy to work alongside the Pharma.AI scientists at Insilico Medicine on getting the GANs to generate meaningful leads in cancer and, most importantly, age-related diseases and aging itself. This is humaniкty's most pressing cause and everyone in machine learning and data science should be contributing. The pipelines these guys are developing will play a transformative role in the pharmaceutical industry and in extending human longevity and we will continue our collaboration and invite other scientists to follow this path", said Artur Kadurin, the head of the segmentation group at Mail.Ru, one of the largest IT companies in Eastern Europe and the first author on the paper. 
"Generative AAE is a radically new way to discover drugs according to the required parameters. At Pharma.AI we have a comprehensive drug discovery pipeline with reasonably accurate predictors of efficacy and adverse effects that work on the structural data and transcriptional response data and utilize the advanced signaling pathway activation analysis and deep learning. We use this pipeline to uncover the prospective uses of molecules, where these types of data are available. But the generative models allow us to generate completely new molecular structures that can be run through our pipelines and then tested in vitro and in vivo. And while it is too early to make ostentatious claims before our predictions are validated in vivo, it is clear that generative adversarial networks coupled with the more traditional deep learning tools and biomarkers are likely to transform the way drugs are discovered", said Alex Aliper, president, European R&D at the Pharma.AI group of Insilico Medicine. 
Recent advances in deep learning and specifically in generative adversarial networks have demonstrated surprising results in generating new images and videos upon request, even when using natural language as input. In this study the group developed a 7-layer AAE architecture with the latent middle layer serving as a discriminator. As an input and output AAE uses a vector of binary fingerprints and concentration of the molecule. In the latent layer the group introduced a neuron responsible for tumor growth inhibition index, which when negative it indicates the reduction in the number of tumour cells after the treatment. To train AAE, the authors used the NCI-60 cell line assay data for 6252 compounds profiled on MCF-7 cell line. The output of the AAE was used to screen 72 million compounds in PubChem and select candidate molecules with potential anti-cancer properties. 

###
About Insilico Medicine, Inc

Insilico Medicine, Inc. is a bioinformatics company located at the Emerging Technology Centers at the Johns Hopkins University Eastern campus in Baltimore with Research and Development ("R&D") resources in Belgium, UK and Russia hiring talent through hackathons and competitions. The company utilizes advances in genomics, big data analysis, and deep learning for in silico drug discovery and drug repurposing for aging and age-related diseases. The company pursues internal drug discovery programs in cancer, Parkinson's Disease, Alzheimer's Disease, sarcopenia, and geroprotector discovery. Through its Pharma.AI division, the company provides advanced machine learning services to biotechnology, pharmaceutical, and skin care companies. Brief company video: https://www.youtube.com/watch?v=l62jlwgL3v8

Tuesday, December 6, 2016

GeroScope -- a computer method to beat aging

Scientists give robots the 'menial task' of searching for the key to eternal life

Russian scientists from MIPT, in collaboration with Insilico Medicine Inc., were commissioned by the Center for Biogerontology and Regenerative Medicine to develop the GeroScope algorithm to identify geroprotectors - substances that extend healthy life. Hundreds of compounds were screened for geroprotective activity using computer simulations, and laboratory experiments were conducted on the ten substances that were identified using this algorithm. A research paper detailing the results of the study has been published in one of the top peer-reviewed journals in aging research, Aging.
Decades of hard work by highly-competent research teams and millions of dollars are spent on the process of developing new drugs. And the screening and development process of geroprotectors, interventions intended to combat aging, a complex multifactorial biological process affecting every cell in the human body, is even more tedious. Computer modeling techniques may significantly reduce the time and cost of development.
"The aging of the population is a global problem. Developing effective approaches for creating geroprotectors and validating them for use in the human body is one of the most important challenges for biomedicine. We have proposed a possible approach that brings us one step closer to solving this problem," said Alexey Moskalev, a corresponding member of the RAS and head of the Laboratory of Genetics of Aging and Longevity.
For several years the group studied cancer-related processes and relied on the Oncofinder, an algorithm designed to study and analyze the activation values of molecular pathways by comparing gene expression in cancerous and normal healthy cells, and also comparing tissue samples of different patients. The researchers applied a similar approach to develop GeroScope, which is able to compare changes in the cells of young and old patients and search for drugs with minimal side effects that compensate for these changes. 
To do this, the scientists analyzed transcriptomic data (information which is read from DNA and transcribed into RNA) in "young" (donors aged between 15 and 30 years) and "old" (donors over the age of 60) samples from many human tissue types. This data was used for advanced computer modeling to identify and re-construct the molecular pathways associated with aging. Molecular pathways are a sequence of reactions that lead to changes in a cell. The most common molecular pathways are involved in metabolism and signal transduction. GeroScope modeled molecular pathways and analyzed cell reactions to various substances. Having chosen 70 compounds from the database of geroprotective drugs, previously published by the research group in a paper titled "Geroprotectors.org: a new, structured and curated database of current therapeutic interventions in aging and age-related disease," the scientists used the new algorithm to identify 10 substances that could have geroprotector properties in accordance with the model.
The GeroScope model was used to analyze the tissues of young and old patients, as well as cell lines. In order to experimentally verify the algorithm, the scientists took stem cell lines of human fibroblasts (connective tissue cells). Two effects were studied: cell "rejuvenation" and survival.
The experiments started with the measurement of the many parameters of viable cells: the size, shape, and complexity of the internal structure of the cell etc. The cells were then mixed with a test substance and a growth medium and held in this state for 6, 12, and 18 days. The scientists then measured the same parameters as at the start of the experiment, as well as the level of associated β?galactosidase, which is considered one of the markers of aging.
The 10 test substances chosen by the computer model demonstrated different results in human cell assays. For example, NDGA has no effect on rejuvenation, but it does decrease short- and long-term survival, Myricetin has a mild rejuvenating effect and EGCG has a strong rejuvenating effect. NAC has a very mild rejuvenating effect, but dramatically increases short- and long-term survival, PD-98059 has a very strong rejuvenating effect and increases both short- and long-term survival*.
The predictions made by the computer model were confirmed in cell cultures of human fibroblasts for several substances: PP-98059, NAC, Myricetin and EGCG. Some of these drugs are already actively sold as dietary supplements individually. Further analysis of the pathway-level effects of many of these compounds provided insights into the possible combinations providing maximal cumulative effects and minimizing the possible adverse effects. 
"For computer modeling this is a very good result. In the pharmaceutical industry, 92% of drugs that are tested on animals fail during clinical trials in humans. The ability to simulate biological effects with such a high level of accuracy in silico is a real breakthrough. PD-98059 and NAC proved to be the strongest geroprotectors. We hope that some of these drugs will soon be tested on people using biologically-relevant biomarkers of aging," said Alex Zhavoronkov Ph.D., head of the Laboratory of Regenerative Medicine at the D. Rogachev Federal Research and Clinical Center for Pediatric Hematology, Oncology, and Immunology, an adjunct professor at MIPT, and head of Insilico Medicine Inc. (Emerging Technology Centers located at the Johns Hopkins University at Eastern Campus).
Earlier this year Alexey Moskalev and Alex Zhavoronkov collaborated on applying the deep learning techniques to develop cost-effective biomarkers of aging on one of the most abundant data types from simple blood tests, Putin E, et al, "Deep biomarkers of human aging: Application of deep neural networks to biomarker development. These and other biomarkers developed using deep learning techniques commonly referred to as artificial intelligence will be applied to validating the effects of geroprotectors in humans. 
The GeroScope algorithm developed for geroprotector screening has thus been successfully validated using series of experiments on human cells. A high correlation was demonstrated between the predictions made by the algorithm and experimental data. GeroScope will later be used to search for unknown substances with geroprotective effects as well as for compounds that may be used to treat a variety of the age-related conditions.