Welcome to My Week in AI! Each week this blog will have the following parts:
- What I have done this week in AI
- An overview of an exciting and emerging piece of AI research
Progress Update
Applying models across domains
This week I have been working on applying seq2seq models to time series forecasting. These models are typically used in NLP applications, but due to the similarity of the two tasks, they can also be used for time series forecasting. They are especially useful for multi-step forecasting.
Leveraging new resources
I have also spent time perusing Deepmind’s educational resources. They recently launched a learning resources page for people with all levels of experience in AI, from beginners to students to researchers. Some of the highlighted resources made available by Deepmind include: their podcast and Youtube channel featuring interviews and talks by AI scientists and engineers, a range of college-level lecture series and courses, and resources from global education initiatives that broaden access to AI research. In addition, they provide access to fascinating blog posts and research papers. I highly recommend browsing through the site!

Emerging Research
Identifying a chemical molecule’s target proteins with deep learning
As I mentioned last week, the research I am sharing in this post involves AI in drug discovery. In ‘IVS2vec: A tool of Inverse Virtual Screening based on word2vec and deep learning techniques,’ Zhang et al. present a framework for applying the Inverse Virtual Screening (IVS) technique to chemical molecules using deep learning¹. Research has found that on average, each drug can bind to 6 target proteins — and in the early stages of drug development, it’s very useful to know what these target proteins might be. IVS is a method of identifying these target proteins.
The method presented by the authors combines Mol2vec (based on Word2vec) and a Dense Neural Network and is called IVS2vec. They described Mol2vec, which proceeds in the following manner: a chemical compound is translated into SMILES structure so that it is in the form of strings, meaning the molecule is viewed similarly to a sentence and is split into substructures or ‘words’, and then Word2vec is applied to finally encode the molecule as a 300 dimension vector.
Trending AI Articles:
1. Machine Learning Concepts Every Data Scientist Should Know
3. AI Fail: To Popularize and Scale Chatbots, We Need Better Data
The researchers built up a dataset using Mol2vec by extracting information from the PDBbind database, encoding the molecules as described, and then matching the encoded molecules with their corresponding target proteins. This dataset was then used to train a classifier using a Dense Fully Connected Neural Network based on DenseNet². DenseNet was developed to solve the problem wherein some of the layers in many-layer networks are disused, by allowing each layer to pass its extracted information to ensuing layers. This means that each layer in the network has an aggregation of all the information that was previously extracted.

This framework worked very well in performing IVS on a holdout set from the PDBbind database, achieving a classification accuracy of over 91%. IVS2vec has the potential to speed up clinical trials and help researchers understand the effects and side effects of new drugs more easily. It also has applications in repurposing existing drugs for new uses, which is a much faster and less expensive process than de novo drug development.
Next week I will be presenting more of my work in AI and showcasing some cutting-edge AI research. Thanks for reading and I appreciate any comments/feedback/questions.
Finally, I have a personal update to share. It is with sadness that I am leaving Blueprint Power after this week, a company that helped me to grow, introduced me to many fantastic colleagues and taught me much about engineering and software development. At the same time, I am very excited to be starting a new role as an AI Scientist at Emergent Dynamics, where I will be applying AI to help build an intelligent drug discovery platform with the aim of increasing the rate and effectiveness of drug development.
References
[1] Zhang, H., Liao, L., Cai, Y., Hu, Y., & Wang, H. (2019). IVS2vec: A tool of Inverse Virtual Screening based on word2vec and deep learning techniques. Methods (San Diego, Calif.), 166, 57–65. https://doi.org/10.1016/j.ymeth.2019.03.012
[2] Huang, G., Liu, Z., Maaten, L. V., & Weinberger, K. Q. (2017). Densely Connected Convolutional Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). doi:10.1109/cvpr.2017.243
Don’t forget to give us your ? !



My Week in AI: Part 9 was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.
Via https://becominghuman.ai/my-week-in-ai-part-9-8f7150de2752?source=rss—-5e5bef33608a—4
source https://365datascience.weebly.com/the-best-data-science-blog-2020/my-week-in-ai-part-9
