Background: Drug development requires a lot of money and time, and the outcome of the challenge is unknown. So, there is an urgent need for researchers to find a new approach that can reduce costs. Therefore, the identification of drug-target interactions (DTIs) has been a critical step in the early stages of drug discovery. These computational methods aim to narrow the search space for novel DTIs and to elucidate the functional background of drugs. Most of the methods developed so far use binary classification to predict the presence or absence of interactions between the drug and the target. However, it is more informative but also more challenging to predict the strength of the binding between a drug and its target. If the strength is not strong enough, such a DTI may not be useful. Hence, the development of methods to predict drug-target affinity (DTA) is of significant importance
Method: We have improved the GraphDTA model from a dual-channel model to a triple-channel model. We interpreted the target/protein sequences as time series and extracted their features using the LSTM network. For the drug, we considered both the molecular structure and the local chemical background, retaining the four variant networks used in GraphDTA to extract the topological features of the drug and capturing the local chemical background of the atoms in the drug by using BiGRU. Thus, we obtained the latent features of the target and two latent features of the drug. The connection of these three feature vectors is then inputted into a 2 layer FC network, and a valuable binding affinity is the output.
Result: We used the Davis and Kiba datasets, using 80% of the data for training and 20% of the data for validation. Our model showed better performance when compared with the experimental results of GraphDTA
Conclusion: In this paper, we altered the GraphDTA model to predict drug-target affinity. It represents the drug as a graph and extracts the two-dimensional drug information using a graph convolutional neural network. Simultaneously, the drug and protein targets are represented as a word vector, and the convolutional neural network is used to extract the time-series information of the drug and the target. We demonstrate that our improved method has better performance than the original method. In particular, our model has better performance in the evaluation of benchmark databases.
Keywords: Drug repurposing, drug-target affinity, SMILES, BiGRU, LSTM, proteins.