Faculty of Informatics and Statistics, Department of Information and Knowledge Engineering (DIKE)

Date and time: February 15 2018 (16:00 – 17:30).

Room: 473 NB


MolecRank: A Network Analysis Algorithm for Ranking Therapeutic Molecules in Biological Networks


  • Ahmed Abdeen Hamed, Merck & Co., Inc., Boston, MA, USA

Pharmaceutical scientists often search databases of therapeutic molecules (i.e. drugs) to answer a set of drug discovery queries. In this paper we present a novel algorithm called MolecRank, which is a molecule ranking mechanism. The algorithm traverses a network of biological features (e.g., therapeutic molecules, genes, proteins, chemical compounds, cell-types, RNA, disease names, etc.) extracted from publications (i.e., article abstracts). Starting with the PubMed web portal, we searched for two specific keywords to guarantee relevant abstracts: "Merck", and "MK" for the MK Number that encodes a Merck molecule. The PubMed query resulted in 792 publicly available MEDLINE abstracts. From the biological features extracted, we constructed a network: its nodes are genes, diseases, and chemical compounds, while its links are co-occurrence incidences between a molecule and the feature within the same abstract. That is: the nodes are connected based on their mentions in the same article. If two nodes (e.g., a molecule and a disease) are mentioned in the same abstract, they are linked. The network is stored in a graph database (a triplestore) to make it accessible for querying. When a query is issued against the triplestore database, the algorithm performs a post-processing ranking step. Its purpose is to return the most relevant and most specific molecules first. For any given query, the result shows interesting and produces unique rankings for the molecules in the network. Such a ranking cannot be achieved using a single network centrality measure (e.g., degree, excentricity, closeness, betweenness, or PageRank), which are known to be the gold-standard in such cases.

Dr. Ahmed Abdeen Hamed is a biomedical informatics research scientist at Merck. Previously he was an interdisciplinary computer scientist in the University of Vermont's Social-Ecological Gaming and Simulation Lab, where he built tools to help doctors, researchers, and pharmaceutical companies. One of the lab's projects involved mining data on Twitter and other social media sites to find unknown dangerous drug interactions before they were reported in medical research libraries. Hamed and his research team developed a computer program to search millions of tweets for names of drugs and build a map of how they are connected. The program looked for possible connections between things like colon cancer and marijuana, as well as alcohol and oxidative damage (thought to play a role in Alzheimer’s). Dr. Hamed also developed an online database that allowed researchers to look for linkages across social media and the National Library of Medicine’s PubMed archive of studies for potential drug side effects that patients share - an initiative he hopes will one day function as an early warning system.
Dr. Hamed is currently in Prague on a short rotation to collaborate with the Pharmaceutical scientists and IT Engineers on the MolecRank project.

Powered by Resource Description Framework (RDF)