Drug Repurposing Explorer Abhik Seal Introduction
In this project we used the side effect data from sider which contains around 996 drugs along with 4500 side effects information. For the dataset preparation we used SIDER database containing 996 drugs and mapped it to the Drugbank ID (DBID) using pubchem identifier exchange service. We found 545 compounds correctly mapped to the Drugbank ID using pubchem CID. We then mapped the compounds by their generic name and visually inspected the drugs and formed a dataset of 746 compounds. From 746 drugs, a filter was applied in which drugs whose molecular weight was more than 1000 Da were removed along with some metals like lithium and Magnesium. The final dataset contained 727 drugs. We collected the side effect information of 727 drugs from SIDER .The target information, ATC Codes from the Drugbank. We also manually added some of the drugs ATC codes from KEGG database. The disease information of 727 drugs was obtained from Comparative Toxicogenomics Database (CTD) [8].CTD is a curated information resource for chemical- gene, chemical-diseases, gene-diseases, chemical βontology interaction which is curated by hand by the CTD bio-curators. We extracted the Chemical to disease association between the compounds which are related to the therapeutic area only which will help to identify possible diseases to treat the drug and also if new diseases could be identified. A user searching for a drug in Drug Repurposing explorer should be aware of the drug and also for what purpose it is used.
Terms in the website: 1) ECFP6- These are the ECFP6 fingerprints calculated using Accelrys pipeline pilot academic version. This type of fingerprints is used in the tool because it gave a good significant correlation value with the Proteins target similarity. This indicates that if a search is made with ECFP6 and proteins will return some top similar compounds. Tanimoto similarity is calculated for the compounds using ECFP6 2) Pubchem β It is the 881 pubchem fingerprints calculated using rcdk. Tanimoto similarity is calculated for the compounds using pubchem fringerprints. 3)Proteins β It is the compounds which shares similar proteins. The compound target information is obtained from the Drugbank data and converted to a matrix. A fast dice coefficient calculation algorithm is applied to calculate the similarity given in equation 1. It is being found that dice and tanimoto the results are proportional.
π₯βπ¦ : ππ₯βπ¦ π ποΏ½|
|{ππ₯βπ₯: ππ₯βπ₯ π π}| + |οΏ½ππ¦βπ¦ : ππ¦βπ¦ π ποΏ½|
4) ROCS β comparison of compounds based on 3D shape and color with a maximum of 400 conformers for each molecule was done to generate a 727 x 727 matrix. 5) SIDE EFFECT- The compound similarity via side effect was similar to proteins similarity and same algorithm was applied given in equation 1.727 drugs are mapped and a 727 x 4500 matrix was generated in order to calculate the similarity. Using of Drug Repurposing Explorer:
a) Search via any one Option
1) A User can search via normally by side effect to list similar compound via side effect. A search cannot be made with other compounds if a drug is not in the database. The search Box automatically lists the drug names when a user tries to type a drug name. Diagram 1 given below helps to understand how to search. Other than that if a user wants to search the disease related to the search for the top similar compounds he/she can select the disease check box and it will return the drug name and disease associated with it. Diagram showing Side effect similarity search
Diagram showing side effect and Disease search. Diagram1
Results
For searching by one option for example only side effects the results page returns a table for the top similar drugs by that option. Diagram 2 shows the results page by selecting only option for example side effect and selecting Disease checkbox. The results page shows the top search similar drugs by side effect and it similarity calculated using equation 1, ATC Code and the wiki links of the drug .The disease table results with the top similar drug along with its disease information Note: It might be possible that drugs which are orderly ranked by any one option, in the disease table that drug may not available. It is because the current version has 680 drugs mapped with the disease information. Later version will have more drugs and Disease information.
Wikipedia Link
Diagram 2
b) Search via any two or more Options Apart from selecting one option a user can select two options for searching for example ECFP6 or Pubchem with Side Effect. Following with it there are dropdown boxes given, were a user can do a weighted combo search. This type of search is very useful because in drug discovery and repurposing area varied structures cam have similar side effect and a protein target can also have varied drug compounds. Along with it clicking the disease checkbox it will return the disease information to. For searching the database a query should like the following: a) Search compounds which have less structural similarity but high side effect similarity (application of weights example (side effect= 0.9 ECFP6 = 0.1)
b) Search compounds which have high protein sharing similarity and less 3D shape similarity (ROCS)
c) Search compounds which have equal side effect and structural similarity weights (0.5, 0.5) or (0.3, 0.3). you can give any value searching with equal weights.
Case study 1:
Searching for Similar group of compounds.
In this case study I want to have the entire statins top ranked using a combo search option. The statins fluvastatin, rosuvastatin, cervastatin, pravastatin, lovastatin, simvastatin. It is not possible for a single search option to get all the statins ranked in order because they have varied structures so as share different proteins and some of them have different side effects. But when a combo search is made with ECFP fingerprints, Proteins and Side effect with weights 0.2, 0.3, 0.5 respectively then we can rank the statins in top search. This type of combo search is very powerful in drug discovery research which could be useful for drug repurposing.
The major criteria for combo search are the weights, a user should be careful in selecting the weight and performing the search. Table 1 gives the comparisons of different search. The number is the brackets represent the rank.
Proteins SideEffect ECFP + Proteins + SideEffect (0.2,0.3,0.5)
cerivastatin(114) Rosuvastatin(22) sitagliptin
Lovastatin(365) Simvastatin (11) Fluvastatin(226)
Note: If any user using this tool found something interesting please mail your results to ort will help to make a good manual for the user while searching the database.
If you are using Drug repurposing explorer for your research and you have good results donβt forget to cite the websit The article is in process.
A comment link is made for your valuable comments.
JASON WINN WORK EXPERIENCE Software Engineer Intern, Zynga, Toronto, May 2013 - August 2013 β’ Acted as the sole developer in creating C# framework bindings to connect Unity through C to C++ (script-generated bindings), and finally to native β’ Worked with Googleβs IABv3, Amazon IAP, and a slew of other payment providers, in order to unify IAP into a convenient API for game te
Rising margin pressure Slovenia Current price Pharmaceuticals Fair value Performance over IFRS cons FY/e 31.12 Absolute Sales (β¬ m) Relative to SVSM EBITDA (β¬ m) EBIT (β¬ m) Net income (β¬ m) 12M Hi/Lo EPS (β¬) Bloomberg DPS (β¬) Market cap Next corporate events Dividend yield (%) EV/EBITDA* (x) Alth