Overview
Druglikeness anlysis
ADME/T evaluation
Similarity searching based on ADME/T database
Systematic ADME/T assessment
How to explain
Overview
| What's ADMETlab

ADMETlab platform provides a user-friendly, freely available web interface for systematic ADMET evaluation of chemical compounds based on a comprehensive database consisting of 288,967 entries. It contains four main modules: 'Druglikeness analysis', 'ADMET prediction', 'Systematic evaluation' and 'Smilarity searching'. The detailed information about these modules is described as below.


| Main features
  • Comparative large datasets of most properties.
  • Better and robust SAR/QSAR models.
  • Systematic analysis and comparison
  • Provide constructive suggestions for molecular optimization
  • Batch computation
  • User-friendly interface
| Main functionalities
ADMETlab can help researcher, PK specialists to do as follows:

Druglikeness analysis.

ADME/T evaluation.

Similarity searching based on ADME/T database.

Systematic ADME/T assessment.

Druglikeness analysis
| Druglikeness rules

Druglikeness rules are several expert criterions that are used in drug design for how "druglike" a substance is with respect to factors like bioavailability. Here, we selected 5 commonly used rules and provided to users:

  • Lipinski's rules: MW<=500; logP<=5; Hacc<=10; Hdon<=5
  • Ghose's rules: - 5.6< MclogP < -0.4 mean: 2.52; 160 < MW < 480 mean: 357;
    40 < MR < 130 mean: 97; 20 < natoms < 70 mean: 48
  • Oprea's rules: nrings≥3; nrigidbond>=18; nRotbond≥6
  • Veber's rules: nRotbond<=10; tPSA<= 140 or Hacc and Hdon<=12
  • Varma's rules: MW<= 500; tPSA<=125; -5< logD < – 2; Hacc+Hdon<=9; nRotbond<=12

| Druglikeness model

We collected 6731 drugs from Drugbank database as samples of druglikeness. Then 6769 molecules were picked as negative samples from those molecules with IC50 or Ki less than 10000nm from CHEMBL database by using Self-organizing feature Mapping (SOM) method. The SOM method ensures that the nagetive samples are picked from the similar clusters compared with positive samples. This tend to make the model have a better discriminatory power and a proper application domain. Finally, we get a classification model with a accuracy of 0.801 for training set by 5-fold cross validation. This model can not only find out the active compounds from chemical entities but also distinguish the potential drug candidates from active compounds.

ADMET prediction
| Data summary

For all the ADMET-related properties, we collected corresponding data mainly by two approaches: the previous literatures and the DrugBank database (http://www.drugbank.ca). After several pretreatments, we totally obtained 30 datasets. The global overview of these ADMET datasets can be seen in the table below. (Table S1).

Table S1. The number of end-points of each property

CategoryPropertyTotalPositiveNegativeTrainTest
Basic physicochemical propertyLogS5220--41161104
LogD7.41031--773258
LogP     
AbsorptionCaco-21182--886296
Pgp-Inhibitor229713729251723574
Pgp-Substrate1252643609939313
HIA970818152728242
F (20%)1013759254760253
F (30%)1013672341760253
DistributionPPB1822--1368454
VD544--408136
BBB223754016971678559
MetabolismCYP1A2-Inhibitor121455713643291453000
CYP1A2-Substrate39619819829799
CYP3A4-Inhibitor118935047684688933000
CYP3A4-Substrate1020510510765255
CYP2C19-Inhibitor122725670660292723000
CYP2C19-Substrate31215615623478
CYP2C9-Inhibitor117203960776087203000
CYP2D6-Inhibitor1272623421038497263000
CYP2C9-Substrate784278506626156
CYP2D6-Substrate816352464611205
ExcretionClearance544--408136
T1/2544--408136
ToxicityhERG655451204392263
H-HT217114357361628543
Ames76194252336757141905
SkinSen40427413032381
LD50 of acute toxicity7397--59171480
DILI47523623938095
FDAMDD803442361643160

| Data sources

The models are avaliable for public via: http://github.com/ifyoungnet/ADMETlab, other datasets or resources please email oriental-cds@163.com; biomed@csu.edu.cn to get download link.

| Model summary

To obtain robust and reliable QSAR models for ADMET properties prediction, we constructed a series models and aimed to find a best one. Six methods (RF, SVM, RP, PLS, NB, DT) and seven types of descriptors (2D, Estate, MACCS, ECFP2, ECFP4, ECFP6, FP2) were applied in the modeling process. The best model for each property and its performance can be seen in tables below (Table S2, Table S3).


| Model Results

Table S2. The best regression models for the coressponding ADME/T related properties

PropertyMethodFeaturesmtryR2Q2R2TRMSEFRMSECVRMSET
LogSRF2D100.9950.9670.9570.1380.3690.436
LogD7.4RF2D140.9830.8770.8740.2280.6140.605
LogP         
Caco-2RF2D140.9730.8450.8240.1210.2890.290
PPBRF2D80.9540.6910.6827.12418.44318.044
PropertyMethodFeaturesmtry2-fold rate (CV/Test)3-fold rate (CV/Test)
VDRF2D100.819/0.8010.912/0.904
CLRF2D100.760/0.8160.877/0.897
T1/2RF2D120.762/0.6990.897/0.824
LD50 of acute toxicityRF2D50.986/0.9870.998/0.997

Table S3. The best classification models for the coressponding ADME/T related properties

PropertyMethodFeatures Five-fold cross validationExternal validation dataset
SensitivitySpecificityAccuracyAUCSensitivitySpecificityAccuracyAUC
HIARFMACCS0.8200.7430.7820.8460.8010.7430.7730.831
F (20%)RFMACCS0.7310.6470.6890.7590.6800.6630.6710.746
F (30%)RFECFP60.7430.6050.6690.7150.7510.6010.6670.718
BBBSVMECFP20.9620.8130.9260.9480.9930.8540.9620.975
Pgp-inhibitorSVMECFP40.8870.7890.8480.9080.8630.8020.8380.913
Pgp-substrateSVMECFP40.8390.8070.8240.8990.8260.8540.8400.905
CYP1A2-InhibitorSVMECFP40.8330.8640.8490.9280.8530.8800.8670.939
CYP1A2-SubstrateRFECFP40.7680.6360.7020.8010.7680.6370.7020.802
CYP3A4-InhibitorSVMECFP40.7590.8580.8170.9010.7880.8600.8290.909
CYP3A4-SubstrateRFECFP40.7980.7160.7570.8350.8190.6790.7490.835
CYP2C19-InhibitorSVMECFP20.8260.8190.8220.8930.8120.8250.8190.899
CYP2C19-SubstrateRFECFP40.7350.7440.7400.8160.8710.6670.7690.853
CYP2C9-InhibitorSVMECFP40.7190.8980.8370.9000.7300.8820.8300.894
CYP2C9-SubstrateRFECFP40.7460.7090.7280.8190.7460.7090.7340.824
CYP2D6-InhibitorRFECFP40.7700.8110.7930.8680.7710.8120.7950.882
CYP2D6-SubstrateRFECFP40.7650.730.7480.8230.7920.730.760.833
hERGRF2D0.9080.7000.8440.8790.8880.7620.8480.873
H-HTRF2D0.7800.5200.6890.7100.7850.4870.6810.683
AmesRFMACCS0.8000.8410.8200.8900.8480.8160.8340.897
SkinSenRFMACCS0.6850.7270.7060.7600.7150.7270.7310.774
DILIRFMACCS0.8660.8130.8400.9040.8300.8570.8430.910
FDAMDDRFECFP40.8480.8120.8320.9040.8530.7820.8210.892

The Cohen's kappa coefficient can be used as a performance metric to evaluate the results of models based on unbalanced dataset. Here we calculated the coefficient for the 7 unbalanced models: {CYP2C9-Substrate: 0.397, CYP2D6-Inhibitor: 0.506, CYP2D6-Substrate: 0.411, F-20: 0.309, F-30: 0.254, HIA: 0.429, SkinSen: 0.413} . Usually, the coefficient standards: (<=0: poor; 0.01-0.20: slight; 0.21-0.40: fair; 0.41-0.60:moderate; 0.61-0.80: substantial; 0.81-1: almost perfect). We can see, after the processing using our strategies, the consistency is quite acceptable.

Search & Database
| Database contents

The database integrated all the ADMET entries from ChEMBL database, Drugbank database, EPA database and related records from several literatures along with all the data described above. We manually checked the correctness of the values and dropped redundant information which resulted in 288,967 entries. Each entry includs basic molecular properties( eg. , common name, SMILES, ALogp, PSA) and ADMET activities.


| Similarity search

QSAR models and similarity search are both useful strategies to predict ADMET properties. Compared with QSAR models, The similarity search in databases are fast and can easily be extended to include new information. Here, we provides 5 kinds of fingerprints to represent molecular information and 2 kinds of similarity metrics. Users can input molecules to estimate their properties by comparing with similar compounds.

Systemic evaluation
| Summary

Not just one property affects the behavior of drugs in body. Usually we are looking for molecules that possess relatively good performance through every stage of ADME/T. Here, we developed this module that allows users to evaluate most aspects of ADME/T process of one molcule. The results provide users an full impression and lead to constructive suggestions of molecular optimization.

How to explain

Table S4. The explaination for the coressponding ADME/T related properties

Property Type Units Suggestions   Meaning & Preference Reference
LogS (Solubility) Numeric log mol/L > 10 μg/ml
  • Optimal: higher than -4 log mol/L
  • <10 μg/mL: Low solubility; 10–60 μg/mL: Moderate solubility; >60 μg/mL: High solubility
  • Book: ISBN: 9787562832287. pp. 14.
  • J PHARMACOL TOX MET. 2000, 44 (1), 235−249.
LogD7.4 (Distribution Coefficient D) Numeric   1~5
  • < 1: Solubility high; Permeability low by passive transcellular diffusion; Permeability possible via paracellular if MW < 200; Metabolism low.
  • 1 to 3: Solubility moderate; Permeability moderate; Metabolism low.
  • 3 to 5: Solubility low; Permeability high; Metabolism moderate to high.
  • > 5: Solubility low; Permeability high; Metabolism high.
  • Methods and principles in medicinal chemistry 18 (pp. 21–45). Weinheim: Wiley-VCH.
LogP (Distribution Coefficient P) Numeric   0~3
  • Optimal: 0< LogP <3
  • LogP <0: poor lipid bilayer permeability.
  • LogP >3: poor aqueous solubility.
  • Book: ISBN: 3-906390-22-5. pp. 127–182.
Papp (Caco-2 Permeability) Numeric cm/s > -5.15
  • Optimal: higher than -5.15 Log unit or -4.70 or -4.80
  • J CHEM INF MODEL. 2016, 56 (4), pp 763–773.
Pgp-inhibitor Categorical  
  • Category 0: Non-inhibitor; Category 1: Inhibitor;
  • The Pgp-inhibitor & non-inhibitor classification criteria refers the reference.
  • J CHEM INF MODEL. 2010. 50(6): p. 1034-1041.
  • J MED CHEM. 2011. 54(6): p. 1740-1751.
Pgp-substrate Categorical  
  • Category 0: Non-substrate; Category 1: Substrate;
  • More likely to be a Pgp substrate: N+O ≥ 8; MW > 400; Acid with pKa > 4
  • More likely to be a Pgp non-substrate: N+O ≤ 4; MW < 400; Acid with pKa < 8
  • J DRUG TARGET. 11, 391–406.
HIA (Human Intestinal Absorption) Categorical  
  • Category 0: HIA-; Category 1: HIA+;
  • ≥30%: HIA+; <30%: HIA-
  • RSC ADV. 2017, 7, 19007-19018
F (20% Bioavailability) Categorical  
  • Category 0: F20-; Category 1: F20+;
  • ≥20%: F20+; <20%: F20-
  • MOL PHARMACEUT, 2011. 8(3): p. 841-851
  • J PHARMACEUT BIOMED, 2008. 47(4): p. 677-682.
F (30% Bioavailability) Categorical  
  • Category 0: F30-; Category 1: F30+;
  • ≥30%: F30+; <30%: F30-
  • MOL PHARMACEUT, 2011. 8(3): p. 841-851
  • J PHARMACEUT BIOMED, 2008. 47(4): p. 677-682.
PPB (Plasma Protein Binding) Numeric % 90
  • Significant with drugs that are highly protein-bound and have a low therapeutic index.
  • ISBN: 978-0-1236-9520-8. pp. 194
VD (Volume Distribution) Numeric L/kg 0.04~20
  • Optimal: 0.04-20L/kg;
  • Range:
    <0.07L/kg: Confined to blood, Bound to plasma protein or highly hydrophilic; 0.07-0.7L/kg: Evenly distributed; >0.7L/kg: Bound to tissue components (e.g., protein, lipid),highly lipophilic.
  • Book: ISBN: 9787562832287. pp. 174
  • Book: ISBN: 978-0-1236-9520-8. pp. 229
BBB (Blood–Brain Barrier) Categorical  
  • Category 0: BBB-; Category 1: BBB+;
  • BB ratio >=0.1: BBB+; BB ratio <0.1: BBB-
  • These features tend to improve BBB permeation:
    H-bonds (total) < 8–10; MW < 400–500; No acids.
  • J NEUROCHEM. 70, 1781–1792
P450 CYP1A2 inhibitor Categorical  
  • Category 0: Non-inhibitor; Category 1: Inhibitor;
  • Molecules that labeled inhibitor in PubChem BioAssay were regarded as inhibitor.
  • NAT BIOTECHNOL. 2009, 27(11): 1050-1055.
  • BIOINFORMATICS. 2013, 29(16): 2051-2052.
CYP450 1A2 substrate Categorical  
  • Category 0: Non-substrate; Category 1: Substrate;
  • Molecules that labeled substrate in PubChem BioAssay were regarded as substrate.
  • Characteristics of CYP1A2 substrate: 0.08< LogP <3.61; Planar amines and amides
  • NAT BIOTECHNOL. 2009, 27(11): 1050-1055.
  • BIOINFORMATICS. 2013, 29(16): 2051-2052.
  • ISBN: 978-0-1236-9520-8. pp. 162
CYP450 3A4 inhibitor Categorical  
  • Category 0: Non-inhibitor; Category 1: Inhibitor;
  • Molecules that labeled inhibitor in PubChem BioAssay were regarded as inhibitor.
  • Strategies to Reduce CYP3A4 Inhibition: Decrease the lipophilicity (LogD 7.4); Add steric hindrance to the heterocycle para to the nitrogen; Add an electronic substitution (e.g., halogen) that reduces the pKa of the nitrogen.
  • NAT BIOTECHNOL. 2009, 27(11): 1050-1055.
  • BIOINFORMATICS. 2013, 29(16): 2051-2052.
CYP450 3A4 substrate Categorical  
  • Category 0: Non-substrate; Category 1: Substrate;
  • Molecules that labeled substrate in PubChem BioAssay were regarded as substrate.
  • Characteristics of CYP3A4 substrate: 0.97< LogP <7.54; Large molecules
  • NAT BIOTECHNOL. 2009, 27(11): 1050-1055.
  • BIOINFORMATICS. 2013, 29(16): 2051-2052.
  • ISBN: 978-0-1236-9520-8. pp. 162
CYP450 2C9 inhibitor Categorical  
  • Category 0: Non-inhibitor; Category 1: Inhibitor;
  • Molecules that labeled inhibitor in PubChem BioAssay were regarded as inhibitor.
  • NAT BIOTECHNOL. 2009, 27(11): 1050-1055.
  • BIOINFORMATICS. 2013, 29(16): 2051-2052.
CYP450 2C9 substrate Categorical  
  • Category 0: Non-substrate; Category 1: Substrate;
  • Molecules that labeled substrate in PubChem BioAssay were regarded as substrate.
  • Characteristics of CYP2C9 substrate: 0.89< LogP <5.18; Acidic (Nonionized)
  • MOL INFORM. 2011. 30(10): p. 885-895.
  • J CHEM INF MODEL. 2013. 53(12): p. 3373-3383.
  • ISBN: 978-0-1236-9520-8. pp. 162
CYP450 2C19 inhibitor Categorical  
  • Category 0: Non-inhibitor; Category 1: Inhibitor;
  • Molecules that labeled inhibitor in PubChem BioAssay were regarded as inhibitor.
  • NAT BIOTECHNOL. 2009, 27(11): 1050-1055.
  • BIOINFORMATICS. 2013, 29(16): 2051-2052.
CYP450 2C19 substrate Categorical  
  • Category 0: Non-substrate; Category 1: Substrate;
  • Molecules that labeled substrate in PubChem BioAssay were regarded as substrate.
  • NAT BIOTECHNOL. 2009, 27(11): 1050-1055.
  • BIOINFORMATICS. 2013, 29(16): 2051-2052.
CYP450 2D6 inhibitor Categorical  
  • Category 0: Non-inhibitor; Category 1: Inhibitor;
  • Molecules that labeled substrate in PubChem BioAssay were regarded as substrate.
  • MOL INFORM. 2011. 30(10): p. 885-895.
  • J CHEM INF MODEL. 2013. 53(12): p. 3373-3383.
CYP450 2D6 substrate Categorical  
  • Category 0: Non-substrate; Category 1: Substrate;
  • Molecules that labeled substrate in PubChem BioAssay were regarded as substrate.
  • Characteristics of CYP2D6 substrate: 0.75< LogP <5.04; Basic (Ionized)
  • MOL INFORM. 2011. 30(10): p. 885-895.
  • J CHEM INF MODEL. 2013. 53(12): p. 3373-3383.
  • ISBN: 978-0-1236-9520-8. pp. 162
T 1/2 (Half Life) Numeric h > 0.5
  • Range: >8h: high; 3h< Cl < 8h: moderate; <3h: low
  • ISBN: 978-0-1236-9520-8. pp. 236
CL (Clearance) Numeric mL/min/kg
  • Range: >15 mL/min/kg: high; 5mL/min/kg< Cl < 15mL/min/kg: moderate; <5 mL/min/kg: low
  • ISBN: 978-0-1236-9520-8. pp. 236
hERG (hERG Blockers) Categorical  
  • Category 0: Non-blockers; Category 1: Blockers;
  • Where molecules with IC50 < 40 μM were regarded as blockers.
  • Features may lead to hERG blocker: A basic amine (positively ionizable, pKa >7.3). Hydrophobic/lipophilic substructure(s) (ClogP >3.7). Absence of negatively ionizable groups or oxygen H-bond acceptors.
  • TRENDS PHARMACOL SCI. 2005, 26(3): 119-124
  • ISBN: 978-0-1236-9520-8. pp. 213
  • MOL PHARM. 2016, 13(8):2855–2866
H-HT (Human Hepatotoxicity) Categorical  
  • Category 0: H-HT negative(-); Category 1: H-HT positive(+);
  • The H-HT positive(+) & negative(-) classification criteria refers the reference.
  • CHEM RES TOXICOL, 2016, 29(5): 757-767.
AMES (Ames Mutagenicity) Categorical  
  • Category 0: Ames negative(-); Category 1: Ames positive(+);
  • Ames positive(+) & negative(-): significantly induces revertant colony growth at least in one out of usually five strains, otherwise, negative.
  • J CHEM INF MODEL. 2012, 52(11): 2840-2847.
SkinSen (Skin sensitization) Categorical  
  • Category 0: Non-sensitizer; Category 1: Sensitizer;
  • Sensitizer & Non-sensitizer: The (r)LLNA experimental value. (r)LLNA: (Reduced) local lymph node assay.
  • TOXICOL APPL PHARM, 2015 , 284 (2) :262-272
LD50 (LD50 of acute toxicity) Numeric -log mol/kg > 500 mg/kg
  • Median lethal dose (LD50) usually represents the acute toxicity of chemicals.It is the dose amount of a tested molecule to kill 50 % of the treated animals within a given period.
  • High-toxicity: 1~50 mg/kg; Toxicity: 51~500 mg/kg; low-toxicity: 501~5000 mg/kg.
  • CHEM RES TOXICOL, 2009, 22 (12), pp 1913–1921
  • J CHEMINFORMATICS, 2016 , 8 (1) :6
DILI (Drug Induced Liver Injury) Categorical  
  • Category 0: DILI negative(-); Category 1: DILI positive(+);
  • The DILI positive(+) & negative(-) classification criteria refers the reference.
  • J CHEM INF MODEL, 2015, 55(10) :2085-2093
FDAMDD (Maximum Recommended Daily Dose) Categorical  
  • Category 0: FDAMDD negative(-); Category 1: FDAMDD positive(+);
  • The FDAMDD positive(+) & negative(-) classification criteria refers the reference.
  • CHEMOMETR INTELL LAB, 2015, 146:494-502