Interpretation-ADMElab: ADMET Prediction|ADMET Predictor|QSAR|ADMET Database

Overview

Druglikeness anlysis

ADME/T evaluation

Similarity searching based on ADME/T database

Systematic ADME/T assessment

How to explain

Detailed modeling process

Overview

| What's ADMETlab

ADMETlab platform provides a user-friendly, freely available web interface for systematic ADMET evaluation of chemical compounds based on a comprehensive database consisting of 288,967 entries. It contains four main modules: 'Druglikeness analysis', 'ADMET prediction', 'Systematic evaluation' and 'Smilarity searching'. The detailed information about these modules is described as below.

| Main features

Comparative large datasets of most properties.
Better and robust SAR/QSAR models.
Systematic analysis and comparison
Provide constructive suggestions for molecular optimization
Batch computation
User-friendly interface

| Main functionalities

ADMETlab can help researcher, PK specialists to do as follows:

Druglikeness analysis.

ADME/T evaluation.

Similarity searching based on ADME/T database.

Systematic ADME/T assessment.

Druglikeness analysis

| Druglikeness rules

Druglikeness rules are several expert criterions that are used in drug design for how "druglike" a substance is with respect to factors like bioavailability. Here, we selected 5 commonly used rules and provided to users:

Lipinski's rules: MW<=500; logP<=5; Hacc<=10; Hdon<=5
Ghose's rules: - 5.6< MclogP < -0.4 mean: 2.52; 160 < MW < 480 mean: 357;
40 < MR < 130 mean: 97; 20 < natoms < 70 mean: 48
Oprea's rules: nrings≥3; nrigidbond>=18; nRotbond≥6
Veber's rules: nRotbond<=10; tPSA<= 140 or Hacc and Hdon<=12
Varma's rules: MW<= 500; tPSA<=125; -5< logD < – 2; Hacc+Hdon<=9; nRotbond<=12

| Druglikeness model

We collected 6731 drugs from Drugbank database as samples of druglikeness. Then 6769 molecules were picked as negative samples from those molecules with IC50 or Ki less than 10000nm from CHEMBL database by using Self-organizing feature Mapping (SOM) method. The SOM method ensures that the nagetive samples are picked from the similar clusters compared with positive samples. This tend to make the model have a better discriminatory power and a proper application domain. Finally, we get a classification model with a accuracy of 0.801 for training set by 5-fold cross validation. This model can not only find out the active compounds from chemical entities but also distinguish the potential drug candidates from active compounds.

ADMET prediction

| Data summary

For all the ADMET-related properties, we collected corresponding data mainly by two approaches: the previous literatures and the DrugBank database (http://www.drugbank.ca). After several pretreatments, we totally obtained 30 datasets. The global overview of these ADMET datasets can be seen in the table below. (Table S1).

Table S1. The number of end-points of each property

Category	Property	Total	Positive	Negative	Train	Test
Basic physicochemical property	LogS	5220	-	-	4116	1104
	LogD7.4	1031	-	-	773	258
	LogP
Absorption	Caco-2	1182	-	-	886	296
	Pgp-Inhibitor	2297	1372	925	1723	574
	Pgp-Substrate	1252	643	609	939	313
	HIA	970	818	152	728	242
	F (20%)	1013	759	254	760	253
	F (30%)	1013	672	341	760	253
Distribution	PPB	1822	-	-	1368	454
	VD	544	-	-	408	136
	BBB	2237	540	1697	1678	559
Metabolism	CYP1A2-Inhibitor	12145	5713	6432	9145	3000
	CYP1A2-Substrate	396	198	198	297	99
	CYP3A4-Inhibitor	11893	5047	6846	8893	3000
	CYP3A4-Substrate	1020	510	510	765	255
	CYP2C19-Inhibitor	12272	5670	6602	9272	3000
	CYP2C19-Substrate	312	156	156	234	78
	CYP2C9-Inhibitor	11720	3960	7760	8720	3000
	CYP2D6-Inhibitor	12726	2342	10384	9726	3000
	CYP2C9-Substrate	784	278	506	626	156
	CYP2D6-Substrate	816	352	464	611	205
Excretion	Clearance	544	-	-	408	136
Excretion	T1/2	544	-	-	408	136
Toxicity	hERG	655	451	204	392	263
	H-HT	2171	1435	736	1628	543
	Ames	7619	4252	3367	5714	1905
	SkinSen	404	274	130	323	81
	LD50 of acute toxicity	7397	-	-	5917	1480
	DILI	475	236	239	380	95
	FDAMDD	803	442	361	643	160

| Data sources

The models are avaliable for public via: http://github.com/ifyoungnet/ADMETlab, other datasets or resources please email oriental-cds@163.com; biomed@csu.edu.cn to get download link.

| Model summary

To obtain robust and reliable QSAR models for ADMET properties prediction, we constructed a series models and aimed to find a best one. Six methods (RF, SVM, RP, PLS, NB, DT) and seven types of descriptors (2D, Estate, MACCS, ECFP2, ECFP4, ECFP6, FP2) were applied in the modeling process. The best model for each property and its performance can be seen in tables below (Table S2, Table S3).

| Model Results

Table S2. The best regression models for the coressponding ADME/T related properties

Property	Method	Features	mtry	R²	Q²	R²T	RMSE_F	RMSE_CV	RMSE_T
LogS	RF	2D	10	0.995	0.967	0.957	0.138	0.369	0.436
LogD7.4	RF	2D	14	0.983	0.877	0.874	0.228	0.614	0.605
LogP
Caco-2	RF	2D	14	0.973	0.845	0.824	0.121	0.289	0.290
PPB	RF	2D	8	0.954	0.691	0.682	7.124	18.443	18.044

Property	Method	Features	mtry	2-fold rate (CV/Test)	3-fold rate (CV/Test)
VD	RF	2D	10	0.819/0.801	0.912/0.904
CL	RF	2D	10	0.760/0.816	0.877/0.897
T1/2	RF	2D	12	0.762/0.699	0.897/0.824
LD50 of acute toxicity	RF	2D	5	0.986/0.987	0.998/0.997

Table S3. The best classification models for the coressponding ADME/T related properties

Property	Method	Features	Five-fold cross validation				External validation dataset
Property	Method	Features	Sensitivity	Specificity	Accuracy	AUC	Sensitivity	Specificity	Accuracy	AUC
HIA	RF	MACCS	0.820	0.743	0.782	0.846	0.801	0.743	0.773	0.831
F (20%)	RF	MACCS	0.731	0.647	0.689	0.759	0.680	0.663	0.671	0.746
F (30%)	RF	ECFP6	0.743	0.605	0.669	0.715	0.751	0.601	0.667	0.718
BBB	SVM	ECFP2	0.962	0.813	0.926	0.948	0.993	0.854	0.962	0.975
Pgp-inhibitor	SVM	ECFP4	0.887	0.789	0.848	0.908	0.863	0.802	0.838	0.913
Pgp-substrate	SVM	ECFP4	0.839	0.807	0.824	0.899	0.826	0.854	0.840	0.905
CYP1A2-Inhibitor	SVM	ECFP4	0.833	0.864	0.849	0.928	0.853	0.880	0.867	0.939
CYP1A2-Substrate	RF	ECFP4	0.768	0.636	0.702	0.801	0.768	0.637	0.702	0.802
CYP3A4-Inhibitor	SVM	ECFP4	0.759	0.858	0.817	0.901	0.788	0.860	0.829	0.909
CYP3A4-Substrate	RF	ECFP4	0.798	0.716	0.757	0.835	0.819	0.679	0.749	0.835
CYP2C19-Inhibitor	SVM	ECFP2	0.826	0.819	0.822	0.893	0.812	0.825	0.819	0.899
CYP2C19-Substrate	RF	ECFP4	0.735	0.744	0.740	0.816	0.871	0.667	0.769	0.853
CYP2C9-Inhibitor	SVM	ECFP4	0.719	0.898	0.837	0.900	0.730	0.882	0.830	0.894
CYP2C9-Substrate	RF	ECFP4	0.746	0.709	0.728	0.819	0.746	0.709	0.734	0.824
CYP2D6-Inhibitor	RF	ECFP4	0.770	0.811	0.793	0.868	0.771	0.812	0.795	0.882
CYP2D6-Substrate	RF	ECFP4	0.765	0.73	0.748	0.823	0.792	0.73	0.76	0.833
hERG	RF	2D	0.908	0.700	0.844	0.879	0.888	0.762	0.848	0.873
H-HT	RF	2D	0.780	0.520	0.689	0.710	0.785	0.487	0.681	0.683
Ames	RF	MACCS	0.800	0.841	0.820	0.890	0.848	0.816	0.834	0.897
SkinSen	RF	MACCS	0.685	0.727	0.706	0.760	0.715	0.727	0.731	0.774
DILI	RF	MACCS	0.866	0.813	0.840	0.904	0.830	0.857	0.843	0.910
FDAMDD	RF	ECFP4	0.848	0.812	0.832	0.904	0.853	0.782	0.821	0.892

The Cohen's kappa coefficient can be used as a performance metric to evaluate the results of models based on unbalanced dataset. Here we calculated the coefficient for the 7 unbalanced models: {CYP2C9-Substrate: 0.397, CYP2D6-Inhibitor: 0.506, CYP2D6-Substrate: 0.411, F-20: 0.309, F-30: 0.254, HIA: 0.429, SkinSen: 0.413} . Usually, the coefficient standards: (<=0: poor; 0.01-0.20: slight; 0.21-0.40: fair; 0.41-0.60:moderate; 0.61-0.80: substantial; 0.81-1: almost perfect). We can see, after the processing using our strategies, the consistency is quite acceptable.

Search & Database

| Database contents

The database integrated all the ADMET entries from ChEMBL database, Drugbank database, EPA database and related records from several literatures along with all the data described above. We manually checked the correctness of the values and dropped redundant information which resulted in 288,967 entries. Each entry includs basic molecular properties( eg. , common name, SMILES, ALogp, PSA) and ADMET activities.

| Similarity search

QSAR models and similarity search are both useful strategies to predict ADMET properties. Compared with QSAR models, The similarity search in databases are fast and can easily be extended to include new information. Here, we provides 5 kinds of fingerprints to represent molecular information and 2 kinds of similarity metrics. Users can input molecules to estimate their properties by comparing with similar compounds.

Systemic evaluation

| Summary

Not just one property affects the behavior of drugs in body. Usually we are looking for molecules that possess relatively good performance through every stage of ADME/T. Here, we developed this module that allows users to evaluate most aspects of ADME/T process of one molcule. The results provide users an full impression and lead to constructive suggestions of molecular optimization.

How to explain

Table S4. The explaination for the coressponding ADME/T related properties

Property	Type	Units	Suggestions	Meaning & Preference	Reference
LogS (Solubility)	Numeric	log mol/L	> 10 μg/ml	Optimal: higher than -4 log mol/L <10 μg/mL: Low solubility; 10–60 μg/mL: Moderate solubility; >60 μg/mL: High solubility	Book: ISBN: 9787562832287. pp. 14. J PHARMACOL TOX MET. 2000, 44 (1), 235−249.
LogD7.4 (Distribution Coefficient D)	Numeric		1~5	< 1: Solubility high; Permeability low by passive transcellular diffusion; Permeability possible via paracellular if MW < 200; Metabolism low. 1 to 3: Solubility moderate; Permeability moderate; Metabolism low. 3 to 5: Solubility low; Permeability high; Metabolism moderate to high. > 5: Solubility low; Permeability high; Metabolism high.	Methods and principles in medicinal chemistry 18 (pp. 21–45). Weinheim: Wiley-VCH.
LogP (Distribution Coefficient P)	Numeric		0~3	Optimal: 0< LogP <3 LogP <0: poor lipid bilayer permeability. LogP >3: poor aqueous solubility.	Book: ISBN: 3-906390-22-5. pp. 127–182.
Papp (Caco-2 Permeability)	Numeric	cm/s	> -5.15	Optimal: higher than -5.15 Log unit or -4.70 or -4.80	J CHEM INF MODEL. 2016, 56 (4), pp 763–773.
Pgp-inhibitor	Categorical			Category 0: Non-inhibitor; Category 1: Inhibitor; The Pgp-inhibitor & non-inhibitor classification criteria refers the reference.	J CHEM INF MODEL. 2010. 50(6): p. 1034-1041. J MED CHEM. 2011. 54(6): p. 1740-1751.
Pgp-substrate	Categorical			Category 0: Non-substrate; Category 1: Substrate; More likely to be a Pgp substrate: N+O ≥ 8; MW > 400; Acid with pKa > 4 More likely to be a Pgp non-substrate: N+O ≤ 4; MW < 400; Acid with pKa < 8	J DRUG TARGET. 11, 391–406.
HIA (Human Intestinal Absorption)	Categorical			Category 0: HIA-; Category 1: HIA+; ≥30%: HIA+; <30%: HIA-	RSC ADV. 2017, 7, 19007-19018
F (20% Bioavailability)	Categorical			Category 0: F20-; Category 1: F20+; ≥20%: F20+; <20%: F20-	MOL PHARMACEUT, 2011. 8(3): p. 841-851 J PHARMACEUT BIOMED, 2008. 47(4): p. 677-682.
F (30% Bioavailability)	Categorical			Category 0: F30-; Category 1: F30+; ≥30%: F30+; <30%: F30-	MOL PHARMACEUT, 2011. 8(3): p. 841-851 J PHARMACEUT BIOMED, 2008. 47(4): p. 677-682.
PPB (Plasma Protein Binding)	Numeric	%	90	Significant with drugs that are highly protein-bound and have a low therapeutic index.	ISBN: 978-0-1236-9520-8. pp. 194
VD (Volume Distribution)	Numeric	L/kg	0.04~20	Optimal: 0.04-20L/kg; Range: <0.07L/kg: Confined to blood, Bound to plasma protein or highly hydrophilic; 0.07-0.7L/kg: Evenly distributed; >0.7L/kg: Bound to tissue components (e.g., protein, lipid),highly lipophilic.	Book: ISBN: 9787562832287. pp. 174 Book: ISBN: 978-0-1236-9520-8. pp. 229
BBB (Blood–Brain Barrier)	Categorical			Category 0: BBB-; Category 1: BBB+; BB ratio >=0.1: BBB+; BB ratio <0.1: BBB- These features tend to improve BBB permeation: H-bonds (total) < 8–10; MW < 400–500; No acids.	J NEUROCHEM. 70, 1781–1792
P450 CYP1A2 inhibitor	Categorical			Category 0: Non-inhibitor; Category 1: Inhibitor; Molecules that labeled inhibitor in PubChem BioAssay were regarded as inhibitor.	NAT BIOTECHNOL. 2009, 27(11): 1050-1055. BIOINFORMATICS. 2013, 29(16): 2051-2052.
CYP450 1A2 substrate	Categorical			Category 0: Non-substrate; Category 1: Substrate; Molecules that labeled substrate in PubChem BioAssay were regarded as substrate. Characteristics of CYP1A2 substrate: 0.08< LogP <3.61; Planar amines and amides	NAT BIOTECHNOL. 2009, 27(11): 1050-1055. BIOINFORMATICS. 2013, 29(16): 2051-2052. ISBN: 978-0-1236-9520-8. pp. 162
CYP450 3A4 inhibitor	Categorical			Category 0: Non-inhibitor; Category 1: Inhibitor; Molecules that labeled inhibitor in PubChem BioAssay were regarded as inhibitor. Strategies to Reduce CYP3A4 Inhibition: Decrease the lipophilicity (LogD 7.4); Add steric hindrance to the heterocycle para to the nitrogen; Add an electronic substitution (e.g., halogen) that reduces the pKa of the nitrogen.	NAT BIOTECHNOL. 2009, 27(11): 1050-1055. BIOINFORMATICS. 2013, 29(16): 2051-2052.
CYP450 3A4 substrate	Categorical			Category 0: Non-substrate; Category 1: Substrate; Molecules that labeled substrate in PubChem BioAssay were regarded as substrate. Characteristics of CYP3A4 substrate: 0.97< LogP <7.54; Large molecules	NAT BIOTECHNOL. 2009, 27(11): 1050-1055. BIOINFORMATICS. 2013, 29(16): 2051-2052. ISBN: 978-0-1236-9520-8. pp. 162
CYP450 2C9 inhibitor	Categorical			Category 0: Non-inhibitor; Category 1: Inhibitor; Molecules that labeled inhibitor in PubChem BioAssay were regarded as inhibitor.	NAT BIOTECHNOL. 2009, 27(11): 1050-1055. BIOINFORMATICS. 2013, 29(16): 2051-2052.
CYP450 2C9 substrate	Categorical			Category 0: Non-substrate; Category 1: Substrate; Molecules that labeled substrate in PubChem BioAssay were regarded as substrate. Characteristics of CYP2C9 substrate: 0.89< LogP <5.18; Acidic (Nonionized)	MOL INFORM. 2011. 30(10): p. 885-895. J CHEM INF MODEL. 2013. 53(12): p. 3373-3383. ISBN: 978-0-1236-9520-8. pp. 162
CYP450 2C19 inhibitor	Categorical			Category 0: Non-inhibitor; Category 1: Inhibitor; Molecules that labeled inhibitor in PubChem BioAssay were regarded as inhibitor.	NAT BIOTECHNOL. 2009, 27(11): 1050-1055. BIOINFORMATICS. 2013, 29(16): 2051-2052.
CYP450 2C19 substrate	Categorical			Category 0: Non-substrate; Category 1: Substrate; Molecules that labeled substrate in PubChem BioAssay were regarded as substrate.	NAT BIOTECHNOL. 2009, 27(11): 1050-1055. BIOINFORMATICS. 2013, 29(16): 2051-2052.
CYP450 2D6 inhibitor	Categorical			Category 0: Non-inhibitor; Category 1: Inhibitor; Molecules that labeled substrate in PubChem BioAssay were regarded as substrate.	MOL INFORM. 2011. 30(10): p. 885-895. J CHEM INF MODEL. 2013. 53(12): p. 3373-3383.
CYP450 2D6 substrate	Categorical			Category 0: Non-substrate; Category 1: Substrate; Molecules that labeled substrate in PubChem BioAssay were regarded as substrate. Characteristics of CYP2D6 substrate: 0.75< LogP <5.04; Basic (Ionized)	MOL INFORM. 2011. 30(10): p. 885-895. J CHEM INF MODEL. 2013. 53(12): p. 3373-3383. ISBN: 978-0-1236-9520-8. pp. 162
T 1/2 (Half Life)	Numeric	h	> 0.5	Range: >8h: high; 3h< Cl < 8h: moderate; <3h: low	ISBN: 978-0-1236-9520-8. pp. 236
CL (Clearance)	Numeric	mL/min/kg		Range: >15 mL/min/kg: high; 5mL/min/kg< Cl < 15mL/min/kg: moderate; <5 mL/min/kg: low	ISBN: 978-0-1236-9520-8. pp. 236
hERG (hERG Blockers)	Categorical			Category 0: Non-blockers; Category 1: Blockers; Where molecules with IC50 < 40 μM were regarded as blockers. Features may lead to hERG blocker: A basic amine (positively ionizable, pKa >7.3). Hydrophobic/lipophilic substructure(s) (ClogP >3.7). Absence of negatively ionizable groups or oxygen H-bond acceptors.	TRENDS PHARMACOL SCI. 2005, 26(3): 119-124 ISBN: 978-0-1236-9520-8. pp. 213 MOL PHARM. 2016, 13(8):2855–2866
H-HT (Human Hepatotoxicity)	Categorical			Category 0: H-HT negative(-); Category 1: H-HT positive(+); The H-HT positive(+) & negative(-) classification criteria refers the reference.	CHEM RES TOXICOL, 2016, 29(5): 757-767.
AMES (Ames Mutagenicity)	Categorical			Category 0: Ames negative(-); Category 1: Ames positive(+); Ames positive(+) & negative(-): significantly induces revertant colony growth at least in one out of usually five strains, otherwise, negative.	J CHEM INF MODEL. 2012, 52(11): 2840-2847.
SkinSen (Skin sensitization)	Categorical			Category 0: Non-sensitizer; Category 1: Sensitizer; Sensitizer & Non-sensitizer: The (r)LLNA experimental value. (r)LLNA: (Reduced) local lymph node assay.	TOXICOL APPL PHARM, 2015 , 284 (2) :262-272
LD50 (LD50 of acute toxicity)	Numeric	-log mol/kg	> 500 mg/kg	Median lethal dose (LD50) usually represents the acute toxicity of chemicals.It is the dose amount of a tested molecule to kill 50 % of the treated animals within a given period. High-toxicity: 1~50 mg/kg; Toxicity: 51~500 mg/kg; low-toxicity: 501~5000 mg/kg.	CHEM RES TOXICOL, 2009, 22 (12), pp 1913–1921 J CHEMINFORMATICS, 2016 , 8 (1) :6
DILI (Drug Induced Liver Injury)	Categorical			Category 0: DILI negative(-); Category 1: DILI positive(+); The DILI positive(+) & negative(-) classification criteria refers the reference.	J CHEM INF MODEL, 2015, 55(10) :2085-2093
FDAMDD (Maximum Recommended Daily Dose)	Categorical			Category 0: FDAMDD negative(-); Category 1: FDAMDD positive(+); The FDAMDD positive(+) & negative(-) classification criteria refers the reference.	CHEMOMETR INTELL LAB, 2015, 146:494-502

New counts from Mar. 10, 2025 Visits between Oct. 1, 2018 to Mar. 10, 2025: 2.2 million

The recommended browsers: Safari, Firefox, Chrome,IE(Ver.>8).
ADMETlab is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. E-mail: jiedong@csu.edu.cn