• 中国中文核心期刊
  • 中国科学引文数据库核心期刊
  • 中国科技核心期刊
  • 中国高校百佳科技期刊
高级检索

基于机器学习的药物血浆蛋白结合率的预测

付洺宇, 朱一阳, 吴春勇, 侯凤贞, 关媛

付洺宇, 朱一阳, 吴春勇, 侯凤贞, 关媛. 基于机器学习的药物血浆蛋白结合率的预测[J]. 中国药科大学学报, 2021, 52(6): 699-706. DOI: 10.11665/j.issn.1000-5048.20210607
引用本文: 付洺宇, 朱一阳, 吴春勇, 侯凤贞, 关媛. 基于机器学习的药物血浆蛋白结合率的预测[J]. 中国药科大学学报, 2021, 52(6): 699-706. DOI: 10.11665/j.issn.1000-5048.20210607
FU Mingyu, ZHU Yiyang, WU Chunyong, HOU Fengzhen, GUAN Yuan. Prediction of plasma protein binding rate based on machine learning[J]. Journal of China Pharmaceutical University, 2021, 52(6): 699-706. DOI: 10.11665/j.issn.1000-5048.20210607
Citation: FU Mingyu, ZHU Yiyang, WU Chunyong, HOU Fengzhen, GUAN Yuan. Prediction of plasma protein binding rate based on machine learning[J]. Journal of China Pharmaceutical University, 2021, 52(6): 699-706. DOI: 10.11665/j.issn.1000-5048.20210607

基于机器学习的药物血浆蛋白结合率的预测

基金项目: 国家自然科学基金资助项目(No.82074128);“双一流”学科创新团队资助项目(No.CPU2018GY30)

Prediction of plasma protein binding rate based on machine learning

Funds: This study was supported by the National Natural Science Foundation of China (No.82074128), and the Program for Innovation Team of the "Double-First Class" Disciplines (No.CPU2018GY30)
  • 摘要: 预测药物在血浆中的蛋白结合率,有助于了解药物的药代动力学特征,对药物发现的早期研究有重要的参考价值。本研究收集了2 452个临床药物的血浆蛋白结合率信息,用Molecular Operating Environment(MOE)和Mordred两种软件计算分子描述符,将算得的分子描述符作为模型的输入特征。使用极端梯度提升(extreme gradient boosting, XGBoost)算法和随机森林(random forest,RF)算法构建机器学习模型。结果表明,与MOE相比,将Mordred计算的分子描述符作为模型的输入,构建的模型预测性能更优。使用XGBoost算法和RF算法构建模型的预测性能结果相近,最优模型的R2均为0.715。此外,根据研究结果得出药物血浆蛋白结合率与药物分子的一些理化性质参数,如水溶性,辛醇/水分配系数以及共轭双键密切相关。通过这些参数预测药物血浆蛋白结合率具有方便快捷的优点,可以为相关药代动力学研究提供参考依据。
    Abstract: Predicting the protein binding rate of drugs in plasma is helpful to us in understanding the pharmacokinetic characteristics of drugs, with much value of reference for early research on drug discovery. In this study, plasma protein binding rate information of 2 452 clinical drugs were collected.Two pieces of software, Molecular Operating Environment (MOE) and Mordred, were used to calculate molecular descriptors, which were used as input features of the model.Extreme gradient boosting (XGBoost) algorithm and random forest (RF) algorithm were then used to build a machine learning model.The results showed that, compared with MOE, the prediction performance of the constructed model was better using the molecular descriptor calculated by Mordred as the input of the model.The prediction performance results of the model constructed using the XGBoost algorithm and the RF algorithm were similar, and the R2 of the optimal model were both 0.715.According to the research results, it can be concluded that the drug plasma protein binding rate is closely related to some physical and chemical properties of the drug molecule, such as water solubility, octanol/water partition coefficient and conjugated double bonds.Using these parameters to predict the plasma protein binding rate of drugs has the advantages of convenience and efficiency, which can provide reference for related pharmacokinetic studies.
  • [1] . Shandong Chem(山东化工),2019,48(22):70-73.
    [2] Kola I,Landis J. Can the pharmaceutical industry reduce attrition rates[J]. Nat Rev Drug Discov,2004,3(8):711-715.
    [3] Zhang L,Jiang C,Chen SM,et al. Determination of plasma protein binding of peptide drug candidates by dextran-coated charcoal[J]. J China Pharm Univ(中国药科大学学报),2020,51(5):522-529.
    [4] Chen Y,Wu H,Ge WH,et al. Research on entity relation extraction of Chinese adverse drug reaction reports based on deep learning method[J]. J China Pharm Univ(中国药科大学学报),2019,50(6):753-759.
    [5] Ghafourian T,Barzegar J,Dastmalchi S,et al. QSPR models for the prediction of apparent volume of distribution[J]. Int J Pharm,2006,319(1/2):82-97.
    [6] Gleeson MP,Waters NJ,Paine SW,et al. In silico human and rat vss quantitative structure-activity relationship models[J]. Med Chem,2006,49(6):1953-1963.
    [7] Lombardo F,Obach RS,DiCapua FM,et al. A hybrid mixture discriminant analysis-random forest computational model for the prediction of volume of distribution in human[J]. Med Chem,2006,49(7):2262-2267.
    [8] Gleeson MP. Plasma protein binding affinity and its relationship to molecular structure:an in-silico analysis[J]. Med Chem,2007,50(1):101-112.
    [9] Gunturi SB,Narayanan R. In silico ADME modeling 3:computational models to predict human intestinal absorption using sphere exclusion and kNN QSAR methods[J]. QSAR Combinat Sci,2007,26:653-668.
    [10] Norinder U,Bergstroem CA. Prediction of ADMET properties[J]. Med Chem,2006,1(9):920-937.
    [11] Votano JR,Parham M,Hall LM,et al. QSAR modeling of human serum protein binding with several modeling techniques utilizing structure information representation[J]. Med Chem,2006,49(24):7169-7181.
    [12] Ingle L,Veber BC,Nichols JW,et al. Informing the human plasma protein binding of environmental chemicals by machine learning in the pharmaceutical space:applicability domain and limits of predictability[J]. Chem Inf Model,2016,56(11):2243-2252.
    [13] Watanabe R,Esaki T,Kawashima H,et al. Predicting fraction unbound in human plasma from chemical structure:improved accuracy in the low value ranges[J]. Mol Pharm,2018,15(11):5302-5311.
    [14] Obach RS,Lombardo F,Waters NJ. Trend analysis of a database of intravenous pharmacokinetic parameters in humans for 670 drug compounds[J]. Drug Metab Dispos,2008,36(7):1385-1405.
    [15] Zhang R,Wang YB. Research on machine learning with algorithm and development[J]. Comm Univ China (中国传媒大学学报),2016,23(2):10-18.
    [16] Liu BY,Wang Q,Xu LY,et al. Application of artificial intelligence technology in medicine research and development[J]. Chin J New Drugs (中国新药杂志),2020,29(17):1979-1986.
    [17] Moriwaki H,Tian YS,Kawashita N,et al. Mordred:a molecular descriptor calculator[J]. Cheminform,2018,10(1):4.
    [18] Bergstra J,Bengio Y. Random search for hyper-parameter optimization[J]. Machine Learning,2012,13:281-305.
    [19] Nagle K. Atomic polarizability and electronegativity[J]. Am Chem Soc,1990,112(12),4741-4747.
    [20] Zhivkova Z,Doytchinova I. Quantitative structure-plasma protein binding relationships of acidic drugs[J]. Pharm Scim 2012,101(12):4627-4641.
  • 期刊类型引用(2)

    1. 杨亚鑫,王璟德,孙巍. 基于特征结构组合描述的抗癌药物筛选. 华东理工大学学报(自然科学版). 2023(06): 907-914 . 百度学术
    2. 王琰,胥美美,童俞嘉,苟欢,蔡荣,单治易,安新颖. 基于机器学习的环境监测数据对循环系统疾病死亡影响及预测预警模型构建. 数据分析与知识发现. 2022(10): 79-92 . 百度学术

    其他类型引用(0)

计量
  • 文章访问数:  311
  • HTML全文浏览量:  11
  • PDF下载量:  667
  • 被引次数: 2
出版历程
  • 收稿日期:  2021-03-03
  • 修回日期:  2021-11-08
  • 刊出日期:  2021-12-24

目录

    /

    返回文章
    返回
    x 关闭 永久关闭