高级检索

机器学习在合成大麻素识别鉴定中的应用进展

许情, 吕敏, 邓虹霄, 胡驰, 向平, 陈航

许情,吕敏,邓虹霄,等. 机器学习在合成大麻素识别鉴定中的应用进展[J]. 中国药科大学学报,2024,55(3):316 − 325. DOI: 10.11665/j.issn.1000-5048.2023113003
引用本文: 许情,吕敏,邓虹霄,等. 机器学习在合成大麻素识别鉴定中的应用进展[J]. 中国药科大学学报,2024,55(3):316 − 325. DOI: 10.11665/j.issn.1000-5048.2023113003
XU Qing, LYU Min, DENG Hongxiao, et al. Advances in the application of machine learning in the identification and authentication of synthetic cannabinoids[J]. J China Pharm Univ, 2024, 55(3): 316 − 325. DOI: 10.11665/j.issn.1000-5048.2023113003
Citation: XU Qing, LYU Min, DENG Hongxiao, et al. Advances in the application of machine learning in the identification and authentication of synthetic cannabinoids[J]. J China Pharm Univ, 2024, 55(3): 316 − 325. DOI: 10.11665/j.issn.1000-5048.2023113003

机器学习在合成大麻素识别鉴定中的应用进展

基金项目: 国家重点研发计划项目(No.2022YFC3300903);中央级科研院所社会公益研究专项(No.GY2022D-1); 上海市法医学重点实验室资助项目(No.21DZ2270800)
详细信息
    作者简介:

    陈航,司法鉴定科学研究院副主任法医师,硕士生导师,国际法医毒理家协会(The International Association of Forensic Toxicologists, TIAFT)会员,入选上海市青年科技英才计划。主要从事法医毒物学研究及基于应用研究的司法鉴定公共法律服务。主持或参与含“十二五”“十三五”国家重点研发专项在内的多项国家级、省部级科研项目,曾作为学术秘书参与编制“十三五”国家规划高等院校教材《法医毒物学》及配套材料,参编《法医毒物学手册》《法医毒物鉴定理论与实践》《滥用物质分析与应用》《毛发分析基础及应用》《新精神活性物质分析与应用》等专著,开发并登记包括《司法鉴定材料管理信息化系统(FSMS V1.0)》《法医毒物学化合物知识库系统V2.0》《法医毒物数字化平台V2.0》等数字化软件

    通讯作者:

    陈航: Tel:021-52352955 E-mail:chenh@ssfjd.cn

  • 中图分类号: TP181;R917

Advances in the application of machine learning in the identification and authentication of synthetic cannabinoids

Funds: This study was supported by the National Key Research and Development Program of China (No.2022YFC3300903), the Social Welfare Research Projects of Centralized Research Institutes(No.GY2022D-1), and the Project of Shanghai Key Laboratory of Forensic Medicine(No.21DZ2270800)
  • 摘要:

    合成大麻素是一种人工合成的可以引起公共健康风险的精神活性物质,且合成大麻素结构多变,容易被结构修饰,结构未知的合成大麻素的快速出现使得对其鉴识面临了新的挑战。近年来,机器学习已取得很大的进展,已经广泛应用到其他领域,也为结构未知合成大麻素的鉴识以及可能的来源推断提供了新的策略。本文阐述了常用机器学习方法的原理以及机器学习技术在合成大麻素类物质的质谱分析、拉曼光谱分析、代谢组学以及定量构效关系等方面的应用,以期为未知合成大麻素的鉴识提供新的思路。

    Abstract:

    Synthetic cannabinoids (SCs) are synthetic psychoactive substances that can pose a public health risk. The SCs are structurally variable and susceptible to structural modification. The rapid emergence of structurally unknown synthetic cannabinoids has led to new challenges in their identification. In recent years, machine learning has made great progress and has been widely applied to other fields, providing new strategies for the identification of unknown synthetic cannabinoids and the inference of possible sources. This paper describes the principles of commonly used machine learning methods and the application of machine learning techniques to mass spectrometry, Raman spectroscopy, metabolomics and quantitative conformational relationships of synthetic cannabinoids, aiming to provide new ideas for the identification of unknown synthetic cannabinoids.

  • 图  1   合成大麻素结构

    图  2   机器学习结合4种其他技术鉴识合成大麻素常用算法方法比较及适用范围

    PCA:主成分分析;MLR:多元线性回归;ANN:人工神经网络;SVM:支持向量机;RF:随机森林

    表  1   常用机器学习算法模型优缺点

    算法名称优 点缺 点
    主成分分析降低数据维度,去除噪声,便于数据可视化和进一步处理,提高计算效率对异常值敏感,受到样本量和变量个数限制
    K-均值聚类算法简单,容易实现对数据类型要求较高,适合数值型数据;须事先确定K
    层次聚类可解释性强,无须事先确定聚类数量计算复杂度高,对噪声和异常值敏感。
    K最近邻算法理论成熟,可用于非线性分类计算量大,需要大量内存;不适合样本不平衡数据
    逻辑回归实现简单,分类时计算量较小,速度快容易欠拟合;只能处理二分类问题
    支持向量机泛化能力强,可以解决高维问题数据样本较大时,计算复杂度升高,训练时长大幅增加
    决策树易于理解和解释,可以可视化分析;比较适合有缺失属性的样本处理缺失数据困难,容易出现过拟合问题
    随机森林可以用来处理较高维度数据,且不用降维;可以判断特征的重要程度;不容易过拟合;对于不平衡的数据集可以平衡误差在噪音较大的分类问题上会过拟合
    神经网络算法具有较高非线性拟合能力,可以映射复杂的非线性关系,呈现较高的鲁棒性和自学习能力数据量较少的情况下,预测准确性降低;缺乏解释模型推理过程和推理能力的能力
    下载: 导出CSV
  • [1]

    Wiley JL, Marusich JA, Huffman JW. Moving around the molecule: relationship between chemical structure and in vivo activity of synthetic cannabinoids[J]. Life Sci, 2014, 97(1): 55-63. doi: 10.1016/j.lfs.2013.09.011

    [2]

    Schurman LD, Lu D, Kendall DA, et al. Molecular mechanism and cannabinoid pharmacology[J]. Handb Exp Pharmacol, 2020, 258: 323-353.

    [3]

    Alves VL, Gonçalves JL, Aguiar J, et al. The synthetic cannabinoids phenomenon: from structure to toxicological properties. A review[J]. Crit Rev Toxicol, 2020, 50(5): 359-382. doi: 10.1080/10408444.2020.1762539

    [4]

    Alzu’bi A, Almahasneh F, Khasawneh R, et al. The synthetic cannabinoids menace: a review of health risks and toxicity[J]. Eur J Med Res, 2024, 29(1): 49. doi: 10.1186/s40001-023-01443-6

    [5]

    Banister SD, Connor M. The chemistry and pharmacology of synthetic cannabinoid receptor agonist new psychoactive substances: evolution[J]. Handb Exp Pharmacol, 2018, 252: 191-226.

    [6]

    Tai S, Fantegrossi WE. Pharmacological and toxicological effects of synthetic cannabinoids and their metabolites[J]. Curr Top Behav Neurosci, 2017, 32: 249-262.

    [7]

    Fantegrossi WE, Moran JH, Radominska-Pandya A, et al. Distinct pharmacology and metabolism of K2 synthetic cannabinoids compared to Δ(9)-THC: mechanism underlying greater toxicity[J]. [J]? Life Sci, 2014, 97(1): 45-54.

    [8] Yan FR. Application and advance of artificial intelligence in biomedical field[J]. J China Pharm Univ (中国药科大学学报), 2023, 54(3): 263-268.
    [9] Wang C, Xiao F, Li M, et al. Application progress of artificial intelligence in the screening and identification of drug targets[J]. J China Pharm Univ (中国药科大学学报), 2023, 54(3): 269-281.
    [10] Yu ZH, Zhang LM, Zhang MN, et al. Artificial intelligence-based drug development: current progress and future challenges[J]. J China Pharm Univ (中国药科大学学报), 2023, 54(3): 282-293.
    [11]

    Jiang T, Gradus JL, Rosellini AJ. Supervised machine learning: a brief primer[J]. Behav Ther, 2020, 51(5): 675-687. doi: 10.1016/j.beth.2020.05.002

    [12]

    Ringnér M. What is principal component analysis[J]? Nat Biotechnol, 2008, 26(3): 303-304. doi: 10.1038/nbt0308-303

    [13]

    Gilbert N, Mewis RE, Sutcliffe OB. Classification of fentanyl analogues through principal component analysis (PCA) and hierarchical clustering of GC–MS data[J]. Forensic Chem, 2020, 21: 100287. doi: 10.1016/j.forc.2020.100287

    [14]

    Jiménez-Carvelo AM, González-Casado A, Bagur-González MG, et al. Alternative data mining/machine learning methods for the analytical evaluation of food quality and authenticity - A review[J]. Food Res Int, 2019, 122: 25-39. doi: 10.1016/j.foodres.2019.03.063

    [15]

    Amendolia SR, Cossu G, Ganadu ML, et al. A comparative study of K-nearest neighbour, support vector machine and multi-layer perceptron for thalassemia screening[J]. Chemom Intell Lab Syst, 2003, 69(1/2): 13-20.

    [16]

    Broséus J, Anglada F, Esseiva P. The differentiation of fibre- and drug type Cannabis seedlings by gas chromatography/mass spectrometry and chemometric tools[J]. Forensic Sci Int, 2010, 200(1/2/3): 87-92.

    [17]

    Thijs B, AxelJan R, Melvin G, et al. Decision trees and random forests[J]. Am J Orthod Dentofac Orthop Off Publ Am Assoc Orthod Const Soc Am Board Orthod, 2023, 164(6): 894-897.

    [18]

    Winkler DA, Le TC. Performance of deep and shallow neural networks, the universal approximation theorem, activity cliffs, and QSAR[J]. Mol Inform, 2017, 36(1/2): 10.1002/minf. 201600118.

    [19]

    Yang YQ, Liu DP, Hua ZD, et al. Machine learning-assisted rapid screening of four types of new psychoactive substances in drug seizures[J]. J Chem Inf Model, 2023, 63(3): 815-825. doi: 10.1021/acs.jcim.2c01342

    [20]

    Wong SL, Ng LT, Tan J, et al. Screening unknown novel psychoactive substances using GC-MS based machine learning[J]. Forensic Chem, 2023, 34: 100499. doi: 10.1016/j.forc.2023.100499

    [21]

    Lee SY, Lee ST, Suh S, et al. Revealing unknown controlled substances and new psychoactive substances using high-resolution LC-MS-MS machine learning models and the hybrid similarity search algorithm[J]. J Anal Toxicol, 2022, 46(7): 732-742. doi: 10.1093/jat/bkab098

    [22]

    Koshute P, Hagan N, Jameson NJ. Machine learning model for detecting fentanyl analogs from mass spectra[J]. Forensic Chem, 2022, 27: 100379. doi: 10.1016/j.forc.2021.100379

    [23]

    Moorthy AS, Kearsley AJ, Mallard WG, et al. Mass spectral similarity mapping applied to fentanyl analogs[J]. Forensic Chem, 2020, 19. doi: 10.1016/j.forc.2020.100237.

    [24]

    Setser AL, Waddell Smith R. Comparison of variable selection methods prior to linear discriminant analysis classification of synthetic phenethylamines and tryptamines[J]. Forensic Chem, 2018, 11: 77-86. doi: 10.1016/j.forc.2018.10.002

    [25]

    Bonetti JL, Samanipour S, van Asten AC. Utilization of machine learning for the differentiation of positional NPS isomers with direct analysis in real time mass spectrometry[J]. Anal Chem, 2022, 94(12): 5029-5040. doi: 10.1021/acs.analchem.1c04985

    [26]

    Münster-Müller S, Matzenbach I, Knepper T, et al. Profiling of synthesis-related impurities of the synthetic cannabinoid Cumyl-5F-PINACA in seized samples of e-liquids via multivariate analysis of UHPLC-MSn data[J]. Drug Test Anal, 2020, 12(1): 119-126. doi: 10.1002/dta.2673

    [27]

    Lee J, Jiang H. Analysis of indole and indazole amides synthetic cannabinoids by differential Raman spectroscopy based on ANN[J]. J Forensic Sci, 2022, 67(6): 2242-2252. doi: 10.1111/1556-4029.15133

    [28]

    Tian LC, Jiang H, Chen TZ. A rapid and nondestructive approach for forensic identification of novel psychoactive substances using shifted-excitation Raman difference spectroscopyand machine learning[J]. J Raman Spectrosc, 2023, 54(5): 540-550. doi: 10.1002/jrs.6508

    [29]

    Streun GL, Steuer AE, Poetzsch SN, et al. Towards a new qualitative screening assay for synthetic cannabinoids using metabolomics and machine learning[J]. Clin Chem, 2022, 68(6): 848-855. doi: 10.1093/clinchem/hvac045

    [30]

    Olesti E, De Toma I, Ramaekers JG, et al. Metabolomics predicts the pharmacological profile of new psychoactive substances[J]. J Psychopharmacol, 2019, 33(3): 347-354. doi: 10.1177/0269881118812103

    [31]

    Khan K, Benfenati E, Roy K. Consensus QSAR modeling of toxicity of pharmaceuticals to different aquatic organisms: ranking and prioritization of the DrugBank database compounds[J]. Ecotoxicol Environ Saf, 2019, 168: 287-297. doi: 10.1016/j.ecoenv.2018.10.060

    [32]

    Lee W, Park SJ, Hwang JY, et al. QSAR model for predicting the cannabinoid receptor 1 binding affinity and dependence potential of synthetic cannabinoids[J]. Molecules, 2020, 25(24): 6057. doi: 10.3390/molecules25246057

    [33]

    Paulke A, Proschak E, Sommer K, et al. Synthetic cannabinoids: in silico prediction of the cannabinoid receptor 1 affinity by a quantitative structure-activity relationship model[J]. Toxicol Lett, 2016, 245: 1-6. doi: 10.1016/j.toxlet.2016.01.001

    [34]

    Risoluti R, Materazzi S, Gregori A, et al. Early detection of emerging street drugs by near infrared spectroscopy and chemometrics[J]. Talanta, 2016, 153: 407-413. doi: 10.1016/j.talanta.2016.02.044

    [35]

    de Castro JS, Rodrigues CHP, Bruni AT. In silico infrared characterization of synthetic cannabinoids by quantum chemistry and chemometrics[J]. J Chem Inf Model, 2020, 60(4): 2100-2114. doi: 10.1021/acs.jcim.9b00871

    [36]

    Liu CM, Song CH, Jia W, et al. The application of 19F NMR spectroscopy for the analysis of fluorinated new psychoactive substances (NPS)[J]. Forensic Sci Int, 2022, 340: 111450. doi: 10.1016/j.forsciint.2022.111450

图(2)  /  表(1)
计量
  • 文章访问数:  126
  • HTML全文浏览量:  51
  • PDF下载量:  40
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-11-29
  • 网络出版日期:  2024-06-24
  • 刊出日期:  2024-06-24

目录

    /

    返回文章
    返回
    x 关闭 永久关闭