高级检索

T5MHCII:基于深度学习的MHC-II类蛋白质与多肽的亲和力预测模型

T5MHCII: deep learning-based model for MHC-II peptide binding affinity prediction

  • 摘要: 为解决抗原肽与特定MHC-II类分子的结合亲和力预测模型性能较低,无法满足临床需求的现状,提出一种基于深度学习的MHC-II类分子与多肽亲和力的预测模型T5MHCII。该模型通过迁移学习的方法,利用蛋白质语言模型ProtT5已学习到的知识对氨基酸序列进行特征提取,生成高质量的表征,并结合深度学习强大的学习能力,得到具有良好预测性能的新模型。五折交叉验证结果受试者工作特征曲线下的面积(AUC)为0.893±0.003,皮尔逊相关系数(PCC)为0.780±0.006,与NetMHCIIpan-3.2、PUFFIN、DeepMHCII和RPEMH相比性能更好;分子留一验证也展示了模型具有更好的泛化能力,该研究为使用深度学习方法在数据有限的情况下更准确地预测肽-MHCII的亲和力提供了新的方法。

     

    Abstract: To address the current issue of low performance in predicting the binding affinity between antigenic peptides and specific MHC class II molecules, which fails to meet clinical requirements, we proposed T5MHCII, a deep learning-based prediction model for the affinity of MHC II class molecules to peptides. The model employed the knowledge previously acquired from the protein language model ProtT5 to extract the amino acid sequences via a transfer learning approach, thereby generating high-quality characterizations. This knowledge was then integrated with the robust learning abilities of deep learning to develop a novel model with enhanced predictive capabilities. The results of the five-fold cross-validation demonstrated that the model exhibited superior performance compared to NetMHCIIpan-3.2, PUFFIN, DeepMHCII, and RPEMH, with an AUC of 0.893±0.003 and a PCC of 0.780±0.006. The leave-one-out cross-validation (LOOCV) further demonstrated that the model exhibited enhanced generalization capabilities. This study proposes a novel approach to enhance the precision of peptide-MHCII prediction in the context of limited data affinity through the application of deep learning techniques.

     

/

返回文章
返回