Improving Compound–Protein Interaction Prediction by Self-Training with Augmenting Negative Samples

Koyama, Takuto; Matsumoto, Shigeyuki; Iwata, Hiroaki; Kojima, Ryosuke; Okuno, Yasushi

このアイテムのアクセス数: 121

http://hdl.handle.net/2433/285076

このアイテムのファイル:

ファイル	記述	サイズ	フォーマット
acs.jcim.3c00269.pdf		2.72 MB	Adobe PDF	見る/開く

完全メタデータレコード

DCフィールド	値	言語
dc.contributor.author	Koyama, Takuto	en
dc.contributor.author	Matsumoto, Shigeyuki	en
dc.contributor.author	Iwata, Hiroaki	en
dc.contributor.author	Kojima, Ryosuke	en
dc.contributor.author	Okuno, Yasushi	en
dc.contributor.alternative	小山, 拓豊	ja
dc.contributor.alternative	松本, 篤幸	ja
dc.contributor.alternative	岩田, 浩明	ja
dc.contributor.alternative	小島, 諒介	ja
dc.contributor.alternative	奥野, 恭史	ja
dc.date.accessioned	2023-09-13T05:11:08Z	-
dc.date.available	2023-09-13T05:11:08Z	-
dc.date.issued	2023-08-14	-
dc.identifier.uri	http://hdl.handle.net/2433/285076	-
dc.description.abstract	Identifying compound-protein interactions (CPIs) is crucial for drug discovery. Since experimentally validating CPIs is often time-consuming and costly, computational approaches are expected to facilitate the process. Rapid growths of available CPI databases have accelerated the development of many machine-learning methods for CPI predictions. However, their performance, particularly their generalizability against external data, often suffers from a data imbalance attributed to the lack of experimentally validated inactive (negative) samples. In this study, we developed a self-training method for augmenting both credible and informative negative samples to improve the performance of models impaired by data imbalances. The constructed model demonstrated higher performance than those constructed with other conventional methods for solving data imbalances, and the improvement was prominent for external datasets. Moreover, examination of the prediction score thresholds for pseudo-labeling during self-training revealed that augmenting the samples with ambiguous prediction scores is beneficial for constructing a model with high generalizability. The present study provides guidelines for improving CPI predictions on real-world data, thus facilitating drug discovery.	en
dc.language.iso	eng	-
dc.publisher	American Chemical Society (ACS)	en
dc.rights	© 2022 The Authors. Published by American Chemical Society	en
dc.rights	This publication is licensed under CC-BY-NC-ND 4.0.	en
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/	-
dc.subject	Algorithms	en
dc.subject	Drug discovery	en
dc.subject	Peptides and proteins	en
dc.subject	Receptors	en
dc.subject	Students	en
dc.title	Improving Compound–Protein Interaction Prediction by Self-Training with Augmenting Negative Samples	en
dc.type	journal article	-
dc.type.niitype	Journal Article	-
dc.identifier.jtitle	Journal of Chemical Information and Modeling	en
dc.identifier.volume	63	-
dc.identifier.issue	15	-
dc.identifier.spage	4552	-
dc.identifier.epage	4559	-
dc.relation.doi	10.1021/acs.jcim.3c00269	-
dc.textversion	publisher	-
dc.identifier.pmid	37460105	-
dcterms.accessRights	open access	-
datacite.awardNumber	20K12063	-
datacite.awardNumber.uri	https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-20K12063/	-
dc.identifier.pissn	1549-9596	-
dc.identifier.eissn	1549-960X	-
jpcoar.funderName	日本学術振興会	ja
jpcoar.awardTitle	苦味受容体におけるAI・シミュレーション・進化解析の融合解析フレームワークの構築	ja
出現コレクション:	学術雑誌掲載論文等

アイテムの簡略レコードを表示する

Export to RefWorks

このアイテムは次のライセンスが設定されています: クリエイティブ・コモンズ・ライセンス