Downloads: 160

Files in This Item:
File Description SizeFormat 
djohk00838.pdfDissertation_全文12.87 MBAdobe PDFView/Open
yjohk00838.pdfAbstract_要旨191.76 kBAdobe PDFView/Open
Title: Offline Reinforcement Learning from Imperfect Human Guidance
Other Titles: 不完全な人間の誘導からのオフライン強化学習
Authors: Zhang, Guoxi
Author's alias: 张, 国熙
Keywords: Offline Reinforcement Learning
Preference-based Reinforcement Learning
Human-in-the-loop Reinforcement Learning
Issue Date: 24-Jul-2023
Publisher: Kyoto University
Conferring University: 京都大学
Degree Level: 新制・課程博士
Degree Discipline: 博士(情報学)
Degree Report no.: 甲第24856号
Degree no.: 情博第838号
Conferral date: 2023-07-24
Degree Call no.: 新制||情||140(附属図書館)
Degree Affiliation: 京都大学大学院情報学研究科知能情報学専攻
Examination Committee members: (主査)教授 鹿島, 久嗣, 教授 河原, 達也, 教授 森本, 淳
Provisions of the Ruling of Degree: 学位規則第4条第1項該当
Rights: 3章は1及び2に基づく。4章は3に基づく。5章は4及び5に基づく。1. G. Zhang and H. Kashima. Batch reinforcement learning from crowds. In Machine Learning and Knowledge Discovery in Databases, pages 38–51. Springer Cham, 2023. 2. G. Zhang, J. Li, and H. Kashima. Improving pairwise rank aggregation via querying for rank difference. In Proceedings of the Ninth IEEE International Conference on Data Science and Advanced Analytics, IEEE, 2022. 3. G. Zhang and H. Kashima. Learning state importance for preference-based reinforcement learning. Machine Learning, 2023. 4. G. Zhang and H. Kashima. Behavior estimation from multi-source data for offline reinforcement learning. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence. AAAI Press, 2023. 5. G. Zhang, X. Yao, and X. Xiao. On modeling long-term user engagement from stochastic feedback. In Companion Proceedings of the ACM Web Conference 2023. Association for Computing Machinery, 2023.
DOI: 10.14989/doctor.k24856
Appears in Collections:140 Doctoral Dissertation (Informatics)

Show full item record

Export to RefWorks

Export Format: 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.