ダウンロード数: 168

このアイテムのファイル:
ファイル 記述 サイズフォーマット 
fnbot.2017.00001.pdf15.82 MBAdobe PDF見る/開く
タイトル: Adaptive Baseline Enhances EM-Based Policy Search: Validation in a View-Based Positioning Task of a Smartphone Balancer
著者: Wang, Jiexin
Uchibe, Eiji
Doya, Kenji
キーワード: smartphone robot
reinforcement learning
EM-based policy search
non-linear motor control
vision-based control
発行日: 23-Jan-2017
出版者: Frontiers Media SA
誌名: Frontiers in Neurorobotics
巻: 11
論文番号: 1
抄録: EM-based policy search methods estimate a lower bound of the expected return from the histories of episodes and iteratively update the policy parameters using the maximum of a lower bound of expected return, which makes gradient calculation and learning rate tuning unnecessary. Previous algorithms like Policy learning by Weighting Exploration with the Returns, Fitness Expectation Maximization, and EM-based Policy Hyperparameter Exploration implemented the mechanisms to discard useless low-return episodes either implicitly or using a fixed baseline determined by the experimenter. In this paper, we propose an adaptive baseline method to discard worse samples from the reward history and examine different baselines, including the mean, and multiples of SDs from the mean. The simulation results of benchmark tasks of pendulum swing up and cart-pole balancing, and standing up and balancing of a two-wheeled smartphone robot showed improved performances. We further implemented the adaptive baseline with mean in our two-wheeled smartphone robot hardware to test its performance in the standing up and balancing task, and a view-based approaching task. Our results showed that with adaptive baseline, the method outperformed the previous algorithms and achieved faster, and more precise behaviors at a higher successful rate
著作権等: © 2017 Wang, Uchibe and Doya. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
URI: http://hdl.handle.net/2433/218530
DOI(出版社版): 10.3389/fnbot.2017.00001
PubMed ID: 28167910
出現コレクション:学術雑誌掲載論文等

アイテムの詳細レコードを表示する

Export to RefWorks


出力フォーマット 


このリポジトリに保管されているアイテムはすべて著作権により保護されています。