Optimizing Reinforcement Learning with Limited HRI Demonstrations: A Task-Oriented Weight Update Method with Analysis of Multi-Head and Layer Feature Combinations

Qinghua Chen
Jessica Korneder
Osamah A. Rawashdeh
Yanfeng Wang
Wing-Yue Geoffrey Louie

0 evaluations Published on Apr 16, 2025

This article on Sciety

Abstract

To address the challenge of training reinforcement learning (RL) networks with limited data in Human-Robot Interaction (HRI), we introduce a novel task-oriented update method that combines meta-inverse reinforcement learning (Meta-IRL) and transformer encoder architectures. Our approach utilizes Meta-IRL to emulate expert trajectories, improving training efficiency by minimizing ineffective interactions. Through a systematic exploration of transformer encoder components, including varying heads and layers, we optimize feature extraction for sparse data representations. Experimental validation shows performance improvements in reinforcement learning networks trained with limited HRI expert demonstration data from Applied Behavior Analysis (ABA) scenarios.

Related articles are currently not available for this article.