Papers
arxiv:2306.07307

Online Prototype Alignment for Few-shot Policy Transfer

Published on Jun 12, 2023
Authors:
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

Domain adaptation in reinforcement learning (RL) mainly deals with the changes of observation when transferring the policy to a new environment. Many traditional approaches of domain adaptation in RL manage to learn a mapping function between the source and <PRE_TAG>target domain</POST_TAG> in explicit or implicit ways. However, they typically require access to abundant data from the <PRE_TAG>target domain</POST_TAG>. Besides, they often rely on visual clues to learn the mapping function and may fail when the source domain looks quite different from the <PRE_TAG>target domain</POST_TAG>. To address these problems, we propose a novel framework Online Prototype Alignment (OPA) to learn the mapping function based on the functional similarity of elements and is able to achieve the few-shot policy transfer within only several episodes. The key insight of OPA is to introduce an exploration mechanism that can interact with the unseen elements of the <PRE_TAG>target domain</POST_TAG> in an efficient and purposeful manner, and then connect them with the seen elements in the source domain according to their functionalities (instead of visual clues). Experimental results show that when the <PRE_TAG>target domain</POST_TAG> looks visually different from the source domain, OPA can achieve better transfer performance even with much fewer samples from the <PRE_TAG>target domain</POST_TAG>, outperforming prior methods.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2306.07307 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2306.07307 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2306.07307 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.