Dynamical-VAE-based Hindsight to Learn the Causal Dynamics of Factored-POMDPs - Université de Lille
Pré-Publication, Document De Travail Année : 2024

Dynamical-VAE-based Hindsight to Learn the Causal Dynamics of Factored-POMDPs

Résumé

Learning representations of underlying environmental dynamics from partial observations is a critical challenge in machine learning. In the context of Partially Observable Markov Decision Processes (POMDPs), state representations are often inferred from the history of past observations and actions. We demonstrate that incorporating future information is essential to accurately capture causal dynamics and enhance state representations. To address this, we introduce a Dynamical Variational Auto-Encoder (DVAE) designed to learn causal Markovian dynamics from offline trajectories in a POMDP. Our method employs an extended hindsight framework that integrates past, current, and multi-step future information within a factored-POMDP setting. Empirical results reveal that this approach uncovers the causal graph governing hidden state transitions more effectively than history-based and typical hindsight-based models.

Dates et versions

hal-04785076 , version 1 (15-11-2024)

Licence

Identifiants

Citer

Chao Han, Debabrota Basu, Michael Mangan, Eleni Vasilaki, Aditya Gilra. Dynamical-VAE-based Hindsight to Learn the Causal Dynamics of Factored-POMDPs. 2024. ⟨hal-04785076⟩
0 Consultations
0 Téléchargements

Altmetric

Partager

More