now publishers - Bayesian Multi-Temporal-Difference Learning

APSIPA Transactions on Signal and Information Processing > Vol 11 > Issue 1

Bayesian Multi-Temporal-Difference Learning

Jen-Tzung Chien, Institute of Electrical and Computer Engineering, National Yang Ming Chiao Tung University, Taiwan, jtchien@nycu.edu.tw , Yi-Chung Chiu, Institute of Electrical and Computer Engineering, National Yang Ming Chiao Tung University, Taiwan

Suggested Citation

Jen-Tzung Chien and Yi-Chung Chiu (2022), "Bayesian Multi-Temporal-Difference Learning", APSIPA Transactions on Signal and Information Processing: Vol. 11: No. 1, e34. http://dx.doi.org/10.1561/116.00000037

Publication Date: 24 Nov 2022

Subjects

Keywords

Bayesian learning, variational autoencoder, sequential learning, temporal-difference learning, state machine

Journal details

Open Access

This is published under the terms of CC BY-NC.

Downloaded: 1200 times

In this article:

Abstract

This paper presents a new sequential learning via a planning strategy where the future samples are predicted by reflecting the past experiences. Such a strategy is appealing to implement an intelligent machine which foresees multiple time steps instead of predicting step by step. In particular, a flexible sequential learning is developed to directly predict future states without visiting all intermediate states. A Bayesian approach to multi-temporal-difference neural network is accordingly proposed to calculate the stochastic belief state for an abstract state machine so as to capture large-span context as well as make high-level prediction. Importantly, the sequence data are represented by multiple jumpy states with varying temporal differences. A Bayesian state machine is trained by maximizing the variational lower bound of log likelihood of sequence data. A generalized sequence model with various number of Markov states is derived with the simplified realization to the previous temporal-difference variational autoencoder. The predictive states are learned to roll forward with jumps. Experiments show that this approach is substantially trained to predict jumpy states in various types of sequence data.

DOI:10.1561/116.00000037

Introduction
Bayesian Sequential Learning
Bayesian Temporal-Difference Learning
Bayesian Multi-temporal-difference Learning
Experiments
Conclusions
References

Bayesian Multi-Temporal-Difference Learning

Share

Journal details

Abstract