all AI news
Conservative Q Learning TD error not converging
Hi, I am using the discrete conservative Q learning implementation in the d3rlpy library (https://github.com/takuseno/d3rlpy) to train a policy offline to optimize mechanical ventilation treatment by using the MIMIC-III dataset (https://physionet.org/content/mimiciii-demo/1.4/).
The state space for my problem is a set of 38 measurements taken from the MIMIC-III dataset such as heartrate, blood pressure, etc.
The action space is a combination of 3 settings (Positive end-expiratory pressure, fraction of inspired oxygen and adjusted tidal volume) on the …!-->