OverviewΒΆ
MORL-Baselines contains multiple implementations of multi-objective reinforcement learning algorithms. The following table lists the algorithms that are currently implemented in MORL-Baselines.
Name |
Single/Multi-policy |
ESR/SER |
Observation space |
Action space |
Paper |
---|---|---|---|---|---|
Multi |
SER |
Continuous |
Discrete / Continuous |
||
Multi |
/ |
/ |
/ |
||
Multi |
SER |
Continuous |
Discrete |
||
Multi |
SER |
Continuous |
Continuous |
||
Multi |
SER |
Continuous |
Continuous |
||
Multi |
SER/ESR 2 |
Continuous |
Discrete / Continuous |
||
Multi |
SER |
Discrete |
Discrete |
||
Single |
SER |
Discrete |
Discrete |
||
MPMOQLearning (outer loop MOQL) |
Multi |
SER |
Discrete |
Discrete |
|
Multi |
SER |
/ |
/ |
Section 3.3 of the thesis |
|
Single |
ESR |
Discrete |
Discrete |
:warning: Some of the algorithms have limited features.
1: Currently, PGMORL is limited to environments with 2 objectives.
2: PCN assumes environments with deterministic transitions.