Linear Support¶
- class morl_baselines.multi_policy.linear_support.linear_support.LinearSupport(num_objectives: int, epsilon: float = 0.0, verbose: bool = True)¶
Linear Support for computing corner weights when using linear utility functions.
Implements both
Optimistic Linear Support (OLS) algorithm: Paper: (Section 3.3 of http://roijers.info/pub/thesis.pdf).
Generalized Policy Improvement Linear Support (GPI-LS) algorithm: Paper: https://arxiv.org/abs/2301.07784
Initialize Linear Support.
- Parameters:
num_objectives (int) – Number of objectives
epsilon (float, optional) – Minimum improvement per iteration. Defaults to 0.0.
verbose (bool) – Defaults to False.
- add_solution(value: ndarray, w: ndarray) List[int] ¶
Add new value vector optimal to weight w.
- Parameters:
value (np.ndarray) – New value vector
w (np.ndarray) – Weight vector
- Returns:
List of indices of value vectors removed from the CCS for being dominated.
- compute_corner_weights() List[ndarray] ¶
Returns the corner weights for the current set of values.
See http://roijers.info/pub/thesis.pdf Definition 19. Obs: there is a typo in the definition of the corner weights in the thesis, the >= sign should be <=.
- Returns:
List of corner weights.
- ended() bool ¶
Returns True if there are no more corner weights to test.
Warning: This method must be called AFTER calling next_weight(). Ex: w = ols.next_weight()
- if ols.ended():
print(“OLS ended.”)
- get_corner_weights(top_k: int | None = None) List[ndarray] ¶
Returns the corner weights of the current CCS.
- Parameters:
top_k – If not None, returns the top_k corner weights.
- Returns:
List[np.ndarray] – List of corner weights.
- get_weight_support() List[ndarray] ¶
Returns the weight support of the CCS.
- Returns:
List[np.ndarray] – List of weight vectors of the CCS
- gpi_ls_priority(w: ndarray, gpi_expanded_set: List[ndarray]) float ¶
Get the priority of a weight vector for GPI-LS.
- Parameters:
w – Weight vector
- Returns:
Priority of the weight vector.
- is_dominated(value: ndarray) bool ¶
Checks if the value is dominated by any of the values in the CCS.
- Parameters:
value – Value vector
- Returns:
True if the value is dominated by any of the values in the CCS, False otherwise.
- max_scalarized_value(w: ndarray) float | None ¶
Returns the maximum scalarized value for weight vector w.
- Parameters:
w – Weight vector
- Returns:
Maximum scalarized value for weight vector w.
- max_value_lp(w_new: ndarray) float ¶
Returns an upper-bound for the maximum value of the scalarized objective.
- Parameters:
w_new – New weight vector
- Returns:
Upper-bound for the maximum value of the scalarized objective.
- next_weight(algo: str = 'ols', gpi_agent: MOPolicy | None = None, env: Env | None = None, rep_eval: int = 1) ndarray ¶
Returns the next weight vector with highest priority.
- Parameters:
algo (str) – Algorithm to use. Either ‘ols’ or ‘gpi-ls’.
gpi_agent (Optional[MOPolicy]) – Agent to use for GPI-LS.
env (Optional[Env]) – Environment to use for GPI-LS.
rep_eval (int) – Number of times to evaluate the agent in GPI-LS.
- Returns:
np.ndarray – Next weight vector
- ols_priority(w: ndarray) float ¶
Get the priority of a weight vector for OLS.
- Parameters:
w – Weight vector
- Returns:
Priority of the weight vector.
- remove_obsolete_values(value: ndarray) List[int] ¶
Removes the values vectors which are no longer optimal for any weight vector after adding the new value vector.
- Parameters:
value (np.ndarray) – New value vector
- Returns:
The indices of the removed values.
- remove_obsolete_weights(new_value: ndarray) List[ndarray] ¶
Remove from the queue the weight vectors for which the new value vector is better than previous values.
- Parameters:
new_value – New value vector
- Returns:
List of weight vectors removed from the queue.