Linear Support¶

class morl_baselines.multi_policy.linear_support.linear_support.LinearSupport(num_objectives: int, epsilon: float = 0.0, verbose: bool = True)¶

Linear Support for computing corner weights when using linear utility functions.

Implements both

Optimistic Linear Support (OLS) algorithm: Paper: (Section 3.3 of http://roijers.info/pub/thesis.pdf).

Generalized Policy Improvement Linear Support (GPI-LS) algorithm: Paper: https://arxiv.org/abs/2301.07784

Initialize Linear Support.

Parameters:

num_objectives (int) – Number of objectives
epsilon (float, optional) – Minimum improvement per iteration. Defaults to 0.0.
verbose (bool) – Defaults to False.

add_solution(value: ndarray, w: ndarray) → List[int]¶

Add new value vector optimal to weight w.

Parameters:

value (np.ndarray) – New value vector
w (np.ndarray) – Weight vector

Returns:

List of indices of value vectors removed from the CCS for being dominated.

compute_corner_weights() → List[ndarray]¶

Returns the corner weights for the current set of values.

See http://roijers.info/pub/thesis.pdf Definition 19. Obs: there is a typo in the definition of the corner weights in the thesis, the >= sign should be <=.

Returns:: List of corner weights.

ended() → bool¶

Returns True if there are no more corner weights to test.

Warning: This method must be called AFTER calling next_weight(). Ex: w = ols.next_weight()

if ols.ended():
print(“OLS ended.”)

get_corner_weights(top_k: int | None = None) → List[ndarray]¶

Returns the corner weights of the current CCS.

Parameters:: top_k – If not None, returns the top_k corner weights.
Returns:: List[np.ndarray] – List of corner weights.

get_weight_support() → List[ndarray]¶

Returns the weight support of the CCS.

Returns:: List[np.ndarray] – List of weight vectors of the CCS

gpi_ls_priority(w: ndarray, gpi_expanded_set: List[ndarray]) → float¶

Get the priority of a weight vector for GPI-LS.

Parameters:: w – Weight vector
Returns:: Priority of the weight vector.

is_dominated(value: ndarray) → bool¶

Checks if the value is dominated by any of the values in the CCS.

Parameters:: value – Value vector
Returns:: True if the value is dominated by any of the values in the CCS, False otherwise.

max_scalarized_value(w: ndarray) → float | None¶

Returns the maximum scalarized value for weight vector w.

Parameters:: w – Weight vector
Returns:: Maximum scalarized value for weight vector w.

max_value_lp(w_new: ndarray) → float¶

Returns an upper-bound for the maximum value of the scalarized objective.

Parameters:: w_new – New weight vector
Returns:: Upper-bound for the maximum value of the scalarized objective.

next_weight(algo: str = 'ols', gpi_agent: MOPolicy | None = None, env: Env | None = None, rep_eval: int = 1) → ndarray¶

Returns the next weight vector with highest priority.

Parameters:

algo (str) – Algorithm to use. Either ‘ols’ or ‘gpi-ls’.
gpi_agent (Optional[MOPolicy]) – Agent to use for GPI-LS.
env (Optional[Env]) – Environment to use for GPI-LS.
rep_eval (int) – Number of times to evaluate the agent in GPI-LS.

Returns:

np.ndarray – Next weight vector

ols_priority(w: ndarray) → float¶

Get the priority of a weight vector for OLS.

Parameters:: w – Weight vector
Returns:: Priority of the weight vector.

remove_obsolete_values(value: ndarray) → List[int]¶

Removes the values vectors which are no longer optimal for any weight vector after adding the new value vector.

Parameters:: value (np.ndarray) – New value vector
Returns:: The indices of the removed values.

remove_obsolete_weights(new_value: ndarray) → List[ndarray]¶

Remove from the queue the weight vectors for which the new value vector is better than previous values.

Parameters:: new_value – New value vector
Returns:: List of weight vectors removed from the queue.