Linear Support

class morl_baselines.multi_policy.linear_support.linear_support.LinearSupport(num_objectives: int, epsilon: float = 0.0, verbose: bool = True)

Linear Support for computing corner weights when using linear utility functions.

Implements both

Optimistic Linear Support (OLS) algorithm: Paper: (Section 3.3 of http://roijers.info/pub/thesis.pdf).

Generalized Policy Improvement Linear Support (GPI-LS) algorithm: Paper: https://arxiv.org/abs/2301.07784

Initialize Linear Support.

Parameters:
  • num_objectives (int) – Number of objectives

  • epsilon (float, optional) – Minimum improvement per iteration. Defaults to 0.0.

  • verbose (bool) – Defaults to False.

add_solution(value: ndarray, w: ndarray) List[int]

Add new value vector optimal to weight w.

Parameters:
  • value (np.ndarray) – New value vector

  • w (np.ndarray) – Weight vector

Returns:

List of indices of value vectors removed from the CCS for being dominated.

compute_corner_weights() List[ndarray]

Returns the corner weights for the current set of values.

See http://roijers.info/pub/thesis.pdf Definition 19. Obs: there is a typo in the definition of the corner weights in the thesis, the >= sign should be <=.

Returns:

List of corner weights.

ended() bool

Returns True if there are no more corner weights to test.

Warning: This method must be called AFTER calling next_weight(). Ex: w = ols.next_weight()

if ols.ended():

print(“OLS ended.”)

get_corner_weights(top_k: int | None = None) List[ndarray]

Returns the corner weights of the current CCS.

Parameters:

top_k – If not None, returns the top_k corner weights.

Returns:

List[np.ndarray] – List of corner weights.

get_weight_support() List[ndarray]

Returns the weight support of the CCS.

Returns:

List[np.ndarray] – List of weight vectors of the CCS

gpi_ls_priority(w: ndarray, gpi_expanded_set: List[ndarray]) float

Get the priority of a weight vector for GPI-LS.

Parameters:

w – Weight vector

Returns:

Priority of the weight vector.

is_dominated(value: ndarray) bool

Checks if the value is dominated by any of the values in the CCS.

Parameters:

value – Value vector

Returns:

True if the value is dominated by any of the values in the CCS, False otherwise.

max_scalarized_value(w: ndarray) float | None

Returns the maximum scalarized value for weight vector w.

Parameters:

w – Weight vector

Returns:

Maximum scalarized value for weight vector w.

max_value_lp(w_new: ndarray) float

Returns an upper-bound for the maximum value of the scalarized objective.

Parameters:

w_new – New weight vector

Returns:

Upper-bound for the maximum value of the scalarized objective.

next_weight(algo: str = 'ols', gpi_agent: MOPolicy | None = None, env: Env | None = None, rep_eval: int = 1) ndarray

Returns the next weight vector with highest priority.

Parameters:
  • algo (str) – Algorithm to use. Either ‘ols’ or ‘gpi-ls’.

  • gpi_agent (Optional[MOPolicy]) – Agent to use for GPI-LS.

  • env (Optional[Env]) – Environment to use for GPI-LS.

  • rep_eval (int) – Number of times to evaluate the agent in GPI-LS.

Returns:

np.ndarray – Next weight vector

ols_priority(w: ndarray) float

Get the priority of a weight vector for OLS.

Parameters:

w – Weight vector

Returns:

Priority of the weight vector.

remove_obsolete_values(value: ndarray) List[int]

Removes the values vectors which are no longer optimal for any weight vector after adding the new value vector.

Parameters:

value (np.ndarray) – New value vector

Returns:

The indices of the removed values.

remove_obsolete_weights(new_value: ndarray) List[ndarray]

Remove from the queue the weight vectors for which the new value vector is better than previous values.

Parameters:

new_value – New value vector

Returns:

List of weight vectors removed from the queue.