Neural Networks helpers

Utilities for Neural Networks.

class morl_baselines.common.networks.NatureCNN(observation_shape: ndarray, features_dim: int = 512)

CNN from DQN nature paper: Mnih, Volodymyr, et al. “Human-level control through deep reinforcement learning.” Nature 518.7540 (2015): 529-533.

CNN from DQN Nature.

Parameters:
  • observation_shape – Shape of the observation.

  • features_dim – Number of features extracted. This corresponds to the number of unit for the last layer.

forward(observations: Tensor) Tensor

Predicts the features from the observations.

Parameters:

observations – current observations

morl_baselines.common.networks.get_grad_norm(params: Iterable[Parameter]) Tensor

This is how the grad norm is computed inside torch.nn.clip_grad_norm_().

Parameters:

params – The parameters to compute the grad norm for.

Returns:

The grad norm.

morl_baselines.common.networks.huber(x, min_priority=0.01)

Huber loss function.

Parameters:
  • x – The input tensor.

  • min_priority – The minimum priority.

Returns:

The huber loss.

morl_baselines.common.networks.layer_init(layer, method='orthogonal', weight_gain: float = 1, bias_const: float = 0) None

Initialize a layer with the given method.

Parameters:
  • layer – The layer to initialize.

  • method – The initialization method to use.

  • weight_gain – The gain for the weights.

  • bias_const – The constant for the bias.

morl_baselines.common.networks.mlp(input_dim: int, output_dim: int, net_arch: ~typing.List[int], activation_fn: ~typing.Type[~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>, drop_rate: float = 0.0, layer_norm: bool = False) Sequential

Create a multi layer perceptron (MLP), which is a collection of fully-connected layers each followed by an activation function.

Parameters:
  • input_dim – Dimension of the input vector

  • output_dim – Dimension of the output vector

  • net_arch – Architecture of the neural net. It represents the number of units per layer. The length of this list is the number of layers.

  • activation_fn – The activation function to use after each layer.

  • drop_rate – Dropout rate

  • layer_norm – Whether to use layer normalization

morl_baselines.common.networks.polyak_update(params: Iterable[Parameter], target_params: Iterable[Parameter], tau: float) None

Polyak averaging for target network parameters.

Parameters:
  • params – The parameters to update.

  • target_params – The target parameters.

  • tau – The polyak averaging coefficient (usually small).