Neural Networks helpers¶

Utilities for Neural Networks.

class morl_baselines.common.networks.NatureCNN(observation_shape: ndarray, features_dim: int = 512)¶

CNN from DQN nature paper: Mnih, Volodymyr, et al. “Human-level control through deep reinforcement learning.” Nature 518.7540 (2015): 529-533.

CNN from DQN Nature.

Parameters:

observation_shape – Shape of the observation.
features_dim – Number of features extracted. This corresponds to the number of unit for the last layer.

forward(observations: Tensor) → Tensor¶

Predicts the features from the observations.

Parameters:: observations – current observations

morl_baselines.common.networks.get_grad_norm(params: Iterable[Parameter]) → Tensor¶

This is how the grad norm is computed inside torch.nn.clip_grad_norm_().

Parameters:: params – The parameters to compute the grad norm for.
Returns:: The grad norm.

morl_baselines.common.networks.huber(x, min_priority=0.01)¶

Huber loss function.

Parameters:

x – The input tensor.
min_priority – The minimum priority.

Returns:

The huber loss.

morl_baselines.common.networks.layer_init(layer, method='orthogonal', weight_gain: float = 1, bias_const: float = 0) → None¶

Initialize a layer with the given method.

Parameters:

layer – The layer to initialize.
method – The initialization method to use.
weight_gain – The gain for the weights.
bias_const – The constant for the bias.

morl_baselines.common.networks.mlp(input_dim: int, output_dim: int, net_arch: ~typing.List[int], activation_fn: ~typing.Type[~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>, drop_rate: float = 0.0, layer_norm: bool = False) → Sequential¶

Create a multi layer perceptron (MLP), which is a collection of fully-connected layers each followed by an activation function.

Parameters:

input_dim – Dimension of the input vector
output_dim – Dimension of the output vector
net_arch – Architecture of the neural net. It represents the number of units per layer. The length of this list is the number of layers.
activation_fn – The activation function to use after each layer.
drop_rate – Dropout rate
layer_norm – Whether to use layer normalization

morl_baselines.common.networks.polyak_update(params: Iterable[Parameter], target_params: Iterable[Parameter], tau: float) → None¶

Polyak averaging for target network parameters.

Parameters:

params – The parameters to update.
target_params – The target parameters.
tau – The polyak averaging coefficient (usually small).