Neural Networks helpers¶
Utilities for Neural Networks.
- class morl_baselines.common.networks.NatureCNN(observation_shape: ndarray, features_dim: int = 512)¶
CNN from DQN nature paper: Mnih, Volodymyr, et al. “Human-level control through deep reinforcement learning.” Nature 518.7540 (2015): 529-533.
CNN from DQN Nature.
- Parameters:
observation_shape – Shape of the observation.
features_dim – Number of features extracted. This corresponds to the number of unit for the last layer.
- forward(observations: Tensor) Tensor ¶
Predicts the features from the observations.
- Parameters:
observations – current observations
- morl_baselines.common.networks.get_grad_norm(params: Iterable[Parameter]) Tensor ¶
This is how the grad norm is computed inside torch.nn.clip_grad_norm_().
- Parameters:
params – The parameters to compute the grad norm for.
- Returns:
The grad norm.
- morl_baselines.common.networks.huber(x, min_priority=0.01)¶
Huber loss function.
- Parameters:
x – The input tensor.
min_priority – The minimum priority.
- Returns:
The huber loss.
- morl_baselines.common.networks.layer_init(layer, method='orthogonal', weight_gain: float = 1, bias_const: float = 0) None ¶
Initialize a layer with the given method.
- Parameters:
layer – The layer to initialize.
method – The initialization method to use.
weight_gain – The gain for the weights.
bias_const – The constant for the bias.
- morl_baselines.common.networks.mlp(input_dim: int, output_dim: int, net_arch: ~typing.List[int], activation_fn: ~typing.Type[~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>, drop_rate: float = 0.0, layer_norm: bool = False) Sequential ¶
Create a multi layer perceptron (MLP), which is a collection of fully-connected layers each followed by an activation function.
- Parameters:
input_dim – Dimension of the input vector
output_dim – Dimension of the output vector
net_arch – Architecture of the neural net. It represents the number of units per layer. The length of this list is the number of layers.
activation_fn – The activation function to use after each layer.
drop_rate – Dropout rate
layer_norm – Whether to use layer normalization
- morl_baselines.common.networks.polyak_update(params: Iterable[Parameter], target_params: Iterable[Parameter], tau: float) None ¶
Polyak averaging for target network parameters.
- Parameters:
params – The parameters to update.
target_params – The target parameters.
tau – The polyak averaging coefficient (usually small).