Sumo Environment

class sumo_rl.environment.env.SumoEnvironment(net_file: str, route_file: str, out_csv_name: str | None = None, use_gui: bool = False, virtual_display: ~typing.Tuple[int, int] = (3200, 1800), begin_time: int = 0, num_seconds: int = 20000, max_depart_delay: int = -1, waiting_time_memory: int = 1000, time_to_teleport: int = -1, delta_time: int = 5, yellow_time: int = 2, min_green: int = 5, max_green: int = 50, single_agent: bool = False, reward_fn: str | ~typing.Callable | dict = 'diff-waiting-time', observation_class: ~sumo_rl.environment.observations.ObservationFunction = <class 'sumo_rl.environment.observations.DefaultObservationFunction'>, add_system_info: bool = True, add_per_agent_info: bool = True, sumo_seed: str | int = 'random', fixed_ts: bool = False, sumo_warnings: bool = True, additional_sumo_cmd: str | None = None, render_mode: str | None = None)

SUMO Environment for Traffic Signal Control.

Class that implements a gym.Env interface for traffic signal control using the SUMO simulator. See https://sumo.dlr.de/docs/ for details on SUMO. See https://gymnasium.farama.org/ for details on gymnasium.

Parameters:
  • net_file (str) – SUMO .net.xml file

  • route_file (str) – SUMO .rou.xml file

  • out_csv_name (Optional[str]) – name of the .csv output with simulation results. If None, no output is generated

  • use_gui (bool) – Whether to run SUMO simulation with the SUMO GUI

  • virtual_display (Optional[Tuple[int,int]]) – Resolution of the virtual display for rendering

  • begin_time (int) – The time step (in seconds) the simulation starts. Default: 0

  • num_seconds (int) – Number of simulated seconds on SUMO. The duration in seconds of the simulation. Default: 20000

  • max_depart_delay (int) – Vehicles are discarded if they could not be inserted after max_depart_delay seconds. Default: -1 (no delay)

  • waiting_time_memory (int) – Number of seconds to remember the waiting time of a vehicle (see https://sumo.dlr.de/pydoc/traci._vehicle.html#VehicleDomain-getAccumulatedWaitingTime). Default: 1000

  • time_to_teleport (int) – Time in seconds to teleport a vehicle to the end of the edge if it is stuck. Default: -1 (no teleport)

  • delta_time (int) – Simulation seconds between actions. Default: 5 seconds

  • yellow_time (int) – Duration of the yellow phase. Default: 2 seconds

  • min_green (int) – Minimum green time in a phase. Default: 5 seconds

  • max_green (int) – Max green time in a phase. Default: 60 seconds. Warning: This parameter is currently ignored!

  • single_agent (bool) – If true, it behaves like a regular gym.Env. Else, it behaves like a MultiagentEnv (returns dict of observations, rewards, dones, infos).

  • reward_fn (str/function/dict) – String with the name of the reward function used by the agents, a reward function, or dictionary with reward functions assigned to individual traffic lights by their keys.

  • observation_class (ObservationFunction) – Inherited class which has both the observation function and observation space.

  • add_system_info (bool) – If true, it computes system metrics (total queue, total waiting time, average speed) in the info dictionary.

  • add_per_agent_info (bool) – If true, it computes per-agent (per-traffic signal) metrics (average accumulated waiting time, average queue) in the info dictionary.

  • sumo_seed (int/string) – Random seed for sumo. If ‘random’ it uses a randomly chosen seed.

  • fixed_ts (bool) – If true, it will follow the phase configuration in the route_file and ignore the actions given in the step() method.

  • sumo_warnings (bool) – If true, it will print SUMO warnings.

  • additional_sumo_cmd (str) – Additional SUMO command line arguments.

  • render_mode (str) – Mode of rendering. Can be ‘human’ or ‘rgb_array’. Default: None

property action_space

Return the action space of a traffic signal.

Only used in case of single-agent environment.

action_spaces(ts_id: str) Discrete

Return the action space of a traffic signal.

close()

Close the environment and stop the SUMO simulation.

encode(state, ts_id)

Encode the state of the traffic signal into a hashable object.

property observation_space

Return the observation space of a traffic signal.

Only used in case of single-agent environment.

observation_spaces(ts_id: str)

Return the observation space of a traffic signal.

render()

Render the environment.

If render_mode is “human”, the environment will be rendered in a GUI window using pyvirtualdisplay.

reset(seed: int | None = None, **kwargs)

Reset the environment.

save_csv(out_csv_name, episode)

Save metrics of the simulation to a .csv file.

Parameters:
  • out_csv_name (str) – Path to the output .csv file. E.g.: “results/my_results

  • episode (int) – Episode number to be appended to the output file name.

property sim_step: float

Return current simulation second on SUMO.

step(action: dict | int)

Apply the action(s) and then step the simulation for delta_time seconds.

Parameters:
  • action (Union[dict, int]) – action(s) to be applied to the environment.

  • True (If single_agent is)

  • int (action is an)

  • ids. (otherwise it expects a dict with keys corresponding to traffic signal)