Make your own environment¶

API¶

class highway_env.envs.common.abstract.AbstractEnv(config: dict = None)[source]¶

A generic environment for various tasks involving a vehicle driving on a road.

The environment contains a road populated with vehicles, and a controlled ego-vehicle that can change lane and speed. The action space is fixed, but the observation space and reward function must be defined in the environment implementations.

metadata = {'render.modes': ['human', 'rgb_array']}¶

PERCEPTION_DISTANCE = 180.0¶: The maximum distance of any vehicle present in the observation [m]

__init__(config: dict = None) → None[source]¶: Initialize self. See help(type(self)) for accurate signature.

action_type: ActionType = None¶

observation_type: ObservationType = None¶

automatic_rendering_callback: Optional[Callable] = None¶

classmethod default_config() → dict[source]¶

Default environment configuration.

Can be overloaded in environment implementations, or by calling configure(). :return: a configuration dict

seed(seed: int = None) → List[int][source]¶

Sets the seed for this env’s random number generator(s).

Note:

Some environments use multiple pseudorandom number generators. We want to capture all such seeds used in order to ensure that there aren’t accidental correlations between multiple generators.

Returns:

list<bigint>: Returns the list of seeds used in this env’s random: number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to ‘seed’. Often, the main seed equals the provided ‘seed’, but this won’t be true if seed=None, for example.

configure(config: dict) → None[source]¶

define_spaces() → None[source]¶

_reward(action: Union[int, numpy.ndarray]) → float[source]¶

Return the reward associated with performing a given action and ending up in the current state.

Parameters: action – the last action performed
Returns: the reward

_is_terminal() → bool[source]¶

Check whether the current state is a terminal state

:return:is the state terminal

_cost(action: Union[int, numpy.ndarray]) → float[source]¶

A constraint metric, for budgeted MDP.

If a constraint is defined, it must be used with an alternate reward that doesn’t contain it as a penalty. :param action: the last action performed :return: the constraint signal, the alternate (constraint-free) reward

reset() → numpy.ndarray[source]¶

Reset the environment to it’s initial configuration

Returns: the observation of the reset state

step(action: Union[int, numpy.ndarray]) → Tuple[numpy.ndarray, float, bool, dict][source]¶

Perform an action and step the environment dynamics.

The action is executed by the ego-vehicle, and all other vehicles on the road performs their default behaviour for several simulation timesteps until the next decision making step.

Parameters: action – the action performed by the ego-vehicle
Returns: a tuple (observation, reward, terminal, info)

_simulate(action: Union[int, numpy.ndarray, None] = None) → None[source]¶: Perform several steps of simulation with constant action.

render(mode: str = 'human') → Optional[numpy.ndarray][source]¶

Render the environment.

Create a viewer if none exists, and use it to render an image. :param mode: the rendering mode

close() → None[source]¶

Close the environment.

Will close the environment viewer if it exists.

get_available_actions() → List[int][source]¶

Get the list of currently available actions.

Lane changes are not available on the boundary of the road, and speed changes are not available at maximal or minimal speed.

Returns: the list of available actions

_automatic_rendering() → None[source]¶

Automatically render the intermediate frames while an action is still ongoing.

This allows to render the whole video and not only single steps corresponding to agent decision-making.

If a callback has been set, use it to perform the rendering. This is useful for the environment wrappers such as video-recording monitor that need to access these intermediate renderings.

simplify() → highway_env.envs.common.abstract.AbstractEnv[source]¶

Return a simplified copy of the environment where distant vehicles have been removed from the road.

This is meant to lower the policy computational load while preserving the optimal actions set.

Returns: a simplified environment state

change_vehicles(vehicle_class_path: str) → highway_env.envs.common.abstract.AbstractEnv[source]¶

Change the type of all vehicles on the road

Parameters: vehicle_class_path – The path of the class of behavior for other vehicles Example: “highway_env.vehicle.behavior.IDMVehicle”
Returns: a new environment with modified behavior model for other vehicles

set_preferred_lane(preferred_lane: int = None) → highway_env.envs.common.abstract.AbstractEnv[source]¶

set_route_at_intersection(_to: str) → highway_env.envs.common.abstract.AbstractEnv[source]¶

set_vehicle_field(args: Tuple[str, object]) → highway_env.envs.common.abstract.AbstractEnv[source]¶

__annotations__ = {'action_type': <class 'highway_env.envs.common.action.ActionType'>, 'automatic_rendering_callback': typing.Union[typing.Callable, NoneType], 'observation_type': <class 'highway_env.envs.common.observation.ObservationType'>}¶

__module__ = 'highway_env.envs.common.abstract'¶

call_vehicle_method(args: Tuple[str, Tuple[object]]) → highway_env.envs.common.abstract.AbstractEnv[source]¶

randomize_behaviour() → highway_env.envs.common.abstract.AbstractEnv[source]¶

to_finite_mdp()[source]¶

__deepcopy__(memo)[source]¶: Perform a deep copy but without copying the environment viewer.