Make your own environment

API

class highway_env.envs.common.abstract.AbstractEnv(config: dict = None)[source]

A generic environment for various tasks involving a vehicle driving on a road.

The environment contains a road populated with vehicles, and a controlled ego-vehicle that can change lane and speed. The action space is fixed, but the observation space and reward function must be defined in the environment implementations.

metadata = {'render.modes': ['human', 'rgb_array']}
PERCEPTION_DISTANCE = 180.0

The maximum distance of any vehicle present in the observation [m]

__init__(config: dict = None) → None[source]

Initialize self. See help(type(self)) for accurate signature.

action_type: ActionType = None
observation_type: ObservationType = None
automatic_rendering_callback: Optional[Callable] = None
classmethod default_config() → dict[source]

Default environment configuration.

Can be overloaded in environment implementations, or by calling configure(). :return: a configuration dict

seed(seed: int = None) → List[int][source]

Sets the seed for this env’s random number generator(s).

Note:

Some environments use multiple pseudorandom number generators. We want to capture all such seeds used in order to ensure that there aren’t accidental correlations between multiple generators.

Returns:
list<bigint>: Returns the list of seeds used in this env’s random

number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to ‘seed’. Often, the main seed equals the provided ‘seed’, but this won’t be true if seed=None, for example.

configure(config: dict) → None[source]
define_spaces() → None[source]
_reward(action: Union[int, numpy.ndarray]) → float[source]

Return the reward associated with performing a given action and ending up in the current state.

Parameters

action – the last action performed

Returns

the reward

_is_terminal() → bool[source]

Check whether the current state is a terminal state

:return:is the state terminal

_cost(action: Union[int, numpy.ndarray]) → float[source]

A constraint metric, for budgeted MDP.

If a constraint is defined, it must be used with an alternate reward that doesn’t contain it as a penalty. :param action: the last action performed :return: the constraint signal, the alternate (constraint-free) reward

reset() → numpy.ndarray[source]

Reset the environment to it’s initial configuration

Returns

the observation of the reset state

step(action: Union[int, numpy.ndarray]) → Tuple[numpy.ndarray, float, bool, dict][source]

Perform an action and step the environment dynamics.

The action is executed by the ego-vehicle, and all other vehicles on the road performs their default behaviour for several simulation timesteps until the next decision making step.

Parameters

action – the action performed by the ego-vehicle

Returns

a tuple (observation, reward, terminal, info)

_simulate(action: Union[int, numpy.ndarray, None] = None) → None[source]

Perform several steps of simulation with constant action.

render(mode: str = 'human') → Optional[numpy.ndarray][source]

Render the environment.

Create a viewer if none exists, and use it to render an image. :param mode: the rendering mode

close() → None[source]

Close the environment.

Will close the environment viewer if it exists.

get_available_actions() → List[int][source]

Get the list of currently available actions.

Lane changes are not available on the boundary of the road, and speed changes are not available at maximal or minimal speed.

Returns

the list of available actions

_automatic_rendering() → None[source]

Automatically render the intermediate frames while an action is still ongoing.

This allows to render the whole video and not only single steps corresponding to agent decision-making.

If a callback has been set, use it to perform the rendering. This is useful for the environment wrappers such as video-recording monitor that need to access these intermediate renderings.

simplify() → highway_env.envs.common.abstract.AbstractEnv[source]

Return a simplified copy of the environment where distant vehicles have been removed from the road.

This is meant to lower the policy computational load while preserving the optimal actions set.

Returns

a simplified environment state

change_vehicles(vehicle_class_path: str) → highway_env.envs.common.abstract.AbstractEnv[source]

Change the type of all vehicles on the road

Parameters

vehicle_class_path – The path of the class of behavior for other vehicles Example: “highway_env.vehicle.behavior.IDMVehicle”

Returns

a new environment with modified behavior model for other vehicles

set_preferred_lane(preferred_lane: int = None) → highway_env.envs.common.abstract.AbstractEnv[source]
set_route_at_intersection(_to: str) → highway_env.envs.common.abstract.AbstractEnv[source]
set_vehicle_field(args: Tuple[str, object]) → highway_env.envs.common.abstract.AbstractEnv[source]
__annotations__ = {'action_type': <class 'highway_env.envs.common.action.ActionType'>, 'automatic_rendering_callback': typing.Union[typing.Callable, NoneType], 'observation_type': <class 'highway_env.envs.common.observation.ObservationType'>}
__module__ = 'highway_env.envs.common.abstract'
call_vehicle_method(args: Tuple[str, Tuple[object]]) → highway_env.envs.common.abstract.AbstractEnv[source]
randomize_behaviour() → highway_env.envs.common.abstract.AbstractEnv[source]
to_finite_mdp()[source]
__deepcopy__(memo)[source]

Perform a deep copy but without copying the environment viewer.