Highway

In this task, the ego-vehicle is driving on a multilane highway populated with other vehicles. The agent’s objective is to reach a high speed while avoiding collisions with neighbouring vehicles. Driving on the right side of the road is also rewarded.

https://raw.githubusercontent.com/eleurent/highway-env/gh-media/docs/media/highway.gif

Usage

env = gym.make("highway-v0")

Default configuration

{
    "observation": {
        "type": "Kinematics"
    },
    "action": {
        "type": "DiscreteMetaAction",
    },
    "lanes_count": 4,
    "vehicles_count": 50,
    "duration": 40,  # [s]
    "initial_spacing": 2,
    "collision_reward": -1  # The reward received when colliding with a vehicle.
    "simulation_frequency": 15,  # [Hz]
    "policy_frequency": 1,  # [Hz]
    "other_vehicles_type": "highway_env.vehicle.behavior.IDMVehicle",
    "screen_width": 600,  # [px]
    "screen_height": 150,  # [px]
    "centering_position": [0.3, 0.5],
    "scaling": 5.5,
    "show_trajectories": False,
    "render_agent": True,
    "offscreen_rendering": False
}

More specifically, it is defined in:

HighwayEnv.default_config() → dict[source]

Default environment configuration.

Can be overloaded in environment implementations, or by calling configure(). :return: a configuration dict

API

class highway_env.envs.highway_env.HighwayEnv(config: dict = None)[source]

A highway driving environment.

The vehicle is driving on a straight highway with several lanes, and is rewarded for reaching a high speed, staying on the rightmost lanes and avoiding collisions.

RIGHT_LANE_REWARD: float = 0.1

The reward received when driving on the right-most lanes, linearly mapped to zero for other lanes.

HIGH_SPEED_REWARD: float = 0.4

The reward received when driving at full speed, linearly mapped to zero for lower speeds.

LANE_CHANGE_REWARD: float = 0

The reward received at each lane change action.

default_config() → dict[source]

Default environment configuration.

Can be overloaded in environment implementations, or by calling configure(). :return: a configuration dict

reset() → numpy.ndarray[source]

Reset the environment to it’s initial configuration

Returns

the observation of the reset state

step(action: int) → Tuple[numpy.ndarray, float, bool, dict][source]

Perform an action and step the environment dynamics.

The action is executed by the ego-vehicle, and all other vehicles on the road performs their default behaviour for several simulation timesteps until the next decision making step.

Parameters

action – the action performed by the ego-vehicle

Returns

a tuple (observation, reward, terminal, info)

_create_road() → None[source]

Create a road composed of straight adjacent lanes.

_create_vehicles() → None[source]

Create some new random vehicles of a given type, and add them on the road.

__annotations__ = {'HIGH_SPEED_REWARD': <class 'float'>, 'LANE_CHANGE_REWARD': <class 'float'>, 'RIGHT_LANE_REWARD': <class 'float'>}
__module__ = 'highway_env.envs.highway_env'
_reward(action: Union[int, numpy.ndarray]) → float[source]

The reward is defined to foster driving at high speed, on the rightmost lanes, and to avoid collisions. :param action: the last action performed :return: the corresponding reward

_is_terminal() → bool[source]

The episode is over if the ego vehicle crashed or the time is out.

_cost(action: int) → float[source]

The cost signal is the occurrence of collision.