Parking¶
A goal-conditioned continuous control task in which the ego-vehicle must park in a given space with the appropriate heading.
Usage¶
env = gym.make("parking-v0")
Default configuration¶
{
"observation": {
"type": "KinematicsGoal",
"features": ['x', 'y', 'vx', 'vy', 'cos_h', 'sin_h'],
"scales": [100, 100, 5, 5, 1, 1],
"normalize": False
},
"action": {
"type": "ContinuousAction"
},
"simulation_frequency": 15,
"policy_frequency": 5,
"screen_width": 600,
"screen_height": 300,
"centering_position": [0.5, 0.5],
"scaling": 7
"show_trajectories": False,
"render_agent": True,
"offscreen_rendering": False
}
More specifically, it is defined in:
API¶
-
class
highway_env.envs.parking_env.
ParkingEnv
(config: dict = None)[source]¶ A continuous control environment.
It implements a reach-type task, where the agent observes their position and speed and must control their acceleration and steering so as to reach a given goal.
Credits to Munir Jojo-Verge for the idea and initial implementation.
-
REWARD_WEIGHTS
: numpy.ndarray = array([1. , 0.3 , 0. , 0. , 0.02, 0.02])¶
-
SUCCESS_GOAL_REWARD
: float = 0.12¶
-
STEERING_RANGE
: float = 0.7853981633974483¶
-
classmethod
default_config
() → dict[source] Default environment configuration.
Can be overloaded in environment implementations, or by calling configure(). :return: a configuration dict
-
step
(action: numpy.ndarray) → Tuple[numpy.ndarray, float, bool, dict][source]¶ Perform an action and step the environment dynamics.
The action is executed by the ego-vehicle, and all other vehicles on the road performs their default behaviour for several simulation timesteps until the next decision making step.
- Parameters
action – the action performed by the ego-vehicle
- Returns
a tuple (observation, reward, terminal, info)
-
reset
() → numpy.ndarray[source]¶ Reset the environment to it’s initial configuration
- Returns
the observation of the reset state
-
_create_road
(spots: int = 15) → None[source]¶ Create a road composed of straight adjacent lanes.
- Parameters
spots – number of spots in the parking
-
_create_vehicles
() → None[source]¶ Create some new random vehicles of a given type, and add them on the road.
-
compute_reward
(achieved_goal: numpy.ndarray, desired_goal: numpy.ndarray, info: dict, p: float = 0.5) → float[source]¶ Proximity to the goal is rewarded
We use a weighted p-norm
- Parameters
achieved_goal – the goal that was achieved
desired_goal – the goal that was desired
info (dict) – any supplementary information
p – the Lp^p norm used in the reward. Use p<1 to have high kurtosis for rewards in [0, 1]
- Returns
the corresponding reward
-
_reward
(action: numpy.ndarray) → float[source]¶ Return the reward associated with performing a given action and ending up in the current state.
- Parameters
action – the last action performed
- Returns
the reward
-
_is_terminal
() → bool[source]¶ The episode is over if the ego vehicle crashed or the goal is reached.
-
__annotations__
= {'REWARD_WEIGHTS': <class 'numpy.ndarray'>, 'STEERING_RANGE': <class 'float'>, 'SUCCESS_GOAL_REWARD': <class 'float'>}¶
-
__module__
= 'highway_env.envs.parking_env'¶
-