.. _quickstart: =============== Getting Started =============== Making an environment ------------------------ Here is a quick example of how to create an environment: .. jupyter-execute:: import gym import highway_env from matplotlib import pyplot as plt %matplotlib inline env = gym.make('highway-v0') env.reset() for _ in range(3): action = env.action_type.actions_indexes["IDLE"] obs, reward, done, info = env.step(action) env.render() plt.imshow(env.render(mode="rgb_array")) plt.show() All the environments ~~~~~~~~~~~~~~~~~~~~ Here is the list of all the environments available and their descriptions: .. toctree:: :maxdepth: 1 environments/highway environments/merge environments/roundabout environments/parking environments/intersection .. _configuration: Configuring an environment --------------------------- The :ref:`observations `, :ref:`actions `, :ref:`dynamics ` and :ref:`rewards ` of an environment are parametrized by a configuration, defined as a :py:attr:`~highway_env.envs.common.abstract.AbstractEnv.config` dictionary. After environment creation, the configuration can be accessed using the :py:attr:`~highway_env.envs.common.abstract.AbstractEnv.config` attribute. .. jupyter-execute:: import pprint env = gym.make("highway-v0") pprint.pprint(env.config) For example, the number of lanes can be changed with: .. jupyter-execute:: env.config["lanes_count"] = 2 env.reset() plt.imshow(env.render(mode="rgb_array")) plt.show() Training an agent ------------------- Reinforcement Learning agents can be trained using libraries such as `rl-agents `_, `baselines `_ or `stable-baselines `_. .. figure:: https://raw.githubusercontent.com/eleurent/highway-env/gh-media/docs/media/parking-env.gif The highway-parking-v0 environment trained with HER. .. code-block:: python import gym import highway_env import numpy as np from stable_baselines import HER, SAC, DDPG, TD3 from stable_baselines.ddpg import NormalActionNoise env = gym.make("parking-v0") # Create 4 artificial transitions per real transition n_sampled_goal = 4 # SAC hyperparams: model = HER('MlpPolicy', env, SAC, n_sampled_goal=n_sampled_goal, goal_selection_strategy='future', verbose=1, buffer_size=int(1e6), learning_rate=1e-3, gamma=0.95, batch_size=256, policy_kwargs=dict(layers=[256, 256, 256])) model.learn(int(2e5)) model.save('her_sac_highway') # Load saved model model = HER.load('her_sac_highway', env=env) obs = env.reset() # Evaluate the agent episode_reward = 0 for _ in range(100): action, _ = model.predict(obs) obs, reward, done, info = env.step(action) env.render() episode_reward += reward if done or info.get('is_success', False): print("Reward:", episode_reward, "Success?", info.get('is_success', False)) episode_reward = 0.0 obs = env.reset() Examples on Google Colab ------------------------- Use these notebooks to train driving policies on `highway-env`. .. |parking_mb| image:: https://colab.research.google.com/assets/colab-badge.svg :target: https://colab.research.google.com/github/eleurent/highway-env/blob/master/scripts/parking_model_based.ipynb .. |planning_hw| image:: https://colab.research.google.com/assets/colab-badge.svg :target: https://colab.research.google.com/github/eleurent/highway-env/blob/master/scripts/highway_planning.ipynb .. |parking_her| image:: https://colab.research.google.com/assets/colab-badge.svg :target: https://colab.research.google.com/github/eleurent/highway-env/blob/master/scripts/parking_her.ipynb - A Model-based Reinforcement Learning tutorial on Parking |parking_mb| A tutorial written for `RLSS 2019 `_ and demonstrating the principle of model-based reinforcement learning on the `parking-v0` task. - Trajectory Planning on Highway |planning_hw| Plan a trajectory on `highway-v0` using the `OPD` :cite:`Hren2008` implementation from `rl-agents `_. - Parking with Hindsight Experience Replay |parking_her| Train a goal-conditioned `parking-v0` policy using the :cite:`Andrychowicz2017` implementation from `stable-baselines `_.