Release Stable-Baselines3 v2.0.0: Gymnasium Support · DLR-RM/stable-baselines3

Warning

Stable-Baselines3 (SB3) v2.0 will be the last one supporting python 3.7 (end of life in June 2023).
We highly recommended you to upgrade to Python >= 3.8.

SB3 Contrib (more algorithms): https://github.com/Stable-Baselines-Team/stable-baselines3-contrib
RL Zoo3 (training framework): https://github.com/DLR-RM/rl-baselines3-zoo
Stable-Baselines Jax (SBX): https://github.com/araffin/sbx

To upgrade:

pip install stable_baselines3 sb3_contrib rl_zoo3 --upgrade

or simply (rl zoo depends on SB3 and SB3 contrib):

pip install rl_zoo3 --upgrade

Breaking Changes:

Switched to Gymnasium as primary backend, Gym 0.21 and 0.26 are still supported via the shimmy package (@carlosluis, @arjun-kg, @tlpss)
The deprecated online_sampling argument of HerReplayBuffer was removed
Removed deprecated stack_observation_space method of StackedObservations
Renamed environment output observations in evaluate_policy to prevent shadowing the input observations during callbacks (@npit)
Upgraded wrappers and custom environment to Gymnasium
Refined the HumanOutputFormat file check: now it verifies if the object is an instance of io.TextIOBase instead of only checking for the presence of a write method.
Because of new Gym API (0.26+), the random seed passed to vec_env.seed(seed=seed) will only be effective after then env.reset() call.

New Features:

Added Gymnasium support (Gym 0.21 and 0.26 are supported via the shimmy package)

SB3-Contrib

Fixed QRDQN update interval for multi envs

RL Zoo

Gym 0.26+ patches to continue working with pybullet and TimeLimit wrapper
Renamed CarRacing-v1 to CarRacing-v2 in hyperparameters
Huggingface push to hub now accepts a --n-timesteps argument to adjust the length of the video
Fixed record_video steps (before it was stepping in a closed env)
Dropped Gym 0.21 support

Bug Fixes:

Fixed VecExtractDictObs does not handle terminal observation (@WeberSamuel)
Set NumPy version to >=1.20 due to use of numpy.typing (@troiganto)
Fixed loading DQN changes target_update_interval (@tobirohrer)
Fixed env checker to properly reset the env before calling step() when checking
for Inf and NaN (@lutogniew)
Fixed HER truncate_last_trajectory() (@lbergmann1)
Fixed HER desired and achieved goal order in reward computation (@JonathanKuelz)

Others:

Fixed stable_baselines3/a2c/*.py type hints
Fixed stable_baselines3/ppo/*.py type hints
Fixed stable_baselines3/sac/*.py type hints
Fixed stable_baselines3/td3/*.py type hints
Fixed stable_baselines3/common/base_class.py type hints
Fixed stable_baselines3/common/logger.py type hints
Fixed stable_baselines3/common/envs/*.py type hints
Fixed stable_baselines3/common/vec_env/vec_monitor|vec_extract_dict_obs|util.py type hints
Fixed stable_baselines3/common/vec_env/base_vec_env.py type hints
Fixed stable_baselines3/common/vec_env/vec_frame_stack.py type hints
Fixed stable_baselines3/common/vec_env/dummy_vec_env.py type hints
Fixed stable_baselines3/common/vec_env/subproc_vec_env.py type hints
Upgraded docker images to use mamba/micromamba and CUDA 11.7
Updated env checker to reflect what subset of Gymnasium is supported and improve GoalEnv checks
Improve type annotation of wrappers
Tests envs are now checked too
Added render test for VecEnv and VecEnvWrapper
Update issue templates and env info saved with the model
Changed seed() method return type from List to Sequence
Updated env checker doc and requirements for tuple spaces/goal envs

Documentation:

Added Deep RL Course link to the Deep RL Resources page
Added documentation about VecEnv API vs Gym API
Upgraded tutorials to Gymnasium API
Make it more explicit when using VecEnv vs Gym env
Added UAV_Navigation_DRL_AirSim to the project page (@heleidsn)
Added EvalCallback example (@sidney-tio)
Update custom env documentation
Added pink-noise-rl to projects page
Fix custom policy example, ortho_init was ignored
Added SBX page

Full Changelog: v1.8.0...v2.0.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stable-Baselines3 v2.0.0: Gymnasium Support