Impoola

This repository is the official implementation of the paper:

Impoola: The Power of Average Pooling for Image-based Deep Reinforcement Learning

Raphael Trumpp, Ansgar Schäfftlein, Mirco Theile, and Marco Caccamo.

Presented at: Reinforcement Learning Conference (RLC) 2025.

Background

As image-based deep reinforcement learning tackles more challenging tasks, increasing model size has become an important factor in improving performance. Recent studies achieved this by focusing on the parameter efficiency of scaled networks, typically using Impala-CNN, a 15-layer ResNet-inspired network, as the image encoder. However, while Impala-CNN evidently outperforms older CNN architectures, potential advancements in network design for deep reinforcement learning-specific image encoders remain largely unexplored. We find that replacing the flattening of output feature maps in Impala-CNN with global average pooling leads to a notable performance improvement. This approach outperforms larger and more complex models in the Procgen Benchmark, particularly in terms of generalization. We call our proposed encoder model Impoola-CNN. A decrease in the network’s translation sensitivity may be central to this improvement, as we observe the most significant gains in games without agent-centered observations. Our results demonstrate that network scaling is not just about increasing model size—efficient network design is also an essential factor.

Install

We recommend using a virtual environment for the installation:
```
python -m venv impoola
source impoola/bin/activate
```

Activate the environment and install the following packages:

pip install torch torchrl numpy tyro matplotlib torchinfo wandb torch-pruning procgen stable_baselines3 tqdm gym==0.26.2 gymnasium==0.28.1

Usage

The PPO agent can be trained with the following command

python ppo_training.py

and DQN with

python dqn_training.py

We provide launch files for both PPO and DQN in the benchmark_utils folder, allowing to run experiments with different seeds across multiple GPUs. Results will be logged to Weights & Biases, so make sure to set up your W&B account and API key. Command line arguments are managed with tyro and are defined in each *_training.py file. For example, the Progen Benchmark environment can be specified with the --env argument:

python ppo_training.py --env_id fruitbot

Our default points to the generalization track with 200 training levels and 25M training steps. The hard setting can be specified with the --distribution_mode argument:

python ppo_training.py --env_id fruitbot --distribution_mode hard

Tools for plotting results and analyzing the trained models are provided in the plot_utils folder. They use a customized rlops package and openrlbenchmark.

Reference

If you find our work useful, please consider citing our paper:

@inproceedings{
    trumpp2025impoola,
    title={Impoola: The Power of Average Pooling for Image-based Deep Reinforcement Learning},
    author={Raphael Trumpp and Ansgar Sch{\"a}fftlein and Mirco Theile and Marco Caccamo},
    booktitle={Reinforcement Learning Conference},
    year={2025},
    url={https://openreview.net/forum?id=Kkw4nqaM9Y}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
benchmark_utils		benchmark_utils
docs		docs
impoola		impoola
plot_utils		plot_utils
LICENSE		LICENSE
README.md		README.md
dqn_training.py		dqn_training.py
ppo_training.py		ppo_training.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Impoola

Table of contents

Background

Install

Usage

Reference

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Impoola

Table of contents

Background

Install

Usage

Reference

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages