OfflineRL: Towards Real-world Decision-making

Offline Reinforcement Learning (OfflineRL) is a setting in Reinforcement Learning, which aims at learning a good policy only with a static dataset (previously collected by a behavior policy) and without further interaction in the deployment environment. OfflineRL has received widespread attention in recent years, as a potential paradigm of reinforcement learning in cost-sensitive applications for online sampling.

An illustration of offline RL. One key composition in Offline RL is the static dataset which includes experience from past interactions. The source of experience can be various: usually, we collect datasets using experts, medium players, script policies, or human demonstrations. In the second phase, we train a policy via an offline reinforcement learning algorithm. Finally, we deploy the learned policy in the real world directly.

The website will collect the newest resources related to offline reinforcement learning. Currently, we classify the resources into the following parts: algorithms, benchmarks, reading list, software, competitions, dataset, and interesting news. The details of each part of the resources can be found in the following links:

Algorithms ↗

proposed OfflineRL algorithms, related papers, and open-source codes (if released).


proposed benchmarks that compared SOTA offlineRL algorithms in standard datasets/ environments.

Reading List ↗

valuable surveys and tutorials.

Software ↗

published software, applications, and repositories based on OfflineRL.

Competitions ↗

competitions related to OfflineRL

Dataset ↗

proposed datasets suited to be used in the offlineRL setting.

Related news

Interesting news about the OfflineRL.

contact us: