Spaces:

baiyanlali-zhao
/

NCERL-Diverse-PCG

Sleeping

App Files Files Community

NCERL-Diverse-PCG / README.md

baiyanlali-zhao

update app and readme

2801e45 8 months ago

preview code

raw

history blame contribute delete

3.51 kB

	---
	title: Negatively Correlated Ensemble RL
	emoji: 🌹
	colorFrom: red
	colorTo: yellow
	sdk: gradio
	python_version: 3.9
	app_file: app.py
	pinned: false
	---

	![alt text](./media/banner.png)
	# Negatively Correlated Ensemble RL

	## 环境安装
	创建conda环境
	```bash
	conda create -n ncerl python=3.9
	```

	切换conda环境
	```
	conda activate ncerl
	```

	安装环境依赖
	```bash
	pip install -r requirements.txt
	```
	注：该程序不需要您使用任何显卡，但是需要安装pytorch。如果您的显卡支持cuda，那么请安装cuda版本，否则安装cpu版本。使用cuda版本可以提高推理速度。



	## 快速开始
	如果您想查看效果，可以通过
	```
	python app.py
	```
	后打开命令行显示连接互动查看。

	也可以通过运行
	```
	python generate_and_play.py
	```
	后查看`models/example_policy/samples.png`查看生成效果。

	## 开始训练

	All training are launched by running `train.py` with option and arguments. For example, execute `python train.py ncesac --lbd 0.3 --m 5` will train NCERL with hyperparameters set as $\lambda = 0.3, m=5$.
	Plot script is `plots.py`

	* `python train.py gan`: to train a decoder which maps a continuous action to a game level segment.
	* `python train.py sac`: to train a standard SAC as the policy for online game level generation
	* `python train.py asyncsac`: to train a SAC with an asynchronous evaluation environment as the policy for online game level generation
	* `python train.py ncesac`: to train an NCERL based on SAC as the policy for online game level generation
	* `python train.py egsac`: to train an episodic generative SAC (see paper [The fun facets of Mario: Multifaceted experience-driven PCG via reinforcement learning](https://dl.acm.org/doi/abs/10.1145/3555858.3563282)) as the policy for online game level generation
	* `python train.py pmoe`: to train an episodic generative SAC (see paper [Probabilistic Mixture-of-Experts for Efficient Deep Reinforcement Learning](https://arxiv.org/abs/2104.09122)) as the policy for online game level generation
	* `python train.py sunrise`: to train a SUNRISE (see paper [SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning](https://proceedings.mlr.press/v139/lee21g.html)) as the policy for online game level generation
	* `python train.py dvd`: to train a DvD-SAC (see paper [Effective Diversity in Population Based Reinforcement Learning](https://proceedings.neurips.cc/paper_files/paper/2020/hash/d1dc3a8270a6f9394f88847d7f0050cf-Abstract.html)) as the policy for online game level generation

	For the training arguments, please refer to the help `python train.py [option] --help`

	## 目录结构
	```
	NCERL-DIVERSE-PCG/
	* analysis/
	* generate.py 未使用
	* tests.py 做evaluation使用
	* media/ markdown素材文件
	* models/
	* example_policy/ 做生成展示使用
	* smb/ 马里奥仿真以及图片资源数据
	* src/
	* ddpm/ ddpm模型相关目录
	* drl/ drl模型、训练目录
	* env/ 马里奥gym环境和reward function
	* gan/ gan模型、训练目录
	* olgen/ 在线生成环境与policy目录
	* rlkit/ 强化学习使用部件目录
	* smb/ 马里奥与仿真器交互组件以及多进程异步池组件
	* utils/ 一些功能性文件
	* training_data/ 训练数据
	* README.md 当前文件
	* app.py 用于gradio展示用途文件
	* generate_and_play.py 用于非gradio展示文件
	* train.py 训练文件
	* test_ddpm.py 测试训练ddpm文件
	* requirements.txt 环境依赖文件
	```