Update README.md

b91a4da verified 6 months ago

363 Bytes

metadata

license: mit
base_model:
  - Qwen/Qwen2.5-3B-Instruct

This is the baseline checkpoint for paper: ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind, which is trained with RL but without theory of mind information.

Please refer to our Github Repo for usage details.