@etemiz on Hugging Face: "https://x.com/OwainEvans_UK/status/1960359498515952039 It is weird that we…"

Post

180

https://x.com/OwainEvans_UK/status/1960359498515952039

It is weird that we can also understand humans more thanks to LLM research. Human behavior has parallels to this. When we optimize for short term pleasure (high time preference) we end up feeding the beast in us, ending up misaligned with other humans. But if we care about other humans (low space preference) we are more aligned with them. Feeding ego can have parallels to reward hacking. Overcoming ego can be described as having high human alignment score..

Join the conversation