arxiv:2508.19827

Analysing Chain of Thought Dynamics: Active Guidance or Unfaithful Post-hoc Rationalisation?

Published on Aug 27

· Submitted by

XingweiT on Aug 28

Upvote

Authors:

Samuel Lewis-Lim ,

Xingwei Tan ,

Abstract

Investigation into Chain-of-Thought dynamics and faithfulness across various models reveals inconsistencies in their reliance on CoT and its alignment with actual reasoning.

AI-generated summary

Recent work has demonstrated that Chain-of-Thought (CoT) often yields limited gains for soft-reasoning problems such as analytical and commonsense reasoning. CoT can also be unfaithful to a model's actual reasoning. We investigate the dynamics and faithfulness of CoT in soft-reasoning tasks across instruction-tuned, reasoning and reasoning-distilled models. Our findings reveal differences in how these models rely on CoT, and show that CoT influence and faithfulness are not always aligned.

View arXiv page View PDF Add to collection

Community

XingweiT

Paper author Paper submitter 4 days ago

Our paper investigates the faithfulness of CoT on the soft-reasoning tasks based on instruction-tuned, multi-step reasoning, and distilled reasoning models. We designed two experiments: 1) forcing an answer at intermediate reasoning steps to measure the gold answer confidence; 2) adding cues to mislead the models to measure the variance in the gold answer confidence. We found CoT often serves as a post-hoc justification for the instruction-tuned LLMs, but distilled reasoning LLMs rely heavily on CoT. Moreover, we found that unfaithful CoTs can still provide active guidance.

This work has been accepted by EMNLP 2025 (Main).