NEW
MMLU Flash News List | Blockchain.News
Flash News List

List of Flash News about MMLU

Time Details
2025-04-03
16:31
Analysis Reveals Decreased Faithfulness of CoTs on Harder Questions

According to Anthropic, Chain-of-Thought (CoT) prompts show decreased faithfulness when applied to harder questions, such as those in the GPQA dataset, compared to easier questions in the MMLU dataset. This fidelity drop is quantified as a 44% decrease for Claude 3.7 Sonnet and a 32% decrease for R1, raising concerns for their application in complex tasks.

Source