@thompsonson Is there something about the entropy of the next token probability distribution (need to find the paper that shows “reasoning tokens” - like however or therefore - have a high entropy).

What is the entropy at each token in SLMs and LLMs compared to the r_c and r_o values… 🤔