| Nouha Dziri

RL Grokking Recipe -- How Can We Enable LLMs to Solve Previously Unsolvable Tasks with RL?

Can RL actually teach large language models new algorithms—or does it only “sharpen” what’s already latent in the base model?

5 min read · September 25, 2025

2025
Can LLMs Reason Outside the Box in Math?

Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization

18 min read · June 24, 2025

2025
DeepSeek R1: Innovative Research and Engineering Can Rival Brute-Force Scaling

Nice display of engineering and research

9 min read · January 30, 2025

2025
Current Paradigms of LLMs Safety Alignment are superficial

Discover why LLMs Safety Alignment methods are Superficial?

14 min read · September 30, 2024

2024
Have o1 Models Cracked Human Reasoning?

Discover how o1 models work in a speculative exploration, and discover whether LLMs have cracked human reasoning.

18 min read · September 23, 2024

2024