- Published on
Welcome to rlxf.ai
- Authors

- Name
- Shixiang Shane Gu
- @RLXFai
Welcome to rlxf.ai — my personal research blog.
I'm Shixiang Shane Gu, a Senior Staff Research Scientist at Google DeepMind working on the Gemini Thinking team. This site is a place where I'll share research notes, tutorials, and ideas across my areas of interest:
- Reinforcement learning from human feedback (RLHF) and reward modeling
- Reasoning and self-improvement in large language models
- Generative modeling (with a soft spot for Gumbel-Softmax)
- Sample-efficient deep RL
Why a blog?
The research I care about most lives at the intersection of rigorous math, scalable systems, and things that actually matter. I want this blog to be a place for half-formed ideas, distill-style deep dives, and interactive demos — not just links to arXiv papers.
Expect posts with LaTeX like this: the Gumbel-Softmax reparameterization trick lets us write a sample from a categorical distribution as:
where and is the temperature parameter.
More posts coming soon. In the meantime, find me on X/Twitter or check out my work on Google Scholar.