Shrivu Shankar (@shrivu): "opus 4.6 does still struggle a decent amount with research 'creativity' have been running my own claude-powered rl research loop (~@karpathy autoresearch stuff) for the past 3 weeks and the one thing I have to continue nudging for is for it to try more exotic things no amount …"

The app for independent voices

opus 4.6 does still struggle a decent amount with research 'creativity'

have been running my own claude-powered rl research loop (~@karpathy autoresearch stuff) for the past 3 weeks and the one thing I have to continue nudging for is for it to try more exotic things

no amount of subagents or markdown seems be enough to pull myself out of that part of the loop

for now, I've settled on:

1. [weekend] align on a research plan (scope/budget, 'lets explore these 3 abstract questions, you get $1k of firework ai credit')

2. [week] completely hands off, let it loop, run the plan and spin off sub-plans to get as much empirical research signal as possible, converting compute into signal and hillclimbing a scoring metric

3. [weekend] take all the logs/reports from the week and align on a new research plan thats about a weeks worth of looping and $1k worth of gpu compute

aligns with my building thought that ai is best for taking 'larger leaps', pushing human effort to higher and higher level loops

Mar 22

5:52 PM

The app for independent voices

Log in or sign up