4 Comments
User's avatar
Nathan Lubchenco's avatar

In this example, I wonder how important actual intelligence is vs things like determination, persistence and a belief that it's possible. I'm thinking through this lens because I know many quite smart people and many of them are not necessarily effective at achieving their goals in the world. But if this is true, we should be all the more concerned about the difficulties and risks here because AI agents do appear to demonstrate these capabilities strongly already.

Steven Adler's avatar

Yeah that's an interesting thought - I know that with humans, for instance, knowing that a puzzle is solvable seems to make a pretty big difference in whether people find the solution. There's probably an interesting experiment to be run on AI & whether being told there's a solution has a similar effect

Nathan Lubchenco's avatar

Yeah, thats an interesting direction. I had more meant that models don't seem to currently have a good sense of what's possible and what's not. Which in some contexts is a limitation, but in others is a strength. A human is much more likely to assess difficulty and effort before a task while a model will just try things. I have in mind things like AI village or Claude plays Pokemon. So in some sense the models not knowing their limitations resembles something akin to determination, grit or agency.

But now I'm curious about your suggestion (maybe particularly on hard problems like tier 4 frontier math).

Steven Adler's avatar

If you want to get more into the weeds on "How might we safely govern AI agents?", my teammates and I at OpenAI wrote a paper on exactly this: https://cdn.openai.com/papers/practices-for-governing-agentic-ai-systems.pdf