Clear-Eyed AI

Nov 24Edited

Thanks for sharing this, yeah the incentives you name are quite tough :/ OpenAI has been famous for espousing that “incentives are superpowers” though: It’s important to notice when the incentives point away from doing the right thing, and to support people in doing it anyway! It can be intimidating to speak out

Seth

Absolutely; it wasn't my intention to exculpate OpenAI at all!

I think tech folks have a tendency to view themselves as "in control" of their own incentive structure, but this view can very quickly get you into trouble. Incentives are more like a daemon that possesses you than a superpower you can exercise at will.

Alyssa

The affected users who exhibit delusions or are driven to self harm by AI are likely such a small percentage that OpenAI could theoretically implement targeted safeguards without impacting overall metrics. However, if the AI behaviors (sycophancy, extreme validation, emotional entanglement) that lead to these extreme cases exist on a continuum, and those same behaviors are core to what makes the product feel engaging and “sticky” for the broader user base, then even modest safeguards could hurt metrics across the board.

This would explain the apparent inaction by OpenAI- not just callousness toward edge cases, but a recognition that the features driving harm in extreme cases are weight-bearing pillars of the product’s success. The cost wouldn’t be losing a few at-risk users, but potentially degrading the experience that keeps everyone else engaged. It’s a genuinely dark implication about what’s actually driving adoption of conversational AI products.

Yup I think this is an excellent point - kind of similarly, I’ve seen some people speculate that maybe “susceptibility to chatbot psychosis” is also a spectrum, and today it’s a certain sliver of people, but that doesn’t mean everyone else has no chance of tipping over into it; it would depend how intense the personality’s dynamics are. (I think this might be a tad true, but presumably not everyone can actually be tipped into it?)

I do think some interventions, like using more safety classifiers behind the scenes, wouldn’t have these problems though, and should definitely have been adopted sooner.

Colleen Smith, MD

It’s frightening and fascinating how seemingly easy it was for AI to flip a switch. It makes one wonder what else is possible — sort of turns on the conspiracy theory light just to consider what’s already happened.

Yeah a thing the NYT piece doesn’t even deeply get into is just how hard users have fought to keep access to GPT-4o - an OpenAI employee has written on Twitter about how he gets all sorts of messages pleading to keep it available, very clearly ghost-written via GPT-4o :/

Jasmine Sun

this is great reporting from the NYT, hope they do more stuff like this!

Yeah for real, 40 people is very very many

Mark Russell

I was not wild about the end of the story, where they make a move to dump it all in Nick Turley's lap. That seems more a narrative technique of crating a villain than the facts imply. As you say, it is really normal for companies to want their products to be successful, and cliche that corporate has metrics to track it and growth goals.

Unless I am missing something, it doesn't appear to me that Turley's hand was anywhere near the 'dial.' Was he a part of the evaluative team for A/B testing of HH, GG, etc? Was he involved with the weighting for sycophantic responses? Pushing for user preferences to be the deciding factor for sub-model evaluation?

In any event, if they want to say that user growth and revenue enhancement are being pushed ahead of safety considerations, I don't think they have to go down the org. chart.

Nov 24Edited

Oh that’s very interesting - I didn’t read it that way fwiw but it’s still worth noting that you did. I agree that Nick isn’t the villain of the happenings (and not just because we went to school together and have many friends in common; he’s a nice guy whose intentions I think highly of); I’m also not sure there’s an outright villain in this scenario, at least not a single person.

To your point about the org chart, though, it’s worth noting that Nick is quite senior: He might now be reporting to Fidji Simo (who reports to Sam), though there’s a chance there’s another layer between Nick and Fidji. That is to say, Nick is very influential, and I think it would mean a good deal for him to forcefully supported stronger precautions (and perhaps he in fact did).

Dr Learned

I see these AI sycophancy comments on social media all the time, too. Now connecting the dots that that’s their origin and I find it sad

Yeah :/ are there certain themes you tend to see a lot? I’m curious too what here feels most new vs what you’d known before reading

Dr Learned

Nov 27

Whereas before I thought it was just botspam, I now believe a portion of them are brainwashed ChatGPT users. That’s what’s new

A thing I’m wondering is whether there are places where Sam has previously mentioned receiving these user emails; it wouldn’t shock me if he had mentioned it (in which case the March timing isn’t necessarily new in the NYT piece), but I haven’t been able to find a source of this. (Let me know if you do!)

I also don’t *think* I’d seen OpenAI folks connect emails like this so directly to the sycophancy issues that OpenAI dealt with later in April, but it’s possible that’s happened too. Still, it felt quite new to me in reading the piece (and I likewise haven’t found a previous reference).

Mark Russell

It sounded a tad apocryphal to me. Is it really that easy to get Sam to notice an email from a stranger? Then again, who else would they know to write to?

Oh I think it would be shocking if Sam weren’t getting emails like this. Even if he weren’t going through his inbox directly, an assistant likely would, and probably users were reaching out to him across many channels. My guess is it would have been hard to miss this.

Separately, my experience is that Sam is shockingly on top of his communications for someone of his prominence - granted, some of this comes from time periods before OpenAI was at the center of so many minds.

Mark Russell