Discussion about this post

User's avatar
Drew Margolin's avatar

Two things.

First, on AI self-explanation. I had a long conversation with Claude (4.6) about this and it was very clear. It does not retain any memory of its “mental state” while doing an action, as a human would (however erroneously). So when you ask it to explain why it picked Kelsey Piper, it’s not doing that. It’s analyzing the input and the output and creating the most plausible explanation it can find for its actions after the fact. This would, I think, bias it toward explanations that are understandable. But understandable is almost certainly inaccurate, because the real answer is that “Kelsey Piper” as answer minimized or maximized some calculation in a very complex matrix of numbers. In this way, AI is the ultimate CYA bullshitter.

I also wonder whether it is guessing you because you have played this guessing game before. In other words, 4.7 was trained on data that included Kelsey Piper asking Claude to guess authors, and that’s a rare behavior. So when text that sounds a bit Kelsey-ish comes in, 4.7 knows she’s a good guess, not because the prose perfectly matches her, but because it is the best match among known guessing game players.

Clues to that are:

1. The high school essay. I have a hard time believing it sounds similar to you now. It could, but I mean, I was a dreadful writer in high school. My college teachers told me so.

The bigger clue — that it found your friend’s discord that you are also in, based on something unrelated they wrote? This gives me a strong hunch it’s starting with you as a search premise (“she plays this game, let’s check her, first”).

Nick Luchs's avatar

I ran this myself, and at first I couldn't replicate your results (Opus 4.7, with and without adaptive thinking, with and without Claude's incognito mode). No joke, it kept giving me Scott Alexander and Matt Yglesias too, along with some other secondary guesses. But I realized most of its guesses were people in my overly long "personal preferences" section where I list a bunch of writers. I thought you must have made the same mistake, combined with maybe some weird caching behavior that affected your friend.

Since I realized that it was drawing from those preferences, I deleted them and tried one more time from scratch...and then, over and over with slight variations, it couldn't not get Kelsey Piper as an answer. Wow.

I'm _still_ hoping it's some super weird new caching behavior since this is sort of terrifying. But I'm looking forward to seeing other writers try this with their work (especially less prolific ones with less presence in the training corpus).

4 more comments...

No posts

Ready for more?