Newcomb’s Problem, for those of you with social lives, is this. A superintelligent “Predictor” puts two opaque boxes on a table. The first contains either $1,000,000 or nothing, while the second contains $1,000. You have a choice: you can either open the first box or both boxes. Either way, you get to keep whatever you find.
But (duhhh…) there’s a catch: the Predictor has already predicted what you’ll do. If he predicted you’ll open both boxes, then he left the first box empty; if he predicted you’ll open the first box only, then he put $1,000,000 in the first box. Furthermore, the Predictor has played this game hundreds of times before, with you and other people, and has never once been wrong.
So what do you do? As Robert Nozick wrote, in a famous 1969 paper:
“To almost everyone, it is perfectly clear and obvious what should be done. The difficulty is that these people seem to divide almost evenly on the problem, with large numbers thinking that the opposing half is just being silly.”
Actually, people confronted with Newcomb’s Problem tend to split into three camps: the one-boxers, the two-boxers, and the Wittgensteins.
The one-boxers figure they might as well trust the Predictor: after all, he’s never been wrong. According to the prediction, if you open the first box you’ll get $1,000,000, while if you open both you’ll only get $1,000. So it’s a no-brainer: you should open only the first box.
“But that’s stupid!” say the two-boxers. “By the time you’re making the choice, the $1,000,000 is either in the first box or it isn’t. Your choice can’t possibly change the past. And whatever you’d get by opening the first box, you’ll get $1,000 more by opening both. So obviously you should open both boxes.”
(Incidentally, don’t imagine you can wiggle out of this by basing your decision on a coin flip! For suppose the Predictor predicts you’ll open only the first box with probability p. Then he’ll put the $1,000,000 in that box with the same probability p. So your expected payoff is 1,000,000p2 + 1,001,000p(1-p) + 1,000(1-p)2 = 1,000,000p + 1,000(1-p), and you’re stuck with the same paradox as before.)
The Wittgensteins take a third, boring way out. “The whole setup is contradictory!” they say. “It’s like asking what happens if an irresistable force hits an immovable object. If the ‘Predictor’ actually existed, then you wouldn’t have free will, so you wouldn’t be making a choice to begin with. Your very choice implies that the Predictor can’t exist.”
I myself once belonged to the Wittgenstein camp. Recently, however, I came up with a new solution to Newcomb’s Problem — one that I don’t think has ever been discussed in the literature. (Please correct me if I’m wrong.) As I see it, my solution lets me be an intellectually-fulfilled one-boxer: someone who can pocket the $1,000,000, yet still believe the future doesn’t affect the past. I was going to write up my solution for a philosophy journal, but what fun is that? Instead, I hereby offer it for the enlightenment and edification of Shtetl-Optimized readers.
We’ll start with a definition:
“You” are anything that suffices to predict your future behavior.
I know this definition seems circular, but it has an important consequence: that if some external entity could predict your future behavior as well as you could, then we’d have to regard that entity as “instantiating” another copy of you. In other words, just as a perfect simulation of multiplication is multiplication, I’m asserting that a perfect simulation of you is you.
Now imagine you’re standing in front of the boxes, agonizing over what to do. As the minutes pass, your mind wanders:
I wonder what the Predictor thinks I’ll decide? “Predictor”! What a pompous asshole. Thinks he knows me better than I do. He’s like that idiot counselor at Camp Kirkville — what was his name again? Andrew. I can still hear his patronizing voice: “You may not believe me now, but someday you’ll realize you were wrong to hide those candy bars under the bed. And I don’t care if you hate the cafeteria food! What about the other kids, who don’t have candy bars? Didn’t you ever think of them?” Well, you know what, Predictor? Let’s see how well you can track my thoughts. Opening only one box would be rather odd, wouldn’t you say? Camp Kirkville, Andrew, candy bar – that’s 27 letters in total. An odd number. So then that settles it: one box.
What’s my point? That reliably predicting whether you’ll take one or both boxes is “you-complete,” in the sense that anyone who can do it should be able to predict anything else about you as well. So by definition, the Predictor must be running a simulation of you so detailed that it’s literally a copy of you. But in that case, how can you possibly know whether you’re the “real” you, or a simulated version running inside the Predictor’s mind?
“But that’s silly!” you interject. “Here, I’ll prove I’m the ‘real’ me by pinching myself!” But of course, your simulated doppelganger says and does exactly the same thing. Let’s face it: the two of you are like IP and PSPACE, water and H2O, Mark Twain and Samuel Clemens.
If you accept that, then the optimal strategy is clear: open the first box only. Sure, you could make an extra $1,000 by opening both boxes if you didn’t lead a double life inside the Predictor’s head, but you do. That, and not “backwards-in-time causation,” is what explains how your decision can affect whether or not there’s $1,000,000 in the first box.
An important point about my solution is that it completely sidesteps the “mystery” of free will and determinism, in much the same way that an NP-completeness proof sidesteps the mystery of P versus NP. What I mean is that, while it is mysterious how your “free will” could influence the output of the Predictor’s simulation, it doesn’t seem more mysterious than how your free will could influence the output of your own brain! It’s six of one, half a dozen of the other. Or at least, that’s what the neural firings in my own brain have inexorably led me to believe.