The optimality argument has been refuted, see http://philsci-archive.pitt.edu/12429/1/soloccam.pdf

Thanks for the link! The paper actually does not agree with your conclusion, as per footnote 5 in the paper:

This phrasing suggests a weaker property than reliability (i.e., convergence to the truth), namely optimality (convergence to predictions that are at least as good as those of any other prediction method). However, the proof that is referred to is about reliability.

So the paper covers reliability not optimality. As I initially said, any prior would converge with Bayesian inference, or as the paper says, “We can prove that any Bayesian predictor, operating under the inductive assumption of S, is reliable under the assumption that the data is indeed generated by some source S* ∈ S.”

I don’t think the optimality of Solomonoff induction is in dispute, and I think optimality is sufficient justification for Occam’s razor as a general recommendation for prediction: it’s simple, it’s explicable/communicable, it’s effective, and it’s optimal.

However, the paper also concludes that, “If there exists a way to salvage the argument at all, then it would have to consist in demonstrating anew that effectiveness as an inductive assumption does lead to a fundamental simplicity notion”.

That is, the paper does not definitively establish that effectiveness does not require simplicity, which you seem to imply, it concludes only that effectiveness is sufficient for Solomonoff Induction, and that simplicity does not appear necessary unless it is itself required by effectiveness. That remains an open problem.

]]>Wow, thanks for that link! I wasn’t expecting much when I clicked on it but it makes a good argument. The key point isn’t stated clearly up front, though, so let me summarize it for everyone else. (Warning, I may be butchering some of the technical details regarding different notions of computability a little.)

So, the idea is that a universal prior formalizes the principle Occam’s Razor, and a universal prior works well (for some appropriate notion of that), so this shows that Occam’s Razor works, right? The thing is, there’s a theorem that *any* semicomputable prior on the set of semicomputable sequences (and that assigns nonzero probability to every semicomputable hypothesis) can be represented as a universal prior. So this idea that universal priors are this special Occamian subset of these, is not true — if you demand semicomputability (and non-dogmatism), there’s nothing to contrast them with! And the argument “Occam’s Razor does well in a semicomputable world” just becomes “Assuming a semicomputable world (and being semicomputable yourself) does well in a semicomputable world”, which is less interesting.

OTOH, one could just reply “Yes, assuming a semicomputable world does well in a semicomputable world, but that’s equivalent to Occam’s Razor doing well in a semicomputable world, so what’s the problem? It still remains the case that Occam’s Razor does well in a semicomputable world, so unless you think the world might not be semicomputable, it’s still justified; it’s OK if one only implicitly uses Occam’s Razor rather than invoking it explicitly.” But — as mentioned above — this kind of requires acknowledging priors that don’t assume semicomputable (or aren’t semicomputable themselves, or are dogmatic) as a live option to contrast against, so this reply seems to be on pretty shaky ground…

]]>“Were you instead trying to argue NP=P^#P? If so, then for starters, why do you think NP=coNP? (I.e., if the number of satisfying assignments is zero, what’s a short proof of that fact that can be efficiently checked?)”

If you have a #P oracle wouldn’t the proof that the number of satisfying solutions is zero simply be that the oracle tells you so ?

]]>A mathematical explanation why Ockham’s Razor works is exactly what is in question.

Any prior would converge with Bayesian inference, but minimum-length description/Okham’s razor is a *non-arbitrary* partial ordering that converges optimally, which is why it’s the universal prior in Solomonoff Induction.

Their experience of reality is only through this VR system. They would build a theory of how physics work in such system, and come up with models for their own brains.. but they would never realize that their own minds are actually tapping into a much more comprehensive reality, which can’t be modeled within the world they’re experiencing.

Another example would be to couple a QC with a simulation that’s classical.

But turing machines can be constructed in minecraft, fred. They would be confused by how powerful their minds could be. But they should be able to figure out that it’s all reducible to a turing machine.

]]>(Also, since Solomonoff induction is based on the notion of a Universal Turing Machine, predictions made by Solomonoff induction can vary extremely based on what implementation of a UTM you use for it, since there is no unique “ideal” UTM known to date. But that’s another problem.)

]]>This would mean that induction works necessarily, and that therefore there must exist some purely mathematical explanation of why induction works. E.g. a proof that (in some precise sense) “simpler” hypotheses are indeed more likely to be true, like Ockham’s razor conjectures.

Indeed, read up on Solomonoff Induction.

]]>