Wednesday, October 21, 2009

MWI proposals that include modifications of physics

Previous: Decision Theory and other approaches to the MWI Born Rule, 1999-2009

The greatest appeal of the Everett-style Many-Worlds Interpretation of QM - that is, the wave equation alone, or standard MWI - is its simplicity in terms of mathematics and physics. While all other interpretations need to add extra things to the wave equation - adding in hidden variables with dynamics of their own, or modifying the wave equation itself to include random wave collapse processes - the Everett MWI states that the standard wave equation alone can explain everything we observe.

Yet, despite numerous attempts and claims to the contrary (and putting aside the possibilities for my own approach for now), the Born Rule probabilities have not been derived from the Everett picture. Thus, it may prove necessary to add new physics to our description of QM after all.

However, some approaches attempt to retain much of the advantage in simplicity of the MWI, as well as its multiple-worlds character, while making such modifications. It's a promising idea, and these MWIs certainly take inspiration from the Everett-style MWI, but they add too much to the pure wave picture to satisfy true Everett-style MWI partisans.

a) Hidden variables introduce complexity not only because of the extra dynamic equations, but because they require some choice of initial conditions. The wavefunction of QM also requires initial conditions, of course, but there is reason to hope that some simple equation could govern those initial conditions; the Hartle-Hawking 'no boundary' condition for the wavefunction of the universe is a well-known example of such a proposal (though it has problems of its own). Particle-like hidden variables do not seem amenable to such simple specification of initial conditions. However, if all possible sets of hidden variable initial conditions are equally real, then the overall simplicity of initial conditions for the multiverse is restored.

'Continuum Bohmian Mechanics' (CBM) is the best-known example of this approach. Like the Pilot Wave Interpretation, it has particle-like hidden variables; but instead of just one set of them, it has a continuous distribution of such sets, which act much like a continuous fluid. In addition to the possibility of simpler initial conditions, CBM may be immune to the fatal flaw of the PWI, which is being 'many-worlds in denial'. In other words, in the 'single-world' PWI, with one set of hidden variables, most of the observers will end up being implemented by the many worlds of the wavefunction, so the hidden variables won't matter. With CBM, the number of hidden variable sets is also infinite, so a typical observer could depend on the hidden variables after all. (This latter claim still needs to be proven compatible with computationalist considerations, but it is plausible.)

The hidden variables in the PWI follow the Born Rule, so CBM should be OK in that regard. But CBM retains the other features of the PWI that many physicists dislike, namely non-locality and a preferred reference frame. It is also not clear how well a relativistic version of the PWI works, and CBM inherits such problems. (Granted, no physics that works for quantum general relativity is known yet.) Also, even CBM is not as simple as the standard MWI.

I regard CBM (and more generally, MWI's with hidden variables) as something useful to keep in mind, as it is one of the few interpretations of nonrelativistic QM that seems to actually work in terms of being compatible with the Born Rule and not having an 'MWI in denial' problem. But other possiblities must be thoroughly explored before I would consider endorsing CBM as being likely to be true.

b) Another approach is to retain a pure wavefunction picture as in Everett's MWI, but to make the wavefunction be discrete instead of continuous. Discrete space is not what is meant here, but rather a discrete nature of the wavefunction itself. Buniy et al advocate such an approach.

The most obvious way to do that might be to assume that the wavefunction is represented by an integer function on configuration space rather than a continuous function. (If configuration space is also discrete, that is one way an approximate discrete numerical representation of a continuous wave function might be done on a digital computer.)

Buniy et al propse a somewhat different different assumption, in which wavefunctions seperated by a term of some mimimum squared amplitude are condidered to be the same.

Because the wave function (or perhaps I should say its 'populated' region) is constantly and rapidly expanding into new areas of configuration space (e.g. as entropy increases), its numerical value is constantly imploding. I will call this the Wavefunction Value Implosion (WVI). If the universe is finite, then this effect will be finite, but exponentially large as a function of the number of particles in the universe. Thus, a discrete wavefunction could not be detected experimentally if its discrete nature is small enough, until such time as the WVI brings the populated part of the wavefunction below that scale, and then presumably time evolution will radically change or effectively stop; I will call this the Crash.

Discrete physics has a certain appeal to some people, independent of any possible role in quantum mechanics. Wolfram's book "A New Kind of Science" discusses such views. Also, the idea that all possible mathematical universes physically exist (the Everything Hypothesis, which will be discussed in a later post) may be somewhat more tractable if it is restricted to digital systems (though, despite its undeniable appeal, it still has problems even then).

If the wavefunction is discrete, would that help explain the Born probabilities? Buniy et al argue that it would, by shoring up the old "frequency operator" attempted derivation by extending it to finite numbers of measurements rather than infinite. This argument notes that, after repeated measurements, terms in the wavefunction which don't have the Born frequencies have much smaller amplitudes than the terms with the 'right' frequencies. With a minimum amplitude cutoff, most of the un-Born terms would be eliminated. This argument does not seem very satisfactory, as we are interested in situations with small numbers of meaurements, and thus small factors of difference in amplitude, while the digital cutoff would have to be very far from a significant fraction of the total amplitude if the Crash is not yet upon us. In practical situations, other factors would affect amplitudes much more. For example, entropy production is not associated with low probability, but it results in numerous sub-branches each of which shares a fraction of the original squared amplitude.

Shoring up the 'Mangled Worlds' argument would seem a more promising approach. There are many sub-branches comprising each macroscopically distinguishable world, and they tend to have a log-normal distribution in squared amplitude. As Hanson showed, a cutoff in the right range of squared amplitudes would lead to Born Rule probabilities. This cutoff must be uniform across branches, which Hanson's 'mangling' mechanism by larger branches actually fails to provide, but a digital cutoff could provide it. I will tentatively say that this is a possible mechanism for the Born Rule, though I need to study it more before I can say for sure that there are no problems that would ruin it. In particular, if the number of worlds changes too much over time or the era in which the Born Rule holds is too short, that would indicate a problem.

c) Another mechanism that improves on 'Mangled Worlds' is my own idea in which random noise in the initial wavefunction means that larger volumes in configuration space per implemented computation are required for low-amplitude sub-branches, which can lead to the Born Rule. This requires new physics in the form of special initial conditions, but hopefully not in terms of time evolution. It is possible that this leads to a Boltzmann Brains problem. I will discuss this, as well as an alternative in which the Born Rule is due to a special way to count computations (which avoids new physics - if it can be justified) in later posts.

d) The Everything Hypothesis (that all possible mathematical stuctures exist) can be used directly in an attempt to predict what a typical observer would observe. Some have argued that this explains what we observe, including the Born Rule. The Everything Hypothesis will be discussed in a post of its own.

e) Other MW schemes for modifying physics have been proposed.

One example is Michael Weismann's idea involving sudden splitting of existing worlds into proposed new degrees of freedom, with a higher rate of such splitting events for higher amplitude worlds. The problem with it is that if new worlds are constantly being produced, then the number of observers would be growing exponentially. The probability of future observations, as far into the future as possible, would be much greater than that of our current observations. Thus, the scheme must be false unless we are highly atypical observers, which is highly unlikely.

David Strayhorn had an idea based on general relativity, in which different topologies correspond to different sub-branches of the wavefunction. This approach is not well-developed as of yet and it has problems that I think will prevent it from working. I discussed it in various post on the OCQM yahoo group.

Saturday, September 26, 2009

Decision Theory & other approaches to the MWI Born Rule problem, 1999-2009

In the previous post, I explained the early attempts to derive the Born Rule for the MWI. These attempts required assumptions for which no justification was given; as a result, critics of the MWI pointed to the lack of justification for the Born Rule as a major weakness of the interpretation.

MWI supporters often had to resort to simply postulating the Born Rule as an additional law of physics. That is not as good as a derivation, which would be a great advantage for the MWI, but it at least puts the MWI on the same footing as most other interpretations. However, it is by no means clear that it is legitimate to do that, either. Many people think that branch-counting (or some form of observer-counting) must be the basis for probabilities in an MWI, as Graham had suggested. Since branch-counting gives the wrong probabilities (as Graham failed to realize), a critic might argue that experiments (which confirm the Born rule) show the MWI must be false.

Thus, MWI supporters were forced to argue that branch-counting did not, in fact, matter. The MWI still had supporters due to its mathematical simplicity and elegance, but when it came to the Born Rule, it was in a weak position.

In the famous Everett FAQ of 1995, Price cited the old 'infinite measurements frequency operator' argument. That was my own first encounter with the problem of deriving the Born Rule for the MWI, and despite being an MWI supporter, the finite-number-of measurements hole in the infinite-measurements argument was immediately obvious to me.

5) The decision-theoretic approach to deriving the Born Rule

In 1999, David Deutsch created a new approach to deriving the Born Rule for the MWI, based on decision theory. He wrote "Previous attempts ... applied only to infinite sets of measurements (which do not occur in nature), and not to the outcomes of individual measurements (which do). My method is to analyse the behaviour of a rational decision maker who is faced with decisions involving the outcomes of future quantum-mechanical measurements. I shall prove that if he does not assume [the Born Rule], or any other probabilistic postulate, but does believe the rest of quantum theory, he necessarily makes decisions as if [the Born Rule] were true."

Deutsch's approach quickly attracted both supporters and critics. David Wallace came out with a series of papers that defended, simplified and built on the decision theory approach, which is now known as the Deutsch-Wallace approach.

Deutch's derivation contained an implicit assumption, which Wallace made explicit, and called 'measurement neutrality'. Basically, it means that the details of how a measurement is made don't matter. For example, if a second measurement is made along with the first, it is assumed that the probabilities for the outcomes of the first won't be affected. This implies that unitary transformations, which preserve the amplitudes, don't matter. That implies 'equivalence', which states that two branches of equal amplitudes have equal probabilities, and which is essentially equivalent to the Born Rule. The Born Rule is then derived from 'equivalence' using simple assumptions cast in the language of decision theory.

Wallace acknowledged that 'measurement neutrality' was controversial, admitting "The reasons why we treat the state/observable description as complete are not independent of the quantum probability rule." Indeed, if probabilities depend on something other than amplitudes, then clearly they can change under unitary transformations.

So he offered a direct defense of the 'equivalence' assumption, which formed the basis of the paper that was for a long time considered the best statement of the DW approach, certainly as of the 2007 conferences. New Scientist magazine proclaimed that his derivation of the Born Rule in the MWI was "rigorous" and was forcing people to take the MWI seriously.

His basic argument was that things that the person making a decision doesn't care about won't matter. This included the number of sub-branches, but he also took care to argue that the number of sub-branches can't matter because it is not well-defined.

Consider Albert's hypothetical fatness rule, in which probabilities are proportional both to the squared amplitudes and to the observer's mass. This obviously violates 'equivalence'. According to Wallace's argument, the decider should ignore his mass unless it comes into play for the decision, so that is impossible. But it is a circular argument; the decider should care about his mass if it in fact affects the probabilities.

My critique of Wallace's approach is presented in more detail here, where I also cover his more recent paper.

In his 2009 paper, Wallace takes a different approach. Perhaps recognizing that assuming 'equivalence' is practically the same as just assuming the Born Rule, he makes some other assumptions instead, couched in the language of decision theory, which allow him to derive 'equivalence'. The crucial new assumption is what he calls 'diachronic consistency'. In addition to consistency of desires over time, it contains the assumption of conservation of measure as a function of time, which there is no justification to assume. Of course, the classical version of diachronic consistency is unproblematic, and only a very careful reading of the paper would reveal the important difference if it were not for the fact that Wallace helpfully notes that Albert's fatness rule violates it.

6) Zurek's envariance

W. Zurek attempted to derive the Born Rule using symmetries that he called 'envariance' or enviroment-assisted invariance. While interesting, his assumptions are not justified. The most important assumption is that all parts of a branch, and all observers in a branch, have the same "probability". Albert's fatness rule provides an obvious counterexample. I also note that a substate with no observers in it can not meaningfully be assigned any effective probability.

He uses this, together with another unjustified assumption that is similar to locality of probabilities, to obtain what Wallace called 'equivalence' and then the Born Rule from that. Because the latter part of Zurek's derivation is similar to the DW approach, the two approaches are sometimes considered similar, although Zurek does not invoke decision theory.

7) Hanson's Mangled Worlds

Robin Hanson came up with a radical new attempt to derive the Born Rule in 2003. It was similar to Graham's old world-counting proposal in that Hanson proposed to count sub-branches of the wavefunction as the basis for the probabilities.

The new element Hanson proposed was that the dynamics of sub-branches of small amplitude would be ruined, or 'mangled', by interference from larger sub-branches of the wavefunction. Thus, rather than simply count sub-branches, he would count only the ones with large enough amplitude to escape the 'mangling'.

Due to microscopic scattering events, a log-normal squared-amplitude distribution of sub-branches arises, as it is a random walk in terms of multiplication of the original squared-amplitude. Interference ('mangling') from large amplitude branches imposes a minimum amplitude cutoff. If the cutoff is in the right numerical range and is uniform for all branches, then due to the mathematical form of the log-normal function, the number of branches above the cutoff is proportional to the square of the original amplitude, yielding the Born Rule.

Unfortunately, this Mangled Worlds picture relies on many highly dubious assumptions; most importantly, the uniformity of the ‘mangling’ cutoff. Branches will not interfere much with other branches unless they are very similar, so there will be no uniformity; small-amplitude main branches will have smaller sub-branches but also smaller interference from large main branches and thus a smaller cutoff.

Even aside from that, while the idea of branch-counting has some appeal, it is clear that observer-counting (with computationalism, implementation-counting) is what is fundamentally of interest. Nonetheless, 'Mangled Worlds' is an interesting proposal, and is the inspiration for a possible approach to attempt to count implementations of computations for the MWI, which will be discussed in more detail in later posts. That does require some new physics though, in the form of random noise in the initial conditions which acts to provide the uniform cutoff scale that is otherwise not present.

In the next post, proposals for MWIs that include modifications of physics will be discussed.

Wednesday, September 23, 2009

Early attempts to derive the Born Rule in the MWI

When Everett wrote his thesis in 1957 on the '"Relative State" Formulation of Quantum Mechanics', he certainly needed to address how the Born Rule probabilities fit into his new interpretation of QM. While the MWI remains provocative even today, it was not taken seriously in 1957 except by a few people, to the extent that Everett had to call it "Relative State" rather than "Many Worlds". So it is perhaps fortunate that he did not realize the true challenges of fitting the Born Rule into the MWI, which could have derailed his paper. Instead, he came up with a short derivation of the Born Rule, using assumptions that he did not realize lacked justification.

Of course, the Born Rule issue has long since returned to haunt the MWI. Historically, what has happened several times was that a derivation of the Born Rule that seemed plausible to MWI supporters was produced, but soon it attracted critics. After a few years it became clear to most physicists that the critics were right, and the MWI fell into disrespect until a new justification for the Born Rule was produced. This cycle continues today, with the decision-theoretic Deutsch-Wallace approach being considered the best by many, and now attracting growing (and deserved) criticism.

When considering claimed derivations of the Born Rule in the MWI, it is often useful to keep in mind an 'alternative rule' that is being ruled out, and to question the justification for doing so. Two useful ones are as follows:

a) The unification rule: All observations that exist have the same measure. In this case, branch amplitudes don't matter, as long as they are nonzero (and they always are, in practice).

b) David Albert's fatness rule: The measure of an observer is proportional to the squared amplitude (of the branch he's on) multiplied by his mass. Here, amplitudes matter, but so does something else. This one is especially interesting because it illustrates that not all observers necessarily have the same measure, even if they are on the same branch of the wavefunction. While it is obviously implausible, it's a useful stand-in for other possibilities that may seem better more justifiable, such as using the number of neurons in the observer's brain instead of his mass, or any other detail of the wavefunction.

Another useful thing to keep in mind is the possibility of a modified counterpart to quantum mechanics, in which squared-amplitude would not be a conserved quantity. We would expect that the Born Rule might no longer hold, but some other Rule should, even in the absense of conserved quantities. Presumably, if the modification is small, so would be any departure from the Born Rule. Thus, one should not think that conserved quantities must have any special a priori importance without which no measure distribution is possible.

Let us examine a few of the early attempts to derive the Born Rule within the MWI:

1) Everett's original recipe

In Everett's 1957 paper, he models an observer in a fairly simple way, considering only a set of memory elements. This is a sort of rough approximation of a computational model, but without the dynamics (which are crucial for a well-defined account of computation). Thus, Everett was a visionary pioneer in applying computationalist thinking to quantum mechanics, but he never confronted the complexity of what would be required to do a satisfactory job of it.

He assumed that the measure of a branch would be a function of its amplitude only, and thus would not depend on the specific nature of that branch. This is a very strong assumption, and arguably contains his next assumption as a special case already. [A more general approach would allow other properties to be considered, such as in Albert's fatness rule.]

[Note: Everett's use of the term 'measure' is not stated to refer specifically to the amount of consciousness, but in this context, the role it plays is essentially the same as if it did. Some authors use 'measure of existance' to specifically mean the squared amplitude by definition; obviously Everett did not, since he wanted to prove that his measure was equal to the squared amplitude. I recommend avoiding overly suggestive terms (like 'weight') for the squared amplitude.]

Next, he assumed that measure is 'additive' in the sense that if two orthogonal branches are in superposition, they can be regarded as a single branch, and the same function of amplitude must give the same total measure in either case.

If the definition of a 'branch' is arbritrary in allowing combinations of orthogonal components, the 'additivity' assumption makes sense, since it means that it does not matter how the branches are considered to be divided up into orthogonal components. [An argument similar to that would be presented years later in Wallace's 2005 paper, in which Wallace defended the assumption of 'equivalence' (branches of equal amplitude must have equal measure) against the idea of sub-branch-counting, based on the impossibility of defining the specific number of sub-branches. Everett did not get into such detail.]

With the previous assumption, 'additivity' would only hold if the measure is proportional to the squared amplitude; thus, he concluded that the Born Rule holds.

Everett considered the additivity requirement equivalent to saying that measure is conserved; thus, when a branch splits into two branches, the sum of the new measures is equal to the measure of the original branch. He gave no justification for the conservation of measure, perhaps considering it self-evident.

In classical mechanics, conservation of probability is self-evident because the probability just indicates something about what state the single system is likely to be in. If the probabilities summed to 2, for example, a single system couldn't explain it; perhaps there would have to be 2 copies instead of one. Yet the existance of multiple copies is precisely what the MWI of QM describes, and in this case, there is no a priori reason to believe that the total measure can not change over time.

Everett's attempted derivation of the Born Rule is not considered satisfactory even by other supporters of the MWI, because he did not justify his assumptions. Soon, other attempts to explain the probabilities emerged.

2) Gleason's Theorem

Also discovered in 1957, Gleason's theorem shows that if probabilities are non-contextual, meaning that the probability of a term in the superposition does not depend on what other terms are in the superposition, then the only formula which could give the probabilities is based on squared expansion coefficients. It is straighforward to argue that the correct expansion to use is that for the current wavefunction; thus, these coefficients are the amplitudes, which gives Born's Rule.

Unfortunately, there is no known justification for assuming non-contextuality of the probabilities. If measure is not conserved, the probabilities can not generally be noncontextual. Gleason's theorem is sometimes cited in attempts to show that the MWI yields the Born Rule, but it is not a popular approach since usually those attempts make (unjustified) assumptions which are strong enough to select the Born Rule without having to rely on the more complicated math required to prove Gleason's theorem.

3) The infinite-measurements limit and its frequency operator

The frequency operator is the operator associated with the observable that is the number of cases in a series of experiments that a particular result occurs, divided by the total number of experiments. If is assumed that just the frequency itself is measured, and if the limit of the number of experiments is taken to infinity, the eigenvalue of this frequency operator is unique and equal to the Born Rule probability. The quantum system is then left in the eigenstate with that frequency; all other terms have zero amplitude, as shown by Finkelstein (1963) and Hartle (1968).

This scheme is irrelevant for two reasons. First, an infinite number of experiments can never be performed. As a result, terms of all possible frequencies remain in the superposition. Unless the Born Rule is assumed, there is no reason to discard branches of small amplitude. Assuming that they just disappear is equivalent to assuming collapse of the wavefunction.

Second, in real experiments, individual outcomes are recorded as well as the overall frequency. As a result, there are many branches with the same frequency and the amplitude of any one branch tends towards zero as the number of experiments is increased. If one discards branches that approach zero amplitude in the limit of infinite experiments, then all branches should be discarded. Furthermore, prior to taking the infinite limit, the very largest individual branch is the one where the highest amplitude outcome of each individual experiment occurred, if there is one.

A more detailed critique of the frequency operator approach is given here. The same basic approach of using infinite ensembles of measurements has been taken recently by certain Japanese physicists, Tanaka (who seems unaware of Hartle's work) and (seperately) Wada. Their work contains no significant improvements on the old, failed approach.

4) Graham's branch counting

Neil Graham came out with a paper in 1973 that appears in the book "The Many Worlds Interpretation of Quantum Mechanics" along with Everett's papers and others.

Graham claimed that the actual number of fine-grained branches is proportional to the total squared amplitude of a course-grained macroscopically defined branch. Such sub-branches would be produced by splits due to microscopic scattering events and so on which act as natural analogues of measurements.

If it were true, it could also begin to give some insight into why the Born Rule would be true, beyond just a mathematical proof; that is, each fine-grained branch would presumably support the same number of copies of the observer. (That assumption would still need to be explained, of course.)

Unfortunately, and even aside from the lack of precise definition for fine-grained branches, he failed to justify his statistical claims, which stand in contradiction to straightforward counting of outcomes. He simply assumed that fine-grained branches would on average have equal amplitudes regardless of the amplitude of the macroscopic branch that they split from.

In the next post, the more recent attempts (other than my own) to derive the Born Rule within the MWI will be described.

Monday, September 21, 2009

Why 'Quantum Immortality' is false

In the previous posts, I explained that effective 'probabilities' in an MWI are proportional to the amount (measure) of consciousness that sees the various outcomes. Because this measure need not be a conserved quantity, this can lead to nonclassical selection effects, with 'probabilities' for a given outcome still changing as a function of time even after the outcomes have been observed and recorded. That can lead to an illusion of nonlocality, which can only be properly understood by thinking in terms of the measures directly, as opposed to thinking only in terms of 'probabilities'.

The most extreme example in which it is crucial to think in terms of the measures, rather than 'probabilities' only, is the so-called 'Quantum Suicide' (QS) experiment. Failure to realize this leads to a literally dangerous misunderstanding. The issue is explained at length in my eprint "Many-Worlds Interpretations Can Not Imply 'Quantum Immortality'".

The idea of QS is as follows: Suppose Bob plays Russian Roulette, but instead of using a classical revolver chamber to determine if he lives or dies, he uses a quantum process. In the MWI, there will be branches in which he lives, and branches in which he dies. The QS fallacy is that, as far as he is concerned, he will simply find himself to survive with no ill effects, and that the experiment is therefore harmless to him.

A common variation is for him to arrange a bet, such that he gets rich in the surviving branches only, which would thus seem to benefit him. Of course in the branches where he does not survive, his friends will be upset, and this is often cited as the main reason for not doing the experiment.

That it is a fallacy can be seen in several ways. Most basically, the removal of copies of Bob in some branches does nothing to benefit the copies in the surviving branches; they would have existed anyway. Their measure is no larger than it would have been without the QS - no extra consciousness magically flows into the surviving branches, while the measure in the dead branches is removed. If our utility function states that more human life is a good thing, then clearly the overall measure reduction is bad, just as killing your twin would be bad in a classical case.

It is true that the effective probability (conditional on Bob making an observation after the QS event) of the surviving branches becomes 1. That is what creates the QS confusion; in fact, it leads to the fallacy of "Quantum Immortality" - the belief that since there are some branches in which you will always survive, then for practical purposes you are immortal.

But such a conditional effective probability being 1 is not at all the same as saying that the probability that Bob will survive is 1. Effective probability is simply a ratio of measures, and while it often plays the role we would expect a probability to play, this is not a case in which such an assumption is justified.

We can get at what does correspond for practical purposes to the concept of 'the probability that Bob will survive' in a few equivalent ways. In a case of causal differentiation, it is simple: the fraction of copies that survive is the probability we want, since the initial copy of Bob is effectively a randomly chosen one.

A more general argument is as follows: Suppose Bob makes an observation at 12:00, has a 50% chance QS at 12:30, and his surviving copies make an observation at 1:00. Given that Bob is observing at either 12:00 or 1:00, what is the effective probability that it is 12:00? (Perhaps he forgets the time, and wants to guess it in advance of looking at a clock, so that the Reflection Argument can be used here.) The answer is the measure ratio of observations at 12:00 to the total at both times, which is therefore 2/3.

That is just what we would expect if Bob had a 50% chance to survive the QS: Since there are twice as many copies at 12:00 compared to 1:00, he is twice as likely to make the observation at 12:00.

Most of your observations will be made in the span of your normal lifetime. Thus QI is a fallacy; for practical purposes, people are just as mortal in the MWI as in classical models.

It's worth mentioning another argument against a person's measure being constant:

1) "MWI immortality" believers typically think that a person's total amount of consciousness does not change even if their quantum amplitude changes, while I argue that the contrary is true.

2) In the MWI, there are definitely some (very small but nonzero) amplitudes for branches that contain Boltzmann brains (brains formed by uncoordinated processes such as thermal fluctuations) very early on. The exact amplitudes are irrelevant to the point being made.

3) Once a Boltzmann brain that matches yours has some amplitude, you start to exist. It's true that evolution, much later, will also cause _much larger amplitude_ branches to also contain versions of you. But if the belief described in point #1 were true, that would _not_ mean that your amount of consciousness increased. Thus, you would still be on even footing with the other Boltzmann brains. That is not plausible, so the immortality belief is not plausible.

Next up: Early attempts to derive the Born Rule in the MWI

Wednesday, September 16, 2009

Measure of Consciousness versus Probability

In the last post, Meaning of Probability in an MWI, it was explained that in a deterministic Many-Worlds model, with known initial conditions, that which plays the role for of a probability for practical purposes is the ratio

(the measure (amount) of consciousness which sees a given outcome)
/ (the total measure summed over outcomes)

I call that the effective probability of the outcome.

Although the effective probability is quite similar to what we normally think of as a probability in terms of its practical uses, there are also important differences, which will be explored here.

The most important differences stem from the fact that measure of consciousness need not be a conserved quantity. By definition, probabilities sum to 1, but that is not all there is to it. In a traditional, single-world model, a transfer of probability indicates causality, while the total measure remains constant over time. This is not necessarily so in a MW model.

For example, suppose there are two branches, A and B. A has 10 observers at all times. B starts off with 5 observers at T0, which increases to 10 observers at T1 and to 20 observers at T2. All observers have the same measure, and observe which branch they are in.

So the effective probability of A starts off at 2/3 at T0, while the effective probability of B is 1/3. At T1, A and B have effective probabilities of 1/2 each. At T2, the effective probability of A is 1/3 and that of B is 2/3.

There are two important effects here. First, the effective probability of B increased with time. In a single-world situation, that would mean that a system which was actually in A was more likely to change over to B as time passes. But in this MW model, there is no transfer of systems, just changes in B itself.

This means that probability changes that would require nonlocality in a single-world model don't necessarily mean nonlocality in a MW model. If A is localized at X1, and B is localized at X2 which is a light-year away, there need not be a year's delay before the effective probability of B suddenly increases.

In a single-world local hidden variable model, probability must be locally conserved, so that the change of probability in a region is equal to the transitions into and out of adjacent regions only. This need not be so in an MW model.

The second important effect of nonconservation of measure in a MW model is that total measure changes as a function of time. Observers can measure, not only what branch they are on, but also what time it is. They will be more likely to observe times with higher measure than with lower measure, just as with any other kind of observation.

A good example of this is a model proposed by Michael Weissman - a modification of physics designed to make world-counting yield the Born Rule. His scheme involved sudden splitting of existing worlds into proposed new degrees of freedom, with a higher rate of such splitting events for higher amplitude worlds. The problem with it is that if new worlds are constantly being produced, then the number of observers would be growing exponentially. The probability of future observations, as far into the future as possible, would be much greater than that of our current observations. Thus, the scheme must be false unless we are highly atypical observers, which is highly unlikely.

Edit (2/2/16): See however this post. If the SIA is correct, the above argument against Weissman's idea fails, since the SIA gives extra likelihood to theories with more observers, exactly cancelling out the effect of reducing the fraction of observers which have observations like ours. However, as discussed in that post, I don't think the SIA is the right thing to use for comparing MWIs.

It is important to realize that since changes in measure mean changes in the number of observers, decreases in measure are undesirable. This will be discussed further in the next post.

Friday, September 11, 2009

Meaning of Probability in an MWI

The quantitative problem of whether the Born Rule for quantum probabilities is consistent with the many-worlds interpretation is the key issue for interpretation of QM. Before addressing that, it is important to understand in general what probabilities mean in a many-worlds situation, because ideas from single-world thinking can lead to unjustified assumptions regarding how the probabilities must behave. Many failed attempts to derive the Born Rule make that mistake.

The issue of what probabilities mean in a Many-Worlds model is covered in greatest detail in my eprint "Many-Worlds Interpretations Can Not Imply 'Quantum Immortality'". Certain work by Hilary Greaves is directly relevant.

First, note that for a single-world, deterministic model, such as classical mechanics provides, probabilities are subjective. The classic example is tossing a coin: the outcome will depend deterministically on initial conditions, but since we don't know the details, we have to assign a subjective probability to each outcome. This may be 50%, or it may be different, depending on other information we may have such as the coin's weight distribution or a historical record of outcomes. Bayes' rule is used to update prior probabilities to reflect new information that we have.

In such a model, consciousness comes into play in a fairly trivial way: As long as we register the outcome correctly, our experienced outcome will be whatever the actual outcome was. Thus, if we are crazy and always see a coin as being heads up, then the probability that we see "up" is 100%. Physics must explain this, but the explanation will be grounded in details of our brain defects, not in the physics of coin trajectories.

By contrast, in any normal situation, the probability that we see "up" is simply equal to the probability that the coin lands face up. [Even this is really nontrivial: it means that randomly occuring "Boltzman brains" are not as common as "normal people". As we will see, if we believe in computationalism, it also means that rocks don't compute everything that brains do, which is nontrivial to prove.]

In a many-worlds situation, it may still be the case that we don't know the initial conditions. However, even if we do know the initial conditions, as we do for many simple quantum systems, there would still be more than one outcome and there is some distribution of observers that see those outcomes.

Assume that we do know the initial conditions. The question of interest becomes (roughly speaking): 'What is the probability of being among the observers that see a particular given outcome?'

It is important to note that in a many-worlds situation, the total number of obsevers might vary with time, which can lead to observer selection effects not seen in single-world situations. Because of this the fundamental quantity of interest is not probability as such, but rather the number, or quantity, of observers that sees each outcome. The amount of conscious observers that see a given outcome will be called the measure (of consciousness) for that outcome.

In a deterministic MWI with known initial conditions, it will be seen that what plays the role of the “probability” of a given observation in various situations relates to the commoness of that observation among observers.

Define the 'effective probability' for a given outcome as (the measure of observers that see a given outcome) divided by (the total measure summed over observed outcomes).

1) The Reflection Argument

When a measurement has already been performed, but the result has not yet been revealed to the experimenter, he has subjective uncertainty as to which outcome occurred in the branch of the wavefunction that he is in.

He must assign some subjective probabilities to his expectations of seeing each outcome when the result is revealed. He should set these equal to the effective probabilities. For example, if 2/3 of his copies (or measure) will see outcome A while the other 1/3 see B, he should assign a subjective probability to A of 2/3.

Why? Because that way, the amount of consciousness seeing each outcome will be proportional to its subjective probability, just as one would expect on average for many trials with a regular probability.

See Why do Anthropic Arguments work? for more details.

2) Theory Confirmation

It may be than an experimental outcome is already known, but the person does not know what situation produced it. For example, suppose a spin is measured and the result is either “up” or “down”. The probability of each outcome depends on the angle that the preparation apparatus is set to. There are two possible preparation angles; angle A gives a 90% effective probability for spin up, while angle B gives 10%. Bob knows that the result is “up”, but he does not know the preparation angle.

In this case, he will probably guess that the preparation angle was A. In general, Bayesian updating should be used to relate his prior subjective probabilities for the preparation angle to take the measured outcome into account. For the conditional probability that he should use for outcome “up” given angle A, he should use the effective probability of seeing “up” given angle A, and so on.

This procedure is justified on the basis that most observers (the greatest amount of conscious measure) who use it will get the right answer. Thus, if the preparation angle really was B, then only 10% of Bob’s measure would experience the guess that A is more likely, and the other 90% will see a “down” result and correctly guess B is more likely.

3) Causal Differentiation

It may be the case that some copies of a person have the ability to affect particular future events such as the fate of particular copies of the future person. The observer does not know which copy he is. Pure Causal Differentiation situations are the most similar to classical single-world situations, since there is genuine ignorance about the future, and normal decision theory applies. Effective probabilities here are equal to subjective probabilities just like in the Reflection Argument.

4) Caring Coefficients

As opposed to Causal Differentiation, which may not apply to the standard MWI, the most standard way to think of what happens to a person when a “split” occurs is that of personal fission. Perhaps this is the most interesting case when an experiment has not yet been performed. Decision theory comes into play here: In a single-world case, one would make a decision so as to maximize the average utility, where the probabilities are used to find the average. What is the Many-Worlds analogue?

If it is a deterministic situation and the decider knows the initial conditions, including his own place in the situation, it is important to note that he should not use some bastardized form of ‘decision theory in the presence of subjective uncertainty’ for this case. It is a case in which the decider would know all of the facts, and only his decision selects what the future will be among the options he has. He must maximize, not a probability-weighted average utility, but simply the actual utility for the decision that is chosen.

Rationality does not constrain utility functions, so at first glance it might seem that the decider’s utility function might have little to do with the effective probabilities. However, as products of Darwinian evolution and members of the human species, many people have common features among their utility functions. The feature that is important here is that of “the most good for the most people”. Typically, the decider will want his future ‘copies’ to be happy, and the more of them are happy the better.

In principle he may care about whether the copies all see the same thing or if they see different things, but in practice, most believers in the MWI would tend to adopt a utility function that is linear in the measures of each branch outcome:

U_total = Σ_i Σ_p m_ip[Choice] q_ip

where i labels the branch, p denotes the different people and other things in each branch, m_ip is the measure of consciousness of person (or animal) p which sees outcome i, and is a function of the Choice that the decider will make, and q_ip is the decider’s utility per unit measure (quality-of-life factor) for that outcome for that person.

The measures here can be called “caring measures” since the decider cares about the quality of life in each branch in proportion to them.

Utility here is linear in the measures. For cases in which measure is conserved over time, this is equivalent to adopting a utility function which is linear in the effective probabilities, which would then differ from the measures by only a constant factor. In such a case, effective probabilities are used to find the average utility in the same way that actual probabilities would have been used in a single-world model in which one outcome occurs randomly.

Next: Measure of Consciousness versus Probability

Monday, September 7, 2009

Interlude: The 2007 Perimeter Institute conference Many Worlds @ 50

As explained in the previous post, I had long been anticipating a conference on the MWI in 2007, and attended the Perimeter Institute conference Many Worlds at 50, armed with a copy of my then-new eprint on the Many Computations Interpretation.

When I arrived at my hotel the night before the conference, an older couple was checking in at the same time as I was. Someone asked the clerk for directions to the Perimeter Institute. It turned out that this couple was also attending the conference, and they were a couple of the friendliest and most interesting people I met there.

George Pugh had worked with Hugh Everett (founder of the MWI) at a defense contractor, Lambda Corp. (The work Everett did there is not so famous as his MWI but was actually important during the Cold War.) George and his impressive wife Mary had talked about the MWI with Everett himself, and they support it. They asked me which side I was on, as both pro- and con- people were attending the conference. I told them I was in favor of the MWI. They liked to hear that. We ended up having meals together on several occasions over the course of the conference.

The conference itself consisted mostly of lectures in a classroom-like atmosphere, followed by questions from the audience. Appropriately, most of the talks focused on the question of probability in the MWI.

However, and unfortunately, they mainly focused on the attempt to derive the Born Rule from decision-theoretic considerations. That approach was proposed by David Deutsch in 2000, and further developed by Simon Saunders and especially by David Wallace. Saunders and Wallace gave talks that mainly reiterated what is in their papers. There were also talks that (correctly, though of course this was not accepted by Wallace's supporters) pointed out the failures of that approach, such as those by Adrian Kent and David Albert.

The only other approach to the Born Rule that was presented at a talk was that of W. Zurek, who talked about his (equally fallacious) 'envariance' approach. Most people seemed to agree that Zurek's approach was similar to Wallace's. There was little discussion of it beyond that. When Zurek was asked about Wallace's approach during an informal discussion, he basically said that he didn't know if Wallace's approach was correct also, but he didn't seem to think it matters much, because his own approach showed that the Born Rule followed from the MWI. When I tried to point out to him why his approach fails - a task made all the more difficult by his somewhat intimidating large physical presense and lion-like bearded appearance - he didn't understand my point and soon ended the conversation.

Max Tegmark was a speaker, and he briefly discussed his heirarchy of many-worlds types, up to the Everything Hypothesis for which he is known.

Besides that, the only other controversy addressed in the talks was that of the legitimacy and meaning of talking about probability in the deterministic MWI, which is a seperate question than the quantitative problem of deriving the Born Rule. This focused on Hilary Greaves' 'caring measure' approach. She is sometimes lumped in with the decision theoretic approach to the Born Rule, because she uses decision theory in another way, but in fact her ideas are independent of that and are basically correct though not the full story.

The official speakers were basically divided into two camps: Those MWI-supporters who supported Wallace's attempted derivation of the Born Rule or who were considered allies of it (like Zurek and Greaves), versus those who not only rejected it but also were against the MWI in general (like Kent and Albert). Tegmark was neither but his one talk was largely ignored, and he did not address the Born Rule controversy.

Among the attendees, however, the situation was more complicated. I was not the only one who supported some kind of MWI, and considered understanding the Born Rule to be the key issue of interest, but utterly rejected the approaches to the Born Rule that had been presented. The alternatives that we wanted to discuss involved some form of observer-counting as the basis for probabilities in an MWI, even if it required some new physics. This led to a minor rebellion, in which a few of us tried to talk about our ideas during a lunch period in the room set aside for the conference lunch. The only official speaker that we got any help from was Hilary Greaves. We were able to speak in the lunchroom for a little while, but it didn't get much attention.

There was another young woman by the name of Hillary, I think a physicist studying at the Institute, who also helped us set up the lunchtime discussion.

The 'counter' camp included Michael Weissman, who proposed a modification of physics in order for world-counting to yield the Born Rule. His scheme involved sudden splitting of existing worlds into proposed new degrees of freedom, with a higher rate of such splitting events for higher amplitude worlds. This was interesting, but I was skeptical, and after thinking about it for a while I found the fatal flaw in it. If new worlds were constantly being produced, then the number of observers would be growing exponentially. The probability of future observations, as far into the future as possible, would be much greater than that of our current observations. Thus, the scheme must be false unless we are highly atypical observers, which is highly unlikely. While false, Mike's model serves as a good way to discuss the need for approximate conservation of measure for a successful model. In any case, Mike proved to be a good guy to talk to.

Also among the 'counters' was David Strayhorn, who proposed that an indeterminacy in General Relativity could lead to a Many Worlds model in which spacetime topologies were distributed according to, and formed the basis for, the Born Rule. His ideas did not seem fully developed, and I was skeptical of them as well, but we had interesting discussions.

Another guy with us was Allan Randall. He supports Tegmark's Everything Hypothesis, and is also interested in transhumanism and immortality. As I explained to Allen and to Max Tegmark, I wasn't sure about the Everything hypothesis, because of the problem of what would determine a unique measure distribution, but I used to support it and still like it. I think it's important and maybe useful. After all, and like many supporters of the hypothesis, I discovered a version of it on my own long before I ever heard of Tegmark.

Which brings me to a subject that received little official mention at the conference, the 'Quantum Immortality / Quantum Suicide' fallacy which Tegmark had publicized. This is the belief, which many MWI supporters have come to endorse, that the MWI implies that people always survive because some copies of them survive in branches of the wavefunction. I had always regarded this as the worst form of crackpot thinking, and had hoped to discuss it at the conference as something that MWI supporters must crush before it gets out of hand. My brief discussions about it at the conference convinced me that it was not getting the condemnation that it deserves. This ultimately led me to write my own eprint against it, Many-Worlds Interpretations Can Not Imply 'Quantum Immortality', despite my misgivings that even discussing the subject could give the dangerous idea extra publicity.

I also had interesting discussions with Mark Rubin, who had shown an explicit local formulation of the MWI using the Heisenberg picture, which is something I still need to study more. Mark and I had dinner with the Pughs. I liked the Swiss Chalet restaurant and Canadian beer.

I also happened to run into a friend of mine from NYU, where I got my Ph.D. in physics. Andre is a Russian who came to the US to study, and he had a postdoc at the Perimeter Institute. He's not an MWI supporter or really into interpretation of QM, but he knew that I am, so he was not too surprised that I showed up at the conference. I was lucky to run into him, because the next day he was heading to England for a postdoc there, studying quark-gluon plasmas using the methods he learned from models of string theory. He said he might never return to the US.

All in all, it was certainly an interesting experience. Ultimately, though, it was disappointing because I didn't get to discuss my paper much, and I never was able to have a substantive discussion with the well-known figures in the field who were there to present their own work. It was largely a lecture series rather than an egalitarian discussion group. Some discussion took place on the sidelines, such as at meals, but that was limited in who you happened to be next to. Well-known people mainly talked to each other.

One thing that grew out of the discussions on observer-counting was that a group of us decided to continue the discussion on-line. This led to the creation of the OCQM yahoo group, which included David Strayhorn, Michael Weissman, Allan Randall, Robin Hanson, and myself. Robin had not been at the conference, but he was the originator of the Mangled Worlds approach to the Born Rule, and accepted our invitation to join the group. In practice, however, posts to the group largely came from just David and myself. We all supported some form of observer-counting, but our approaches were quite different. We had some very interesting discussions, and it was a good place to 'think out loud', but ultimately even David's posting to the group petered out and it seems dead at this point.

I gave the Pughs my printed copy of the MCI paper. They were compiling a book in which they would quote various people about why Everett's interpretation of QM was important, so I wrote a few lines for them. Ultimately they decided not to use it though. I think they didn't like my criticism of the current status of the Born Rule in the MWI.

Thursday, August 27, 2009

Interlude: Anticipating the 2007 Many Worlds conference

For many years, I knew it was coming. You just had to do the math: Hugh Everett III had published his thesis, which introduced the Many Worlds Interpretation (MWI) of quantum mechanics, in 1957. So, somewhere, there would be a 'Many Worlds at 50' conference in 2007. And I would be there.

-------------------------------------------------------------------------------------

Back in 2000, I attended the conference ‘One Hundred Years of the Quantum: From Max Planck to Entanglement’ at the University of Puget Sound, which commemorated Planck's paper which first introducted the concept of energy quantization, used to explain why the equilibrium density of thermal radiation is not infinite.

I had already started exploring the concepts behind the Many Computations Interpretation (MCI). [I called it the 'Computationalist Wavefunction Interpretation' (CWI) but that just didn't have the same ring to it.] It grew out of David Chalmer's suggestion, in the last chapter of his book The Conscious Mind, that applying computationalism to quantum mechanics was the right way to make sense of the MWI. But I knew that computationalism had to be made more precise before that could be done, and I knew that the Born Rule would be the key issue.

I submitted a short paper about it for the conference book. The paper is still available online at
http://www.finney.org/~hal/mallah1.html

At the conference I met a few well known physicists, the most famous of whom was James Hartle. At the time, the 'Consistent Histories' approach to interpretation of QM was getting a lot of attention, and Hartle and Murray Gell-Mann had written a book about it. As far as I was concerned, that approach was not of much interest, because it pretended that single-world-style probabilities could be assigned to terms in the wavefunction 'once decoherence occurred' despite the fact that decoherence is never truly complete. (Probabilities can not generally be assigned in the sense that, prior to decoherence, interference effects can occur and only be understood as showing the simultaneous existance of multiple terms in the wavefunction.)

It was also maddeningly vague about what exactly was suppposed to really exist, and declared that some questions must not be asked. It was not clear whether it was really just the MWI in drag, deliberately using vague language so as not to scare away those who thought the MWI is too weird, or if it was some new variant of the single world Copenhagen Interpretation. Its advocates publically claimed inspiration from both sources!

I got the chance to ask Hartle a question. I asked him two things:

1) Is Consistent Histories the same as the MWI?

He said it is. That provoked a gasp from the audience! You see, Consistent Histories was looked on quite favorably by many physicists at the time, while the MWI was still largely dismissed as material for science fiction.

2) Is it the same as the Pilot Wave Interpretation?

He said it's not. The second question was necessary because some people, especially those who like the Copenhagen Interpretation, consider experimental predictions to be the only thing that matters - so that they would consider all interpretations which give the same predictions to be the same thing. Now I knew that was not the case with him, so the first answer really did mean something.

Anyway, after that conference I resolved to try to make my interpretation of QM precise in time to discuss it at the inevitable 2007 conference. Seven years should be enough time, right? Of course, it was never my day job, just a hobby of sorts.

--------------------------------------------------------------------------------------

In 2002 I attended ‘Towards a Science of Consciousness’ (TSC), a yearly philosophy conference which was held at the University of Arizona that year and every even year. That was interesting in its own right, as I met interesting people and learned about issues and thought experiments in philosophy of mind which I had not previously been exposed to. (I don't think it would be as interesting to attend another TSC, because many of the issues are the same every year, unless I have published something of my own that will be talked about. But it's not bad so perhaps I will.)

At that 2002 TSC, I participated in the poster session, with a poster called “What Does a Physical System Compute?” which laid out my ideas about an implementation criterion for computations. It got little attention, except that David Chalmers himself was kind enough to stop by and consider it. He made some comments and criticisms. I'd had many false starts at formulating a criterion, and had discussed it by email with him, so he knew what it was about. The criteria I listed weren't good enough, and we both knew it, but I believed it was a step in the right direction.

[Some of the other posters there were interesting, but I remember only one, because it stood out as being the most crackpot idea I'd yet encountered - and I'd encountered many on the usenet newgroups. This guy was combining the kooky notion that humans only became conscious when language was invented, with the crazy idea that only consciousness causes wavefunction collapse, to argue that _the biblical age of the Earth is correct_ (a few thousand years) because that's when the first wavefunction collapse brought the universe into real existence! Quite a combination!]

--------------------------------------------------------------------------------------

So, years passed by and before I knew it the 2007 Perimeter Institute conference Many Worlds @ 50 was approaching. This was it; the conference I'd been looking forward to for so long, in which I hoped to discuss my ideas about the MWI with other supporters of the interpretation. Would I be ready? I'd had some success in refining my implementation ideas, and scrambled to write up what I had.

The Born Rule still eluded me, though. I had hoped that once I found the precise criteria for existence of an implementation, I could apply it to quantum mechanics and the Born Rule might pop out. After all, it's actually fairly easy to get the Born Rule to pop out if you impose certain simple requirements such as conservation of measure. People have been doing it for years without even realizing they'd made unjustified assumptions. All I had to do was find a reason to justify an assumption like that for the counting of implementations.

I didn't find that justification, and time was getting short. I turned to an unusual approach for inspiration - Robin Hanson's 'Mangled Worlds' papers. He had a rather innovative approach to the MWI, in which large terms in the wavefunction 'mangle' small ones, leading to an effective minimum amplitude, and he argued that the Born Rule followed from counting worlds (lumps of wavefunction) in the distribution of survivors. The world-counting appealed to me, as it could easily be translated into implementation-counting, but I did not believe his scheme could work: large worlds would not 'mangle' worlds they had decohered from nearly as much as Hanson had assumed.

To get that kind of thing to work, I had to assume new physics, contrary to Everett. But the new physics was fairly simple: random background noise in the wavefunction (which could be part of the initial conditions rather than new dynamics) could 'mangle small worlds' and if it does the Born Rule pops out (in an interesting new way). There were still some real questions about whether this could work out right, so I explored a more direct approach as well in which I tried to rig the way implementations are to be counted in order for it to come out right. That turned out to be easier said than done, and it remains an open question about whether it can or should be done, though I regard it more favorably now. All of this will be discussed in later posts.

I also discussed other alternatives, such as an MWI with hidden variables, and other ways that a minimum amplitude could be introduced. The basic conclusion was that computationalism strongly favors some kind of MWI over single-world interpretations, even if both have hidden variables, but the details are unknown (and might always remain so).

I wrote all this up and added criticisms of the incorrect attempts to derive the Born Rule in the MWI, including the one based on decision theory, which was widely considered the strongest of the attempted derivations although it had its critics. This became my MCI paper, which I placed on the preprint arxiv: http://arxiv.org/abs/0709.0544

I knew that I was cutting it close, so I emailed some of the people who had written about the MWI and who would attend the conference to tell them about my paper on the arxiv.

It was time to go to Canada and see if the 2007 MWI Perimeter Institute conference would live up to the anticipation.

Futher Study

I'd like to wait for some comments for this one. What do you want to learn?

I assume you know how to search the web. The Stanford Encyclopedia is good for many topics, as is Wikipedia. Though as always, don't assume that something is true just because you read it there. You must develop an eye for controversial issues.

What I have attempted to do so far here is twofold: First, to provide an easy to understand overview of many issues surrounding interpretation of quantum mechanics. That should be useful to students who intend to pursue a serious interest in philosophy of physics. Secondly, to convey my own ideas about philosophy of physics; some of that requires a lot of background in very specific issues to properly understand.

I will add references here on an irregular basis. Traffic on this 'blog' is not high as of yet so there is no typical reader. If that changes, I expect some requests. Unlike a typical blog, I edit these posts as needed to cover a topic, rather than just making new posts all the time.

You can email jackmallah@yahoo.com if you don't want to post a comment.

You can also add your own links in your comments.

From here on out, the focus of the 'blog' will change from review of QM to discussion of contemporary research topics related to the MWI, but still will hopefully be understandable.


Primers on Basic QM:

https://arxiv.org/abs/1803.07098

http://theoreticalminimum.com/courses/quantum-mechanics/2012/winter

Thursday, August 20, 2009

Studying Quantum Mechanics: Measurement and Conservation Laws

When you learned that the results of measurements in quantum mechanics are random, it may have raised a question in your mind: What about conservation laws? Do they only hold on average? For example, if you measure the energy of an atom, you might end up with a different amount of energy than the average, right? If there are random fluctuations in 'conserved' quantities, could the effect be used to violate conservation laws in a systematic way?

For example, consider a spin measurement for spin-1/2 particles. Each particle's spin carries an amount of angular momentum equal to hbar/2 in the direction it points. The particles are prepared so that their spins point in the +Z direction, and then sent into a Stern-Gerlach (SG) device, which we can rotate to measure spin along any direction. If we measure a spin in the X direction, the result is that the spin ends up in either the +X or -X direction. So it looks like we are violating conservation of angular momentum in a systematic way, destroying the +Z direction angular momentum we prepared the particles with. If that were true and the experiment is done in an isolated satellite, we could use it to build up a net angular momentum in the -Z direction.

If conservation laws mean anything, there must be something wrong with the above picture. Perhaps, one might think, there must be some back-action of the particles on the Stern-Gerlach device. That is, the missing angular momentum is being transferred into the SG device, as the particles exert torques on it with their magnetic moments as they come through.

The problem we run into next is that this seems to violate linearity: A +Z spin can be written as a superposition of a +X term and a -X term. After going through the SG device, there is decoherence (or as some people wrongly assume, wavefunction collapse), and what is observed is just a +X result or a -X result. Since QM is linear, the final wavefunction is a linear superposition of the terms that would have resulted if the original spins had been +X or -X. Such terms do not take the original +Z spin into account. So at least as far as an observer within such a term is concerned, there is no residual effect of the original spin direction, such as we would need if the SG device had received angular momentum that depended on that direction.

The solution to this puzzle, naturally, is to treat the measuring device as a fully quantum-mechanical system. That means that its angular orientation can not be precisely known, due to its finite uncertainty in angular momentum. (The uncertainty principle applies, limiting how small the product of the uncertainties of angle and angular momentum can get.) As a result, there will be very small 'error' terms in which the wrong spin outcome is measured, i.e. -X instead of +X, or an incoming spin is flipped.

This effect may seem negligible, but it is enough to allow the information about the original direction of the particle spin to be encoded in the final state of the SG device. It works out to be exactly enough of an effect to enforce the conservation law. The uncertainty in the SG device's angular momentum allows a sort of selection effect; in effect, the 'lost' angular momentum does end up in the SG device. The same kind of effect holds for all conservation laws. This is explained in detail in my eprint "There is No Violation of Conservation Laws in Quantum Measurement". It was first studied by Wigner in 1952, and is related to the Wigner-Araki-Yanase theorem (1960).

See also
"WAY beyond conservation laws"

Wednesday, August 12, 2009

Key definitions for QM: Part 3

Previous: Key definitions for QM: Part 2

In this post some additional QM terms will be defined. These often come up in applications of QM and might come into play for interpretation issues.

Hamiltonian: In classical mechanics, the Hamiltonian is a function of the configuration and velocities that equals the energy of the system, giving the energy as a sum of that of the various types of energy in the system. In QM, it is a corresponding linear operator on the wavefunction. The Hamiltonian appears in the Shrodinger equation. Its eigenstates have definite values for energy. A wavefunction that is an energy eigenstate will not undergo change as time passes except as a standing wave, undergoing phase rotations.

commute: Let A and B be operators. They commute if A B psi = B A psi for any function psi. This is written as AB = BA or [A,B]=0.

If two operators don't commute, then measurements associated with one of them will change the probabilities for values of measurable quantities associated with the other, and they can not be measured simultaneously.

Position does not commute with momentum (which is mass times velocity). Spin measurements in different directions also don't commute.

(Heisenberg's) uncertainty principle: There is a minimum uncertainty for the product of measurable quantities that don't commute. This follows from the math (and the Born Rule). Most famously, the product of (spread in position) (spread in momentum) >= hbar/2.

This can be understood roughly as follows: Momentum is related to the wavelength of sinusoidol patterns in the wavefunction - those are its eigenstates (actually there is an imaginary component as well - the wavefunction is complex-number-valued). If the wavefunction is concentrated near a point (small uncertainty in position), then it must be built up out of a superposition of a wide range of sinusoidal functions. If on the other hand it is in a nearly sinusoidal pattern, then it must be spread out over a large range of positions.

(The Born Rule comes into play because we assume the usual relation between probability and the square of the wavefunction.)

There is a similar uncertainty principle that relates uncertainty in energy and time.

Wikipedia's article has a more in-depth explanation.

boson: Particles with "integer spin" have spin component eigenvalues that are integer multiples of hbar. Such particles are bosons, which means that they have a tendency to occupy the same states as identical particles of the same type; technically, their wavefunctions are symmetric with respect to exchanging the particles. Photons (particles of light) are bosons, which lets them reinforce each other and produce the classical-seeming behavior of electromagnetic fields.

fermion: Particles with half-integer spin are fermions, which means they cannot occupy the same state as identical particles of the same type; technically, their wavefunctions are anti-symmetric with repect to exchanging the particles.
The connection between spin values and boson/fermion behavior is a consequence of relativistic quantum field theory (QFT). Actually in QFT there are no particles, just quantized excitations of the fields. It is not surprising that treating excitations of the fields as though they were particles (as is done in the nonrelativistic approximation, used very often in QM) would require some special treatment of the so-called particles with respect to the symmetry of exchanging them.

Pauli exclusion principle: This is the principle that, as mentioned above, no two fermions of the same type can occupy the same state. Electrons are fermions, so this is very important in atomic physics and chemistry. Atoms have various shells of electrons which can be though of as built up by adding one electron at a time. When an inner shell is fully occupied, another electron can't occupy one of those states, so it will end up in the next shell out. (If placed in an even higher shell out, it will fall to the innermost shell it can, emitting a photon to carry away the extra energy.)

Shrodinger Picture: This is the usual formalism in which the wavefunction varies with time while measurable quantities are associated with fixed linear operators. There is a global time.

Heisenberg Picture: This is a formalism in which the wavefunction is static but linear operators vary with time, giving the same Born Rule probabilities for measured outcomes. In relativistic quantum field theory, this picture has fewer problems of being mathematically well defined (with infinite renormalization, or re-scaling of certain quantities) than the Shrodinger picture, and is also the only local formulation of QM. (Local meaning that things defined at points in ordinary space only interact with their neighbors.)

However, this formalism is harder to work with. It is also believed that infinite renormalization will not be necessary for a fundamental model that includes quantum gravity.

The Heisenberg picture has the strange feature that interactions carry labels with them of what has been interacted with, and these proliferate as more and more systems interact. Basically, at each point, a field operator encodes information about the correlations of the field at that point with the set of field configurations over all space. At each point in space the field operator must be capable of carrying an unlimited amount of information about all of space and updating it as time passes. Although there are no shortage of infinities in most models of physics, this would seem surprisingly inefficient for the fundamental working of nature. Of course, nature has surprised us before.

For more detailed and technical information about locality and label proliferation in the Heisenberg picture see
Locality in the Everett Interpretation of Heisenberg-Picture Quantum Mechanics
Locality in the Everett Interpretation of Quantum Field Theory

Next: Further study

Tuesday, August 11, 2009

Key definitions for QM: Part 2

In the last post, definitions Part 1, I explained some of the terms that commonly come up in interpretation of QM and described their roles in that context. Here, I will define some other useful terms; these are more technical and less key to understanding most interpretation issues, but still handy in that context (and fun!) You can look up the equations that are involved; my concern here is with what is relevant to interpretations.

spin: This refers to a property of individual particles that behaves like an intrinsic angular momentum. When measured, it has a constant magnitude, and the component of it in the measured direction can only take on a few discrete values.

Spin-1/2 particles, such as an electron, have two possible eigenvalues of their measured spin component: + or - 1/2 hbar. When not measured, they are in a superposition of the eigenstates with the allowed values (or in an entangled state). Such a superposition is always an eigenstate of the spin measurement operator in some other direction, though an entangled state is not.

degenerate: While this term may refer to modern society, in the context of QM it means that there are more than one eigenfunctions of a particular operator with the same eigenvalue. Measurements based on that operator will not cause degenerate eigenfunctions to decohere - for example, if you measure energy and there are two eigenstates with the same energy, those two states will remain in a coherent superposition and the observer will not distinguish out a unique eigenstate.

bra and ket notation: Dirac invented this useful notation in which a function can be represented by a 'ket', or the second half of a bracket, written |label> where "label" is used to describe which function is being referred to. For example, if f(x) = sine(ax)exp(-bx^2), one could write |f_ab>.

A 'bra', or the first half of a bracket, represents the complex conjugate of the same function.

It is written < F_ab|

A 'bracket', such as < f | g >, represents the integral (sum over the configuration space) of the bra function (here, complex conjugate of f) multiplied by the ket function (here, g).

Often (though not always), bras and kets are normalized so that < f|f> = 1.

If you see two kets next to each other, such as |b>=|f>|g>, this means function b is the product of the function f that lives in the configuration space of one system and g which lives in another system: b(x,y) = f(x) g(y).

quantized: Some measurements have discrete possible outcomes, and such quantities are called quantized. For example, the energy levels of an electron in a hydrogen atom are quantized, but the energy of an electron that escapes the atom can take on a continuous set of values so it is not quantized.

It can also refer to obtaining a quantum mechanical model from a classical one.

'Quantum mechanics' originally referred to quantized quantities but is now used to describe the whole branch of physics which deals with related phenomena such as the wavefunction.

Plank's constant, h: This is a constant that appears in the Shrodinger equation. It sets the scale at which quantum phenomena have noticable direct effects. It has units of 'action', units which are those of (mass)(velocity)(position). More commonly encountered is hbar, which is h/2 pi. The plain h is more useful for full oscillating cycles, which are common enough with waves, while hbar is useful for instantaneous rates of change.

h = 6.63 x 10^-34 kg m^2 / s

geometric optics limit: While there are many issues involved in deriving the appearance of a classical world from the wavefunction model, if we grant the validity of the Born Rule for probabilities then an important part of the derivation of classical mechanics is simple after that:

When the wavelength of a wave is much smaller than the size of whatever openings it goes through, the spreading out of the wave become negligable. This is the same reason that light waves can be treated as coming out of a flashlight in straight lines, while sound waves much more noticably bend around corners. There are still small tails where a tiny portion of the wave's squared amplitude will spread off, which is why I invoked the Born Rule, since it lets us neglect that part.

Just as most of a light wave will move in a straight line, most of a quantum matter wave for a non-microscopic (macroscopic) object will follow the trajectory predicted by classical mechanics. The small bit that will not generally has a Born Rule probability so low that it is effectively impossible to ever measure.

Hilbert space: The state of a quantum system is given by the wavefunction, which is a function on configuration space. This can be thought of as representing a 'state vector' in an abstract space.

An ordinary vector in regular 3 dimensional space is a quantity which has both direction and magnitude - for example, velocity. It can be represented by x,y,z components: V = (vx, vy, vz). Thus, in a particular coordinate system, it is written as a function of a discrete index which can only take on 3 values; e.g. V(1) = vx, V(2) = vy, V(3) = vz.

A Hilbert space is a generalization of this to describe any function as a vector in some high-dimensional space.

Philosophically, thinking of the function as a vector implies that the particular coordinate system in which the components are spelled out is not very important. The physical nature of a velocity would be the same no matter what x,y,z coordinate system we are working in, but the components would look different.

quantum mechanical basis: This is analogous to choosing a coordinate system to write the vectors of the Hilbert space in. Changing a basis is analogous to rotating the directions of your coordinate system components.

Examples include 1) position basis, in which the wavefunction is a function of position as expected, 2) momentum basis, in which the wavefunction is a function of particle momentum, which is (mass)(velocity). It may not seem at first that the two pictures are equivalent, because in classical mechanics, knowing the momentum will not tell you the position. But in quantum mechanics, in which the wavefunction is a complex number valued function, knowing the wavefunction at every point in the momentum basis is enough to find it in the position basis and vice versa.

Just as there is rotational symmetry in ordinary space which makes it impossible to know if there is in nature any actual, fundamental coordinate system in which vectors really have three components, it is impossible to know what the "actual basis that nature uses" is in QM. Under the Copenhagen interpretation, the assumption was that there is no such thing - only things we can measure were considered real.

As will be seen, more mechanistic, literal views of mathematical models (such as the MWI is) seem to require some actual basis that nature would use.

Next: Key definitions for QM: Part 3

Friday, August 7, 2009

Key definitions for QM: Part 1

Before turning to the more advanced issues that I created this blog to discuss, it would be good to give at least a brief summary of a few of the basic terms related to QM that often come up in discussion of interpretations.

Here I will explain some of the terms that seem most likely to cause confusion and which are most directly involved in the interpretation of QM. Rather than technical definitions, here I am concentrating on their role in interpretations. In the next post, definitions Part 2, I'll define some slightly more technical terms which often arise in the context of discussions of QM but which are not as essential for the basic interpretation issues.

wavefunction: The mathematical model of QM, in it most standard formulation, deals with a complex-valued function of the configuration space that undergoes wave-like motions. In a fit of inspired creativity, it was dubbed the wavefunction ;) It is often represented by the Greek letter psi, which looks like a U with a vertical line through the middle and extending below the curve.

measurement: This usually refers to an experiment in which a human observes a macroscopic instrument to determine the outcome. As such, it is an emergent phenomenon and can play no fundamental role in the mathematical model of the physical world. However, it certain plays a fundamental epistemological role in our ability to learn about the world.

When a 'measurement' occurs, the result is one of the allowed results (an eigenvalue of the measurement operator) and subsequently the wavefunction appears to behave as if it had been placed in the corresponding eigenstate (see below).

eigenfunction/eigenvalue: These terms from German, now used in linear algebra, describe mathematical properties of certain functions. Eigen- means characteristic. Each measureable quantity (such as energy) corresponds to a linear operator. Each such operator A give a spectrum of solutions to the equation A psi_i(x,..) = c_i psi_i (x,..) where c_i is a constant. Here i is an index for however many values work for that operator. The different constants are called eigenvalues, and I'll let you guess what the corresponding functions are called ;) A wavefunction that is an eigenfunction is called an eigenstate.

Copenhagen Interpretation: This was "the standard view" of most physicists during much of the 20th century. Essentially, it said that "when a measurement occurs", the wavefunction "collapses" to give one of the allowed outcomes. It was never possible to define the exact circumstances under which a measurement was supposed to occur, or to say what "collapse" was like. This interpretation has become widely recognized as incomplete at best, or less charitably, as ill defined and utterly implausible. In practice, it often meant "shut up and calculate" - it allowed the interpretation question to be swept under the rug so that physicists could work on practical problems instead, like making atomic bombs. It is no longer taken seriously by most philosophers of QM.

Shrodinger equation: This is a deterministic, linear equation that gives the time evolution of the wavefunction. In the MWI, this equation always holds true.

It does not produce any "collapse of the wavefunction" so certain other interpretations must modify it, either explicitly (continuous collapse models) or by hand waving talk about 'measurement' (Copenhagen).

"collapse of the wavefunction": It was long believed that another process, not described by the Shrodinger equation, must occur during 'measurement' in which a single random outcome is chosen - the so-called 'collapse of the wavefunction'. In 1957, Everett argued that no such 'collapse' is needed to explain what we see - he proposed the MWI.

Caution: Even people who believe the MWI sometimes use a sloppy terminology in which they talk about "collapse of the wavefunction" when it is supposed to be understood that they really only mean the illusion of such collapse due to decoherence. I dislike this misleading terminology.

Born Rule: When a 'measurement' occurs, the probability of each outcome is given by the absolute value of the square of the overlap integral of the wavefunction with the corresponding eigenstate. It is an open question as to whether and how the standard (Shrodinger equation only) MWI can explain this, or if not, what does. This is the key issue in interpretation of QM.

linear superposition: One of the most important properties of the Shrodinger equation is that it is linear. This means that if F1 is a solution of the equation, and so is F2, then the sum (superposition) F3 = F1 + F2 is also a solution. (The actual solution that physically occurs depends on the initial conditions - the starting state.)

In a measurement-like situation, and more generally whenever two systems interact, the solution will usually have a branching type of behavior. For example, when a photon hits a half-silvered mirror, there will be part of the wave that is reflected and another part that passes through. The subsequent behavior of these two parts of the wave does not depend on what the other part is doing. If the two parts are brought back together, the resulting wavefunction is a linear superposition of the parts. This can result in an interference pattern.

Linearity forbids collapse because if F1(0) evolves to F1(t), and F2(0) evolves to F2(t), then F1(0) + F2(0) must evolve to F1(t) + F2(t). Collapse, by contrast, would mean replacing the sum by randomly selecting only one, either F1 or F2.

decoherence: This refers to the way in which different branches of the wavefunction stop interfering with each other. Linearity prevents different terms in a superposition from changing each other but it still permits cancellation or reinforement between parts of the functions - interference patterns, e.g. a positive part of F1(x) cancelling a negative region in F2(x).

Decoherence usually means that a system becomes entangled (correlated) with the environment in a robust way. This is generally an irreversible process in the statistical sense, much like an increase in entropy in statistical mechanics.

Once entangled with the environment, interference patterns are no longer seen because the functions F1(x1,x2,...) and F2(x2,x2,...) now have most of their nonnegligable regions in different parts of configuration space: They may still overlap in terms of the x1-dependence (the microscopic system under study), but they occur in different parts of the environment variables' space, e.g. x2.

In principle, an interference pattern could be restored if the x2,... dependence were also brought back into overlap. In practice, there are so many particles in the environment that doing this is not feasible. Thus, decoherence creates the illusion of irreversible 'collapse of the wavefunction".

Also, it is now fairly well understood that in measurement-like situations, in which an interaction exists that tends to seperate out components of the wavefunction that have different eigenvalues for what is being measured, the different eigenstates will tend to decohere. This explains part of the measurement puzzle from the MWI perspective.

entanglement: This means that the wavefunction has correlations between the states of two or more systems. For example, F(x1)G(x2) is a product state and is not entangled, but F1(x1)G1(x2)+F2(x1)G2(x2) is a correlated, entangled state. Entanglement with the environment results in decoherence and the illusion of "collapse of the wavefunction".

Entangled states between small numbers of controlled particles are also important, because they display various non-classical behaviors, such as violations of Bell's inequalities when measured, and are useful in quantum computing and quantum cryptography.

Next: Key definitions for QM: Part 2

Wednesday, August 5, 2009

Studying Quantum Mechanics: the Delayed Choice example

Most descriptions of QM are not very good. In particular, the configuration-space-wave-mechanical aspects of QM are usually not fully taken into account; instead, a nearly incomprehensible description is given in more classical terms.

Delayed Choice experiment:

For example, consider a delayed-choice thought experiment in which a photon can take two paths simultaneously. If the experimenter wants, he can "determine which path the photon took" by letting it hit a pair of detectors; it will register in only one detector, randomly chosen, as far as he can tell. The paths are laid out in such a way that in order for it to hit a detector, it must have taken the corresponding path. Taking the other path would cause it to sail past that detector and into the other one.

Or, he can insert a 'beamsplitter' (a half-silvered mirror) to recombine the beams, in a way that results in the photon always going to the rightmost detector due to wave interference - in which case it must have taken both paths. He can choose whether to insert the mirror just before the photon reaches the detectors, after most of the paths would have already occurred!

Mysterious stuff, right? It looks like the experimenter reached back in time, changing whether the photon took both paths or chose one randomly!

Sure - if you think about it the wrong way.

In terms of wave mechanics (which is the MWI), the photon took both paths in all cases. If the beam recombiner is not present, the photon becomes entangled with the detectors - that is, becomes correlated with their degrees of freedom in configuration space. In one 'branch' of the wavefunction, one detector clicked; in the other, the other did. There is no 'delayed choice' mystery. [There is only the standard question for the MWI of what explains the appearance of probabilities - the old Born Rule problem.]

Most of the mysterious aspects of QM make a lot more sense when viewed as just wave mechanics in configuation space. But it's hard (impossible?) to find an introductory treatment of QM that even mentions configuration space. An advanced treatment of QM is unlikely to be much better - the equations will be there, but with little explanation.

The next post will be a basic glossary of common QM terms such as 'entangled'. Then I should be getting on to start discussing MWIs in more detail.

on external links

I wrote in a comment on the "Why MWI?" post:

"External links can be very useful, and thanks for the tips, but there is one problem: There is liable to be something I disagree with at most links. For example, while the article on collapse interpretations that you gave a link for is good, it casts them in a more favorable light than I would. I mentioned collapse in my blog only to say why it is wrong, get it out of the way, and move on to the more interesting stuff :)"

The matter bears some discussion, and I would welcome comments about it, though those remarks might have scared off the guy I was responding to.

There are a few things I want to make clear:

1) I do not want to limit anyone's exploration of ideas or to railroad people into a particular conclusion. This isn't about that at all. What I want is to avoid sending people to read misleading articles until they are ready to detect the ways in which those articles are unintentionally misleading.

2) Now, the best way for you to know if an article omits important information, contains outright untruths, sweeps problems with a claim under the rug, or is otherwise misleading, is for you to read it and decide for yourself! However, in order to decide correctly, you often need considerable background information.

For example, suppose you see an article that says "Bell's theorem, and the experiments that have tested it, prove that nonlocality is a real feature of our world."

You are likely to see statements like this in many different, independent articles and sources. It's a common interepretation of Bell's theorem, even by respectable physicists (those who know little of the MWI). Should you therefore believe it?

No, it's false. You could know that if you read my post
http://onqm.blogspot.com/2009/07/simple-proof-of-bells-theorem.html
in which I mention

"Note: The theorem is often said to prove that QM is nonlocal, because a reasonable local model would not allow the direction chosen for a distant measurement to influence the result of the other measurement. That is not the whole story and you should be aware of the other possibilities. In particular, Many-Worlds interpretations do not suffer this limitation because all outcomes occur and correlations might be established only after local interactions; see http://arxiv.org/abs/0902.3827"

Now here I did give an external link, because I read it and it seemed fairly reliable. You can read the linked paper and decide for yourself.

But most links I could give to discussions of QM are liable to contain misleading statements.

3) So, I could prepare the reader in advance, by telling you what to look out for at a particular link, right? Not usually practical. If I give a link, it's so that I don't have to explain the whole thing myself, but I'd practically end up having to do it anyway. In some cases though this could work, if the problem area is relatively small or obvious.

4) If you read a long link, that could take a lot of time and interrupt the flow of what I am trying to say.

5) You should know upfront, this blog is my turf. I don't claim to take a neutral stance on the issues; I just present the correct stance as I see it. This is not a public school or a newspaper.

Comments?

Featured Post

Why MWI?

Before getting into the details of the problems facing the Many-Worlds Interpretation (MWI), it's a good idea to explain why I believe t...

Followers