Comments
Description
Transcript
How “Quantum” is the D-Wave Machine?
How “Quantum” is the D-Wave Machine? Seung Woo Shin* , Graeme Smith† , John A. Smolin† , and Umesh Vazirani* arXiv:1401.7087v1 [quant-ph] 28 Jan 2014 * † Computer Science division, UC Berkeley, USA. IBM T.J. Watson Research Center, Yorktown Heights, NY 10598, USA. Abstract Recently there has been intense interest in claims about the performance of the D-Wave machine. Scientifically the most interesting aspect was the claim in [7], based on extensive experiments, that the D-Wave machine exhibits large-scale quantum behavior. Their conclusion was based on the strong correlation of the input-output behavior of the D-Wave machine with a quantum model called simulated quantum annealing, in contrast to its poor correlation with two classical models: simulated annealing and classical spin dynamics. In this paper, we outline a simple new classical model, and show that on the same data it yields correlations with the D-Wave input-output behavior that are at least as good as those of simulated quantum annealing. Based on these results, we conclude that classical models for the D-Wave machine are not ruled out. Further analysis of the new model provides additional algorithmic insights into the nature of the problems being solved by the D-Wave machine. 1 Introduction In a future world of quantum devices, it will become increasingly important to test that these devices behave according to specification. While this is clearly a central issue in the context of quantum cryptography [4, 27] and certified random number generators [17, 26], it is also quite fundamental in the context of testing whether a claimed quantum computer is really quantum [2, 18, 14, 5]. Recently, this last issue has featured prominently in the context of the D-Wave machine [12, 6, 7, 1, 15], amidst questions about whether it should really be thought of as a quantum computer and whether it provides speedups over classical computers. One of the grounds for skepticism is that the decoherence time of D-Wave’s qubits is reported to be on the order of nanoseconds, which is comparable to the time for a single operation and much shorter than the time required to carry out a computation, which is on the order of microseconds. By now, claims about speedups over classical computers [16] have been largely refuted [7, 22, 19]. But one of the papers [7] that disproved the claims of quantum speedup 1 also claimed based on extensive experiments that the D-Wave machine exhibits large-scale quantum behavior. While falling far short of the existence of a working quantum computer, this conclusion is nonetheless very exciting, since it suggests that it might be possible to create large-scale entanglement without investing in stable, slowly decohering qubits. For this reason it is particularly important to closely examine the evidence presented in the paper. The paper recorded the input-output behavior of the machine D-Wave One on a thousand randomly chosen inputs and compared it to the input-output behavior obtained by simulating three models. The first model was simulated quantum annealing1 , and the other two were classical models — simulated annealing, and classical spin dynamics suggested in [23]. The paper reported that the input-output behavior of D-Wave One correlated highly with simulated quantum annealing, but despite considerable effort in choosing parameters did not correlate well with the two classical models. Based on this, the paper concluded that the D-Wave machine was indeed following quantum dynamics, and exhibited largescale quantum behavior. At a high level, the nature of this argument is very interesting, and reminiscent of the algorithmic approaches in [2, 18, 9, 14, 5]. In those papers, the quantum system to be tested is modeled as a black box, and elaborate protocols are designed whose execution guarantees that the untrusted quantum systems behaves according to specification. Given the limited access to the D-Wave machine, which is proprietary and with access controlled by the company, the approach of [7] may be seen as algorithmic in nature. The evidence of “quantumness” is based on broad algorithmic differences between the three models rather than detailed physical modeling of the system. In particular, the core of the argument is based on the finding that the D-Wave machine and simulated quantum annealing generally find the same set of instances to be “hard” and the same set of instances to be “easy.” The hardness of an instance is measured by the number of times that each candidate algorithm succeeds in finding the optimal solution out of one thousand trials. Of course, how convincing this evidence is depends upon how comprehensive the list of candidate classical models is. In this paper, we consider a third classical model for the D-Wave machine. The model is quite natural, and it replaces each spin in the D-Wave machine with a magnet pointing in some direction in the XZ plane. Each magnet is subject to both an external magnetic field as well as dipole-dipole couplings due to interactions with nearest neighbors on the so-called Chimera graph. The external magnetic field is attenuated at the same rate as in [8], while the couplings between magnets also follow the schedule from [8]2 . We consider 1 Simulated quantum annealing refers to a quantum Monte Carlo simulation of quantum annealing. In this sense, it is a classical algorithm. However, one can argue that the quantum Monte Carlo simulation is quite involved and the D-Wave machine is surely not carrying out such a complex computation. Therefore, it may be reasonable to regard this as a quantum process. 2 We note that the schedule we consider here is different from the one reported in [7]. We were informed [24] that the schedule in [7] was based on an error by D-Wave, and the correct schedule is reported 2 the input-output behavior of this model on the same set of one thousand inputs. Our simulations show that it achieves as good or better correlation with the D-Wave machine’s input-output behavior than simulated quantum annealing does. Based on these results, it is safe to say that classical models for the D-Wave machine are not ruled out, and it is premature to conclude that the D-Wave machine exhibits large-scale quantum behavior. Matthias Troyer suggested [24] a direct comparison between our model and simulated quantum annealing, which reveals an extremely high correlation of R ≈ 0.99 (Fig 9). One way to view this result is that our model is the classical analogue of a mean-field approximation to simulated quantum annealing, and that for the set of problems solved by D-Wave One, this approximation is very accurate. This helps explain why the correlation of the D-Wave data with simulated quantum annealing does not preclude a classical model for D-Wave. Further analysis of the algorithmic behavior of this new model also provides new insights into the nature of the computational problem being solved, namely the problem of finding the ground state of a classical Ising spin glass on the so-called Chimera graphs. The Chimera graph consists of super-nodes or clusters of 8 qubits (connected in a complete bipartite graph K4,4 ) arranged in a two-dimensional grid. The D-Wave machine consists of superconducting qubits arranged on the vertices of such a graph, with a z − z coupling between neighboring qubits. An analysis of the new model suggests that the existence of these super-nodes greatly reduces the effective size of the computational problem being solved, so that heuristically the size of the problem should be thought of as the number of super-nodes m = 16, rather than the number of qubits n = 108. To the extent that the model accurately represents the dynamics of the D-Wave machine, it therefore has great implications for the performance of the D-Wave machine as it is scaled up to larger numbers of qubits. 2 Background We start by defining the computational problem that D-Wave machines are designed to solve. The “native” problem of D-Wave machines is the problem of finding the ground state of a classical Ising spin glass, which can be described as follows. The input to the problem is a weighted graph with n vertices, where each vertex represents a spin and each edge represents interaction between the two incident vertices. The edge weights Jij can be either 1 or −1, corresponding to ferromagnetic and antiferromagnetic interaction respectively. The problemP is simply to assign a spin value zi ∈ {−1, 1} to each vertex i so that the energy H = − i<j Jij zi zj is minimized, i.e. finding the “ground state” of the given interaction graph. An additional constraint imposed by D-Wave machines is that the input graph must be a in [8]. Our results are reported for this new schedule. All the claims we report here apply equally well to the previous schedule in [7]. 3 Supplementary material for “Quantum annealing with more 0 4 8 12 16 20 24 28 1 5 9 13 17 21 25 29 2 6 10 14 18 22 26 30 II. THE CHIMER The qubits and cou thought of as the ve bipartite graph, called figure 1. This graph eight qubits each. W couplers realise a com each of the four qub the four on the right left is furthermore co in the unit cell above on the right is horizo ing qubits in the uni appropriate modificat the 128 qubits in the in the experiments ar between them are ma For our scaling an cedure for scaling of FIG. 1: Qubits and couplers in the D-Wave device. Figure 1: The “Chimera” graph structure implemented by the D-Wave One. Grey dots sidering the chimera D-Wave One Rainer consists 4 ⇥ Figure 4 unitiscells represent The the twenty non-working qubitschip in the D-WaveofOne. fromof [7]. with an eight-site un eight qubits, connected by programmable inductive couplers tions. The sizes we ty as shown by lines. are L = 1, . subgraph of the so-called Chimera graph, which is depicted in Figure 1. The Chimeraulations graph consists of copies of complete bipartite graphs K4,4 ’s which are laid out in 2D lattice8,as32, in 72, 128, 200, 288 Figure 1. We will refer to each copy of K4,4 as “supernode.” Note that the left qubits in each lated annealers and supernode are coupled vertically in the 2D lattice and the right qubits horizontally.above More we used a perf specifically, each left qubit is coupled with the corresponding left qubit in the supernodes 128 where we compar immediately above and below it, and each right qubit to the corresponding right qubits in qubits supernodes immediately to the right and left. On the coarser level, this structure can be within selectio the shown in fi viewed as a 2D lattice of size 4 × 4. While the problem of finding the ground state of graph a I. OVERVIEW In references [29, classical Ising spin glass is NP-hard, it is known to have a polynomial time approximation scheme when the input graph is a Chimera graph [20]. sation problem on a The D-Wave machine places a superconducting flux qubit at each node of the Chimera can be mapped to an graph with Npvertice Here we provide additional details in support of the tree width of N men main text. Section II shows details of the chimera graph this mapping. See S 4 used in our study and the choice of graphs for our simulaabout the tree width tions. Section III expands upon the algorithms employed in our study. Section IV presents additional success probability histograms for di↵erent numbers of qubits and for III. CLAS instances with magnetic fields, explains the origin of easy and hard instances, and explains how the final state can A. S be improved via a simple error reduction scheme. Section 3 7 11 15 19 23 27 31 32 36 40 44 48 52 56 60 33 37 41 45 49 53 57 61 34 38 42 46 50 54 58 62 35 39 43 47 51 55 59 63 64 68 72 76 80 84 88 92 65 69 73 77 81 85 89 93 66 70 74 78 82 86 90 94 67 71 75 79 83 87 91 95 96 100 104 108 112 116 120 124 97 101 105 109 113 117 121 125 98 102 106 110 114 118 122 126 99 103 107 111 115 119 123 127 1 Figure 2: The annealing schedule of the D-Wave One. Figure is from [8]. graph. The time-dependent Hamiltonian of D-Wave machines is given by X X H(t) = −A(t) σix − B(t) Jij σiz σjz i<j 1≤i≤n where A(t) and B(t) represent the “annealing schedule” of the machine, which is shown in Figure 2, and the process is carried out at a finite temperature. As mentioned in a footnote in the introduction, this schedule is different from the one reported in [7], which was based on an error by D-Wave [24], although this did not change their conclusions. Our paper is based entirely on the new corrected schedule [8]. We note that our conclusions also hold equally well for both schedules. If the entire process were carried out sufficiently slowly and at zero temperature, then it would be an implementation of adiabatic quantum optimization [13]. Such Pa process is guaranteed to end up in the ground state of the target Hamiltonian H = − i<j Jij Zi Zj . For this to happen, the total time for the process has to scale as Ω(1/∆2 ), where ∆ is the minimum spectral gap of the time-dependent Hamiltonian where the minimum is taken over the annealing schedule. It is known that in the worst case ∆ scales exponentially in the problem size [13, 25, 3]. Quantum annealing can be thought of as a noisy version of adiabatic quantum computing which is carried out at a finite temperature. Finally, we summarize the previous results on the question of whether D-Wave machines exhibit large-scale quantum behavior. We note that there were a few earlier papers that studied related questions [12, 6], but the first paper to study this question at a sufficiently large scale was [7], in which the behavior of the 108-qubit D-Wave One was compared to simulations of quantum and classical models. The method of comparison was as follows. First, one thousand instances of the Ising spin glass problem were generated randomly. 5 Number of instances 500 400 300 200 200 100 100 0 0.0 400 0.2 0.4 0.6 0.8 0 0.0 1.0 500 C) SA 400 300 300 200 200 100 100 0 0.0 0.2 0.4 0.6 0.8 Success probability B) SQA 400 300 500 Number of instances 500 A) DW 0 0.0 1.0 0.2 0.4 0.6 0.8 1.0 D) SD 0.2 0.4 0.6 0.8 Success probability 1.0 Figure 3: Histogram of success probabilities from [8]. It is observed that the histogram is bimodal for D-Wave, simulated quantum annealing, and classical spin dynamics, whereas it is unimodal for SA. This means that the former three algorithms divide the problem instances into two groups, namely “easy” and “hard.” They succeed almost always on the “easy” instances and fail almost always on the “hard” instances. The problem size was 108 for all of these instances, in order that all of the 108 working qubits of the D-Wave One are utilized. Then, each of the following three algorithms were run one thousand times on each of these one thousand problem instances: the D-Wave One, simulated quantum annealing, and simulated (classical) annealing. Based on this, the success probability of each algorithm was recorded for each instance, where success is defined to be the finding of the exact optimum, i.e. the ground state, for the instance. The histogram of success probabilities of each algorithm, shown in Figure 3, served as the main evidence for the paper’s conclusion, which was that the D-Wave One was consistent with simulated quantum annealing rather than the classical models and therefore D-Wave One is indeed performing quantum annealing. A parallel conclusion of the paper was that their simulated annealing code was in fact up to 6 times faster than the D-Wave One, indicating no speedup over classical computers. This result was immediately questioned by Smolin and Smith in [23], who pointed out that the unimodal histogram of simulated annealing was simply a consequence of statistical independence between different runs and that it was not a signature of all classical algorithms. They went on to propose a classical model for the D-Wave machine based on O(2) rotor model, which also exhibited bimodal signature in the success probability histogram. 6 3 20" instances" 40" 0" FIG. 3: Correlations. Panels A-C show scatter plots of correlations of the success probabilities p(s) obtained from di↵erent methods. The red lines indicate perfect correlation. Panel A is for the D-Wave device between two sets of eight di↵erent gauges. This data shows the baseline imperfections in the correlations due to calibration errors in the D-Wave device. Panel B is for the simulated quantum annealer (SQA) and the D-Wave device, with the latter averaged over 16 random gauges. This correlation is nearly as good as in panel A, indicating good correlations between the two methods.. Panel C is for the classical spin dynamics and the D-Wave device, and shows poor correlation. Panel D shows the correlation between success probability and the mean Hamming distance of excited states found at the end of the annealing for N = 108 spin instances with local random fields. Easy (hard) instances tend to have a small (large) Hamming distance. The colour scale indicates how many of the instances are found in a pixel of the plots. Figure 4: Scatterplot of success probabilities from [8]. The correlation between D-Wave and SQA is noticeably better than that between D-Wave and the classical models. FIG. 2: Energy-success distributions. Shown is the joint probability distribution p(s, ) (colour scale) of success probability s and the final state energy measured relative to the ground state. We find very similar results for the D-Wave device (panel A) and the simulated quantum annealer (panel B). The distribution for simulated classical annealing (panel C) matches poorly and for spin dynamics (panel D) matches only moderately. For the D-Wave device and SQA the hardest instances result predominantly in low-lying excited states, while easy instances result in ground states. For SA most instances concentrate around intermediate success probabilities and the ground state as well as low-lying excited states. For classical spin dynamics there is a significantly higher incidence of relatively high excited states than for DW, as well as far fewer excited states for easy instances. The histograms of figure 1, representing p(s), are recovered when summing these distributions over . SA distributions for di↵erent numbers of sweeps are shown in the supplementary material. The authors of [7] soon responded in [28] that their conclusions were in fact based on more detailed correlation analysis, showing that the correlation between the D-Wave success probabilities and SQA success probabilities is much higher than the correlation between the D-Wave success probabilities and success probabilities of the classical models (Figure In panel C) we show the correlation between the classi4). Along with these results, they also released the dataset from their experiments as a cal spin dynamics model and the device. Some instances are easily solved by the classical mean-field dynamics, service the community, to perform the studyand reported this paper. classicalto annealer (panel C) and spinwhich dynamicsallowed (panel D). us simulated quantum annealing, the device.in However, 3 The third test, shown in figure 3, is perhaps the most enlightening, as it plots the correlation of the success probabilities between the DW data and the other models. As a reference for the best correlations we may expect, we show in panel A) the correlations between two di↵erent sets of eight gauges (di↵erent embeddings of the same problem on the device, see Methods and supplementary material): no better correlations than the device with itself can be expected due to calibration errors. Panel B) shows a scatter plot of the hardness of instances for the simulated quantum annealer and the D-Wave device after gauge averaging. The high density in the lower left corner (hard for both methods) and the upper right corner (easy for both methods) confirms the similarities between the D-Wave device and a simulated quantum annealer. i for instances of interThe two are also well correlated mediate hardness. The similarity to panel A) suggests almost perfect correlation with SQA, to within calibration uncertainties. Results as can be expected from inspection of their respective distributions in figure 1, there is no apparent correlation between the hard instances for the spin dynamics model and the success probability on the device, nor does there appear to be a correlation for instances of intermediate hardness, in contrast to the correlations seen in panel A). Similarly, there are poor correlations [22] with a classical spin dynamics model of reference [23]. The correlations between the simulated classical annealer and the D-Wave device, shown in the supplementary i material, are significantly worse than between SQA and the device. We next provide evidence for the bimodality being due to quantum e↵ects. Our first evidence comes from the simulated quantum annealer. When lowering the temperature thermal updates are suppressed, quantum tunneling dominates thermal barrier crossing, and we observe a stronger bimodality; indeed a similar bimodal distribution arises also in an ensemble of (zero-temperature) In this paper, we consider the following classical model: each spin i is modeled by a classical magnet pointing in some direction θ in the XZ plane (since there is no σ y term in our time-dependent Hamiltonian, we can assume that there is no Y component). We assume that the angle θ is measured in relation to x̂. We further assume that there is an external magnetic field of intensity A(t) pointing in the x̂-direction, and that neighboring magnets are coupled via either ferromagnetic or anti-ferromagnetic coupling, according to the specification of the input graph. Our Hamiltonian can then be rewritten as follows: X X H(t) = −A(t) cos θi − B(t) Jij sin θi sin θj . i<j 1≤i≤n Thus, in the absence of noise, i.e. at zero temperature, each spin i willPsimply align with the net effective field at that location which is given by A(t)x̂ + B(t)ẑ j Jij sin θj . To model the noise in the system, we perform a Metropolis-type update inside this model. That is, at each time step, we do the following: 1. For each spin i, pick θi0 at random and compute the change in energy that would result if the spin were to change its state to θi0 : X ∆Ei = −A(t)(cos θi0 − cos θi ) − B(t) Jij sin θj (sin θi0 − sin θi ) j 7 300" Number'of'instances' 250" 200" 150" 100" 50" 0" 0.2" 0.4" 0.6" 0.8" Success'probability' Our"model" DWave" 1" Our model’s success probability Figure 5: Histogram and scatterplot of our classical model. The correlation coefficient R between the D-Wave One and our model is about 0.89. 2. Update spin i’s state to θi0 with probability e−∆Ei /T where T is the temperature of the system which is kept constant throught the simulation. Note that this model is related to the O(2) rotor model of [23], but differs in that we perform Metropolis-type updates instead of integrating the equation of motion. Indeed the update procedure is the direct analogue of the Metropolis algorithm in O(2) model. Our experimental results (Figure 5) show that this simple model not only exhibits a bimodal signature similar to that of the D-Wave One or simulated quantum annealing, but it also achieves a very high correlation with the success probabilities of the D-Wave One which were published in [28]. Our success probabilities were obtained by simulating the above model one thousand times on each instance, where each run consisted of 500,000 steps and the system temperature was fixed at T = 0.22. We note that the correlation of 0.89 achieved by our model is higher than the correlation achieved by simulated quantum annealing in [8]. 4 Discussion It is instructive to analyze the behavior of our classical model and to compare and contrast it with simulated annealing. First, if the transverse field were set to 0, i.e. if A(t) = 0 for all t, then the model simplifies to an O(2) analogue of simulated annealing. To see this, note that for the Metropolis acceptance probability function e−∆Ei /T , keeping the temperature T constant and increasing the coupling strength B(t) has the same effect as keeping B(t) constant and decreasing the temperature T over time. Simple experiments confirm that 8 ir%conclusions%were% sis.%In%parHcular,%the% er%than%the%correlaHon% 3 ing%rotor%models.% odel%for%D;Wave%that% g Woo Shin Role of transverse field What happ ! If%we%remove%the%transverse%field%A(t)%from%our%model,%we%obtain O(2)%simulated%annealing,%which%shows%unimodal%signature.% ! A%case%study:%the ! How%does%the%transverse%field%affect%the%evoluHon?% Umesh Vazirani t=0.1725%up%to%two Computer Science division ⇒ Provides%gradaHon%in%the%magnitude%of%z;components.% ! Comparison%of%th 3 orientaHon%betwee UC Berkeley transverse field weak bias in z direction with transverse field weak bias in z direction 1% 0.5% z;component% without transverse field 0% during%the%branchi ! As%t%increases,%sm ! Other%instances%s ;0.5% Rotor model ll figures are from [1] move%the%transverse%field%A(t)%from%our%model,%we%obtain% ;1% ated%annealing,%which%shows%unimodal%signature.% FIG. 3: Correlations. Panels A-C show scatter plots of correlations of the success probabilities p(s) obtained from es%the%transverse%field%affect%the%evoluHon?% di↵erent methods. The red lines indicate perfect correlation. strong bias in z-direction strong bias in z-direction Spins%(sorted%in%decreasing%order)% with%X%field% without%X%field% and%Smith%[2],%but% ! Reminiscent%of%the%relaHonship%between%cuts%and%eigenvectors% Panel A is for the D-Wave device between two sets6:ofRole eightof transverse field des%gradaHon%in%the%magnitude%of%z;components.% Figure her%than%integraHng% di↵erent gauges. This data shows the baseline imperfections of%graphs%in%spectral%graph%theory.% in the correlations due to calibration ansverse field with transverse field errors in1%the D-Wave de- vice. Panel B is for the simulated quantum annealer (SQA) and the D-Wave device, with the latter averaged over 16 rancatter plots ofThis correlation is nearly as good 0.5% as in panel A, dom gauges. ector%poinHng%at% obtained indicatingfrom good correlations between! the two methods.. Panel At%zero%temperature,%i.e.%in%the%absence%of%noise,%the%simulaHon% given%by% ect C correlation. is for the classical spin dynamicscan%be%viewed%as%a%fixed%point%iteraHon%procedure.%% and the D-Wave device, 0% wo sets of eight s in z direction weak bias in z direction and shows poor correlation. Panel D shows the correlation imperfections Je ij sin ✓isuccess sin ✓jprobability and the! By%a%theorem%in%topology,%all%fixed%points%(equilibria)%are% between mean Hamming distance the D-Wave de;0.5% of excited states found at the end of the annealing for N = 108 reachable%from%the%starHng%point.% annealer (SQA) spin instances with local random fields. Easy (hard) instances ns%with%the%net% ! Thus,%as%Hme%t%changes,%the%model%traces%the%fixed%points% edtend overto16have ran- a small (large) Hamming distance. The colour nite%temperature,% ;1% dter as in panel A, (equilibria)%of%the%Hme;dependent%update%funcHon.% scale indicates in a pixel plots of how many of the instances are foundSpins%(sorted%in%decreasing%order)% methods.. Panel strong bias in z-direction ofz-direction the plots. with%X%field% without%X%field% s in tained from D-Wave device, 5% correlation. Annealing'schedule'of'D/Wave'(approximate)' A(t)% the correlation 4.5% Figure 7: Snapshot of a typical run: z-components of the spins, sorted in decreasing order. sets of eight mming distance B(t)% 4% owing%update:%% In panel C) we show the correlation between the classi3.5% mperfections ng for N = 108 y%at%random%and% cal spin dynamics model and the device. Some instances 3% hard) instances D-Wave deour model indeed exhibits a unimodal signature if the transverse field is removed. This 2.5%mean-field E %%%%%%%%%if%the%spin%was% are easily solved by the classical dynamics, i ce. The colour ealer (SQA) 2% indicates that the transverse field is the very source of bimodal signature. simulated quantum annealing, and ound in aEpixel 1.5%the device. However, i To%confirm%our%in over 16 ranTo understand this better, a closer look at the model shows that an important! effect ase can Tbe expected from inspection 1% of their respective y%%%%%%%%%%%%%%%.%%% of transverse field is to provide gradation in the magnitude of z-components of spins. in panel A, in figure 1, there is no high%correlaHon%wi 0.5% apparent correlation distributions 0% illustrates typical states of spins with and without transverse field. Figure 6 schematically hods.. Panel also%achieves%high% between the hard instances for the spin dynamics 0% 0.05% 0.1% 0.15%model 0.2% 0.25% 0.3% 0.35% 0.4% 0.45% 0.5% 0.55% 0.6% 0.65% 0.7% 0.75% 0.8% 0.85% 0.9% 0.95% Note that in the absence of transverse field, each spin will simply align in the direction Wave device, Time%t% andthe theclassisuccess probability on the device, nor does there een of the z-component of the net field at that location. That is, the spin will want to point! Up%to%t=0.4, correlation appear to be a correlation for instances of intermediate ome instances ~0.08 provablyofdeterministic completely up or completely down regardless the actual strength of the field, as this is the! At%t=0.4,%let ing distance hardness, in contrast to the correlations seen in panel A). eld dynamics, configuration that minimizes energy. ~0.15 The empirically presence ofdeterministic a transverse field completely alters for NHowever, = 108there are Similarly, poor correlations [22] with a classical vice. branching of fixed pointsthe%temperatu this phenomenon, because the relative strengths of the z-0.15~0.4 and x- components determine dynamics model of reference [23].of the spin. behaves like SA 0.4~ d)spin instances non;evolving% heir respective the preferred direction z;component% Evolution of the system Energy%(GHz)% scent%of%the%relaHonship%between%cuts%and%eigenvectors% n%spectral%graph%theory.% Discussion on of the system emperature,%i.e.%in%the%absence%of%noise,%the%simulaHon% wed%as%a%fixed%point%iteraHon%procedure.%% orem%in%topology,%all%fixed%points%(equilibria)%are% from%the%starHng%point.% %Hme%t%changes,%the%model%traces%the%fixed%points% The correlations between the simulated classical anent correlation The colour theoretically equivalent to SA 0.6~ )%of%the%Hme;dependent%update%funcHon.% nealer and the D-Wave device, shown in the supplemennamics model nd in a pixel tary material, nor does there are significantly worse than between SQA9 and the device. f intermediate Annealing'schedule'of'D/Wave'(approximate)' next provide evidence for the bimodality being due en panel A). )% inWe model’s success probability the with classical e↵ects. Our first evidence comes 2.7 from 13.3 )% to aquantum n the classi(t=0.15) (t=0.20) simulated quantum annealer. When lowering the temper5.6 28.1 provided in [3], which is 1 instances thermal tunnel- (t=0.30) de ature classical an- updates are suppressed, (t=0.175) (t=0.10)quantum ments of [1]. ing dominates thermal barrier crossing, and we observe hedynamics, supplemen# of fixed points steps. a stronger bimodality; indeed a similar bimodal distribetween SQA . However, (empirical data) bout 0.89. bution arises also in an ensemble of (zero-temperature) 28.3 (t=0.40) ! Resume%the transverse%fie evoluHon%is%eq ! The%number%of%branches%is% at%an%already% surprisingly%low,%considering%the% problem%size%(N=108).% ! The%model%indica ! Olen%there%exists%a%strongly% stable%local%configu preferred%branch,%making%the% breaking%their%two process%even%more%determinisHc other%supernodes.% Figure 7, in which we take a snapshot of a typical simulation and plot z-components of the spins, confirms that the above intuition is true. The values of z-components are concentrated around 1 and −1 when there is no transverse field (red line), whereas they are more evenly distributed when there is transverse field (blue line). From an algorithmic viewpoint, it is worth drawing an analogy to spectral graph theory, in particular to the relationship between cuts and eigenvectors of graphs. In spectral graph theory, a cut of a graph is represented by a {−1, 1}-valued vector, while entries of an eigenvector of a graph can take on arbitrary real values. It is a well known fact that there exists a very interesting relationship between the two, which allows us to find a sparse cut in a given graph using information about its eigenvectors [11]. Why does the transverse field cause the model to behave more deterministically? Here we make the observation that a simulation of our model can be viewed as a fixed point iteration procedure which traces the equilibria of the time-dependent Hamiltonian. Note that there is a unique equilibrium at t = 0, namely the configuration where every spin points in the x direction. In fact, it is straightforward to prove that under the annealing schedule of Figure 2 and the assumption that the underlying interaction graph is a subgraph of the Chimera graph, H(t) has a unique equilibrium when t < 0.02. Moreover, we can invoke a theorem in topology [10, 21] to prove that all later equilibria are reachable from this unique equilibrium. If we set the noise term in our model to 0, the evolution of the system is completely deterministic and therefore the same equilibrium will be reached on each run. In this case, simulation results are similar to that of [23] and it fails on too many instances to correlate well with the D-Wave machine. When there is noise, however, different equilibria are reached with different probabilities. Interestingly, our results (Figure 5) show that success probabilities are greatly enhanced with the introduction of noise. To obtain an intuitive understanding of this phenomenon, we designed the following experiment: 1. Up to t = tthres , simulate our model as usual, with noise. 2. At t = tthres , equilibrate the system under H(tthres ). 3. Resume the original simulation from t = tthres , but with the transverse field A(t) completely turned off. Remarkably, we were able to obtain a very high correlation of R ≈ 0.99 between our original model and this modified model using tthres = 0.15. (The correlation with D-Wave One remained at about 0.88.) This result suggests the following picture for our model. In the first part of the evolution, i.e. when t < 0.15, we can understand the system to be making a probabilistic choice of one of the equilibria of H(tthres ). The probability with which each equilibrium is reached is governed by the behavior of the noise model. On the other hand, the second part of the evolution, i.e. when t > 0.15, can be understood as simulated annealing that begins at an effective temperature T /B(t) which is already quite low. This means that it can roughly be thought of as local search with small perturbations which explores a very small neighborhood of the given state. Therefore, this algorithm will 10 succeed if and only if the first part of the evolution brings the system into a state which is sufficiently close to the global optimum. Another result of our simulations was that the number of different equilibria reached in the first part of the process was often surprisingly low. Averaged across all instances, we only saw 20.2 distinct equilibria at t = 0.15, which indicates that the size of the state space explored in the first part of the evolution is not comparable with our usual expectations for a problem of size n = 108. Moreover, in most cases we found that there existed a few equilibria that were strongly preferred to the others, making the state space even smaller in effect and the process more deterministic. Together with the picture suggested in the previous paragraph, this offers an approximate high-level explanation of the bimodal behavior of our model. An interesting question that remains to be studied is what happens at the so-called “branching points,” namely at those values of t where the number of equilibria of H(t) jumps from 1 to 2, 2 to 3, and so on. In other words, what choices are being made at those branching points and in what ways do distinct “branches” differ? A case study on the instance 13-55-29 offers a very interesting insight into this question. The instance has two distinct equilibria at t = 0.054, up to the two-fold symmetry of z-flips of all spins. Figure 8 shows how the two equilibria differ from each other. The figure suggests that the choice being made at this particular branching point is the orientation between the “white” cluster and the “black” cluster. Note that if a wrong branch is selected here, the second part of the process will not be able to correct it, as it will have to flip the entire black cluster. This is because as we already saw, the second part of the process may be thought of as simulated annealing at low temperature, and in this case the number of spins that have to be flipped simultaneously is too large. Similar analysis on other instances and other values of t confirms this intuition; branching points always correspond to points at which the orientation between large clusters gets determined. Moreover, as t increases, we see smaller and smaller clusters become involved in the process. That is, we eventually see branching points which determine the orientation between one or two supernodes and the rest of the spins. From an algorithmic viewpoint, this phenomenon has significant implications. Namely, it indicates that supernodes with highly stable local configurations get fixed early in the process, and therefore the main challenge in finding the global optimum lies in breaking their two-fold symmetry based on interaction with other supernodes. Note that the energy contribution from interactions within a particular supernode remains unchanged if every spin in the supernode is flipped. However, the energy contribution from interactions between supernodes depends crucially on the relative orientation between those supernodes. Since simulated annealing performed in the second part of the process is not able to correct more than a handful of spins at once, the main difficulty of the problem lies in choosing a correct branch in the first part of the process. However, we have seen that different branches roughly correspond to different choice of orientation between supernodes. This observation has an important implication for the computational problem being solved, 11 Figure 8: “Branching” of equilibria (Instance 13-55-29, t = 0.054). Black dots indicate spins for which the sign of z-components differ between the two “branches.” Figure shows the less frequent of the two branches. Blue edge indicates ferromagnetic interaction and red edge antiferromagnetic interaction. because it provides a sense in which the effective problem size may be thought of as closer to the number of supernodes m = 16 rather than the number of spins n = 108. This could be one explanation of the fact that heuristic methods such as simulated annealing perform rather well on this type of problems despite the large problem size. Therefore, to the extent that our classical model accurately represents the dynamics of the D-Wave machine, it is reasonable to conjecture that the D-Wave machine, as well as any other heuristic method, will fail to solve these problems effectively when the number of supernodes m becomes sufficiently large. It is worth emphasizing that the goal of this paper is not to provide a classical model 12 Figure 9: Correlation between simulated quantum annealing of [8] and our classical model. The correlation coefficient R is about 0.99. for the D-Wave machine, but rather to critically examine the claim that there is no simple local classical model that correlates with its input-output behavior. The classical model introduced here is useful for the purposes of studying the large-scale algorithmic features of the D-Wave machine. The task of finding an accurate model for the D-Wave machine (classical, quantum or otherwise), would be better pursued with direct access, not only to programming the D-Wave machine, but also to its actual hardware. 5 Conclusions This paper examined the interesting claims from a recent paper by [7] based on extensive experiments, that the 108 qubit D-Wave One machine exhibits large-scale quantum behavior. The nature of the argument in that paper is algorithmic: a strong correlation was observed between the input-output behavior of the D-Wave machine with a quantum model called quantum simulated annealing, whereas it correlated poorly with two classical models: simulated annealing and classical spin dynamics. In this paper, we suggest a simple classical model, which treats each spin in the D-Wave machine as a classical magnet, which couples to neighboring magnets as well as an external field. The evolution of the system proceeds according to a Metropolis-type dynamics. Simulation of this model on the same instances reported in [7] yields higher correlation with the D-Wave input-output behavior than those of simulated quantum annealing. One way to view our model is as a classical mean field approximation to simulated quantum annealing. A direct comparison between our model and simulated quantum annealing [24] reveals an extremely high correlation of R ≈ 0.99 (Fig 9). Hence, the correlation of the D-Wave data with simulated 13 quantum annealing does not impose any barrier to a classical model for D-Wave. Further analysis of the new model indicates that the dynamics do not exhibit the the kind of combinatorial explosion one might expect from a 108-bit problem. The reason appears to be related to the 8-bit clusters in the interaction graph that appear to have the effect of reducing the effective problem size to 16 rather than 108. To the extent that this model captures the essential algorithmic features of the D-Wave One machine, this suggests that scaling to machines with larger numbers of qubits should lead to increasing problems with combinatorial explosion. Acknowledgments SWS and UV were supported by ARO Grant W911NF-09-1-0440 and NSF Grant CCF0905626. We thank Matthias Troyer, Sergio Boixo, and Troels F. Rønnow for useful discussions and for sharing their unpublished experimental data with us. References [1] S. Aaronson. D-Wave: Truth finally starts to emerge. http://www.scottaaronson. com/blog/?p=1400, 2013. [2] D. Aharonov, M. Ben-Or, and E. Eban. Interactive proofs for quantum computation. In Proceedings of Innovations of Computer Science, pages 453–469, 2010. [3] B. Altshuler, H. Krovi, and J. Roland. Anderson localization casts clouds over adiabatic quantum optimization. Proceedings of the National Academy of Sciences of the United States of America, 107(28):12446–12450, 2010. [4] J. Barrett, L. Hardy, and A. Kent. No signalling and quantum key distribution. Physical Review Letters, 95(010503), 2005. [5] S. Barz, J. F. Fitzsimons, E. Kashefi, and P. Walther. Experimental verification of quantum computation. Nature Physics, 9:727–731, 2013. [6] S. Boixo, T. Albash, F. M. Spedalieri, N. Chancellor, and D. A. Lidar. Experimental signature of programmable quantum annealing. Nature Communications, 4:2067, 2012. [7] S. Boixo, T. F. Rønnow, S. V. Isakov, Z. Wang, D. Wecker, D. A. Lidar, J. M. Martinis, and M. Troyer. Quantum annealing with more than one hundred qubits. E-print arXiv:1304.4595. http://arxiv.org/pdf/1304.4595v2.pdf (An updated version will appear in [8]), 2013. [8] S. Boixo, T. F. Rønnow, S. V. Isakov, Z. Wang, D. Wecker, D. A. Lidar, J. M. Martinis, and M. Troyer. Evidence for quantum annealing with more than one hundred qubits. Nature Physics, in press, 2014. 14 [9] A. Broadbent, J. F. Fitzsimons, and E. Kashefi. Universal blind quantum computation. In Proceedings of the 50th Annual IEEE Symposium on Foundations of Computer Science, pages 517–526, 2009. [10] F. E. Browder. On continuity of fixed points under deformations of continuous mappings. Summa Brasil Math, 4:183–191, 1960. [11] F. Chung. Spectral Graph Theory. American Mathematical Society, 1992. [12] N. G. Dickson et al. Thermally assisted quantum annealing of a 16-qubit problem. Nature Communications, 4:1903, 2013. [13] E. Farhi, J. Goldstone, S. Gutmann, and M. Sipser. Quantum computation by adiabatic evolution. E-print arXiv:quant-ph/0001106. http://arxiv.org/pdf/ quant-ph/0001106v1.pdf, 2000. [14] J. F. Fitzsimons and E. Kashefi. Unconditionally verifiable blind computation. E-print arXiv:1203.5217. http://arxiv.org/pdf/1203.5217v2, 2012. [15] N. Jones. Google and NASA snap up quantum computer D-Wave Two. http://www.scientificamerican.com/article.cfm?id= google-nasa-snap-up-quantum-computer-dwave-two, 2013. [16] C. C. McGeoch and C. Wang. Experimental evaluation of an adiabiatic quantum system for combinatorial optimization. In Proceedings of the 2013 ACM Conference on Computing Frontiers, 2013. [17] S. Pironio et al. Random numbers certified by Bell’s theorem. Nature, 464:1021–1024, 2010. [18] B. W. Reichardt, F. Unger, and U. Vazirani. Classical command of quantum systems. Nature, 496:456–460, 2013. [19] T. F. Rønnow, Z. Wang, J. Job, S. Boixo, S. V. Isakov, D. Wecker, J. M. Martinis, D. A. Lidar, and M. Troyer. Defining and detecting quantum speedup. E-print arXiv:1401.2910. http://arxiv.org/pdf/1401.2910v1.pdf, 2014. [20] R. Saket. A PTAS for the classical Ising spin glass problem on the Chimera graph structure. E-print arXiv:1306.6943. http://arxiv.org/pdf/1306.6943v2, 2013. [21] H. Schirmer. Fixed point sets of homotopies. 108(1):191–202, 1983. Pacific Journal of Mathematics, [22] A. Selby. D-Wave: comment on comparison with classical computers. http://www.archduke.org/stuff/d-wave-comment-on-comparison/, 2013. [23] J. A. Smolin and G. Smith. Classical signature of quantum annealing. arXiv:1305.4904. http://arxiv.org/pdf/1305.4904v1.pdf, 2013. E-print [24] M. Troyer. Personal communication, 2014. [25] W. van Dam, M. Mosca, and U. Vazirani. How powerful is adiabatic quantum computation? E-print arXiv:0206003. http://arxiv.org/pdf/quant-ph/0206003v1, 2002. 15 [26] U. V. Vazirani and T. Vidick. Certifiable quantum dice - or, exponential randomness expansion. In ACM Symposium on Theory of Computing, 2012. [27] U. V. Vazirani and T. Vidick. Fully device-independent quantum key distribution. In The 5th Innovations in Theoretical Computer Science, 2014. [28] L. Wang, T. F. Rønnow, S. Boixo, S. V. Isakov, Z. Wang, D. Wecker, D. A. Lidar, J. M. Martinis, and M. Troyer. Comment on: “classical signature of quantum annealing”. E-print arXiv:1305.5837. http://arxiv.org/pdf/1305.5837v1.pdf, 2013. 16