Introduction to localization by disorder
The integer quantum Hall effect (the quantization of the transverse conductance in units of $\frac{e^2}{h}$) depends on the existence of disorder so that the above Lorentz-invariance argument is violated; for the noninteracting case, it is simple to show that even a periodic potential is insufficient to give the correct behavior, and some sort of disorder is required. We will now discuss the physics of localization by disorder in zero magnetic field, both as a interesting topic in its own right and as a prelude to the discussion of the IQHE and FQHE.
Consider the solutions of Schrodinger's equation for a single electron moving in a random potential. This problem may seem to have little to do with the sort of interacting clean problems we discussed previously, but actually the same field-theoretic methods are useful for both. Instead we will be content to combine some simple arguments in general dimensionality with a simple calculation due to Halperin in one dimension. The surprising feature is that even a weak disorder potential can lead to localization of electronic eigenstates: the early steps by Anderson and Mott in understanding this phenomenon were rewarded by the 1977 Nobel Prize.
Our simple picture for the behavior of eigenstates in a random potential would probably go as follows. At low energy, there should be some bound states near minima of the potential, while at high energy, we expect that there should be some free states, where the electron scatters occasionally off bumps in the potential but is unbound. A mathematical distinction can be made between "localized" eigenstates whose magnitude falls off exponentially at spatial infinity, and "extended" eigenstates which fall off more slowly. (Often the term "critical" is used for wavefunctions that fall off algebraically, and "extended" reserved for wavefunctions like plane waves that do not fall off at all.)
There is a simple argument due to Mott that shows that extended and localized states should not both exist at the same energy in a generic random potential. Therefore there is a "mobility edge": there is a particular energy $E_c$ above which eigenstates are extended, and below which eigenstates are localized. The argument is that if both extended and localized states were present at the same energy, then because the energy denominator is zero, even a small perturbation would strongly mix the extended and localized states, giving two extended states.
Actually, a more mathematically useful way to proceed with this argument is to consider time evolution of a wavepacket, rather than energy eigenstates. Following the original 1958 paper of Anderson, consider the following tight-binding lattice model for single particles:
$$\begin{equation}
H = \sum_i U_i n_i - t \sum_{\langle ij \rangle} (c^\dagger_i c_j + {\rm h.c.}).
\end{equation}$$
Here the $U_i$ are some random variables sampled from a distribution whose details are relatively unimportant, as long as it falls off exponentially at energies far away from some central value $U_0$.
Suppose we start with one electron in the localized state at the origin $O$: $\psi(0) = c^\dagger_O |0\rangle.$ This state is not an eigenstate because of the hopping operator proportional to $t$, so over time the electron density spreads out. If there are some extended states, then one expects that at sufficiently long times, the density spread will be diffusive with some diffusion constant $D$:
$$\begin{equation}
\langle R^2 \rangle = \int |\psi(x)|^2 R^2\,d^d x \sim D t.
\end{equation}$$
However, if all the states in the system are sharply localized, then at long times the density will have ceased to spread:
$$\begin{equation}
\lim_{t \rightarrow \infty} \langle R^2 \rangle = \xi^2.
\end{equation}$$
Here $\xi$ is some quantity with units of length, referred to as the "localization length."
It is perhaps worth pointing out that these are not the only two alternatives. It was understood only quite recently that if one adds a magnetic field and works within the lowest Landau level instead of using a lattice model, then $\langle R^2 \rangle \sim t^\alpha$ for some number $\alpha \approx 0.79$, but a calculable understanding of this number is still lacking.
Going back to the question of whether there is a mobility edge between extended and localized states: this picture is essentially correct in three dimensions, and there is a "localization transition" as the Fermi level moves through the mobility edge. However, in one and two dimensions, an amazing sort of quantum interference leads to localization of eigenstates at {\bf all} energies, even for a weak potential. We will see this explicitly in one dimension in the next lecture. Why is dimensionality so important, given that the above mobility edge picture didn't seem to depend on dimensionality?
Well, it is fairly easy to argue that extended states should survive above two dimensions. Consider diffusive propagation of electrons, as we would expect if they are scattering occasionally off potential fluctuations, without becoming localized. (Eigenstates can be "ballistic", like plane waves, "diffusive", like these scattering states, or "localized".)
Diffusive spreading means $\langle R^2 \rangle \sim Dt$: we can think of the electron density at time $t$ as concentrated in a sphere of radius proportional to $\sqrt{D t}$. (More realistically, of course, the probability distribution of the density would be Gaussian.)
Then, normalizing the overall density to 1, we have that the probability for the particle to be near the origin at time $t$ goes as $(D t)^{-d/2}$, where $d$ is the spatial dimensionality, since this is the reciprocal of the sphere's volume. Now we can ask, how many times is the electron expected to have returned to the origin by time $T$? The expected number of returns is
$$\begin{equation}
N = \int_{t_0}^T {1 \over (D t)^{d/2}} dt.
\end{equation}$$
Here we ignore any possible singularity at the origin (since we know that at short times our assumption of a Gaussian spread breaks down) and focus on the long-time behavior. The integral converges for $d>2$. So for $d>2$, the electron returns only a finite number of times to any particular fluctuation; if the fluctuations are weak enough, then the electron will not be localized since different fluctuations are independent for a random potential with only short-ranged correlations.
What about in dimensionality 2 or below? Then a diffusive electron would see some potential fluctuations over and over again, which creates the possibility of constructive or destructive interference of electron waves. Note that localization depends crucially on phase coherence; destruction of phase coherence by inelastic scattering (off of phonons, for instance) leads to delocalization. An argument that constructive interference
leads to localization is as follows: consider two paths in the Feynman path integral over classical histories for $\Psi$ which differ in that one interior loop is traversed in one direction (say clockwise) by the first path, and in the opposite direction by the second path. In zero magnetic field, for any realization these two contribute equally. We have, writing $\psi_1$ and $\psi_2$ for the contributions from the two paths, a sort of constructive interference for such paths:
$$\begin{equation}
\langle |\Psi|^2 \rangle = \langle |\psi_1 + \psi_2|^2 \rangle = \langle |2 \psi_1|^2
\rangle = 4 \langle |\psi_1|^2 \rangle > \langle |\psi_1|^2 \rangle + \langle |\psi_2|^2 \rangle.
\end{equation}$$
This implies that paths with more self-intersections have relatively greater probability than paths with fewer self-intersections. Since total probability is conserved, this means that the electron is likely to stay close to the origin, since paths with many self-intersections tend to be less spatially extended. A detailed calculation in two dimensions that such interference leads to localization of all eigenstates is quite complicated; here we will be content to show localization in 1D.
The term "weak localization" is used to describe perturbative calculations (perturbative in the strength of disorder) of incipient localization; then the same Green's functions techniques can be used as introduced before, although one has to calculate the two-body rather than the one-body Green's function.
We argued, based simply on the return probability of random walks, that weak disorder could not localize all states in dimensions greater than 2, because the expected number of returns to any individual point for $d>2$ is finite. In dimensions $d \leq 2$, it is at least possible that quantum interference leads to localization, and we gave an argument that such quantum interference is constructive for paths which have self-intersections. Since paths with many self-intersections are likely to stay closer to the origin than paths with few self-intersections, this shows that quantum mechanics should at least tend to localize states, although whether this interference is actually strong enough to localize all eigenstates requires a much stronger argument. A good reference for localization is the review article of P. A. Lee and T. V. Ramakrishnan, RMP {\bf 57}, 287 (1985).
Here we will start by giving a scaling argument which explains in another way why $d=2$ is special. The formal version of this argument is made using the renormalization group, but the basic idea is quite simple. Let us try to understand the behavior of the function $g(L)$ which gives the {\bf conductance} (not conductivity) of some material in a cube of side $L$. Our goal will be, given some initial value $g(L_0)$ at a short length scale $L_0$, to understand what happens when we go to larger scales; this is normally parametrized in terms of the $\beta$-function
$$\begin{equation}
\beta = {d\log g \over d\log L}.
\end{equation}$$
Suppose first of all that a scattering picture is correct: noninteracting electrons in the material move diffusively (rather than being localized or ballistic), and Ohm's law is satisfied. Then the conductance, once the cube is larger than the mean free path $l$, should go as
$$\begin{equation}
g(L) \approx \sigma L^{d-2}
\end{equation}$$
where $\sigma$ is the conductivity. Already we can see that $d=2$ is marginal; we come soon to the quantization of the Hall conductance in units of $e^2/h$. It seems possible only in $d=2$ to have a scale-invariant conductance, but in $d=1$, however, there is also a sort of conductance quantization in units of $e^2/h$. The resolution to this paradox is that the 1D finite conductance results purely from the contacts; transport is quasi-ballistic in the bulk of the system. We may say a bit more about this later.
Suppose now that instead of having diffusive electron motion, all electrons are in localized states. How then should $g$ behave? Well, if the longest localization length is $\xi \gg l$ (the localization length is always longer than the mean free path), then we expect
$$\begin{equation}
g(L) \sim \exp(-L/\xi).
\end{equation}$$
Here we are ignoring possible power-law factors which will be dominated for large $L$ by the exponential. For a particular realization of disorder, we expect that the microscopic conductance "flows" from its initial value $g(L_0)$ with increasing $L$ until reaching one of the above two asymptotic regimes. The challenge is now to justify this picture and understand the importance of dimensionality.
The main step taken by Abrahams et al. in 1979 was to conjecture that $\beta = {d \log g \over d \log L}$ can be taken to be only a function of $g$. We might think that other properties such as $L$, the details of disorder, etc. would be important, but at least in the long-length-scale limit, it seems that $\beta$ is indeed a function of $g$ alone: this is known as "one-parameter scaling". We write $\beta(g)$ henceforth to emphasize this.
The somewhat more rigorous justification of the above is made using the connection between the conductance and what is known as the Thouless ratio, which measures the change in energy of a block of size $L$ under a change in its boundary conditions. First, it is natural to say that the conductance of a large block of size $2L$ on each side can be determined from a knowledge of the electronic eigenstates inside each of the $2^d$ blocks of size $L$ within the large block. If we wanted to explicitly construct trial eigenstates for the large block from eigenstates of the small blocks (which form a complete set), we could use perturbation theory. The amount of admixture of small-block eigenstates with each other can be estimated as the ratio $O / \delta W$, where $O$ is an overlap integral and $\delta W$ is an energy spacing.
Here it is a little subtle what the overlap $O$ means, since we might think of eigenstates vanishing sharply at the boundary of the $L^d$ system. However, this would require an infinite potential at the boundary; in fact the eigenstates will go slightly outside the block, and it is the overlap of these tails with the states of the neighboring block that we want to estimate.
To be more precise, note that if we took a sample of size $L^d$ and repeated it in one direction to form an infinite chain, each individual eigenvalue will be broadened into a band, and the bandwidth can be used to estimate the overlap integral. A little thought will convince yourself that the bandwidth can be written as the change in energy between periodic and antiperiodic boundary conditions on one $L^d$ cube!
Localized states will have very small overlap (assuming that they are localized within the bulk of the $L^d$ cube), while extended states will have significant overlap.
Introducing the dimensionless conductange $g = G / (e^2/h)$, Thouless argued that this conductance at scale $L$ is a function of the overlap, estimated as above:
$$\begin{equation}
g(L) = f(\Delta E \over \delta W).
\end{equation}$$
Here $\Delta E$ is the change in energy of eigenstates at the Fermi level between periodic and antiperiodic boundary conditions. In one dimension it can be shown explicitly that the unknown function $f(x)$ is quadratic: $f(x) = c x^2$.
The idea of one-parameter scaling is that the properties of the system on scale $2L$ are determined by the effective level of disorder at scale $L$, and that the dimensionless conductance $g$ is a sufficient measure of this disorder: two types of microscopic disorder that give rise to the same conductance $g$ at a large scale $L$ then are predicted to give the same conductance at all larger length scales. This is still an assumption that needs to be tested, but one-dimensional calculations support this picture. It is also possible to justify some of the above using perturbation theory in the disorder strength, which is known as "weak localization" theory.
Returning to our above guesses for the asymptotic form of $\beta$, we now have
deep in the diffusive regime (high conductance, $g \gg 1$, $\log g > 0$)
$$\begin{equation}
\beta(g) = {d \log g \over d \log L} = (d-2).
\end{equation}$$
Deep in the localized regime (low conductance, $g \ll 1$, $\log g < 0$), we have
$$\begin{equation}
\beta(g) = {d \log g \over d \log L} = (- L / \xi) \approx \log g,
\end{equation}$$
which is negative in any dimension.
The point of one-parameter scaling is that now we can make a plot of $\beta(g)$ vs. $\log g$ and argue based on continuity that $d=1$ and $d=2$ are very different from $d=3$. If there were more dimensions to the plot, as occurs in a strong magnetic field, then the situation would be more complicated.
In $d=1$ and $d=2$, the simplest continuity assumption is that $\beta(g)$ is always negative for finite g, since it is negative at $g = 0^+$ and zero or negative at $g=\infty$.
In $d=3$, we have a more complicated situation because $\beta(g)$ must have a zero. Above this critical point, the flow is to the diffusive regime; below this critical point, the flow is to a localized regime. This picture corresponds roughly to our intuitive idea of a "mobility edge" separating extended and localized states in three dimensions.
A brief aside on why quenched disorder is hard: the impurity problems discussed here involve both a quantum-mechanical average and a disorder average, and the disorder average is not over a thermal distribution. The term "quenched disorder", from metallurgy, is used to describe random nonthermal disorder, while "annealed disorder" refers to disorder that can be understood simply as another thermal variable. Quenched disorder is very difficult to handle from a field-theory point of view essentially because it corresponds to disorder-averaging $\log Z$ rather than $Z$, where $Z$ is the partition function, if for instance one wants to calculate the free energy. That is, what one needs to describe the physical free energy of a disordered system is
$$\begin{equation}
\langle \log Z \rangle \not = \log \langle Z \rangle,
\end{equation}$$
but what is on the right side is much easier to calculate. Put in terms of correlation functions, what we need is
$$\begin{equation}
\langle {T\{\Psi_\alpha(t_1,x_1) \Psi_\beta(t_2,x_2) S \}\over S} \rangle
\not =
{\langle T\{\Psi_\alpha(t_1,x_1) \Psi_\beta(t_2,x_2) S \}\rangle \over \langle S \rangle}\end{equation}$$.
(To convince yourself that these are different, compute $\langle x^2 \rangle / \langle x \rangle$ for some smooth distribution and show that it need not equal $\langle x \rangle$.)
Many deep theoretical methods have been developed to understand such quenched averages: the most famous are the "replica trick" and supersymmetry methods. The replica trick is very general: simply write
$$\begin{equation}
\log Z = \lim_{n\rightarrow 0} {(Z^n - 1) \over n},
\end{equation}$$
attempt to calculate $Z^n$ for all positive $n$, and analytically continue down to $n=0$. Sometimes this continuation works, and sometimes it runs into deep mathematical issues such as "replica symmetry breaking," which remain a subject of active debate.