- Holographic quantum criticality from multi-trace deformations We explore the consequences of multi-trace deformations in applications of gauge-gravity duality to condensed matter physics. We find that they introduce a powerful new "knob" that can implement spontaneous symmetry breaking, and can be used to construct a new type of holographic superconductor. This knob can be tuned to drive the critical temperature to zero, leading to a new quantum critical point. We calculate nontrivial critical exponents, and show that fluctuations of the order parameter are `locally' quantum critical in the disordered phase. Most notably the dynamical critical exponent is determined by the dimension of an operator at the critical point. We argue that the results are robust against quantum corrections and discuss various generalizations. 3 authors · Aug 9, 2010
1 Critical yielding rheology: from externally deformed glasses to active systems In the last decade many research efforts have been focused on understanding the rheology of disordered materials, and several theoretical predictions have been put forward regarding their yielding behavior. Nevertheless, not many experiments nor molecular dynamics simulations were dedicated to testing those theoretical predictions. Here we use computer simulations to study the yielding transition under two different loading schemes: standard simple shear dynamics, and self-propelled, dense active systems. In the active systems a yielding transition is observed as expected, when the self-propulsion is increased. However, the range of self-propulsions in which a pure liquid regime exist appears to vanish upon approaching the so-called "jamming point" at which solidity of soft-sphere packings is lost. Such an "active yielding" transition shares similarities with the generic yielding transition for shear flows. A Herschel-Bulkley law is observed in both loading scenarios, with a clear difference in the critical scaling exponents between the two, suggesting the existent of different universality classes for the yielding transition under different driving conditions. In addition, we present direct measurements of length and time scales for both driving scenarios. A comparison with theoretical predictions from recent literature reveals poor agreement with our numerical results. 2 authors · Jan 28, 2021
- Gravity Duals of Lifshitz-like Fixed Points We find candidate macroscopic gravity duals for scale-invariant but non-Lorentz invariant fixed points, which do not have particle number as a conserved quantity. We compute two-point correlation functions which exhibit novel behavior relative to their AdS counterparts, and find holographic renormalization group flows to conformal field theories. Our theories are characterized by a dynamical critical exponent z, which governs the anisotropy between spatial and temporal scaling t to lambda^z t, x to lambda x; we focus on the case with z=2. Such theories describe multicritical points in certain magnetic materials and liquid crystals, and have been shown to arise at quantum critical points in toy models of the cuprate superconductors. This work can be considered a small step towards making useful dual descriptions of such critical points. 3 authors · Aug 13, 2008
- Towards strange metallic holography We initiate a holographic model building approach to `strange metallic' phenomenology. Our model couples a neutral Lifshitz-invariant quantum critical theory, dual to a bulk gravitational background, to a finite density of gapped probe charge carriers, dually described by D-branes. In the physical regime of temperature much lower than the charge density and gap, we exhibit anomalous scalings of the temperature and frequency dependent conductivity. Choosing the dynamical critical exponent z appropriately we can match the non-Fermi liquid scalings, such as linear resistivity, observed in strange metal regimes. As part of our investigation we outline three distinct string theory realizations of Lifshitz geometries: from F theory, from polarised branes, and from a gravitating charged Fermi gas. We also identify general features of renormalisation group flow in Lifshitz theories, such as the appearance of relevant charge-charge interactions when z geq 2. We outline a program to extend this model building approach to other anomalous observables of interest such as the Hall conductivity. 4 authors · Dec 5, 2009
- Comments on Fermi Liquid from Holography We investigate the signatures of Fermi liquid formation in the N=4 super Yang-Mills theory coupled to fundamental hypermultiplet at nonvanishing chemical potential for the global U(1) vector symmetry. At strong 't Hooft coupling the system can be analyzed in terms of the D7 brane dynamics in AdS_5 x S^5 background. The phases with vanishing and finite charge density are separated at zero temperature by a quantum phase transition. In case of vanishing hypermultiplet mass, Karch, Son and Starinets discovered a gapless excitation whose speed equals the speed of sound. We find that this zero sound mode persists to all values of the hypermultiplet mass, and its speed vanishes at the point of phase transition. The value of critical exponent and the ratio of the velocities of zero and first sounds are consistent with the predictions of Landau Fermi liquid theory at strong coupling. 2 authors · Aug 28, 2008
- Examples of renormalization group transformations for image sets Using the example of configurations generated with the worm algorithm for the two-dimensional Ising model, we propose renormalization group (RG) transformations, inspired by the tensor RG, that can be applied to sets of images. We relate criticality to the logarithmic divergence of the largest principal component. We discuss the changes in link occupation under the RG transformation, suggest ways to obtain data collapse, and compare with the two state tensor RG approximation near the fixed point. 4 authors · Jul 26, 2018
- Variational integrals on Hessian spaces: partial regularity for critical points We develop regularity theory for critical points of variational integrals defined on Hessian spaces of functions on open, bounded subdomains of R^n, under compactly supported variations. The critical point solves a fourth order nonlinear equation in double divergence form. We show that for smooth convex functionals, a W^{2,infty} critical point with bounded Hessian is smooth provided that its Hessian has a small bounded mean oscillation (BMO). We deduce that the interior singular set of a critical point has Hausdorff dimension at most n-p_0, for some p_0 in (2,3). We state some applications of our results to variational problems in Lagrangian geometry. Finally, we use the Hamiltonian stationary equation to demonstrate the importance of our assumption on the a priori regularity of the critical point. 2 authors · Jul 3, 2023
- Instability of the solitary waves for the Generalized Benjamin-Bona-Mahony Equation In this work, we consider the generalized Benjamin-Bona-Mahony equation $partial_t u+partial_x u+partial_x( |u|^pu)-partial_t partial_x^{2}u=0, quad(t,x) in R times R, with p>4. This equation has the traveling wave solutions \phi_{c}(x-ct), for any frequency c>1. It has been proved by Souganidis and Strauss Strauss-1990 that, there exists a number c_{0}(p)>1, such that solitary waves \phi_{c}(x-ct) with 1<c<c_{0}(p) is orbitally unstable, while for c>c_{0}(p), \phi_{c}(x-ct) is orbitally stable. The linear exponential instability in the former case was further proved by Pego and Weinstein Pego-1991-eigenvalue. In this paper, we prove the orbital instability in the critical case c=c_{0}(p)$. 2 authors · Sep 1, 2023
- Sharp seasonal threshold property for cooperative population dynamics with concave nonlinearities We consider a biological population whose environment varies periodically in time, exhibiting two very different "seasons" : one is favorable and the other one is unfavorable. For monotone differential models with concave nonlinearities, we address the following question: the system's period being fixed, under what conditions does there exist a critical duration for the unfavorable season? By "critical duration" we mean that above some threshold, the population cannot sustain and extincts, while below this threshold, the system converges to a unique periodic and positive solution. We term this a "sharp seasonal threshold property" (SSTP, for short). Building upon a previous result, we obtain sufficient conditions for SSTP in any dimension and apply our criterion to a two-dimensional model featuring juvenile and adult populations of insects. 2 authors · Apr 20, 2018
- New type of solutions for a critical Grushin-type problem with competing potentials In this paper, we consider a critical Grushin-type problem with double potentials. By applying the reduction argument and local Pohozaev identities, we construct a new family of solutions to this problem, which are concentrated at points lying on the top and the bottom circles of a cylinder. 2 authors · Jun 29, 2024
- Building an AdS/CFT superconductor We show that a simple gravitational theory can provide a holographically dual description of a superconductor. There is a critical temperature, below which a charged condensate forms via a second order phase transition and the (DC) conductivity becomes infinite. The frequency dependent conductivity develops a gap determined by the condensate. We find evidence that the condensate consists of pairs of quasiparticles. 3 authors · Mar 22, 2008
- The Slepian model based independent interval approximation of persistency and zero-level exceedance distributions In physics and engineering literature, the distribution of the excursion-above-zero time distribution (exceedance distribution) for a stationary Gaussian process has been approximated by a stationary switching process with independently distributed switching times. The approach matched the covariance of the clipped Gaussian process with the one for the stationary switching process and the distribution of the latter was used as the so-called independent interval approximation (IIA). The approach successfully assessed the persistency exponent for many physically important processes but left an unanswered question when such an approach leads to a mathematically meaningful and proper exceedance distribution. Here we address this question by proposing an alternative matching of the expected values of the clipped Slepian process and the corresponding switched process initiated at the origin. The method has allowed resolving the mathematical correctness of the matching method for a large subclass of the Gaussian processes with monotonic covariance, for which we provide a sufficient condition for the validity of the IIA. Within this class, the IIA produces a valid distribution for the excursion time and is represented in an explicit stochastic form that connects directly to the covariance of the underlying Gaussian process. We compare the excursion level distributions as well as the corresponding persistency exponents obtained through the IIA method with numerically computed exact distributions, and the simulated distribution for several important Gaussian models. We also argue that for stationary Gaussian processes with a non-monotonic covariance, the IIA fails and should not be used. 2 authors · Jan 3, 2024
- Convergence of local times of stochastic processes associated with resistance forms In this paper, it is shown that if a sequence of resistance metric spaces equipped with measures converges with respect to the local Gromov-Hausdorff-vague topology, and certain non-explosion and metric-entropy conditions are satisfied, then the associated stochastic processes and their local times also converge. The metric-entropy condition can be checked by applying volume estimates of balls. Whilst similar results have been proved previously, the approach of this article is more widely applicable. Indeed, we recover various known conclusions for scaling limits of some deterministic self-similar fractal graphs, critical Galton-Watson trees, the critical Erdos-R\'enyi random graph and the configuration model (in the latter two cases, we prove for the first time the convergence of the models with respect to the resistance metric and also, for the configuration model, we overcome an error in the existing proof of local time convergence). Moreover, we derive new ones for scaling limits of uniform spanning trees and random recursive fractals. The metric-entropy condition also implies convergence of associated Gaussian processes. 1 authors · May 22, 2023
- Condensed matter and AdS/CFT I review two classes of strong coupling problems in condensed matter physics, and describe insights gained by application of the AdS/CFT correspondence. The first class concerns non-zero temperature dynamics and transport in the vicinity of quantum critical points described by relativistic field theories. I describe how relativistic structures arise in models of physical interest, present results for their quantum critical crossover functions and magneto-thermoelectric hydrodynamics. The second class concerns symmetry breaking transitions of two-dimensional systems in the presence of gapless electronic excitations at isolated points or along lines (i.e. Fermi surfaces) in the Brillouin zone. I describe the scaling structure of a recent theory of the Ising-nematic transition in metals, and discuss its possible connection to theories of Fermi surfaces obtained from simple AdS duals. 1 authors · Feb 16, 2010
1 Critical scaling law for the deposition efficiency of inertia-driven particle collisions with a cylinder in high Reynolds number air flow The Earth's atmosphere is an aerosol, it contains suspended particles. When air flows over an obstacle such as an aircraft wing or tree branch, these particles may not follow the same paths as the air flowing around the obstacle. Instead the particles in the air may deviate from the path of the air and so collide with the surface of the obstacle. It is known that particle inertia can drive this deposition, and that there is a critical value of this inertia, below which no point particles deposit. Particle inertia is measured by the Stokes number, St. We show that near the critical value of the Stokes number, St_c, the amount of deposition has the unusual scaling law of exp(-1/(St-St_c)^{1/2}). The scaling is controlled by the stagnation point of the flow. This scaling is determined by the time for the particle to reach the surface of the cylinder varying as 1/(St-St_c)^{1/2}, together with the distance away from the stagnation point (perpendicular to the flow direction) increasing exponentially with time. The scaling law applies to inviscid flow, a model for flow at high Reynolds numbers. The unusual scaling means that the amount of particles deposited increases only very slowly above the critical Stokes number. This has consequences for applications ranging from rime formation and fog harvesting to pollination. 2 authors · Jan 3, 2023
- Feature Learning and Generalization in Deep Networks with Orthogonal Weights Fully-connected deep neural networks with weights initialized from independent Gaussian distributions can be tuned to criticality, which prevents the exponential growth or decay of signals propagating through the network. However, such networks still exhibit fluctuations that grow linearly with the depth of the network, which may impair the training of networks with width comparable to depth. We show analytically that rectangular networks with tanh activations and weights initialized from the ensemble of orthogonal matrices have corresponding preactivation fluctuations which are independent of depth, to leading order in inverse width. Moreover, we demonstrate numerically that, at initialization, all correlators involving the neural tangent kernel (NTK) and its descendants at leading order in inverse width -- which govern the evolution of observables during training -- saturate at a depth of sim 20, rather than growing without bound as in the case of Gaussian initializations. We speculate that this structure preserves finite-width feature learning while reducing overall noise, thus improving both generalization and training speed. We provide some experimental justification by relating empirical measurements of the NTK to the superior performance of deep nonlinear orthogonal networks trained under full-batch gradient descent on the MNIST and CIFAR-10 classification tasks. 3 authors · Oct 11, 2023
- 4+3 Phases of Compute-Optimal Neural Scaling Laws We consider the solvable neural scaling model with three parameters: data complexity, target complexity, and model-parameter-count. We use this neural scaling model to derive new predictions about the compute-limited, infinite-data scaling law regime. To train the neural scaling model, we run one-pass stochastic gradient descent on a mean-squared loss. We derive a representation of the loss curves which holds over all iteration counts and improves in accuracy as the model parameter count grows. We then analyze the compute-optimal model-parameter-count, and identify 4 phases (+3 subphases) in the data-complexity/target-complexity phase-plane. The phase boundaries are determined by the relative importance of model capacity, optimizer noise, and embedding of the features. We furthermore derive, with mathematical proof and extensive numerical evidence, the scaling-law exponents in all of these phases, in particular computing the optimal model-parameter-count as a function of floating point operation budget. 4 authors · May 23, 2024
- Symmetric Single Index Learning Few neural architectures lend themselves to provable learning with gradient based methods. One popular model is the single-index model, in which labels are produced by composing an unknown linear projection with a possibly unknown scalar link function. Learning this model with SGD is relatively well-understood, whereby the so-called information exponent of the link function governs a polynomial sample complexity rate. However, extending this analysis to deeper or more complicated architectures remains challenging. In this work, we consider single index learning in the setting of symmetric neural networks. Under analytic assumptions on the activation and maximum degree assumptions on the link function, we prove that gradient flow recovers the hidden planted direction, represented as a finitely supported vector in the feature space of power sum polynomials. We characterize a notion of information exponent adapted to our setting that controls the efficiency of learning. 2 authors · Oct 3, 2023
1 Explaining Neural Scaling Laws The population loss of trained deep neural networks often follows precise power-law scaling relations with either the size of the training dataset or the number of parameters in the network. We propose a theory that explains the origins of and connects these scaling laws. We identify variance-limited and resolution-limited scaling behavior for both dataset and model size, for a total of four scaling regimes. The variance-limited scaling follows simply from the existence of a well-behaved infinite data or infinite width limit, while the resolution-limited regime can be explained by positing that models are effectively resolving a smooth data manifold. In the large width limit, this can be equivalently obtained from the spectrum of certain kernels, and we present evidence that large width and large dataset resolution-limited scaling exponents are related by a duality. We exhibit all four scaling regimes in the controlled setting of large random feature and pretrained models and test the predictions empirically on a range of standard architectures and datasets. We also observe several empirical relationships between datasets and scaling exponents under modifications of task and architecture aspect ratio. Our work provides a taxonomy for classifying different scaling regimes, underscores that there can be different mechanisms driving improvements in loss, and lends insight into the microscopic origins of and relationships between scaling exponents. 5 authors · Feb 12, 2021
- Model scale versus domain knowledge in statistical forecasting of chaotic systems Chaos and unpredictability are traditionally synonymous, yet large-scale machine learning methods recently have demonstrated a surprising ability to forecast chaotic systems well beyond typical predictability horizons. However, recent works disagree on whether specialized methods grounded in dynamical systems theory, such as reservoir computers or neural ordinary differential equations, outperform general-purpose large-scale learning methods such as transformers or recurrent neural networks. These prior studies perform comparisons on few individually-chosen chaotic systems, thereby precluding robust quantification of how statistical modeling choices and dynamical invariants of different chaotic systems jointly determine empirical predictability. Here, we perform the largest to-date comparative study of forecasting methods on the classical problem of forecasting chaos: we benchmark 24 state-of-the-art forecasting methods on a crowdsourced database of 135 low-dimensional systems with 17 forecast metrics. We find that large-scale, domain-agnostic forecasting methods consistently produce predictions that remain accurate up to two dozen Lyapunov times, thereby accessing a new long-horizon forecasting regime well beyond classical methods. We find that, in this regime, accuracy decorrelates with classical invariant measures of predictability like the Lyapunov exponent. However, in data-limited settings outside the long-horizon regime, we find that physics-based hybrid methods retain a comparative advantage due to their strong inductive biases. 1 authors · Mar 12, 2023
- Parabolic-elliptic and indirect-direct simplifications in chemotaxis systems driven by indirect signalling Under relevant biological situations of the signalling process on a much faster time scale compared to the species diffusion and all interactions, we study two singular limits corresponding to varepsilonto 0^+ with a fixed tau>0, and (varepsilon,tau)to (0^+,0^+) arising in the following indirect signalling chemotaxis system with no-flux accross the boundary align* \left\{array{lllllll} \partial_t n=\Delta n-\nabla\cdot(n\nabla c)&in \Omega\times(0,\infty),\\ \varepsilon\partial_t c=\Delta c-c+w&in \Omega\times(0,\infty),\\ \varepsilon\partial_t w=\tau\Delta w-w+n&in \Omega\times(0,\infty),\\ (n,c,w)_{t=0}=(n_0,c_0,w_0)&on \Omega, array\right. align* up to the critical dimension N=4, called parabolic-elliptic and indirect-direct simplifications, respectively. We provide rigorous analysis for these simplifications, including passage to the limits, convergence rate estimates with the initial layer effect, and the convergence to critical manifolds. 4 authors · Aug 2
- Phase-space analysis of the viscous fluid cosmological models in the coincident f(Q) gravity In this article, we consider a newly proposed parameterization of the viscosity coefficient zeta, specifically zeta=zeta_0 {Omega^s_m} H , where zeta_0 = zeta_0{{Omega^s_{m_0}}} within the coincident f(Q) gravity formalism. We consider a non-linear function f(Q)= -Q +alpha Q^n, where alpha and n are arbitrary model parameters, which is a power-law correction to the STEGR scenario. We find an autonomous system by invoking the dimensionless density parameters as the governing phase-space variables. We discuss the physical significance of the model corresponding to the parameter choices n=-1 and n=2 along with the exponent choices s=0, 0.5, and 1.05. We find that model I shows the stable de-Sitter type or stable phantom type (depending on the choice of exponent s) behavior with no transition epoch, whereas model II shows the evolutionary phase from the radiation epoch to the accelerated de-Sitter epoch via passing through the matter-dominated epoch. Hence, we conclude that model I provides a good description of the late-time cosmology but fails to describe the transition epoch, whereas model II modifies the description in the context of the early universe and provides a good description of the matter and radiation era along with the transition phase. 3 authors · Jan 9, 2024
- Critical Learning Periods Emerge Even in Deep Linear Networks Critical learning periods are periods early in development where temporary sensory deficits can have a permanent effect on behavior and learned representations. Despite the radical differences between biological and artificial networks, critical learning periods have been empirically observed in both systems. This suggests that critical periods may be fundamental to learning and not an accident of biology. Yet, why exactly critical periods emerge in deep networks is still an open question, and in particular it is unclear whether the critical periods observed in both systems depend on particular architectural or optimization details. To isolate the key underlying factors, we focus on deep linear network models, and show that, surprisingly, such networks also display much of the behavior seen in biology and artificial networks, while being amenable to analytical treatment. We show that critical periods depend on the depth of the model and structure of the data distribution. We also show analytically and in simulations that the learning of features is tied to competition between sources. Finally, we extend our analysis to multi-task learning to show that pre-training on certain tasks can damage the transfer performance on new tasks, and show how this depends on the relationship between tasks and the duration of the pre-training stage. To the best of our knowledge, our work provides the first analytically tractable model that sheds light into why critical learning periods emerge in biological and artificial networks. 3 authors · Aug 23, 2023
- Metastable Cosmological Constant and Gravitational Bubbles: Ultra-Late-Time Transitions in Modified Gravity The observed cosmological constant may originate as the minimum value U_{min} of a scalar field potential, where the scalar field is frozen due to a large mass. If this vacuum is metastable, it may decay to a true vacuum either at present or in the future. Assuming its decay rate Gamma is comparable to the Hubble expansion rate H_0, we estimate the scale of true vacuum bubbles and analyze their evolution. We find that their initial formation scale is sub-millimeter and their tension causes rapid collapse if m gtrsim 1.7 cdot 10^{-3}, eV. For smaller masses, the bubbles expand at the speed of light. We extend our analysis to scalar-tensor theories with non-minimal coupling, finding that the nucleation scale of gravitational constant bubbles remains consistent with the sub-millimeter regime of General Relativity. The critical mass scale remains around 10^{-3},eV. A theoretical estimate at redshift z_{obs} sim 0.01 suggests an observable bubble radius of sim 50 Mpc, implying a gravitational transition triggered sim 300 Myr ago, with a present-day size approaching 100 Mpc. Additionally, we explore mass ranges (m < 10^{-3},eV) and non-minimal coupling xi ranges (10^{-8},eV^{2-n} - 10^{-1},eV^{2-n}) that lead to a variation Delta G/G_N within the 1%-7% range. We assume non-minimal coupling of the form F(phi)=1/kappa - xi phi^n, with kappa=8pi G_N and 2 leq n leq 9. Finally, we review various local physics or/and transition based proposed solutions to the Hubble tension, including ultra-late-time transitional models (z sim 0.01), screened fifth-force mechanisms, and the Lambda_{rm s}CDM model, which features a transition at z sim 2. We discuss observational hints supporting these scenarios and the theoretical challenges they face. 2 authors · Mar 14
- Gravity/Spin-model correspondence and holographic superfluids We propose a general correspondence between gravity and spin models, inspired by the well-known IR equivalence between lattice gauge theories and the spin models. This suggests a connection between continuous type Hawking-phase transitions in gravity and the continuous order-disorder transitions in ferromagnets. The black-hole phase corresponds to the ordered and the graviton gas corresponds to the disordered phases respectively. A simple set-up based on Einstein-dilaton gravity indicates that the vicinity of the phase transition is governed by a linear-dilaton CFT. Employing this CFT we calculate scaling of observables near T_c, and obtain mean-field scaling in a semi-classical approximation. In case of the XY model the Goldstone mode is identified with the zero mode of the NS-NS two-form. We show that the second speed of sound vanishes at the transition also with the mean field exponent. 1 authors · Jul 27, 2010
- Beyond IID weights: sparse and low-rank deep Neural Networks are also Gaussian Processes The infinitely wide neural network has been proven a useful and manageable mathematical model that enables the understanding of many phenomena appearing in deep learning. One example is the convergence of random deep networks to Gaussian processes that allows a rigorous analysis of the way the choice of activation function and network weights impacts the training dynamics. In this paper, we extend the seminal proof of Matthews et al. (2018) to a larger class of initial weight distributions (which we call PSEUDO-IID), including the established cases of IID and orthogonal weights, as well as the emerging low-rank and structured sparse settings celebrated for their computational speed-up benefits. We show that fully-connected and convolutional networks initialized with PSEUDO-IID distributions are all effectively equivalent up to their variance. Using our results, one can identify the Edge-of-Chaos for a broader class of neural networks and tune them at criticality in order to enhance their training. Moreover, they enable the posterior distribution of Bayesian Neural Networks to be tractable across these various initialization schemes. 3 authors · Oct 25, 2023
- Holographic Superconductors from Einstein-Maxwell-Dilaton Gravity We construct holographic superconductors from Einstein-Maxwell-dilaton gravity in 3+1 dimensions with two adjustable couplings alpha and the charge q carried by the scalar field. For the values of alpha and q we consider, there is always a critical temperature at which a second order phase transition occurs between a hairy black hole and the AdS RN black hole in the canonical ensemble, which can be identified with the superconducting phase transition of the dual field theory. We calculate the electric conductivity of the dual superconductor and find that for the values of alpha and q where alpha/q is small the dual superconductor has similar properties to the minimal model, while for the values of alpha and q where alpha/q is large enough, the electric conductivity of the dual superconductor exhibits novel properties at low frequencies where it shows a "Drude Peak" in the real part of the conductivity. 2 authors · Jun 14, 2010
- On the minimal power of q in a Kazhdan-Lusztig polynomial For w in the symmetric group, we provide an exact formula for the smallest positive power q^{h(w)} appearing in the Kazhdan-Lusztig polynomial P_{e,w}(q). We also provide a tight upper bound on h(w) in simply-laced types, resolving a conjecture of Billey-Postnikov from 2002. 2 authors · Mar 23, 2023
- Singularities in Einstein-conformally coupled Higgs cosmological models The dynamics of Einstein-conformally coupled Higgs field (EccH) system is investigated near the initial singularities in the presence of Friedman-Robertson--Walker symmetries. We solve the field equations asymptotically up to fourth order near the singularities analytically, and determine the solutions numerically as well. We found all the asymptotic, power series singular solutions, which are (1) solutions with a scalar polynomial curvature singularity but the Higgs field is bounded (`Small Bang'), or (2) solutions with a Milne type singularity with bounded spacetime curvature and Higgs field, or (3) solutions with a scalar polynomial curvature singularity and diverging Higgs field (`Big Bang'). Thus, in the present EccH model there is a new kind of physical spacetime singularity (`Small Bang'). We also show that, in a neighbourhood of the singularity in these solutions, the Higgs sector does not have any symmetry breaking instantaneous vacuum state, and hence then the Brout-Englert-Higgs mechanism does not work. The large scale behaviour of the solutions is investigated numerically as well. In particular, the numerical calculations indicate that there are singular solutions that cannot be approximated by power series. 2 authors · Feb 2, 2018
- The Fyodorov-Hiary-Keating Conjecture. I By analogy with conjectures for random matrices, Fyodorov-Hiary-Keating and Fyodorov-Keating proposed precise asymptotics for the maximum of the Riemann zeta function in a typical short interval on the critical line. In this paper, we settle the upper bound part of their conjecture in a strong form. More precisely, we show that the measure of those T leq t leq 2T for which $ max_{|h| leq 1} |zeta(1/2 + i t + i h)| > e^y log T {(loglog T)^{3/4}} is bounded by Cy e^{-2y} uniformly in y \geq 1. This is expected to be optimal for y= O(\log\log T). This upper bound is sharper than what is known in the context of random matrices, since it gives (uniform) decay rates in y$. In a subsequent paper we will obtain matching lower bounds. 3 authors · Jul 2, 2020
- Predictive power of the Berezinskii-Kosterlitz-Thouless theory based on Renormalization Group throughout the BCS-BEC crossover in 2D superconductors Recent experiments on 2D superconductors allow the characterization of the critical temperature and of the phase diagram across the BCS-BEC crossover as a function of density. We obtain from these experiments the microscopic parameters of the superconducting state at low temperatures by the BCS mean-field approach. For Li_xZrNCl, the extracted parameters are used to evaluate the superconducting phase stiffness and the Berezinskii-Kosterlitz-Thouless (BKT) critical temperature throughout the BCS-BEC crossover, by implementing the corresponding Renormalization Group (RG) approach. In this way, we make a quantitative test of the predictive power of the BKT theory for evaluating the critical temperature. The RG flow equations turn out to give a sizable renormalization of the phase stiffness and of the critical temperature, which is crucial to obtain a satisfactory agreement between the BKT theory and the experiments, in particular in the BCS-BEC crossover regime. We predict the temperature range where phase stiffness renormalization can be measured in Li_xZrNCl across the BCS-BEC crossover. Contrary to other microscopic theories of superconductivity, we find that the BKT theory can be exploited to evaluate quantitatively the critical temperature of 2D superconductors in different pairing regimes. 4 authors · Mar 5, 2024
- The Numerical Stability of Hyperbolic Representation Learning Given the exponential growth of the volume of the ball w.r.t. its radius, the hyperbolic space is capable of embedding trees with arbitrarily small distortion and hence has received wide attention for representing hierarchical datasets. However, this exponential growth property comes at a price of numerical instability such that training hyperbolic learning models will sometimes lead to catastrophic NaN problems, encountering unrepresentable values in floating point arithmetic. In this work, we carefully analyze the limitation of two popular models for the hyperbolic space, namely, the Poincar\'e ball and the Lorentz model. We first show that, under the 64 bit arithmetic system, the Poincar\'e ball has a relatively larger capacity than the Lorentz model for correctly representing points. Then, we theoretically validate the superiority of the Lorentz model over the Poincar\'e ball from the perspective of optimization. Given the numerical limitations of both models, we identify one Euclidean parametrization of the hyperbolic space which can alleviate these limitations. We further extend this Euclidean parametrization to hyperbolic hyperplanes and exhibits its ability in improving the performance of hyperbolic SVM. 4 authors · Oct 31, 2022
- Schrödinger-Poisson systems with a general critical nonlinearity We consider a Schr\"odinger-Poisson system involving a general nonlinearity at critical growth and we prove the existence of positive solutions. The Ambrosetti-Rabinowitz condition is not required. We also study the asymptotics of solutions with respect to a parameter. 3 authors · Jan 6, 2015
1 NUPES : Non-Uniform Post-Training Quantization via Power Exponent Search Deep neural network (DNN) deployment has been confined to larger hardware devices due to their expensive computational requirements. This challenge has recently reached another scale with the emergence of large language models (LLMs). In order to reduce both their memory footprint and latency, a promising technique is quantization. It consists in converting floating point representations to low bit-width fixed point representations, usually by assuming a uniform mapping onto a regular grid. This process, referred to in the literature as uniform quantization, may however be ill-suited as most DNN weights and activations follow a bell-shaped distribution. This is even worse on LLMs whose weight distributions are known to exhibit large, high impact, outlier values. In this work, we propose an improvement over the most commonly adopted way to tackle this limitation in deep learning models quantization, namely, non-uniform quantization. NUPES leverages automorphisms to preserve the scalar multiplications. Such transformations are derived from power functions. However, the optimization of the exponent parameter and weight values remains a challenging and novel problem which could not be solved with previous post training optimization techniques which only learn to round up or down weight values in order to preserve the predictive function. We circumvent this limitation with a new paradigm: learning new quantized weights over the entire quantized space. Similarly, we enable the optimization of the power exponent, i.e. the optimization of the quantization operator itself during training by alleviating all the numerical instabilities. The resulting predictive function is compatible with integer-only low-bit inference. We show the ability of the method to achieve state-of-the-art compression rates in both, data-free and data-driven configurations. 3 authors · Aug 10, 2023
- Almost sure bounds for a weighted Steinhaus random multiplicative function We obtain almost sure bounds for the weighted sum sum_{n leq t} f(n){n}, where f(n) is a Steinhaus random multiplicative function. Specifically, we obtain the bounds predicted by exponentiating the law of the iterated logarithm, giving sharp upper and lower bounds. 1 authors · Jul 2, 2023
- Phase transitions between Reissner-Nordstrom and dilatonic black holes in 4D AdS spacetime We study Einstein-Maxwell-dilaton gravity models in four-dimensional anti-de Sitter (AdS) spacetime which admit the Reissner-Nordstrom (RN) black hole solution. We show that below a critical temperature the AdS-RN solution becomes unstable against scalar perturbations and the gravitational system undergoes a phase transition. We show using numerical calculations that the new phase is a charged dilatonic black hole. Using the AdS/CFT correspondence we discuss the phase transition in the dual field theory both for non-vanishing temperatures and in the extremal limit. The extremal solution has a Lifshitz scaling symmetry. We discuss the optical conductivity in the new dual phase and find interesting behavior at low frequencies where it shows a "Drude peak". The resistivity varies with temperature in a non-monotonic way and displays a minimum at low temperatures which is reminiscent of the celebrated Kondo effect. 3 authors · Dec 17, 2009
1 A Neural Scaling Law from Lottery Ticket Ensembling Neural scaling laws (NSL) refer to the phenomenon where model performance improves with scale. Sharma & Kaplan analyzed NSL using approximation theory and predict that MSE losses decay as N^{-alpha}, alpha=4/d, where N is the number of model parameters, and d is the intrinsic input dimension. Although their theory works well for some cases (e.g., ReLU networks), we surprisingly find that a simple 1D problem y=x^2 manifests a different scaling law (alpha=1) from their predictions (alpha=4). We opened the neural networks and found that the new scaling law originates from lottery ticket ensembling: a wider network on average has more "lottery tickets", which are ensembled to reduce the variance of outputs. We support the ensembling mechanism by mechanistically interpreting single neural networks, as well as studying them statistically. We attribute the N^{-1} scaling law to the "central limit theorem" of lottery tickets. Finally, we discuss its potential implications for large language models and statistical physics-type theories of learning. 2 authors · Oct 3, 2023
1 Deep Learning Scaling is Predictable, Empirically Deep learning (DL) creates impactful advances following a virtuous recipe: model architecture search, creating large training data sets, and scaling computation. It is widely believed that growing training sets and models should improve accuracy and result in better products. As DL application domains grow, we would like a deeper understanding of the relationships between training set size, computational scale, and model accuracy improvements to advance the state-of-the-art. This paper presents a large scale empirical characterization of generalization error and model size growth as training sets grow. We introduce a methodology for this measurement and test four machine learning domains: machine translation, language modeling, image processing, and speech recognition. Our empirical results show power-law generalization error scaling across a breadth of factors, resulting in power-law exponents---the "steepness" of the learning curve---yet to be explained by theoretical work. Further, model improvements only shift the error but do not appear to affect the power-law exponent. We also show that model size scales sublinearly with data size. These scaling relationships have significant implications on deep learning research, practice, and systems. They can assist model debugging, setting accuracy targets, and decisions about data set growth. They can also guide computing system design and underscore the importance of continued computational scaling. 9 authors · Dec 1, 2017
- Concentrating solutions of the fractional (p,q)-Choquard equation with exponential growth This article deals with the following fractional (p,q)-Choquard equation with exponential growth of the form: $varepsilon^{ps}(-Delta)_{p}^{s}u+varepsilon^{qs}(-Delta)_q^su+ Z(x)(|u|^{p-2}u+|u|^{q-2}u)=varepsilon^{mu-N}[|x|^{-mu}*F(u)]f(u) in R^N, where s\in (0,1), \varepsilon>0 is a parameter, 2\leq p=N{s}<q, and 0<\mu<N. The nonlinear function f has an exponential growth at infinity and the continuous potential function Z satisfies suitable natural conditions. With the help of the Ljusternik-Schnirelmann category theory and variational methods, the multiplicity and concentration of positive solutions are obtained for \varepsilon>0$ small enough. In a certain sense, we generalize some previously known results. 3 authors · May 31
- Linear statistics for Coulomb gases: higher order cumulants We consider N classical particles interacting via the Coulomb potential in spatial dimension d and in the presence of an external trap, at equilibrium at inverse temperature beta. In the large N limit, the particles are confined within a droplet of finite size. We study smooth linear statistics, i.e. the fluctuations of sums of the form {cal L}_N = sum_{i=1}^N f({bf x}_i), where {bf x}_i's are the positions of the particles and where f({bf x}_i) is a sufficiently regular function. There exists at present standard results for the first and second moments of {cal L}_N in the large N limit, as well as associated Central Limit Theorems in general dimension and for a wide class of confining potentials. Here we obtain explicit expressions for the higher order cumulants of {cal L}_N at large N, when the function f({bf x})=f(|{bf x}|) and the confining potential are both rotationnally invariant. A remarkable feature of our results is that these higher cumulants depend only on the value of f'(|{bf x}|) and its higher order derivatives evaluated exactly at the boundary of the droplet, which in this case is a d-dimensional sphere. In the particular two-dimensional case d=2 at the special value beta=2, a connection to the Ginibre ensemble allows us to derive these results in an alternative way using the tools of determinantal point processes. Finally we also obtain the large deviation form of the full probability distribution function of {cal L}_N. 4 authors · Oct 25, 2023
- Neural Network Approximations of PDEs Beyond Linearity: A Representational Perspective A burgeoning line of research leverages deep neural networks to approximate the solutions to high dimensional PDEs, opening lines of theoretical inquiry focused on explaining how it is that these models appear to evade the curse of dimensionality. However, most prior theoretical analyses have been limited to linear PDEs. In this work, we take a step towards studying the representational power of neural networks for approximating solutions to nonlinear PDEs. We focus on a class of PDEs known as nonlinear elliptic variational PDEs, whose solutions minimize an Euler-Lagrange energy functional E(u) = int_Omega L(x, u(x), nabla u(x)) - f(x) u(x)dx. We show that if composing a function with Barron norm b with partial derivatives of L produces a function of Barron norm at most B_L b^p, the solution to the PDE can be epsilon-approximated in the L^2 sense by a function with Barron norm Oleft(left(dB_Lright)^{max{p log(1/ epsilon), p^{log(1/epsilon)}}}right). By a classical result due to Barron [1993], this correspondingly bounds the size of a 2-layer neural network needed to approximate the solution. Treating p, epsilon, B_L as constants, this quantity is polynomial in dimension, thus showing neural networks can evade the curse of dimensionality. Our proof technique involves neurally simulating (preconditioned) gradient in an appropriate Hilbert space, which converges exponentially fast to the solution of the PDE, and such that we can bound the increase of the Barron norm at each iterate. Our results subsume and substantially generalize analogous prior results for linear elliptic PDEs over a unit hypercube. 4 authors · Oct 21, 2022
- Nonintrusive approximation of parametrized limits of matrix power algorithms -- application to matrix inverses and log-determinants We consider in this work quantities that can be obtained as limits of powers of parametrized matrices, for instance the inverse matrix or the logarithm of the determinant. Under the assumption of affine dependence in the parameters, we use the Empirical Interpolation Method (EIM) to derive an approximation for powers of these matrices, from which we derive a nonintrusive approximation for the aforementioned limits. We derive upper bounds of the error made by the obtained formula. Finally, numerical comparisons with classical intrusive and nonintrusive approximation techniques are provided: in the considered test-cases, our algorithm performs well compared to the nonintrusive ones. 4 authors · Oct 6, 2017
- Automated Search for Conjectures on Mathematical Constants using Analysis of Integer Sequences Formulas involving fundamental mathematical constants had a great impact on various fields of science and mathematics, for example aiding in proofs of irrationality of constants. However, the discovery of such formulas has historically remained scarce, often perceived as an act of mathematical genius by great mathematicians such as Ramanujan, Euler, and Gauss. Recent efforts to automate the discovery of formulas for mathematical constants, such as the Ramanujan Machine project, relied on exhaustive search. Despite several successful discoveries, exhaustive search remains limited by the space of options that can be covered and by the need for vast amounts of computational resources. Here we propose a fundamentally different method to search for conjectures on mathematical constants: through analysis of integer sequences. We introduce the Enumerated Signed-continued-fraction Massey Approve (ESMA) algorithm, which builds on the Berlekamp-Massey algorithm to identify patterns in integer sequences that represent mathematical constants. The ESMA algorithm found various known formulas for e, e^2, tan(1), and ratios of values of Bessel functions. The algorithm further discovered a large number of new conjectures for these constants, some providing simpler representations and some providing faster numerical convergence than the corresponding simple continued fractions. Along with the algorithm, we present mathematical tools for manipulating continued fractions. These connections enable us to characterize what space of constants can be found by ESMA and quantify its algorithmic advantage in certain scenarios. Altogether, this work continues in the development of augmenting mathematical intuition by computer algorithms, to help reveal mathematical structures and accelerate mathematical research. 6 authors · Dec 13, 2022
- Dynamical Cosmological Constant The dynamical realisation of the equation of state p +rho =0 is studied. A non-pathological dynamics for the perturbations of such a system mimicking a dynamical cosmological constant (DCC) requires to go beyond the perfect fluid paradigm. It is shown that an anisotropic stress must be always present. The Hamiltonian of the system in isolation resembles the one of a Pais-Uhlenbeck oscillator and linear stability requires that it cannot be positive definite. The dynamics of linear cosmological perturbations in a DCC dominated Universe is studied in detail showing that when DCC is minimally coupled to gravity no dramatic instability is present. In contrast to what happens in a cosmological constant dominated Universe, the non-relativistic matter contrast is no longer constant and exhibits an oscillator behaviour at small scales while it grows weakly at large scales. In the gravitational waves sector, at small scales, the amplitude is still suppressed as the inverse power of the scale factor while it grows logarithmically at large scales. Also the vector modes propagate, though no growing mode is found. 2 authors · Mar 5
27 Scaling Laws for Floating Point Quantization Training Low-precision training is considered an effective strategy for reducing both training and downstream inference costs. Previous scaling laws for precision mainly focus on integer quantization, which pay less attention to the constituents in floating-point quantization and thus cannot well fit the LLM losses in this scenario. In contrast, while floating-point quantization training is more commonly implemented in production, the research on it has been relatively superficial. In this paper, we thoroughly explore the effects of floating-point quantization targets, exponent bits, mantissa bits, and the calculation granularity of the scaling factor in floating-point quantization training performance of LLM models. While presenting an accurate floating-point quantization unified scaling law, we also provide valuable suggestions for the community: (1) Exponent bits contribute slightly more to the model performance than mantissa bits. We provide the optimal exponent-mantissa bit ratio for different bit numbers, which is available for future reference by hardware manufacturers; (2) We discover the formation of the critical data size in low-precision LLM training. Too much training data exceeding the critical data size will inversely bring in degradation of LLM performance; (3) The optimal floating-point quantization precision is directly proportional to the computational power, but within a wide computational power range, we estimate that the best cost-performance precision lies between 4-8 bits. 16 authors · Jan 4 2
- Detecting Fermi Surface Nesting Effect for Fermionic Dicke Transition by Trap Induced Localization Recently, the statistical effect of fermionic superradiance is approved by series of experiments both in free space and in a cavity. The Pauli blocking effect can be visualized by a 1/2 scaling of Dicke transition critical pumping strength against particle number Nat for fermions in a trap. However, the Fermi surface nesting effect, which manifests the enhancement of superradiance by Fermi statistics is still very hard to be identified. Here we studied the influence of localized fermions on the trap edge when both pumping optical lattice and the trap are presented. We find due to localization, the statistical effect in superradiant transition is enhanced. Two new scalings of critical pumping strength are observed as 4/3, and 2/3 for mediate particle number, and the Pauli blocking scaling 1/3 (2d case) in large particle number limit is unaffected. Further, we find the 4/3 scaling is subject to a power law increasing with rising ratio between recoil energy and trap frequency in pumping laser direction. The divergence of this scaling of critical pumping strength against N_{rm at} in E_R/omega_xrightarrow+infty limit can be identified as the Fermi surface nesting effect. Thus we find a practical experimental scheme for visualizing the long-desired Fermi surface nesting effect with the help of trap induced localization in a two-dimensional Fermi gas in a cavity. 2 authors · Mar 1, 2023
- Holographic Thermodynamics at Finite Baryon Density: Some Exact Results We use the AdS/CFT correspondence to study the thermodynamics of massive N=2 supersymmetric hypermultiplets coupled to N=4 supersymmetric SU(Nc) Yang-Mills theory in the limits of large Nc and large 't Hooft coupling. In particular, we study the theory at finite baryon number density. At zero temperature, we present an exact expression for the hypermultiplets' leading-order contribution to the free energy, and in the supergravity description we clarify which D-brane configuration is appropriate for any given value of the chemical potential. We find a second-order phase transition when the chemical potential equals the mass. At finite temperature, we present an exact expression for the hypermultiplets' leading-order contribution to the free energy at zero mass. 2 authors · Sep 5, 2007