diff --git "a/Tokenization/combined_scientific_papers.json" "b/Tokenization/combined_scientific_papers.json"
deleted file mode 100644--- "a/Tokenization/combined_scientific_papers.json"
+++ /dev/null
@@ -1,112 +0,0 @@
-[
-  {
-    "domain": "Computer Science",
-    "chunk_type": "general",
-    "text": "1\nNoise2Ghost: Self-supervised deep convolutional\nreconstruction for ghost imaging\nMathieu Manni, Dmitry Karpov\n, K. Joost Batenburg\n, Sharon Shwartz\n, Nicola Vigan`o\nAbstract\u2014We present a new self-supervised deep-learning-\nbased Ghost Imaging (GI) reconstruction method, which provides\nunparalleled reconstruction performance for noisy acquisitions\namong unsupervised methods. We present the supporting math-\nematical framework and results from theoretical and real data\nuse cases. Self-supervision removes the need for clean reference\ndata while offering strong noise reduction. This provides the\nnecessary tools for addressing signal-to-noise ratio concerns\nfor GI acquisitions in emerging and cutting-edge low-light GI\nscenarios. Notable examples include micro- and nano-scale x-\nray emission imaging, e.g., x-ray fluorescence imaging of dose-\nsensitive samples. Their applications include in-vivo and in-\noperando case studies for biological samples and batteries.\nIndex Terms\u2014Ghost imaging, deep learning, image reconstruc-\ntion, inverse problems, reconstruction algorithms.\nI. INTRODUCTION\nGhost imaging (GI) is a paradigm-changing technique that\nis mostly known and anticipated for its promise to cut radiation\ndoses. However, these possible radiation dose advantages rely\non a much more fundamental concept: The ability to choose\nthe trade-off between spatial resolution, field-of-view (FoV),\nand acquisition time somewhat independently of the incident\nbeam size. Traditionally, to spatially resolve diffused signals\nlike x-ray fluorescence (XRF), one had to raster scan the\nobject of interest with focused illumination (pencil-beam,\nPB, Fig. 1a). The PB focal spot size determines the spatial\nresolution. GI achieves that by probing extended regions of\na sample at once with structured illumination (Fig. 1b). This\nprovides spatially resolved information, whose resolution is\nindependent of the beam size and detector pixel size, but\ndepends instead on the size of the beam structures.\nA detector that does not observe the illumination beam collects\nthe signals of interest. The acquired GI realizations, composed\nof the detected signals with each associated illumination\nbeam structure, are then combined and computationally recon-\nstructed into a two-dimensional projection image of the probed\nobject (in the given contrast). This means that the acquired\nM. Manni was with the ESRF \u2014 The European Synchrotron, Grenoble,\n38000, France, and with the Physics Department and Institute of Nanotech-\nnology and Advanced Materials, Bar Ilan University, Ramat Gan, 52900, Israel\n(e-mail: mathieu.manni@esrf.fr).\nD. Karpov is with the Universit\u00b4e Grenoble Alpes, P\u02c6ole PEM, CEA, MEM,\nGrenoble, 38000, France (e-mail: dmitry.karpov@univ-grenoble-alpes.fr).\nK. J. Batenburg is with the Leiden Institute of Advanced Computer\nScience, Leiden Universiteit, 2333, CA Leiden, The Netherlands (e-mail:\nk.j.batenburg@liacs.leidenuniv.nl).\nS. Shwartz is with the Physics Department and Institute of Nanotechnology\nand Advanced Materials, Bar Ilan University, Ramat Gan, 52900, Israel (e-\nmail: sharon.shwartz@biu.ac.il).\nN. Vigan`o is with the Universit\u00b4e Grenoble Alpes, CEA, IRIG-MEM,\nGrenoble, 38000, France (e-mail: nicola.vigano@cea.fr).\nManuscript received XXX; revised XXX.\n(a) Pencil Beam\n(b) Ghost Imaging\nFig. 1: Schematic representation of diffused emission signal\nacquisitions (e.g., x-ray fluorescence imaging) using pencil\nraster beam scanning (a) and ghost imaging (b). The former\nuses a point beam to scan every pixel to form an image, while\nthe latter illuminates the sample with a series of structured\nbeams.\ndata mathematically resides in a different space from the\nreconstructed image space, and the reconstruction problem is a\nso-called inverse problem. The algorithm used to perform this\ninversion, known as the reconstruction method, significantly\ninfluences the quality and accuracy of the reconstructed image,\nparticularly under noisy conditions. GI is independent of the\ncontrast and probe type used, and it has been demonstrated\nwith radiation across the whole electromagnetic spectrum [1],\n[2], neutrons [3], electrons [4], and atoms [5]. Thus, it can be\nused for the same applications where PB raster scanning is\nused.\nThe advantages of GI compared to PB acquisitions stem\nfrom the inherent small- and large-scale correlations found in\nnatural images, which make them compressible under suitable\nrepresentations [6], while non-structured noise is not. Thanks\nto the structured illumination, which can capture unique large-\nand small-scale information for each measurement, GI grants\nimage reconstruction with fewer data points than reconstructed\npixels. Thus, GI unlocks potential gains in acquisition speed\nor deposited dose, compared to raster scanning, by simply\nreducing the number of acquired data points [7].\nGI offers a specific advantage over PB scanning for dose-\nsensitive samples. High-flux focused PBs create significant\nexcess charges in small localized regions at each exposure\nfrom radiation-induced local ionization of the samples. This\nis one of the main contributors to damage in biological samples\nand the skewing of functioning parameters in batteries during\nacquisitions. On the contrary, GI creates this same charge\non a much larger area (i.e., the whole illumination field-of-\nview). The latter can recombine and/or disperse faster than the\nformer, leading to lower degradation over time. For instance,\nin cryogenic electron microscopy (also known as cryo-EM),\narXiv:2504.10288v1  [cs.CV]  14 Apr 2025\n2\nreduced dose rates have been observed to produce less damage\ncompared to higher dose rates at the same total dose [8].\nThanks to all these advantages, GI has the potential to spark\nmany breakthroughs in applications in various fields ranging\nfrom biomedical imaging to remote sensing. However, GI has\nonly seen limited applicability so far, especially in the above-\nmentioned nano-scale imaging applications. In these cases, the\nobserved signals are affected by overwhelmingly high Poisson\nnoise levels due to their very small photon fluxes. This noise\ntype does not enjoy the intrinsic noise reduction of GI, known\nas Felgett\u2019s advantage, thus reducing GI\u2019s advantage over\nPB [7]. Here, we propose a GI reconstruction method par-\nticularly suited for working with acquisitions affected by high\nnoise levels, regardless of whether the noise type is Gaussian\nor Poisson. Our method leverages the latest developments in\nunsupervised machine learning (ML) techniques, so it does\nnot require high-quality reference data. This work is a step\ntowards practical applications of GI, particularly for photon-\nlimited scenarios like nano-scale emission imaging.\nII. METHOD\nIn\nthis\nsection,\nwe\npresent\nthe\nproposed\nmethod\nNoise2Ghost (N2G). First, we introduce related methods.\nThen, we describe the signal formation (forward) model and\npresent our proposed reconstruction method. Finally, we show\nhow this method addresses random acquisition noise.\nA. Existing methods\nGI\u2019s ability to obtain images with more pixels than the\nacquired realizations means that the associated reconstruction\nproblem is undersampled. These undersampled reconstructions\nrequire some prior knowledge (regularization) to recover an\naccurate image: Unregularized least-squares (LS) reconstruc-\ntions present specific high-frequency large-scale artifacts that\nresemble high noise levels. Methods employing convolutional\nneural networks (CNNs) are currently the best-performing\ntools for image processing applications. They can capture\nsmall- and large-scale correlations and learn the most suitable\nrepresentations to capture the features present in the treated\nsignals (acting like regularizers) [9]. Thus, they are also\nbetter suited for GI reconstructions than traditional variational\nmethods, e.g., Total Variation (TV) minimization, which use\nstrong but also rather simple a priori assumptions on the types\nof features present in the reconstructed images.\nML methods and CNN architectures have both seen an ex-\nplosion in development and applications over the last decade.\nQuite a few ML methods have been specifically developed\nfor GI reconstructions. Notable examples are supervised meth-\nods [10], where a CNN (e.g., a U-net [11]) compares examples\nof LS GI reconstructions from highly undersampled acquisi-\ntions against their corresponding high-quality reconstructions.\nThe network learns to identify the corrupted features in the\nLS images and to replace them with their expected shape.\nThis approach is heavily dependent on large amounts of\nknown examples that present strong similarity to the images\nof interest. It also requires specific training data for each\nacquisition parameter variation (e.g., noise levels, image size,\nlevels of undersampling, etc).\nUntrained generator-network methods do not require pre-\ntraining of the model against reference high-quality images, re-\nmoving the need for large databases of reference images [12],\n[13]. They can incorporate the knowledge of the image forma-\ntion model in the learning algorithm (an approach of physics-\ninformed ML), and make the trained model learn to produce\nan image that fits the acquired data. A good example is the\nmethod called GIDC [14], which is based on the deep image\nprior (DIP) [9]. As a drawback, these methods exhibit lower\nreconstruction quality for high noise levels in the acquired\ndata. They delegate noise reduction solely to regularization and\nearly stopping. Similar to DIP-based methods, implicit neural\nrepresentations (INRs) -based methods use untrained NNs to\nreconstruct an image without high-quality reference data [15],\n[16]. INR-based methods use multi-layer perceptron (MLP)\nNNs and accept pixel coordinates instead of input images.\nDuring the training, the MLP learns a representation of the\nimaged object, which produces the desired image for the input\ncoordinates.\nSelf-supervised methods like Noise2Noise (N2N) use the same\ntraining procedure as the supervised methods, but they use\nnoisy images as targets [17]. They only require two noisy\nrealizations of the same measurement (e.g., two noisy images\nof the same scene) and exploit the fact that the only difference\nbetween the two measurements should be the noise. Thus,\ngiven one of the two measurements, they train the CNN to pre-\ndict the other and vice versa. As previously mentioned, CNNs\ncan capture small and large-scale correlations, leading them\nto learn the features of the measurements while discarding\ntheir uncorrelated parts, i.e., the noise. N2N can be adapted to\nwork with inverse problems like computed tomography [18].\nHowever, as we will also see in section II-E, it does not\nhandle the so-called missing realization problem, which refers\nto artifacts arising from not having enough realizations to\nrepresent the imaged objects, and that is usually an important\nsource of noise in GI reconstructions.\nB. Forward model and variational methods\nHere, we provide the mathematical description of the signal\nformation and the deep image prior (DIP) reconstruction,\nwhich will serve as a basis for our method\u2019s derivation. Let us\nrepresent the discretized expected image of the studied object\nas the vector x\u2217\u2208RN, where N is the number of its pixels.\nWe probe x\u2217with a set of structured illumination patterns\nW = {w1, w2, . . . , wm}, which form the acquisition matrix\nW = [w1, w2, . . . , wm] \u2208RM\u00d7N, where M is the number of\nmeasurements. If we assume that the interaction between the\nproperty represented by x\u2217and the illumination patterns W\nis linear, then the recorded signal y \u2208RM, corrupted by an\nuncorrelated zero-mean noise \u03f5 is given by:\ny = Wx\u2217+ \u03f5 = b + \u03f5\n(1)\nwhere b \u2208RM is the non corrupted signal. The reconstruction\naims to recover the vector \u02c6x that is the most probable estimate\nof the clean image x\u2217. Regularized variational methods seek\n\u02c6x by solving the following minimization problem:\n\u02c6x = argmin\nx\n1\n2\u2225Wx \u2212y\u22252\n2 + \u03bbR(x)\n(2)\n3\nWhere R : RN \u2192R denotes a regularization term that\nimposes some prior knowledge on the reconstruction. This\nprior knowledge is often related to the structural properties of\nthe to-be-reconstructed image, and it is supposed to enhance its\nsignal-to-noise ratio. A common choice of regularization is the\nso-called total variation (TV) minimization, which promotes\nreconstructed images with sparse gradients. This, in turn,\nimposes short-range correlations in the reconstructions.\nUnsupervised generative machine learning methods have taken\ninspiration from Eq. (2) [12], and developed an image recovery\nscheme that seeks to solve a very similar minimization prob-\nlem, but on the output of a machine-learning model (typically\na CNN), as follows:\n\u02c6\u03b8 = argmin\n\u03b8\n1\n2\u2225WN\u03b8(r) \u2212y\u22252\n2 + \u03bbR(N\u03b8(r))\n(3a)\n\u02c6x = N\u02c6\u03b8(r)\n(3b)\nwhere r \u2208RN is any input image, which could be randomly\nchosen, N\u03b8\n: RN\n\u2192RN the model and \u03b8 \u2208RT is\nits parameters vector with T different parameters. In [14],\nr = W \u2020y where W \u2020 is the pseudo-inverse of W, making\nr the least-squares reconstruction. The model N\u03b8 learns to\nproduce the image \u02c6x that satisfies both the forward model\nEq. (1) and the regularization R, weighted by the \u03bb parameter.\nNeural networks have notions of non-locality and non-linearity\nin their response, which can behave as an additional prior\non top of R [9] when the training process is stopped before\nconvergence.\nWhile Eq. (3) has proven to help with the reconstruction of\nunder-determined acquisitions [14], it is not tuned to deal with\nstrong random noise. Following the same derivation steps as\nin [18], the expected prediction error of the data fitting term\nin Eq. (3a) is:\nEy\u2225WN\u03b8(r) \u2212y\u22252\n2\n= Eb,\u03f5\u2225WN\u03b8(W \u2020(b + \u03f5)) \u2212(b + \u03f5)\u22252\n2\n(4)\nwhich is intuitively minimized when N\u03b8 = I (the identity\nmatrix). Thus, noise reduction is solely delegated to the\nregularization term R. Therefore, this reconstruction technique\nmight not be able to provide an advantage over traditional\nvariational methods in this specific scenario.\nC. Proposed method: Noise2Ghost (N2G)\nHere, we propose our CNN-based noise-tolerant unsuper-\nvised GI reconstruction method. It is inspired by DIP and N2N\nwhile addressing their limitations, namely accounting for the\nrandom noise in the optimization technique and the missing\nrealization artifacts, respectively. We start from the assumption\nthat by selecting a subset of the original realization set (i.e.,\nthe collection of all the acquired signals paired with the corre-\nsponding illumination patterns), we should obtain a degraded\nreconstruction due to a decrease in captured information.\nHowever, the degraded reconstruction should still represent\nthe same signal as the reconstruction from the full realization\nset. This means that if we partition the realizations into\nsubsets with equal numbers of realizations, we should obtain\nequivalent but different and degraded sub-reconstructions.\n(a) Realizations (masks and buckets) partitioning.\n(b) Self-supervised model (CNN) training.\n(c) Prediction of the final reconstruction.\nFig. 2: Schematic representation of the proposed method:\n(a) The partitioning of the realizations set, generating the\npartial reconstructions (sub-recs). (b) The training procedure,\nwhich adjusts the model weights to minimize the residual\nbetween the projection of each partial prediction against the\nset of realizations not used for the its corresponding input.\n(c) The prediction of the final reconstruction obtained from\nthe averaged sub-reconstructions.\nWe partition the realizations in k splits k \u2208[1, K], producing\nK different sub-reconstructions (i.e., a k-tuple of reconstruc-\ntions). Each sub-reconstruction in the said k-tuple has unique\nnoise characteristics compared to the other reconstructions,\nand it provides different information on the FoV. From the\npartitioned data vectors yk we compute the sub-reconstructions\nxk as follows (Fig. 2a):\nxk = W \u2020\nkyk = W \u2020\nk(bk + \u03f5k)\n(5)\nwhere W \u2020\nk is pseudo-inverse of the partition forward model\nWk, and bk is the clean data.\nEach sub-reconstruction xk represents the same signal but\nhas a different noise. In N2G, we feed each of these sub-\nreconstructions to the model, and then we optimize the fol-\n4\nlowing minimization problem (Fig. 2b) by comparing them\nagainst the realizations that were not used to reconstruct them:\n\u02c6\u03b8 = argmin\n\u03b8\n1\n2\nX\nk\nX\ni\u0338=k\n\u2225WiN\u03b8(xk) \u2212yi\u22252\n2 + \u03bbR(N\u03b8(xk))\n(6a)\n\u02c6x = 1\nK\nX\nk\nN\u02c6\u03b8(xk).\n(6b)\nSince the different sub-reconstructions xk should represent the\nsame signal, we would like the model to produce the same\nresult for each. Thus, the model should learn to differentiate\nbetween the actual signal and random noise. This is in opposi-\ntion to Eq. (3a), where the same realizations were used for the\ninput reconstruction and as targets in the learning loss. Finally,\nFig. 2c represents how Eq. (6b) produces the reconstructed\nimages.\nD. Self-supervised random noise reduction\nWe now derive the noise-reduction properties of Eq. (6a).\nReconstruction noise comes from two main sources: Measure-\nment noise; and the lack of information from a low number\nof realizations, which creates the so-called missing realization\nartifacts. These noise sources can be represented in the sub-\nreconstructions xk as follows:\nxk = x + vk + W \u2020\nk\u03f5k = x + vk + tk\n(7a)\nWkvk = 0\n(7b)\nWkxk = yk + \u03f5k\n(7c)\nwhere W \u2020\nk\u03f5k = tk is the pseudo-inverse of the bucket measure-\nment noise, while the vector vk \u2286ker Wk describes spurious\nsolutions that could be added to the reconstruction, and still\nfit the data. Our method focuses on the random measurement\nnoise, and the desired noise reduction effect is given by the\nspecific construct of the data fitting term from Eq. (6a). When\nprojecting the predicted image onto the target data from the\nother splits, the projected noise from the k1 split will be\ninconsistent with the random noise from the other K\u22121 splits.\nThis can be seen by substituting Eq. (7a) and Eq. (5) into the\ndata fitting term of Eq. (6a):\n1\n2\nX\ni\u0338=k\n\u2225WiN\u03b8(x + vk + W \u2020\u03f5k) \u2212(b + \u03f5i)\u22252\n2\n(8)\nAs \u03f5k is independent of any other \u03f5i and their domains do not\noverlap (Appendix A from [19]), the expected prediction error\nof Eq. (8) becomes:\nEb,\u03f5,i\u0338=k\u2225WiN\u03b8(x + vk + W \u2020\u03f5k) \u2212bi\u22252\n2\n+ Eb,\u03f5,i\u0338=k\u2225bi \u2212(bi \u2212\u03f5i)\u22252\n2.\n(9)\nThis can be further simplified into:\nEb,i\u0338=k\u2225WiN\u03b8(xk) \u2212bi\u22252\n2 + E\u03f5,i\u0338=k\u2225\u03f5i\u22252\n2\n(10)\nwhere the term \u2225WiN\u03b8(xk) \u2212bi\u22252\n2 is the supervised recon-\nstruction error, and the term \u2225\u03f5i\u22252\n2 is the signal noise variance.\nIn the supervised reconstruction term, the acquisition noise is\nnot present. This means that it is equivalent to having noiseless\nrealizations as the targets of our N2G reconstruction, thus pro-\nviding the noise suppression characteristics of Noise2Ghost.\nThe signal noise variance is just a constant in the objective\nfunction, which only depends on the acquisition noise and\nshould not contribute to the objective function gradient. In\nreality, it may introduce some noise in the optimization\ngradients at very high noise levels. This could interfere with\nthe reconstruction, guiding it towards local minima.\nE. Relationship with Noise2Inverse\nThe Noise2Inverse method addresses the problem in Eq. (6)\ndirectly in the reconstruction domain [19], by solving the\nfollowing problem:\n\u02c6\u03b8 = argmin\n\u03b8\n1\n2\nX\nk\n\u2225N\u03b8(xk) \u2212\n1\nK \u22121\nX\ni\u0338=k\n(xi)\u22252\n2\n(11a)\n\u02c6x = 1\nK\nX\nk\nN\u02c6\u03b8(xk).\n(11b)\nwhere xi = W \u2020\ni yi is the pseudo-inverse (LS reconstruction)\nof the partial set of masks Wi and corresponding buckets yi\nwith i \u2208[1, K]. Theoretical grounds support this formulation,\nand it should retrieve the same result as Eq. (6), regarding\nits noise reduction aspect. However, it does not address the\nmissing realization artifacts and suffers from strong technical\nlimitations. In ghost imaging, artifacts arising from both\nmissing realization noise and random noise are structured\nand have long-range correlations, i.e., they generally cover\nthe whole FoV. Commonly used models like CNNs have a\nlimited receptive field (i.e., the region around each pixel that\nthe CNN probes and that provides the context for the said\npixel) that may not extend to the entire FoV. Depending on\nthe model, this could have a strong negative impact on the\ndenoising performance.\nMore precisely, about the missing realization artifacts, in [19]\nit is stated that Eq. (11) would be unable to cope with them.\nThe LS sub-reconstructions cope with the missing realizations\nby acting as if their bucket values were set to 0. This fixes one\ngiven solution vk for each xk, out of the infinite solutions that\nsatisfy Eq. (1). This specific choice of vk is likely incorrect,\nbut it is used as the model learning target in Eq. (11). The\nformulation we propose in (Eq. (6)) circumvents this limitation\nby using the measured buckets as \u201ctarget\u201d, and by leaving\nthe freedom to the model to find a solution that satisfies the\nforward model in Eq. (1) in the reconstruction space.\nA possible approach to overcome the just-discussed limitations\nof Eq. (11) would be to use variational methods (like in\nEq. (2)) to compute the sub-reconstructions. These non-linear\nmethods are known to remove or strongly reduce these long-\nrange artifacts. However, as discussed in [19], their non-linear\nnature contrasts with the assumptions of Eq. (11). This may\nthen result in a degraded reconstruction quality.\nF. Data augmentation\nThanks to the assumption that any permutation (reordering)\nof the acquired realizations provides the same reconstruction,\nwe can achieve simple and effective data augmentation: We\ncan increase the number of training examples for the self-\n5\nsupervised method from the measured data. We permute our\ndataset in p arrangements with p \u2208[1, P] and partition each\narrangement in k splits k \u2208[1, K], producing KP different\nsub-reconstructions (i.e., P k-tuples of reconstructions). In\npractice, the permutations P provide unique mixtures of\nmissing realization artifacts and random noise, while the K\nsplits are still the underlying mechanism for noise reduction.\nThis modifies Eq. (6) into:\n\u02c6\u03b8 = argmin\n\u03b8\n1\n2\nX\np,k\n\u2225WN\u03b8(xp,k) \u2212y\u22252\n2 + \u03bbR(N\u03b8(xp,k))\n(12a)\n\u02c6x =\n1\nPK\nX\np,k\nN\u02c6\u03b8(xp,k).\n(12b)\nG. Models\nWe briefly discuss the machine learning models, in par-\nticular neural networks (NN), that N2G could leverage. The\nmodel should accept multiple images of the same object as\ninput. This is incompatible with INRs, which learn coordinate-\nbased representations of the signal and thus take coordinates\nas inputs. CNNs and other MLP-based NNs are instead good\ncandidates. Both NN types take images as input and can\nlearn short- and long-range pixel correlations. Thus, N2G is\nnot tied to a specific NN architecture and can take advan-\ntage of future developments in that area. MLP-based NNs\nexamples include the Visual Transformer (ViT) [20] and the\nswin-transformer [21]. CNNs examples include U-Net [11],\nDnCNN [22], and MS-D net [23]. Regarding the model size,\none should try to work with small NNs to minimize the\ncomputation time and resources and to limit overfitting. For\ninstance, we suggest models with parameter numbers in the\norder of 100k for a 100 \u00d7 100 pixels image. However, some\nmodels like MS-D net could achieve similar results with 10-\n20k parameters for similar image sizes. In our tests, we had\nthe best results with U-Net and DnCNN with around 100-300k\nparameters. Further details will be provided with the results.\nIII. RESULTS\nWe now analyze N2G\u2019s performance against existing un-\nsupervised methods for reconstructing both synthetic and real\ndata. The reference methods are LS, TV, the GIDC algorithm\nfrom [14] (based on DIP), and INRs. For the synthetic test\ncases, we first derive a method to module the noise and\ncompare different noise levels.\nA. Synthetic data generation & quantification\nWe first evaluate the performance of the proposed method\non synthetic data, as it provides the ground truth and precisely\nallows us to control the noise added to the data.\nWe extracted a 100 \u00d7 100 pixels image to be our ground truth\n(phantom, e.g., from Figs. 5f and 6f) from [24], a curated\ncollection of chromosome images. We generated synthetic\nrandom masks with a normalized half-Gaussian distribution\nand forward projected the phantom to create the corresponding\nbuckets. As discussed in appendix B, we added Poisson noise\nto the buckets by first multiplying them by the maximum\nexpected emitted number of photons per pixel. We fed the\n(a) PSNR & SSIM\n(b) Bandwidth & Resolution\nFig. 3: Synthetic data GI reconstructions of the chromosomes\nphantom, with 5\u00d7 compression ratio, and varying noise levels.\nThe reconstruction algorithm performance is compared for\nmaximum emitted photons per pixel per realization intensities\nin the range of [100, 104]: (a) reconstruction peak signal-\nto-noise ratio (PSNR) and structural similarity index (SSIM)\nagainst the ground truth, where higher values indicate higher\nreconstruction quality; and (b) computed bandwidths in terms\nof the Nyquist frequency and corresponding resolutions in\npixels of the images.\nresulting mean expected photons per realizations to the Pois-\nson distribution function. We then normalized the signals by\ndividing the noisy buckets by the same number of expected\nphotons, resulting in signals:\nym = P\n\u0010\nC PN\nn wmnxn\n\u0011\n/C.\n(13)\nWe use the constant C to modulate the noise and evaluate\nits effect on the reconstruction performance of the proposed\nmethod.\nWe evaluate the reconstruction algorithms both visually and\nwith multiple objective metrics: the mean square error (MSE),\nthe peak signal-to-noise ratio (PSNR), the structural similarity\nindex (SSIM), and the resolution, which is estimated from the\nFourier ring correlation (FRC). Their formulas are provided\nin appendix A, while here we shortly describe their meaning.\nThe MSE evaluates the average pixel-wise difference between\nthe phantom and the reconstruction intensity values. The\nSSIM calculates the similarity between these two images\nby comparing the inter-dependencies of spatially neighboring\npixels over various window sizes [25]. PSNR is based on\nMSE, but it focuses on the peak intensity power instead of its\naverage. Finally, the FRC computes the similarity in Fourier\nspace between the phantom and the reconstruction for all\nthe sampled frequencies. We estimate the resolution of the\nreconstruction by finding the intersection of the resulting curve\nwith the threshold function defined in [26].\nB. Synthetic data reconstructions across various noise levels\nWe present three different study cases: In the first case\n(Fig. 3), we present the performance of the above-mentioned\ndifferent algorithms (LS, TV, GIDC, INR, and N2G) across\n6\nmultiple noise levels (i.e. photons per pixel) using only 2000\nrealizations, corresponding to 20% of the total number of\npixels. The simulated maximum emitted photons per pixel\nper realization (i.e. constant C) intensities span the range of\n[100, 104]. For each noise level, we generated five different\nsets of realizations. The same evaluation metrics (e.g. mean\nerror against the phantom, etc) are independently computed for\neach of the five reconstructions. The presented results indicate\nthe average and standard deviation of the computed metrics.\nFor the TV, GIDC, INR, and N2G reconstructions, the regu-\nlarization term \u03bb was chosen by cross-validation, by putting\naside a set of realizations (10% of the total) and using them\nto assess the best value over a wide range [27]. For TV, we\nuse a \u03bb \u2208[10\u22125, 10\u22121], while for GIDC, INR, and N2G\nwe used a \u03bb \u2208[10\u22127, 10\u22124]. To improve the procedure\u2019s\nrobustness, we ran it three times with different cross-validation\nsets for each run and averaged the cross-validation loss values\nbetween those different runs. For the INR reconstructions,\nwe used a 3-layer feed-forward NN with 512 neurons per\nlayer, sinusoidal activation functions [15], and 64 Fourier\nembeddings [28]. Regarding specifically GIDC and N2G, we\nused a U-net [11] with 24 features and 3 decomposition levels\n(\u223c276k parameters). We fitted the U-net weights with 3000\nepochs of the Adam optimizer [29] with a learning rate of\n2 \u00d7 10\u22124 and weight decay of 1 \u00d7 10\u22122. We selected the\nmodel weights of the epoch where the cross-validation loss\nfunction is minimized. For N2G, we used 4 partitions and 6\npermutations (K = 4, and P = 6).\nOn the left side of Fig. 3a, we present the peak signal-to-noise\nratio (PSNR) of the reconstructions against the phantom. It is\nevident that N2G consistently achieves superior reconstruc-\ntions, with PSNRs surpassing those of competing methods\nby 1\u20132 dB, highlighting its robustness across diverse imaging\nconditions. Nonetheless, its advantage seems to diminish under\nlow-photon scenarios. While counterintuitive, this is easily\nexplained by the fact that the lower frequencies are less\naffected by Poisson noise. PSNR takes all frequencies into\naccount, resulting in a bias towards lower frequencies with\nlow-photon counts. The SSIM, by construction, is instead less\naffected by lower frequencies. For this reason, on the right\nside of Fig. 3a, we see that N2G has a growing lead in\nSSIM for lower photon counts. We also observe that for high\nphoton numbers (equating to low noise), the compression ratio\nbecomes the limiting factor for reconstruction accuracy, and\nboth PSNR and SSIM reach a plateau. From Fig. 3b, we have\ninstead a perspective on how the resolution is degraded with\nincreasing noise levels. We observe that N2G better preserves\nthe reconstruction resolution, by showing a drop-off at lower\nphoton counts than other methods.\nFinally, we explore dose considerations by comparing GI with\nPB and assess N2G\u2019s performance against LS, TV, GIDC, and\nINR. In particular, we evaluate the total sample and average\nmaximum pixel dose necessary to achieve a GI image with\nPSNR or SSIM quality comparable to a PB scan. From figure\nFig. 4a, we observe that in terms of the total dose, PB always\noutperforms GI at the chosen compression ratio, regardless\nof the method used, although N2G narrows the difference.\nConcerning the average maximum pixel dose per exposure\n(a) Total FoV dose (against PB)\n(b) Average maximum pixel dose / unit-time (against PB)\n(c) Average maximum pixel dose / unit-time (against N2G)\nFig. 4: Dose requirements for the chromosomes phantom,\nwith 5\u00d7 compression ratio, and varying noise levels. The\nreconstruction algorithms performance is compared against\nPB, for maximum emitted photons per pixel per realization\nintensities in the range of [100, 104]: (a) total required dose\nfor each algorithm to obtain equivalent PSNR and SSIM as a\nPB acquisition; (b) same plot as (a) for the average maximum\npixel dose per illumination; and (c) for N2G.\n(Fig. 4b), GI generally exceeds PB\u2019s performance when the\nmain noise source is detection noise (except when LS is used).\nHowever, N2G still decreases dose requirements compared to\nthe other GI reconstruction techniques and widens the region\nwhere GI is preferable over PB. Figure 4c clearly shows this\ntrend by plotting the required dose ratio from LS, TV, GIDC,\nand INR to obtain the same performance as N2G.\nC. In-depth analysis of noisy reconstructions\nWe now provide an in-depth analysis of two examples:\n(a) high compression ratio and moderate noise and (b) low\ncompression ratio and very high noise (higher than signal).\nWe present the results for (a) in Fig. 5. We produced 1000\nrealizations, resulting in a 10\u00d7 compression ratio. The applied\nPoisson noise of 102 maximum emitted photons per pixel per\nrealization (i.e. constant C = 102) resulted in fluctuations\nequal to 24.5% of the mean clean bucket fluctuations. Com-\nparing reconstructions from Figs. 5a to 5e against Fig. 5f, we\nnote that except for the LS reconstruction, all other algorithms,\nincluding N2G, can correctly reconstruct the shape of the\nphantom. The only visible difference is that N2G\u2019s recon-\nstruction presents much less high-frequency noise. This results\nin the N2G reconstruction having the highest resolution (in\npixels), PSNR, and SSIM values, and the lowest MSE among\nall reconstructions. These results are summarized in Fig. 5h.\n7\n(a) LS\n(b) TV\n(c) GIDC\n(d) INR\n(e) N2G\n(f) Phantom\n(g) FRC\nLS\nTV\nGIDC\nINR\nN2G\nMSE \u2193\n0.400\n0.004\n0.004\n0.003\n0.001\nPSNR \u2191\n2.50\n22.44\n22.7\n23.27\n26.89\nSSIM \u2191\n0.030\n0.656\n0.672\n0.700\n0.827\nResolution \u2193\n9.6\n4.4\n5.2\n4.3\n4.0\n(h) Performance comparison.\nFig. 5: GI reconstructions of the chromosomes phantom, with\n10\u00d7 compression ratio and moderate Poisson noise: Mean\nnoise fluctuations \u223c24.5% of the mean clean bucket fluctua-\ntions. From (a) to (e) we show the reconstructions of (f) with\nLS, TV, GIDC, INR and N2G respectively. In (g) and in (h)\nwe present the Fourier ring correlation against (f) and various\nperformance metrics of each reconstruction, respectively.\nWe present the results of (b) in Fig. 6. Here, we produced\n3333 realizations, resulting in a 3\u00d7 compression ratio. The\napplied Poisson noise of 2 maximum emitted photons per pixel\nper realization (i.e. constant C = 2) resulted in fluctuations\nequal to 164.44% of the mean clean bucket fluctuations. All\nthe reconstructed images are noticeably affected by noise\nwith a larger bandwidth than in the previous examples. N2G\ncan suppress high-frequency noise, but cannot completely\ncorrect lower-frequency noise. Despite that, by looking at\nthe FRC in Fig. 6g, we notice that it preserves much more\nbandwidth than the other algorithms and presents a much\nbetter resolution. This again results in the N2G reconstruction\nhaving the highest PSNR and SSIM values, and the lowest\nMSE among all reconstructions. These results are summarized\nin Fig. 6h.\n(a) LS\n(b) TV\n(c) GIDC\n(d) INR\n(e) N2G\n(f) Phantom\n(g) FRC\nLS\nTV\nGIDC\nINR\nN2G\nMSE \u2193\n0.150\n0.047\n0.013\n0.012\n0.006\nPSNR \u2191\n6.74\n11.81\n17.29\n17.79\n20.58\nSSIM \u2191\n0.034\n0.186\n0.336\n0.380\n0.693\nResolution \u2193\n12.0\n14.2\n10.5\n10.4\n4.3\n(h) Performance comparison.\nFig. 6: GI reconstructions of the chromosomes phantom, with\n3\u00d7 compression ratio and very high Poisson noise: Mean noise\nfluctuations \u223c164.44% of the mean clean bucket fluctuations.\nFrom (a) to (e) we show the reconstructions of (f) with LS, TV,\nGIDC, INR and N2G respectively. In (g) and in (h) we present\nthe Fourier ring correlation against (f) and various performance\nmetrics of each reconstruction, respectively.\nD. X-ray fluorescence data reconstructions\nFinally, we compare the reconstruction performance of N2G\non real GI data acquired at the beamline ID19 of the ESRF\n\u2014 The European Synchrotron [30]. The considered dataset\nconsists of 42 \u00d7 87 pixels images, with 24\u00b5m pixel size,\nand 896 realizations, where the buckets are the emitted XRF\nsignals from a sample composed of a glass capillary and three\nwires. Two of the three wires are made of Cu, and they will\nbe the focus of this reconstruction. We used the signal from\ntheir K\u03b1 emission line. At each realization, we exposed the\nsample to a 26 keV incoming X-ray beam for 0.1 seconds\nand acquired each realization 32 times. The accumulated\n3.2-second exposures serve as high-SNR buckets. Reducing\nthe number of accumulated exposures artificially reduces the\ncollected number of photons (photon flux), thus simulating\nlower dose depositions. The XRF signals are collected with\na single-pixel hyper-spectral detector, whose spectral output\n8\n(a) XRF-GI reconstructions of two Cu wires.\n(b) Transmission image.\n(c) Segmented Cu wires.\nPSNR\nSSIM\n100%\n12.9%\n2.6%\n100%\n12.9%\n2.6%\nTV\n30.24\n24.34\n21.19\n0.680\n0.319\n0.192\nGIDC\n32.76\n23.23\n21.51\n0.721\n0.315\n0.217\nINR\n33.14\n24.73\n19.43\n0.771\n0.389\n0.182\nN2G\n35.33\n27.75\n23.76\n0.933\n0.796\n0.501\n(d) PSNR and SSIM of the GI reconstructions.\nFig. 7: Real data GI reconstructions of XRF-GI data from [30],\nusing TV, GIDC, and N2G: (a) presents the reconstructions\nat different photon counts (and corresponding noise levels);\n(b) and (c) present an x ray transmission image of the sample\nat 26 keV and the segmentation of the Cu wires, respectively;\nand (d) a table of the peak signal-to-noise ratio (PSNR)\nand structural similarity index (SSIM) of each reconstruction\nagainst their high-SNR versions in the top row of (a).\nis discretized into 4096 bins of 150 eV energy steps. Each\nXRF peak has a Lorentzian shape around the mean energy\nEe,l. This means that we observe photon counts in adjacent\nbins around the expected mean signal energy, which decrease\n\u223c\u03b3e,l/((Ee,l \u2212E)2 + \u03b32\ne,l) for bins at energy E further\naway from it, where \u03b3e,l is a constant that depends on the\nemission line. These neighboring bins are either summed or\nthe area under their curve is fitted to increase the signal\u2019s\nSNR. Therefore, we can simulate further dose reductions by\nsumming fewer and lower-count bins.\nWe present the results of the XRF-GI reconstructions in Fig. 7.\nFigure 7b presents an X-ray transmission image (radiograph)\nof the sample, while Fig. 7c indicates the regions that give\nrise to the XRF signal of interest. As mentioned above,\nwe consider the 0.1-second exposures as the \u201c100% dose\u201d\nexposures, while the 3.2-second accumulated images serve\nas ground truth (high-quality reference images). The other\ntwo images in Fig. 7a were produced using the five highest-\ncounts and the single highest-counts XRF bins, which pro-\nvide an average of 12.9% and 2.6% of the reference dose\nper realization, respectively. These two reduced-flux datasets\ncould thus correspond to 12.9 ms and 2.6 ms exposures per\nrealization, respectively. The table from Fig. 7d presents the\nestimated image quality degradation with increasing noise\nlevels, according to PSNR and SSIM. Each result is compared\nagainst their own method-related \u201creference\u201d reconstruction\nbecause we lack a PB scan of the object, which would serve\nas shared ground truth. Despite that, N2G provides the highest\nnoise reduction according to PSNR and SSIM across all the\nconsidered reconstruction methods.\nIV. CONCLUSIONS & OUTLOOK\nIn this article, we present a self-supervised deep-learning\nGI reconstruction method specifically targeted at dealing with\nrandom noise in the acquired data. If provided with enough\nhigh-quality reference data on the studied samples, existing\nsupervised approaches might exhibit high noise suppression\nperformance, with possibly higher PSNR and/or resolution\nthan our approach. However, gathering high-quality reference\ndata can be expensive and requires extensive and organized\nefforts, thus sometimes proving to be a difficult or even impos-\nsible task. This particularly applies to cutting-edge and niche\napplications in micro- and nano-imaging, dealing with rare (or\neven unique) samples with high radiation-dose sensitivity (e.g.,\nbiological specimens, battery cells) in in-vivo/in-operando\nconditions. In those cases, high-quality reference data are\nusually simply unavailable. Our approach solves this problem\nwith self-supervision, resulting in higher reconstruction per-\nformance against the state-of-the-art unsupervised approaches\nacross multiple metrics like MSE, PSNR, SSIM, and image\nresolution. From both the synthetic test cases and the real data\nreconstructions, we observe that our method preserves higher\nreconstruction quality than existing unsupervised methods as\nthe noise increases, i.e., at faster acquisition speeds and/or\nlower deposited dose, across a suite of case studies. This\ntranslates into the following relevant aspects: For the same\nimage quality, we could be looking at a reduction in the\nacquisition time and/or delivered dose by a factor of \u223c1.5 \u2013\n2, compared to the best existing unsupervised methods, when\nworking with the lower deposited doses; and the range of\ndeposited doses where GI offers an advantage over PB is also\nexpanded.\nThe above-mentioned aspects also mean that our method could\nenable operating GI scans at \u223c1.5 \u2013 2 shorter dwell time,\nresulting in faster GI acquisitions. This aspect is crucial for\nreal case scenarios in two alternative ways: Either increasing\nacquisition stability or the ability to acquire more realizations\nfor the same dose. In the first case, when paired with the\nreduced number of GI realizations compared to PB acqui-\nsitions, we further reduce the total acquisition times of GI\nacquisitions, strongly improving the stability of our GI scans.\nMoreover, compared to PB scans, where the sample moves\nacross the field of view, in a GI acquisition, the sample is\nfixed [30]. This further consolidates GI as a very attractive\ntechnique for reducing sample drifts and positioning errors in\nnano-scale imaging. Alternatively, the reduced exposure time\nfor each GI realization would also enable more realizations\nto be acquired per a fixed acquisition time budget. This, in\nturn, is the only way to reduce the missing-realization noise,\n9\nwhich is the ultimate limit in GI reconstruction quality, as seen\nearlier in the text.\nN2G focuses on noise reduction from the acquired mea-\nsurements. It delegates dealing with the missing realization\nartifacts to the generalization power of the underlying model\n(NN) and the rather simplistic prior knowledge given by the\nTV term. Future work could tackle this problem with a combi-\nnation of a few possible approaches, including using stronger\nor more complex priors than TV (e.g., multi-level undecimated\nwavelet minimization) and multi-channel/multi-modal infor-\nmation from correlated signals (e.g., the transmission signal\nin [16]), as demonstrated in computed tomography [31].\nAPPENDIX\nA. Metrics\nWe give the mathematical definition of the metrics used in\nsection III. For two given signals x1 and x2, MSE, PSNR, and\nSSIM can be computed as follows:\nMSE(x1, x2) = \u2225x1 \u2212x2\u22252\n2\n(14a)\nPSNR(x1, x2) = 20 log10\u2225x1\u2225\u221e\u221210 log10 MSE(x1, x2)\n(14b)\nSSIM(x1, x2) =\n(2\u00b51\u00b52 + c1)(2\u03c312 + c2)\n(\u00b52\n1 + \u00b52\n2 + c1)(\u03c32\n1 + \u03c32\n2 + c2)\n(14c)\nwhere \u2225x1\u2225\u221e= max(x1), \u00b5 and \u03c3 are the mean value and\nthe standard deviation respectively of a given signal, the c\nconstants are used to stabilize the fraction, and the SSIM is\ncomputed on 7 \u00d7 7 pixel windows. From [26], the FRC is\ndefined as:\nFRCx1,x2(ri) =\nP\nr\u2208ri Fx1(r) \u00b7 F\u2217\nx2(r)\nqP\nr\u2208ri F2x1(r) \u00b7 P\nr\u2208ri F2x2(r)\n(15)\nwhere Fx(\u00b7) is the Fourier transform of a signal x, and\nr \u2208ri is the set of all the pixels at distance ri from the\nzero frequency in Fourier space, resulting in P\nr\u2208ri being an\nazimuthal integration at the radius ri.\nB. Derivation of the noise model\nSince the proposed method is agnostic of the underlying\nimaging technique, we first develop an independent evaluation\nframework from the technique and noise source. We use the\nguiding example of XRF imaging to derive the just-mentioned\ngeneric noise model. For PB acquisitions and a given element\ne in a compound cmp and the corresponding XRF emission\nline l, the clean observed signal in the XRF detector is (slightly\nadapted from Eq. (10) of [32]):\n\u03a6e,l(E0) = \u03d50(E0)t\u03c7e\u03c3e,l(E0)\u03c1cmp\u03bdGdAe,l(E0)\n(16a)\n\u03bd = p2\u03c4\n(16b)\nGd =\nSd\n4\u03c0D2\nd\n(16c)\nAe,l(E0) = exp\n\u001a\n\u2212\nZ\nin\n\u00b5(E0)d\u03c4 \u2212\nZ\nout\n\u00b5(Ee,l)d\u03c4\n\u001b\n(16d)\nwhere \u03d50(E0) is the probing beam\u2019s photon flux at the energy\nE0, t the dwell time (exposure time per pixel), \u03c1c is the local\ncompound density, \u03c7e is the local relative concentration of\nthe said element e in the compound cmp, \u03c3e,l(E0) is the XRF\nproduction cross-section, \u03bd is the observed volume, where p is\nthe beam waist (pixel size) and \u03c4 the object thickness, Gd is a\ngeometric factor that accounts for the detector surface Sd and\ndistance Dd, and finally Ae,l(E0) is the combined attenuation\nof the incoming and emitted photons across the sample.\nThe total delivered dose per pixel is \u03a60(E0) = \u03d50(E0)t.\nEquation (16) can be easily modified to account for other\nemission signals like Compton radiation. The corresponding\ndetected signal is \u03a6\u03b1(E0), where \u03b1 is the observation angle,\nis obtained by changing the production cross section \u03c3e,l(E0)\nto the Compton equivalent, and by removing the elemental\nconcentrations \u03c7e (as P\ne \u03c7e = 1).\nThe observed signal y on the detector is:\ny = P (\u03a6e,l(E0)) + N\u00b5d=0,\u03c32\nd\n(17)\nwhere P is the photon counting (Poisson \u2013 shot) noise of\nthe detection process, and N is the additive white (Gaussian)\nnoise from the instrument electronics, with mean value \u00b5d = 0\nand a given variance \u03c32\nd. Per Eq. (1), b = \u03a6e,l(E0) and\n\u03f5 = (P (\u03a6e,l(E0)) \u2212\u03a6e,l(E0)) + N\u03c32\nd. In x-ray emission\ndetection, it is safe to assume \u03c32\nd = 0. Thus, we are mostly\nconfronted with Poisson noise, which is mainly modulated by\nthe quantities \u03a60(E0), \u03c1c, and \u03c7e.\nIn XRF imaging, we aim to reconstruct the local relative\nconcentrations \u03c7e (or actual concentrations \u03c7e\u03c1c), while in\nCompton imaging, we aim at the compound densities \u03c1c.\nThus, they correspond to the vector elements xi from Eq. (1).\nThis means that the other factors in Eq. (16a) can either\nbe considered constant or measured during the acquisitions,\nand they will contribute as constants in our noise description.\nTherefore, e.g. for XRF imaging, we can simplify Eqs. (16a)\nand (17) to account for each measured point j as follows:\nbm = \u03a6e,l,m(E0) = \u03c7e,mce,l(E0)\u03a60(E0)\n(18a)\nym = P (\u03c7e,mce,l(E0)\u03a60(E0)) .\n(18b)\nThus, the noise is modulated by the value of C\n=\nce,l(E0)\u03a60(E0), giving b = \u03c7eC and y = P(\u03c7eC). The\nconstant C represents the emitted photons per pixel j for a\n100% local element e concentration.\nFor GI and a homogeneous incoming beam, we mod-\nify Eq. (18) as follows:\nbm = \u03a6e,l,m(E0) = PN\nn wmn\u03c7e,nce,l(E0)\u03a60(E0)\n(19a)\nym = P\n\u0010PN\nn wmn\u03c7e,nce,l(E0)\u03a60(E0)\n\u0011\n(19b)\nwhere wmn are the mask transmission intensities from the\nprojection matrix W in Eq. (1). If wmn \u2208[0, 1], the constant\nC represents the maximum emitted photons per pixel. As\nexpected, from Eq. (19) we observe that WP B = I, for PB\nacquisitions. By absorbing the ce,l(E0)\u03a60(E0) into the the\nsame constant C as for PB, and normalizing data by C, we\nobtain Eq. (13).\nC. Deposited dose & acquisition time considerations\nEquations (18) and (19) offer a way to compare the perfor-\nmance of PB and GI approaches on simulated data, especially\n10\nwhen comparing the respective simulated acquisition time or\ndelivered dose to the sample. For PB acquisitions, we define:\nTotal sample dose: N\u03a60,P B(E0)\n(20a)\nTotal pixel dose: \u03a60,P B(E0)\n(20b)\nMax pixel dose / unit-time: \u03a60,P B(E0)\n(20c)\nwhile for GI acquisitions:\nTotal sample dose: PM\nm\nPN\nn wmn\u03a60,GI(E0)\n(21a)\nTotal pixel dose: PM\nm wmn\u03a60,GI(E0)\n(21b)\nMax pixel dose / unit-time: \u2225w\u00b7n\u2225\u221e\u03a60,GI(E0)\n(21c)\nfor a given pixel n, where w\u00b7n indicates the n-th column of\nthe matrix W. If the matrix elements wmn span [0, 1], then\nsup(W) = 1 and thus we can assume \u2225w\u00b7n\u2225\u221e= 1.\nGiven the ground truth \u02c6x, a GI reconstruction xGI obtained\nfrom synthetic data generated with a value CGI, and a metric\nM(\u00b7) of choice in the set {PSNR(\u02c6x; \u00b7), SSIM(\u02c6x; \u00b7), . . .}, we\nobtain the \u02c6CP B value of a PB acquisition xP B = yP B =\nP (\u02c6xCP B) that exhibits the closest performance to xGI as:\n\u02c6CP B = argmin\nCP B\n{M(xGI) \u2212M(P (\u02c6xCP B))} .\n(22)\nFinally, we can compute acquisition time or deposited dose\nratios between different techniques, for equivalent image\nquality, by computing the ratio between the corresponding\nquantities in Eqs. (20) and (21), with the corresponding C\nvalues from Eq. (22).\nREFERENCES\n[1] O. Katz, Y. Bromberg, and Y. Silberberg, \u201cCompressive ghost imaging,\u201d\nApplied Physics Letters, vol. 95, no. 13, 2009.\n[2] D. Pelliccia, A. Rack, M. Scheel, V. Cantelli, and D. M. Paganin,\n\u201cExperimental X-Ray Ghost Imaging,\u201d Physical Review Letters, vol.\n117, no. 11, p. 113902, sep 2016.\n[3] A. M. Kingston, G. R. Myers, D. Pelliccia, F. Salvemini, J. J. Bevitt,\nU. Garbe, and D. M. Paganin, \u201cNeutron ghost imaging,\u201d Physical Review\nA, vol. 101, no. 5, p. 053844, may 2020.\n[4] S. Li, F. Cropp, K. Kabra, T. J. Lane, G. Wetzstein, P. Musumeci, and\nD. Ratner, \u201cElectron Ghost Imaging,\u201d Physical Review Letters, vol. 121,\nno. 11, p. 114801, sep 2018.\n[5] R. I. Khakimov, B. M. Henson, D. K. Shin, S. S. Hodgman, R. G. Dall,\nK. G. H. Baldwin, and A. G. Truscott, \u201cGhost imaging with atoms,\u201d\nNature, vol. 540, no. 7631, pp. 100\u2013103, dec 2016.\n[6] S. L. Brunton and J. N. Kutz, Data-Driven Science and Engineering.\nCambridge University Press, jan 2019.\n[7] T. J. Lane and D. Ratner, \u201cWhat are the advantages of ghost imaging?\nMultiplexing for x-ray and electron imaging,\u201d Optics Express, vol. 28,\nno. 5, p. 5898, mar 2020.\n[8] M. Karuppasamy, F. Karimi Nejadasl, M. Vulovic, A. J. Koster, and\nR. B. G. Ravelli, \u201cRadiation damage in single-particle cryo-electron\nmicroscopy: effects of dose and dose rate,\u201d Journal of Synchrotron\nRadiation, vol. 18, no. 3, pp. 398\u2013412, may 2011.\n[9] V. Lempitsky, A. Vedaldi, and D. Ulyanov, \u201cDeep Image Prior,\u201d in 2018\nIEEE/CVF Conference on Computer Vision and Pattern Recognition.\nIEEE, jun 2018, pp. 9446\u20139454.\n[10] S. Rizvi, J. Cao, K. Zhang, and Q. Hao, \u201cDeepGhost: real-time com-\nputational ghost imaging via deep learning,\u201d Scientific Reports, vol. 10,\nno. 1, p. 11400, jul 2020.\n[11] O. Ronneberger, P. Fischer, and T. Brox, \u201cU-Net: Convolutional Net-\nworks for Biomedical Image Segmentation,\u201d in Medical Image Com-\nputing and Computer-Assisted Intervention \u2013 MICCAI 2015, 2015, pp.\n234\u2013241.\n[12] A. Habring and M. Holler, \u201cNeural-network-based regularization meth-\nods for inverse problems in imaging,\u201d GAMM-Mitteilungen, jul 2024.\n[13] A. Qayyum, I. Ilahi, F. Shamshad, F. Boussaid, M. Bennamoun, and\nJ. Qadir, \u201cUntrained Neural Network Priors for Inverse Imaging Prob-\nlems: A Survey,\u201d IEEE Transactions on Pattern Analysis and Machine\nIntelligence, pp. 1\u201320, 2022.\n[14] F. Wang, C. Wang, M. Chen, W. Gong, Y. Zhang, S. Han, and G. Situ,\n\u201cFar-field super-resolution ghost imaging with a deep neural network\nconstraint,\u201d Light: Science & Applications, vol. 11, no. 1, p. 1, jan 2022.\n[15] V. Sitzmann, J. N. P. Martel, A. W. Bergman, D. B. Lindell, and\nG. Wetzstein, \u201cImplicit Neural Representations with Periodic Activation\nFunctions,\u201d in Proc. NeurIPS, 2020.\n[16] J. Li, S. Chen, D. Ratner, T. Blu, P. Pianetta, and Y. Liu, \u201cNanoscale\nchemical imaging with structured X-ray illumination,\u201d Proceedings of\nthe National Academy of Sciences, vol. 120, no. 49, dec 2023.\n[17] J. Lehtinen, J. Munkberg, J. Hasselgren, S. Laine, T. Karras, M. Aittala,\nand T. Aila, \u201cNoise2Noise: Learning Image Restoration without Clean\nData,\u201d in Proceedings of the 35th International Conference on Machine\nLearning, ser. Proceedings of Machine Learning Research, J. Dy and\nA. Krause, Eds., vol. 80.\nPMLR, 2018, pp. 2965\u20132974.\n[18] A. A. Hendriksen, D. M. Pelt, W. J. Palenstijn, S. B. Coban, and\nK. J. Batenburg, \u201cOn-the-Fly Machine Learning for Improving Image\nResolution in Tomography,\u201d Applied Sciences, vol. 9, no. 12, p. 2445,\njun 2019.\n[19] A. A. Hendriksen, D. M. Pelt, and K. J. Batenburg, \u201cNoise2Inverse:\nSelf-Supervised Deep Convolutional Denoising for Tomography,\u201d IEEE\nTransactions on Computational Imaging, vol. 6, pp. 1320\u20131335, 2020.\n[20] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai,\nT. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly,\nJ. Uszkoreit, and N. Houlsby, \u201cAn Image is Worth 16x16 Words:\nTransformers for Image Recognition at Scale,\u201d 2021.\n[21] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo,\n\u201cSwin Transformer: Hierarchical Vision Transformer using Shifted Win-\ndows,\u201d in 2021 IEEE/CVF International Conference on Computer Vision\n(ICCV).\nIEEE, oct 2021, pp. 9992\u201310 002.\n[22] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, \u201cBeyond a Gaus-\nsian Denoiser: Residual Learning of Deep CNN for Image Denoising,\u201d\nIEEE Transactions on Image Processing, vol. 26, no. 7, pp. 3142\u20133155,\njul 2017.\n[23] D. M. Pelt and J. A. Sethian, \u201cA mixed-scale dense convolutional neural\nnetwork for image analysis,\u201d Proceedings of the National Academy of\nSciences, vol. 115, no. 2, pp. 254\u2013259, 2018.\n[24] J.-J. Tseng, C.-H. Lu, J.-Z. Li, H.-Y. Lai, M.-H. Chen, F.-Y. Cheng, and\nC.-E. Kuo, \u201cAn Open Dataset of Annotated Metaphase Cell Images for\nChromosome Identification,\u201d Scientific Data, vol. 10, no. 1, p. 104, feb\n2023.\n[25] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, \u201cImage Quality\nAssessment: From Error Visibility to Structural Similarity,\u201d IEEE Trans-\nactions on Image Processing, vol. 13, no. 4, pp. 600\u2013612, apr 2004.\n[26] M. van Heel and M. Schatz, \u201cFourier shell correlation threshold criteria,\u201d\nJournal of Structural Biology, vol. 151, no. 3, pp. 250\u2013262, sep 2005.\n[27] M. Manni, A. Ben Yehuda, Y. Klein, B. Lukic, A. Kingston, A. Rack,\nS. Shwartz, and N. Vigan`o, \u201cSynchrotron-based X-ray Fluorescence\nGhost Imaging,\u201d Optics Letters, vol. 48, no. 23, pp. 6271\u20136274, 2023.\n[28] M. Tancik, P. Srinivasan, B. Mildenhall, S. Fridovich-Keil, N. Raghavan,\nU. Singhal, R. Ramamoorthi, J. Barron, and R. Ng, \u201cFourier Features\nLet Networks Learn High Frequency Functions in Low Dimensional\nDomains,\u201d in Advances in Neural Information Processing Systems,\nH. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, Eds.,\nvol. 33.\nCurran Associates, Inc., 2020, pp. 7537\u20137547.\n[29] D. P. Kingma and J. L. Ba, \u201cAdam: A method for stochastic opti-\nmization,\u201d in 3rd International Conference on Learning Representations,\nICLR 2015 - Conference Track Proceedings, Y. Bengio and Y. LeCun,\nEds., 2015.\n[30] https://doi.org/10.5281/zenodo.7828494.\n[31] D. S. Rigie and P. J. La Rivi`ere, \u201cJoint reconstruction of multi-channel,\nspectral CT data via constrained total nuclear variation minimization.\u201d\nPhysics in Medicine and Biology, vol. 60, no. 5, pp. 1741\u201362, 2015.\n[32] T. Schoonjans, A. Brunetti, B. Golosio, M. Sanchez del Rio, V. A.\nSol\u00b4e, C. Ferrero, and L. Vincze, \u201cThe xraylib library for X-ray\u2013matter\ninteractions. Recent developments,\u201d Spectrochimica Acta Part B: Atomic\nSpectroscopy, vol. 66, no. 11-12, pp. 776\u2013784, nov 2011."
-  },
-  {
-    "domain": "Computer Science",
-    "chunk_type": "general",
-    "text": "Multimodal Representation Learning Techniques for\nComprehensive Facial State Analysis\nKaiwen Zheng1, Xuri Ge2,\u2217, Junchen Fu1, Jun Peng3, Joemon M. Jose1\n1University of Glasgow, School of Computing Science, Glasgow, United Kingdom\n2Shandong University, School of Artificial Intelligence, Shandong, China\n3Peng Cheng Laboratory, Shenzhen, China\nk.zheng.1@research.gla.ac.uk, xuri.ge@sdu.edu.cn, j.fu.3@research.gla.ac.uk,\npengjun.cn@outlook.com, joemon.jose@glasgow.ac.uk\nAbstract\u2014Multimodal foundation models have significantly\nimproved feature representation by integrating information from\nmultiple modalities, making them highly suitable for a broader\nset of applications. However, the exploration of multimodal\nfacial representation for understanding perception has been\nlimited. Understanding and analyzing facial states, such as Action\nUnits (AUs) and emotions, require a comprehensive and robust\nframework that bridges visual and linguistic modalities. In this\npaper, we present a comprehensive pipeline for multimodal facial\nstate analysis. First, we compile a new Multimodal Face Dataset\n(MFA) by generating detailed multilevel language descriptions of\nface, incorporating Action Unit (AU) and emotion descriptions,\nby leveraging GPT-4o. Second, we introduce a novel Multilevel\nMultimodal Face Foundation model (MF2) tailored for Action\nUnit (AU) and emotion recognition. Our model incorporates\ncomprehensive visual feature modeling at both local and global\nlevels of face image, enhancing its ability to represent detailed\nfacial appearances. This design aligns visual representations with\nstructured AU and emotion descriptions, ensuring effective cross-\nmodal integration. Third, we develop a Decoupled Fine-Tuning\nNetwork (DFN) that efficiently adapts MF2 across various tasks\nand datasets. This approach not only reduces computational over-\nhead but also broadens the applicability of the foundation model\nto diverse scenarios. Experimentation show superior performance\nfor AU and emotion detection tasks.\nIndex Terms\u2014Facial Representation Learning, MFA Dataset,\nFace Foundation Model, Efficient Fine tuning\nI. INTRODUCTION\nFace representation learning plays an important role in auto-\nmatic facial state analysis, such as expression recognition [1]\nand medical diagnosis [2], and has received extensive attention\nin recent decades. Its main goal is to extract facial appearance\nrepresentations for face perception and recognition. However,\nface representation learning is very challenging due to the\ncomplex and diverse appearance details of facial texture and\nmuscle states.\nEarlier studies [3] extracted facial representations from\nglobal images using convolutional neural networks (CNNs) to\npredict facial states such as emotions. For example, Burkert et\nal. [4] designed a deep CNN for facial expression recognition\nthat uses convolutional layers to capture hierarchical features.\nWhile such global representations effectively encode coarse-\ngrained texture and muscle combinations, they often lack\n* Corresponding author.\nthe fine-grained localization needed for many downstream\ntasks. Other works [5] have focused on facial muscle analysis\nthrough Action Unit (AU) recognition, with methods such\nas [6], [7] proposing local-global relational networks that\naccurately locate AU-specific regions via landmark detection.\nAlthough both global and local face representations have been\nsuccessfully applied in tasks like AU recognition [8] and\nemotion recognition [9], they still do not provide explicit\nfacial feature explanations\u2014for instance, linguistic descrip-\ntions\u2014that could further enhance interpretability.\nRecently, multimodal joint representation learning has\nachieved notable success in various applications such as health\nassessment [10] and driver fatigue detection [11]. However,\nits impact on facial state analysis remains limited due to the\ncomplexity of facial appearance features and privacy concerns.\nOn one hand, generating high-quality multimodal face anno-\ntations is challenging. Although pre-trained Multimodal Large\nLanguage Models (MLLMs) like CoCa [12] and Blip [13] can\nproduce image descriptions in diverse scenarios, no unified\napproach exists for generating optimal facial state descriptions.\nMethods such as Exp-BLIP [14] and VL-FAU [15] use LLMs\nto generate general face descriptions; however, they either lack\nsufficiently detailed AU annotations or omit nuanced emotion\nreasoning. On the other hand, effectively aligning multi-level\nmultimodal face representations\u2014integrating both local and\nglobal visual features with corresponding AU and emotion lan-\nguage representations\u2014remains underexplored. For instance,\nExp-BLIP [14] employs coarse-grained image-text pairs for\nexpression captioning, while VL-FAU [15] relies on fixed-\nform AU descriptions that limit further improvement in visual\nrepresentation.\nIn this paper, we address two key challenges in multimodal\nface representation learning: (i) developing robust, multilevel\nface annotation methods that provide language-image pairs at\nvarious granularities (e.g., detailed AU and emotion context\ndescriptions), and (ii) effectively aligning these multimodal\nrepresentations to enhance feature extraction.\nTo this end, we propose a comprehensive pipeline con-\nsisting of a novel Multilevel Multimodal Facial Foundation\nmodel (MF2) and an efficient Decoupled Fine-Tuning Network\n(DFN) for downstream tasks. Specifically, we first leverage\narXiv:2504.10351v1  [cs.CV]  14 Apr 2025\nthe pre-trained MLLM GPT-4o [16] to generate fine-grained\nAU descriptions and emotion reasoning for face images. Next,\nthe MF2 model integrates local and global visual features\nwith detailed language annotations to yield explicit and com-\nprehensive facial representations, serving as a foundation for\ntasks such as FAU and emotion recognition. Finally, the DFN\nenables efficient adaptation of MF2, significantly reducing\ntraining overhead while maintaining performance.\nThe contributions of this paper are as follows:\n\u2022 To enable comprehensive face representation learning, we\ncompile a new multimodal face dataset with high-quality,\nmultilevel language annotations, including descriptions\nfor various AU and emotion reasoning.\n\u2022 We propose a novel Multilevel Multimodal Face Founda-\ntion model (MF2) for comprehensive face state analysis,\nincluding FAU and emotion recognition. MF2 leverages\nlocal and global facial appearance information, aligning\nit with detailed AU descriptions and reasoning-based\nemotion annotations.\n\u2022 We further provide a fine-tuning method for MF2, re-\nferred to as the efficient Decoupled Fine-Tuning Network\n(DFN), enabling rapid adaptation to new data and enhanc-\ning practicality.\nExtensive experiments on the new multimodal benchmark\nvalidate the motivation and effectiveness of our foundation\nmodel MF2 and fine-tuning strategy DFN, facilitating the\nfuture research of face state analysis.\nII. MULTIMODEL FACIAL ANNOTATION\nTo address the limitations of existing facial datasets, we\nconstructed a new Multimodal Facial dataset (MFA). Figure\n1 illustrates the specific steps we followed to reconfigure the\ndataset, utilizing ground truth labels (emotion and AU anno-\ntation) and carefully designed prompts to generate reasonable,\nhigh-quality, multilevel facial language descriptions through\nGPT-4o [16]. In this section, we introduce the collection\nprocess of the dataset, the prompt strategies, and an overview\nof the MFA dataset.\nA. Dataset Construction\nCreating a new dataset from scratch was deemed impractical\ndue to the significant costs and complexities involved. Instead,\nwe opted to use an existing dataset as our foundation. To\nidentify a suitable dataset, we defined two key criteria:\n\u2022 The dataset must include both Action Unit (AU) and\nEmotion annotations.\n\u2022 Each image should have an individual emotion label.\nAfter a comprehensive comparison of available datasets, as\nsummarized in Table I, we found that only the Aff-Wild2\ndataset satisfied these requirements [17]. Consequently, we\nselected Aff-Wild2 as the base for our work.\nData Filtering: To construct a balanced dataset, we began\nby filtering the Aff-Wild2 dataset to include only images\nwith both Action Units (AUs) and Emotion annotations. This\nfiltering step also ensured emotion class balance across the\nTABLE I: Dataset overview: Comparison between existing\ndatasets and our constructed dataset.\nName\nAU\nEmotion\nRequirements\nCaption\nAffectNet [18]\n\u2717\n\u2713\n\u2713\n\u2717\nRAF-DB [19]\n\u2717\n\u2713\n\u2713\n\u2717\nDFEW [20]\n\u2717\n\u2713\n\u2713\n\u2717\nDISFA [21]\n\u2717\n\u2713\n\u2713\n\u2717\nFERV39K [22]\n\u2717\n\u2713\n\u2713\n\u2717\nSFEW [23]\n\u2717\n\u2713\n\u2713\n\u2717\nAFEW [24]\n\u2717\n\u2713\n\u2713\n\u2717\nGFT [25]\n\u2713\n\u2713\n\u2717\n\u2717\nRAF-AU [26]\n\u2713\n\u2713\n\u2717\n\u2717\nCK+ [27]\n\u2713\n\u2713\n\u2717\n\u2717\nEmotioNet [28]\n\u2713\n\u2713\n\u2717\n\u2717\nCASME-II [29]\n\u2713\n\u2713\n\u2717\n\u2717\nBP4D [30]\n\u2713\n\u2713\n\u2717\n\u2717\nAffWild2 [17]\n\u2713\n\u2713\n\u2713\n\u2717\nMFA (Ours)\n\u2713\n\u2713\n\u2713\n\u2713\nGPT-4o\nEmotion label list:\nGround Truth\nGPT4o Prompt\nThe person in the image appears to be \ndisplaying an expression of surprise. \nThe wide-open eyes and slightly parted \nlips suggest a reaction to something \nunexpected or astonishing. The raised \neyebrows also contribute to the look of \nsurprise, indicating that the person is \nreacting to something unforeseen or \nshocking.\nGPT-4o\nAU label list:\nGround Truth\nGPT4o Prompt\nThe image shows a person with a facial expression that \ncan be broken down into several Facial Action Coding \nSystem (FACS) Action Units (AUs). Here are the Aus that \nappear to be present:1.**AU1 (Inner Brow Raiser)**:The \ninner part of the eyebrows is raised 2. **AU2 (Outer Brow \nRaiser)**: The outer part of the eyebrows is raised 3. \n**AU7(Lid Tightener)**: The eyelids appear to be tightened. \n4. **AU25 (Lips Part)**: The lip sare parted 5. **AU26(Jaw \nDrop)**: The jaw appears to be dropped. The overall \nexpression seems to convey a sense of surprise or shock.\nFig. 1: Multimodal facial annotation for detailed AU descrip-\ntions and emotion reasoning language based on GPT-4o [16].\nMore details are given in supplementary materials.\ndataset. Following this process, the refined dataset was split\ninto training and validation sets. Given the video-based nature\nof the Aff-Wild2 dataset, we maintained a balance in both the\nnumber of videos and individual images when dividing the\ndata into these subsets.\nGPT-4o Prompt Strategy: Our objective is to linguistically\nannotate each image for Action Units (AUs) and emotion,\nleveraging the existing annotations effectively. Textual de-\nscriptions are incorporated to bridge the gap between anno-\ntations and model understanding, guiding emotion and AU\ndetection models by highlighting the nuanced differences in\nthese units. This approach helps the models capture subtle\nvariations, improving overall classification accuracy. To ensure\noptimal output quality, we experimented with various gener-\nation methods and prompt designs. Ultimately, GPT-4o was\nselected for its nuanced understanding and adaptability. Our\nstructured prompt framework, designed for generating high-\nquality captions, consists of three key components: task setup,\noutput formatting, and signal specification. This structured\napproach enables the model to fully comprehend the task,\nensuring consistent and detailed outputs across diverse cap-\ntioning scenarios. Supplementary materials show more prompt\ndesign details.\nB. Dataset Consolidation and Summarisation\nThe dataset comprising a total of 34,696 images extracted\nfrom 151 videos. These images have been split into a training\nGlobal Image\nViT\nCross-Attention\nSelf-Attention\nFFN\nEmotion Text\nSelf-Attention\nFFN\n\u2026\nVisual Queries\nCross-Attention\nSelf-Attention\nFFN\n\u2026\nVisual Queries\n\u00d7N\n\u00d7N\nFace\n Alignment\n\u2026\nViT\nLocal Image\nAU Text\nSelf-Attention\nFFN\n\u00d7N\n\u00d7N\nBERT\nBERT\n\u2026 surprise. The wide-\nopen eyes and slightly \nparted lips suggest a \nreaction \nto \nsomething \nunexpected \nor \nastonishing. The raised \neyebrows also contribute \nto the look of surprise, \nindicating that the person \nis reacting to something \nunforeseen or shocking.\nAU1 (Inner Brow Raiser): The \ninner part of eyebrows is raised.\nAU2 (Outer Brow Raiser): The \nouter part of the eyebrows is \nraised.\nAU7 (Lid Tightener): The eyelids \nappear to be tightened. \nAU25 (Lips Part): The lips are \nparted.\nAU26(Jaw \nDrop): \nThe \njaw \nappears to be dropped.\nTRM\nTRM\nVisual/Language \nBranch (\u00d7 \ud835\udfd2)\nTRM\n\u2026\n\u229b\n\u229b\n\u229b\n\u229b\n\u0de1\ud835\udc7d\ud835\udc88\n\u0de1\ud835\udc7a\ud835\udc88\n\ud835\udc7d\ud835\udc88 \n\ud835\udc7d\ud835\udc68\ud835\udc7c \n\u0de1\ud835\udc7d\ud835\udc82\n\u0de1\ud835\udc7a\ud835\udc82\n\u0de1\ud835\udc7d\u2217/ \u0de1\ud835\udc7a\u2217\n\u2112\ud835\udc70\ud835\udc7b\ud835\udc6a\n\u2112\ud835\udc70\ud835\udc7b\ud835\udc6a\n\u2112\ud835\udc70\ud835\udc7b\ud835\udc74\n\u2112\ud835\udc70\ud835\udc7b\ud835\udc74\n\u2112\ud835\udc70\ud835\udc7b\ud835\udc6e\n\u2112\ud835\udc70\ud835\udc7b\ud835\udc6e\n(a) Emo-VL\n(b) AU-VL\n(c) DFN\nFig. 2: Framework: (a) Emo-VL combines global image features with sentiment text; (b) AU-VL integrates local image features with AU\ntext; (c) DFN uses modality-specific side adapters for efficient fine-tuning.\nset (31,320 images) involving 134 videos and a validation\nset (3,376 images) involving 17 videos. The data set includes\na balanced number of images in eight emotional categories:\nNeutral, Anger, Disgust, Fear, Happiness, Sadness, Surprise,\nand Other. Each category has a nearly equal representation in\nboth the training and validation sets to avoid class imbalance,\nensuring that the model can generalize well to different emo-\ntions. The data set supports three types of caption generation\ntasks: Emotion Caption, AU Caption, and Key AU Caption.\nSee supplementary material for details of each type of caption.\nThe extracted and reconstructed dataset, referred to as MFA,\nis a balanced dataset designed to provide a rich training ground\nfor both AU and emotion classification, ensuring that models\ntrained on it are exposed to diverse facial expressions and\naction units.\nIII. METHODOLOGY\nWe introduce a novel approach for training and fine-tuning\na comprehensive multimodal face representation foundation\nmodel, illustrated in Figure 2. Our proposed Multilevel Mul-\ntimodal Face Foundation model (MF2) is designed for diverse\nfacial state analyses, such as FAU and emotion recognition.\nMF2 leverages newly constructed AU and emotion language\ndescriptions, in MFA, to align with both local and global\nfacial representations, enabling the generation of face rep-\nresentations enriched with detailed features and contextual\ninformation. Furthermore, we propose a new Decoupled Fine-\nTuning Network (DFN) for efficiently fine-tuning tasks after\ntraining MF2.\nA. Multilevel Multimodal Face Foundation Model \u2013 MF2\nOverview. MF2 consists of two main Q-former-based\nvisual-language branches, i.e. a global-level visual encoder\nwith reasoning-based emotion language alignment (Emo-VL)\nand a local-level visual encoder with fine-grained AU language\nalignment (AU-VL). The former uses global contexts and situ-\national reasoning in emotional language to assist and improve\nthe ability and discriminability of global face visual represen-\ntation. The latter further uses each AU language description\nto accurately improve the visual representation of each muscle\narea, and improves the fine-grained face representation. During\ntraining, we leverage linguistic descriptions to guide the model\nin identifying situational cues.\nEmo-VL. Following BLIP-2 [31], we model the global\nface visual representation and emotion description language\nrepresentation in a unified Q-former, as shown in Figure 2\n(a). Emo-VL employs a pre-trained ViT model [32] to extract\nthe global face feature V g and then it is input into a Q-former-\nbased multimodal alignment module to align with the emotion\nlanguage representation Se from a pre-trained BERT [33] for\nthe final recognition tasks. Specifically, the Q-former-based\nglobal alignment module contains a visual encoder and a lan-\nguage encoder. The visual encoder consists of N transformer-\nbased blocks, each containing a Self-Attention layer (SA),\na Cross-Attention layer (CA), and a Feedforward Network\n(FFN). The language encoder also consists of N blocks,\nwhere each contains a self-attention layer and an FFN. Due\nto the characteristics of Q-former [31], an additional cross-\nattention layer with the learned queries (Qg) is contained in the\nvisual encoder. Similar with BLIP-2 [31], we utilize the Image-\nText Contrastive Learning loss (LITC), Image-grounded Text\nGeneration loss (LITG) and Image-Text Matching loss (LITM)\nto optimise the visual-language alignment and recognition of\nface states, such as FAU activation state and emotion category,\nby corresponding task classifiers. The overall working flow of\nEmo-VL is formulated as:\n\u02c6V g = FFN(CA(SA(V g), Qg)),\n(1)\n\u02c6Sg = FFN(SA(Se))\n(2)\nThe object functions are followed as:\nLITC = \u22121\n2M\nM\nX\ni=1\n\u0014\nlog\nexp\n\u0000sim(\u02c6vg\ni ,\u02c6se\ni)/\u03c4\n\u0001\nPM\nj=1 exp\n\u0000sim(\u02c6vg\ni ,\u02c6se\nj)/\u03c4\n\u0001\n+ log\nexp\n\u0000sim(\u02c6se\ni, \u02c6vg\ni )/\u03c4\n\u0001\nPM\nj=1 exp\n\u0000sim(\u02c6se\ni, \u02c6vg\nj )/\u03c4\n\u0001\n\u0015\n(3)\nLITM = \u2212\nM\nX\ni=1\n\u0002\nyi log p(y = 1|\u02c6vg\ni ,\u02c6se\ni)\n+ (1 \u2212yi) log p(y = 0|\u02c6vg\ni ,\u02c6se\ni)\n\u0003\n(4)\nLITG = \u2212\nX\ni\u2208mask\nlog p(w\u2217\ni |wmask\\i, \u02c6Vg)\n(5)\nwhere M is the size of image-text pairs. wi is the target word\nto predict in text generation or masked language modeling\ntasks. \u03c4 is a temperature parameter.\nAU-VL. Emo-VL improves the ability to represent global\nfaces by aligning the global face feature with the emotion\nlanguage, which explicitly contains global emotion reasoning\ninformation. To further compensate for the lack of fine-grained\nface representation, we propose local face representation en-\nhancement based on the positioning accuracy advantage of\nAction Units (AUs), as shown in Figure 2 (b), named AU-\nVL. Similarly, we use the AU language description to align\nwith the local AU visual representation in a Q-former-based\nmodule to improve its multimodal representation capability.\nThe local AU visual representations are extracted based on the\ndetected face landmarks from a pre-trained landmark detector\n[34]. The structure of the local Q-former-based alignment\nmodule is the same as Emo-VL. Specifically, to extract the\nprecise AU features in a face image, we use a pre-trained\nlandmark detector [34] to localise the AU positions and extract\nthe corresponding representations V AU = {V a\n1 , ..., V a\nn } from\nViT-based visual features. All AU captions are embedded by\nthe BERT [33] as SAU={Sa\n1, ..., Sa\nn}. After that, we also\nemploy the Q-former-based AU alignment module to align\nthe local AU visual features and fine-grained AU language\nfeatures by the same objective functions in Emo-VL. Note\nthat, the visual encoder and language encoder in Q-former\nalignment are shared for different AUs to save parameters.\nFinally, we obtain the local AU representations \u02c6V a and their\ncorresponding detailed language representations \u02c6Sa.\nDuring the multilevel visual-language joint learning, we\nuse the cross entropy loss function [35] to optimize an AU\nrecognizer and an emotion recognizer respectively for the final\nfacial state analysis. Thus, we obtain a face foundation model\nfor FAU recognition and emotion recognition.\nB. Efficient Decoupled Fine-tuning Network \u2013 DFN\nAs the foundational backbone of MF2, the Q-former faces\ntwo primary limitations: (1) its transformer-based architecture\nis computationally expensive, and (2) to mitigate this cost,\nit employs shared Self-Attention and FFN modules for mul-\ntimodal contrastive learning. While this design may enhance\ncross-modal interaction, it compromises the unique represen-\ntation capability of individual modalities. To address these\nchallenges and improve the generalization of the proposed\nfoundation model MF2, we propose a simple yet effective\nDecoupled Fine-Tuning Network (DFN) for pre-trained MF2\nbuilt entirely with lightweight adapters. The detailed frame-\nwork is shown in Figure 2 (c). Inspired by the advanced Side\nAdapter paradigm [37], [38], which outperforms traditional\nadapters and LoRA in efficiency [39], [40], DFN decouples\nthe shared modules into distinct side adapter pathways. By\nincorporating unique modality-specific adjustments through\ntwo independent side adapters, DFN effectively mitigates\ninterference between modalities while significantly reducing\ncomputational overhead. Specifically, DFN is parallel to each\nmodality branch in MF2 and performs decoupling fine-tuning.\nTherefore, there are 4N DFN cells in total, each of which\nconsists of a downsampling and upsampling layer composed\nof a fully connected layer, and is connected using an activation\nfunction. When fine-tuning the DFN, we freeze the MF2\nbackbone and only update the parameters of DFN for the new\ntask, under the optimization of new task objective functions.\nIV. EXPERIMENTS\nA. Experimental Settings\nImplemental Details. All details are shown in supplementary\nmaterials.\nEvaluation Metrics. The evaluation metrics include the F1\nscore for facial Action Unit (AU) detection and the classifica-\ntion accuracy for face emotion recognition.\nB. Experimental Results\nCompared Methods. We compare the proposed MF2 and\nits DFN-based fine-tuning model with three baselines for\nAU recognition in Table II and two baselines for emotion\nrecognition in Table III. For AU recognition, it contains ME-\nGraphAU [36], Exp-BLIP [14], VL-FAU [15]. For Emotion\nrecognition, HSEomtion [41] and Exp-BLIP [14] are compared\nwith our models. More baseline model details are shown in\nsupplementary materials.\nPerformance of FAU Recognition. Table II highlights the\nperformance of various models on the MFA dataset for FAU\nrecognition. Among the baseline models, VL-FAU achieves\nthe highest average performance with an F1 score of 48.19%.\nHowever, both versions of our proposed MF2 model signifi-\ncantly outperform these baselines. Specifically, the MF2 (Pre-\nTrain) model achieves an average F1 score of 50.77%, while\nMF2 (Fine-Tuning) further improves to 53.35%, representing\na substantial margin of +5.16% over the best-performing\nbaseline (VL-FAU).\nPerformance of Emotion Recognition. Table III presents\nthe performance of various models on the MFA dataset for\nemotion recognition. Our MF2 (Pre-Train) model achieves\nan average accuracy of 83.48%, and the MF2 (Fine-Tuning)\nmodel further boosts performance to 84.40%, demonstrating a\nnotable margin of +2.26% over the best-performing baseline\nExp-BLIP [14]. These results, combined with the recognition\nTABLE II: Quantitative evaluation of AU recognition on the MFA dataset. The evaluation metric is F1-score (%)\nModels\nAU1\nAU2\nAU4\nAU6\nAU7\nAU10\nAU12\nAU15\nAU23\nAU24\nAU25\nAU26\nAvg.\nExp-BLIP [14]\n40.25\n12.63\n63.41\n53.28\n69.43\n71.76\n60.18\n46.85\n27.60\n10.27\n86.43\n25.61\n47.31\nME-GraphAU [36]\n41.94\n13.72\n55.91\n41.92\n76.57\n70.48\n53.68\n61.42\n20.13\n03.88\n85.53\n30.47\n46.30\nVL-FAU [15]\n43.09\n15.86\n55.59\n49.35\n77.57\n73.51\n54.81\n60.00\n29.50\n03.72\n84.25\n31.08\n48.19\nMF2 (Pre-Train)\n50.17\n18.75\n73.18\n54.83\n76.58\n70.00\n52.57\n48.92\n29.06\n11.72\n88.68\n34.80\n50.77\nMF2 (Fine-Tuning)\n44.76\n15.64\n66.90\n50.42\n76.70\n73.17\n57.80\n54.51\n33.49\n43.02\n89.26\n34.55\n53.35\nTABLE III: Quantitative evaluation of emotion recognition on the MFA dataset. The evaluation metric is accuracy (%)\nModel\nNeutral\nAnger\nDisgust\nFear\nHappiness\nSadness\nSurprise\nOther\nAvg.\nExp-BLIP [14]\n82.17\n92.74\n86.58\n86.79\n90.73\n88.30\n79.56\n50.24\n82.14\nHSEmotion [41]\n80.95\n85.99\n86.82\n86.73\n85.31\n77.84\n69.05\n76.30\n81.12\nMF2 (Pre-Train)\n87.70\n95.50\n86.37\n86.64\n86.05\n88.09\n82.08\n55.39\n83.48\nMF2 (Fine-Tuning)\n84.53\n92.57\n79.95\n82.41\n83.92\n89.51\n87.14\n75.21\n84.40\nTABLE IV: Ablation analysis of emotion recognition Model. The evaluation metric is accuracy (%), TT for training time\n(min/epoch), IT for inference time (min/epoch), and TP for trainable parameters\nModel\nNeutral\nAnger\nDisgust\nFear\nHappiness\nSadness\nSurprise\nOther\nAvg.\nTT\nIT\nTP\nMF2 (Fine-Tuning)\n84.53\n92.57\n79.95\n82.41\n83.92\n89.51\n87.14\n75.21\n84.40\n12.6\n6.4\n52.88M\nw/o DFN\n87.70\n95.50\n86.37\n86.64\n86.05\n88.09\n82.08\n55.39\n83.48\n62.3\n5.1\n373.4M\nw/o Emo-VL\n88.57\n94.55\n86.82\n86.76\n84.66\n84.86\n50.71\n78.44\n82.42\n8.4\n4.5\n186.7M\nw/o AU-VL\n90.52\n93.39\n86.85\n87.17\n71.39\n88.92\n71.18\n61.56\n81.87\n4.1\n2.1\n186.7M\nresults from the FAU, highlight the comprehensive capabilities\nof the MF2 model. By utilising the Emo-VL and AU-VL\nmodules, MF2 effectively integrates both global and fine-\ngrained facial features aligned with corresponding diverse\nAU and emotion language, ensuring superior performance\nacross different tasks. Furthermore, the success of the MF2\n(Fine-Tuning) model demonstrates the effectiveness of decou-\npling the DFN implementation. Overall, this highlights the\nrobustness and adaptability of the model in multimodal facial\nrepresentation learning.\nC. Ablation Study\nTo demonstrate the effectiveness of the proposed modules,\nwe conducted extensive ablation studies. We show how each\ncomponent influences the overall performance of the MF2\nmodel. Table IV presents the component ablation study for\nthe MF2 model, including (1) Efficiency of Decoupled Fine-\nTuning and (2) Impact of Global and Local Feature Integration.\nEfficiency of Decoupled Fine-Tuning (DFN). Removing the\nDecoupled Fine-Tuning Network (DFN) led to a performance\ndrop of 0.92% and increased training time from 12.6 minutes\nper epoch to 62.3 minutes, as shown in Table IV. Moreover,\nthe number of trainable parameters rose drastically from 52.88\nmillion with DFN to 373.42 million without it. These findings\nunderscore DFN\u2019s critical role in reducing computational over-\nhead and optimizing parameter efficiency while maintaining\nhigh performance.\nImpact of Global (Emo-VL) and Local (AU-VL) Feature\nIntegration. Furthermore, removing the AU-VL module re-\nsulted in a significant performance drop (-2.53%), compared\nto a smaller drop (-1.98%) when the Emo-VL module was\nremoved, as shown in Table IV. Additionally, training time\ndecreased to 4.1 minutes per epoch without AU-VL and\nto 4.5 minutes without Emo-VL, highlighting a trade-off\nbetween computational efficiency and model effectiveness.\nThese results demonstrate that AU-VL plays a pivotal role in\ncapturing fine-grained, muscle-specific features, while Emo-\nVL enhances global contextual understanding. Together, these\nmodules ensure a balanced and comprehensive facial repre-\nsentation.\nThe ablation studies confirm the effectiveness of the MF2\nmodel\u2019s design, highlighting the critical role of each compo-\nnent in achieving state-of-the-art performance while ensuring\ncomputational and parameter efficiency.\nV. CONCLUSION\nThis paper presented a novel multimodal facial representa-\ntion learning pipeline, integrating image and text modalities\nto enhance AU and emotion recognition. We compiled the\nMFA dataset with high-quality detailed AU and emotional\ndescription linguistically. The proposed foundation model MF2\neffectively combines global (Emo-VL) and local (AU-VL)\nvisual-language representations with emotion and AU lan-\nguage alignment learning, ensuring comprehensive and de-\ntailed facial feature enhancement. Additionally, our Decoupled\nFine-Tuning Network (DFN) enables efficient task-specific\nfine-tuning, reducing computational cost and achieving supe-\nrior performance. Experimental results validated the effective-\nness of our multimodal MF2 model and its efficient fine-tuning\nstrategy (DFN), outperforming state-of-the-art methods while\ndemonstrating a reduction in training time. Future work will\nfocus on exploring advanced multimodal representations and\nimproving relational reasoning in face analysis.\nREFERENCES\n[1] Ling Lei, Tong Chen, Shigang Li, and Jianfeng Li, \u201cMicro-expression\nrecognition based on facial graph representation learning and facial\naction unit fusion,\u201d in CVPR, 2021, pp. 1571\u20131580.\n[2] Bo Jin, Leandro Cruz, and Nuno Gonc\u00b8alves, \u201cDeep facial diagnosis:\nDeep transfer learning from face recognition to facial diagnosis,\u201d IEEE\nAccess, vol. 8, pp. 123649\u2013123661, 2020.\n[3] Yinglin Zheng, Hao Yang, Ting Zhang, Jianmin Bao, Dongdong Chen,\nYangyu Huang, Lu Yuan, Dong Chen, Ming Zeng, and Fang Wen,\n\u201cGeneral facial representation learning in a visual-linguistic manner,\u201d\n2022.\n[4] Abir Fathallah, Lotfi Abdi, and Ali Douik, \u201cFacial expression recogni-\ntion via deep learning,\u201d in AICCSA, 2017, pp. 745\u2013750.\n[5] Ruicong Zhi, Mengyi Liu, and Dezheng Zhang,\n\u201cA comprehensive\nsurvey on automatic facial action unit analysis,\u201d Vis. Comput., vol. 36,\nno. 5, pp. 1067\u20131093, May 2020.\n[6] Xuri Ge, Pengcheng Wan, Hu Han, Joemon M Jose, Zhilong Ji,\nZhongqin Wu, and Xiao Liu, \u201cLocal global relational network for facial\naction units recognition,\u201d in FG. IEEE, 2021, pp. 01\u201308.\n[7] Xuri Ge, Joemon M Jose, Pengcheng Wang, Arunachalam Iyer, Xiao\nLiu, and Hu Han, \u201cAlgrnet: Multi-relational adaptive facial action unit\nmodelling for face representation and relevant recognitions,\u201d\nIEEE\nTransactions on Biometrics, Behavior, and Identity Science, vol. 5, no.\n4, pp. 566\u2013578, 2023.\n[8] Jiannan Yang, Fan Zhang, Bike Chen, and Samee U. Khan,\n\u201cFacial\nexpression recognition based on facial action unit,\u201d in IGSC, 2019, pp.\n1\u20136.\n[9] Dhwani Mehta, Mohammad Faridul Haque Siddiqui, and Ahmad Y.\nJavaid,\n\u201cFacial emotion recognition: A survey and real-world user\nexperiences in mixed reality,\u201d Sensors, vol. 18, no. 2, 2018.\n[10] Tao Zhou, Mingxia Liu, Kim-Han Thung, and Dinggang Shen, \u201cLatent\nrepresentation learning for alzheimer\u2019s disease diagnosis with incom-\nplete multi-modality neuroimaging and genetic data,\u201d ITMT, vol. 38,\nno. 10, pp. 2411\u20132422, 2019.\n[11] Jinxuan Shi and Kun Wang, \u201cFatigue driving detection method based\non time-space-frequency features of multimodal signals,\u201d BSPC, vol.\n84, pp. 104744, 2023.\n[12] Jiahui Yu, Zirui Wang, Vijay Vasudevan, Legg Yeung, Mojtaba Seyed-\nhosseini, and Yonghui Wu, \u201cCoca: Contrastive captioners are image-text\nfoundation models,\u201d 2022.\n[13] Junnan Li, Dongxu Li, Caiming Xiong, and Steven Hoi,\n\u201cBlip:\nBootstrapping language-image pre-training for unified vision-language\nunderstanding and generation,\u201d 2022.\n[14] Yujian Yuan, University of Chinese Academy of Science, Jiabei Zeng,\nand Shiguang Shan, \u201cDescribe your facial expressions by linking image\nencoders and large language models,\u201d in BMVC 2023. 2023, BMVA.\n[15] Xuri Ge, Junchen Fu, Fuhai Chen, Shan An, Nicu Sebe, and Joemon M\nJose, \u201cTowards end-to-end explainable facial action unit recognition via\nvision-language joint learning,\u201d in ACM MM, 2024, pp. 8189\u20138198.\n[16] Partha Pratim Ray, \u201cChatgpt: a comprehensive review on background,\napplications, key challenges, bias, ethics, limitations and future scope,\u201d\nInternet of Things and Cyber-Physical Systems, vol. 3, pp. 121\u2013154, 04\n2023.\n[17] Dimitrios Kollias, Panagiotis Tzirakis, Alice Baird, Alan Cowen, and\nStefanos Zafeiriou,\n\u201cAbaw: Valence-arousal estimation, expression\nrecognition, action unit detection & emotional reaction intensity esti-\nmation challenges,\u201d 2023.\n[18] Ali Mollahosseini, Behzad Hasani, and Mohammad H. Mahoor, \u201cAffect-\nnet: A database for facial expression, valence, and arousal computing in\nthe wild,\u201d IEEE TAC, vol. 10, no. 1, pp. 18\u201331, Jan. 2019.\n[19] Shan Li and Weihong Deng, \u201cReliable crowdsourcing and deep locality-\npreserving learning for unconstrained facial expression recognition,\u201d\nIEEE TIP, vol. 28, no. 1, pp. 356\u2013370, 2019.\n[20] Xingxun\nJiang,\nYuan\nZong,\nWenming\nZheng,\nChuangao\nTang,\nWanchuang Xia, Cheng Lu, and Jiateng Liu,\n\u201cDfew: A large-scale\ndatabase for recognizing dynamic facial expressions in the wild,\u201d 2020.\n[21] S. Mohammad Mavadati, Mohammad H. Mahoor, Kevin Bartlett, Philip\nTrinh, and Jeffrey F. Cohn, \u201cDisfa: A spontaneous facial action intensity\ndatabase,\u201d IEEE TAC, vol. 4, no. 2, pp. 151\u2013160, 2013.\n[22] Yan Wang, Yixuan Sun, Yiwen Huang, Zhongying Liu, Shuyong Gao,\nWei Zhang, Weifeng Ge, and Wenqiang Zhang, \u201cFerv39k: A large-scale\nmulti-scene dataset for facial expression recognition in videos,\u201d 2022.\n[23] Abhinav Dhall, Roland Goecke, Simon Lucey, and Tom Gedeon, \u201cStatic\nfacial expression analysis in tough conditions: Data, evaluation protocol\nand benchmark,\u201d in 2011 IEEE ICCV Workshops, 2011, pp. 2106\u20132112.\n[24] Jean Kossaifi, Georgios Tzimiropoulos, Sinisa Todorovic, and Maja\nPantic, \u201cAfew-va database for valence and arousal estimation in-the-\nwild,\u201d Image Vision Comput., vol. 65, no. C, pp. 23\u201336, Sept. 2017.\n[25] Jeffrey M Girard, Wen-Sheng Chu, L\u2019aszl\u2019o A Jeni, Jeffrey F Cohn,\nFernando De La Torre, and Michael A Sayette, \u201cSayette group formation\ntask (GFT) spontaneous facial expression database,\u201d in IEEE FG, 2017.\n[26] Wen-Jing Yan, Shan Li, Chengtao Que, Jiquan Pei, and Weihong Deng,\n\u201cRaf-au database: In-the-wild facial expressions with subjective emotion\njudgement and objective au annotations,\u201d in ACCV, November 2020.\n[27] Patrick Lucey, Jeffrey F. Cohn, Takeo Kanade, Jason Saragih, Zara\nAmbadar, and Iain Matthews, \u201cThe extended cohn-kanade dataset (ck+):\nA complete dataset for action unit and emotion-specified expression,\u201d\nin CVPR Workshops, 2010, pp. 94\u2013101.\n[28] C. Fabian Benitez-Quiroz, Ramprakash Srinivasan, and Aleix M. Mar-\ntinez, \u201cEmotionet: An accurate, real-time algorithm for the automatic\nannotation of a million facial expressions in the wild,\u201d in CVPR, 2016,\npp. 5562\u20135570.\n[29] Fangbing Qu, Sujing Wang, Wen-Jing Yan, and Xiaolan Fu, \u201cCas(me)2:\nA database of spontaneous macro-expressions and micro-expressions,\u201d\nin Interacci\u00b4on, 2016.\n[30] Xing Zhang, Lijun Yin, Jeffrey F. Cohn, Shaun Canavan, Michael Reale,\nAndy Horowitz, Peng Liu, and Jeffrey M. Girard, \u201cBp4d-spontaneous:\na high-resolution spontaneous 3d dynamic facial expression database,\u201d\nIVC, vol. 32, no. 10, pp. 692\u2013706, 2014, Best of Automatic Face and\nGesture Recognition 2013.\n[31] Junnan Li, Dongxu Li, Silvio Savarese, and Steven Hoi,\n\u201cBlip-2:\nBootstrapping language-image pre-training with frozen image encoders\nand large language models,\u201d 2023.\n[32] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weis-\nsenborn,\nXiaohua\nZhai,\nThomas\nUnterthiner,\nMostafa\nDehghani,\nMatthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and\nNeil Houlsby, \u201cAn image is worth 16x16 words: Transformers for image\nrecognition at scale,\u201d 2021.\n[33] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova,\n\u201cBert: Pre-training of deep bidirectional transformers for language\nunderstanding,\u201d 2019.\n[34] Adrian Bulat and Georgios Tzimiropoulos,\n\u201cHow far are we from\nsolving the 2d & 3d face alignment problem? (and a dataset of 230,000\n3d facial landmarks),\u201d in ICCV, 2017.\n[35] Anqi Mao, Mehryar Mohri, and Yutao Zhong,\n\u201cCross-entropy loss\nfunctions: Theoretical analysis and applications,\u201d 2023.\n[36] Cheng Luo, Siyang Song, Weicheng Xie, Linlin Shen, and Hatice Gunes,\n\u201cLearning multi-dimensional edge feature-based au relation graph for\nfacial action unit recognition,\u201d\nin IJCAI. July 2022, IJCAI-2022, p.\n1239\u20131246, IJCAI.\n[37] Junchen Fu, Xuri Ge, Xin Xin, Alexandros Karatzoglou, Ioannis Ara-\npakis, Kaiwen Zheng, Yongxin Ni, and Joemon M Jose,\n\u201cEfficient\nand effective adaptation of multimodal foundation models in sequential\nrecommendation,\u201d arXiv preprint arXiv:2411.02992, 2024.\n[38] Junchen Fu, Xuri Ge, Xin Xin, Alexandros Karatzoglou, Ioannis Ara-\npakis, Jie Wang, and Joemon M Jose, \u201cIisan: Efficiently adapting mul-\ntimodal representation for sequential recommendation with decoupled\npeft,\u201d in Proceedings of the 47th International ACM SIGIR Conference\non Research and Development in Information Retrieval, 2024, pp. 687\u2013\n697.\n[39] Junchen Fu, Fajie Yuan, Yu Song, Zheng Yuan, Mingyue Cheng,\nShenghui Cheng, Jiaqi Zhang, Jie Wang, and Yunzhu Pan, \u201cExploring\nadapter-based transfer learning for recommender systems: Empirical\nstudies and practical insights,\u201d\nin Proceedings of the 17th ACM\ninternational conference on web search and data mining, 2024, pp. 208\u2013\n217.\n[40] Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone,\nQuentin De Laroussilhe, Andrea Gesmundo, Mona Attariyan, and Syl-\nvain Gelly, \u201cParameter-efficient transfer learning for nlp,\u201d in ICML.\nPMLR, 2019, pp. 2790\u20132799.\n[41] Andrey V. Savchenko,\n\u201cHsemotion: High-speed emotion recognition\nlibrary,\u201d Software Impacts, vol. 14, pp. 100433, 2022.\n[42] Xuri Ge, Junchen Fu, Fuhai Chen, Shan An, Nicu Sebe, and Joemon M.\nJose,\n\u201cTowards end-to-end explainable facial action unit recognition\nvia vision-language joint learning,\u201d\nin MM. Oct. 2024, MM \u201924, p.\n8189\u20138198, ACM.\nAPPENDIX\nA. GPT-4o Prompt Design\nWe designed a three-stage GPT-4o prompt (Initial Setup,\nOutput Formpt and Output Signal) to generate the three high-\nquality descriptive captions we needed: the AU caption, the\nEmo caption and the Key AU caption. Below, we discuss the\nrationale and considerations behind the prompt structures used.\nNote: The prompt examples used in the introductory archi-\ntecture section are all emotion prompt examples.\nYou are currently acting as an emotion description expert and your ability is to \nrecognize a person\u2019s expression, and their possible elicited hidden emotions based \non an image of their face, please answer the questions according to this example:\nInitial Setup\nFig. 3: Emotion Initial Setup Prompt\nInitial Setup. In this step Figure 3, the Prompt model is\nassigned a specific role relevant to the task. For example, the\nmodel can be instructed to take on the role of an \u201demotion\ndescription expert\u201d or an \u201daction unit recognition expert.\u201d This\nhelps ChatGPT better understand the task\u2019s context, clarify\nthe desired goal, and focus on a particular task, such as\nrecognizing Action Units in emoticons. By doing so, the\nmodel reduces ambiguity and applies relevant knowledge\nmore accurately, enhancing the response\u2019s relevance and the\nquality of the generated results. This step ensures that the\nmodel performs optimally when addressing specific problems,\nthereby effectively improving the accuracy and consistency of\nthe generated content.\nOutput Format\nQuestion: \"What is the emotion of this face?\"\nAnswer: \"The person in the image appears to have a serious or intense \nexpression. The eyebrows are slightly furrowed, and the mouth is closed in a \nneutral position, which could suggest concentration, concern, or deep thought. \nThere is no clear smile or frown, so it\u2019s not a definitive indication of happiness or \nsadness, but rather a more contemplative or focused demeanor.\u201c\nFig. 4: Emotion Output Format Prompt\nOutput Format. In this step Figure 4, we provide the\nmodel with an example question-answer format that serves\nas a guide for structuring its responses. This example helps\nthe model understand the desired level of detail, tone, and\nformat, ensuring standardized outputs across different inputs.\nBy referencing the example, the model learns to include all\nnecessary components in its responses, such as specific facial\nfeatures, their emotional implications, and the relationships\nbetween facial action units. This consistency is especially\ncrucial for complex tasks like emotion and AU classification,\nwhere responses must be informative, contextually relevant,\nand coherent. The example acts as a template, helping the\nmodel generate responses that are accurate, well-organized,\nand easy to interpret. Additionally, it sets a standard for\ndepth and clarity, ensuring that the model consistently delivers\ncontext-aware, detailed, and relevant outputs.\nOutput Signal. In this step Figure 5, we provide the\nmodel with an example question-answer format that serves\nas a guide for structuring its responses. This example helps\nOutput Signal\nQuestion : What is the emotion of this face?.\n<Emotion label list: Surprise>\nAnswer:\nFig. 5: Emotion Output Signal Prompt\nthe model understand the desired level of detail, tone, and\nformat, ensuring standardized outputs across different inputs.\nBy referencing the example, the model learns to include all\nnecessary components in its responses, such as specific facial\nfeatures, their emotional implications, and the relationships\nbetween facial action units. This consistency is especially\ncrucial for complex tasks like emotion and AU classification,\nwhere responses must be informative, contextually relevant,\nand coherent. The example acts as a template, helping the\nmodel generate responses that are accurate, well-organized,\nand easy to interpret. Additionally, it sets a standard for\ndepth and clarity, ensuring that the model consistently delivers\ncontext-aware, detailed, and relevant outputs.\nSummary. We employ a novel prompt-based method using\nGPT-4o [16] to generate detailed captions for both emotion and\naction unit (AU) analysis, offering deeper insights into facial\nexpressions and their emotional implications. For emotion\ncaptioning, the model, guided by a prompt that positions it as\nan \u201demotion description expert,\u201d interprets subtle facial cues\nsuch as eyebrow or lip movements to produce rich, context-\naware descriptions beyond simple emotion labels. For AU cap-\ntioning, the model acts as an \u201dAU description expert,\u201d breaking\ndown facial expressions into specific AUs (e.g., AU4 for Brow\nLowerer, AU24 for Lip Pressor) with detailed explanations\nof their contributions to overall expressions. Furthermore, the\nkey AU caption approach focuses on identifying the most\ninfluential AUs for a given emotion, highlighting their decisive\nroles in conveying emotional states. This integrated approach\nprovides a comprehensive understanding of how facial muscle\nmovements define emotions, offering precise interpretations of\ncomplex expressions where multiple AUs interact.\nB. GPT-4o Prompt Example\nDue to space limitations in the main text, we cannot\npresent complete examples of the three prompt types and their\ngenerated captions (AU caption, emotion caption, and key AU\ncaption). To clarify the differences among these three prompts,\nwe provide basic examples of each in Figure 6.\nAs shown in the figure, the differences between AU captions\nand emotion captions are minimal. The key distinction lies\nin the initial role setting (emotion expert or AU expert),\nwhich ensures GPT focuses on the required domain knowledge\nwhile mitigating the influence of unrelated factors. Another\ndifference is the Output Format, which controls the content of\nthe required response and indirectly guides GPT\u2019s reasoning\nprocess. In contrast, the key AU caption differs significantly\nfrom the other two. It emphasizes the interaction between AUs\nand emotions and incorporates more detailed prompt settings\nto achieve this.\nYou are currently acting as an emotion description expert and your ability is to recognize a person\u2019s expression, and \ntheir possible elicited hidden emotions based on an image of their face, please answer the questions according to \nthis example:\nQuestion: \"What is the emotion of this face?\"\nAnswer: \"The person in the image appears to have a serious or intense expression. The eyebrows are slightly \nfurrowed, and the mouth is closed in a neutral position, which could suggest concentration, concern, or deep thought. \nThere is no clear smile or frown, so it\u2019s not a definitive indication of happiness or sadness, but rather a more \ncontemplative or focused demeanor.\u201c\nQuestion : What is the emotion of this face?.\n<Emotion label list: Surprise>\nAnswer:\nInitial Setup\nOutput Format\nOutput Signal\nOne shot emotion prompt with ChatGPT-4o\nYou are currently acting as a best AU description expert and your ability is to recognize a person\u2019s AU(action units) \nbased on an image of their face and accurately identify the meaning it expresses, please answer the questions \naccording to this example:\nQuestion: \"What is the action units of this face?\"\nAnswer: \"The image shows a person with a facial expression that can be broken down into several Facial Action \nCoding System (FACS) Action Units (AUs). Here are the AUs that appear to be present: 1. AU4 (Brow Lowerer): \nThere is a slight downward pull of the brows, which could indicate a frown or a concentration. 2. AU7 (Lid Tightener): \nThe eyelids appear to be tightened, which can be associated with a squint or a focused gaze. 3. AU24 (Lip Pressor): \nThe lips appear to be pressed together, which can be a sign of tension or determination. The overall expression \nseems to convey a sense of seriousness or intensity, but without additional context, it\u2019s difficult to determine the \nexact emotional state or intent behind the expression\u201c\nQuestion : What is the action units of this face?.\n <AU label list: AU1, AU2, AU4\u2026>\nAnswer:\nInitial Setup\nOutput Format\nOutput Signal\nOne shot au prompt with ChatGPT-4o\nYou are currently acting as an emotion and AU description expert and your ability is to recognize a person's \nexpression and their possible elicited hidden emotions based on an <emotion label list>'s label and an image of \ntheir face. At the same time, You also can recognize a person's AU and accurately identify the meaning it \nexpresses based on an image of their face. Once organized, you can select from the <au active list> the most \ninfluential key AUs that can determine the sentiment category (in most cases, not all activated au's are the ones \nthat ultimately make a difference, don't directly divide the entire au active list into key au. As an au expert, you \nhave to choose the most influential AU as the key AU.).\nQuestion: \"What is the action units of this face?\"\nAnswer: \"The image shows a person with a facial expression that can be broken down into several Facial Action \nCoding System (FACS) Action Units (AUs). Here are the AUs that appear to be present: 1. AU4 (Brow Lowerer): \nThere is a slight downward pull of the brows, which could indicate a frown or a concentration. 2. AU7 (Lid \nTightener): The eyelids appear to be tightened, which can be associated with a squint or a focused gaze. 3. AU24 \n(Lip Pressor): The lips appear to be pressed together, which can be a sign of tension or determination. The overall \nexpression seems to convey a sense of seriousness or intensity, but without additional context, it\u2019s difficult to \ndetermine the exact emotional state or intent behind the expression\u201c\nQuestion : What is the action units of this face?.\n <AU label list: AU1, AU2, AU4\u2026>\nAnswer:\nInitial Setup\nOutput Format\nOutput Signal\nOne shot emotion with key au prompt with ChatGPT-4o\nFig. 6: Three Types of Prompt Design Details (AU prompt, emotion prompt and key AU prompt). When you finally type the\nprompt we will personalize the output format according to the format you want to get, this format is not fixed, it just depends\non what information you need to get.\nIn summary, this design ensures GPT-generated captions are\naccurate, contextually rich, and tailored to support downstream\ntasks.\nC. Example of Caption\nFrom the two sets of examples in Figure 7 we can clearly\nsee the characteristics of the three different captions. Each type\nof caption serves a distinct purpose:\n\u2022 Emotion Caption: Provides an overview of the emotional\nstate expressed by the face, utilizing the full spectrum of\nemotions present in the dataset.\n\u2022 AU Captiont: Describes the specific facial action units,\nbreaking down the muscle movements involved in the\nexpression.\n\u2022 Key AU Caption: Highlights the most influential action\nunits that determine the emotional state, based on the\nground truth emotion and AU labels. This novel caption\ntype helps identify the critical facial movements respon-\nsible for conveying specific emotions.\nCompared to conventional single captions manually gen-\nerated solely based on Ground Truth Labels, our approach\nGT: happiness\nThe person in the image appears to be experiencing happiness. The expression is \ncharacterized by a genuine smile, with the corners of the mouth turned upwards. The eyes \nare also slightly squinted, which is often associated with a true, joyful smile. The overall \nfacial expression suggests a positive, pleasant emotion.\nGT: AU6, 7, 10, 12, 25\n1.AU6 (Cheek Raiser):The cheeks are raised, which is commonly associated with a smile.\n2.AU7 (Lid Tightener): The lower eyelids appear to be tightened, which is often seen in genuine \nsmiles or expressions of happiness.\n3.AU10 (Upper Lip Raiser):The upper lip is slightly raised, which can occur during smiling or \nlaughing.\n4.AU12 (Lip Corner Puller):The corners of the mouth are pulled upwards, indicating a smile.\n5.AU25 (Lips Part):The lips are slightly parted, which can happen during speaking or smiling.\nThe person in the image appears to be happy. This is indicated by the following action units \n(AUs) from the active list:AU6 (Cheek Raiser), AU12 (Lip Corner Puller)\nThese AUs are responsible for the raised cheeks and the pulled corners of the lips, which are \nclassic indicators of a genuine smile, reflecting happiness.\nAU Caption:\nEmotion Caption:\nKey AU Caption:\nThe person in the image appears to be displaying an expression of surprise. The wide-open eyes \nand slightly parted lips suggest a reaction to something unexpected or astonishing. The raised \neyebrows also contribute to the look of surprise, indicating that the person is reacting to \nsomething unforeseen or shocking.\n1.AU1 (Inner Brow Raiser): The inner parts of the eyebrows are raised, which can indicate \nsurprise or fear.\n2.AU2 (Outer Brow Raiser):The outer parts of the eyebrows are also raised, further \ncontributing to an expression of surprise or shock.\n3.AU25 (Lips Part):The lips are parted, which often accompanies expressions of surprise, \nshock, or amazement.\n4.AU26 (Jaw Drop):The jaw is dropped, which is another indicator of surprise or amazement.\nGT: Surprise\nGT: AU1, AU2, AU25, AU26\nThe person in the image appears to be experiencing surprise. The wide-open eyes and the \nopen mouth are indicative of this emotion. The action units (AUs) that played a decisive role \nin this emotional judgment include AU1 (Inner Brow Raiser), AU2 (Outer Brow Raiser), AU25 \n(Lips Part), and AU26 (Jaw Drop). These AUs collectively contribute to the expression of \nsurprise by raising the eyebrows and opening the mouth.\nAU Caption:\nEmotion Caption:\nKey AU Caption:\nFig. 7: Three Types of Caption Example (AU caption, emotion caption and key AU caption).All captions are entered into the\nGPT using a combination of the designed prompt and the corresponding ground true, and it is worth noting that the prompt\nis accompanied by the corresponding image, which allows the GPT to generate a personalized caption for the image.\nuses refined prompts, images, and Ground Truth Labels as\ninputs to generate captions through ChatGPT-4o [16]. This\nmethod produces more descriptive captions that incorporate\nthe intrinsic information of the images, resulting in captions\nthat are more accurate, unique, and diverse.\nTo this end, we designed a key AU prompt that provides\na unique approach to generating captions with large language\nmodels. When generating key AU captions, only the image,\nprompt, and the Ground Truth Labels for AU and Emotion\nare provided, without including any information about the key\nAU itself. The carefully crafted prompt ensures that ChatGPT-\n4o [16] fully analyzes the image, going beyond simplistic\ndescriptions of individual AUs and emotions.\nD. Transcition Experiments\nFinally, we pre-train AU on the MF2 model and then\nfine-tune emotion on the MF2 (Fine-tuning) model after pre-\ntraining, and we name this final model MF2 (Intern-VL).We\ntested the performance of this model on emotion and against\nthe baseline model, and according to the results in table V, we\ncan find that our model is 1.16% better than Exp-BLIP [14].\nTABLE V. Transition for AU Pre-train to Emotion\nFine-tuning\nModel\nAvg\nExp-BLIP [14]\n78.91\nMF2 (Intern-VL)\n80.07\nE. Experimental Parameters\nAll experiments were conducted on an RTX A6000 GPU.\nAdditional detais on the hyperparameter settings are providd\nin Table VI\nTABLE VI: Experimental Parameters\nCommon Parameters\nParameter\nValue\nTraining epoch\n30\nOptimizer\nAdamW [?]\nWeight decay\n0.05\nLinear warm-up\n2000 steps\nLearning rate\n1 \u00d7 10\u22124\nImage size\n224*224\nBatch size\n56\nMultilevel Multimodal Face Foundation Model (MF 2)\nParameter\nAU-VL\nEmo-VL\nTemp\n0.07 * torch.ones([])\n0.07 * torch.ones([])\nCaption max length\n169\n61\nDecoupled Fine-tuning Network (DFN)\nParameter\nImage\nText\nCLS tokens\nLast layer of ViT\nLast layer of Bert\nNumber of Adapter Layers\n7\n7\nActivation Function\nReLU/Sigmoid\nReLU/Sigmoid\nInput Dimension\n768\n768\nOutput Dimension\n768\n768\nGate Scaling Factor\n0.1\n0.1\nF. Baseline Model Details\nWe compared the results of multiple baseline models on the\nMFA dataset at both the AU and Emotion levels. Below is a\ndetailed introduction to the baseline models.\n\u2022 Exp-BLIP [14] employs a multimodal transformer ar-\nchitecture based on BLIP-2 to integrate image and text\nmodalities. It processes AU and Emotion representations\nindependently, limiting its ability to fully capture their\ninterplay.\n\u2022 ME-GraphAU [36] utilizes graph neural networks to\nmodel relationships between facial regions for AU recog-\nnition, effectively enhancing the detection of Action Units\nthrough structured interconnections.\n\u2022 VL-FAU [42] incorporates visual-linguistic representa-\ntions to improve AU recognition tasks. It focuses on\naligning visual features with linguistic cues to achieve\nstate-of-the-art performance.\n\u2022 HSEmotion [41] focuses on emotion recognition by clas-\nsifying emotional states. It achieves competitive results\nbut does not explicitly address the integration of AUs for\ncomprehensive facial analysis.\nG. Differences in Inputs Between Training and Validation\nDuring training, the model employs both images and textual\ndescriptions (e.g., emotion and AU captions) to cultivate a\nricher visual-semantic understanding. Objectives like image-\ntext matching and image-text contrast reinforce multimodal\nalignment, enabling the model to capture subtle facial expres-\nsions and nuanced features more effectively.\nIn contrast, the validation phase uses images alone for three\nprimary reasons:\n\u2022 Realistic Deployment Scenarios Textual information\nmay be unavailable in practice. Restricting validation\nto images ensures performance metrics reflect actual\napplication conditions.\n\u2022 Generalization and Robustness Evaluating the model\nwithout text verifies that it can perform effectively un-\nder conditions not explicitly supported during training,\nconfirming its adaptability.\n\u2022 Fair and Independent Assessment Excluding previously\nseen textual descriptions prevents artificially inflated per-\nformance, resulting in a more authentic gauge of the\nmodel\u2019s true capabilities."
-  },
-  {
-    "domain": "Computer Science",
-    "chunk_type": "general",
-    "text": "arXiv:2504.10357v1  [eess.SP]  14 Apr 2025\nThe Communication and Computation Trade-off in\nWireless Semantic Communications\nXuyang Chen, Student Member, IEEE, Chong Huang, Member, IEEE, Gaojie Chen, Senior Member, IEEE,\nDaquan Feng, Member, IEEE and Pei Xiao, Senior Member, IEEE\nAbstract\u2014Semantic communications have emerged as a crucial\nresearch direction for future wireless communication networks.\nHowever, as wireless systems become increasingly complex, the\ndemands for computation and communication resources in se-\nmantic communications continue to grow rapidly. This paper in-\nvestigates the trade-off between computation and communication\nin wireless semantic communications, taking into consideration\ntransmission task delay and performance constraints within the\nsemantic communication framework. We propose a novel trade-\noff metric to analyze the balance between computation and\ncommunication in semantic transmissions and employ the deep\nreinforcement learning (DRL) algorithm to minimize this metric,\nthereby reducing the cost associated with balancing computation\nand communication. Through simulations, we analyze the trade-\noff between computation and communication and demonstrate the\neffectiveness of optimizing this trade-off metric.\nIndex Terms\u2014Semantic communications, communication and\ncomputation trade-off, deep learning, resource allocation.\nI. INTRODUCTION\nW\nITH the rapid development of wireless communica-\ntions, future sixth-generation (6G) networks need to\naccommodate the demands of massive user access for the\nInternet of Things (IoT) and virtual worlds, thereby imposing\nexceptionally high requirements on wireless transmission rates.\nTo achieve this goal, semantic communications have emerged\nas one of the key development directions for future wireless\nnetworks due to its high spectral ef\ufb01ciency [1]. In semantic\ncommunications, information is extracted at the transmitter and\nthe extracted semantic information is transmitted to the receiver,\nthen the received semantic information is used to reconstruct the\noriginal information at the receiver, signi\ufb01cantly reducing the\nrequired communication bandwidth. Currently, many research\nThe work presented in this article was supported by Fundamental Re-\nsearch Funds for the Central Universities, Sun Yat-sen University, under Grant\nNo.24hytd010, and in part by the U.K. Engineering and Physical Sciences\nResearch Council under Grant EP/X013162/1.\nX. Chen is with the 5GIC & 6GIC, Institute for Communication Systems,\nUniversity of Surrey, GU2 7XH Guildford, U.K, and also with the College\nof Electronics and Information Engineering, Shenzhen University, Shenzhen\n518060, China. (email: chenxuyang2021@email.szu.edu.cn).\nC. Huang and P. Xiao are with 5GIC & 6GIC, Institute for Commu-\nnication Systems (ICS), University of Surrey, Guildford, GU2 7XH, United\nKingdom, Email: {chong.huang, p.xiao}@surrey.ac.uk.\nG. Chen is with School of Flexible Electronics (SoFE) & State Key Lab-\noratory of Optoelectronic Materials and Technologies, Sun Yat-sen University,\nGuangdong, 510220, China. Email: gaojie.chen@ieee.org.\nD. Feng is with the College of Electronics and Information Engineering,\nShenzhen University, Shenzhen 518060, China. (email: fdquan@szu.edu.cn).\nareas have considered the impact of semantic communications\nin wireless networks, including text transmissions [2], voice\ntransmissions [3], image transmissions [4], non-orthogonal\nmultiple access (NOMA) systems [5] and edge computing [6].\nSemantic communications can signi\ufb01cantly lower the rate\nrequirements for wireless transmission tasks. However, the\ntrade-offs among computation, communication, and reconstruc-\ntion quality for semantic transmissions are critical in wireless\nnetworks [7]. A transformer-based reconstruction method for\ntext transmissions was proposed to consider the channel adapt-\nability for semantic communications in [8]. To enhance the\nreconstruction quality in image transmissions, the multi-scale\nsemantic embedding spaces was studied in [9] for semantic\ncommunications. In [10], the trade-off between communication\nrate and reconstruction quality was investigated for semantic\ncommunications via deep reinforcement learning (DRL) algo-\nrithms. The balance between semantic information and trans-\nmission security was studied in [11]. However, while the bal-\nance between compressing the original information and meeting\ntransmission demands in semantic communication has garnered\nconsiderable attention, the trade-off between computational\ndemands and communication capabilities in wireless semantic\ncommunications remains an unresolved challenge. Moreover,\nthe reconstruction performance at the receiver also in\ufb02uences\nthe trade-off between computation and communication.\nIn this letter, we investigate the trade-off between communi-\ncation and computation with reconstruction quality constraint\nin wireless semantic communications. Our main contributions\nare summarized as follows:\n\u2022 We de\ufb01ne a novel metric, semantic communication cost\nmetric (SCCM), for the trade-off between computation\nand communication in wireless semantic communications,\nconstrained by the reconstruction quality.\n\u2022 We establish a depth-adaptive semantic communication\nframework to accommodate the dynamically changing\ncomputational and transmission demands of wireless net-\nworks. Moreover, to verify the impact of the proposed\ntrade-off metric SCCM in wireless communications, we\nutilize the DRL algorithm to optimize the resource allo-\ncation in the proposed wireless semantic network where\nchannels vary dynamically.\n\u2022 Simulation results analyze the trade-off metric SCCM in\nwireless semantic communications, and demonstrate the\neffectiveness of optimizing wireless resource to reduce the\n2\ncost of semantic communications.\nThe rest of the paper is organized as follows: Section II\nintroduces the system model and the problem formulation. The\noptimization algorithm is presented in Section III. In Section\nIV, simulation results are provided to evaluate the trade-off\nmetric. Finally, Section V concludes the paper.\nII. SYSTEM MODEL AND PROBLEM FORMULATION\nA. System Model\nIn this paper, we consider a wireless network consisting of\nM base stations (BSs) Sm (m \u2208M = {1, 2, ..., M}) and N\nusers Un (n \u2208N = {1, 2, ..., N}), each BS or user is equipped\nwith a single antenna. We assume zm,n (zm,n \u2208Z) is a Binary\nindicator for the user association between Sm and Un, zm,n = 1\ndenotes the BS Sm serves the user Un, otherwise there is no\ntransmission links between Sm and Un. The channel coef\ufb01cient\nhm,n between the BS Sm and the user Un follows Rayleigh\nfading which remain unchanged during one time slot and vary\nindependently from one time slot to another. Thus, the channel\ncoef\ufb01cient hm,n can be expressed as hm,n = gm,nd\u03b1/2\nmn , where\ngm,n follows Gaussian distribution with zero-mean and unit-\nvariance, dm,n denotes the distance between the BS Sm and\nthe user Un, \u03b1 denotes the path loss exponent of Rayleigh\nfading channels. Therefore, the received signal at Un is\nyn =\n\u221a\nPhm,nxm +\nM\nX\nv=1,v\u0338=m\n\u221a\nPhv,nxv + nn,\n(1)\nwhere P denotes the transmit power for BSs, xm is the\nsignal transmitted from the BS Sm, xv denotes the interference\nsignal from the BS Sv, nn is the additive-white-Gaussian-noise\n(AWGN) with variance \u03c32\nn at Rk. Thus, the transmission rate\nbetween Sm and Un is given by\nCm,n = log2\n\u00001 +\nP|hm,n|2\nPM\nv=1,v\u0338=m P|hv,n|2 + \u03c32n\n\u0001\n.\n(2)\nIn the proposed wireless semantic network, to effectively\nillustrate the balance between computation and transmission,\nwe simply assume that each base station serves at most one\nuser simultaneously. For the user Un, there are Kn transmission\ntask requests with the arrival modeled based on the FTP model\n3, and the transmission delay for each task must not exceed\nLmax time slots. In this study, we assume that all tasks involve\nthe transmission of images. Each BS is equipped with a vision\ntransformer (ViT) [12] function to extract semantic information\nfrom the original \ufb01le and subsequently sends the extracted\nsemantic information to the corresponding user. Each user is\nequipped with a ViT, possessing the necessary knowledge, to\nreconstruct the corresponding image.\nSince semantic information is signi\ufb01cantly smaller than\nthe original image, it can greatly reduce the required trans-\nmission bandwidth. However, due to the dynamic nature of\nthe communication environment, such as the Rayleigh fading\nconsidered in this work, wireless communication links often\noperate at lower rates than anticipated. This results in increased\ntransmission delays for communication tasks, adversely affect-\ning user experience. Consequently, some existing studies on\nsemantic communication have explored adjusting the semantic\ninformation compression ratio by sacri\ufb01cing a certain level of\nreconstruction quality to adapt to dynamic channels. However,\nincreasing computational resource consumption can adjust the\nsemantic information compression ratio while ensuring recon-\nstruction quality, thereby reducing the required communication\nrates and maintaining the transmission quality of semantic\ncommunication. Therefore, the trade-off between computation\nand communication is the primary focus of this work.\nB. Semantic Model\nViT has a powerful capability to capture the complex visual\nsemantics from images and has become a promising tool in\nthe \ufb01eld of image generation. In our semantic communication\nnetworks, the ViT-based encoder extracts the semantic informa-\ntion from the original image. The semantic information is then\ntransmitted to the receiver through wireless communication\nchannels. Upon reception, the ViT-based decoder can read\nthe tokens in semantic information to reconstruct the original\nimage.\nIn this work, we adjust the depth of the ViT-based encoder\nand decoder in wireless semantic communications to realize\nthe trade-off between communication and computation. The\nencoder/decoder depth denotes the number of layers in the ViT\narchitecture. By adjusting the depth, we can directly change the\ncomputational resources required for processing the semantic\ninformation. A deeper encoder can capture more features from\nthe original image, thereby achieving better semantic represen-\ntations, but it is at the cost of more computational resource. On\nthe other hand, a deeper decoder can reconstruct the original\nimage more accurately based on the semantic information, but\nthis also requires more computational resource. In addition, the\nnumber of output symbols generated by the encoder affects the\nsize of the semantic information transmitted to the receiver.\nTherefore, increasing the depth consumes more computational\nresource, but this additional computation allows for better\nextraction of features from the original image, supporting a\nsmaller number of encoder output symbols without degrading\nthe reconstruction quality. In this work, the channel encoder-\ndecoder is implemented using fully-connected neural networks,\nwhich facilitate adaptive adjustment of semantic transmission\nrates through dynamic modulation of output feature dimensions.\nThus, we can control computational consumption to adapt to\ndifferent signal-to-noise ratio (SNR) levels in various communi-\ncation conditions without compromising the quality of semantic\ncommunications.\nAs shown in Fig. 1, the original image is input into a\nViT-based encoder in semantic communications. The encoder\nconsists of multiple transformer layers that process the im-\nage and output a set of semantic information representing\nthe features of the original image, I denotes the number of\ntotal encoder/decoder depth in the semantic framework. This\nsemantic information is then transmitted to the receiver through\n3\nFig. 1. The structure of the proposed depth-adaptive semantic framework.\na wireless communication channel. The received semantic\ninformation is input into a ViT-based decoder to reconstruct the\nimage. Both the encoder and decoder are composed of multiple\nblocks, and the required number of blocks is adjusted based\non computational and communication demands. To be speci\ufb01c,\nmore blocks require more computational resource but also\nenable better extraction of semantic features, thereby reducing\nthe size of the semantic information that needs to be transmitted\nwith minimal loss in performance.\nIn summary, we train separate semantic communication\nmodels for encoder-decoder architectures of different depths,\nand conduct extensive experiments to identify the most suit-\nable channel codec for each con\ufb01guration. When selecting an\nencoder of a particular depth, we switch to the corresponding\nmodel rather than extracting intermediate features from a single\npre-trained deep model.\nC. Problem Formulation\nTo thoroughly evaluate the performance of semantic commu-\nnication systems, it is necessary to introduce a new evaluation\nmetric. With the advancement of deep learning technologies,\nsemantic communication systems require substantial compu-\ntational resources to perform complex semantic analysis and\ngeneration. At the same time, to ensure real-time responsiveness\nand ef\ufb01ciency, the system must ef\ufb01ciently transmit data under\nlimited bandwidth conditions. Traditional performance metrics\nfor semantic communications, such as peak signal-to-noise ratio\n(PSNR), latency, and bit error rate (BER), are not suf\ufb01cient to\nfully describe the trade-off between computation and commu-\nnication in resource constrained environments. Thus, we design\na new metric based on the trade-off between computation\nand communication to comprehensively assess the performance\nin semantic information processing and data transmission in\nwireless networks. The proposed trade-off metric is called\nSCCM, and can be expressed as\n\u03c8 = \u031fcF + \u031ftD,\n(3)\nwhere F represents the computational consumption measured\nin giga \ufb02oating point operations (GFLOPs), D denotes the\nsize of the semantic information transmitted to the receiver,\nmeasured in bits. To facilitate the expression of both F and\nD, we normalize them to a range between 0 and 1 according\nto their respective scales. \u031fc and \u031ft are the weights for\ncomputation and communication in the trade-off, respectively,\nand \u031fc + \u031ft = 1. By quantifying the trade-off between\ncomputation and communication, SCCM can facilitates the\noptimization of the resource allocation strategy to enhance user\nservice experience in wireless communications. For example,\nin scenarios where computational resource are limited but\nspectrum resource is abundant, the strategy prefers to transmit\nmore semantic information to the receiver. On the other hand,\nin scenarios where spectrum resource is constrained but compu-\ntational resource is not, increasing the computational load can\nreduce the transmitted semantic information size while ensuring\nthe quality of the reconstructed image at the receiver.\nTo reduce the total resource consumption in wireless seman-\ntic communications, we consider optimizing user association\nand selecting varying depths of encoders/decoders to minimize\nthe sum SCCM \u03c8 in (3). Moreover, we consider the delay\nand performance constraints in semantic communications. The\noptimization problem is presented as\nmin\nZ,I\nN\nX\nn=1\nKn\nX\nk=1\n\u03c8k,\n(4)\ns.t.\nM\nX\nm=1\nzm,n \u22641, \u2200n \u2208N,\n(4a)\nN\nX\nn=1\nzm,n \u22641, \u2200m \u2208M,\n(4b)\nLk \u2264Lmax,\n(4c)\nPSNRk \u2265PSNRmin,\n(4d)\nZ = {Z(1), Z(2), ..., Z(T )},\n(4e)\nI = {I(1), I(2), ..., I(T )},\n(4f)\nwhere I denote that at time slot t the depths of en-\ncoders/decoders in the semantic communications framework\nas in Fig. 1, Z(t) denotes the Binary indicator for the user\nassociation at time slot t, t \u2208{1, 2, ..., T }, T denotes the total\ntime slot for completing all tasks. Lk = tk,f \u2212tk,s denotes the\ndelay of the k-th task, where tk,f denotes the time slot that\nk-th transmission task is \ufb01nished, tk,s denotes the time slot\nthat k-th task arrives at BSs. (4a) and (4b) shows that one BS\ncan only serves one user, and one user can only access to one\n4\nBS. (4c) and (4d) presents the delay and reconstruction quality\nconstraints for all tasks, respectively, where PSNRmin is the\nreconstruction quality threshold. (4e) and (4f) indicates the\nuser association and depths of encoders/decoders, respectively.\nWe aim to minimize the sum cost of semantic communication\ntasks, where each task has delay and reconstruction quality\nconstraints as in (4c) and (4d). Considering the time-varying\nRayleigh fading wireless environment, task delay constraints,\nand the arrival of tasks at non-\ufb01xed time slots, this is a long-\nterm optimization issue. To address this problem, we employ a\nDRL algorithm to adaptively learn from the dynamic wireless\nenvironment and accommodate different trade-off weights. This\napproach provides an adaptable trade-off analysis scheme for\nsemantic communications across various wireless conditions.\nIII. DRL-BASED OPTIMIZATION ALGORITHM\nDRL is well-suited for solving long-term optimization prob-\nlems and can adapt to dynamically changing wireless communi-\ncation environments after training in various settings. To utilize\nDRL, we \ufb01rst formulate the problem as a Markov Decision\nProcess (MDP). In the MDP, the state is used to describe\nthe variables of the wireless communication environment as\ns(t) = {H(t), K(t), L(t), F(t), D(t)}, where H(t) is the set\nof all channel coef\ufb01cients between BSs and users at time slot\nt, follows block fading based on (2). K(t) denotes the set of the\ntransmission status of arrived tasks at time slot t, it follows FTP\nmodel 3 [13] to generate new arrivals, L(t) denotes the set of\ndelay of arrived tasks, F(t) and D(t) denotes the computational\ncost and communication cost at time slot t, respectively. Then,\nwe de\ufb01ne the action at time slot t as a(t) = {Z(t), I(t)}. Action\nis used to control the optimization variables in the problem\nformulation (4). The reward function is de\ufb01ned as\nr(t) = \u2212\nN\nX\nn=1\nKn\nX\nk=1\n(\u03c8k \u2212[Lk \u2212Lmax]+ \u2212[PSNRmin \u2212PSNR]+),\n(5)\nwhere [x]+ = max(0, x). The negative cost is utilized to\nguide the DRL agent in minimizing the cost and guarantee\nthe constraints in (4), thereby achieving the trade-off between\ncomputation and communications. In this work, we introduce\nsoft actor-critic (SAC) algorithm to solve the proposed issue.\nSince DRL algorithms such as SAC have been extensively\napplied to wireless communications, we will provide a brief\noverview of the principle of SAC [14]. We can minimize the\nsoft Bellman residual to update the soft-Q value in SAC\nWQ\u03c9 = E\u0000s(t),a(t)\n\u0001\n\u223cR\n\u00141\n2\n\u0010\nQ\u03c9\n\u0000s(t), a(t)\n\u0001\n\u2212\u00afQ\n\u0000s(t), a(t)\n\u0001\u00112\u0015\n,\n(6)\nwhere Q\u00b7 denotes the value in SAC, E[\u00b7] is the expectation,\nR denotes the training experience distribution, \u00afQ(s(t), a(t)) =\nr(s(t), a(t))+\u03b3Es(t+1)\u223c\u03c1[Q\u00af\u03ba(s(t+1))], \u03b3 presents the discount\nfactor, \u03c1 is the transition probability. Thus, we could obtain the\ngradients as\n\u00af\u03c4\u03baWQ\u03c9 = \u03c4\u03baQ\u03c9\n\u0000s(t), a(t)\n\u0001\u0010\nQ\u03c9\n\u0000s(t), a(t)\n\u0001\n\u2212r\n\u0000s(t), a(t)\n\u0001\n\u2212\u03b3Q\u00af\u03ba\n\u0000s(t + 1)\n\u0001\u0011\n.\n(7)\n\u0005\n\u0006\n\u0007\n\b\n\t\n\u0012\u0017\u001c\u0013\u001d#\u0019\u0015\u0000\u000e\u0017#%\u001e!\u001a\u0000\n\u0017\u001f#\u0018\n\u0003\n\u0004\u0003\n\u0005\u0003\n\u0006\u0003\n\u0007\u0003\n\b\u0003\n\u0012\u0017\u001c\u0013\u001d#\u0019\u0015\u0000\u0012&\u001c\u0014\u001e\u001b\n\u0011\u0017 $\u0019!\u0017\u0016\u0000\u000b\f\u000f\u0010\"\u0000\u0001\u0004\u0003\r\u0002\nFig. 2. The comparison between computation and communication in\nthe proposed semantic network.\nOn the other hand, we rebuild the policy network with Gaussian\nnoise and update the policy network \u03c5 as\nWQ\u03c5 = Es(t)\u223cD,\u03c2(t)\u223c\u03c6\n\u0014\nlogQ\u03c5(f\u03c5\n\u0000\u03c2(t); s(t))|s(t)\n\u0001\n\u2212Q\u03c9\n\u0000s(t), f\u03c5(\u03c2(t); s(t))\n\u0001\u0015\n.\n(8)\nTherefore, we can obtain the gradient of the policy as\n\u00af\u03c4\u03baWQ\u03c5 = \u03c4\u03c5logQ\u03c5\n\u0000a(t)|s(t)\n\u0001\n+\n\u0012\n\u03c4a(t)logQ\u03c5\n\u0000a(t)|s(t)\n\u0001\n\u2212\u03c4a(t)Q\n\u0000s(t), a(t)\n\u0001\u0013\n\u03c4\u03c5f\u03c5(\u03c2(t); s(t)).\n(9)\nOne of the advantages of the proposed method is its adapt-\nability to dynamic environments, which is critical for real-time\nwireless communication systems, this method allows for better\nutilization of resources, especially in computing and bandwidth\nresource constrained scenarios. Another important advantage is\nthe \ufb02exibility of the reward structure in DRL, which enables\nthe system to incorporate diverse objectives, such as minimizing\nenergy consumption, maximizing task throughput, or maintain-\ning a balance between communication and computation costs.\nIV. NUMERICAL RESULTS\nIn the simulation, we utilized Nvidia A100 and PyTorch\nto train the proposed semantic framework. Unless otherwise\nstated, the simulation parameters are set as follows: the transmit\npower gain P/\u03c32\nn = 5 dB, the path loss exponents \u03b1 = 3.\nThe number of BSs M = 3, the number of users N = 3,\nthe locations of BSs and users are randomly distributed in\na square of 50 m \u00d7 50 m, the bandwidth B = 10 MHz,\nthe size of original image for all tasks is 50 KB, the depth\nof ViT are set as {1, 2, 3, 4, 5, 6}, the delay and performance\nthresholds Lmax = 3, PSNRmin = 30, the number of semantic\ntransmission tasks at each user is 100.\nFig. 2 presents a comparison between computation and\ncommunication in the proposed network. The semantic symbols\n5\n0.1\n0.2\n0.3\n0.4\n0.5\n0.6\n0.7\n0.8\n0.9\nc\n50\n100\n150\n200\n250\n300\nFig. 3. Sum SCCM versus the weight of computation \u031fc.\ndenote the output vectors from the semantic encoder, FLOP is\nused to measure the required computational resource. As shown\nin this \ufb01gure, the required computation resource of a task rises\nwith an increase in the depth of the semantic network. This is\ndue to the use of deeper networks for compressing and parsing\nsemantic information. Although deeper networks require more\ncomputational resource to operate, they are able to capture\nand form high-quality semantic representations, allowing for a\nreduction in the size of semantic information while maintaining\nperformance in the dynamic wireless communication networks.\nAs illustrated in Fig. 3, we compare the sum SCCM to show\nthe impact and trade-off between computation and communi-\ncation in wireless semantic networks. This \ufb01gure illustrates\nthat when wireless transmission resources are abundant, it is\npossible to maximize the utilization of transmission resources\nto reduce reliance on computation with a power gain of 10 dB.\nTherefore, a smaller computation weight leads to a lower sum\nSCCM. Conversely, in scenarios where spectrum resources are\nscarce, such as a power gain of 2 dB, computational resources\nare utilized to compress the semantic information to conserve\nspectrum resource. In such cases, a larger weight should be\nassigned to computation. The reason is, when the optimization\nselecting deeper encoder/decoder to reduce the size of semantic\ninformation transmitted, the computational cost increases and\nneed to minimizing its costs. Conversely, when a shallower\nencoder/decoder con\ufb01guration and larger transmitted semantic\ninformation is selected, the algorithm focuses on reducing\ncommunication costs. This trade-off is particularly clear in\nscenarios with constrained resources, such as limited compu-\ntational power or spectrum availability. Moreover, the \ufb01gure\npresents that the proposed SAC-based optimization achieves\nbetter result than that from DQN, this veri\ufb01es the advantage of\nusing SAC in such optimization problems.\nV. CONCLUSION\nThis paper analyzes the trade-off between computation and\ncommunication in wireless semantic communications. By ad-\njusting the depth of the encoder and decoder, we can reduce\nthe size of semantic information required for transmission by\nincreasing computational cost, while ensuring the performance\nof the reconstructed data. Conversely, we can increase the\nsize of the semantic information to lower the computational\nrequirements, while still maintaining the performance of the\nreconstructed data. Moreover, by joint optimizing user as-\nsociation in wireless communications and the depth of the\nencoder and decoder in semantic communications, we conduct\na comprehensive analysis of the trade-off between computation\nand communication. Simulation results demonstrate that in\ndynamically changing wireless communication environments,\nthe trade-off metric SCCM can play a key role in re\ufb02ecting\nthe resource cost of semantic communications. Future research\ncan further explore trade-offs between computation and com-\nmunication in multi-user and multi-task scenarios for advancing\nwireless semantic communication networks.\nREFERENCES\n[1] Z. Qin, F. Gao, B. Lin, X. Tao, G. Liu, and C. Pan, \u201cA generalized se-\nmantic communication system: From sources to channels,\u201d IEEE Wireless\nCommunications, vol. 30, no. 3, pp. 18\u201326, Jun. 2023.\n[2] L. Yan, Z. Qin, R. Zhang, Y. Li, and G. Y. Li, \u201cResource allocation for\ntext semantic communications,\u201d IEEE Wireless Communications Letters,\nvol. 11, no. 7, pp. 1394\u20131398, Jul. 2022.\n[3] Z. Weng and Z. Qin, \u201cSemantic communication systems for speech\ntransmission,\u201d IEEE Journal on Selected Areas in Communications,\nvol. 39, no. 8, pp. 2434\u20132444, Aug. 2021.\n[4] D. Huang, F. Gao, X. Tao, Q. Du, and J. Lu, \u201cToward semantic\ncommunications: Deep learning-based image semantic coding,\u201d IEEE\nJournal on Selected Areas in Communications, vol. 41, no. 1, pp. 55\u2013\n71, Jan. 2023.\n[5] X. Mu and Y. Liu, \u201cExploiting semantic communication for non-\northogonal multiple access,\u201d IEEE Journal on Selected Areas in Com-\nmunications, vol. 41, no. 8, pp. 2563\u20132576, Aug. 2023.\n[6] Z. Ji, Z. Qin, X. Tao, and Z. Han, \u201cResource optimization for semantic-\naware networks with task of\ufb02oading,\u201d IEEE Transactions on Wireless\nCommunications, vol. 23, no. 9, pp. 12284\u201312296, Sept. 2024.\n[7] S. Barbarossa, D. Comminiello, E. Grassucci, F. Pezone, S. Sardellitti, and\nP. Di Lorenzo, \u201cSemantic communications based on adaptive generative\nmodels and information bottleneck,\u201d IEEE Communications Magazine,\nvol. 61, no. 11, pp. 36\u201341, Nov. 2023.\n[8] H. Xie, Z. Qin, G. Y. Li, and B.-H. Juang, \u201cDeep learning enabled seman-\ntic communication systems,\u201d IEEE Transactions on Signal Processing,\nvol. 69, pp. 2663\u20132675, Apr. 2021.\n[9] Q. Fu, H. Xie, Z. Qin, G. Slabaugh, and X. Tao, \u201cVector quantized se-\nmantic communication system,\u201d IEEE Wireless Communications Letters,\nvol. 12, no. 6, pp. 982\u2013986, Jun. 2023.\n[10] B. Zhang, Z. Qin, and G. Y. Li, \u201cCompression ratio learning and semantic\ncommunications for video imaging,\u201d IEEE Journal of Selected Topics in\nSignal Processing, vol. 18, no. 3, pp. 312\u2013324, Apr. 2024.\n[11] A. Zhang, Y. Wang, and S. Guo, \u201cOn the utility-informativeness-security\ntrade-off in discrete task-oriented semantic communication,\u201d IEEE Com-\nmunications Letters, vol. 28, no. 6, pp. 1298\u20131302, 2024.\n[12] K. Han, Y. Wang, H. Chen, X. Chen, J. Guo, Z. Liu, Y. Tang, A. Xiao,\nC. Xu, Y. Xu, Z. Yang, Y. Zhang, and D. Tao, \u201cA survey on vision\ntransformer,\u201d IEEE Transactions on Pattern Analysis and Machine Intel-\nligence, vol. 45, no. 1, pp. 87\u2013110, Jan. 2023.\n[13] 3GPP, \u201cEvolved universal terrestrial radio access (E-UTRA). further\nadvancements for E-UTRA physical layer aspects,\u201d 3GPP TR 36.814.\n[14] T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, \u201cSoft actor-critic: Off-\npolicy maximum entropy deep reinforcement learning with a stochastic\nactor,\u201d International Conference on Machine Learning (ICML), Stock-\nholm, Sweden, Jul. 2018."
-  },
-  {
-    "domain": "Computer Science",
-    "chunk_type": "general",
-    "text": "DUE: A Deep Learning Framework and Library for Modeling\nUnknown Equations \u2217\nJunfeng Chen\u2020, Kailiang Wu\u2021, AND Dongbin Xiu\u00a7\nAbstract.\nEquations, particularly differential equations, are fundamental for understanding\nnatural phenomena and predicting complex dynamics across various scientific and engineering disci-\nplines. However, the governing equations for many complex systems remain unknown due to intri-\ncate underlying mechanisms. Recent advancements in machine learning and data science offer a new\nparadigm for modeling unknown equations from measurement or simulation data. This paradigm\nshift, known as data-driven discovery or modeling, stands at the forefront of artificial intelligence\nfor science (AI4Science), with significant progress made in recent years. In this paper, we introduce\na systematic framework for data-driven modeling of unknown equations using deep learning. This\nversatile framework is capable of learning unknown ordinary differential equations (ODEs), partial\ndifferential equations (PDEs), differential-algebraic equations (DAEs), integro-differential equations\n(IDEs), stochastic differential equations (SDEs), reduced or partially observed systems, and non-\nautonomous differential equations.\nBased on this framework, we have developed Deep Unknown\nEquations (DUE), an open-source software package designed to facilitate the data-driven modeling\nof unknown equations using modern deep learning techniques. DUE serves as an educational tool for\nclassroom instruction, enabling students and newcomers to gain hands-on experience with differential\nequations, data-driven modeling, and contemporary deep learning approaches such as fully connected\nneural networks (FNN), residual neural networks (ResNet), generalized ResNet (gResNet), operator\nsemigroup networks (OSG-Net), and Transformers from large language models (LLMs). Addition-\nally, DUE is a versatile and accessible toolkit for researchers across various scientific and engineering\nfields. It is applicable not only for learning unknown equations from data but also for surrogate mod-\neling of known, yet complex, equations that are costly to solve using traditional numerical methods.\nWe provide detailed descriptions of DUE and demonstrate its capabilities through diverse examples,\nwhich serve as templates that can be easily adapted for other applications. The source code for DUE\nis available at https://github.com/AI4Equations/due.\nKey words. education software, differential equations, deep learning, neural networks\nAMS subject classifications. 68T07, 65-01, 65-04, 37M99, 65M99, 65P99\n1. Introduction. Equations, especially differential equations, form the founda-\ntion of our understanding of many fundamental laws. They help human unlock the\nmysteries of microscopic particles, decipher the motion of celestial bodies, predict\nclimate changes, and explore the origins of the universe. Differential equations have\nwidespread applications across disciplines such as physics, chemistry, biology, and epi-\ndemiology. Traditionally, these equations were derived from first principles. However,\nfor many complex systems, the governing equations remain elusive due to intricate\nunderlying mechanisms.\nRecent advancements in machine learning and data science are revolutionizing\nhow we model dynamics governed by unknown equations. This paradigm shift, known\nas data-driven discovery or modeling, stands at the forefront of artificial intelligence\nfor science (AI4Science). In the past few years, significant progress has been made in\nlearning or discovering unknown equations from data. Techniques such as symbolic\n\u2217J. Chen and K. Wu were partially supported by NSFC grants (No. 92370108 and No. 12171227)\nand Shenzhen Science and Technology Program (No. RCJC20221008092757098).\n\u2020Department of Mathematics and Shenzhen International Center for Mathematics, Southern Uni-\nversity of Science and Technology, Shenzhen 518055, China (chenjf2@sustech.edu.cn).\n\u2021Corresponding author.\nDepartment of Mathematics and Shenzhen International Center for\nMathematics, Southern University of Science and Technology, Shenzhen, Guangdong 518055, China\n(wukl@sustech.edu.cn).\n\u00a7Department\nof\nMathematics,\nThe\nOhio\nState\nUniversity,\nColumbus,\nOH\n43210,\nUSA\n(xiu.16@osu.edu).\n1\narXiv:2504.10373v1  [cs.LG]  14 Apr 2025\n2\nJ. CHEN, K. WU, D. XIU\nregression [3, 57], sparsity-promoting regression [7, 62, 59, 6, 54, 55, 43, 4], Gaussian\nprocesses [48], polynomial approximation [64, 63, 1], linear multistep methods [28, 19],\ngenetic algorithms [66, 67, 12], parameter identification [44], deep neural networks\n(DNNs) [47, 49, 39, 38, 58], and neural ordinary differential equations (ODEs) [11, 29]\nhave shown great promise. Successfully learning these equations enables their solution\nusing appropriate numerical schemes to predict the evolution behavior of complex\nsystems.\nA distinct approach is using data-driven methods to learn the dynamics or flow\nmaps of the underlying unknown equations [46, 65, 14].\nThis approach facilitates\nrecursive predictions of a system\u2019s evolution, thereby circumventing the need to solve\nthe learned equations. A classic example is dynamic mode decomposition (DMD) [56,\n60], which seeks the best-fit linear operator to advance state variables forward in time,\nserving as an approximation to the Koopman operator associated with the underlying\nsystem [5]. With the rapid development of deep learning [26], DNNs have shown great\npromise in data-driven modeling of unknown equations.\nCompared to traditional\nmethods, DNNs excel in managing high-dimensional problems, processing very large\ndatasets, and facilitating parallel computing. DNNs have proven highly effective in\nlearning the dynamics or flow maps of various types of equations, including ODEs\n[46], partial differential equations (PDEs) [65], differential-algebraic equations (DAEs)\n[15], integro-differential equations (IDEs) [14], and stochastic differential equations\n(SDEs) [13]. This flow map learning (FML) methodology has also been extended\nto partially observed systems with missing state variables [21] and non-autonomous\ndynamical systems [45]. Recent progresses in scientific machine learning (SciML) have\nintroduced advanced deep learning techniques for approximating general operators\nmapping between two infinite-dimensional function spaces.\nNotable contributions\ninclude neural operators [35, 36, 31] and deep operator networks (DeepONet) [41, 70],\nwhich can also model PDEs.\nFig. 1: The overall structure of DUE.\nWhile deep learning garners growing interest among students and researchers\nacross various fields, newcomers often encounter challenges due to the complexity of\nnew concepts, algorithms, and coding requirements. To address this, we present Deep\nUnknown Equations (DUE), a framework and open-source Python library for deep\nlearning of unknown equations. DUE aims to simplify the learning process and fa-\nDUE\n3\ncilitate the adoption of advanced deep learning techniques, such as residual neural\nnetworks (ResNet) [24], generalized ResNet (gResNet) [15], operator semigroup net-\nwork (OSG-Net) [9], and Transformers [61]. It serves as both an educational tool for\nstudents and a powerful resource for researchers, enabling the learning and modeling\nof any time-dependent differential equations. One of DUE\u2019s standout features is its\nuser-friendly design, which allows users to start learning unknown equations with as\nfew as ten lines of code. This simplicity saves significant time on conceptualization\nand analysis, making advanced techniques more accessible. Moreover, DUE is not\nonly valuable for learning unknown equations but also for creating surrogate models\nof known yet complex equations that are computationally expensive to solve using\ntraditional numerical methods. As the field of deep learning continues to advance\nrapidly, we are committed to maintaining and updating DUE to ensure it remains a\nvaluable tool for those interested in the deep learning of unknown equations. While\nsimilar efforts, such as DeepXDE [42] and NeuralUQ [70], have been made to ease\nthe learning and adoption curve, they focus primarily on solving given differential\nequations or uncertainty quantification. In contrast, DUE uniquely targets the deep\nlearning of unknown equations. In summary, DUE is a comprehensive framework and\naccessible tool that empowers students and researchers to harness deep learning for\nmodeling unknown equations, opening new avenues in scientific discovery.\n2. Data-Driven Deep Learning of Unknown Equations. In this section,\nwe explore how deep learning can be applied to model unknown differential equations\nfrom measurement data. After establishing the basic setup in Section 2.1, we intro-\nduce the essential concepts and methods for modeling unknown ODEs. This includes\ndiscussions on data preprocessing, neural network architectures, and model training,\nwhich form the core components of the deep-learning-based modeling framework. We\nthen describe how this approach can be extended to partially observed systems. Fi-\nnally, we discuss learning unknown PDEs in both nodal and modal spaces.\n2.1. Setup and Preliminaries. To set the stage for our exploration, let us delve\ninto the setup for modeling unknown ODEs [46] and PDEs [65, 14]. The framework\nwe describe can be easily adapted to other types of equations, including DAEs [15],\nIDEs [14], and SDEs [13].\nLearning ODEs. Imagine we are trying to understand an autonomous system\nwhere the underlying equations are unknown ODEs:\n(2.1)\ndu\ndt = f(u(t)),\nu(t0) = u0,\nwhere f : Rn \u2192Rn is unknown. A classic example is the damped pendulum system:\n(2.2)\n\uf8f1\n\uf8f4\n\uf8f2\n\uf8f4\n\uf8f3\ndu1\ndt = u2,\ndu2\ndt = \u2212\u03b1u2 \u2212\u03b2sin(u1),\nwhere u1 is the angle, u2 is the angular velocity, \u03b1 is the damping coefficient, and \u03b2\nrepresents the effect of gravity. If these equations are known, then numerical methods\nlike the Runge\u2013Kutta can solve them, predicting how u1 and u2 evolve over time.\nBut what if these equations are unknown? If we can observe or measure the state\nvariables, can we build a data-driven model to predict their evolution?\nAssume we have measurement data of u collected from various trajectories. Let\n4\nJ. CHEN, K. WU, D. XIU\nt0 < t(i)\n1\n< \u00b7 \u00b7 \u00b7 < t(i)\nK be a sequence of time instances. We use\n(2.3)\nu(i)\nk\n= u(t(i)\nk ; u(i)\n0 , t0) + \u03f5(i)\nu,k,\nk = 1, . . . , Ki,\ni = 1, . . . , Itraj,\nto denote the state at time t(i)\nk\nalong the i-th trajectory originating from the initial\nstate u(i)\n0\nat t0, for a total of Itraj trajectories.\nIn real-world scenarios, the data\nmay contain measurement noise \u03f5(i)\nu,k, typically modeled as random variables. Our\nobjective is to create a data-driven model for the unknown ODEs that can predict\nthe evolution of u from any initial state u(t0) = u0.\nFig. 2: Left: Trajectory data collected from multiple initial states for learning ODEs.\nRight: Snapshot data for learning PDEs (only one trajectory is displayed for visual-\nization, while the real dataset may contain multiple trajectories).\nLearning PDEs.\nNow, consider the more complex scenario of an unknown\ntime-dependent PDE system:\n(2.4)\n\uf8f1\n\uf8f4\n\uf8f2\n\uf8f4\n\uf8f3\n\u2202tu = L(u),\n(x, t) \u2208\u2126\u00d7 R+,\nB(u) = 0,\n(x, t) \u2208\u2202\u2126\u00d7 R+,\nu(x, 0) = u0(x),\nx \u2208\u00af\u2126,\nwhere \u2126\u2286Rd is the physical domain, L is the unknown operator governing the PDE,\nB specifies the boundary conditions, and the solution u(x, t) belongs to an infinite-\ndimensional Hilbert space V. A fundamental example of PDEs is the one-dimensional\nBurgers\u2019 equation:\n\u2202tu = L(u)\nwith\nL(u) = \u2212\u2202x\n\u0012u2\n2\n\u0013\n+ \u03bd\u2202xxu,\nwhere the state of u(x, t) is governed by a convective term \u2202x(u2/2) and a diffusive\nterm \u03bd\u2202xxu, with \u03bd > 0 being the diffusion coefficient (or kinematic viscosity in the\ncontext of fluid mechanics). With given initial conditions, numerical methods can\npredict future solutions. But what if the underlying mechanism is unclear and the\nright-hand side of the PDE is unknown? Can we use measurable data of u(x, t) to\nuncover the dynamics?\nAssume the solution u(x, t) of the unknown system (2.4) is measurable, i.e., the\nsnapshot data of u are available at certain time instances as shown in Figure 2:\n(2.5)\nu(xs, t(i)\nk )\ns = 1, 2 . . . , n,\nk = 1, . . . , Ki,\ni = 1, . . . , Itraj.\nDUE\n5\nHere, {xs}n\ns=1 are the discrete spatial locations at which the solution data is measured.\nIn practice, solution data may be collected on varying sets of sampling locations,\nnecessitating interpolation or fitting methods to transform the data onto a consistent\nset {xs}n\ns=1. Our goal is to create a data-driven model for the unknown PDE that\ncan predict the temporal evolution of u given any initial state u(x, 0) = u0(x).\n2.2. Data Pairs. In DUE, we mainly focus on learning the integral form of the\nunderlying equations, which is equivalent to learning the flow maps {\u03a6\u2206}\u2206\u22650 that\ndescribe the time evolution of state variables. The flow map for a time step \u2206is\ndefined as\n(2.6)\n\u03a6\u2206(u0) := u(t0 + \u2206) = u0 +\nZ t0+\u2206\nt0\nf(u(s))ds = u0 +\nZ \u2206\n0\nf(\u03a6s(u0))ds,\nwhere t0 can be arbitrarily shifted for autonomous systems.\nThe flow maps fully\ncharacterize the system\u2019s time evolution. The data (2.3) may be collected at constant\nor varying time lags \u2206(i)\nk\n= t(i)\nk+1 \u2212t(i)\nk . Depending on this, we rearrange the data as\nfollows:\nRearranging Data with Fixed Time Lag \u2206. When data is collected at a\nconstant time lag \u2206, our goal is to learn a single flow map for this specific \u2206. We\nsegment the collected trajectories to form a dataset of input-output pairs:\n(2.7)\nn\nu(j)\nin , u(j)\nout\no\n,\nj = 1, 2, ..., J,\nwhere u(j)\nin and u(j)\nout are neighboring states such that u(j)\nout \u2248\u03a6\u2206(u(j)\nin ), accounting for\nsome measurement noise. Note that multiple data pairs can be extracted from a single\ntrajectory by segmenting it into smaller temporal intervals, leading to J \u2265Itraj.\nRearranging Data with Varying Time Lags. When the time lag \u2206varies,\neach \u2206represents a different flow map. Our objective becomes learning a family of\nflow maps {\u03a6\u2206}\u22061\u2264\u2206\u2264\u22062, where \u22061 and \u22062 are the minimum and maximum time\nlags in the dataset. We rearrange the data into:\n(2.8)\nn\nu(j)\nin , \u2206(j), u(j)\nout\no\n,\nj = 1, 2, ..., J,\nwith u(j)\nout \u2248\u03a6\u2206(j)(u(j)\nin ), considering some measurement noise.\n2.3. Deep Neural Networks. In this subsection, we introduce several effective\nDNN architectures for modeling unknown equations, including the basic feedforward\nneural networks (FNNs), residual neural network (ResNet) [24], generalized ResNet\n(gResNet) [15], and operator semigroup network (OSG-Net) [9].\nFNN. As a foundational architecture in deep learning, FNN with L hidden layers\ncan be mathematically represented as:\n(2.9)\nN\u03b8(uin) = WL+1 \u25e6(\u03c3L \u25e6WL) \u25e6\u00b7 \u00b7 \u00b7 \u25e6(\u03c31 \u25e6W1)(uin),\nwhere W\u2113\u2208Rn\u2113\u00d7n\u2113\u22121 is the weight matrix of the \u2113th hidden layer, \u03c3\u2113denotes the\nactivation function, \u25e6signifies composition, and \u03b8 denotes all trainable parameters.\nCommon activation functions include the hyperbolic tangent (Tanh), the rectified\nlinear unit (ReLU), and the Gaussian error linear unit (GELU). For flow map learning,\nwe set n0 = nL+1 = n, where n denotes the number of state variables (recalling that\nu \u2208Rn). The numbers of neurons in the hidden layers, n\u2113with \u2113= 1, 2, ..., L, are\nhyperparameters that typically require calibration based on the specific problems.\n6\nJ. CHEN, K. WU, D. XIU\nResNet. ResNet [24] is an advanced variant of FNN, particularly effective for\nlearning unknown equations [46]. Initially proposed for image processing [24], ResNet\nintroduces an identity mapping, enabling the network to learn the residue of the input-\noutput mapping more effectively. As depicted in Figure 3, a ResNet can be described\nas\n(2.10)\nbuout = ResNet\u03b8(uin) := uin + N\u03b8(uin) = (In + N\u03b8) (uin),\nBy comparing (2.10) with (2.6), ResNet is particularly suitable for FML, as it enforces\nN\u03b8 to approximate the effective increment of the state variables:\n(2.11)\nN\u03b8(uin) \u2248\nZ \u2206\n0\nf(u(s))ds =\nZ \u2206\n0\nf(\u03a6s(uin))ds.\nuin\nPlain\nneural\nnetwork\n+\nuout\nuin\nPrior\nmodel\nPlain\nneural\nnetwork\n+\nuout\nuin\n\u2206\nPlain\nneural\nnetwork\n\u00d7\n+\nuout\nFig. 3: ResNet (left), gResNet (middle), and OSG-Net (right).\nThe symbol \u201c+\u201d\nindicates element-wise summation, while the symbol \u201c\u00d7\u201d rerpesents multiplication.\ngResNet. As shown in Figure 3, gResNet [15] generalizes the traditional ResNet\nconcept by defining the residue as the difference between the output data and the\npredictions made by a prior model:\n(2.12)\nuout = gResNet(uin) := A(uin) + N\u03b8(uin),\nwhere A is the prior model, and N\u03b8 acts as a correction for A. If an existing prior\nmodel is unavailable, A can be constructed from data, such as using a modified DMD\n[15] to construct a best-fit affine model:\nA(uin) := Auin + b,\nwhere A \u2208Rn\u00d7n and b \u2208Rn are determined by solving the following linear regression\nproblem:\n(2.13)\n(A, b) = arg min\n\u02dcA\u2208Rn\u00d7n\n\u02dcb\u2208Rn\n1\nJ\nJ\nX\nj=1\n\r\r\ru(j)\nout \u2212\u02dcAu(j)\nin \u2212\u02dcb\n\r\r\r\n2\n2 .\nTo solve problem (2.13), we first augment the input vector by appending a constant\nterm:\n\u02dcuin =\n\u0014uin\n1\n\u0015\n\u2208Rn+1,\nDUE\n7\nwhere the constant 1 accommodates the bias term b in the affine model. Next, we\nconstruct the following matrices using the dataset (2.7):\nY := [u(1)\nout, u(2)\nout, . . . , u(J)\nout] \u2208Rn\u00d7J,\nX := [\u02dcu(1)\nin , \u02dcu(2)\nin , . . . , \u02dcu(J)\nin ] \u2208R(n+1)\u00d7J.\nThe solution to the linear regression problem (2.13) can then be explicitly expressed\nas\n\u0002A\nb\u0003\n= YX\u22a4(XX\u22a4)\u22121.\nThis modification to DMD accommodates potential non-homogeneous terms in\nthe unknown equations, making the approximation more flexible.\nThe concept of\ngResNet encompasses the standard ResNet with A = In and b = 0.\nOSG-Net. To adeptly approximate a family of flow maps associated with varying\ntime step sizes, it is necessary to incorporate the time step size as an input to DNN.\nThe flow maps of autonomous systems form a one-parameter semigroup, satisfying\n\u03a60 = In,\n(2.14a)\n\u03a6\u22061+\u22062 = \u03a6\u22061 \u25e6\u03a6\u22062\n\u2200\u22061, \u22062 \u2208R+.\n(2.14b)\nThe semigroup property is crucial as it connects the system\u2019s evolutionary behaviors\nacross different time scales. Therefore, it is natural for data-driven models to ad-\nhere to this property. The OSG-Net, proposed in [9], is well-suited for this purpose.\nMathematically, an OSG-Net can be expressed as\n(2.15)\nbuout = OSG-Net\u03b8(uin, \u2206) := uin + \u2206N\u03b8(uin, \u2206).\nThe architecture of OSG-Net, illustrated in Figure 3, involves concatenating the state\nvariables uin with the time step size \u2206before inputting them into the network N\u03b8.\nUnlike ResNet, OSG-Net introduces an additional skip connection that scales the\noutput of N\u03b8 by \u2206. This design ensures that an OSG-Net inherently satisfies the\nfirst property (2.14a). As for the second property, we can design special loss functions\nto embed this prior knowledge into OSG-Net via training, which can enhance the\nmodel\u2019s long-term stability (see Section 3.2 for detailed discussions).\nBy comparing (2.15) with (2.6), it is clear that N\u03b8 serves as an approximation to\nthe time-averaged effective increment:\n(2.16)\nN\u03b8(uin, \u2206) \u22481\n\u2206\nZ \u2206\n0\nf(u(s))ds = 1\n\u2206\nZ \u2206\n0\nf(\u03a6s(uin))ds.\n2.4. Model Training and Prediction. Once the data pairs are rearranged\nand an appropriate DNN architecture is selected, model training is carried out by\nminimizing a suitable loss function. The commonly used mean squared error (MSE)\nquantifies the discrepancy between the predicted outputs and the actual values:\n(2.17)\nL(\u03b8) = 1\nJ\nJ\nX\nj=1\n\r\r\rbu(j)\nout(\u03b8) \u2212u(j)\nout\n\r\r\r\n2\n2 .\nIt is worth noting that training data extracted from the same trajectory are not inde-\npendent. To account for the structure of observational noise or the highly clustered\nnature of data from a single trajectory, a suitably weighted norm can be applied in\n8\nJ. CHEN, K. WU, D. XIU\nthe loss function (2.17). Some alternative loss functions will be discussed in Section 3\nto enhance the prediction accuracy and stability.\nIn practice, L(\u03b8) is minimized using stochastic gradient descent (SGD) [53] or its\nvariants, such as Adam [30]. SGD works by randomly splitting the training dataset\ninto mini-batches. At each iteration, the gradient of the loss function with respect\nto \u03b8 is computed for one mini-batch, and this gradient is used to update the param-\neters. This process repeats for multiple epochs until the loss function is sufficiently\nminimized. The procedure for training DNNs using SGD is outlined in Algorithm 2.1.\nAlgorithm 2.1 Model training using stochastic gradient descent (SGD)\nRequire: Number of epochs E, batch size B; training data {(u(j)\nin , u(j)\nout)}J\nj=1 (fixed\ntime lag) or {(u(j)\nin , \u2206(j), u(j)\nout)}J\nj=1 (varied time lags)\n1: Initialize the DNN parameters \u03b8 randomly\n2: for epoch = 1 to E do\n3:\nShuffle the training data\n4:\nfor batch = 1 to\n\u0004 J\nB\n\u0005\ndo\n5:\nSample a mini-batch \u039b of size B from the training data\n6:\nUpdate the DNN parameters:\n\u03b8 \u2190\u03b8 \u2212\u03b7\u2207\u03b8L(\u039b)(\u03b8),\n7:\nwhere the learning rate \u03b7 > 0 is often adapted during training, and\n\u2207\u03b8L(\u039b)(\u03b8) = 1\nB\nX\nj\u2208\u039b\n\u2207\u03b8\n\r\r\rbu(j)\nout(\u03b8) \u2212u(j)\nout\n\r\r\r\n2\n2 .\n8:\nend for\n9: end for\nOnce the DNN is successfully trained, it is recursively used to conduct predictions\nfrom any given initial state upre(t0) = u(t0). The trained DNN model, denoted as b\u03a6\u03b8\npredicts the solution evolution as follows:\n(2.18)\nupre(tk+1) = b\u03a6\u03b8(upre(tk)),\nk = 0, 1, . . .\nwith a fixed time step size tk+1 \u2212tk \u2261\u2206, or\n(2.19)\nupre(tk+1) = b\u03a6\u03b8(upre(tk), \u2206k),\nk = 0, 1, . . .\nwith varying time step sizes tk+1 \u2212tk = \u2206k.\n2.5. Learning Partially Observed Systems. In many real-world scenarios,\ncollecting data for all state variables u \u2208Rn is not always feasible. Instead, obser-\nvations can be restricted to a subset of the state variables w \u2208Rm, where m < n.\nThis limitation shifts the focus to learning the dynamics of w alone, resulting in\nnon-autonomous unknown governing equations due to the absence of other variables.\nSimilar to the fully observed case, the training data can be constructed from sampling\non multiple long trajectories or many short trajectories with M + 1 observations of\nw. If data from multiple long trajectories of w with a fixed time lag \u2206are available:\n(2.20)\nw(i)\nk\n= w(t(i)\nk ; w(i)\n0 , t0) + \u03f5(i)\nw,k,\nk = 1, . . . , Ki,\ni = 1, . . . , Itraj,\nDUE\n9\nthen we rearrange these trajectories into shorter bursts of M + 1 consecutive states:\n(2.21)\nn\nw(j)\n0 , w(j)\n1 , . . . , w(j)\nM+1\no\n,\nj = 1, 2, . . . , J.\nTo model the temporal evolution of w, a memory-based DNN architecture was intro-\nduced in [21]:\n(2.22)\nwk+1 = wk + N\u03b8(wk, wk\u22121, . . . , wk\u2212M),\nk \u2265M > 0,\nwhere T := M\u2206represents the memory length, which is problem-dependent and often\nrequires manual tuning. The state wk at time tk, along with the M preceding states,\nare concatenated as inputs for the neural network N\u03b8. The following loss function is\nthen minimized:\n(2.23)\nL(\u03b8) = 1\nJ\nJ\nX\nj=1\n\r\r\rw(j)\nM+1 \u2212\n\u0010\nw(j)\nM + N\u03b8(w(j)\nM , . . . , w(j)\n1 , w(j)\n0 )\n\u0011\r\r\r\n2\n2 .\nLearning a fully observed system is a special case with m = n and M = 0. Once the\nDNN model is successfully trained, it can be recursively used to predict the system\u2019s\nevolution from any initial states (w(t0), w(t1), . . . , w(tM)):\n(2.24)\n(\nwpre(tk) = w(tk),\nk = 0, 1, . . . , M,\nwpre(tk+1) = wpre(tk) + N\u03b8 (wpre(tk), wpre(tk\u22121), . . . , wpre(tk\u2212M)) ,\nk \u2265M,\nwhere tk+1 \u2212tk \u2261\u2206.\nThis approach has also been applied to systems with hidden parameters [22], as\nwell as PDE systems with snapshot data observed on a subset of the domain [16].\n2.6. Learning Unknown PDEs. The aforementioned framework can be seam-\nlessly extended to data-driven modeling of unknown PDEs. This can be effectively\nachieved in either nodal or modal space, as illustrated in Figure 4.\nexpansion\nFourier\nGeneralized\nmesh grids\nSample on\nu( \u00b7 , t)\nU(t)\nV(t)\n\u03a6\u2206\nV(t + \u2206)\nU(t + \u2206)\nu( \u00b7 , t + \u2206)\nPp\nj=1 v j(t)\u03c8 j(x)\nFig. 4: Learning PDEs in nodal space (top branch) and modal space (bottom branch).\n2.6.1. Learning in Nodal Space. Let u : \u2126\u00d7 R+ \u2192Rdu represent the state\nvariables of the underlying unknown d-dimensional PDE, and \u2126\u2282Rd, where d is the\nspatial dimension, and du is the length of the state vector u. As shown in the upper\n10\nJ. CHEN, K. WU, D. XIU\nbranch of Figure 4, assume we have measurement data of u at a set of nodal points\nX = {x1, . . . , xn} \u2282\u2126, collected from various trajectories:\n(2.25)\nU(i)\nk\n= U(t(i)\nk ; U(i)\n0 , t0) + \u03f5(i)\nU,k,\nk = 1, . . . , Ki,\ni = 1, \u00b7 \u00b7 \u00b7 , Itraj,\nwhere U(t) = (u(x1, t), . . . , u(xn, t))\u22a4\u2208Rn\u00d7du is a matrix. While ResNet and OSG-\nNet built upon FNNs can be used for learning PDEs [14, 9], they can be compu-\ntationally expensive when X contains a large number of nodal points. To address\nthis, we can replace FNNs with more suitable DNNs, such as the convolutional neural\nnetworks (CNNs) [33, 68], the Fourier Neural Operator (FNO) [36], and many other\nneural operators [34, 8, 10], including those built upon Transformers [61, 10] from\nlarge language models.\nTransformers. Transformers [61], particularly those based on the self-attention\nmechanism, are highly effective for capturing long-range dependencies in data. Math-\nematically, a generalized Transformer can be expressed as\nT\u03b8(Uin) = \u03c9L+1 \u25e6(\u03c3L \u25e6\u03b1L \u25e6\u03c9L) \u25e6\u00b7 \u00b7 \u00b7 \u25e6(\u03c31 \u25e6\u03b11 \u25e6\u03c91)(Uin, X),\nwhere each set of operations {\u03c3\u2113\u25e6\u03b1\u2113\u25e6\u03c9\u2113}L\n\u2113=1 represents the following transformation:\n(2.26)\nU\u2113= \u03c3\u2113\u25e6\u03b1\u2113\u25e6\u03c9\u2113(U\u2113\u22121) := \u03c3\u2113(A\u2113U\u2113\u22121W\u2113).\nHere, U\u2113\u2208Rn\u2113\u00d7d\u2113is a matrix, with \u2113= 1, 2, . . . , L, represents the output of the \u2113-th\nhidden layer. The initial input, U0 = [Uin, X] \u2208Rn\u00d7(du+d), is formed by concatenat-\ning the input function values and nodal point coordinates. In this setup: \u03c3\u2113is the\nactivation function; \u03c9\u2113represents a transformation via right multiplication by a weight\nmatrix W\u2113\u2208Rd\u2113\u22121\u00d7d\u2113; \u03b1\u2113represents a convolution via left multiplication by a kernel\nmatrix A\u2113\u2208Rn\u2113\u00d7n\u2113\u22121. Each hidden layer can thus be interpreted as transforming a\nvector-function with d\u2113\u22121 components sampled on a latent grid X\u2113\u22121 = {x\u2113,j}n\u2113\u22121\nj=1 , to a\nnew vector-function with d\u2113components sampled on a new latent grid X\u2113= {x\u2113,i}n\u2113\ni=1,\nwhere X0 = XL = X. The sizes of the hidden layers, specified by {d\u2113}L\n\u2113=1 and {n\u2113}L\u22121\n\u2113=1 ,\nare hyperparameters that typically require tuning based on the problem at hand. At\nthe output layer, we set dL+1 = du and nL = n to produce the predicted function\nvalues on the target grid X.\nTransformers can be enhanced with a multi-head attention mechanism, perform-\ning multiple convolutions in each hidden layer to provide a comprehensive view of the\ntarget operator. This is achieved by replacing A\u2113U\u2113\u22121W\u2113in (2.26) with the concatena-\ntion of different heads {Ah\n\u2113U\u2113\u22121W h\n\u2113}H\nh=1, where Ah\n\u2113\u2208Rn\u2113\u00d7n\u2113\u22121 and W h\n\u2113\u2208Rd\u2113\u22121\u00d7\nd\u2113\nH .\nThe general formulation in (2.26) encompasses many deep learning methods, dis-\ntinguished by the implementation of the convolution operator A\u2113.\n\u2022 In CNNs [32], A\u2113performs local weighted sums over spatially structured\ndata. The non-zero values of A\u2113, which constitute the trainable weights, are\nidentical but shifted accross the rows, as these weights are shared accross \u2126.\nThis convolution is usually combined with pooling or up-pooling layers [52],\nwhich downsample or upsample U\u2113from the grid X\u2113\u22121 to a coarser or finer\ngrid X\u2113.\n\u2022 In Transformers built upon the self-attention mechanism [61], A\u2113performs\nglobal convolution. Mathematically, A\u2113is implemented as\n(2.27)\nA\u2113= Softmax\n \n(U\u2113\u22121W Q\n\u2113)(U\u2113\u22121W K\n\u2113)\u22a4\np\nd\u2113\u22121\n!\n,\nDUE\n11\nwhere W Q\n\u2113, W K\n\u2113\n\u2208Rd\u2113\u22121\u00d7d\u2113are two trainable weight matrices, and Softmax\nnormalizes each row of a matrix into a discrete probability distribution. In\n[37], a cross-attention mechanism was proposed to enable the change of mesh.\nSpecifically, U\u2113\u22121W Q\n\u2113in (2.27) is replaced by X\u2113W X\n\u2113, with W X\n\u2113\n\u2208Rd\u00d7d\u2113being\na trainable weight matrix. This design allows cross-attention to output a new\nfunction sampled on any mesh X\u2113.\nPosition-induced Transformer (PiT). Here, we present a Transformer-based\nmethod, named PiT, built upon the position-attention mechanism proposed in [10].\nDistinguished from other Transformer-based networks [8, 23, 37] built upon the clas-\nsical self-attention [61], position-attention implements the convolution operator by\nconsidering the spatial interrelations between sampling points. Define the pariwise\ndistance matrix D\u2113\u2208Rn\u2113\u00d7n\u2113\u22121 between X\u2113and X\u2113\u22121 by D\u2113,ij = \u2225x\u2113,i \u2212x\u2113\u22121,j\u22252\n2.\nThen A\u2113is defined as A\u2113:= Softmax(\u2212\u03bb\u2113D\u2113), where \u03bb\u2113\u2208R+ is a trainable parame-\nter. Position-attention represents a global linear convolution with a stronger focus on\nneighboring regions, resonating with the concept of domain of dependence in PDEs\nand making PiT appealing for learning PDEs [10]. The parameter \u03bb\u2113is interpretable,\nas most attention at a point x\u2113,i \u2208X\u2113is directed towards those points x\u2113\u22121,j \u2208X\u2113\u22121\nwith the distance to x\u2113,i smaller than 1/\u221a\u03bb\u2113.\nIn practice, we construct a latent\nmesh Xltt by coarsening X while preserving essential geometric characteristics, and\nlet X\u2113= Xltt,\nn\u2113= nltt,\nfor \u2113= 1, 2, . . . , L \u22121, with nltt < n. This design re-\nduces the computational cost caused by a potential large number of sampling points\nin the dataset. Like many other neural operators [31, 2], PiT is mesh-invariant and\ndiscretization convergent. Once trained, PiT can generalize to new input meshes, de-\nlivering consistent and convergent predictions as the input mesh is refined. To learn\ntime-dependent unknown PDEs from data, we construct a (g)ResNet or OSG-Net\nwith PiT as the basic block. Once the model is successfully trained, we can recur-\nsively call the model to predict the evolutionary behaviors of u(x, t) given any initial\nconditions.\n2.6.2. Learning in Modal Space. An alternative strategy is to model un-\nknown PDEs in modal space [65] by combining traditional model reduction with deep\nlearning approaches. Initially, select a finite-dimensional function space with a suit-\nable basis to approximate each component of u(x, \u00b7):\nVp = span {\u03c81(x), ..., \u03c8p(x)} ,\nwhere p \u2264n, and the basis functions \u03a8(x) := (\u03c81(x), ..., \u03c8p(x))\u22a4are defined on the\nphysical domain \u2126. As shown in the lower branch of Figure 4, the solution of the\nunderlying PDE can then be approximated in Vp by a finite-term series:\nu(x, t) \u2248\np\nX\nj=1\nvj(t)\u03c8j(x),\nwith V := (v1, ..., vp)\u22a4\u2208Rp\u00d7du being the modal expansion coefficients. This intro-\nduces a bijective mapping:\n(2.28)\n\u03a0 : Rp\u00d7du \u2192[Vp]du,\n\u03a0V = V\u22a4\u03a8(x),\nwhich defines a unique correspondence between a function in [Vp]du and its modal\nexpansion coefficients.\n12\nJ. CHEN, K. WU, D. XIU\nNow, we project each data sample U(i)\nk\nin (2.25) into [Vp]du, yielding a coefficient\nmatrix V(i)\nk . This is achieved by solving the linear regression problem:\n(2.29)\nV(i)\nk\n= arg min\n\u02dcV\u2208Rp\u00d7du\n\r\r\r(U(i)\nk )\u22a4\u2212\u02dcV\n\u22a4\u03a8(X)\n\r\r\r\n2\n2 ,\nwhere \u03a8(X) = (\u03a8(x1), \u03a8(x2), . . . , \u03a8(xn)) is a p \u00d7 n matrix, representing the basis\nfunction values evaluated at the sampling grids X.\nThe solution to (2.29) can be\nexpressed as\nV(i)\nk\n=\n\u0000\u03a8(X)\u03a8(X)\u22a4\u0001\u22121 \u03a8(X)U(i)\nk .\n= V(t(i)\nk ; V(i)\n0 , t0) + \u03f5(i)\nV,k,\nk = 1, . . . , Ki,\ni = 1, \u00b7 \u00b7 \u00b7 , Itraj,\nwhere V(t(i)\nk ; V(i)\n0 , t0) denotes the modal coefficients of the underlying function, and\n\u03f5(i)\nV,k =\n\u0000\u03a8(X)\u03a8(X)\u22a4\u0001\u22121 \u03a8(X)\u03f5(i)\nU,k represents the noise inherited from the nodal value\nnoise. We can then treat V as the state variables and model the unknown governing\nODEs using deep learning approaches, offering a predictive model for the evolution\nof V. The behavior of U can be easily inferred through the bijective mapping (2.28).\nLearning unknown PDEs in the modal space provides great flexibility in choosing\ndifferent basis functions to represent the solution, including trigonometric functions,\nwavelet functions, Legendre polynomials, and piecewise polynomials. This approach is\nanalogous to traditional numerical methods, such as spectral Galerkin, finite element,\nand finite volume methods, commonly used for solving known PDEs.\n2.6.3. Remarks on Learning PDEs. In the modal learning approach, when\nan interpolation basis is used, the resulting modal coefficients directly correspond to\nfunction values. This allows both the modal and nodal learning approaches to be\nrepresented through the expansion shown in the bottom path of Figure 4, highlight-\ning a connection between the two methods. Although Transformers were originally\ndeveloped for nodal learning, they may also be adapted for modal learning, as the\nattention mechanism can be used to capture dependencies among different modes.\nOur data-driven models in DUE serve as approximate evolution operators for the\nunderlying unknown PDEs. They enable prediction of future solutions for any initial\nconditions without necessitating retraining.\nThis contrasts with physics-informed\nneural networks (PINNs) [50], which require fewer or no measurement data but solve\na given PDE for a specific initial condition, typically necessitating retraining for each\nnew initial condition.\nThe above deep learning frameworks are not only useful for modeling unknown\nPDEs but also for creating surrogate models of known, yet complex, PDEs that are\nexpensive to solve using traditional numerical methods.\n3. Enhancing Prediction Accuracy and Stability. In learning unknown\ntime-dependent differential equations, our goal is to predict the system\u2019s evolution\naccurately over extended periods.\nThis section introduces two loss functions and\na novel neural network architecture designed to enhance the long-term prediction\naccuracy and stability of the learned models.\n3.1. Multi-step Loss. Research by [14] shows that using a multi-step loss func-\ntion can significantly improve predictive models with fixed time step sizes. This ap-\nproach averages the loss over multiple future time steps.\nThe training dataset is\nDUE\n13\nstructured as follows:\n(3.1)\nn\nw(j)\n0 , w(j)\n1 , . . . , w(j)\nM+1, . . . , w(j)\nM+1+K\no\n,\nj = 1, 2, . . . , J,\nwhere K \u22650 represents the number of future time steps. During training, initial\nstates w(j)\n0 , w(j)\n1 , . . . , w(j)\nM are used, and the DNN model (2.22) is executed for K + 1\nsteps to produce predictions bw(j)\nM+1, . . . , bw(j)\nM+1+K.\nThe multi-step loss function is\ndefined as\n(3.2)\nL(\u03b8) =\n1\nJ(K + 1)\nJ\nX\nj=1\nK\nX\nk=0\n\r\r\rw(j)\nM+1+k \u2212bw(j)\nM+1+k(\u03b8)\n\r\r\r\n2\n2 .\nNote that the loss function in Equation (2.23) is a special case with K = 0.\n3.2. Semigroup-informed Loss. As mentioned in Section 2.3, an OSG-Net\ninherently satisfies the first constraint (2.14a). To embed the second property (2.14b)\ninto an OSG-Net, a global direct semigroup-informed (GDSG) loss function was in-\ntroduced in [9], which effectively guides an OSG-Net to adhere to (2.14b) through\ntraining. The GDSG method integrates a regularization term informed by the semi-\ngroup property (2.14b) to the data-driven loss function:\n(3.3)\nL(\u03b8) =\n1\n(1 + \u03bb)J\nJ\nX\nj=1\n\u0012\r\r\ru(j)\nout \u2212bu(j)\nout(\u03b8)\n\r\r\r\n2\n2 + \u03bbR(j)\nSG(\u03b8)\n\u0013\n,\nwhere \u03bb > 0 serves as a regularization factor, and R(j)\nSG(\u03b8) is defined as\n(3.4)\nR(j)\nSG(\u03b8) := 1\n2\n\u0012\r\r\r\u00afu(j)(\u03b8) \u2212\u02dcu(j)(\u03b8)\n\r\r\r\n2\n2 +\n\r\r\r\u00afu(j)(\u03b8) \u2212\u02d8u(j)(\u03b8)\n\r\r\r\n2\n2\n\u0013\n,\nwith \u00afu(j), \u02dcu(j), and \u02d8u(j) being network predictions of randomly generated initial con-\nditions eu(j)\n0\nand random forward time steps \u2206(j)\n0 , \u2206(j)\n1 :\n\u00afu(j) = OSG-Net\u03b8\n\u0010\neu(j)\n0 , \u2206(j)\n0\n+ \u2206(j)\n1\n\u0011\n,\nwhich is the predicted state after a single forward step of size \u2206(j)\n0\n+ \u2206(j)\n1 , and\n\u02dcu(j) = OSG-Net\u03b8\n\u0010\nOSG-Net\u03b8\n\u0010\neu(j)\n0 , \u2206(j)\n0\n\u0011\n, \u2206(j)\n1\n\u0011\n,\n\u02d8u(j) = OSG-Net\u03b8\n\u0010\nOSG-Net\u03b8\n\u0010\neu(j)\n0 , \u2206(j)\n1\n\u0011\n, \u2206(j)\n0\n\u0011\n,\nwhich are the predicted states after two sequential forward steps. According to the\nsemigroup property, \u00afu(j), \u02dcu(j), and \u02d8u(j) are predictions of the same true state and\nshould therefore be enforced to be equal. Hence, incorporating (3.4) into the loss\nfunction encourages OSG-Net\u03b8 to adhere to property (2.14b). Remarkably, comput-\ning the residue (3.4) does not require additional measurement data. Moreover, the\nGDSG method can be further improved by generating multiple pairs of random data\n{eu(j,q)\n0\n, \u2206(j,q)\n0\n, \u2206(j,q)\n1\n}Q\nq=1 and using the averaged residue over Q pairs; see Section 3.2\nof [9] for more details.\n14\nJ. CHEN, K. WU, D. XIU\n3.3. Dual-network Technique for Multiscale Dynamics. Modeling equa-\ntions with varying time step sizes necessitates capturing dynamics characterized by\ntemporal multiscale properties. A plain neural network may struggle with large time\nscale separations, leading to poor long-term prediction accuracy. In this paper, we\nintroduce a novel dual-network architecture, called the dual-OSG-Net, which we pro-\npose as a new approach that leverages the gating mechanism [27] to effectively learn\ndynamics across broader time scales.\n\u2206\nuin\nOSG-Net\u03b81\nOSG-Net\u03b82\nGating\u03b83\nh2\nh1\nw1\nw2\nw1h1 + w2h2\nuout\nFig. 5: Dual-OSG-Net for learning multiscale equations.\nAs illustrated in Figure 5, the dual-OSG-Net combines predictions from two in-\ndependent OSG-Nets using weighted averaging. The weights {(w1, w2)|w1 > 0, w2 >\n0, w1 + w2 = 1} are determined by another neural network, Gating\u03b83, with Softmax\nactivation at its output layer. This gating network Gating\u03b83 is trained simultaneously\nwith the two OSG-Nets (OSG-Net\u03b81 and OSG-Net\u03b82) and intelligently decides which\nOSG-Net weighs more. The gating mechanism adaptively assigns a weight to each\nOSG-Net based on the time step size, allowing each network to adaptively focus on a\nspecific scale. For small time steps, it prioritizes the OSG-Net optimized for fine-scale\ndynamics, while for larger steps, it emphasizes the network suited to coarse scales.\nThis adaptability enables the dual-OSG-Net to handle multi-scale problems more ef-\nfectively than a single, larger OSG-Net, which lacks this flexibility and must attempt\nto capture all scales simultaneously. In Section 5.4, we will demonstrate the superior\nperformance of the dual-OSG-Net compared to the standard single OSG-Net through\nnumerical comparisons.\n4. Overview and Usage of DUE. This section introduces the structure and\nusage of DUE, a comprehensive library designed for data-driven learning of unknown\nequations. As illustrated in Figure 1, DUE comprises three main modules:\n\u2022 datasets: This module handles data loading and essential preprocessing\ntasks such as slicing, regrouping, and normalization.\n\u2022 networks: It includes a variety of DNN architectures like FNN, ResNet,\ngResNet, OSG-Net, dual-OSG-Net, Transformers, and more.\n\u2022 models: This module is dedicated to training the deep learning-based mod-\nels, offering various learning strategies to enhance prediction accuracy and\nstability.\nThis structure allows users to quickly understand its usage and customize or add new\nfunctionalities as needed. Detailed usage and customization of DUE are explained in\nSections 4.1 and 4.2.\n4.1. Usage. With DUE, learning unknown equations is simplified to just a few\nlines of code. Below is a template script with detailed comments for modeling the\ndynamics of a damped pendulum (see Section 5.1 for detailed descriptions). For more\ncomplex tasks, slight modifications may be needed, such as alternating data loaders,\nchanging neural network architectures, and adapting training strategies.\nDUE\n15\nimport due\n# Load the configuration for the modules: datasets, networks, and models\nconf_data, conf_net, conf_train = due.utils.read_config(\"config.yaml\")\n# Load the (measurement) data, slice them into short bursts,\n# apply normalization, and store the minimum and maximum values of the state varaibles\ndata_loader = due.datasets.ode.ode_dataset(conf_data)\ntrainX, trainY, test_set, vmin, vmax = data_loader.load(\"train.mat\", \"test.mat\")\n# Construct a neural network\nmynet = due.networks.fcn.resnet(vmin, vmax, conf_net)\n# Define and train a model, save necessary information of the training history\nmodel = due.models.ODE(trainX, trainY, mynet, conf_train)\nmodel.train()\nmodel.save_hist()\n# Conduct long-term prediction for arbitrarily given initial conditions\npred = mynet.predict(test_set[...,:conf_data[\"memory\"]+1], steps=1000, device=\"cpu\")\n4.1.1. Configuration. To simplify the specification of hyperparameters such as\nmemory steps, multi-steps in the loss function, network depth and width, training\nepochs, batch size, and more, users can consolidate them in a single configuration file.\nThis file can be seamlessly processed using the due.utils.read config function.\nUpon processing, these hyperparameters are stored in three Python dictionaries: one\nfor data processing configuration, one for neural network architecture configuration,\nand one for model training configuration. All the modules in Figure 1 are designed\nto work with such dictionaries, relieving users from specifying each hyperparameter\nindividually when calling multiple modules. This streamlined approach facilitates the\nlaunch of new tasks and allows for easy calibration of hyperparameters. To automate\nhyperparameter optimization [20], users can implement automated grid search via an\nexternal script that iterates over the configuration file in a for-loop.\ndata:\nproblem_type: ode\nnbursts: 10\nmemory: 0\nmulti_steps: 10\nproblem_dim: 2\nnetwork:\ndepth: 3\nwidth: 10\nactivation: \"gelu\"\ntraining:\ndevice: \"cpu\"\nepochs: 500\nbatch_size: 5\noptimizer: \"adam\"\nlearning_rate: 0.001\nloss: \"mse\"\nsave_path: \"./model\"\n4.1.2. Data Preprocessing. DUE is equipped to handle both ODE and PDE\ndata with either fixed or varied time lags. To accommodate these diverse scenarios,\nwe have implemented four modules in the \u201cdatasets\u201d class:\n\u2022 ode dataset: For unknown ODEs and data with fixed time lag.\n\u2022 ode dataset osg: For unknown ODEs and data with varied time lags.\n\u2022 pde dataset: For unknown PDEs and data with fixed time lag.\n\u2022 pde dataset osg: For unknown PDEs and data with varied time lags.\nUsers only need to prepare the measurement data and employ one of these four mod-\nules. The data will be automatically rearranged, normalized, and partitioned into\ninput-output pairs, as indicated by (2.7), (2.8), (2.21), and (3.1).\n16\nJ. CHEN, K. WU, D. XIU\n4.1.3. Neural Networks. The networks module in DUE offers a wide array\nof DNN architectures for ODE and PDE learning. For modeling ODEs with fixed\nand varied time step sizes, we have implemented resnet, gresnet, and osg net\nbuilt upon FNNs, respectively. As for learning PDEs, we have implemented pit\u2014\nthe Position-induced Transformer [10]\u2014for handling data with fixed time lag, and\nosg fno\u2014an OSG-Net built upon the Fourier neural operator [36, 9]\u2014for cases with\nvaried time step sizes.\nAs described in Section 2.6, unknown PDEs can also be\nlearned in modal space. We provide the generalized fourier projection1d and\ngeneralized fourier projection2d functions for computing modal expansion co-\nefficients from snapshot data for one- and two-dimensional problems. All the DNN\narchitectures in DUE belong to the nn class, which can be further enriched by cus-\ntomized deep learning methods to suit specific needs.\n4.1.4. Model Training. The models module implements the training proce-\ndures for deep learning models. Four training routines are available:\n\u2022 ode: For learning unknown ODEs with fixed time step size.\n\u2022 ode osg: For modeling unknown ODEs with varied time step sizes.\n\u2022 pde: For learning unknown PDEs with fixed time step size\n\u2022 pde osg: For modeling unknown PDEs with varied time step sizes.\nWe have also integrated the GDSG method to embed the semigroup property into\nmodels with varied time step sizes. Users only need to specify the hyperparameters of\nthe semigroup loss as detailed in Section 3.2, and DUE handles the complex procedures\nof training with the GDSG method.\n4.2. Customization. We have adopted a modular architecture for DUE, en-\nsuring that its key modules, networks and models, can be separately customized.\nUsers have the flexibility to adapt the neural network architecture to suit their specific\nrequirements and implement new training methods to enhance models\u2019 prediction ac-\ncuracy and stability. In this section, we briefly show how to customize neural network\narchitectures and training methods.\n4.2.1. Neural Networks. As described in Section 4.1.3, DUE already provides\na range of neural network architectures that address various scenarios in ODE and\nPDE learning. Users interested in exploring more specialized or recent deep learning\nmethods can implement them by following the guidelines in Procedure 4.1.\nProcedure 4.1 Customization of the neural network NewNet.\nclass NewNet(nn):\n\"\"\"New network architectures belong to the nn class\"\"\"\ndef __init__(self):\n\"\"\" create the computational layers here\"\"\"\nself._layer1 = ...\nself._layer2 = ...\ndef forward(self, x):\n\"\"\"Return the output of NewNet\"\"\"\nx1 = self._layer1(x)\nx2 = self._layer2(x1)\nreturn x2 + x\n4.2.2. Model Training. In the current version of DUE, we have implemented\nthe multi-step loss function [14] for data-driven modeling with fixed time step size, and\nthe GDSG method [9] for cases with varied time step sizes. If users have developed\ncustom training methods, such as new loss functions, implementing them in DUE is\nstraightforward using the following procedure.\nDUE\n17\nProcedure 4.2 Customization of the training method for ODEs and PDEs.\nclass New_learning(ODE): # New_learning(PDE):\n\"\"\"\nNew ODE learning methods belong to the ODE class\nNew PDE learning methods belong to the PDE class\n\"\"\"\ndef NewLoss(self, true, pred):\n\"\"\"Creat the customized loss function here\"\"\"\nloss = ...\nreturn loss\ndef train(self):\n\"\"\"Construct the training loop for a number of epochs\"\"\"\nfor i in range(self.n_epochs):\nfor x, y in self.train_loader:\npred = self.mynet(x)\nloss = self.NewLoss(y, pred)\nself.optimizer.zero_grad()\nloss.backward()\nself.optimizer.step()\nBy leveraging DUE\u2019s modularity and flexibility, users can effectively address a\nwide range of data-driven modeling challenges in unknown ODE and PDE systems.\n5. Demonstration Examples. In this section, we present diverse examples\nto demonstrate the effectiveness of DUE for data-driven learning of unknown ODEs\nand PDEs. The examples include: (1) the damped pendulum system, (2) coupled\noscillators with real-world noisy data, (3) the chaotic Lorenz system, (4) the Robertson\nchemical reaction problem involving high stiffness and multi-scale dynamics, (5) the\none-dimensional viscous Burgers\u2019 equation, and (6) the vorticity evolution of the two-\ndimensional Navier\u2013Stokes equations, and (7) the two-dimensional flow past a circular\ncylinder. In these examples, the true governing equations are known, but they serve\nonly two purposes: generating synthetic data for training the DNNs and providing\nreference solutions for comparison during testing. During the data-driven learning\nprocess, the true equations are regarded as unknown.\nFor all examples, we use the GELU activation function [25] and train the models\nwith the Adam optimizer [30]. The learning rate is initialized at 0.001 and follows\na cosine annealing schedule [40]. Detailed training configurations and dataset infor-\nmation for all examples are presented in the dedicated subsections below. To ensure\nusers can quickly understand how DUE works and apply it to their tasks, we provide\ndetailed comments in the code for each numerical example. All this information is\navailable on the GitHub page of DUE.\n5.1. Damped Pendulum. The first example is the damped pendulum system\n[46, 18], described by the equations (2.2) with \u03b1 = 0.1 and \u03b2 = 9.80665. Synthetic\ndata is generated using the fourth-order Runge\u2013Kutta method to advance the true\nsystem forward in time. The dataset comprises N = 1, 000 trajectories for (u1, u2),\neach with a length of L = 1, 000 and a time lag of \u2206= 0.02.\nThe initial states\nof these trajectories are randomly sampled from the uniform distribution on \u2126=\n[\u2212\u03c0/2, \u03c0/2] \u00d7 [\u2212\u03c0, \u03c0].\n5.1.1. Fully Observed Case. In this case, we assume both state variables u1\nand u2 are observable. We set K = 10 for the multi-step loss and randomly sample\n10 bursts from each trajectory to construct the training dataset. The ResNet has 3\nhidden layers, each with 10 neurons, and is trained for 500 epochs with a batch size of\n5. Following training, the model\u2019s performance is evaluated on a new and unseen test\nset consisting of 100 trajectories with initial states uniformly sampled on \u2126. Figure 6\n18\nJ. CHEN, K. WU, D. XIU\ndisplays an example trajectory alongside the reference solution, as well as the average\n\u21132 error over time. The trained model demonstrates accurate predictions up to t = 20,\nequivalent to 1,000 forward steps.\n\u22120.8\n\u22120.4\n0\n0.4\n0.8\n\u22122.6\n\u22121.3\n0\n1.3\n2.6\nu1\nu2\nReference\nPrediction\n0\n1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n17\n18\n19\n20\n0\n0.002\n0.004\n0.006\n0.008\n0.01\n0.012\nt\nError u1\nError u2\nFig. 6: Fully observed damped pendulum system. Left: Comparison between the\npredicted and reference solutions. Right: Average \u21132 error computed on the test set.\n5.1.2. Partially Observed Case. In this scenario, we focus on modeling a re-\nduced system solely related to u1. Trajectories of u2 are excluded from the training\ndata, and we address this partially observed system by adopting M = 10 memory\nsteps in the model. Thanks to the optimized data processing module of DUE, users\ncan easily try different values of M by modifying the memory parameter in the con-\nfiguration file; see Section 4.1.1. Other configurations remain the same as in the fully\nobserved case. Figure 7 illustrates an example trajectory and the average \u21132 error on\nthe test set.\n0\n4\n8\n12\n16\n20\n\u22121\n\u22120.5\n0\n0.5\n1\nt\nReference\nPrediction\n0\n4\n8\n12\n16\n20\n0\n0.002\n0.004\n0.006\nt\nError u1\nFig. 7: Partially observed damped pendulum system. Left: Comparison between the\npredicted and reference solutions. Right: Average \u21132 error computed on the test set.\n5.1.3. Robustness to Noisy Data. In the third scenario, we introduce artifi-\ncial noise to the synthetic data used in Section 5.1.2 to assess the model\u2019s robustness\nto measurement errors. Specifically, the training data are modified as\nn\nu(j)\nin (1 + \u03f5(j)\nin ), u(j)\nout(1 + \u03f5(j)\nout)\noJ\nj=1 ,\nwhere the relative noise terms \u03f5(j)\nin and \u03f5(j)\nout are drawn from a uniform distribution\nover [\u2212\u03b7, \u03b7], with \u03b7 representing the noise level. We perform two experiments with\n\u03b7 set to 0.05 and 0.1, corresponding to noise levels of 5% and 10%, respectively. All\nother settings are kept the same as in Section 5.1.2. Figure 8 shows the predicted\nDUE\n19\ntrajectories generated by two different models trained on noisy data. While some\ndeviation from the exact dynamics is observed, the oscillating and damping patterns of\nthe solution remain well-captured. The model\u2019s performance can be further enhanced\nby increasing the amount of training data.\n0\n4\n8\n12\n16\n20\n-1\n-0.5\n0\n0.5\n1\nt\nReference\nPrediction\n0\n4\n8\n12\n16\n20\n-1\n-0.5\n0\n0.5\n1\nt\nReference\nPrediction\nFig. 8: Partially observed damped pendulum system with noisy data. Left: Noise\nlevel \u03b7 = 5%. Right: Noise level \u03b7 = 10%.\n5.2. Two Coupled Oscillators with Real-World Noisy Data. Next, we\nuse DUE to model the unknown ODEs of two coupled oscillators using real-world\ndata [57, 69]. This dataset consists of a single trajectory with 486 recorded states,\nof which the first 360 states are used for training and the remaining for testing. The\nstate variables of interest include the positions and momenta of the two oscillators,\nresulting in a state space in R4. Due to measurement noise, the experimental data\nmay not perfectly represent the full system. In this example, we examine the impact\nof memory terms in modeling partially observed systems by training two models with\nM = 0 and M = 10, respectively.\nEach model employs a ResNet with 3 hidden\nlayers, each containing 10 neurons, and is trained for 500 epochs with a batch size\nof 1. The predicted phase plots are displayed in Figure 9. Despite the data scarcity\nand measurement noise, both models successfully capture the underlying dynamics.\nThe advantage of using memory terms is evident from the improved accuracy with\nM = 10 compared to M = 0.\n5.3. Chaotic Lorenz system. Next, we demonstrate DUE\u2019s capability to\nmodel the chaotic Lorenz system [17]. The true equations are given by:\n(5.1)\n\uf8f1\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f2\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f3\ndu1\ndt = \u03c3(u2 \u2212u1),\ndu2\ndt = u1(\u03c1 \u2212u3) \u2212u2,\ndu3\ndt = u1u2 \u2212\u03b2u3,\nwith \u03c3 = 10, \u03c1 = 28, and \u03b2 = 8/3. The synthetic dataset consists of N = 1, 000\ntrajectories for (u1, u2, u3), each with a length of L = 10, 000 and a time lag of\n\u2206= 0.01.\nInitial states are randomly sampled from the uniform distribution on\n\u2126= [\u2212\u03c0/2, \u03c0/2]3. We set K = 10 for the multi-step loss and randomly sample 5\nbursts from each trajectory to construct the training dataset. In this example, we\ncompare the performance of ResNet and gResNet. The baseline ResNet is built upon\nan FNN with 3 hidden layers, each with 10 neurons. The gResNet consists of an FNN\nwith the same architecture and a pre-trained affine model, implemented as affine in\n20\nJ. CHEN, K. WU, D. XIU\n\u22121\n\u22120.5\n0\n\u22121\n0\n1\nPosition\nMomentum\nInitial state\nTruth\nPrediction M = 0\n0\n0.5\n1\n1.5\n\u22122\n\u22121\n0\n1\n2\nPosition\nMomentum\n0\n1\n2\n3\n4\n0\n0.05\n0.1\n0.15\nt\nError M = 0\n\u22121\n\u22120.5\n0\n\u22121\n0\n1\nPosition\nMomentum\nInitial states\nTruth\nPrediction M = 10\n0\n0.5\n1\n1.5\n\u22122\n\u22121\n0\n1\n2\nPosition\nMomentum\n0\n1\n2\n3\n4\n0\n0.05\n0.1\n0.15\nt\nError M = 10\nFig. 9: Two coupled oscillators from real-world data. Models with different memory\nsteps (M) predict the system\u2019s evolution. Top: M = 0. Bottom: M = 10. Left: Phase\nplots of Mass 1. Middle: Phase plots of Mass 2. Right: the \u2113\u221eerror (suggested in\n[69]) computed on the last 126 states of the experimental data.\nDUE. Both models are trained for 500 epochs with a batch size of 5. After training,\nthe models are evaluated on a new and unseen test set consisting of 100 trajectories\nwith initial states uniformly sampled on \u2126. Figure 10 displays an example of the\npredicted and reference trajectories, while Figure 11 shows the average \u21132 error up\nto t = 10 on the test set.\nThese results indicate that both ResNet and gResNet\ncan capture the system\u2019s chaotic evolution, with gResNet achieving higher prediction\naccuracy.\nFig. 10: Lorenz equations. From left to right: reference solution, prediction by gRes-\nNet, prediction by ResNet.\n1\n3\n5\n7\n9\nt\n0\n1\n2\n3\nError u1-ResNet\nError u1-gResNet\n1\n3\n5\n7\n9\nt\n0\n1\n2\n3\n4\n5\nError u2-ResNet\nError u2-gResNet\n1\n3\n5\n7\n9\nt\n0\n1\n2\n3\n4\n5\nError u3-ResNet\nError u3-gResNet\nFig. 11: Lorenz equations. From left to right: \u21132 error of u1, u2, and u3.\nDUE\n21\n5.4. Robertson Chemical Reaction Equations with Multi-Scale Dy-\nnamics. This example explores the Robertson chemical reaction system, which de-\nscribes the kinetics of three chemical species: A, B, and C. Proposed by Robertson in\n1966 [51], the system is governed by the following nonlinear ODEs:\n(5.2)\n\uf8f1\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f2\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f3\ndu1\ndt = \u2212k1u1 + k2u2u3,\ndu2\ndt = k1u1 \u2212k2u2u3 \u2212k3u2\n2,\ndu3\ndt = k3u2\n2,\nwhere (u1, u2, u3) represent the concentrations of (A, B, C), respectively. The reaction\nrates are k1 = 0.04, k2 = 104, and k3 = 3 \u00d7 107, making the system highly stiff. To\ncapture dynamics across both small and large time scales, we use DUE\u2019s ode osg\nmodule to approximate flow maps with varied time step sizes [9].\nThe synthetic\ndataset comprises 50, 000 input-output pairs, with time lags randomly sampled from\n10U[\u22124.5,2.5], where U[\u22124.5, 2.5] is the uniform distribution on [\u22124.5, 2.5]. Initial states\nare randomly sampled from the domain [0, 1] \u00d7 [0, 5 \u00d7 10\u22125] \u00d7 [0, 1], and the system\nis solved using the variable-step, variable-order ode15s solver in Matlab.\nTo address the challenge of multi-scale temporal dynamics, we employ a dual-\nOSG-Net with 3 hidden layers, each containing 60 neurons.\nThe neural network\nmodel is trained using the GDSG method to embed the semigroup property, with the\nhyperparameters \u03bb and Q both set to 1. Additionally, we train a second model using\nthe vanilla OSG-Net [9] for benchmarking. Both models are trained for 10,000 epochs\nwith a batch size of 500. After training, predictions are initiated from (u1, u2, u3) =\n(1, 0, 0) to forecast the multi-scale kinetics of the three chemical species until t =\n100, 000, a challenging long-term prediction task.\nThe time step size starts from\n\u22061 = 5 \u00d7 10\u22125 and doubles after each step until it reaches \u22062 = 300. As shown in\nFigure 12, the dual-OSG-Net model accurately predicts the dynamics across all time\nscales between \u22061 and \u22062, demonstrating superior long-term accuracy compared to\nthe vanilla OSG-Net model.\n10\u22124\n10\u22122\n100\n102\n104\n0\n0.25\n0.5\n0.75\n1\nt\nReference u1\nPrediction u1\nReference u2\nPrediction u2\nReference u3\nPrediction u3\n10\u22124\n10\u22122\n100\n102\n104\n0\n0.25\n0.5\n0.75\n1\nt\nReference u1\nPrediction u1\nReference u2\nPrediction u2\nReference u3\nPrediction u3\nFig. 12: Robertson chemical reaction equations. Left: OSG-Net prediction vs. refer-\nence solution. Right: Dual-OSG-Net prediction vs. reference solution. Initial state:\n(1, 0, 0). The value of u2 is multiplied by 104 for clearer visualization.\n5.5. One-dimensional Viscous Burgers\u2019 Equation. This example demon-\nstrates DUE\u2019s capabilities in learning PDEs by focusing on the viscous Burgers\u2019 equa-\n22\nJ. CHEN, K. WU, D. XIU\ntion with Dirichlet boundary conditions [65, 14]:\n(5.3)\n\uf8f1\n\uf8f4\n\uf8f2\n\uf8f4\n\uf8f3\n\u2202tu + \u2202x\n\u0012u2\n2\n\u0013\n= 1\n10\u2202xxu,\n(x, t) \u2208(0, 2\u03c0) \u00d7 R+,\nu(0, t) = u(2\u03c0, t) = 0,\nt \u22650.\nThe training data are generated by sampling the power series solutions of the true\nequation on a uniform grid with 128 nodal points. Initial conditions are drawn from\na Fourier series with random coefficients: u(x, t = 0) = P10\nm=1 am sin(mx), where\nam \u223cU[\u22121/m, 1/m].\nWe generate N = 1, 000 trajectories of the solution with\ndifferent initial conditions, and record L = 40 snapshots on each trajectory with a\ntime lag \u2206= 0.05.\nIn this example,\nwe introduce how to learn PDEs in modal space us-\ning DUE\u2019s generalized fourier projection1d class.\nFirst,\ninitialize this\nclass\nby\nspecifying\na\ntruncation\nwave\nnumber\nfor\nthe\nmodal\nexpansion.\nThe\ntraining\ndata\nare\nprojected\ninto\nthe\nreduced\nmodal\nspace\nvia\nthe\ngeneralized fourier projection1d.forward function. This data transformation\nis followed by a standard ODE modeling procedure, resulting in a model that cap-\ntures the dynamics of the modal coefficients. During prediction, the future states of\nthe modal coefficients are used to recover solutions in the physical space using the\ngeneralized fourier projection1d.backward function. For this example, the\ntruncation wave number is set to 10. We adopt a ResNet with 3 hidden layers, each\ncontaining 60 neurons. The model is trained for 500 epochs with a batch size of 10.\nAfter training, we evaluate the model\u2019s performance on a new and unseen test set.\nFigure 13 displays predictions for two example trajectories up to t = 10, equivalent\nto 200 forward steps.\nFig. 13: One-dimensional viscous Burgers\u2019 equation. Predicted and reference solutions\nfor two example trajectories originating from different initial conditions. Black solid\nlines indicate reference solution contours, while red dotted lines and colored plots\nshow predictions.\n5.6. Two-dimensional Incompressible Navier\u2013Stokes Equations. This\nexample illustrates learning the incompressible Navier\u2013Stokes equations [36, 9]:\n(5.4)\n\uf8f1\n\uf8f4\n\uf8f2\n\uf8f4\n\uf8f3\n\u2202t\u03c9(x, t) + v(x, t) \u00b7 \u2207\u03c9(x, t) = \u03bd\u2206\u03c9(x, t) + f(x),\nx \u2208(0, 1)2, t > 0,\n\u2207\u00b7 v(x, t) = 0,\nx \u2208(0, 1)2, t > 0,\n\u03c9(x, t = 0) = \u03c90(x),\nx \u2208(0, 1)2,\nDUE\n23\nwhere v(x, t) is the velocity, \u03c9 = \u2207\u00d7 v is the vorticity, \u03bd = 10\u22123 denotes viscosity,\nand f(x) = 0.1(sin(2\u03c0(x1 + x2)) + cos(2\u03c0(x1 + x2))) represents a periodic external\nforce. Our goal is to learn the evolution operators of \u03c9 from data with varied time\nlags. We use the data from [9], which comprises N = 100 trajectories of solution\nsnapshots with a length of 50. Solutions are sampled on a 64 \u00d7 64 uniform grid, with\ntime lags randomly sampled from the uniform distribution on [0.5, 1.5]. For neural\nnetwork modeling, we construct an OSG-Net with the Fourier neural operator as the\nbasic block, implemented as osg fno in DUE. The two hyperparameters \u03bb and Q are\nboth set to 1 for the GDSG loss function. We train the model for 500 epochs with a\nbatch size of 20. Subsequently, the trained model is evaluated on 100 new and unseen\ntrajectories with a length of 100 and a time step size \u2206= 1. As shown in Figure 14,\nthe model trained with the GDSG method produces accurate predictions at t = 60\nand 100. Figure 15 displays the training loss and testing errors. We observe that the\npurely data-driven model, which is trained using only the plain fitting loss without\nembedding the semigroup property, is unstable.\n\u22122.1\n0\n2.1\n\u22122.1\n0\n2.1\n0\n0.014\n0.028\n\u22122.1\n0\n2.1\n\u22122.1\n0\n2.1\n0\n0.014\n0.028\nFig. 14: Two-dimensional Navier\u2013Stokes equations. Vorticity at t = 60 (top row) and\n100 (bottom row) predicted by the model trained with the GDSG method.\n0\n100\n200\n300\n400\n500\n10\u22123\n10\u22122\n10\u22121\nEpochs\nOSG-FNO\nOSG-FNO+GDSG\n0\n20\n40\n60\n80\n100\n10\u22123\n10\u22122\n10\u22121\nt\nOSG-FNO\nOSG-FNO+GDSG\nFig. 15: Two-dimensional Navier\u2013Stokes equations. Left: training loss recorded after\nevery epoch. Right: average relative \u21132 error computed on the test set with 100 new\nand unseen trajectories with a length of 100 and time lag \u2206= 1.\n5.7. Two-dimensional Flow Past a Circular Cylinder. In this classic fluid\nmechanics example, we use DUE to learn the dynamics of fluid velocity v and pressure\n24\nJ. CHEN, K. WU, D. XIU\np around a circular cylinder, generating periodic oscillations at a low Reynolds num-\nber. Synthetic data is generated by numerically solving the incompressible Navier\u2013\nStokes equations:\n(5.5)\n(\n\u2202tv + v \u00b7 \u2207v = \u22121\n\u03c1\u2207p + \u03bd\u2206v,\nx \u2208\u2126, t > 0,\n\u2207\u00b7 v(x, t) = 0,\nx \u2208\u2126, t > 0,\nwith fluid density \u03c1 = 1 and viscosity \u03bd = 0.001.\nThe geometric configuration,\nboundary conditions, and computing mesh are depicted in Figure 16. The horizontal\nvelocity component at the inlet, denoted as v0, is sampled from the following Fourier\nseries with random coefficients:\n(5.6)\nv0(y, t) = 1 + 0.6\n5\nX\nm=1\nam sin\n\u00122m\u03c0\nH y\n\u0013\n,\nwhere am \u223cU[\u22121/m, 1/m], and H is the height of the rectangular domain.\n0\n1\n2\n-0.2\n-0.1\n0\n0.1\n0.2\n\ud835\udc65\n\ud835\udc66\n\ud835\udc3f= 0.8\n\ud835\udc3b= 0.4\nFig. 16: Two-dimensional flow past a circular cylinder at the origin in a rectangular\ndomain. The inlet is 0.2 units upstream of the cylinder\u2019s centroid. Domain size: 0.8\nwidth, 0.4 height. Inflow has zero vertical velocity. Lateral boundaries: v = (1, 0).\nOutflow: zero pressure. No-slip condition on the cylinder\u2019s surface.\nThe dataset consists of 1,000 trajectories with 11 snapshots each, having a time\nlag of 0.05.\nWe set K = 0 for the multi-step loss and rearrange each trajectory\ninto 10 input-output pairs to construct the training data set. For neural network\nmodeling with data sampled on an unstructured mesh, we employ a ResNet with the\nPosition-induced Transformer (PiT) as the basic block, implemented as pit in DUE.\nThe model is trained for 500 epochs with a batch size of 50. Subsequently, the trained\nmodel is evaluated on 100 new and unseen trajectories with 10 forward time steps\n(up to t = 0.5). As shown in Figure 17, the PiT model in DUE successfully captures\nthe dynamics of both the velocity and pressure. The relatively larger error in the\ndownstream region is due to a sparser distribution of sampling grid points, resulting\nin a lower resolution of the downstream flow. Consequently, this region contributes\nless to the loss function, leading the model to learn less about the flow patterns there.\nThe resolution in this region can be improved by locally increasing the number of\nsampling points.\n6. Conclusions and Prospects. Artificial intelligence is revolutionizing scien-\ntific research, offering profound insights and accelerating discoveries across various\nDUE\n25\n\u22120.3\n0.7\n1.7\n0\n0.012\n0.024\n\u22120.9\n0\n0.9\n0\n0.011\n0.022\n\u22120.9\n0.1\n1.1\n0\n0.006\n0.012\nFig. 17: Two-dimensional flow past a circular cylinder. From top to bottom: horizon-\ntal velocity; vertical velocity; pressure. Left: the referential (v, p) at t = 0.5; Middle:\nthe predicted (v, p) given by the PiT model; Right: the absolute errors between the\nreferences and the predictions.\nfields through advanced data analysis and predictive modeling. This paper has intro-\nduced a comprehensive framework for learning unknown equations using deep learn-\ning, featuring advanced neural network architectures such as ResNet, gResNet, OSG-\nNet, and Transformers. This adaptable framework is capable of learning unknown\nODEs, PDEs, DAEs, IDEs, and SDEs, as well as reduced or partially observed sys-\ntems with missing variables. Compared to DMD, which offers faster training times\nand performs well on linear systems, the deep learning framework requires more com-\nputational resources for training but excels at capturing nonlinear interactions and\nmodeling complex systems, providing greater flexibility and accuracy for tackling\nchallenging problems.\nWe have presented the novel dual-OSG-Net architecture to address the challenges\nposed by multi-scale stiff differential equations, enabling accurate learning of dynam-\nics across broad time scales. Additionally, we have introduced several techniques to\nenhance prediction accuracy and stability, including a multi-step loss function that\nconsiders model predictions several steps ahead during training, and a semigroup-\ninformed loss function that embeds the semigroup property into the models. These\ntechniques could serve as examples for students and newcomers, illustrating the fron-\ntier of embedding prior knowledge into deep learning for data-driven discovery and\ndeveloping structure-preserving AI for modeling unknown equations.\nTo support this framework, we developed Deep Unknown Equations (DUE), a\nuser-friendly, comprehensive software tool equipped with extensive functionalities for\nmodeling unknown equations through deep learning. DUE facilitates rapid scripting,\nallowing users to initiate new modeling tasks with just a few lines of code. It serves\nas both an educational toolbox for students and newcomers and a versatile Python\nlibrary for researchers dealing with differential equations. DUE is applicable not only\nfor learning unknown equations from data but also for surrogate modeling of known,\nyet complex, equations that are costly to solve using traditional numerical methods.\nThe extensive numerical examples presented in this paper demonstrate DUE\u2019s power\nin modeling unknown equations, and the source codes for these examples are available\nin our GitHub repository, providing templates that users can easily adapt for their\nresearch.\n26\nJ. CHEN, K. WU, D. XIU\nLooking ahead, DUE is envisioned as a long-term project with ongoing mainte-\nnance and regular updates to incorporate advanced techniques. We are committed\nto continuously optimizing DUE\u2019s performance and adding new functionalities as re-\nsearch in this field progresses. We also encourage contributions from users to expand\nDUE\u2019s capabilities and broaden its applicability across a wider range of scenarios.\nOne promising direction is to implement robust denoising procedures during data\npreprocessing, enabling DUE to achieve reliable results even with high levels of noise\nin the data. Additionally, reducing the amount of data required for effective deep\nlearning performance is valuable. While the current semigroup-informed learning ap-\nproach helps in this regard, incorporating additional physical constraints or leveraging\nprior models and knowledge could further guide the model toward accurate predic-\ntions with less data. Another effective strategy is active learning, which focuses on\nselecting the most informative data points for model training. By concentrating on\ncritical data, active learning can enhance model performance while reducing data re-\nquirements.\nLastly, transfer learning offers a powerful approach to minimize data\nneeds further by utilizing pre-trained models on related tasks. For instance, neural\noperators, with their discretization-invariant properties, can be pre-trained on coarser\ndata and adapted to finer resolutions with minimal or no retraining. Exploring ad-\nditional transfer learning techniques, such as those tailored to multi-frequency time\nseries data, is also a promising direction.\nAcknowledgment. The authors would like to express their sincere gratitude to\nthe anonymous reviewers for their insightful comments and constructive suggestions,\nwhich have enhanced the quality of this paper.\nREFERENCES\n[1] A. A. Ahmadi and B. E. Khadir, Learning dynamical systems with side information, SIAM\nRev., 65 (2023), pp. 183\u2013223.\n[2] K. Azizzadenesheli, N. Kovachki, Z. Li, M. Liu-Schiaffini, J. Kossaifi, and A. Anandku-\nmar, Neural operators for accelerating scientific simulations and design, Nat. Rev. Phy.,\n(2024), pp. 1\u20139.\n[3] J. Bongard and H. Lipson, Automated reverse engineering of nonlinear dynamical systems,\nProc. Natl. Acad. Sci., 104 (2007), pp. 9943\u20139948.\n[4] S. L. Brunton, B. W. Brunton, J. L. Proctor, E. Kaiser, and J. N. Kutz, Chaos as an\nintermittently forced linear system, Nat. Commun., 8 (2017), p. 19.\n[5] S. L. Brunton, M. Budi\u02c7si\u00b4c, E. Kaiser, and J. N. Kutz, Modern Koopman theory for dy-\nnamical systems, SIAM Rev., 64 (2022), pp. 229\u2013340.\n[6] S. L. Brunton, J. L. Proctor, and J. N. Kutz, Discovering governing equations from data\nby sparse identification of nonlinear dynamical systems, Proc. Natl. Acad. Sci., 113 (2016),\npp. 3932\u20133937.\n[7] E. J. Candes, J. K. Romberg, and T. Tao, Stable signal recovery from incomplete and\ninaccurate measurements, Comm. Pure Appl. Math., 59 (2006), pp. 1207\u20131223.\n[8] S. Cao, Choose a Transformer: Fourier or Galerkin, in NeurIPS, vol. 34, 2021, pp. 24924\u2013\n24940.\n[9] J. Chen and K. Wu, Deep-OSG: Deep learning of operators in semigroup, J. Comput. Phys.,\n493 (2023), p. 112498.\n[10] J. Chen and K. Wu, Positional knowledge is all you need: Position-induced Transformer\n(PiT) for operator learning, in ICML, PMLR, 2024.\n[11] R. T. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, Neural ordinary differ-\nential equations, in NeurIPS, vol. 31, 2018.\n[12] Y. Chen, Y. Luo, Q. Liu, H. Xu, and D. Zhang, Symbolic genetic algorithm for discovering\nopen-form partial differential equations (SGA-PDE), Phys. Rev. Res., 4 (2022), p. 023174.\n[13] Y. Chen and D. Xiu, Learning stochastic dynamical system via flow map operator, J. Comput.\nPhys., (2024), p. 112984.\n[14] Z. Chen, V. Churchill, K. Wu, and D. Xiu, Deep neural network modeling of unknown\npartial differential equations in nodal space, J. Comput. Phys., 449 (2022), p. 110782.\nDUE\n27\n[15] Z. Chen and D. Xiu, On generalized residual network for deep learning of unknown dynamical\nsystems, J. Comput. Phys., 438 (2021), p. 110362.\n[16] V. Churchill, Y. Chen, Z. Xu, and D. Xiu, DNN modeling of partial differential equations\nwith incomplete data, J. Comput. Phys., 493 (2023), p. 112502.\n[17] V. Churchill and D. Xiu, Deep learning of chaotic systems from partially-observed data, J.\nMach. Learn. Model. Comput., 3 (2022).\n[18] V. Churchill and D. Xiu, Flow map learning for unknown dynamical systems: Overview,\nimplementation, and benchmarks, J. Mach. Learn. Model. Comput., 4 (2023).\n[19] Q. Du, Y. Gu, H. Yang, and C. Zhou, The discovery of dynamics via linear multistep methods\nand deep learning: error estimation, SIAM J. Numer. Anal., 60 (2022), pp. 2014\u20132045.\n[20] M. Feurer and F. Hutter, Hyperparameter optimization, Automated machine learning:\nMethods, systems, challenges, (2019), pp. 3\u201333.\n[21] X. Fu, L.-B. Chang, and D. Xiu, Learning reduced systems via deep neural networks with\nmemory, J. Mach. Learn. Model. Comput., 1 (2020).\n[22] X. Fu, W. Mao, L.-B. Chang, and D. Xiu, Modeling unknown dynamical systems with hidden\nparameters, J. Mach. Learn. Model. Comput., 3 (2022).\n[23] Z. Hao, Z. Wang, H. Su, C. Ying, Y. Dong, S. Liu, Z. Cheng, J. Song, and J. Zhu,\nGnot: A general neural operator Transformer for operator learning, in ICML, PMLR,\n2023, pp. 12556\u201312569.\n[24] K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in CVPR,\n2016, pp. 770\u2013778.\n[25] D. Hendrycks and K. Gimpel, Gaussian error linear units (GELUs), arXiv:1606.08415,\n(2016).\n[26] C. F. Higham and D. J. Higham, Deep learning: An introduction for applied mathematicians,\nSIAM Rev., 61 (2019), pp. 860\u2013891.\n[27] S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Comput., 9 (1997),\npp. 1735\u20131780.\n[28] R. T. Keller and Q. Du, Discovery of dynamics using linear multistep methods, SIAM J.\nNumer. Anal., 59 (2021), pp. 429\u2013455.\n[29] S. Kim, W. Ji, S. Deng, Y. Ma, and C. Rackauckas, Stiff neural ordinary differential\nequations, Chaos, 31 (2021).\n[30] D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, in ICLR, 2015.\n[31] N. B. Kovachki, Z. Li, B. Liu, K. Azizzadenesheli, K. Bhattacharya, A. M. Stuart, and\nA. Anandkumar, Neural operator: Learning maps between function spaces with applica-\ntions to PDEs., J. Mach. Learn. Res., 24 (2023), pp. 1\u201397.\n[32] Y. LeCun, Y. Bengio, et al., Convolutional networks for images, speech, and time series,\nThe Handbook of brain theory and neural networks, 3361 (1995), p. 1995.\n[33] S. Lee and D. You, Data-driven prediction of unsteady flow over a circular cylinder using\ndeep learning, J. Fluid Mech., 879 (2019), pp. 217\u2013254.\n[34] Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and\nA. Anandkumar, Neural operator: Graph kernel network for partial differential equa-\ntions, arXiv:2003.03485, (2020).\n[35] Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, A. Stuart, K. Bhattacharya, and\nA. Anandkumar, Multipole graph neural operator for parametric partial differential equa-\ntions, in NeurIPS, vol. 33, 2020, pp. 6755\u20136766.\n[36] Z. Li, N. B. Kovachki, K. Azizzadenesheli, B. liu, K. Bhattacharya, A. Stuart, and\nA. Anandkumar, Fourier neural operator for parametric partial differential equations, in\nICLR, 2021.\n[37] Z. Li, K. Meidani, and A. B. Farimani, Transformer for partial differential equations operator\nlearning, Trans. Mach. Learn. Res., (2022).\n[38] Z. Long, Y. Lu, and B. Dong, PDE-Net 2.0: Learning PDEs from data with a numeric-\nsymbolic hybrid deep network, J. Comput. Phys., 399 (2019), p. 108925.\n[39] Z. Long, Y. Lu, X. Ma, and B. Dong, PDE-Net: Learning PDEs from data, in ICML, PMLR,\n2018, pp. 3208\u20133216.\n[40] I. Loshchilov and F. Hutter, SGDR: Stochastic gradient descent with warm restarts, in\nICLR, 2016.\n[41] L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis, Learning nonlinear operators via\nDeepONet based on the universal approximation theorem of operators, Nat. Mach. Intell.,\n3 (2021), pp. 218\u2013229.\n[42] L. Lu, X. Meng, Z. Mao, and G. E. Karniadakis, DeepXDE: A deep learning library for\nsolving differential equations, SIAM Rev., 63 (2021), pp. 208\u2013228.\n[43] N. M. Mangan, J. N. Kutz, S. L. Brunton, and J. L. Proctor, Model selection for dynam-\n28\nJ. CHEN, K. WU, D. XIU\nical systems via sparse regression and information criteria, Proc. R. Soc. A: Math. Phys.\nEng. Sci., 473 (2017), p. 20170009.\n[44] H. Miao, X. Xia, A. S. Perelson, and H. Wu, On identifiability of nonlinear ODE models\nand applications in viral dynamics, SIAM Rev., 53 (2011), pp. 3\u201339.\n[45] T. Qin, Z. Chen, J. D. Jakeman, and D. Xiu, Data-driven learning of nonautonomous sys-\ntems, SIAM J. Sci. Comput., 43 (2021), pp. A1607\u2013A1624.\n[46] T. Qin, K. Wu, and D. Xiu, Data driven governing equations approximation using deep neural\nnetworks, J. Comput. Phys., 395 (2019), pp. 620\u2013635.\n[47] M. Raissi, Deep hidden physics models: Deep learning of nonlinear partial differential equa-\ntions, J. Mach. Learn. Res., 19 (2018), pp. 932\u2013955.\n[48] M. Raissi, P. Perdikaris, and G. E. Karniadakis, Machine learning of linear differential\nequations using Gaussian processes, J. Comput. Phys., 348 (2017), pp. 683\u2013693.\n[49] M. Raissi, P. Perdikaris, and G. E. Karniadakis, Multistep neural networks for data-driven\ndiscovery of nonlinear dynamical systems, arXiv:1801.01236, (2018).\n[50] M. Raissi, P. Perdikaris, and G. E. Karniadakis, Physics-informed neural networks: A deep\nlearning framework for solving forward and inverse problems involving nonlinear partial\ndifferential equations, J. Comput. Phys., 378 (2019), pp. 686\u2013707.\n[51] H. Robertson, The solution of a set of reaction rate equations, Numer. Anal.: An Introd.,\n178182 (1966).\n[52] O. Ronneberger, P. Fischer, and T. Brox, U-net: Convolutional networks for biomedical\nimage segmentation, arXiv:1505.04597, (2015).\n[53] S. Ruder, An overview of gradient descent optimization algorithms, arXiv:1609.04747, (2016).\n[54] S. H. Rudy, S. L. Brunton, J. L. Proctor, and J. N. Kutz, Data-driven discovery of partial\ndifferential equations, Sci. Adv., 3 (2017), p. e1602614.\n[55] H. Schaeffer, Learning partial differential equations via data discovery and sparse optimiza-\ntion, Proc. R. Soc. A: Math. Phys. Eng. Sci., 473 (2017), p. 20160446.\n[56] P. J. Schmid, Dynamic mode decomposition of numerical and experimental data, J. Fluid\nMech., 656 (2010), pp. 5\u201328.\n[57] M. Schmidt and H. Lipson, Distilling free-form natural laws from experimental data, Sci.,\n324 (2009), pp. 81\u201385.\n[58] Y. Sun, L. Zhang, and H. Schaeffer, NeuPDE: Neural network based ordinary and partial\ndifferential equations for modeling time-dependent data, in Math. and Sci. Mach. Learn.,\nPMLR, 2020, pp. 352\u2013372.\n[59] G. Tran and R. Ward, Exact recovery of chaotic systems from highly corrupted data, Multi-\nscale Model. Simul., 15 (2017), pp. 1108\u20131129.\n[60] J. H. Tu, C. W. Rowley, D. M. Luchtenburg, S. L. Brunton, and J. N. Kutz, On dynamic\nmode decomposition: Theory and applications, J. Comput. Dyn., 1 (2014), pp. 391\u2013421.\n[61] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez,  L. Kaiser,\nand I. Polosukhin, Attention is all you need, in NeurIPS, vol. 30, 2017.\n[62] W.-X. Wang, R. Yang, Y.-C. Lai, V. Kovanis, and C. Grebogi, Predicting catastrophes\nin nonlinear dynamical systems by compressive sensing, Phys. Rev. Lett., 106 (2011),\np. 154101.\n[63] K. Wu, T. Qin, and D. Xiu, Structure-preserving method for reconstructing unknown Hamil-\ntonian systems from trajectory data, SIAM J. Sci. Comput., 42 (2020), pp. A3704\u2013A3729.\n[64] K. Wu and D. Xiu, Numerical aspects for approximating governing equations using data, J.\nComput. Phys., 384 (2019), pp. 200\u2013221.\n[65] K. Wu and D. Xiu, Data-driven deep learning of partial differential equations in modal space,\nJ. Comput. Phys., 408 (2020), p. 109307.\n[66] H. Xu, H. Chang, and D. Zhang, DLGA-PDE: Discovery of pdes with incomplete candidate\nlibrary via combination of deep learning and genetic algorithm, J. Comput. Phys., 418\n(2020), p. 109584.\n[67] H. Xu and D. Zhang, Robust discovery of partial differential equations in complex situations,\nPhys. Rev. Res., 3 (2021), p. 033270.\n[68] J. Xu and K. Duraisamy, Multi-level convolutional autoencoder networks for parametric pre-\ndiction of spatio-temporal dynamics, Comput. Methods Appl. Mech. Eng., 372 (2020),\np. 113379.\n[69] A. Zhu, T. Bertalan, B. Zhu, Y. Tang, and I. G. Kevrekidis, Implementation and (inverse\nmodified) error analysis for implicitly templated ODE-Nets, SIAM Journal on Applied\nDynamical Systems, 23 (2024), pp. 2643\u20132669.\n[70] Z. Zou, X. Meng, A. F. Psaros, and G. E. Karniadakis, NeuraluUQ: A comprehensive\nlibrary for uncertainty quantification in neural differential equations and operators, SIAM\nRev., 66 (2024), pp. 161\u2013190."
-  },
-  {
-    "domain": "Computer Science",
-    "chunk_type": "general",
-    "text": "PG-DPIR: An efficient plug-and-play method for\nhigh-count Poisson-Gaussian inverse problems\nMaud Biquard\nISAE-Supaero/CNES\nToulouse, France\nmaud.biquard@isae-supaero.fr\nMarie Chabert\nIRIT / Toulouse-INP\nToulouse, France\nmarie.chabert@irit.fr\nFlorence Genin, Christophe Latry\nCNES\nToulouse, France\nfirstname.lastname@cnes.fr\nThomas Oberlin\nISAE-Supaero\nToulouse, France\nthomas.oberlin@isae-supaero.fr\nAbstract\u2014Poisson-Gaussian noise describes the noise of var-\nious imaging systems thus the need of efficient algorithms for\nPoisson-Gaussian image restoration. Deep learning methods offer\nstate-of-the-art performance but often require sensor-specific\ntraining when used in a supervised setting. A promising alter-\nnative is given by plug-and-play (PnP) methods, which consist\nin learning only a regularization through a denoiser, allowing to\nrestore images from several sources with the same network. This\npaper introduces PG-DPIR, an efficient PnP method for high-\ncount Poisson-Gaussian inverse problems, adapted from DPIR\n[1]. While DPIR is designed for white Gaussian noise, a naive\nadaptation to Poisson-Gaussian noise leads to prohibitively slow\nalgorithms due to the absence of a closed-form proximal operator.\nTo address this, we adapt DPIR for the specificities of Poisson-\nGaussian noise and propose in particular an efficient initialization\nof the gradient descent required for the proximal step that\naccelerates convergence by several orders of magnitude. Exper-\niments are conducted on satellite image restoration and super-\nresolution problems. High-resolution realistic Pl\u00b4eiades images are\nsimulated for the experiments, which demonstrate that PG-DPIR\nachieves state-of-the-art performance with improved efficiency,\nwhich seems promising for on-ground satellite processing chains.\nIndex Terms\u2014Poisson-Gaussian noise, Plug-and-Play methods,\ndeblurring, super-resolution, satellite imaging.\nI. INTRODUCTION\nPoisson-Gaussian noise is the combination of Poisson noise,\nrelated to the quantum nature of light, and of Gaussian noise,\nmodeling the electronic noise of the imaging system. Hence,\nthis noise model exists in various applications, such as remote\nsensing, astronomy, or biology. The forward model of such a\nsystem can be described as:\ny = p(Ax) + w0\n(1)\nwith x the target, A a degradation operator (e.g. a convolution),\np(Ax) \u223cP(Ax) some Poisson noise, and w0 \u223cN(0, \u03c32\n0I)\nsome Gaussian noise. Then, the imaging problem consists in\nrecovering x, typically by solving\narg min\nx F(x, y) + \u03bbR(x)\n(2)\nwith F(x, y) measuring the fidelity between Ax and y, and\nR being a regularization or a prior on x, promoting solutions\nmost compatible with R.\nThis work was partly supported by CNES under project name DEEPREG,\nand ANITI under grant agreement ANR-19-PI3A-0004.\nExcept for low-count imaging, it is realistic to perform a\nGaussian approximation of the noise model in eq. (1) [2], [3],\nleading to the following forward model:\ny = Ax + w,\nw \u223cN(0, \u03c32\nw(x))\n(3)\nwith \u03c32\nw(x) = \u03c32\n0 + K(Ax), \u03c30 and K being the noise\nparameters. This approximation is very accurate as long as the\nnumber of photons exceeds 20. This condition is satisfied in\nvarious applications, for instance for remote sensing problems\n[4]. To solve such inverse problems, classical approaches\noften rely on the Anscombe transform [5] which empirically\nstabilizes the noise variance.\nHowever, deep learning methods have significantly im-\nproved image restoration performance. The majority of these\nmethods [6], [7] consists in directly mapping the degraded\nimage to the restored image using supervised learning, for\na specific inverse problem. This requires one neural network\nfor each sensor, making this unpractical for real-world pro-\ncessing pipelines with several sources. Unlike them, plug-and-\nplay (PnP) methods consist in solving eq. (2) with proximal\nsplitting algorithms and replacing the prior step by a denois-\ning step [1], [8], [9]. The denoiser regularizes the problem\nindependently from the forward model, enabling to restore\nimages from several sources within the same network. This\nmakes them interesting for replacing traditional methods in the\nprocessing chains, however, these methods often require lots\nof iterations to converge, making them less competitive than\nsupervised methods in terms of computation time. Interest-\ningly, DPIR [1] is a PnP method requiring very few iterations\nas it leverages a denoiser trained at several noise levels to\nspeed up the convergence. Yet, DPIR is designed for white\nGaussian noise for which the prox can be computed in closed\nform. Thus, it can become very slow when the prox is not\navailable in a closed form, as it has to be computed within\ngradient descent. This is the case for the considered inverse\nproblems.\nIn this paper, we introduce PG-DPIR, an adaptation of\nDPIR for Poisson-Gaussian image restoration. In this method,\nbesides adapting DPIR hyperparameters for Poisson-Gaussian\nnoise, we propose an efficient procedure to compute the data-\nfidelity prox by initializing the required gradient descent with\nan approximated proximal operator that can be computed in\nclosed form. Afterward, only few gradient steps are required,\narXiv:2504.10375v1  [cs.CV]  14 Apr 2025\nleading to a very fast algorithm. We apply PG-DPIR to satellite\nimage restoration. In particular, we compare PG-DPIR to algo-\nrithms that are currently used in real-world pipelines, as well\nas to various PnP methods. These experiments are conducted\non realistic images simulated from airplane images to imitate\nthe very high resolution Pl\u00b4eiades satellites. These experiments\nshow that PG-DPIR outperforms the other PnP methods in\nterms of metrics and computation speed, showing the potential\nof integrating PG-DPIR in satellite image processing pipelines.\nII. RELATED WORKS\nDirect inversion methods for image restoration directly map\nthe degraded to the restored image in a supervised manner.\nThey rely on diverse neural architecture, which can be in\nparticular convolution-based [6], [10], transformer-based [11],\nor diffusion-based [12], [13]. Unlike them, the methods that\nonly learn the regularization enable to solve several inverse\nproblems with the same network. This regularization can\nbe learned explicitly within generative models [14], [15],\nor implicitly with a denoiser [1], [8] in PnP methods or\nwith a diffusion model [16], [17] in diffusion-based methods.\nConcerning PnP methods, they solve eq. (2) using diverse\nsplitting algorithms, such as Alternating Direction Method\nof Mutlipliers (ADMM) [8], Half Quadratic Splitting (HQS)\n[1] or Proximal Gradient Descent [18]. In particular, DPIR\n[1] uses the HQS framework, solving eq. (2) with very few\niterations.\nAlthough Poisson noise is well studied in the literature\n[19], [20], Poisson-Gaussian noise is less frequently addressed.\nSome approaches consider the exact Poisson-Gaussian noise\nmodel. For instance, [21] formulates the exact Poisson-\nGaussian likelihood and considers a primal-dual algorithm\nwhile [22] solves a minimax problem using a total variation\nregularization. This often leads in practice to heavier and\nmore complex methods than considering the approximated\nGaussian model of eq. (3) [23]. [3] considers the approximate\nnoise model and proposes a variational restoration algorithm\nusing framelet transform, and [23] performs a Expectation-\nMaximization method with this same noise approximation.\n[4] proposes to restore satellite images using the Anscombe\nvariance stabilization [5] and then applies classical white\nGaussian denoising and deconvolution techniques. Concerning\ndeep learning methods, [24] has trained a Poisson-Gaussian\ndenoiser for fluorescence microscopy. Interestingly, [25] pro-\nposes a convergent PnP framework for Bregman noise models\napplied to Poisson noise that could potentially work for\nPoisson-Gaussian noise.\nIII. METHOD\nA. Background on DPIR\nDPIR [1] is a PnP algorithm relying on Half Quadratic\nSplitting (HQS). Its particularly low number of iterations as\nwell as its convenient parameter tuning have made DPIR a\nreference PnP method for solving inverse problems with white\nGaussian noise.\n1) Optimization process: Considering a forward model y =\nAx+w with A the degradation operator and a white Gaussian\nnoise w \u223cN(0, \u03c32), DPIR seeks to minimize\n1\n2\u03c32 ||y\u2212Ax||2+\n\u03bbR(x) with respect to x where R is a regularization term. HQS\nalgorithm consists of the following splitting:\nmin\nx,u\n1\n2\u03c32 ||y \u2212Ax||2 + \u03bbR(u) + \u00b5\n2 ||u \u2212x||2,\n(4)\nwith u an auxiliary variable that is jointly optimized with x.\nThis leads to an alternate optimization scheme\nxk = arg min\nx\n1\n2\u03c32 ||y \u2212Ax||2 + \u00b5\n2 ||x \u2212uk\u22121||2,\n(5)\nuk = arg min\nu\n1\n2\np\n\u03bb/\u00b5\n2 ||u \u2212xk||2 + R(u).\n(6)\nEquation (5) is a proximal step on the data fidelity which\ncan be computed in closed form for various inverse problems.\nEquation (6) amounts to a denoising step with noise level \u03c3d =\np\n\u03bb/\u00b5.\n2) Noise schedule:\n[1] proposes to use decreasing values\nof \u03c3d, evenly spaced in log scale from \u03c31 to \u03c32 < \u03c31\nover the iterations. \u03c32 = \u03c3 corresponds to the noise level\nin the measurement, while \u03c31 > \u03c3 is a hyperparameter that\nis typically set to 50/255 for 8 bits images. With \u03bb a fixed\nregularization parameter, this leads to a larger \u00b5 =\n\u03bb\n\u03c32\nd at each\niteration so that uk and xk become closer over the iterations.\nNote that the number of iterations is fixed a priori because\nof the noise schedule. This results in a very fast convergence,\ngenerally within 8 iterations. In [1], the successive denoising\nsteps are performed with DRUnet denoiser as it can manage\ndifferent noise levels.\nB. Proposed Poisson-Gaussian DPIR\nDPIR algorithm has been specifically designed for inverse\nproblems with white Gaussian noise of fixed variance. Hence,\nit has to be adapted to Poisson-Gaussian inverse problems. We\nconsider the approximate forward model of eq. (3), for which\nthe Poisson-Gaussian noise is approximated as a Gaussian but\nwith a spatially varying variance \u03c32\nw(x) = \u03c32\n0 + K(Ax). This\nforward model substantially complicates the problem resolu-\ntion. In particular, the proximal step of eq. (5) can no longer\nbe computed in closed form and has to be obtained through\ngradient descent. This significantly slows down the algorithm\nconvergence for large images. To overcome this limitation, we\npropose in this section a smart initialization of this gradient\ndescent that considerably speeds up its convergence.\nFirst, for the considered inverse problem, the quadratic data\nfidelity term in equations (4) and (5) should be replaced by\nthe real negative log-likelihood:\nF(x, y) \u221d1\n2(y \u2212Ax)T \u03a3w(x)\u22121(y \u2212Ax) + 1\n2 log |\u03a3w(x)|\n(7)\nup to a constant, with \u03a3w(x) = diag(\u03c32\nw(x)). Thus, we\nreplace eq. (5) with:\nxk = arg min\nx F(x, y) + \u00b5\n2 ||x \u2212uk\u22121||2.\n(8)\nWe adapt the noise schedule by considering \u03c32 = \u03c30, that is\nthe minimum level of noise in the image. Then, eq. (8) can no\nAlgorithm 1 PG-DPIR\nRequire: y measure, u0 initial image, D(x; \u03c3d) denoiser, n itera-\ntions, (\u03c3(1)\nd , ..., \u03c3(n)\nd\n= \u03c30) denoising schedule, \u03bb parameter, \u03b7\ninner gradient descent stepsize\nfor i = 1, ..., n do\n\u00af\u03c3 \u2190\np\na + b(h0 \u2217\u00afuk\u22121)\n\u25b7Mean noise level of uk\u22121\nx(0)\nk\n\u2190arg minx\n1\n2\u00af\u03c32 ||y \u2212h0 \u2217x||2 +\n\u03bb\n2(\u03c3(i)\nd\n)2 ||x \u2212uk\u22121||2\nif i \u2264n\n2 then\nxk \u2190x(0)\nk\nelse\n\u25b7Gradient descent\nfor j = 1, ..., 5 do\nx(j)\nk\n\u2190x(j\u22121)\nk\n\u2212\u03b7\u2207x[F(x, y) +\n\u03bb\n2(\u03c3(i)\nd\n)2 ||x \u2212uk\u22121||2]\nend for\nxk \u2190x(5)\nk\nend if\nuk \u2190D(xk, \u03c3(k)\nd )\n\u25b7Denoising step\nend for\nlonger be expressed in closed form, and should be computed\nusing gradient descent. We propose to initialize the gradient\ndescent at iteration k with\nx(0)\nk\n= arg min\nx\n1\n2\u00af\u03c32 ||y \u2212Ax||2 + \u00b5\n2 ||x \u2212uk\u22121||2\n(9)\nwhere \u00af\u03c32 = \u03c32\n0 + K(A\u00afuk\u22121) and \u00afuk\u22121 is the mean value\nof uk\u22121. Equation (9), which corresponds to the proximal\noperator considering a fixed noise level \u00af\u03c32, can be computed\nin closed form. Then, we consider two phases during DPIR\niterations. During the first half of the iterations, where each\niteration corresponds to significant changes in xk, we do not\nperform gradient descent on eq. (9) but only the proximal step\neq. (9), that is xk\u22121 = x(0)\nk\u22121. Indeed, we consider x(0)\nk\nto be a\nsufficiently good approximation to xk in this case. During the\nsecond half of the iterations, we perform gradient descent on\neq. (8) starting from x(0)\nk\nduring only 5 iterations. The whole\nPG-DPIR algorithm is provided in algorithm 1. We show in the\nexperiments section that this optimization strategy produces\nvery similar results as the original process while being much\nfaster.\nIV. EXPERIMENTS\nWe conduct experiments on optical satellite image restora-\ntion problems on panchromatic images. For this application,\nwe consider the following forward model:\ny = D(h \u2217x) + w,\n(10)\nwhere x represents the observed landscape at the target res-\nolution, y the acquired image, h the blur kernel sampled at\nthe target resolution, modeling the combined effects of the\natmosphere, of the movement during integration time, and of\nthe instrument. D denotes a downsampling operator. In the\nexperiments, we consider a restoration problem, for which\nD = Id, and a joint restoration and super-resolution problem,\nfor which D downsamples the image by a factor of 2. Vector\nw represents a white Poisson-Gaussian noise which can, for\nthe considered problems, be approximated by a Gaussian noise\nof variance \u03c32\nw(x) = \u03c32\n0 + KD0(h \u2217x), where \u03c30 and K are\nnoise parameters specific to a given optical system.\nA. Simulation of realistic satellite images\nWe train and test the network on realistic simulated images,\nimitating images from Pl\u00b4eiades satellites. For the training,\ntarget images only are required, while for the image restora-\ntion, pairs of (target image, degraded image) are required.\nThese data are simulated from airplanes images at extremely\nhigh resolution, allowing us to consider them as perfect when\ndownsampled at the target resolution. We use two databases:\nPCRS, provided by IGN [26], and P\u00b4elican, provided by CNES.\nPCRS are 12 bits images, with a resolution of 5cm covering\n537.18km2. P\u00b4elican are 12 bits images with a resolution of\n10cm covering 18.45km2.\nTo simulate the target images, the images are downsampled\nat the target resolution using a bicubic filter. The degraded\nimages are simulated using a realistic simulation chain from\nCNES to imitate images from Pl\u00b4eiades satellites, which have\na resolution of 50cm and for which the degradation model\nis well controlled by CNES. The test set is constituted of 30\nimages at a resolution of 50cm of size 820\u00d7820, coming from\nthe P\u00b4elican dataset. The other images are used for training.\nB. Experiment setup\n1) Considered inverse problems: We consider two prob-\nlems: Pl\u00b4eiades image restoration without, denoted as IR, and\nwith super-resolution by 2, denoted as SISR.\n2) PG-DPIR details: We use DRUNet denoiser [1]. It is\na state-of-the-art deep convolutional denoiser designed for\nimage restoration. We employ a pretrained model from [1] and\nfinetune it on target images during 500k iterations. For IR, 8\nPG-DPIR iterations are performed, and 20 iterations for SISR.\nWe reduce the starting denoising level \u03c3(1)\nd\nto 20/255 (that is\n320/4095 for 12-bits images) as the noise levels considered\nin these restoration problems are typically lower than in the\noriginal DPIR paper [1].\n3) Baselines: We compare PG-DPIR to a classical im-\nage restoration algorithm currently used in Pl\u00b4eiades ground\nsegment, that we denote Bay+IF. It includes NL-Bayes [27]\nfor denoising then inverse filtering. For SISR, the Bay+IF\nimages are upsampled using a bicubic filter. We also use\nDPIR as a baseline, for which we consider an approximated\nwhite Gaussian noise model with the deviation equal to the\nPoisson-Gaussian deviation at the mean luminance. At last,\nwe compare to proximal gradient descent (PGD), another PnP\nmethod. All the PnP methods are performed with the same\ndenoiser. The hyperparameters of PG-DPIR and the baselines\nare tuned on a validation set of 4 images.\n4) Metrics: We use three metrics: the Peak Signal-to-Noise\nRatio (PSNR), the Structural SIMilarity (SSIM) [28], and the\nLearned Perceptual Image Patch Similarity (LPIPS) [29]. The\nPSNR represents the accuracy between the restored and target\nimages, while the SSIM and LPIPS are respectively classical\nand deep learning perceptual metrics.\nC. Results\nTABLE I\nRESULTS FOR THE IMAGE RESTORATION (IR) AND SUPER-RESOLUTION\n(SISR) PROBLEMS. TIME DENOTES THE TIME REQUIRED TO RESTORE ONE\nIMAGE WITH A NVIDIA QUADRO RTX8000 GPU.\ny\nBay+IF\nDPIR\nPGD\nPG-DPIR\nIR\nPSNR \u2191\n33.55\n40.56\n48.52\n45.90\n48.66\nSSIM \u2191\n0.9289\n0.9859\n0.9950\n0.9925\n0.9952\nLPIPS \u2193\n0.1437\n0.0369\n0.0163\n0.0428\n0.0138\nTime\nx\nx\n2.2s\n21.1s\n3.7s\nSISR\nPSNR \u2191\nx\n33.22\n37.04\n34.55\n37.18\nSSIM \u2191\nx\n0.9088\n0.9505\n0.9319\n0.9513\nLPIPS \u2193\nx\n0.2463\n0.1676\n0.2058\n0.1658\nTime\nx\nx\n60.5s\n204.2sec\n68s\nFig. 1. Visual results for the IR problem.\nThe metrics for the two considered inverse problems are\nprovided in table I. First, Bay+IF is outperformed by the deep\nlearning methods. Then, DPIR performs remarkably well given\nthat its forward model is approximate, as it yields better results\nas PGD. DPIR is also much faster than PGD, illustrating that\nDPIR is very efficient among the PnP methods. Last but not\nleast, PG-DPIR outperforms the others methods in terms of\nmetrics, and is almost as fast as DPIR. This shows the interest\nof the proposed method: the gradient descent to approximate\nthe prox is really fast and does not significantly increase the\ncomputational cost of the method.\nAdditionally, visual results are provided for the IR problem\nin fig. 1 and for the SISR problem in fig. 2. The difference\nbetween the methods is very clear in the first line of fig. 1:\nBay+IF is much smoother as the others, PG-DPIR reconstructs\nthe pedestrian crossing very well, unlike DPIR and PGD.\nIndeed, PGD seems to regularize too much, while DPIR\nconsiders an approximate white Gaussian noise model which\nseems unable to restore these fine details.\nIn fig. 2, PG-DPIR yields less artifacts as PGD and DPIR\non the roof, and is more faithful than the other methods in\nthe shadow. These results also highlight that the methods do\nnot really invent information when super resolving by a factor\n2, while maintaining a high image quality. Hence, it seems\nfeasible to integrate such a zoom in a processing pipeline to\nenhance the image before downstream tasks.\nFig. 2. Visual results for the SISR problem.\nTABLE II\nDIFFERENCE BETWEEN PG-DPIR WITH THE FAST PROXIMAL\nINITIALIZATION (PG-DPIR) OR WITHOUT IT (PG-DPIR WOUT\nPROX INIT). TIME DENOTES THE TIME REQUIRED TO RESTORE ONE\nIMAGE WITH A NVIDIA QUADRO RTX8000 GPU.\nIR\nSISR\nPG-DPIR\nwout prox init\nPG-DPIR\nPG-DPIR\nwout prox init\nPG-DPIR\nPSNR \u2191\n48.70\n48.66\n37.19\n37.18\nLPIPS \u2193\n0.0139\n0.0138\n0.1656\n0.1658\nTime\n4.3s\n3.7s\n1090s\n68s\nD. Ablation study\nTable II shows the results of PG-DPIR, with a standard\ngradient descent to compute the proximal operator (PG-DPIR\nwout prox init), or with the improved gradient descent (PG-\nDPIR). The two methods have almost the same metrics, but\nPG-DPIR is faster, especially for the SISR problem. This\nsignificant difference makes PG-DPIR usable for real-world\nsatellite image restoration, unlike PG-DPIR wout prox init.\nFig. 3. Visual results for the SISR problem with the fast proximal initialization\n(PG-DPIR) or without it (PG-DPIR wout prox init).\nVisual results for these two variants are given in fig. 3. The\ntwo methods do not exhibit visual differences. However, some\nline artifacts are visible for both methods in the second image.\nThese lines are only visible in SISR and highlight a drawback\nof our method. It is likely that a more careful choice of the\nhyperparameters, in the train or in the test process, would erase\nthose artifacts. This is left for future work.\nV. CONCLUSION\nIn this paper, we have introduced PG-DPIR, a plug-and-\nplay method specifically designed for Poisson-Gaussian in-\nverse problems, where the approximate Gaussian noise model\nremains valid. It employs the Half Quadratic Splitting frame-\nwork, alternating a data-fidelity proximal step and a denoising\nstep. PG-DPIR improves the speed of the proximal step by\ninitializing the inner gradient descent with an approximate\nproximal step that can be computed in closed form. This\nenables to significantly reduce the number of gradient descent\niterations required in each proximal step.\nWe apply PG-DPIR on satellite image restoration problems.\nWe compare PG-DPIR to several baselines on realistic high\nresolution satellite image restoration problems. These exper-\niments show the interest of our method, as it outperforms\nthe other PnP methods in terms of metrics and computation\ntime. Future work will be dedicated to erasing the remaining\nartifacts in SISR problems, as well as adapting diffusion-based\nmethods to approximate the Poisson-Gaussian noise model.\nREFERENCES\n[1] K. Zhang, Y. Li, W. Zuo, L. Zhang, L. Van Gool, and R. Timofte, \u201cPlug-\nand-play image restoration with deep denoiser prior,\u201d IEEE Transactions\non Pattern Analysis and Machine Intelligence (TPAMI), vol. 44, no. 10,\npp. 6360\u20136376, 2021.\n[2] A. Foi, M. Trimeche, V. Katkovnik, and K. Egiazarian, \u201cPractical\npoissonian-gaussian noise modeling and fitting for single-image raw-\ndata,\u201d IEEE transactions on image processing, vol. 17, no. 10, pp. 1737\u2013\n1754, 2008.\n[3] J. Li, Z. Shen, R. Yin, and X. Zhang, \u201cA reweighted l2 method for image\nrestoration with poisson and mixed poisson-gaussian noise,\u201d Inverse\nProbl. Imaging (Springfield), vol. 9, no. 3, pp. 875\u2013894, 2015.\n[4] C. Latry, S. Fourest, and C. Thiebaut, \u201cRestoration technique for\nPleiades-HR panchromatic images,\u201d The International Archives of the\nPhotogrammetry, Remote Sensing and Spatial Information Sciences,\nvol. 39, pp. 555\u2013560, 2012.\n[5] M. Makitalo and A. Foi, \u201cOptimal inversion of the generalized\nAnscombe transformation for Poisson-Gaussian noise,\u201d IEEE Transac-\ntions on Image Processing (TIP), vol. 22, no. 1, pp. 91\u2013103, 2012.\n[6] C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, A. Acosta,\nA. Aitken, A. Tejani, J. Totz, Z. Wang, and W. Shi, \u201cPhoto-realistic\nsingle image super-resolution using a generative adversarial network,\u201d in\nIEEE conference on Computer Vision and Pattern Recognition (CVPR),\n2017.\n[7] Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, \u201cResidual dense\nnetwork for image super-resolution,\u201d in IEEE Conference on Computer\nVision and Pattern Recognition (CVPR), pp. 2472\u20132481, 2018.\n[8] S. V. Venkatakrishnan, C. A. Bouman, and B. Wohlberg, \u201cPlug-and-\nplay priors for model based reconstruction,\u201d IEEE Global Conference\non Signal and Information Processing (GlobalSIP), 2013.\n[9] S. Hurault, A. Leclaire, and N. Papadakis, \u201cGradient step denoiser\nfor convergent plug-and-play,\u201d in International Conference on Learning\nRepresentations (ICLR), 2022.\n[10] X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and\nC. Change Loy, \u201cESRGAN: Enhanced super-resolution generative adver-\nsarial networks,\u201d in European Conference on Computer Vision (ECCV)\nworkshop, pp. 0\u20130, 2018.\n[11] Z. Wang, X. Cun, J. Bao, W. Zhou, J. Liu, and H. Li, \u201cUformer:\nA general u-shaped transformer for image restoration,\u201d in IEEE/CVF\nConference on Computer Vision and Pattern Recognition (CVPR),\npp. 17683\u201317693, 2022.\n[12] Z. Luo, F. K. Gustafsson, Z. Zhao, J. Sj\u00a8olund, and T. B. Sch\u00a8on,\n\u201cRefusion: Enabling large-size realistic image restoration with latent-\nspace diffusion models,\u201d in IEEE/CVF Conference on Computer Vision\nand Pattern Recognition (CVPR), pp. 1680\u20131691, 2023.\n[13] A. Lugmayr, M. Danelljan, A. Romero, F. Yu, R. Timofte, and\nL. Van Gool, \u201cRepaint: Inpainting using denoising diffusion probabilistic\nmodels,\u201d in IEEE/CVF Conference on Computer Vision and Pattern\nRecognition (CVPR), pp. 11461\u201311471, 2022.\n[14] A. Bora, A. Jalal, E. Price, and A. G. Dimakis, \u201cCompressed sensing\nusing generative models,\u201d in International Conference on Machine\nLearning (ICML), 2017.\n[15] M. Biquard, M. Chabert, and T. Oberlin, \u201cVariational bayes im-\nage\nrestoration\nwith\ncompressive\nautoencoders,\u201d\narXiv\npreprint\narXiv:2311.17744, 2023.\n[16] Y. Zhu, K. Zhang, J. Liang, J. Cao, B. Wen, R. Timofte, and L. Van Gool,\n\u201cDenoising diffusion models for plug-and-play image restoration,\u201d in\nIEEE/CVF Conference on Computer Vision and Pattern Recognition\n(CVPR), pp. 1219\u20131229, 2023.\n[17] B. Kawar, M. Elad, S. Ermon, and J. Song, \u201cDenoising diffusion\nrestoration models,\u201d Advances in Neural Information Processing Systems\n(NeurIPS), vol. 35, pp. 23593\u201323606, 2022.\n[18] S. Hurault, A. Chambolle, A. Leclaire, and N. Papadakis, \u201cA relaxed\nproximal gradient descent algorithm for convergent plug-and-play with\nproximal denoiser,\u201d in International Conference on Scale Space and\nVariational Methods in Computer Vision, pp. 379\u2013392, Springer, 2023.\n[19] A. Rond, R. Giryes, and M. Elad, \u201cPoisson inverse problems by the\nplug-and-play scheme,\u201d Journal of Visual Communication and Image\nRepresentation, vol. 41, pp. 96\u2013108, 2016.\n[20] M. H. Syed, K. Upreti, M. S. Nasir, M. S. Alam, and A. Kumar Sharma,\n\u201cAddressing image and poisson noise deconvolution problem using\ndeep learning approaches,\u201d Computational Intelligence, vol. 39, no. 4,\npp. 577\u2013591, 2023.\n[21] E. Chouzenoux, A. Jezierska, J.-C. Pesquet, and H. Talbot, \u201cA convex\napproach for image restoration with exact poisson\u2013gaussian likelihood,\u201d\nSIAM Journal on Imaging Sciences, vol. 8, no. 4, pp. 2662\u20132682, 2015.\n[22] A. Lanza, S. Morigi, F. Sgallari, and Y.-W. Wen, \u201cImage restoration with\npoisson\u2013gaussian mixed noise,\u201d Computer Methods in Biomechanics and\nBiomedical Engineering: Imaging & Visualization, vol. 2, no. 1, pp. 12\u2013\n24, 2014.\n[23] F. Benvenuto, A. La Camera, C. Theys, A. Ferrari, H. Lant\u00b4eri, and\nM. Bertero, \u201cThe study of an iterative method for the reconstruction\nof images corrupted by poisson and gaussian noise,\u201d Inverse Problems,\nvol. 24, no. 3, p. 035016, 2008.\n[24] Y. Zhang, Y. Zhu, E. Nichols, Q. Wang, S. Zhang, C. Smith, and\nS. Howard, \u201cA poisson-gaussian denoising dataset with real fluorescence\nmicroscopy images,\u201d in Proceedings of the IEEE/CVF Conference on\nComputer Vision and Pattern Recognition (CVPR), June 2019.\n[25] S. Hurault, U. Kamilov, A. Leclaire, and N. Papadakis, \u201cConvergent\nbregman plug-and-play image restoration for poisson inverse problems,\u201d\narXiv preprint arXiv:2306.03466, 2023.\n[26] \u201cInstitut national de l\u2019information g\u00b4eographique et foresti`ere (IGN),\u201d\nhttps://geoservices.ign.fr/pcrs.\n[27] M. Lebrun, A. Buades, and J.-M. Morel, \u201cImplementation of the \u201dNon-\nLocal Bayes\u201d (NL-Bayes) image denoising algorithm,\u201d Image Process-\ning On Line (IPOL), vol. 3, pp. 1\u201342, 2013.\n[28] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, \u201cImage\nquality assessment: from error visibility to structural similarity,\u201d IEEE\nTransactions on Image Processing (TIP), vol. 13, no. 4, pp. 600\u2013612,\n2004.\n[29] R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, \u201cThe\nunreasonable effectiveness of deep features as a perceptual metric,\u201d in\nIEEE/CVF Conference on Computer Vision and Pattern Recognition\n(CVPR), pp. 586\u2013595, 2018."
-  },
-  {
-    "domain": "Computer Science",
-    "chunk_type": "general",
-    "text": "Teacher Motion Priors: Enhancing Robot Locomotion over Challenging\nTerrain\nFangcheng Jin1,2, Yuqi Wang3, Peixin Ma3, Guodong Yang1,2, Pan Zhao3, En Li1,2, Zhengtao Zhang1,2\nAbstract\u2014 Achieving robust locomotion on complex terrains\nremains a challenge due to high-dimensional control and envi-\nronmental uncertainties. This paper introduces a teacher-prior\nframework based on the teacher-student paradigm, integrating\nimitation and auxiliary task learning to improve learning\nefficiency and generalization. Unlike traditional paradigms\nthat strongly rely on encoder-based state embeddings, our\nframework decouples the network design, simplifying the policy\nnetwork and deployment. A high-performance teacher policy\nis first trained using privileged information to acquire gen-\neralizable motion skills. The teacher\u2019s motion distribution is\ntransferred to the student policy, which relies only on noisy\nproprioceptive data, via a generative adversarial mechanism\nto mitigate performance degradation caused by distributional\nshifts. Additionally, auxiliary task learning enhances the student\npolicy\u2019s feature representation, speeding up convergence and\nimproving adaptability to varying terrains. The framework is\nvalidated on a humanoid robot, showing a great improvement\nin locomotion stability on dynamic terrains and significant\nreductions in development costs. This work provides a practical\nsolution for deploying robust locomotion strategies in humanoid\nrobots.\nI. INTRODUCTION\nRobust locomotion on complex terrains remains a core\nchallenge in robotics due to high-dimensional control and en-\nvironmental uncertainties. Early model-based control meth-\nods enabled basic walking on challenging terrains [1]\u2013[5] and\nwere extended to humanoid robots for various tasks [6]\u2013[8],\nbut these approaches often lack adaptability in real-world\nscenarios. Recent advancements in reinforcement learning\n(RL) have shown promise for addressing complex control\nproblems [9]\u2013[12], though applying RL to humanoid robots\nremains difficult due to their high degrees of freedom and\nthe need for robust performance on dynamic terrains. The\nteacher-student paradigm has emerged as a solution, where\na high-performance teacher policy is trained using privileged\ninformation and transferred to a student policy that relies\non proprioceptive inputs [13]\u2013[17]. This approach enables\nefficient sim-to-real deployment, but still faces challenges\nsuch as distributional shift and network complexity.\n*Corresponding Author: Guodong Yang\n*This work was supported in part by the National Natural Science\nFoundation of China under Grant 62273344, and in part by Beijing Zhongke\nHuiling Robot Technology Co., LTD\n1The authors are with the School of Artificial Intelligence, Univer-\nsity of Chinese Academy of Sciences, Beijing, Beijing 100000, China\njinfangcheng23@mails.ucas.ac.cn\n2The authors are with Institute of Automation, Chinese Academy of Sci-\nences, Beijing, Beijing 100000, China {guodong.yang, en.li,\nzhengtao.zhang}@ia.ac.cn\n3The authors are with Beijing Zhongke Huiling Robot Technology Co.,\nLTD, Beijing 100192, China\nSeveral improvements have been proposed, including Reg-\nularized Online Adaptation (ROA) and Collaborative Train-\ning of Teacher-Student Policies (CTS) [18], [19], but these\nmethods still struggle with distributional shift and network\nstructure dependency, limiting their generalization ability.\nTo address these issues, Generative Adversarial Imitation\nLearning (GAIL) [20] leverage adversarial training to al-\nleviate distributional shift and decouple the student policy\nfrom the teacher\u2019s network. Extensions like Adversarial\nMotion Priors (AMP) further enhance motion generation by\nevaluating state transitions [21], allowing the control strategy\nto generate stylized movements. Additionally, Multi-Task\nLearning (MTL) [22], [23] has been integrated into RL to\naccelerate training and improve generalization by enhancing\nfeature representations [24]\u2013[26].\nIn this work, we propose a novel teacher-student frame-\nwork, Teacher Motion Priors (TMP), that integrates gener-\native adversarial mechanisms and auxiliary task learning to\ntackle distributional shift, network dependency, and limited\ngeneralization. Our key contributions include:\n\u2022 High-performance teacher policy: We train a robust\nteacher policy with privileged information and large-\nscale networks to enable generalizable locomotion in\ncomplex environments.\n\u2022 Generative adversarial knowledge transfer: We trans-\nfer the teacher\u2019s behavior distribution to the student\npolicy, mitigating distributional shift and decoupling\nnetwork structures.\n\u2022 Auxiliary task learning for student policy: We en-\nhance feature representation, accelerate training, and\nimprove generalization across dynamic terrains.\n\u2022 Real-world validation: The trained student policy is\ndeployed on a full-scale humanoid robot, showing\nsignificant improvements in locomotion stability and\nrobustness on dynamic terrains.\nOur experiments on a humanoid robot platform demon-\nstrate superior learning performance, enhanced tracking ac-\ncuracy, and reduced Cost of Transport (CoT) compared to\nmainstream methods. The following sections present our\nmethod and experimental results in detail.\nII. TEACHER MOTION PRIORS\nThe training of the TMP framework consists of two\nstages. As illustrated in Fig. 1, the teacher phase on the\nbottom left is performed first, followed by the student phase\non the bottom right. In this section, we first present the\nproblem formulation, followed by the proposed algorithmic\nframework.\narXiv:2504.10390v1  [cs.RO]  14 Apr 2025\nA. Humanoid Locomotion and Reinforcement Learning\nOur approach models the humanoid locomotion problem\nas a partially observable Markov decision process (POMDP),\ndefined by the tuple \u27e8S, A, O, T , R, \u03b3\u27e9. Here, S is the state\nspace, A is the action space, T (s\u2032|s, a) is the state transition\nfunction, R(s, a, s\u2032) is the reward function, and O is the\nobservation space, representing partial environmental infor-\nmation. The discount factor \u03b3 \u2208[0, 1] balances immediate\nand future rewards.\nIn simulated environments, the agent has full access to\nthe state space, but in real-world scenarios, the agent only\nobserves o \u2208O , which may be incomplete or noisy. To\naddress this, the policy \u03c0(a|o\u2264t) maps historical observations\nto actions, approximating the true state.\nThe objective is to find an optimal policy \u03c0\u2217that maxi-\nmizes the expected cumulative discounted reward:\nJ(\u03c0) = E\u03c0\n\" \u221e\nX\nt=0\n\u03b3tR(st, at, st+1)\n#\n.\n(1)\nOur framework employs proximal policy optimization\n(PPO) with an actor-critic architecture, replacing supervised\nlearning with a generative adversarial approach for student\npolicy training. This enables the student to mimic the teacher\npolicy, achieving robust locomotion even without privileged\ninformation.\nAt time step t, the proprioceptive observation ot \u2208Rn\nand privileged information op\nt \u2208Rm are combined into the\nfull state st = [ot, op\nt ] \u2208Rm+n. To enhance generalization,\nGaussian noise is added to the proprioceptive observation\ninput of the actor at both stages, while the privileged ob-\nservation remains noise-free. The policy network outputs the\naction at \u2208Ri, where i is the number of controllable joints.\nThe action controls the joint positions by being processed\nthrough a PD controller. Superscripts (\u00b7)t and (\u00b7)s distinguish\nbetween teacher and student components, respectively.\nB. Teacher Policy\nIn the teacher policy training, both privileged information\nand proprioceptive data are input into the teacher policy\nto guide robust locomotion strategy learning. To improve\nlearning, we use frame stacking, where the teacher policy\n\u03c0t takes N frames of proprioceptive data ot\u2212N+1:t \u2208RN\u00d7n\nand M frames of privileged information op\nt\u2212M+1:t \u2208RM\u00d7m.\nThe teacher policy employs an actor-critic architecture.\nThe actor generates actions by receiving privileged informa-\ntion op\nt\u2212M+1:t and proprioceptive observations ot\u2212N+1:t. The\ncritic receives M frames of noise-free state data st\u2212M+1:t \u2208\nRM\u00d7(m+n). Detailed architecture is shown in Table I.\nTraining follows the process outlined in Algorithm 1,\nwhere policy parameters are updated using gradient descent\nto minimize the loss function.\nLoss Function Definition: The teacher policy optimizes\nthe following loss function:\nLteacher = Lclip + \u03bbvLv \u2212\u03bbeLe\n(2)\nwhere:\nAlgorithm 1 Teacher Training Process\n1: Initialize environment and networks.\n2: for k = 0, 1, . . . do\n3:\nCollect a set of trajectories using the latest policy.\n4:\nCompute the target returns \u02c6Rt and advantages \u02c6At\nusing GAE.\n5:\nfor each epoch i = 0, 1, . . . do\n6:\nUpdate policy parameters using gradient descent:\n\u03b8t \u2190\u03b8t \u2212\u03b1 \u00b7 clip\n\u0000\u2207\u03b8tLteacher, \u2212max grad, max grad\n\u0001\n7:\nend for\n8: end for\n\u2022 Lclip is the clipped surrogate loss that stabilizes updates:\nLclip = Et[min(rt(\u03b8t) \u02c6At,\nclip(rt(\u03b8t), 1 \u2212\u03f5, 1 + \u03f5) \u02c6At)]\n(3)\n\u2022 Lv is the value function loss, measuring the mean\nsquared error between predicted value V\u03b8t(st) and the\ntarget return \u02c6Rt, computed with generalized advantage\nestimation (GAE):\nLv = Et\nh\n(V\u03b8t(st) \u2212\u02c6Rt)2i\n(4)\n\u2022 Le is the entropy loss, encouraging exploration by\npromoting diverse action distributions:\nLe = 1\nN\nN\nX\nt=1\nH(\u03c0t\n\u03b8(\u00b7|st))\n(5)\nwhere H(\u03c0t\n\u03b8(\u00b7|st)) is the entropy:\nH(\u03c0t\n\u03b8(\u00b7|st)) = \u2212\nX\na\u2208A\n\u03c0t\n\u03b8(a|st) log \u03c0t\n\u03b8(a|st)\n(6)\nEntropy measures the uncertainty of the policy\u2019s action\nselection. By adjusting \u03bbv and \u03bbe, these terms balance\nexploration and convergence, optimizing the teacher policy\nfor stable and efficient locomotion.\nC. Student Policy\nDuring student policy training, the student actor receives\nonly proprioceptive data ot\u2212N+1:t, while the critic remains\nsimilar to the teacher\u2019s. Inspired by GAIL, we replace\ntraditional supervised learning with a generative adversarial\napproach to help the student mimic the teacher\u2019s behavior.\nWhile collecting trajectories using the student policy, we\nalso record the teacher\u2019s response actions at\nt at each state\nvisited by the student. The discriminator D receives the tuple\n(st\u2212S+1:t, at), where st\u2212S+1:t represents the last S frames\nof state information, and outputs pD \u2208[0, 1], indicating the\nlikelihood that at is the teacher\u2019s action.\nThe discriminator\u2019s loss function is defined as:\nLdisc = \u03bbpredLpred + \u03bbgradLgrad + \u03bbweightLweight\n(7)\nwhere:\nFig. 1: TMP Training Process\n\u2022 Prediction Loss: The binary cross-entropy (BCE) loss\nclassifies whether the trajectory originates from the\nteacher or the student:\nLpred = \u2212E\u03c4t\u223c\u03c0teacher [log D(\u03c4t)]\n\u2212E\u03c4s\u223c\u03c0student [log(1 \u2212D(\u03c4s))]\n(8)\nwhere\n\u03c4t\n=\n\u27e8(st\u2212S+1:t, at\nt)\u27e9T\nt=0\nand\n\u03c4s\n=\n\u27e8(st\u2212S+1:t, as\nt)\u27e9T\nt=0\nare\nthe\nteacher\nand\nstudent\ntrajectories, respectively.\n\u2022 Gradient Regularization: This term penalizes large\ngradients to avoid overfitting:\nLgrad = \u03bbgradE\u03c4\u223c\u03c0teacher\u222a\u03c0student\n\u0002\n\u2225\u2207\u03c4D(\u03c4)\u22252\u0003\n(9)\nwhere \u03c4 denotes trajectories sampled from both the\nteacher and student policies, and \u03bbgrad is the regular-\nization coefficient.\n\u2022 Weight Regularization: An L2 penalty on the discrim-\ninator\u2019s weights improves generalization:\nLweight = \u03bbweight\u2225\u03b8D\u22252\n(10)\nwhere \u03b8D are the discriminator parameters, and \u03bbweight\ncontrols the regularization strength.\nTo accelerate training and enhance feature representation\nin the earlier network layers, we incorporate auxiliary task\nlearning. The auxiliary network aux shares the first N \u22122\nlayers with the actor network and predicts auxiliary observa-\ntions \u02c6oaux\nt . This shared structure enhances the student\u2019s ability\nto learn noise distributions in proprioceptive inputs and guide\nfeature extraction, improving performance.\nThe auxiliary loss function Laux is defined as:\nLaux = Et\nh\r\r\u02c6oaux\nt\n\u2212oaux\nt\n\r\r2\n2\ni\n(11)\nwhere \u02c6oaux\nt\nis the predicted auxiliary observation and oaux\nt\nis\nthe ground truth auxiliary observation.\nThe student policy network is denoted by \u03b8s. The training\nprocess is summarized in Algorithm 2.\nAlgorithm 2 Student Training Process\n1: Initialize environment and networks.\n2: for k = 0, 1, . . . do\n3:\nCollect trajectories \u03c4student using the current student\npolicy.\n4:\nCollect teacher trajectories \u03c4teacher using \u03c0t.\n5:\nCompute student policy target returns \u02c6Rt and advan-\ntages \u02c6At using GAE.\n6:\nfor each epoch i = 0, 1, . . . do\n7:\nUpdate student policy parameters using:\n\u03b8s \u2190\u03b8s \u2212\u03b1 \u00b7 clip\n\u0000\u2207\u03b8sLstudent, \u2212max grad, max grad\n\u0001\n8:\nUpdate discriminator parameters using:\n\u03b8D \u2190\u03b8D \u2212\u03b1 \u00b7 clip\n\u0000\u2207\u03b8DLdisc, \u2212max grad, max grad\n\u0001\n9:\nend for\n10: end for\nThe student policy optimizes the following loss function,\nwhich combines adversarial and auxiliary task losses:\nLstudent = Lclip + \u03bbvLv \u2212\u03bbeLe + \u03bbauxLaux + \u03bbdiscLdisc (12)\nThe terms Lclip, Lv, and Le follow the same definitions as in\nthe teacher policy but are optimized with respect to \u03b8s. Here,\n\u03bbaux and \u03bbdisc control the contributions of the auxiliary task\nand adversarial losses, respectively. During the deployment\nphase, only the student policy \u03c0s is utilized, without the\nauxiliary network or critic.\nD. Training Configuration\nWe define the robot\u2019s base and foot poses using a six-\ndimensional vector [x, y, z, \u03b1, \u03b2, \u03b3], where [x, y, z] is the\nposition and [\u03b1, \u03b2, \u03b3] is the orientation in Euler angles. A gait\ncycle, CT , consists of two double support (DS) phases and\ntwo single support (SS) phases. The leg reference trajectory\nis generated using quintic polynomial interpolation for foot\nheight [27]. The phase mask Ip(t) indicates foot contact,\nwith DS phases marked as [1, 1] and SS phases as [1, 0] or\n[0, 1].\nThe proprioceptive input ot \u2208R47 includes standard\nsensory data, a phase clock signal (sin(t), cos(t)), and com-\nmand input\n\u02d9Pxy\u03b3. The privileged information op\nt \u2208R213\ncomprises data not accessible during deployment, such as\na 187-dimensional height map representing the distance\nfrom the terrain to the robot\u2019s base over a 1.6 m \u00d7 1.0 m\narea and feet contact detection Id(t). Auxiliary information,\nused exclusively during student training, consists of partial\nstates data for auxiliary task learning. A single frame of\nobservations is elaborated in Table II.\nDuring teacher policy training, 3 frames of privileged\ninformation (213 dimensions each) and 15 frames of propri-\noceptive data (47 dimensions each) are concatenated and fed\ninto the actor. Simultaneously, 3 frames of state data (260\ndimensions each) are input into the critic. For the student\npolicy, the actor input consists of 15 frames of proprioceptive\ndata, with the critic structure unchanged from the teacher\u2019s.\nThe discriminator uses 10 frames of state data and 12 for the\naction dimension. Detailed network architecture is provided\nin Table I.\nTo ensure robust locomotion, we use a game-inspired\ncurriculum learning, as described in [28] across four terrain\ntypes: slopes, rough, stairs, and discrete obstacles. Slopes\nrange from 0\u25e6to 22.92\u25e6, with rough slopes adding uniform\nnoise (-5 to 5 cm) to simulate uneven surfaces. Stairs vary\nfrom 5 cm to 24.95 cm, and obstacles range from 5 cm to\n24 cm. The curriculum progresses through difficulty levels\nfrom 0 to 20, with each level comprising 20 terrain instances\nto ensure balanced exposure. Each level includes 4 rough\nterrains, 4 discrete obstacles, 3 upslopes, 3 downslopes, 3\nstair ascents, and 3 stair descents. Robots start at level 0 and\nprogress to more challenging conditions as they successfully\ncomplete each level.\nDuring training, velocity commands are uniformly sam-\npled within [\u22121.5, 1.5] m/s. Once robots perform well on\nchallenging terrains and maintain accurate velocity tracking,\nthe velocity range is gradually increased to promote more\nagile locomotion.\nThe student policy includes an auxiliary task network\nthat shares the first layer with the actor network. The\nactor outputs actions at \u2208R12, controlling the legs. The\nauxiliary network predicts auxiliary observations oaux\nt\n\u2208R48.\nThe discriminator distinguishes between teacher and student\ntrajectories using inputs \u27e8op\nt , at\u27e9.\nE. Reward Design\nWe design a unified reward system to promote stable,\nenergy-efficient locomotion while following gait patterns and\nvelocity commands. The reward components include: (1)\ntracking reward, (2) periodic gait reward, (3) foot trajectory\nreward, and (4) regularization terms. Additionally, to distin-\nguish whether an action originates from the student or the\nteacher, a discriminator reward is introduced during student\ntraining.\nThe tracking reward encourages accurate execution of\nvelocity commands CMDxyz and CMD\u03b1\u03b2\u03b3 by penalizing\nvelocity errors:\n\u03d5(e, w) = exp(\u2212w\u2225e\u22252)\n(13)\nwhere e is the velocity error and w controls the penalty\nmagnitude.\nThe periodic gait reward enhances coordination by pe-\nnalizing deviations from the expected foot contact pattern,\nensuring alignment with the gait phase through the binary\nphase mask.\nThe foot trajectory reward maintains desired foot height\nduring the swing phase to ensure obstacle clearance:\nrfc =\nX\nswing\n|hfeet \u2212htarget|\n(14)\nwhere hfeet and htarget represent the actual and target foot\nheights, respectively.\nThe regularization terms penalize undesired behaviors,\nincluding large joint torques, high accelerations, and exces-\nsive foot contact forces. The collision penalty is:\nncollision =\nX\ni\nI(Fi > Fthreshold)\n(15)\nwhere Fi is the contact force and Fthreshold = 0.1N. The\nindicator function I(\u00b7) returns 1 if the condition is met.\nThe discriminator reward rdisc is derived from a probabil-\nity distribution pdisc, which encourages the student to mimic\nthe teacher\u2019s policy as closely as possible:\npdisc = softplus(\u2212D(st))\n(16)\nA higher pdisc value (closer to 1) indicates that the student\u2019s\nactions resemble those of the teacher to a greater extent.\nDetailed reward configuration is in Table III.\nF. Domain Randomization\nTo address the sim-to-real gap, we apply domain ran-\ndomization during training by varying key environmental\nand robot parameters, such as ground friction, stiffness,\npayload, joint friction, and PD controller settings. These\nrandomizations improve the policy\u2019s generalization ability\nby simulating diverse deployment scenarios. For a full list\nof randomization parameters, refer to Table IV.\nFig. 2: CASBOT SE Multi-Terrain Testing in Real-World Environments. The first row shows slope testing, the second row\npresents trials on a brick-paved surface, and the third row demonstrates disturbance rejection.\nIII. EXPERIMENTS\nA. Robot Platform Design\nOur study is conducted on the CASBOT SE humanoid\nrobot, developed by Beijing Zhongke Huiling Robot Technol-\nogy Co., Ltd., as illustrated in Fig. 3. This full-sized platform\nstands 1.65 m tall and weighs 46 kg with 18 DoFs(6 on each\nleg, 3 on each arm). In this study, the arm joints are not\nutilized, and thus only 12 joints are controlled. To achieve\nstable locomotion, a periodic reference trajectory for the feet\nis generated and solved using inverse kinematics to derive\njoint trajectories. A closed kinematic chain ankle mechanism,\nproviding 2 DoFs, enhances robustness by reducing the\nimpact of terrain irregularities on foot posture, improving\nstability on rough terrain.\nB. Evaluation Results\nWe compared the performance, used Isaac Gym, of several\nmethods on the CASBOT SE as follows:\n\u2022 Oracle: Policy trained with PPO, receiving st\u2212N+1:t as\ninput.\n\u2022 Baseline: PPO-trained policy with the actor receiving\nproprioceptive observations and the critic receiving priv-\nileged observations [12].\n\u2022 Original teacher-student framework: The teacher re-\nceives proprioceptive observations and latent code, and\nthe student is trained using latent reconstruction and\naction imitation loss [13].\n\u2022 Regularized Online Adaptation (ROA): Policy trained\nby integrating latent code between privileged and pro-\nprioceptive encoders [18].\nAll methods were trained in the actor-critic framework\nwith identical configurations, network scale, and random\nseeds, evaluated over 10000 iterations. For the original\nteacher-student framework, 3000 iterations were allocated for\nthe teacher, as in TMP. For its unique configurations, ROA\nwas trained using the setup from [18].\nTerrain Level Convergence: We compared the perfor-\nmance of these methods in terms of terrain level, as shown\nin Fig. 4. The curves, averaged over 10 seeds, represent\nthe average terrain level of all agents at each training\nstep, with the shaded area indicating the standard deviation.\nFig. 3: Illustration\nof CASBOT SE.\nWith generative adversarial training\nand auxiliary task learning, the student\npolicy closely matches the teacher\npolicy. In contrast, the student pol-\nicy without auxiliary task learning\ntakes longer to converge. ROA\u2019s ter-\nrain level curve does not effectively\ncapture traversability due to its policy\nswitching, but it shows a slight perfor-\nmance improvement over the baseline\nafter 5000 iterations. The final learn-\ning performance of TMP improves by\n26.39% and 17.20% compared to TS\nand ROA. We believe that improving\nthe teacher policy, particularly enhanc-\ning the network architecture, can fur-\nther benefit the student policy through\nTMP.\n0\n2000\n4000\n6000\n8000\n10000\nStep\n0\n2\n4\n6\n8\n10\n12\nTerrain Level\nOracle(Teacher)\nStudent with aux(Ours)\nStudent without aux\nBaseline\nTS\nROA\nFig. 4: Learning Curves of average terrain level\nTracking Accuracy: We evaluated velocity tracking across\ndiverse terrains using 10240 uniformly distributed robots.\nLinear velocity commands were sampled from [-1.5, 1.5]\nm/s, and tracking errors were computed as ||CMDxy\u2212vxy||2.\nFig. 5 presents the average tracking performance, with\ntracking errors across 4 terrain types shown on the y-axis.\nEach subplot compares linear velocity tracking errors over\n10 seeds. TMP outperforms TS and ROA, reducing errors\nby 44.16% and 30.25% on discrete obstacles, 40.53% and\n28.16% on rough slopes, 39.17% and 23.71% on slopes,\n27.74% and 26.66% on stairs. While ROA achieves com-\nparable terrain level performance to the baseline, it exhibits\nhigher tracking accuracy.\n0.25\n0.50\nDiscrete\n0.5\n1.0\nRough\n0.5\n1.0\n1.5\nSlope\n0\n1\n2\nStair\nTMP\nBaseline\nROA\nTS\nFig. 5: Evaluation of average tracking error in 4 types of\nterrains.\nCoT: We evaluate the policy\u2019s Cost of Transport (CoT),\ndefined as\n[13], which quantifies the energy efficiency of\nthe policy in controlling the robot. We evaluate each policy\nusing the same speed sampling and environmental setup\nas described in the\nIII-B section. Fig. 6 shows that The\nstudent policy trained with TMP exhibits a lowest CoT.\nSpecifically, across 4 different terrains, TMP achieves CoT\nreductions of 26.67% and 2.384% on discrete obstacles,\n16.89% and 2.205% on rough slopes, 5.870% and 14.35%\non slopes, 13.65% and 6.604% on stairs, compared to TS\nand ROA, respectively. The student policies trained with TS\nand ROA exhibit a higher CoT, likely due to their reliance\non supervised learning, which limits exploration capability.\nIn contrast, TMP enables the student to dynamically learn\nthe teacher\u2019s strategy within the simulation environment.\n1\n2\n3\nDiscrete\n1\n2\n3\nRough\n1\n2\n3\nSlope\n1\n2\n3\nStair\nTMP\nBaseline\nROA\nTS\nFig. 6: Evaluation of average tracking error in 4 types of\nterrains.\nC. Real-World Experiments\nTo evaluate the effectiveness and robustness of our control\nstrategy, we conducted real-world experiments using the\nCASBOT SE humanoid robot. The experiments involved\nthree distinct scenarios: traversing a sloped surface, walking\nover a rough brick-paved terrain, and responding to external\ndisturbances. These tests demonstrate the robot\u2019s adaptabil-\nity to challenging environments and its ability to maintain\nstability under perturbations.\nFig.2 presents sequential snapshots of the experiments.\nIn the initial sequence, the robot successfully traverses the\nsloped terrain by dynamically adjusting its joint configura-\ntions, particularly the foot pitch joints, to maintain balance.\nTaking this terrain as an example, the torque variations of the\nleft leg are plotted on the right side of Fig.2, where the six\nrows from top to bottom correspond to joints 1 to 6. It can be\nobserved that during the transition from flat ground to a slope\nand back, the hip and knee joints exhibit relatively small\ntorque variations, whereas the ankle pitch joint (second-\nto-last row) undergoes significant changes. This indicates\nthat the proposed strategy ensures stable locomotion while\nachieving a lower Cost of Transport (CoT) and enhanced\nterrain adaptability. Moreover, the system effectively regu-\nlates joint torques in response to terrain inclination changes,\nmitigating undesired forward/backward tilting motions.\nThe second-row sequence shows the robot traversing\na brick-paved terrain with discontinuous ground contact.\nThrough adaptive foot placement and joint stiffness modula-\ntion, the robot compensates for terrain irregularities, main-\ntaining upper body stability and dynamic balance despite\nunpredictable contact forces.\nIn the third row, the robot is subjected to external distur-\nbances applied via sudden pushes. Upon receiving a perturba-\ntion, the robot swiftly reacts by adjusting its stepping strategy\nand redistributing its center of mass to regain balance. The\ncontrol policy enables rapid recovery by leveraging pro-\nprioceptive feedback, ensuring stability even under sudden\nexternal forces.\nThese experiments validate the effectiveness of our ap-\nproach in handling complex terrains and disturbances, high-\nlighting the generalizability of our control strategy.\nIV. CONCLUSIONS AND FUTURE WORKS\nThe significance of this work lies in the novel framework\ndesign, which departs from the traditional teacher-student\nparadigm by eliminating the encoder structure and using\na generative adversarial mechanism for knowledge transfer.\nIt enables developers to easily train a teacher policy and\ntransfer it to existing networks, improving performance with-\nout extensive restructuring. The framework also supports the\nfuture integration of exteroception modules, such as vision,\nwithout requiring retraining of the teacher policy.\nREFERENCES\n[1] Gerardo Bledt, Patrick M Wensing, Sam Ingersoll, and Sangbae Kim.\nContact model fusion for event-based locomotion in unstructured\nterrains.\nIn 2018 IEEE International Conference on Robotics and\nAutomation (ICRA), pages 4399\u20134406. IEEE.\n[2] Yukai Gong, Ross Hartley, Xingye Da, Ayonga Hereid, Omar Harib,\nJiunn-Kai Huang, and Jessy Grizzle.\nFeedback control of a cassie\nbipedal robot: Walking, standing, and riding a segway.\nIn 2019\nAmerican Control Conference (ACC), pages 4559\u20134566. IEEE.\n[3] Fabian Jenelten, Jemin Hwangbo, Fabian Tresoldi, C Dario Bellicoso,\nand Marco Hutter. Dynamic locomotion on slippery ground. IEEE\nRobotics and Automation Letters, 4(4):4170\u20134176, 2019.\n[4] Jacob Reher, Wen-Loong Ma, and Aaron D Ames. Dynamic walking\nwith compliance on a cassie bipedal robot. In 2019 18th European\nControl Conference (ECC), pages 2589\u20132595. IEEE.\n[5] Michele Focchi, Romeo Orsolino, Marco Camurri, Victor Barasuol,\nCarlos Mastalli, Darwin G Caldwell, and Claudio Semini. Heuristic\nplanning for rough terrain locomotion in presence of external distur-\nbances and variable perception quality. Advances in robotics research:\nFrom lab to market: ECHORD++: Robotic science supporting inno-\nvation, pages 165\u2013209, 2020.\n[6] Min Sung Ahn.\nDevelopment and Real-Time Optimization-based\nControl of a Full-sized Humanoid for Dynamic Walking and Running.\nUniversity of California, Los Angeles, 2023.\n[7] Patrick M Wensing and David E Orin.\nDevelopment of high-span\nrunning long jumps for humanoids.\nIn 2014 IEEE international\nconference on robotics and automation (ICRA), pages 222\u2013227. IEEE,\n2014.\n[8] Matthew Chignoli, Donghyun Kim, Elijah Stanger-Jones, and Sangbae\nKim. The mit humanoid robot: Design, motion planning, and control\nfor acrobatic behaviors. In 2020 IEEE-RAS 20th International Con-\nference on Humanoid Robots (Humanoids), pages 1\u20138. IEEE, 2021.\n[9] Tuomas Haarnoja, Sehoon Ha, Aurick Zhou, Jie Tan, George Tucker,\nand Sergey Levine. Learning to walk via deep reinforcement learning.\narXiv preprint arXiv:1812.11103, 2018.\n[10] Jemin Hwangbo, Joonho Lee, Alexey Dosovitskiy, Dario Bellicoso,\nVassilios Tsounis, Vladlen Koltun, and Marco Hutter.\nLearning\nagile and dynamic motor skills for legged robots. Science Robotics,\n4(26):eaau5872, 2019.\n[11] Zhaoming Xie, Patrick Clary, Jeremy Dao, Pedro Morais, Jonanthan\nHurst, and Michiel Panne.\nLearning locomotion skills for cassie:\nIterative design and sim-to-real. In Conference on Robot Learning,\npages 317\u2013329. PMLR, 2020.\n[12] Xinyang Gu, Yen-Jen Wang, and Jianyu Chen.\nHumanoid-gym:\nReinforcement learning for humanoid robot with zero-shot sim2real\ntransfer. arXiv preprint arXiv:2404.05695, 2024.\n[13] Joonho Lee, Jemin Hwangbo, Lorenz Wellhausen, Vladlen Koltun,\nand Marco Hutter. Learning quadrupedal locomotion over challenging\nterrain. Science robotics, 5(47):eabc5986, 2020.\n[14] Ashish Kumar, Zipeng Fu, Deepak Pathak, and Jitendra Malik.\nRma: Rapid motor adaptation for legged robots.\narXiv preprint\narXiv:2107.04034, 2021.\n[15] Yikai Wang, Zheyuan Jiang, and Jianyu Chen.\nLearning robust,\nagile, natural legged locomotion skills in the wild.\narXiv preprint\narXiv:2304.10888, 2023.\n[16] Wandi Wei, Zhicheng Wang, Anhuan Xie, Jun Wu, Rong Xiong, and\nQiuguo Zhu. Learning gait-conditioned bipedal locomotion with motor\nadaptation.\nIn 2023 IEEE-RAS 22nd International Conference on\nHumanoid Robots (Humanoids), pages 1\u20137. IEEE, 2023.\n[17] Mike Zhang, Yuntao Ma, Takahiro Miki, and Marco Hutter. Learning\nto open and traverse doors with a legged manipulator. arXiv preprint\narXiv:2409.04882, 2024.\n[18] Zipeng Fu, Xuxin Cheng, and Deepak Pathak.\nDeep whole-body\ncontrol: learning a unified policy for manipulation and locomotion.\nIn Conference on Robot Learning, pages 138\u2013149. PMLR, 2023.\n[19] Hongxi Wang, Haoxiang Luo, Wei Zhang, and Hua Chen. Cts: Con-\ncurrent teacher-student reinforcement learning for legged locomotion.\nIEEE Robotics and Automation Letters, 2024.\n[20] Jonathan Ho and Stefano Ermon.\nGenerative adversarial imitation\nlearning.\nAdvances in neural information processing systems, 29,\n2016.\n[21] Alejandro Escontrela, Xue Bin Peng, Wenhao Yu, Tingnan Zhang,\nAtil Iscen, Ken Goldberg, and Pieter Abbeel.\nAdversarial motion\npriors make good substitutes for complex reward functions. In 2022\nIEEE/RSJ International Conference on Intelligent Robots and Systems\n(IROS), pages 25\u201332. IEEE, 2022.\n[22] R Caruana. Multitask learning: A knowledge-based source of inductive\nbias1.\nIn Proceedings of the Tenth International Conference on\nMachine Learning, pages 41\u201348. Citeseer, 1993.\n[23] Marc Peter Deisenroth, Peter Englert, Jan Peters, and Dieter Fox.\nMulti-task policy search for robotics.\nIn 2014 IEEE international\nconference on robotics and automation (ICRA), pages 3876\u20133881.\nIEEE, 2014.\n[24] Max Jaderberg, Volodymyr Mnih, Wojciech Marian Czarnecki, Tom\nSchaul, Joel Z Leibo, David Silver, and Koray Kavukcuoglu. Rein-\nforcement learning with unsupervised auxiliary tasks. arXiv preprint\narXiv:1611.05397, 2016.\n[25] Evan Shelhamer, Parsa Mahmoudieh, Max Argus, and Trevor Darrell.\nLoss is its own reward: Self-supervision for reinforcement learning.\narXiv preprint arXiv:1612.07307, 2016.\n[26] Jan Matas, Stephen James, and Andrew J Davison. Sim-to-real rein-\nforcement learning for deformable object manipulation. In Conference\non Robot Learning, pages 734\u2013743. PMLR, 2018.\n[27] Xinyang Gu, Yen-Jen Wang, Xiang Zhu, Chengming Shi, Yanjiang\nGuo, Yichen Liu, and Jianyu Chen. Advancing humanoid locomotion:\nMastering challenging terrains with denoising world model learning.\narXiv preprint arXiv:2408.14472, 2024.\n[28] Nikita Rudin, David Hoeller, Philipp Reist, and Marco Hutter. Learn-\ning to walk in minutes using massively parallel deep reinforcement\nlearning. In Conference on Robot Learning, pages 91\u2013100. PMLR,\n2022.\nAPPENDIX\nTABLE I: Structure of TMP Policy Networks. The parts\nmarked with an asterisk in the Aux network indicate the\nlayers shared with the Actor.\nNetwork\nStructure\nTeacher\nActor\n[1440, 768, 512, 256, 128, 64]\nCritic\n[768, 256, 128]\nStudent\nActor\n[1440, 768, 256]\nCritic\n[768, 256, 128]\nAux\n[1440\u2217, 768]\nDisc\n[256, 256, 128]\nTABLE II: Summary of Observation Space. The table con-\ntains proprioception observations, privileged observations,\nand auxiliary observations. The table also details their di-\nmensions.\nComponent\nDims\nProp.\nPriv.\nAux.\nClock Input\n2\n\u2713\nCommand\n3\n\u2713\nLast Actions\n12\n\u2713\nJoint Position\n12\n\u2713\n\u2713\nJoint Velocity\n12\n\u2713\n\u2713\nBase Angular Velocity\n3\n\u2713\n\u2713\nEuler Angles\n3\n\u2713\n\u2713\nAction Difference\n12\n\u2713\n\u2713\nBase Linear Velocity\n3\n\u2713\n\u2713\nFriction Coefficient\n1\n\u2713\n\u2713\nContact Phase\n2\n\u2713\n\u2713\nDisturbance Force\n2\n\u2713\nDisturbance Torque\n3\n\u2713\nGait Phase\n2\n\u2713\nBody Weight\n1\n\u2713\nHeight Map\n187\n\u2713\nTABLE III: Overview of Reward Function Composition.\nThis indicates the formula of the reward function and the\ncorresponding weight coefficients. The parts marked in red\nare used only during the student training phase.\nReward Term\nFormula\nWeight\nBase Orientation\n\u03d5(P b\n\u03b1\u03b2, 5)\n0.5\nDefault Joint Position\n\u03d5(\u03b8t \u2212\u03b80, 2)\n0.8\nBase Height Tracking\n\u03d5(P b\nz \u22121.1, 100)\n0.2\nVelocity Mismatch\n\u03d5( \u02d9P b\nz,\u03b3,\u03b2 \u2212CMDz,\u03b3,\u03b2, 5)\n0.5\nLin. Velocity Tracking\n\u03d5( \u02d9P b\nxyz \u2212CMDxyz, 5)\n1.4\nAng. Velocity Tracking\n\u03d5( \u02d9P b\n\u03b1\u03b2\u03b3 \u2212CMD\u03b1\u03b2\u03b3, 5)\n1.1\nContact Forces\nmax(FL,R \u2212400, 0, 100)\n-0.05\nContact Pattern\n\u03d5(Ip(t) \u2212Id(t), \u221e)\n1.4\nFeet Clearance\nrfc\n1.6\nCollision\nncollision\n-0.5\nAction Smoothness\n\u2225at \u22122at\u22121 + at\u22122\u22252\n-0.003\nJoint Acceleration\n\u2225\u00a8\u03b8\u22252\n2\n-1e-9\nJoint Torque\n\u2225\u03c4\u22252\n2\n-1e-9\nJoint Power\n|\u03c4|\u2225\u02d9\u03b8\u2225\n2 \u00b7 10\u22125\nDisc. Reward\nrdisc\n2 \u00b7 10\u22124\nTABLE IV: Overview of Domain Randomization. Additive\nrandomization increments the parameter by a value within\nthe specified range while scaling randomization adjusts it by\na multiplicative factor from the same range.\nRandomized Variable\nUnit\nRange\nOperation\nFriction\n-\n[0.2, 1.3]\nScaling\nRestitution\n-\n[0.0, 0.4]\nAdditive\nPush Interval\nseconds\n[8, \u221e]\nScaling\nPush Velocity (XY)\nm/s\n[0, 0.4]\nAdditive\nPush Angular Velocity\nrad/s\n[0, 0.6]\nAdditive\nBase Mass\nkg\n[-4.0, 4.0]\nAdditive\nCOM Displacement\nmeters\n[-0.06, 0.06]\nAdditive\nStiffness Multiplier\n%\n[0.8, 1.2]\nScaling\nDamping Multiplier\n%\n[0.8, 1.2]\nScaling\nTorque Multiplier\n%\n[0.8, 1.2]\nScaling\nLink Mass Multiplier\n%\n[0.8, 1.2]\nScaling\nMotor Offset\nradians\n[-0.035, 0.035]\nAdditive\nJoint Friction\n-\n[0.01, 1.15]\nScaling\nJoint Damping\n-\n[0.3, 1.5]\nScaling\nJoint Armature\n-\n[0.008, 0.06]\nScaling\nLag Timesteps\nsteps\n[5, 20]\nAdditive\nObservation Motor Lag\nsteps\n[5, 20]\nAdditive\nObservation Actions Lag\nsteps\n[2, 5]\nAdditive\nObservation IMU Lag\nsteps\n[1, 10]\nAdditive\nCoulomb Friction\n-\n[0.1, 0.9]\nScaling\nViscous Friction\n-\n[0.05, 0.1]\nScaling\nTABLE V: Algorithm Environment Parameters. The parts\nmarked in red are used only during the student training phase.\nEnvironment Setting\nValue\nObservation Frames\n15\nPrivileged Observation Frames\n3\nNumber of Single Observation\n47\nNumber of Single Privileged Observation\n213\nNumber of Single Auxiliary Observation\n48\nHeight Measurement Range\n1.6m \u00d7 1m\nNumber of Actions\n12\nNumber of Environments\n10240\nStatic Friction Coefficient\n0.6\nDynamic Friction Coefficient\n0.6\nTerrain Block Size\n8m \u00d7 8m\nTerrain Levels\n20\nNumber of Terrains per Level\n20\nTABLE VI: Algorithm Framework Parameters. The parts\nmarked in red are used only during the student training phase.\nAlgorithm Parameter Setting\nValue\nBatch Size\n10240 \u00d7 24\nMini-batch Size\n10240 \u00d7 4\nValue Function Loss Coefficient \u03bbv\n1.0\nEntropy Coefficient \u03bbe\n0.001\nDisc. Loss Coefficient \u03bbdisc\n0.05\nAux. Loss Coefficient \u03bbaux\n0.1\nPrediction Loss Coefficient \u03bbpred\n0.5\nGradient Penalty Coefficient \u03bbgrad\n0.05\nWeight Decay Coefficient \u03bbweight\n0.5\nLearning Rate \u03b1\n1e-3\nLearning Rate Adjustment\nadaptive / fixed\nDesired KL Divergence\n0.01\nClip Parameter\n0.1\nGradient Clipping Max Norm max grad\n1.0\nLearning Iterations per Epoch\n2 / 4\nDiscount Factor \u03b3\n0.995s\nGAE Discount Factor\n0.95"
-  },
-  {
-    "domain": "Computer Science",
-    "chunk_type": "general",
-    "text": "1\nSatellite Federated Fine-Tuning for Foundation\nModels in Space Computing Power Networks\nYan Zhu, Graduate Student Member, IEEE, Jingyang Zhu, Graduate Student Member, IEEE,\nTing Wang, Senior Member, IEEE, Yuanming Shi, Senior Member, IEEE, Chunxiao Jiang, Fellow, IEEE,\nand Khaled B. Letaief, Fellow, IEEE\nAbstract\u2014Advancements in artificial intelligence (AI) and low-\nearth orbit (LEO) satellites have promoted the application of\nlarge remote sensing foundation models for various downstream\ntasks. However, direct downloading of these models for fine-\ntuning on the ground is impeded by privacy concerns and limited\nbandwidth. Satellite federated learning (FL) offers a solution\nby enabling model fine-tuning directly on-board satellites and\naggregating model updates without data downloading. Never-\ntheless, for large foundation models, the computational capacity\nof satellites is insufficient to support effective on-board fine-\ntuning in traditional satellite FL frameworks. To address these\nchallenges, we propose a satellite-ground collaborative federated\nfine-tuning framework. The key of the framework lies in how\nto reasonably decompose and allocate model components to\nalleviate insufficient on-board computation capabilities. During\nfine-tuning, satellites exchange intermediate results with ground\nstations or other satellites for forward propagation and back\npropagation, which brings communication challenges due to the\nspecial communication topology of space transmission networks,\nsuch as intermittent satellite-ground communication, short dura-\ntion of satellite-ground communication windows, and unstable\ninter-orbit inter-satellite links (ISLs). To reduce transmission\ndelays, we further introduce tailored communication strategies\nthat integrate both communication and computing resources.\nSpecifically, we propose a parallel intra-orbit communication\nstrategy, a topology-aware satellite-ground communication strat-\negy, and a latency-minimalization inter-orbit communication\nstrategy to reduce space communication costs. Simulation re-\nsults demonstrate significant reductions in training time with\nimprovements of approximately 33%.\nIndex Terms\u2014Satellite federated learning, fine-tuning, founda-\ntion models, edge learning, and satellite communications.\nI. INTRODUCTION\nThe proliferation of low-earth orbit (LEO) satellites, driven\nby advancements in satellite technology, has established them\nas indispensable tools for acquiring high-resolution imagery\nof the Earth\u2019s surface. These data play a pivotal role in Earth\nobservation applications such as environmental monitoring [1]\nYan Zhu, and Ting Wang are with the MoE Engineering Research-\nCenter of Software/Hardware Co-design Technology and Application, the\nShanghai Key Lab. of Trustworthy Computing, East China Normal Uni-\nversity, Shanghai 200062, China (e-mail: 51275902041@stu.ecnu.edu.cn;\ntwang@sei.ecnu.edu.cn).\nYuanming Shi and Jingyang Zhu are with the School of Information Science\nand Technology, ShanghaiTech University, Shanghai 201210, China (e-mail:\nshiym@shanghaitech.edu.cn; zhujy2@shanghaitech.edu.cn).\nChunxiao Jiang is with Beijing National Research Center for Information\nScience and Technology, Tsinghua University, Beijing 100084, China (e-mail:\njchx@tsinghua.edu.cn).\nKhaled B. Letaief is with the Department of Electronic and Computer\nEngineering, The Hong Kong University of Science and Technology, Hong\nKong (e-mail: eekhaled@ust.hk).\nand land cover classification [2]. Simultaneously, artificial\nintelligence (AI) has experienced remarkable growth over\nrecent decades, substantially contributing to the development\nof remote sensing data interpretation. In the remote sensing\nfield, the most commonly used backbone of the learning\nmodels can be categorized into convolutional neural networks\n(CNNs) and Transformers [3]. CNN-based models, with the\nunique architecture of convolutional layers, can automatically\nextract hierarchical and discriminative features and preserve\nspatial information from remote sensing images, such as scene\nclassification [4]. Transformer-based models, such as variants\nof Vision Transformer, capture long-range dependency better\nthan the CNN-based models [5]. Transformer-based models\nare applied in many scenarios in the remote sensing field,\nincluding generalist geospatial AI (e.g., Prithvi [6]), visual\ntasks (e.g., SatMAE [7] and SatMAE++ [8]), and spectral data\nanalysis (e.g., SpectralGPT [9]). These foundation models are\npre-trained on large-scale remote sensing datasets and fine-\ntuned on specific tasks to enhance their performance in various\ndownstream tasks.\nTraditional data-driven model training has limitations as\nit needs to download raw satellite data to terrestrial cloud\nservers. Firstly, satellite-ground communication faces short\ncommunication durations due to satellite movement, which\nposes a challenge to the real-time transmission of the original\ndata. Secondly, the high resolution of remote sensing images\nhas made the satellite-ground links (SGL) with only a few\nhundred Mbps become the transmission bottleneck. For in-\nstance, transmitting a scene at 0.5-meter resolution over a 30-\nkilometer view width results in approximately 40 GB of data\ntransfer between satellites and ground stations (GSs). Finally,\nthe direct transmission of original data exposes sensitive\ninformation, leading to serious privacy and security issues.\nAt the same time, the communication and on-orbit computing\ntechnologies of satellite networks are becoming increasingly\nmature, which lays a foundation for the development of\nspace computing power networks (Space-CPN). In Space-\nCPN, satellites are combined with terrestrial devices to form\na network. By performing real-time data processing on-board,\nSpace-CPN can detect and analyze real-time environmental\nchanges, such as forest fires, floods, and climate change.\nHowever, fully leveraging the potential of Space-CPN requires\nan efficient approach to training AI models across distributed\nsatellite nodes while minimizing communication overhead.\nA promising solution to this challenge is satellite federated\nlearning (SFL), which enables satellites to collaboratively\narXiv:2504.10403v1  [cs.LG]  14 Apr 2025\n2\nFig. 1. Architecture and deployment of remote sensing (RS) foundation models. The architecture of the RS foundation models is organized into three distinct\ncomponents: the embedding layer, the backbone network, and the task-specific head. The embedding layer and task-specific head are deployed on satellites,\nwhile the backbone network, which handles the primary computation load, is deployed on the ground server.\ntrain AI models without requiring raw data to be transmitted\nto ground stations [10], [11], [12]. While SFL has been\nexplored in prior studies, most existing work focuses on\nsmall-scale models. Despite the rapid development of Space-\nCPN, onboard computation capabilities remain significantly\nconstrained compared to terrestrial AI clusters, making the\nfine-tuning of large models a major challenge. Satellite feder-\nated learning, as a solution for distributed training, supports\ncollaborative training among satellites [10], [11]. To solve this\nproblem, it is a feasible solution to decompose and deploy the\ncomputing tasks of fine-tuning large models in Space-CPN\nand use the distributed computation capabilities for federated\nfine-tuning [13], [14]. According to the distribution of compu-\ntation capabilities in Space-CPN, the modules are deployed as\nfollows: the embedding layer is deployed on satellites because\nsatellites are data sources, the backbone network is deployed\non the terrestrial server, and the classification layer generating\noutputs is deployed on satellites, as illustrated in Fig. 1.\nIn this deployment framework, the primary challenge lies\nin the communication process. During forward propagation,\nsatellites compute local embedding vectors and perform intra-\norbit aggregation. Notably, while raw satellite data can reach\ngigabyte levels in total size, the transmitted embedding vectors\nfor each sample are only a few megabytes, significantly allevi-\nating communication overhead. Once computed, these embed-\nding vectors are transmitted to the ground for feature vector\ncomputation. In this process, satellite-ground communication\nfaces some special challenges compared with transmission on\nthe ground [14]. Firstly, due to the movement of satellites,\nespecially LEO satellites that circle the Earth at high speeds,\nSGLs are dynamic and unstable. Secondly, it takes much time\nto access the GS for each satellite. Besides, the insufficient\nbandwidth of SGLs leads to slower data rates and higher\nlatency. Then, feature vectors are sent back to satellites for\noutput generation, which faces similar questions as embedding\nvector downloading. Finally, satellites update local models and\naggregate a global model through ISLs. The dynamic ISLs\nmake link selection and allocation extremely complex in terms\nof network management.\nTo address the above-mentioned challenges, in this paper,\nwe propose a satellite-ground collaborative federated fine-\ntuning framework, which decomposes the modules of remote\nsensing foundation model based on the distribution of com-\nputing power between space and ground networks. During\nthe federated fine-tuning procedures, we address challenges\nduring satellite-ground communication for embedding vector\ndownloading and feature vector uploading, intra-orbit com-\nmunication for embedding vector aggregation, and inter-orbit\ncommunication for global head aggregation. We design the\ncorresponding collaborative transmission method combined\nwith the process of computation tasks. The major contributions\nof this paper are summarized as follows:\n1) We propose an efficient satellite-ground collaborative\nfederated fine-tuning framework that strategically decom-\nposites the fine-tuning task for remote sensing foundation\nmodels by allocating the modules between GSs and\nsatellites. This framework offers a practical solution for\non-board remote sensing foundation model fine-tuning,\nensuring data privacy and alleviating satellite-ground\ncommunication bottlenecks.\n2) We introduce customized communication strategies for\nsatellite-ground and inter-satellite transmission accord-\ning to the inherent characteristics of transmission links\n(e.g., intermittent SGLs, stable intra-orbit ISLs, and time-\nvarying inter-orbit ISLs). For sporadic satellite-ground\ncommunication, we propose a topology-aware communi-\ncation strategy that allocates transmission tasks for SGLs\naccording to real-time topology and link states. For intra-\norbit communication, we propose a parallel communi-\ncation strategy based on Ring Allreduce for the ring\ntopology of each orbit. For inter-orbit communication, we\npropose a communication algorithm to minimize latency\naccording to link capacity. These strategies integrate\ncommunication with computation processes, taking into\naccount factors such as bandwidth limitations, sparse\nconnection, and dynamic topology, optimize data trans-\nmission efficiency, and accelerate model convergence.\n3) We conduct extensive simulations to evaluate the per-\nformance of the proposed framework and communica-\ntion strategies. The simulation results demonstrate that\nthe proposed framework, coupled with the optimized\ncommunication strategies, significantly mitigates training\n3\nlatency, reduces transmission overhead, and improves the\nconvergence speed of the model.\nThe remainder of the paper is organized as follows. Sec-\ntion II introduces relevant research on satellite FL. Section III\nformulates satellite-ground collaborative federated learning,\nintroduces communication networks, and outlines the module\ndecomposition and deployment framework. The workflow of\nthe proposed framework is described in Section IV. In Sec-\ntion V, communication algorithms are designed to optimize\ntransmission procedures. Section VI presents simulation re-\nsults, followed by the conclusion in Section VII.\nII. RELATED WORKS\nA. Centralized Satellite FL\nIn centralized satellite FL, a coordinator, such as a PS,\ncollects model updates from satellites. Satellites perform local\ntraining on their datasets and then send the model updates\nto the central coordinator. Some algorithms employ only\nSGLs during training, where models are trained on-board\nand transmitted via SGLs for global model aggregation at\nthe PS. These algorithms are usually designed to address the\nchallenges of time-varying and unstable SGLs. For instance,\nthe algorithm proposed in [15] adjusted aggregation intervals\nbased on the density of satellite-ground connections to increase\nthe frequency of model updates. [16] proposed an over-the-\nair computation based scheme and optimized both uplink and\ndownlink communications through beamforming. Considering\nthe variations in connection time and connection density\namong satellites, asynchronous FL is introduced to mitigate the\neffects of stragglers, allowing satellites to send model updates\nto the server asynchronously [17]. A compensation mechanism\nproposed in [18] addressed gradient staleness by adjusting the\nupdates according to discrepancies.\nSatellite FL with SGLs presents significant drawbacks, such\nas high latency due to satellite-ground communication delays.\nSo some work adopts ISLs to enable model aggregation among\nsatellites, thereby reducing the data transmission via SGLs.\nSome studies use high-altitude platforms as the coordinator\nfor improved communication efficiency and a more favorable\ncommunication environment, such as [19]. FedISL lever-\nages stable intra-orbit ISLs for intra-orbit model aggregation,\npredicts satellite movement and sends intra-orbit aggregated\nmodels through SGLs for global model aggregation [20].\nAsyncFLEO utilizes satellites as relays, employs asynchronous\nFL, and proposes a model propagation algorithm that incorpo-\nrates satellite grouping and stale update discounts to accelerate\nconvergence, improve model accuracy, and balance the idle\ntime and model staleness [21].\nB. Decentralized Satellite FL\nTo eliminate the reliance on the central coordinator, de-\ncentralized satellite FL is proposed, where satellites exchange\nmodel parameters directly with other satellites through ISLs.\nIn this context, the primary focus is on designing inter-orbit\ntransmission schemes aimed at minimizing transmission time\nor energy consumption. The scheme in [22] aggregates a\nglobal model at each training round and strives to minimize\nenergy consumption. Some decentralized Satellite FL incorpo-\nrates the gossip learning, where satellites communicate with\ntheir neighbors for partial model aggregation at each training\nround [23], [24], [25]. [24] offloads data through ISLs to bal-\nance data distribution and optimize delay and accuracy within\ncomputation and communication power constraints. DFedSat\nin [25] reduces communication costs and improves system\nrobustness by implementing a self-compensation mechanism\nto address packet loss . [26] proposed a number control scheme\nto minimize the convergence time of decentralized satellite FL\nthrough optimization of constellation configurations.\nNumerous efforts have been made to accelerate the con-\nvergence of satellite FL by addressing communication chal-\nlenges such as intermittent satellite-ground communication\nand dynamic inter-satellite communication. However, most\nexisting works rely on satellites to train models and ignore the\nchallenges of on-board large model training stemming from\ninsufficient computation capacity. To overcome these limita-\ntions, we propose a satellite-ground collaborative federated\nfine-tuning framework to train large models.\nIII. SYSTEM MODEL\nConsider a general Walker constellation, which has remark-\nable advantages in global coverage and is widely utilized\nin various communication and LEO systems like Starlink,\nOneWeb, and Kuiper [27]. The Walker constellation is for-\nmulated by the parameters (M/P/F, H, I), where I is the\ninclination, M is the total number of satellites, P is the\nnumber of orbit planes, F is the phase factor of two adjacent\norbits, and H is the altitude of orbits. In the LEO satellite\nconstellation, satellites orbit the Earth at an altitude ranging\napproximately from 500 to 1500 kilometers. Orbit planes and\nsatellites are evenly spaced as depicted in Fig. 2. Each orbit\nencompasses N\n= M/P satellites, and the satellites are\nindexed by the set S = {S11, \u00b7 \u00b7 \u00b7 , S1N, \u00b7 \u00b7 \u00b7 , SP 1, \u00b7 \u00b7 \u00b7 , SP N},\nwhere Sp,n, p \u2208{1, . . . , P}, n \u2208{1, . . . , N} denotes the n-th\nsatellite in the p-th orbit.\nA. Satellite Federated Fine-Tuning\nIn Space-CPN, each satellite stores massive remote sensing\ndata within a Walker constellation, which can be harnessed\nto train AI models with federated fine-tuning for downstream\nremote sensing tasks, including environmental monitoring and\nland cover classification. A satellite FL system comprises\na ground PS responsible for coordination and global model\naggregation, alongside multiple GSs tasked with facilitating\ndata transmission between satellites and the PS. Each satellite\ntrains its local model on the local dataset without transmitting\nthe raw dataset to the GSs. The goal of satellite FL is to\ncollaboratively train a global model \u02c6w to minimize the global\nloss, which is mathematically expressed as:\n\u02c6w = min\nw\nP\nX\ni=1\nN\nX\nj=1\nmi,j\nm L(w; Di,j),\n(1)\nwhere Di,j represents the local dataset on satellite Si,j, mi,j =\n|Di,j| is the size of the local dataset, m = PP\ni=1\nPN\nj=1 mi,j is\nthe total number of data samples, and the function L(\u00b7) denotes\n4\nGS\nGS\nFig. 2. Overview of the walker constellation. Satellites move around the Earth\nin circular orbits. Multiple GSs are located at different locations.\nthe local loss function dependent on the local dataset Di,j and\nthe local model wi,j. Conventional satellite FL adheres to the\nsubsequent steps:\n1) Initialization: The ground PS transmits the initialized\nmodel w(0) to all satellites, and each satellite initializes its\nlocal model by\nwi,j(0) \u2190w(0).\n(2)\n2) Local Update: At the training round r, each satellite\nupdates its local model through the stochastic gradient descent\n(SGD) algorithm, which is given by\nwi,j(r + 1) = wi,j(r) \u2212\u03b7 \u02dc\u2207L(wi,j(r)),\n(3)\nwhere \u03b7 represents the learning rate and \u02dc\u2207L(\u00b7) denotes the\nstochastic gradient oracle.\n3) Global aggregation: The local models from different\nsatellites are first transmitted to GSs and subsequently relayed\nto the ground PS for global aggregation [28]. The global\naggregation formula is\nw(r + 1) =\nP\nX\ni=1\nN\nX\nj=1\nmi,j\nm wi,j(r + 1).\n(4)\nB. Communication Networks\nIn Space-CPN, communication serves as the fundamental\nenabler for data exchange, resource coordination, and over-\nall system operation. Without communication, the individual\ncomputing nodes in space would be isolated, so we analyze\nthe characteristics of communication networks.\n1) Ground Communication Network:\nDuring federated\nfine-tuning, training information is exchanged between satel-\nlites and GSs for global aggregation. As depicted in Fig. 2,\nthere are G GSs and a PS situated on the ground, and all\nGSs are connected with the PS. The PS serves as the central\ncoordinator in this system, while the GSs act as relays that\nfacilitate communication between the satellites and the PS.\nThe communication links between the GSs and the PS are\nfixed and characterized by high-speed fabric connectivity,\nensuring stable and fast transmission [29]. To represent the\ncharacteristics of the ground communication network, we\nmodel it as a stable star network, as illustrated in Fig. 3(b),\nwhere the PS is the central node and transmission between the\ncentral node and other nodes is fast.\n2) Satellites Communication Network: LEO satellites are\ncapable of communicating with each other through laser ISLs.\nThere exist two distinct types of laser ISLs: intra-orbit ISLs\nfor communication between two adjacent satellites within the\nsame orbit, and inter-orbit ISLs for communication between\ntwo satellites in adjacent orbits. Each satellite Sij typically\nestablishes four ISLs: two intra-orbit ISLs established with\nSi,j\u22121 and Si,j+1 and two inter-orbit ISLs with Si\u22121,j1 and\nSi+1,j2. Laser ISLs are widely utilized as primary inter-\nsatellite communication links by many constellations such\nas Kuiper, Telesat, and Starlink [30]. The power received is\nexpressed by\nPR = PT GT GRLpsLpt,\n(5)\nwhere PT is the transmitting power, GR and GT are the\nreceiving and transmitting efficiencies, Lps is the pointing\nloss, and Lpt is the path loss [31]. The pointing loss due to\nmisalignment from satellite jitter and tracking noise can be\nexpressed as\nLps = e\u2212GT \u03b82\nT \u2212GR\u03b82\nR,\n(6)\nwhere \u03b8T and \u03b8R are the pointing error angles. The free-space\npath loss is calculated as\nLpt =\n\u0012 \u03bb\n4\u03c0l\n\u00132\n,\n(7)\nwhere \u03bb is the wavelength and l is the distance. The noise\nis the additive white Gaussian noise with variance \u03c32, so the\nachievable data rate of laser ISLs is given by\nR = B log2(1 + SNR) = B log2(1 + PR\n\u03c32 ),\n(8)\nwhere B is the bandwidth and SNR is the signal-to-noise\nratio [32]. Laser ISLs have high data rates and can pro-\nvide a transmission speed of 100 Gbps [33]. The satellite\ncommunication network composed of ISLs has the following\ncharacteristics:\n\u2022 Stable intra-orbit communication. The intra-orbit ISLs\nbetween adjacent satellites ensure stable and consistent\nconnectivity, as these adjacent satellites within the same\norbit possess identical velocities and are situated at rel-\natively short distances from one another. All satellites\nwithin the same orbit are interconnected end to end,\nforming a stable ring structure.\n\u2022 Unstable inter-orbit communication. Some satellites\nin adjacent orbits move in opposite directions in the\nWalker constellation, as depicted in Fig. 2. Inter-orbit\nISLs between these satellites, known as cross-seam inter-\norbit ISLs, face challenges such as the Doppler shift and\nlimited communication windows due to the rapid move-\nment of satellites [34]. Consequently, cross-seam ISLs are\n5\n(a) Satellite network: a set of rings.\n(b) Ground network: star topology.\n(c) Satellite-ground network: time-varying topology.\ngenerally not utilized for inter-satellite communication\nbecause of their instability.\nLeveraging these characteristics, satellites within the same\norbit can be modeled as a ring topology. The LEO constella-\ntion comprises a collection of such rings, with adjacent rings\ninterconnected by inter-orbit ISLs, excluding cross-seam ISLs,\nas depicted in Fig. 3(a).\n3) Satellite-Ground Communication Network:\nSatellite-\nground communication is essential for cooperation during fine-\ntuning. Satellite-ground channels can be modeled as frequency\nselective channels with path loss and attenuation [35]. The\ndefinition of free-space path loss remains consistent with\nEquation 7. The attenuation due to rain is presented as\nLrain = KR\u03b1\nr lr,\n(9)\nwhere K and \u03b1 are the rain attenuation coefficients, Rr is the\nrain rate, and lr is the path length in the rain area [36]. The\nreceived power for each path is expressed as\nPR(i) = \u02c6Pt\n \u02c6\u03bb\n4\u03c0\n!2 \f\f\f\f\nrie\u2212\u2206\u03d5i\nli\n\f\f\f\f\n2\n,\n(10)\nwhere \u02c6Pt is the transmit power of SGLs, \u02c6\u03bb is the wavelength\nof satellite-ground signals, ri is the reflection coefficient, and\n\u2206\u03d5i is the phase difference respect to the direct path. The\ntotal multipath received power for N paths is derived as\nPmp = 10 log10\n\uf8ee\n\uf8f0Pt\n\u0012 \u03bb\n4\u03c0\n\u00132 \f\f\f\f\f\n1\nl1\n+\nN\nX\ni=2\nrie\u2212j\u2206\u03d5i\nli\n\f\f\f\f\f\n2\uf8f9\n\uf8fb.\n(11)\nThe total received power is then:\n\u02c6Pr = Pmp \u2212Lpt \u2212Lrain.\n(12)\nSGLs can provide transmission speeds on the order of Mbps,\nwhich is insufficient for satellite data transmission. Due to the\nmovement of satellites, the satellite-ground topology exhibits\nthe following characteristics:\n\u2022 Intermittent connectivity. A satellite becomes visible to\na ground station (GS) when it orbits overhead. The period\nduring which the satellite is visible to a GS is termed the\ncommunication window. Since satellites are not always\nwithin the visible range of GSs due to their movement, it\nmay take hours to move from one visible range to another.\nConsequently, SGLs experience intermittent connectivity,\nresulting in sparse satellite-ground links.\n\u2022 Short communication window. Satellites only commu-\nnicate with the GS when it is in the visible range. How-\never, due to their rapid movement, LEO satellites quickly\npass through this range, leading to short communication\nwindows, typically lasting only a few minutes.\nWhen analyzing SGLs, we sample some time points dis-\ncretely T = {T0, T1, T2, ...} over a period. A satellite and a\nGS can communicate between Ti and Ti+1 if at least one link\nis feasible at Ti. The interval between two adjacent time points\nis 1 minute. At different time points, the topology of SGLs\nvaries. Thus, we model satellite-ground communication links\nas time-varying topology, as depicted in Fig. 3(c).\nC. Satellite Federated Fine-Tuning for Remote Sensing Foun-\ndation Models\nRemote sensing foundation models (RS FMs) are specif-\nically designed to analyze RS data from remote sensing\ndevices, such as satellites and drones. These huge FMs are\npre-trained and can be fine-tuned to enhance their accuracy\nand efficiency in specific downstream tasks. Typically, RS\nFMs consist of the embedding layer, backbone network, and\ntask-specific head, as shown in Fig. 1. Take SpectralGPT\nas an example, the workflow is as follows [9]. Images are\nsplit into patches and the embedding layer embeds image\npatches, adds position embeddings, and outputs embedding\nvectors. Then, the embedding vectors are fed into the backbone\nnetwork to extract feature vectors. The computation load of the\nbackbone network accounts for the vast majority of the total\ncomputation load of the model. Finally, for each downstream\ntask, a learnable task-specific head is connected to the model.\nThis head receives feature vectors as inputs and outputs results.\nDuring fine-tuning, only head parameters wH are updated,\nwhile other parameters remain fixed in each training round r.\nwH\ni,j(r + 1) = wH\ni,j(r) \u2212\u03b7\u2207L(wH\ni,j(r)),\n(13)\nwhere \u03b7 represents the learning rate, and \u2207L() denotes the\ngradient of the loss function. Since only the head parameters\nare updated, only these parameters need to be aggregated:\nwH(r + 1) =\nP\nX\ni=1\nN\nX\nj=1\nmi,j\nm wH\ni,j(r + 1).\n(14)\nThe goal of federated fine-tuning is to find optimal head\nparameters, similar to satellite FL.\n\u02c6wH = min\nwH\nP\nX\ni=1\nN\nX\nj=1\nmi,j\nm L(wH; Di,j).\n(15)\nOur primary objective is to address the computing power\nconstraints of fine-tuning FMs on satellites. To achieve this,\nwe employ a computation task decomposition technique within\nour framework, effectively splitting the FM into distinct\ncomponents and deploying them based on their computation\nrequirements and distributed computation network. To protect\n6\ndata privacy, we deploy the embedding layer and the task-\nspecific head on the satellites. The backbone network, which\nbears most of the computational workload (approximately\n99%), is deployed on the ground. This significantly reduces\nthe computational burden on the satellites, allowing them to\nhandle only about 1\u2030 of the overall computational tasks.\nDuring the forward propagation process, embedding vectors\nand feature vectors are exchanged between the satellites and\nthe ground. Notably, the size of these vectors is approximately\n20 MB for a sample, a reduction of about a thousand times\ncompared to the raw satellite imagery. This drastic reduction\nin data size not only enhances communication efficiency but\nalso alleviates bandwidth requirements, making the system\nmore cost-effective and practical for deployment in space-\nconstrained environments.\nBased on the scheme of task decomposition, the fine-tuning\nprocess can be systematically divided into two distinct stages:\nfeature extraction from satellites to the ground and model\nupdate from the ground to satellites.\n1) Feature Extraction: Satellites begin by processing their\nrespective local data through the local embedding layer. The\ngenerated embedding vectors are then transmitted from the\nsatellites to the PS located on the ground. The PS further ex-\ntracts features from embedding vectors by transformer blocks.\nThis ensures that only processed data is sent over the com-\nmunication link, thereby conserving bandwidth and enhancing\nsecurity. Overall, this phase focuses on transforming raw\nsatellite imagery into actionable features, with data flowing\nfrom satellites to the ground.\n2) Model Update: The feature vectors extracted by the\ntransformer blocks at the ground PS are sent back to the\nrespective satellites. Upon receiving the feature vectors, each\nsatellite processes these vectors and outputs results, com-\nputes local loss, and updates local task-specific heads for\ndownstream tasks. Finally, all the updated task-specific head\nparameters from various satellites are aggregated through ISLs.\nIn this phase, data flows from the ground back to the satellites,\nfacilitating a closed-loop learning process.\nRepeat these two stages until the model converges. By\ndecomposing the fine-tuning process into feature extraction\nand model update stages, we effectively manage computational\nresources, ensure data privacy, and ensure efficient model\nsynchronization across a distributed satellite network.\nIV. SATELLITE-GROUND COLLABORATIVE FEDERATED\nFINE-TUNING\nThis section provides an overview of the proposed frame-\nwork, designed to address computational constraints in fine-\ntuning large foundation models on satellites through a task\ndecomposition mechanism. Large FMs require substantial\ncomputational resources for fine-tuning, and the limitation\nin on-board computing power becomes particularly apparent\nwhen fine-tuning these models. To address these challenges,\nwe propose a task decomposition mechanism to partition the\nmodel and allocate different parts of the model to satellites\nand the PS, respectively. The trainable task-specific head and\nfrozen embedding layer are deployed on satellites, while the\nfrozen backbone network is deployed on the ground. This\n(a) Feature extraction.\n(b) Model update.\nFig. 4. Workflow of satellite-ground collaborative federated fine-tuning.\nmechanism significantly reduces the on-board computation\nload and optimizes the combination of transmission and\ncomputation process. We elaborate on the two stages briefly\nintroduced in the section III-C. Each training round is divided\ninto two phases, feature extraction, and model update, as\nillustrated in Fig. 4.\nA. Feature Extraction\n1) Intra-Orbit Embedding Vector Broadcast: First, each\nsatellite Sp,n computes embedding vectors Ep,n utilizing data\nstored locally. Then, all computed embedding vectors are\nbroadcast within each orbit so that any satellite connected to\na GS can transmit intra-orbit data to the ground for forward\npropagation. This transmission scheme significantly reduces\nwaiting times compared to the conventional transmission ap-\nproach where each satellite only sends its local embedding\nvectors to the ground. We visualize the connection windows\nbetween satellites and GSs throughout the day in Fig. 5 for\ncomparison. Each connection window lasts for approximately\n5 \u223c10 minutes. If data transmission starts at 23:30, some\nsatellites establish their first satellite-ground communication\naround three hours later. So conventional methods waste the\ncommunication resources of other satellites. In contrast, our\nproposed method allows other intra-orbit satellites to transmit\ndata immediately without waiting.\nDue to the instability of inter-orbit ISLs and the volume\nof intra-orbit embedding vectors, embedding vectors are not\nexchanged between different orbits. To optimize this process,\nwe have designed a parallel intra-orbit broadcast strategy,\ndetailed in Section V-A, to make full use of each satellite\u2019s\ncommunication capabilities. After the broadcast, each satellite\nstores the concatenated intra-orbit embedding vectors, repre-\nsented as:\nEp = [Ep,1, ..., Ep,N].\n(16)\n7\n16:24:00 20:24:00 00:24:00 04:24:00 08:24:00 12:24:00 16:24:00\nsatellite 0\nsatellite 1\nsatellite 2\nsatellite 3\nsatellite 4\nsatellite 5\nsatellite 6\nsatellite 7\nsatellite 8\nsatellite 9\nsatellite 10\nsatellite 11\nsatellite 12\nsatellite 13\nsatellite 14\nsatellite 15\nsatellite 16\nsatellite 17\nsatellite 18\nsatellite 19\nFig. 5. Connection windows between satellites and GSs.\n2) Embedding Vectors Transmission: Intra-orbit embedding\nvectors are transmitted from satellites to the PS in each orbit. A\nkey challenge is that satellite-ground communication topology\nis time-varying, as illustrated in section III-B. To address\nthis, we design a topology-aware satellite-ground transmis-\nsion strategy in section V-B, which allows multiple orbits to\nindependently utilize this strategy for parallel transmission.\nThe algorithm takes into account the heterogeneity of different\nSGLs and allocates different transmission tasks to SGLs. Thus,\nthe total transmission time Tem is determined by the maximum\ntransmission time across all orbits:\nTem = max(T 1\nem, \u00b7 \u00b7 \u00b7 , T P\nem).\n(17)\n3) Terrestrial Feature Extraction: After receiving embed-\nding vectors, the PS feeds them into the backbone network.\nThe backbone network, consisting of dozens of transformer\nblocks, identifies the most relevant information from the input\nand outputs feature vectors:\nFp = f(Ep) = [Fp,1, ..., Fp,N]\n(18)\nDue to the high computational speed of the PS, the time cost\nof feature extraction is significantly reduced.\n4) Feature Vectors Delivery: Finally, the feature vectors are\ndelivered back to the satellites for the last step of forward\npropagation. The task-specific head receives these feature\nvectors and produces the output. This output serves as the\nbasis for further processing in applications. For instance, in\nclassification tasks, the model selects the class with the highest\nprobability as its final prediction. This process also involves\nsatellite-ground communication. Similar to the embedding\nvector transmission, we use a topology-aware communication\nstrategy for parallel transmission from the PS to satellites in\nthe orbit p. Receivers then send the corresponding parts Fp,k\nto other satellites Sp,k in the same orbit.\nB. Model Update\n1) Local Model Update: Subsequently, the output is uti-\nlized for back propagation in the model. The loss is computed\nto quantify the difference between the predicted output and\nthe true labels. Once the loss is determined, back propagation\nis performed to calculate the gradients. These gradients are\nthen used to update the model parameters, typically using an\noptimization algorithm such as stochastic gradient descent.\nAfter this step, the local fine-tuning process for a single\nsatellite is completed.\n2) Intra-orbit Head Aggregation: After local fine-tuning,\nthe updated task-specific heads from all satellites within the\nsame orbit are averaged to ensure model convergence in\ncollaborative learning. Satellites exchange information with\neach other in a decentralized way to aggregate these local\nheads. This head aggregation process does not rely on a\nterrestrial coordinator and accounts for the intermittent nature\nof satellite-ground communication. To obtain the global model,\nwe perform hierarchical aggregation through ISLs. Head pa-\nrameters are first aggregated within each orbit\n\u00afwH\np = 1\nN\nN\nX\nn=1\n\u00afwH\npn,\n(19)\nand then aggregated across different orbits. Due to the\nmore stable communication conditions maintained by satellites\nwithin the same orbit, intra-orbit head aggregation employs a\nparallel communication strategy similar to that used in the\nembedding vector broadcast. This approach ensures efficient\nand lower-cost transmission times.\n3) Inter-orbit Head Aggregation: Intra-orbit heads from all\norbits are aggregated to form a global head.\n\u00afwH = 1\nP\nP\nX\np=1\n\u00afwH\np .\n(20)\nHead parameters are transmitted from orbit 1 to orbit P\nthrough inter-orbit ISLs for global aggregation. To ensure\nefficient convergence and communication, the inter-orbit ISLs\nare carefully selected for transmission between two adjacent\norbits. A path selection algorithm is proposed in section V-C,\nwhich can adapt to dynamic inter-orbit ISLs and automatically\nselect the optimal path to minimize transmission latency.\n4) Global Head Broadcast: Finally, the global model is\ntransmitted back from orbit P to orbit 1 via inter-orbit ISLs.\nWithin each orbit, the satellite receiving the global head then\ndisseminates it to other satellites for the next round of training,\nwhere satellites continue to train using their local data. This\nstep ensures that every satellite has access to the latest version\nof the global model. Notably, this can be efficiently achieved\nby simply reversing the direction of transmission during inter-\norbit head aggregation.\nIn conclusion, each training round of the fine-tuning process\ncomprises two primary phases: feature extraction and model\nupdate. These stages are repeated until the model converges.\nThe entire fine-tuning procedure is delineated in Algorithm 1.\nDuring the training procedure, only the heads and embedding\nlayers are computed on the satellites. By selectively computing\nonly the most relevant or impactful parts of the model, this ap-\nproach significantly reduces the computational load and offers\na promising solution to address the computational challenges,\nparticularly beneficial for resource-intensive model training.\nV. SPACE COMMUNICATION STRATEGIES\nDuring fine-tuning, a substantial volume of crucial data,\nsuch as embedding vectors, feature vectors, and model pa-\nrameters, is transmitted through ISLs and SGLs. Although\n8\nAlgorithm 1: Satellite-Ground Collaborative Feder-\nated Fine-Tuning Algorithm\nPS initializes and sends model parameters w(0) to satellites;\nfor round r=0,...,R do\n// feature extraction\nfor orbit p \u2208P parallel do\nfor satellite n \u2208K parallel do\nCompute local embedding vectors Ep,k(r);\nend\nA satellite gathers and broadcasts concatenation of\nintra-orbit embedding vectors Ep(r) ;\nsend intra-orbit aggregated embedding vectors Ep(r)\nto PS;\nend\nPS computes and sends feature vectors Fp(r) to orbit p;\n// model update\nfor orbit p \u2208P parallel do\nfor satellite n \u2208K parallel do\nupdate local model\nwp,k\nH (r) = wp,k\nH (r) \u2212\u03b7\u2207L(wp,k(r), Dp,k);\nend\nAggregate intra-orbit head\n\u00afwp\nH(r) =\n1\nK\nPN\ni=1 wp,i\nH (r);\nend\nAggregate the global head\n\u00afwH(r + 1) =\n1\nN\nPP\nj=1 \u00afwj\nH(r);\nSends the global head to all orbits;\nend\nOutput: \u00afwH(R)\nthe size of these data is significantly smaller than that of\nthe original satellite data, data transmission still accounts\nfor a considerable portion of the overall latency. To achieve\nefficient transmission, communication is optimized to address\nvarious challenges. For satellite-ground communication, SGLs\nare sporadic and intermittent, with short communication win-\ndows. Satellites usually need to wait for a long time to\nestablish SGLs. For inter-satellite communication, links are\ndynamic. Therefore, in this section, we present the customized\ncommunication strategies for the satellite-ground, intra-orbit,\nand inter-orbit communication to accelerate convergence. The\ncommunication optimization is closely integrated with the\ncomputation process.\nA. Intra-Orbit Communication for Partial Head Aggregation\nIn certain stages, such as intra-orbit embedding vector\nbroadcast and intra-orbit head aggregation, data is transmitted\nthrough intra-orbit ISLs. These processes can be abstracted\ninto a unified communication model, where all communication\nclients establish a stable ring topology. Each client possesses\nlocal data that needs to be aggregated or concatenated with\nthe data from other clients before being broadcast to all\nclients within the ring. If data is transmitted sequentially in\neach orbit, the transmission time increases as the number of\nsatellites grows. In mega-constellations, the transmission delay\ncan become significant.\nTo address this issue, we leverage the Ring Allreduce algo-\nrithm to accelerate intra-orbit communication. The Ring Allre-\nduce algorithm is a distributed method designed to efficiently\naggregate data across multiple clients in a ring topology.\n(a) Scatter-reduce phase.\n(b) Gather phase.\nFig. 6. Ring Allreduce based intra-orbit communication. Take N=3 as an\nexample. Satellites {Sp,1, Sp,2, Sp,3} travel in orbit p. In the scatter-reduce\nphase, data on satellite Sp,i is split into 3 chunks, i.e., Ci\n1, Ci\n2 and Ci\n3. At\ncommunication round 1, Sp,1, Sp,2, Sp,3 transmit C1\n1, C2\n2, C3\n3 respectively.\nSo each satellite only transmits 1\n3 data 2 times in the scatter-reduce phase\nand gather phase. The total time is 2 \u00d7 2 \u00d7 1\n3 \u00d7 T, where T is the time to\ntransmit all data on a satellite.\nIn essence, Allreduce means that each client contributes its\ndata, and subsequently, every client receives the result of\nthe reduction. The Ring Allreduce algorithm is particularly\ncommunication-efficient for achieving Allreduce in a ring\ntopology. Inspired by the Ring Allreduce algorithm, we divide\nintra-orbit model transmission into two phases: a scatter-\nreduce phase and a gather phase.\n1) Scatter-Reduce Phase: Each client Sp,n splits its lo-\ncal data into N\nequal-sized chunks, indexed by C\n=\n{C1, C2, ..., CN}. During the communication round r, each\nclient exchanges its local chunk Ci with its neighbors Sp,n+1\nand Sp,n\u22121, where i = (n \u2212r)%N + 1. Each client sends a\nchunk to the next client in the ring and receives a chunk from\nthe previous client. After N \u22121 rounds, each client holds\nthe aggregation result of one chunk from all the clients, as\nillustrated in Fig. 6(a).\n2) Gather Phase: In this phase, each client sends the\naggregated chunk to the next client and receives another\naggregated chunk from its previous neighbor. After N \u22121\nrounds, each aggregated chunk has traveled to all clients,\nallowing each client to have the complete data, as depicted\nin Fig. 6(b).\nIn the first and second phases of Ring Allreduce, each\nsatellite receives data N \u22121 times, with each transmission\ninvolving size(data)\nN\nbits of data, where size(data) denotes the\ntotal number of bits for size. Thus, the total time is given by:\nTintra = 2 \u00b7 (N \u22121) \u00b7 size(head)\nN\n< 2 \u00b7 size(head).\n(21)\nRing AllReduce has a constant communication complexity,\nmeaning the communication cost is independent of the number\nof satellites, effectively mitigating the latency issue. Since each\nsatellite sends and receives data simultaneously in each round,\nRing AllReduce can make better use of parallelism and reduce\nidle time.\nBased on the first phase of the ring all-reduce, we further\noptimize the intra-orbit orbit of embedding vectors, though\nthey are not reduced but concatenated. In each round, all\n9\nsatellites transmit local vectors and receive vectors in a certain\ndirection. Thus, the transmission time is reduced to:\nTbc = (N \u22121) \u00b7 cem.\n(22)\nIt is evident that the time consumed by the Ring Allreduce-\nbased transmission scheme is independent of the number\nof satellites. Therefore, the algorithm scales well with the\nnumber of satellites, making it highly suitable for large-scale\nconstellations.\nB. Satellite-Ground Communication for Embedding Vector\nTransmission\nSGLs are sparse, time-varying, and bandwidth-limited. Each\nsatellite communicates with GSs several times daily, and\ncommunication windows are short. There is a long time\ninterval between the establishment of SGLs for a satellite.\nThese factors significantly influence the transmission time. We\nhave devised the training steps where data is first broadcast\nwithin each orbit and then transmitted through SGLs to fully\nutilize each transmission opportunity and minimize idle time.\nHowever, we did not previously explain the strategy for\nsatellite-ground communication. This section focuses on effi-\ncient satellite-ground transmission in each time slot. We have\ndeveloped a topology-aware communication strategy aimed\nat maximizing data transmission within each communication\nwindow to reduce waiting time. To optimize the utilization\nof communication resources and shorten the transmission\ntime, we adopt a multi-path parallel transmission scheme that\nleverages all feasible SGLs.\nTo effectively organize transmission tasks across heteroge-\nneous SGLs, we model the satellite-ground communication\nnetwork as a static flow network at each time point. This\nnetwork is represented as a directed acyclic graph, wherein\neach edge possesses a capacity constraint, ensuring that the\ndata flow does not exceed this limit. Data flow originates from\nthe source vertex and is sent to the sink vertex. In this context,\nthe source vertex represents the orbits, while the sink vertex\ncorresponds to the PS. Data traverses from each orbit to the\nPS via intermediate satellites and GSs, which are depicted\nas vertices within the graph. The objective of satellite-ground\ntransmission is to maximize the data flow through the network.\nTo achieve this objective, we first define the capacities of\nthe edges. For any link l, the capacity of its corresponding\nedge el is\nT \u00b7rl\n\u03b1p , where \u03b1p denotes the total number of bits\nof data in orbit p, rl denotes the data rate of link l, and T\ndenotes the interval between two time points. Consequently,\nthe capacity of each edge reflects the maximum proportion of\ndata that can be transmitted over the corresponding link during\nthis interval. Subsequently, we apply a max flow algorithm to\nthis flow network to maximize the amount of data transferred\nfrom the source to the sink. Each satellite transmits the data\naccording to the algorithm\u2019s results until all data is transmitted\nto the PS. This approach fully leverages the satellite transmis-\nsion opportunities and ensures the rapid transmission of data\nwithout exceeding the transmission capacity of the links.\nC. Inter-Orbit Communications for Global Head Aggregation\nSatellite-ground communication suffers from intermittent\nSGLs, and the head aggregation does not necessitate the\ninvolvement of GSs. Therefore, we propose an aggregation\nscheme tailored for dynamic satellite networks, wherein satel-\nlites aggregate the global head via ISLs to avoid reliance on\nGSs and unreliable SGLs. The primary challenge for inter-\norbit communication lies in the instability and time variability\nof inter-orbit ISLs. There are different aggregation algorithms\ndesigned according to the network topology of the clients,\nsuch as the fully connected topology, the ring topology, and\nso on. However, these designs prove ineffective in LEO\nsatellite networks due to their neglect of satellite network\ncharacteristics: the stable ring topology of intra-orbit ISLs and\nthe unstable topology of inter-orbit ISLs.\nTo propose an aggregation algorithm, it is imperative first\nto elucidate the network topology. We model the LEO satellite\nconstellation as a network where vertices represent satellites\nand edges denote ISLs. Given the stability of the intra-orbit\nISLs, satellites within the same orbit can be considered as\na tightly connected cluster and hold the same data because\nintra-orbit aggregation is done before inter-orbit aggregation.\nWe enumerate the clusters from 1 to P. There are a few links\nbetween cluster i and cluster i + 1, \u2200i \u2208{1...P \u22121}. Cross-\nseam ISLs are not considered in the system, so there are no\nlinks between cluster 1 and cluster P. Our goal is to select a\npath covering all orbits to achieve fast and stable transmission\nfrom cluster 1 to cluster P. The definition of path Ps\u2192t is the\ntrajectory along which data is transmitted from s to t through\ncertain relay satellites. For example, PSab\u2192Scd signifies the\npath from satellite Sab to satellite Scd and can be expressed as\n{Sab, Sx1y1, Sx2y2, ..., Scd}, where Sxiyi is i-th relay satellite,\nSab is data source and Scd is the destination. POa\u2192Oc denotes\nthe path from any satellite in orbit a to any satellite in orbit\nc. To complete global aggregation, the selected path P must\nsatisfy:\nO(P) = {0, 1, \u00b7 \u00b7 \u00b7 , P} ,\n(23)\nwhere O() computes the set of covered orbits in path P.\nThis constraint guarantees that the path P can interconnect all\norbits. The global model broadcast can be easily implemented\nby reversing the direction of transmission in the algorithm.\nTo accelerate transmission, path selection is crucial. The\ndelay of data transmission comprises two types: transmission\ndelay and propagation delay. Propagation delay is primarily\ninfluenced by the physical medium through which the sig-\nnal travels and the distance between two satellites. Longer\ndistances result in longer propagation delays. The distance\nbetween two satellites Sab and Scd is given by:\n||SabScd|| =((ha + RE)2 + (hc + RE)2\n\u22122(ha + RE)(hc + RE)\n\u00d7 (cos \u03b8ab cos \u03b8cd + cos (\u03f5a \u2212\u03f5c) sin \u03b8ab sin \u03b8cd)),\n(24)\nwhere RE is the radius of the Earth. ha, hc, \u03f5a and \u03f5c indicate\nthe height of orbit a, the height of orbit c, the longitude of\norbit a, and the longitude of orbit c, respectively. The latitudes\nof satellites Sa,b and Sc,d are denoted as \u03b8a,b and \u03b8c,d. The\n10\nFig. 7. Shortest path problem. The distance between two satellites is set to\nthe sum of propagation delay and transmission delay. So the shortest path is\nthe path with the minimum delay.\npropagation delay between satellites Sa,b and Sc,d is given as:\nTp(Sab, Scd) = ||SabScd||\nc\n,\n(25)\nwhere ||SabScd|| is the distance and c is the propagation speed\nof the signal in the transmission medium. Typically, when laser\npropagates in a vacuum, c is approximately 3\u00d7108 meters per\nsecond.\nTransmission delay is influenced by two factors: the size\nof the data and the data rate of links between the sender and\nthe receiver. So the transmission delay for transmitting Z bits\nbetween satellites Sab and Scd is\nTq(Sab, Scd) =\nZ\nrL(Sab,Scd)\n.\n(26)\nData transmission between two satellites Sab and Scd costs\ntime:\nT(Sab, Scd) = Tp(Sab, Scd) + Tq(Sab, Scd).\n(27)\nThe total communication delay for global head aggregation\nis the time it takes for parameters to transmit from orbit 1 to\norbit P on selected path PO1\u2192OP , expressed as the sum of\nthe transmission delay on each ISL belonging to the path:\nTPO1\u2192OP =T(Sab, Sx1y1) + T(Sx1y1, Sx2y2) + \u00b7 \u00b7 \u00b7\n+ T(Sxkyk, Scd).\n(28)\nTo achieve communication-efficient head aggregation in the\nLEO satellite network, the problem can be formulated as\nminimizing the total communication delay as follows.\nP : minimize TP\n(29)\nThe solution needs to satisfy the constraint in ( 23). To find a\npath with minimum delay, the distance between satellites Sa,b\nand Sc,d is set to\n||Sa,bSc,d||\nrL(Sa,b,Sc,d) , which is positively correlated\nwith the delay T(Sab, Scd). The minimum delay problem\nis modeled as a shortest path problem. We run the Floyd-\nWarshall algorithm on the network to find the shortest path\nbetween each pair of satellites. Subsequently, the shortest\nTABLE I\nSIMULATION PARAMETERS\nParameter\nValue\nRemarks\nConstellation\nWalker\nLEO constellation\nP\n4\nThe number of orbit\nH\n590 km\nThe height of orbits\nN\n20\nThe number of satellites in each orbit\nM\n80\nThe total number of satellites\nF\n45 degree\nPhase factor\nI\n90 degree\nOrbit inclination\nTs\n24 Sep 2020 16:00:00.00\nstart time of scenario\nTe\n25 Sep 2020 16:00:00.00\nend time of scenario\nF\n11 GHz\nTransmitter Frequency\nB\n250 MHz\nTransmitter Bandwidth\nPt\n50 dBm\nTransmitter Power\nGt\n35 dBi\nTransmitter gain\nGr\n35 dBi\nReceiver gain\npath between orbit 1 and orbit P is selected, ensuring that\ntransmitting data on this path incurs the shortest time.\nVI. SIMULATIONS\nTo evaluate the performance of the proposed framework,\nwe finetune SpectralGPT for different downstream tasks in\nthis section.\nA. Simulation Setup\n1) LEO Constellation Configurations: We conduct sim-\nulations using an 80/4/1 Walker constellation setup. This\nconfiguration consists of 4 orbits, with 20 satellites evenly\ndistributed in each orbit. All satellites operate at an altitude of\n550 km. The parameters for the transmitters and receivers, as\nwell as the constellation, are shown in Table I.\n2) Downstream tasks: The simulations are conducted on\nPyTorch ver. 2.0.0, CUDA ver. 11.7, and Python ver. 3.9\n\u2022 Land cover analysis on EuroSAT. [37] Land cover\nanalysis involves the classification of the Earth\u2019s surface\nbased on the type of vegetation, buildings, and other\nfeatures in a given area. The task-specific head is a simple\nfully-connected network. The batch size is set to 16. We\nuse the Adamw optimizer with a learning rate of 2e-4.\n\u2022 Semantic segmentation on SegMunich. [9] Semantic\nsegmentation is to classify each pixel in an image into\npredefined categories. The SegMunich dataset is a col-\nlection of annotated street-level images used primarily\nfor semantic segmentation in the context of urban envi-\nronments. During the fine-tuning of the dataset, we used\na batch size of 8 and set the learning rate to 1e-5.\n3) Baselines: To verify the effectiveness of the proposed\nframework, we compare it against the following algorithms:\n\u2022 Training on Satellites. Each satellite trains a local model\non the collected dataset, and their head weights are\naggregated through ISLs.\n\u2022 Training on the Ground. Download satellite data to the\nground and train the model on a ground server. Note\nthat the results of the simulations do not account for the\ntransmission time of raw data.\n11\n0\n3\n6\n9\n12\n15\n18\n21\n24\n27\ntime(h)\n10.0%\n20.0%\n30.0%\n40.0%\n50.0%\n60.0%\n70.0%\n80.0%\n90.0%\naccuracy\nsatellite-ground collaboration\non satellites\non the ground\n(a) Test accuracy on EuroSAT\n0\n3\n6\n9\n12\n15\n18\n21\n24\n27\ntime(h)\n0.4\n0.6\n0.8\n1.0\n1.2\n1.4\n1.6\n1.8\nloss\nsatellite-ground collaboration\non satellites\non the ground\n(b) Training loss on EuroSAT\n0\n14\n28\n42\n56\n70\n84\n98\ntime(h)\n10.0%\n20.0%\n30.0%\n40.0%\n50.0%\n60.0%\n70.0%\n80.0%\naccuracy\nsatellite-ground collaboration\non satellites\non the ground\n(c) Test accuracy on SegMunich\n0\n14\n28\n42\n56\n70\n84\n98\ntime(h)\n1.0\n1.5\n2.0\n2.5\nloss\nsatellite-ground collaboration\non satellites\non the ground\n(d) Training loss on SegMunich\nFig. 8. Learning performance versus training time.\nB. Simulation Results\n1) Training Time: The training time of the proposed frame-\nwork is compared with two baselines as shown in Fig 8.\nIt can be observed that the proposed framework achieves\napproximately 33% reduction in training time when compared\nto the fine-tuning approach on satellites. This reduction in\ntraining time can be attributed to the fact that, in the pro-\nposed framework, the majority of the computation workload\nis handled by the PS, which operates at a much higher speed\nthan the computing devices of the satellites. This highlights\nthat when the model training cost is substantial, the time\nrequired for satellites to train the entire model locally often\nexceeds the time it takes to transmit training data between the\nGSs and the satellites. This discrepancy is primarily due to\nthe limited processing power of the satellites, which makes\nlocal training less efficient. In contrast, by offloading the\nheavy computation tasks to the PS and only transmitting small\ntraining information, the proposed framework reduces the\noverall training time, thus optimizing the overall performance\nof the system.\n2) Model Size: We also analyze the impact of different\nmodel sizes on the training efficiency of the proposed frame-\nwork. The backbone network of our model is composed\nof several transformer blocks, and we conduct simulations\nby varying the number of these blocks. Specifically, these\nsimulations are performed on the EuroSAT dataset, and the\nresults are depicted in Fig. 9. The simulation results indicate\nthat as the model size increases, the training time gap between\nour proposed framework and the traditional on-board training\napproach becomes progressively larger. The underlying reason\nis the increased computation demand in the backbone network\nas more transformer blocks are added. As the model size\nincreases, a larger portion of the overall training process is\naccelerated on the PS, while the volume of data transmitted\nbetween the GSs and the satellites remains constant. This\nimbalance between local computation capacity and the model\u2019s\ndemands amplifies the advantage of offloading the compu-\ntation to the GS, leading to a larger reduction in training\ntime. This effect becomes even more pronounced as the model\nsize increases, highlighting the scalability of the proposed\nframework.\n3) The Overhead of Satellite-Ground Transmission: Inter-\nmittent SGLs can prevent data from being downloaded to\nthe ground server in central training. To assess whether the\nproposed framework can effectively mitigate these issues,\nwe conduct an analysis of the communication overhead.\nSpecifically, we compare the communication overhead in our\nframework with the traditional approach of downloading the\n12\n16\n20\nthe number of transformer blocks\n0\n20\n40\n60\n80\ntraining time(hours)\nsatellite-ground collaboration\non satellites\nFig. 9. Average training time versus model size.\nentire original dataset. For this comparison, we use several\ncommonly used datasets in RS applications that cover a variety\nof domains, such as environmental monitoring and geospatial\nanalysis. We simulate the data transmission requirements for\nboth the traditional method of downloading the full dataset\nand our proposed framework that focuses on transmitting only\nmodel updates rather than raw data.\n1) Bridge dataset [38]. This dataset is for object detection\nand consists of 500 images, each containing at least one\nbridge from different regions worldwide. All images are\n4, 800 \u00d7 2, 843 pixels.\n2) Aerial image segmentation dataset [39]. This is used\nin image segmentation, where a semantic class label is\nassigned to each pixel as a basis for automatic map\ngeneration. Each image is approximately 3000 \u00d7 3000\npixels.\n3) OSCD [40]. It contains 24 pairs of multispectral images\ntaken between 2015 and 2018 and provides pairs of\nsatellite images in 13 bands that can be used for change\ndetection. The spatial resolution of the image varies\nbetween 10m, 20m, and 60m. The image size is 600\u00d7600\npixels.\n4) SegMunich dataset [9]. It is used in previous simulations.\nWe present the results in Fig. 10. For datasets with 13 bands,\nwe convert them into images with 3 bands of the same size\nto display them in the figure. The figure displays the ratio\nof the transmitted data size to the original data size when\nthe original images vary in size. This ratio provides a clear\nindication of how much data is being transmitted between\nthe satellite and the GS relative to the size of the original\n12\n1000\n1280\n1560\n1840\n2120\n2400\n2680\n2960\n3240\n3520\n3800\n4080\n4360\n4640\n4920\nImage width\n1000\n1280\n1560\n1840\n2120\n2400\n2680\n2960\n3240\n3520\n3800\n4080\n4360\n4640\n4920\nImage height\n0.02\n0.03\n0.17\n0.02\nBridge Dataset\n Aerial image segmentation dataset\nOSCD\nSegMunich dataset\n0.05\n0.10\n0.15\n0.20\n0.25\nFig. 10. Communication overhead of SGLs. We show the ratio of the\ntransmitted data size to the original image size (width \u00d7 height) in this figure.\ndataset. As the size of the original images increases, this\nratio decreases, which can be attributed to the information\nextraction capability of the FMs. Several points in the figure\nare marked, each representing a specific scenario where the\ndata transmission ratio is calculated for a given dataset. For\nexample, in the simulations conducted on the SegMunich\ndataset, the transmitted data size is approximately 2% of the\nsize of the original dataset.\nGiven device parameters set as in Table I, the estimated\ndata rate for downlinks is approximately 200Mbps, according\nto formulas in Section III-B. This data rate is insufficient\nfor transmitting large volumes of raw data in a constellation.\nConsequently, directly transmitting raw data from the satel-\nlite to the ground server would result in significant delays,\nincreased latency, and system performance bottlenecks. The\nproposed framework reduces the amount of data transmitted\nto the ground server, which alleviates the pressure on inter-\nmittent SGLs. Therefore, in certain scenarios where satellites\ncontinuously collect high-precision images daily and the data\ngeneration speed exceeds the transmission speed, the proposed\nframework proves more effective than traditional methods of\ndownloading raw data.\n4) Communication Strategy: To verify the effective co-\ndesign of computing and communication in the proposed\nframework, we apply various inter-satellite and satellite-\nground communication strategies.\n1) Strategy 1: Sequential intra-orbit transmission. In this\nstrategy, intra-orbit data are first collected and aggregated\nby a designated satellite. Then, the satellite sends the\ncollected data to other satellites in the same orbit via\nISLs. Namely, data are transmitted sequentially in a\nspecified direction until all satellites obtain Ep. The total\ntime required for this process is given by:\nT = 2 \u00b7 N \u00b7 c,\n(30)\nwhere c represents the time required to transmit data\nbetween two adjacent satellites. It can be observed that\nthe total time increases linearly with the number of\nsatellites.\n2) Strategy 2: Satellite-ground transmission without inter-\n0\n3\n6\n9\n12\n15\n18\ntime(h)\n10.0%\n20.0%\n30.0%\n40.0%\n50.0%\n60.0%\n70.0%\n80.0%\naccuracy\nStrategy 1\nStrategy 2\nStrategy 3\nProposed communication strategy\n(a) Test accuracy\n0\n3\n6\n9\n12\n15\n18\ntime(h)\n0.8\n0.9\n1.0\n1.1\n1.2\n1.3\n1.4\n1.5\ntraining loss\nStrategy 1\nStrategy 2\nStrategy 3\nProposed communication strategy\n(b) Training loss\nFig. 11. Learning performance versus training time in different strategies.\nStrategy 1: sequential intra-orbit transmission. Strategy 2: Satellite-ground\ntransmission without inter-satellite cooperation. Strategy 3: Inter-orbit gossip\ntransmission.\nsatellite cooperation [17]. Each satellite only transmits\nlocal data until the PS receives all data.\n3) Strategy 3: Inter-orbit gossip transmission [25]. Each\norbit only receives model parameters from two adjacent\norbits and repeats the process for several rounds.\nWe can find that sequential intra-orbit transmission signifi-\ncantly reduces convergence speed compared to parallel trans-\nmission. In sequential transmission, after the computation is\nperformed on each satellite, the data is transmitted sequen-\ntially. At any given moment, only one satellite is actively\ntransmitting its data to the other satellite, while all other\nsatellites remain idle. As a result, the overall system through-\nput is reduced, and the communication link is underutilized.\nStrategy 2 prolongs training time due to stragglers. In this\nstrategy, the PS is required to wait for the slowest satellites to\nfinish transmitting their model updates before starting the next\ntraining epoch. This creates a significant delay. For strategy\n3, inter-orbit gossip transmission has a similar training time\nin each training round but needs more epochs to converge\nbecause all satellites do not reach a consensus on the global\nmodel in each epoch. These simulations demonstrate that the\nco-design of computing and communication plays a pivotal\nrole in reducing latency, minimizing communication overhead,\nand accelerating the training process.\nVII. CONCLUSION\nThis paper proposed a satellite federated fine-tuning frame-\nwork for RS FMs. The proposed framework addresses the\ntraining latency and computation resource constraints stem-\nming from the limited computational capabilities of satellites,\nintermittent SGLs, and dynamic ISLs in Space-CPN. To tackle\nthese issues, we designed a model partitioning scheme and\ntraining process. We partition models into the embedding\nlayer, the backbone network and the output layer and deploy\nthe backbone networks and other components on the PS and\nsatellites respectively to reduce on-board computation load and\ntransmission burden. Furthermore, communication strategies\nfor both satellite-ground and inter-satellite transmissions were\nco-designed with the training process to minimize latency.\nWe put forward a topology-aware communication scheme,\nwhich assigns transmission tasks according to the real-time\ntopological structure and the status of links. Regarding intra-\norbit communication, considering the ring topology of each\norbit, we present a parallel communication method founded on\n13\nRing Allreduce. As for inter-orbit communication, we design\na communication algorithm that aims to reduce latency to the\nminimum by taking link capacity into account. These strategies\nblend communication and computational procedures, factoring\nin aspects such as restricted bandwidth, sparse connectivity,\nand dynamic topologies. As a result, these strategies enhance\nthe efficiency of data transmission and expedite the conver-\ngence of models. Simulation results validate the effectiveness\nof our proposed framework and communication strategies.\nREFERENCES\n[1] S. Fang, L. D. Xu, Y. Zhu, J. Ahati, H. Pei, J. Yan, and Z. Liu, \u201cAn in-\ntegrated system for regional environmental monitoring and management\nbased on internet of things,\u201d IEEE Trans. Ind. Informat., vol. 10, no. 2,\npp. 1596\u20131605, 2014.\n[2] A. Temenos, N. Temenos, M. Kaselimi, A. Doulamis, and N. Doulamis,\n\u201cInterpretable deep learning framework for land use and land cover\nclassification in remote sensing using SHAP,\u201d IEEE Geosci. Remote\nSens. Lett., vol. 20, pp. 1\u20135, 2023.\n[3] A. Xiao, W. Xuan, J. Wang, J. Huang, D. Tao, S. Lu, and N. Yokoya,\n\u201cFoundation models for remote sensing and earth observation: A\nsurvey,\u201d 2024. [Online]. Available: https://arxiv.org/abs/2410.16602\n[4] A. Das and S. Chandran, \u201cTransfer learning with Res2Net for remote\nsensing scene classification,\u201d in Proc. Int. Conf. Cloud Comput., Data\nSci. Eng. (Confluence), 2021, pp. 796\u2013801.\n[5] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai,\nT. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al.,\n\u201cAn image is worth 16x16 words: Transformers for image recognition\nat scale,\u201d arXiv preprint arXiv:2010.11929, 2020.\n[6] J. Jakubik, S. Roy, C. Phillips, P. Fraccaro, D. Godwin, B. Zadrozny,\nD. Szwarcman, C. Gomes, G. Nyirjesy, B. Edwards et al., \u201cFoundation\nmodels for generalist geospatial artificial intelligence,\u201d arXiv preprint\narXiv:2310.18660, 2023.\n[7] Y. Cong, S. Khanna, C. Meng, P. Liu, E. Rozi, Y. He, M. Burke, D. B.\nLobell, and S. Ermon, \u201cSatMAE: Pre-training transformers for temporal\nand multi-spectral satellite imagery,\u201d in Proc. Adv. Neural Inf. Process.\nSyst. (NeurIPS), vol. 35, 2022, pp. 197\u2013211.\n[8] M. Noman, M. Naseer, H. Cholakkal, R. M. Anwar, S. Khan, and F. S.\nKhan, \u201cRethinking transformers pre-training for Multi-Spectral satellite\nimagery,\u201d in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.\n(CVPR), 2024, pp. 27 811\u201327 819.\n[9] D. Hong, B. Zhang, X. Li, Y. Li, C. Li, J. Yao, N. Yokoya, H. Li,\nP. Ghamisi, X. Jia, A. Plaza, P. Gamba, J. A. Benediktsson, and\nJ. Chanussot, \u201cSpectralGPT: Spectral remote sensing foundation model,\u201d\nIEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 8, pp. 5227\u20135244,\n2024.\n[10] H. Chen, M. Xiao, and Z. Pang, \u201cSatellite-Based computing networks\nwith federated learning,\u201d IEEE Wireless Commun., vol. 29, no. 1, pp.\n78\u201384, 2022.\n[11] K. B. Letaief, Y. Shi, J. Lu, and J. Lu, \u201cEdge artificial intelligence for\n6g: Vision, enabling technologies, and applications,\u201d IEEE J. Sel. Areas\nCommun., vol. 40, no. 1, pp. 5\u201336, 2022.\n[12] Y. Shi, L. Zeng, J. Zhu, Y. Zhou, C. Jiang, and K. B. Letaief, \u201cSatellite\nfederated edge learning: Architecture design and convergence analysis,\u201d\nIEEE Trans. Wireless Commun., vol. 23, no. 10, pp. 15 212\u201315 229,\n2024.\n[13] Y. Tian, Y. Wan, L. Lyu, D. Yao, H. Jin, and L. Sun, \u201cFedBERT: When\nfederated learning meets pre-training,\u201d ACM Trans. Intell. Syst. Technol.,\nvol. 13, no. 4, pp. 1\u201326, Aug. 2022.\n[14] Z. Wang, Y. Zhou, Y. Shi, and K. B. Letaief, \u201cFederated Fine-Tuning\nfor Pre-Trained foundation models over wireless networks,\u201d IEEE Trans.\nWireless Commun., vol. 24, no. 4, pp. 3450\u20133464, 2025.\n[15] J. Lin, J. Xu, Y. Li, and Z. Xu, \u201cFederated learning with dynamic ag-\ngregation based on connection density at satellites and ground stations,\u201d\nin Proc. IEEE Int. Conf. Satellite Comput. (Satellite), 2022, pp. 31\u201336.\n[16] Y. Wang, C. Zou, D. Wen, and Y. Shi, \u201cFederated learning over LEO\nsatellite,\u201d in Proc. IEEE Globecom Workshops (GC Wkshps), 2022, pp.\n1652\u20131657.\n[17] N. Razmi, B. Matthiesen, A. Dekorsy, and P. Popovski, \u201cGround-\nAssisted federated learning in LEO satellite constellations,\u201d IEEE Wire-\nless Commun. Lett., vol. 11, no. 4, pp. 717\u2013721, 2022.\n[18] L. Wu and J. Zhang, \u201cFedGSM: Efficient federated learning for LEO\nconstellations with gradient staleness mitigation,\u201d in Proc. IEEE Int.\nWorkshop Signal Process. Adv. Wireless Commun. (SPAWC), 2023, pp.\n356\u2013360.\n[19] M. Elmahallawy and T. Luo, \u201cFedHAP: Fast federated learning for LEO\nconstellations using collaborative HAPs,\u201d in Proc. Int. Conf. Wireless\nCommun. Signal Process. (WCSP), 2022, pp. 888\u2013893.\n[20] N. Razmi, B. Matthiesen, A. Dekorsy, and P. Popovski, \u201cOn-Board\nfederated learning for satellite clusters with Inter-Satellite links,\u201d IEEE\nTrans. Commun., vol. 72, no. 6, pp. 3408\u20133424, 2024.\n[21] M. Elmahallawy and T. Luo, \u201cAsyncFLEO: Asynchronous federated\nlearning for LEO satellite constellations with High-Altitude platforms,\u201d\nin Proc. IEEE Int. Conf. Big Data (Big Data), 2022, pp. 5478\u20135487.\n[22] C. Wu, Y. Zhu, and F. Wang, \u201cDSFL: Decentralized satellite federated\nlearning for Energy-Aware LEO constellation computing,\u201d in Proc. IEEE\nInt. Conf. Satellite Comput. (Satellite), 2022, pp. 25\u201330.\n[23] Z. Tang, S. Shi, B. Li, and X. Chu, \u201cGossipFL: A decentralized federated\nlearning framework with sparsified and adaptive communication,\u201d IEEE\nTrans. Parallel Distrib. Syst., vol. 34, no. 3, pp. 909\u2013922, 2023.\n[24] Z. Zhai, Q. Wu, S. Yu, R. Li, F. Zhang, and X. Chen, \u201cFedLEO: An\nOffloading-Assisted decentralized federated learning framework for low\nearth orbit satellite networks,\u201d IEEE Trans. Mobile Comput., vol. 23,\nno. 5, pp. 5260\u20135279, 2024.\n[25] M. Yang, J. Zhang, and S. Liu, \u201cDFedSat: Communication-efficient and\nrobust decentralized federated learning for leo satellite constellations,\u201d\narXiv preprint arXiv:2407.05850, 2024.\n[26] Z. Yan and D. Li, \u201cConvergence time optimization for decentralized\nfederated learning with LEO satellites via number control,\u201d IEEE Trans.\nVeh. Technol., vol. 73, no. 3, pp. 4517\u20134522, 2024.\n[27] Q. Chen, L. Yang, Y. Zhao, Y. Wang, H. Zhou, and X. Chen, \u201cShortest\npath in LEO satellite constellation networks: An explicit analytic ap-\nproach,\u201d IEEE J. Sel. Areas Commun., vol. 42, no. 5, pp. 1175\u20131187,\n2024.\n[28] H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas,\n\u201cCommunication-Efficient learning of deep networks from decentralized\ndata,\u201d in Proc. Int. Conf. Artif. Intell. Stat. (AISTATS), 2017, pp. 1273\u2013\n1282.\n[29] X. Gao, R. Liu, and A. Kaushik, \u201cAn energy efficient approach for\nservice chaining placement in satellite ground station networks,\u201d in Proc.\nInt. Wireless Commun. Mobile Comput. Conf. (IWCMC), 2021, pp. 217\u2013\n222.\n[30] Z. Xia, Y. Lin, C. Hu, and X. Bu, \u201cInter-Satellite link channel character-\nization of laser communication systems,\u201d in Proc. IEEE Int. Conf. Inf.,\nCommun. Netw. (ICICN), 2023, pp. 426\u2013431.\n[31] A. Polishuk and S. Arnon, \u201cOptimization of a laser satellite communi-\ncation system with an optical preamplifier,\u201d JOSA A, vol. 21, no. 7, pp.\n1307\u20131315, Jul. 2004.\n[32] B. Shang, S. Zhang, and Z. J. Wong, \u201cChannel modeling and\nrate analysis of optical inter-satellite link (OISL),\u201d arXiv preprint\narXiv:2501.02756, 2025.\n[33] G. Wang, F. Yang, J. Song, and Z. Han, \u201cFree space optical communi-\ncation for Inter-Satellite link: Architecture, potentials and trends,\u201d IEEE\nCommun. Mag., vol. 62, no. 3, pp. 110\u2013116, 2024.\n[34] I. Leyva-Mayorga, B. Soret, and P. Popovski, \u201cInter-plane inter-satellite\nconnectivity in dense LEO constellations,\u201d IEEE Trans. Wireless Com-\nmun., vol. 20, no. 6, pp. 3430\u20133443, 2021.\n[35] Z. Na, Q. Guan, C. Fu, Y. Cui, and Q. Guo, \u201cChannel model and\nthroughput analysis for LEO OFDM satellite communication system,\u201d\nInt. J. Future Gener. Commun. Netw., vol. 6, no. 6, pp. 109\u2013122, 2013.\n[36] K. Zhang, S. Yang, Y. Wang, J. Huang, and C.-X. Wang, \u201cRay-Tracing\nbased channel modeling and characteristics analysis for LEO satellite-to-\nground systems,\u201d in Proc. Eur. Conf. Antennas Propag. (EuCAP), 2024,\npp. 1\u20135.\n[37] P. Helber, B. Bischke, A. Dengel, and D. Borth, \u201cEuroSAT: A novel\ndataset and deep learning benchmark for land use and land cover\nclassification,\u201d IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., vol. 12,\nno. 7, pp. 2217\u20132226, 2019.\n[38] K. Nogueira, C. Cesar, P. H. T. Gama, G. L. S. Machado, and J. A.\ndos Santos, \u201cA tool for bridge detection in major infrastructure works\nusing satellite images,\u201d in Proc. Workshop Comput. Vis. (WVC), 2019,\npp. 72\u201377.\n[39] P. Kaiser, J. D. Wegner, A. Lucchi, M. Jaggi, T. Hofmann, and\nK. Schindler, \u201cLearning aerial image segmentation from online maps,\u201d\nIEEE Trans. Geosci. Remote Sens., vol. 55, no. 11, pp. 6054\u20136068, 2017.\n[40] R. C. Daudt, B. Le Saux, A. Boulch, and Y. Gousseau, \u201cUrban change\ndetection for multispectral earth observation using convolutional neural\nnetworks,\u201d in Proc. IEEE Int. Geosci. Remote Sens. Symp. (IGARSS),\n2018, pp. 2115\u20132118."
-  },
-  {
-    "domain": "Computer Science",
-    "chunk_type": "general",
-    "text": "1\n \n \nAbstract\u2014 This study dives into Graph Neural Networks \n(GNNs) as a game-changer for code refactoring, leveraging \nabstract syntax trees (ASTs) to enhance software \nmaintainability. By analyzing a massive dataset\u20142 million \nsnippets from CodeSearchNet and a custom 75,000-file \nGitHub Python corpus\u2014it pits GNNs against rule-based \nSonarQube and traditional decision trees. Metrics like \ncyclomatic complexity (target <10), coupling (aim <5), and \nrefactoring precision (correct suggestions) drive the \ncomparison. GNNs hit 92% accuracy, slashing complexity \nby 35% and coupling by 33%, outpacing SonarQube (78%, \n16%) \nand \ndecision \ntrees \n(85%, \n25%). \nDetailed \npreprocessing tackled 60% syntax errors, while bar graphs, \ntables, and AST visuals unpack the results. This work offers \na scalable, AI-driven path to cleaner codebases, vital for \nsoftware engineering\u2019s future. \n \nIndex \nTerms\u2014 \nGraph \nNeural \nNetworks, \nCode \nRefactoring, Software Maintainability, Abstract Syntax \nTrees, Machine Learning, Cyclomatic Complexity, Code \nCoupling, Software Engineering. \nI. INTRODUCTION \nSoftware refactoring\u2014the art of tweaking code to make it \ncleaner, more readable, and easier to maintain without changing \nits behaviors at the heart of modern software engineering. \nPicture a sprawling codebase: functions tangled in loops, \nvariables sprawling across modules, and complexity creeping \nup like vines. Developers spend 30% of their time wrestling \nwith such messes, according to a 2023 GitHub survey [1]. The \nstakes are high\u2014poor maintainability spikes bugs by 25% and \nslows feature rollouts by 40% [2]. Traditional tools like \nSonarQube or Check style flag issues (e.g., methods with 20+ \nlines), but their rigid rules miss the forest for the trees. Enter \nartificial intelligence (AI), where machine learning (ML) \npromises to spot patterns humans and static analyzers overlook. \nThis study dives into Graph Neural Networks (GNNs), a \ncutting-edge ML flavor, to see if they can outsmart these old-\nschool approaches in refactoring code. \nWhy GNNs Code is not just text\u2014it is a structure. Abstract \nSyntax Trees (ASTs) turn code into graphs: nodes for functions, \nedges for calls, loops as cycles. GNNs thrive on graphs, \nlearning relationships\u2014like how a nested loop jacks up \ncomplexity or a global variable ties modules in knots [3]. \nTraditional ML, say decision trees, flattens code into feature \nlists (e.g., line count, variable count), losing that structural \njuice. Rule-based tools They are stuck on hardcoded \nthresholds\u2014cyclomatic complexity over 10, bad; under, good. \nThis research bets GNNs can do better, capturing the \u201cwhy\u201d \nbehind refactoring needs. The goal Suggest precise changes\u2014\nsplit a monster function, decouple a tight module\u2014that slash \ncomplexity and boost maintainability. \n \nThe playground is a hefty dataset: 2 million code snippets from \nCodeSearchNet [4], a multilingual trove of Python, Java, and \nmore, paired with a custom haul of 50,000 Python files mined \nfrom GitHub\u2019s public repos in 2024. These are not toy \nexamples\u2014think \nreal-world \nprojects \nwith \n10,000-line \nbehemoths and refactoring scars. This study pits GNNs against \nSonarQube \nand decision \ntrees, \nmeasuring cyclomatic \ncomplexity (aiming below 10), coupling (targeting <5 \ndependencies), and refactoring precision (correct suggestions \nout of 1000). Tools like PyTorch Geometric and Tree-sitter \nparse ASTs, while PMD tracks metrics [5]. Expect deep dives \ninto preprocessing (50% of files had syntax errors), GNN \ntuning (80% validation accuracy), and results (92% GNN \nprecision vs. 78% SonarQube).Why care Software eats the \nworld\u2014$4.5 trillion in 2025 spending [6]and maintainability is \nits lifeline. A 2022 study found 60% of developers want AI to \nautomate refactoring [7]. This research delivers: a GNN \npipeline that learns from code\u2019s bones, not just its skin, \npromising faster, smarter fixes. Sections unpack theory (ASTs, \nGNNs), past work (static vs. ML tools), methods (data \nwrangling, model specs), experiments (graphs galore), and a \npeek ahead (real-time refactoring bots). Bar charts compare \nprecision, tables list metrics, and AST graphs show GNN \nmagic. It is a step toward codebases that do not fight back. \nII. THEORETICAL BACKGROUND \nCode refactoring is not new\u2014Fowler\u2019s 1999 book codified it: \nextract methods, reduce duplication, tame complexity [8]. \nMaintainability hinges on metrics: cyclomatic complexity \n(paths through code\u201410\u2019s a red flag), coupling (module \ndependencies\u20145+ screams trouble), and cohesion (how tight a \nmodule\u2019s purpose is) [9]. High complexity\u2014like a 50-path \nfunction\u2014means bugs hide easier; tight coupling\u2014like twenty \ncross-module calls\u2014means changes ripple hard. ASTs \nformalize this. A Python line, if x > 0: y = x, becomes a tree: \nAI-Driven Code Refactoring: Using Graph Neural Networks to Enhance \nSoftware Maintainability \nGopichand Bandarupalli1 \n1ai.ml.research.articles@gmail.com \n1Professional M.B.A., Campbellsville university, Texas, USA \n \n2\nroot If, child Compare, leaves x, 0. Edges link control flow and \ndata flow [10]. Traditional tools count nodes or paths but miss \ncontext\u2014does that if nest in a loop. \nGNNs flip the script. Born in graph theory, they shine where \ndata connects\u2014think social networks or molecules [11]. In \ncode, AST nodes (e.g., FunctionDef) and edges (e.g., Calls) \nform a graph. GNNs use message passing: each node shares \nfeatures (e.g., line count) with neighbors, learning patterns like \n\u201cnested loops spike complexity\u201d [12]. Contrast this with \ndecision trees\u2014they would tally features (e.g., 5 loops, 10 \nvariables) but ignore topology. GNN layers\u2014say, 3 with 64 \nunits\u2014aggregate these signals, predicting if a function needs \nsplitting or a module decoupling [13]. Math backs it: node \nembeddings \nevolve \nvia \nhv(l+1)=\u03c3(W\u22c5\u2211u\u2208N(v)hu(l)) \nh_v^{(l+1)} = \\sigma(W \\cdot \\sum_{u \\in N(v)} h_u^{(l)}) \nhv(l+1)=\u03c3(W\u22c5\u2211u\u2208N(v)hu(l)), where hv h_v hv is a node\u2019s \nfeature vector, W W W weighs neighbors, and \u03c3 \\sigma \u03c3 \nactivates [14]. \nMaintainability\u2019s roots run deep. McCabe\u2019s 1976 complexity \nmetric tied paths to bugs\u201410+ paths, 20% more errors [15]. \nCoupling studies from 2000 pegged high dependencies to 30% \nslower updates [16]. Refactoring cuts these: splitting a 20-path \nfunction into two 10-path one\u2019s halves testing effort [17]. \nGNNs amplify this by learning from examples\u201410,000 \nrefactored files teach them \u201cextract method\u201d beats \u201cinline \nvariable\u201d here. Static tools like SonarQube lean on rules\u201410+ \npaths, flag it\u2014but can\u2019t weigh trade-offs. ML\u2019s edge \nAdaptability. A 2021 study showed neural networks spotting \n85% of refactoring needs in Java [18], hinting GNNs could push \npast with graph smarts. \nThis study builds on that. GNNs see code as developers do\u2014\nstructured, relational\u2014not as flat stats. They are not perfect; \ntraining takes hours, and bad data (e.g., broken ASTs) trips \nthem up [19]. But the payoff Precision. If a function\u2019s AST has \n15 nodes and 3 cycles, a GNN might suggest splitting at the \nsecond cycle\u2014SonarQube just yells \u201ctoo long.\u201d Theory says \nGNNs can cut complexity 20% more than rules [20]. This \nresearch tests that, grounding ASTs and GNNs in real code, \naiming for maintainability that scales. \nIII. RELATED WORKS \nRefactoring\u2019s been poked at plenty. Static tools like SonarQube \nand PMD dominate\u20142023 stats show 70% of developers use \nthem [21]. SonarQube flags a 25-line method with complexity \n12, suggesting a split, but it is blind to context\u2014split where \nStudies peg its precision at 75%, missing 25% of nuanced fixes \n[22]. Check style\u2019s similar, catching 80% of coupling issues but \nsuggesting generic \u201creduce dependencies\u201d [23]. These tools \nlean on thresholds\u2014complexity >10, coupling >5\u2014rooted in \n1990s metrics [15], solid but stiff. \n \nML\u2019s muscled in lately. A 2019 study used decision trees on \n5,000 Java files, hitting 82% accuracy in spotting refactoring \nneeds\u2014better than SonarQube\u2019s 78% [24]. Features Line \ncount, variable scope, loop depth. But flattening code to \nnumbers skips structure\u2014coupling\u2019s not just \u201c5 calls,\u201d it\u2019s \n*where* they go [25]. Neural networks upped the game; a 2022 \npaper on 10,000 Python snippets got 87% precision with \nLSTMs, predicting \u201cextract method\u201d from token sequences \n[18]. Still, sequences miss AST depth\u2014does a loop nest or \nstand alone. \n \nGNNs are the new kids. A 2020 study on 1,000 C++ files used \nGNNs on ASTs, nailing 90% accuracy in complexity fixes [3]. \nWhy Graphs capture calls and flows\u2014e.g., a 10-node function \nwith 3 edges to another module screams decoupling [11]. \nAnother 2023 effort on 20,000 JavaScript files hit 91%, \nsuggesting \u201cmove method\u201d with 88% recall [12]. These beat \ntraditional ML\u2014decision trees topped at 85% on the same data \n[24]. Static tools lagged further; PMD\u2019s 76% precision could \nnot touch GNNs\u2019 structural edge [23]. Beyond refactoring, \nGNNs shine elsewhere\u20142024\u2019s blockchain fraud detection hit \n98% with AST-like graphs [19]. \n \nGaps linger. Most GNN work sticks to small datasets\u20141,000 \nfiles will not cut it for real projects [3]. Scalability\u2019s shaky; \ntraining on 10,000+ files take 10 hours on a GPU [12]. And \nvalidation Often just precision, not complexity drops or \ncoupling cuts [18]. This study fills those holes: 2 million \nCodeSearchNet snippets plus 50,000 GitHub files, scaled with \nPyTorch Geometric, and judged on hard metrics\u2014complexity \nfrom 15 to 8, coupling from 7 to 4. Past work sets the bar; this \nresearch leaps it with bigger data, deeper metrics, and GNN grit. \n \nIV. MATERIALS AND METHODS \nThis study dives into the practical details of refactoring with a \nrobust setup, blending massive datasets and a trio of models to \nsee if Graph Neural Networks (GNNs) can outshine the \ncompetition. It is split into two big chunks: wrangling the data \nand tuning the models. Here is how it all shakes out. \n \nA. Dataset Analysis \n \nThis research leans on two powerhouse datasets to fuel its \nrefactoring engine. First up is CodeSearchNet [5], a treasure \ntrove of 2 million code snippets yanked from GitHub in 2019. \nIt is a multilingual beast\u2014Python leads at 40% (800,000 \nsnippets), Java is at 30% (600,000), Go clocks in at 10% \n(200,000), Ruby\u2019s another 10% (200,000), and PHP/JavaScript \nsplit the last 10% (200,000 combined). Average stats Think 18 \nlines per snippet, 4 loops, 6 variables, and 2 functions\u2014real-\nworld code, not classroom fluff. Raw size A hefty 1.2 terabytes, \nwith files ranging from 5-line utils to 50-line algorithms. \nSecond, there is the custom GitHub 2024 Python Corpus, mined \nin March 2024 via GitHub API from 7,500 public repos with \n500+ stars\u2014think forks of Django, Flask, and Pandas. This haul \nnets 75,000 Python files, from 30-line scripts to 15,000-line \nframeworks. About 30% (22,500 files) carry refactoring \ncommits\u2014like \u201csplit core.py into utils.py and db.py\u201d\u2014gold for \nground truth. Total haul 2.075 million samples, 2.3 terabytes \nuncompressed, dwarfing smaller sets like JavaCorpus\u2019s 10,000 \nstatic files [6]. \n \n3\nRaw data was a jungle. GitHub\u2019s 75,000 files A whopping 60% \n(45,000) had syntax errors\u2014missing import os, botched \nindents, unclosed brackets. CodeSearchNet\u2019s 2 million snippets \n15% (300,000) were duplicates\u2014hash matches from copy-\npaste forks\u2014and 10% (200,000) were trivial, like print(\"hi\") or \none-line lambdas. Another 5% (100,000) had no structure\u2014flat \nscripts with zero functions. Cleaning this mess took challenging \nwork. \n \nSyntax Repair: Tree-sitter, a parser beast, chewed through \nASTs\u201445,000 GitHub errors flagged. Fixes 20,000 patched \nup\u2014e.g., added import sys for sys.exit(), guessed os for \nos.path.join()\u201425,000 too broken to save (e.g., half-finished \nclasses). CodeSearchNet lost 150,000 unparsable snippets\u2014\nthink def foo(:\u2014leaving 1.85 million. Success rate 44% fixed, \n56% cut [7]. Deduplication: MD5 hashes ran the gauntlet\u2014\n310,000 repeats axed across both sets. GitHub dropped 10,000 \n(13%)\u2014forks \nrecycling \nutils.py\u2014CodeSearchNet \nshed \n300,000 (15%). Post-cull: 1.765 million unique samples\u20141.2 \nmillion Python, 500,000 Java, 65,000 others. Labeling: PMD \nand Checkstyle tagged the goods. Complexity averaged 12 \n(max 50\u2014a 100-line monster with 20 ifs), coupling hit 6 (max \n15\u2014a module with 12 cross-file calls). GitHub\u2019s 22,500 \nrefactored pairs set truth\u2014e.g., complexity 20 to 8 after \nsplitting a 40-line db fetcher, coupling 10 to 4 after decoupling \na logger. CodeSearchNet got synthetic labels\u201410,000 snippets \nmanually refactored (e.g., a 25-line loopfest split to 12 and 13), \nvalidated by PMD drops (12 to 9 avg.). Total truth 32,500 pairs.  \n \nFeature Extraction: ASTs coughed up 35 features per snippet. \nNode count averaged 22 (max 150\u2014a 200-line API handler), \nedge count 18 (max 80\u2014a call-heavy CLI), cycles 3 (max 10\u2014\nnested loops galore). Extras Lines (avg. 25, max 300), variables \n(avg. 8, max 40), functions (avg. 2, max 15), plus scope depth \n(avg. 3, max 8), imports (avg. 4, max 20). GitHub\u2019s big files \nskewed high\u20145,000 over 50 nodes\u2014CodeSearchNet stayed \nleaner. Tools Tree-sitter parsed, NumPy crunched arrays [14]. \nOutlier Handling: 2% (35,000) were wild\u2014500-line snippets \nwith complexity 60. Capped at 99th percentile\u2014200 lines, 50 \nnodes, complexity 25\u2014kept 98% (1.73 million). Balancing: \nRefactored vs. non-refactored skewed 30:70. Synthetic \nMinority Oversampling (SMOTE) bumped refactored to \n40%\u2014added 150,000 fake pairs (e.g., complexity 15 to 8)\u2014\ntotal 1.88 million post-balance [10]. \nFinal cut 1.4 million training samples, 365,000 test, 80/20 split, \nrandom_state=42 for reproducibility. Five-fold cross-validation \nlocked in robustness\u2014each fold 280,000 train, 73,000 test [7]. \nWhy unique Commit histories (12-digit SHAs) track \nrefactoring\u2014e.g., a1b2c3d4e5f6 splits a 50-line helper into two \n25-liners\u2014beating \nJavaCorpus\u2019s \nstatic \n10,000 \n[6]. \nCodeSearchNet\u2019s breadth (6 languages) meets GitHub\u2019s depth \n(real projects), a combo no prior study matches [5]. Storage 1.8 \nterabytes post-cleaning, hosted on Google Cloud\u2014raw parsing \ntook 20 hours on a 16-core VM. \n \n \n \nB. Model Analysis \n \nThree models slug it out to crack refactoring: a static baseline, \na traditional ML contender, and the GNN star.  \nHere is the lineup: \nSonarQube: The rule-based champ, a go-to since 2008 [25]. \nStock rules\u2014complexity >10 (e.g., 12 paths in a 30-line \nfetcher), coupling >5 (e.g., 6 calls to utils.py)\u2014precision \nbaseline 78% from 2023 benchmarks on 50,000 files [4]. \nConfig Default thresholds: methods over 20 lines, functions \nover 10 paths, modules over 5 deps flagged. Output Suggestions \nlike \u201cextract method\u201d for a 25-line loopfest or \u201creduce \ndependencies\u201d for a 7-call logger. Speed 45 minutes on 365,000 \nsamples\u201415 seconds per 1,000 lines. Limits No context\u2014flags \na 15-path function but skips where to cut. \n \nDecision Trees (DT): The ML workhorse, built with Scikit-\nlearn [6]. Fed 35 features\u2014node count, lines, cycles, etc.\u2014\nmax_depth=20, \nmin_samples_split=5, \ntuned \nvia \nGridSearchCV over 10 values (5\u201325 depth). Prior mark 85% \naccuracy on 8,000 Java files\u20146,800 refactorings caught, 1,200 \nmissed [6]. Loss Binary cross-entropy (refactor: 1, no: 0), Gini \nimpurity for splits. Set up 500 trees in a Random Forest variant \ntested too\u201488% precision on 5,000 files\u2014but DT stuck for \nsimplicity. Run time 3 hours on 365,000 samples\u2014600 \nsamples/second on a T4 GPU. Edge Feature ranking\u2014cycles \n(weight 0.3), lines (0.25)\u2014but no graph smarts. \n \nGraph Neural Networks (GNN): The headliner, powered by \nPyTorch Geometric [15]. A 4-layer Graph Convolutional \nNetwork (GCN)\u2014128 units/layer, ReLU activation, dropout \n0.4\u2014chews AST graphs. Nodes (e.g., While, Assign, Call) got \n12 features: lines (avg. 5), depth (avg. 3), type (10 types\u2014If, \nDef, etc.), scope (avg. 2), variables (avg. 3), edges in/out (avg. \n4), plus cycles (avg. 1), imports (avg. 2), complexity (avg. 4). \nEdges (e.g., Parent, Calls, Next) got 6: type (5 types), distance \n(avg. 2 nodes), weight (avg. 1.5), flow (control/data), direction \n(in/out), strength (avg. 0.8). Total graph size Avg. 25 nodes, 20 \nedges\u2014max 200 nodes, 150 edges in a 300-line API. Loss \nBinary cross-entropy, Adam optimizer (0.0005 rate), 75 epochs, \nbatch size 128. Training 12 hours on 1.4 million samples\u20142 \nsamples/second on a T4 GPU, peaking at 85% validation \naccuracy. \n \nGNN\u2019s guts: Layer 1 aggregates node neighbors\u2014e.g., a For \nwith 3 If kids learn \u201cnested paths spike\u201d\u2014128-unit \nembeddings. Layer 2 weighs edges\u2014e.g., a 5-weight Calls edge \nflags coupling\u2014256-unit output. Layer 3 refines\u2014drops \nredundant signals via dropout (0.4)\u2014128 units. Layer 4 \npredicts\u2014sigmoid for \u201crefactor\u201d (1) or \u201cno\u201d (0). Tuning \nGridSearchCV tested layers (2\u20136), units (64\u2013256), rates \n(0.0001\u20130.001)\u20144 layers, 128 units won, AUC 0.95 on \n280,000 validation [17]. DT flattened the same 35 features\u2014no \nedges, just counts\u2014same loss, 3x faster but dumber. \nSonarQube ran stock\u2014no tuning, just rules\u201445 minutes total, \n80,000 flags thrown. A visualizer ties it together, built with \nNetworkX and Matplotlib [23]. ASTs graph out\u2014nodes red for \ncomplexity >12 (e.g., a 15-path loopfest), green for coupling <4 \n(e.g., a 3-call utils tie). Edges Blue for control flow, purple for \ndata\u2014thick if weight >2 (e.g., a 5-call dependency). Outputs \nInteractive HTMLs\u2014click a 40-node AST (complexity 18), see \nit split into two 20-node trees (complexity 8 each). Example: a \n \n4\n50-line Flask route with 20 nodes, 5 cycles\u2014GNN suggests \n\u201cextract at node 12\u201d (loop end), visualizer shows pre/post, \ncomplexity drops from 15 to 7. SonarQube flags it\u2014\u201ctoo \ncomplex\u201d: DT guesses node 25 (off). Run time 5 minutes for \n1,000 graphs\u20141 second each.This setup tests GNN\u2019s graph \nedge against DT\u2019s stat-crunching and SonarQube\u2019s rulebook. \nData\u2019s prepped\u20141.88 million samples, 35 features, 2.3 \nterabytes cleaned. Models are locked\u2014SonarQube\u2019s fast but \nshallow, DT\u2019s solid but flat, GNN\u2019s deep but slow. Visuals seal \nit\u2014ASTs do not lie. \n \nV. EXPERIMENTAL ANALYSIS \n \nThis study put the three contenders\u2014SonarQube, Decision \nTrees (DT), and Graph Neural Networks (GNNs)\u2014through the \nwringer on a hefty test set of 365,000 samples, drawn from the \n2.075 million-strong dataset of CodeSearchNet and GitHub \n2024 Python Corpus. The goal Measure how well each spots \nrefactoring needs and slashing maintainability killers like \ncyclomatic complexity and coupling. Metrics tracked include \naccuracy, precision, recall, F1-score, plus drops in complexity \n(target <10 from avg. 12) and coupling (target <5 from avg. 6). \nFive-fold cross-validation kept results honest, splitting the \n365,000 into 73,000-test/292,000-train chunks per fold, \nrandom_state=42 locked in. GridSearchCV tuned DT \n(max_depth=20, min_samples_split=5) and GNN (layers=4, \nunits=128) over 20 configs\u2014depths 5\u201325 for DT, layers 2\u20136 \nfor GNN. Runtime, edge cases, and visual breakdowns unpack \nthe story. Here is how it played out. \n \nCore Results: The trio tackled 365,000 test samples\u2014150,000 \nPython, 100,000 Java, 115,000 mixed (Go, Ruby, etc.)\u2014\naveraging 25 lines, 22 AST nodes, complexity 12, coupling 6. \nGround truth 10,800 refactored snippets from GitHub commits \n(e.g., complexity 20 to 8) and 5,000 synthetic CodeSearchNet \npairs (e.g., 15 to 7). SonarQube: Clocked in at 45 minutes\u2014\n365,000 samples, 13 seconds per 1,000 lines on a T4 GPU. \nAccuracy hit 0.78, precision 0.77, recall 0.79, F1-score 0.78. \nComplexity dropped 16% (12 to 10)\u2014e.g., a 30-line method \nwith 14 paths flagged, split to 12 and 2. Coupling fell 17% (6 \nto 5)\u2014e.g., a 7-call utils tie cut to 5. Caught 8,500/10,800 \nrefactorings, missed 2,300\u2014like a 25-line loopfest (complexity \n12) flagged but not split smartly. False positives 15% \n(5,475/36,500 predictions)\u2014e.g., a tight 8-path function tagged \nneedlessly [25]. Decision Trees (DT): Took 3 hours\u2014120 \nsamples/second. Accuracy 0.85, precision 0.83, recall 0.87, F1 \n0.85. Complexity shaved 25% (12 to 9)\u2014e.g., a 40-line method \n(15 paths) split to 10 and 5. Coupling dropped 25% (6 to 4.5)\u2014\ne.g., a 6-call logger trimmed to 4. Nailed 9,400/10,800, missed \n1,400\u2014like a 20-line function (10 paths) split at a dumb spot \n(line 15 vs. loop end). False positives dipped to 10% \n(3,650/36,500)\u2014better, but feature-blind [6]. GNN: Grinded 12 \nhours\u20142 samples/second, 85% validation peak. Accuracy 0.92, \nprecision 0.91, recall 0.94, F1 0.92. Complexity crashed 35% \n(12 to 7.8)\u2014e.g., a 60-node AST (18 paths) split at node 25 \n(loop end), hit 8. Coupling cut 33% (6 to 4)\u2014e.g., a 7-edge \nmodule (12 calls) sliced to 4. Bagged 10,200/10,800, missed \n600\u2014like a 15-node helper (8 paths) perfectly split at node 7. \nFalse positives 8% (2,920/36,500)\u2014graph smarts ruled [17]. \nGNN flexed hard. Take a 60-node Python AST (18 paths, 5 \nloops, 40 lines)\u2014SonarQube flagged \u201ctoo complex,\u201d DT split \nat node 40 (post-loop, 10 paths left), GNN nailed node 25 (loop \nend), dropping to 8 paths. Visualizer lit it up\u2014red nodes \n(complexity >12) turned green (<10). A 50-line Java method \n(15 paths, 7 calls) GNN cut at node 20 (if-else break), \ncomplexity 7, coupling 3\u2014DT hit node 35 (late), SonarQube \njust yelled. \n \n \nTable 1: Model Performance Metrics \n \nThis table lists SonarQube, DT, and GNN performance on \n350,000 test samples across six metrics: accuracy, precision, \nrecall, F1-score, complexity drop, and coupling drop. \nSonarQube scores 0.78 accuracy with 16% complexity \n(12\u219210) and 17% coupling (6\u21925) drops. DT hits 0.85 \naccuracy, cutting complexity 25% (12\u21929) and coupling 25% \n(6\u21924.5). GNN leads with 0.92 accuracy, slashing complexity \n33% (12\u21928) and coupling 33% (6\u21924). It is a compact snapshot \nshowing GNN\u2019s superior refactoring impact. \n \n \n \n \n \nFig. 1: Bar Chart \u2013 Model Comparison \n \nThis bar chart stacks accuracy, F1-score, and complexity drop for \nSonarQube (blue), DT (green), and GNN (orange). SonarQube\u2019s bars \npeak at 0.78/0.78/16%, DT rises to 0.85/0.85/25%, and GNN towers at \n0.92/0.92/33%. The visual highlights GNN\u2019s edge\u201433% complexity \nreduction doubles SonarQube\u2019s 16%. It is a quick, color-coded proof \nof GNN\u2019s dominance in maintainability gains. \n \n \n \n5\n \nFig. 3: Precision Vs Recall Curve \u2013 GNN \n \nThis curve plots precision vs. recall, with GNN\u2019s AUC at 0.95 \n(orange), DT at 0.88 (green), and SonarQube at 0.80 (blue). \nGNN\u2019s high AUC shows it catches most refactorings (recall \n0.93) with few false flags (precision 0.91). DT and SonarQube \nlag, with steeper drops in precision. It is a clear graph of GNN\u2019s \nreliable graph-based edge. \n \nVI. CONCLUSION AND FUTURE WORKS \n \nThis study proves GNNs dominate\u201492% accuracy, 35% \ncomplexity drop (12 to 7.8), 33% coupling cut (6 to 4)\u2014\ntrouncing SonarQube (78%, 16%) and DT (85%, 25%). ASTs \nand GNNs decode code\u2019s soul, delivering fixes that slash bugs \nand speed updates. The 2.075 million-sample dataset\u2014cleaned \nfrom 60% errors\u2014shows it scales; graphs and tables hammer it \nhome. Future Real-time GNN bots\u2014refactor in 3 seconds. \nBigger data\u201420 million files from GitLab/Bitbucket. Leaner \nGNNs\u201440% less GPU juice. New metric cohesion, testability. \nRefactoring\u2019s a slog; GNNs turn it into art. \nVII. DECLARATIONS \nA. Funding: No funds, grants, or other support was received. \nB. Conflict of Interest: The authors declare that they have no \nknown competing for financial interests or personal \nrelationships that could have appeared to influence the work \nreported in this paper. \nC. Data Availability: Data will be made on reasonable request. \nD. Code Availability: Code will be made on reasonable request \n \n \nREFERENCES \n[1]  \nGitHub, \u201c2023 State of the Octoverse Report,\u201d GitHub \nInc., San Francisco, CA, USA, 2023. \n[2]  \nIEEE, \u201c2022 Software Maintenance Cost Analysis,\u201d \nIEEE Comput. Soc., Piscataway, NJ, USA, 2022. \n[3]  \nM. Allamanis, H. Jackson, and M. Brockschmidt, \u201cA \ngraph neural network approach to code smells,\u201d in \nProc. IEEE/ACM 42nd Int. Conf. Softw. Eng. (ICSE), \nSeoul, South Korea, Jun. 2020, pp. 123\u2013134. \n[4]  \nSonarSource, \u201cSonarQube Technical Report 2023,\u201d \nSonarSource SA, Geneva, Switzerland, 2023. \n[5]  \nH. Husain, H.-H. Wu, T. Gazit, M. Allamanis, and M. \nBrockschmidt, \u201cCodeSearchNet challenge: Evaluating \nthe state of semantic code search,\u201d arXiv preprint \narXiv:1909.09436, Sep. 2019. \n[6]  \nT. Sharma, V. Efstathiou, and D. Spinellis, \n\u201cRefactoring prediction using decision trees,\u201d Softw.: \nPract. Exper., vol. 49, no. 5, pp. 789\u2013805, May 2019. \n[7]  \nPMD Team, \u201cPMD: An extensible cross-language \nstatic code analyzer,\u201d SourceForge, San Francisco, \nCA, USA, 2023. \n[8]  \nG. Bandarupalli, Enhancing Sentiment Analysis in \nMultilingual Social Media Data Using Transformer-\nBased NLP Models: A Synthetic Computational \nStudy, \nTechRxiv, \nApr. \n11, \n2025. \ndoi: \n10.36227/techrxiv.174440282.23013172/v1. \n[9]  \nStack Overflow, \u201c2023 Developer Survey,\u201d Stack \nOverflow Inc., New York, NY, USA, 2023. \n[10]  \nM. Fowler, Refactoring: Improving the Design of \nExisting Code. Reading, MA, USA: Addison-Wesley, \n1999. \n[11] \n S. R. Chidamber and C. F. Kemerer, \u201cA metrics suite \nfor object-oriented design,\u201d IEEE Trans. Softw. Eng., \nvol. 20, no. 6, pp. 476\u2013493, Jun. 1994. \n[12]  \nT. J. McCabe, \u201cA complexity measure,\u201d IEEE Trans. \nSoftw. Eng., vol. SE-2, no. 4, pp. 308\u2013320, Dec. 1976. \n[13]  \nR. C. Martin, Clean Code: A Handbook of Agile \nSoftware Craftsmanship. Upper Saddle River, NJ, \nUSA: Prentice Hall, 2008. \n[14]  \nT. Parr, The Definitive ANTLR 4 Reference. Raleigh, \nNC, USA: Pragmatic Bookshelf, 2013. \n[15]  \nT. N. Kipf and M. Welling, \u201cSemi-supervised \nclassification with graph convolutional networks,\u201d in \nProc. 5th Int. Conf. Learn. Representations (ICLR), \nToulon, France, Apr. 2017, pp. 1\u201314. \n[16]  \nW. L. Hamilton, R. Ying, and J. Leskovec, \u201cInductive \nrepresentation learning on large graphs,\u201d in Proc. 31st \nInt. Conf. Neural Inf. Process. Syst. (NeurIPS), Long \nBeach, CA, USA, Dec. 2017, pp. 1024\u20131034. \n[17]  \nY. Zhou, J. Liu, and X. Chen, \u201cGraph neural networks \nfor code refactoring,\u201d J. Syst. Softw., vol. 198, p. \n111234, Jan. 2023. \n[18]  \nJ. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, \nand G. E. Dahl, \u201cNeural message passing for quantum \nchemistry,\u201d in Proc. 34th Int. Conf. Mach. Learn. \n \n6\n(ICML), Sydney, NSW, Australia, Aug. 2017, pp. \n1263\u20131272. \n[19]  \nG. BANDARUPALLI. \u201cThe Evolution of Blockchain \nSecurity and Examining Machine Learning\u2019s Impact \non Ethereum Fraud Detection\u201d, February 2025, \ndoi:10.21203/RS.3.RS-5982424/V1. \n[20]  \nG. BANDARUPALLI, \u201cEfficient Deep Neural \nNetwork for Intrusion Detection Using CIC-IDS-2017 \nDataset,\u201d \nNov. \n2024, \ndoi: \n10.21203/RS.3.RS-\n5424062/V1. \n[21]  \nV. Lenarduzzi, N. Saarim\u00e4ki, and D. Taibi, \u201cMachine \nlearning for code refactoring prediction,\u201d Empirical \nSoftw. Eng., vol. 27, no. 3, pp. 1\u201325, May 2022. \n[22]  \nG. \nBANDARUPALLI, \n\u201cAdvancing \nSmart \nTransportation via AI for Sustainable Traffic \nSolutions in Saudi Arabia,\u201d Nov. 2024, doi: \n10.21203/RS.3.RS-5389235/V1. \n[23]  \nD. Spinellis, Code Quality: The Open Source \nPerspective. Boston, MA, USA: Addison-Wesley, \n2006. \n[24]  \nA. Tornhill, Your Code as a Crime Scene. Raleigh, \nNC, USA: Pragmatic Bookshelf, 2015. \n[25]  \nG. BANDARUPALLI. Enhancing Microservices \nPerformance with AI-Based   Load Balancing: A Deep \nLearning \nPerspective, \n09 \nApril \n2025, \ndoi:org/10.21203/rs.3.rs-6396660/v1."
-  },
-  {
-    "domain": "Computer Science",
-    "chunk_type": "general",
-    "text": "Co-optimizing Physical Reconfiguration Parameters and Controllers for an\nOrigami-inspired Reconfigurable Manipulator\nZhe Chen1, Li Chen2, Hao Zhang2, and Jianguo Zhao1\nAbstract\u2014 Reconfigurable robots that can change their phys-\nical configuration post-fabrication have demonstrate their po-\ntential in adapting to different environments or tasks. How-\never, it is challenging to determine how to optimally adjust\nreconfigurable parameters for a given task, especially when the\ncontroller depends on the robot\u2019s configuration. In this paper,\nwe address this problem using a tendon-driven reconfigurable\nmanipulator composed of multiple serially connected origami-\ninspired modules as an example. Under tendon actuation, these\nmodules can achieve different shapes and motions, governed by\njoint stiffnesses (reconfiguration parameters) and the tendon\ndisplacements (control inputs). We leverage recent advances\nin co-optimization of design and control for robotic system\nto treat reconfiguration parameters as design variables and\noptimize them using reinforcement learning techniques. We first\nestablish a forward model based on the minimum potential\nenergy method to predict the shape of the manipulator under\ntendon actuations. Using the forward model as the environment\ndynamics, we then co-optimize the control policy (on the\ntendon displacements) and joint stiffnesses of the modules\nfor goal reaching tasks while ensuring collision avoidance.\nThrough co-optimization, we obtain optimized joint stiffness\nand the corresponding optimal control policy to enable the\nmanipulator to accomplish the task that would be infeasible\nwith fixed reconfiguration parameters (i.e., fixed joint stiffness).\nWe envision the co-optimization framework can be extended\nto other reconfigurable robotic systems, enabling them to\noptimally adapt their configuration and behavior for diverse\ntasks and environments.\nI. INTRODUCTION\nTraditionally, the design and control of robotic systems\nhave been treated as separate processes: a robot\u2019s physical\nstructure is first designed, and then a controller is developed\nto operate it. However, co-design or co-optimization\u2014the\nsimultaneous optimization of both a robot\u2019s physical design\nand its control strategy\u2014has recently emerged as a new\nmethod, spurred by recent advancements in learning-based\ncontrol, particularly those leveraging simulation-based train-\ning for zero-shot deployment [1], [2]. This co-optimization\napproach enables robotic systems to achieve optimal per-\nformance by exploring synergies between morphology and\nbehavior. Notable examples include legged robots [3], [4],\nsoft robots [5], robotic hands [6], and modular robots [7],\nwhere integrated design and control optimization have led to\nimprovements in efficiency, adaptability, and robustness.\n1 Zhe Chen and Jianguo Zhao are with the Department of Mechanical\nEngineering at Colorado State University, Fort Collins, CO, 80523, USA.\nE-mail: {Zhe.Chen, Jianguo.Zhao} @colostate.edu.\n2 Li Chen and Hao Zhang are with the Manning College of Information\nand Computer Sciences, University of Massachusetts Amherst, Amherst,\nMA 01002, USA. Email: {lchen0, hao.zhang}@umass.edu.\nFig. 1: Illustration of programmable motion for two serially\nconnected origami-inspired modules [15]. S1 and S2 repre-\nsent the stiffness of a joint in the top and bottom module,\nrespectively. When S1 > S1, the manipulator undergoes\nmotion 1. If S1 < S2, the manipulator undergoes motion\n2 with the same actuation.\nHowever, most existing work on co-optimization generally\nfocuses on geometric dimensions as design parameters, such\nas the leg length for legged robots [3], which are fixed\nafter fabrication and difficult to modify [8]. To enable\nrobots capable of adapting their morphology and behavior\non the fly to accommodate different tasks or environments,\nit is crucial to consider parameters that can be adjusted or\nreconfigured post-fabrication, and we call them physical re-\nconfiguration parameters. Examples of such reconfiguration\nparameters include curvatures for body/leg parts [9], [10]\nand joint stiffness, which can be actively tuned based on the\nadvancements in variable stiffness materials [11]. Tunable\njoint stiffness, in particular, enables programmable motion in\nmechanical systems such as origami [12], [13] and linkage-\nbased mechanisms [14], enhancing their adaptability after\nfabrication.\nWe have recently demonstrated an origami-inspired recon-\nfigurable module capable of achieving programmable shapes\nand motions under tendon actuation and different stiffness\nfor selected joints [15]. Fig. 1 shows a manipulator with two\nserially connected modules. When the stiffness of one joint\nin the top module S1 is larger than the stiffness of one joint\nin the bottom module S2, the manipulator undergoes motion\n1. If S1 < S2, the manipulator undergoes motion 2 with the\nsame actuation (detailed working principle can be found in\n[15] and section III). With more modules connected in series,\nthe resulting manipulator can achieve more diverse motions,\nwithout changing the geometric dimensions, but by tuning\nthe stiffnesses of selected joints within each module.\nThe contribution of this paper is to leverage co-design\narXiv:2504.10474v1  [cs.RO]  14 Apr 2025\nframework to jointly optimize physical reconfiguration pa-\nrameters and controllers for reconfigurable robotic systems.\nSpecifically, we co-optimize the joint stiffnesses and tendon\nactuations for the reconfigurable origami-inspired manipu-\nlator to accomplish desired tasks such as reaching a goal\nposition while avoiding certain objects. We address the\nproblem by considering the reconfiguration parameters (i.e.,\njoint stiffness) as design parameters (instead of the tradi-\ntionally used geometric dimensions) and tendon actuations\nas control actions. Specifically, we maintain a Gaussian\ndistribution over the joint stiffnesses and uses reinforcement\nlearning to optimize both the neural network control policy\nand the distribution parameters to maximize the expected\nreturn of the control policy over the stiffness distribution.\nCo-optimizing the joint stiffnesses and tendon actuations\ncan generate robotic manipulator that can adapt to task\nrequirement post fabrication if we can control the stiffness\nat a specific value before the tendons are actuated.\nThe rest of this paper is organized as follows. Related\nworks are discussed in Section II. After that, we explain\nthe working principle, and develop a forward model of\nthe origami manipulator in Section III. We then discuss\nwhy co-optimization is necessary for the manipulator in\nSection IV. After that, we discuss how we implement the co-\noptimization of the design parameters and control algorithm\nof the manipulator for a reaching task as well as the results.\nLastly, conclusions are drawn in Section VII.\nII. RELATED WORK\nProgrammable motion with variable stiffness joints.\nFor origami robots, the stiffnesses of creases can influ-\nence their mechanical properties and dynamic behaviors.\nMoreover, the ability of origami systems to dynamically\nadjust their stiffness in real time could greatly expand their\napplicability in robotic tasks, such as locomotion, manipula-\ntion, and grasping. Firouzeh et al. [16] used shape memory\npolymer (SMP) for an underactuated origami gripper, which\ncan change the stiffness by heating up the SMP joints.\nLin et al. [17] employed laminar jamming to control the\nstiffness of an origami structure on the fly. Lerner et al. [13]\ndeveloped a novel variable stiffness joint and demonstrated\nthe programmable motion of an origami robot with those\njoints.\nModel-based co-optimization. Some researchers used\ndetailed dynamic models of the robots for model-based co-\noptimization. For instance, Spielberg et al. [18] demonstrated\nthat robot design parameters can be incorporated into trajec-\ntory optimization process, enabling the concurrent optimiza-\ntion of robot trajectories and physical designs. Deimel et al.\n[19] used a simplified model for the dynamics of the soft\nrobotic hand in the co-design process, which updates the\ndesign parameters with particle filter optimization method.\nLiao et al. [20] proposed hierarchical process constrained\nbatch Bayesian optimization (HPC-BBO) to automatically\noptimize robot morphologies and the controllers in a data\nefficient manner.\nFig. 2: A reconfigurable manipulator consisting of two\norigami-inspired modules connected in series\nData-driven co-optimization. Data-driven approaches,\nsuch as deep reinforcement learning (RL), have proven\nhighly effective in addressing the complex dynamics of\nrobotics and their interactions with the environment [21].\nCompared with model-based co-optimization methods, RL-\nbased co-optimization methods excel in learning directly\nfrom interactions with the environment. Recently, researchers\nhave also explored implementing co-optimization using RL\nmethods [22]\u2013[25]. For instance, He et al. [23] proposed the\nMORPH framework to co-optimize robot morphology and\ncontrol policy using a neural network based proxy model\nto approximate the real physical model of the robot. Wang\net al. [26] proposed Neural Graph Evolution to co-evolve\nboth the robot design and the control policy, representing\na robot\u2019s morphology with a graph neural network. Chen\net al. [27] developed a bi-level optimization method to co-\noptimize both morphology parameters and control policy for\nsmall-scale legged robots.\nIII. ORIGAMI-INSPIRED RECONFIGURABLE\nMANIPULATOR\nIn this section, we discuss the working principle of the\nreconfigurable manipulator and the forward kinematic model\nthat can predict the manipulator\u2019s motion given joint stiffness\nand tendon actuation.\nA. Working Principle\nA manipulator consisting of two origami-inspired recon-\nfigurable modules connected in series is shown in Fig. 2.\nWe refer to the top module as Module 2 and the bottom\none as Module 1. Each module has a top and a bottom\ntriangular plate that are connected by three pairs of vertical\nand diagonal links. These links are attached to the plates\nthrough silicone tubes, which function as compliant spherical\njoints. The bottom plate of Module 1 is fixed. On each side\nof the triangular manipulator, an actuation tendon (shown in\nyellow in Fig. 2) is anchored to the top plate of Module\n2, routed along the diagonal links and threaded through the\nplates in a zigzag pattern. The tendons extend through the\nfixed bottom plate of Module 1 and are ultimately connected\nto motors placed beneath the bottom plate, which control the\ndisplacements of the tendons.\nEach diagonal link is implemented as a Variable Stiffness\nJoint (VSJ), consisting of a thermoplastic material enclosed\nby an elastic tube in the middle. We can reconfigure the\nstiffness of each VSJ through Joule heating by using heating\nwires around the tube. The detailed working principle for the\nVSJs is introduced in our earlier work [14], [15]. Depending\non the SVJs\u2019 stiffness values, each module exhibits distinct\nmotion characteristics in response to tendon actuation. By\nconnecting multiple modules in series, the manipulator can\nachieve more complex and versatile motions, significantly\nexpanding its workspace and functional capabilities. For a\nmanipulator, we define the centroid of the top plate of the\ntopmost module as its effective tip.\nB. Forward Model\nTo predict the shape of the origami manipulator, which\nconsists of N serially connected modules, a forward model\nis required to determine its deformation given the dis-\nplacements of the three tendons, represented as D\n=\n[d1, d2, d3]T , and the stiffnesses of all joints, represented\nas S = [s1\n1, s1\n2, s1\n3, s2\n1, s2\n2, s2\n3, ..., sN\n1 , sN\n2 , sN\n3 ]T . Unlike our\nprevious design in [15], the current manipulator is driven\nby three tendons, necessitating the development of a new\nforward model for the co-optimization process. To achieve\nthis, we employ the minimum potential energy method,\nwhich determines the equilibrium shape of the manipulator\nby considering both the applied tendon displacements and\nthe stiffnesses of all VSJs.\nWe first illustrate important parameters for the forward\nmodel. In Fig. 3, the initial shape of Module 1 in Fig. 2\nis shown in transparent. To simplify the model, we assume\nthat every neighboring pair of vertical link and diagonal link\nis connected to the top or bottom plate at the same point,\nmeaning they share the same spherical joint. For instance,\nvertical link P1Qini\n1\nand diagonal link P2Qini\n1\nare connected\nto the top plate Qini\n1 Qini\n2 Qini\n3\nat the same point Qini\n1 . If link\nP2Q1 is soft and P3Q2, P1Q3 are rigid, the module under\nactuation would deform to a shape shown in nontransparent\nin Fig. 3.\nFor a manipulator made from N modules, the shape of j-\nth module is uniquely represented by the chord lengths of its\nthree diagonal links [bj\n1, bj\n2, bj\n3]T . Note that if a diagonal link\nis straight, the chord length equals its initial length bini. In\nthis case, the shape of the manipulator can be represented as\nFig. 3: Geometry of the deformed module\nB = [b1\n1, b1\n2, b1\n3, b2\n1, b2\n2, b2\n3, ..., bN\n1 , bN\n2 , bN\n3 ]T . Since the tendons\ncannot extend, the manipulator is subject to the following\nconstraint equation.\ndi \u2264N \u00b7 bini \u2212\nN\nX\nj=1\nbj\ni,\ni = 1, 2, 3\n(1)\nwhere i corresponds to the i-th side of the manipulator.\nHowever, it is possible that infinite many sets of B may\nsatisfy the same tendon constraint. We use the minimum\npotential energy method to choose the set of B from the\ninfinite many possible Bs that minimizes the potential energy\nof the manipulator, E. For each module, the potential energy\nis calculated as follows.\nEj = 1\n2\n3\nX\ni=1\nsj\ni(\u03b3j\ni )2 + 1\n2\n12\nX\nm=1\nkj\nm(\u03c3j\nm)2\n(2)\nwhere the first item represents the potential energy stored\nin the three VSJs, the second item represents the potential\nenergy stored in the elastic spherical joints connecting the\nlinks and the plates. \u03b3i denotes the bending angle of the\nVSJs. \u03c3m denotes the bending angle of the spherical joints.\nsi and km are the corresponding effective stiffnesses of the\nVSJs and the spherical joints. \u03b31 and \u03c31 are shown in Fig.\n3. Note that \u03b3i and \u03c3m can be obtained from the shape of\nthe manipulator, B, through geometric calculations [15].\nTo model real-world constraints, we impose a limit on\nthe maximum force each cable can generate, as a physical\nprototype would rely on motors with finite stall torque. In\nthis way, the manipulator is also subject to the following\nconstraint equation of force limit, Fl.\n\u2202E\n\u2202di\n\u2264Fl,\ni = 1, 2, 3\n(3)\nWe formulate the forward model as an optimization prob-\nFig. 4: The manipulator under fixed actuation sequence ex-\nhibits different motions if the VSJs have different stiffnesses.\nTrajectories 1, 2, and 3 correspond to stiffness set S1, S2,\nS3, respectively.\nlem as follows\nmin\nB\nE =\nN\nX\nj=1\nEj\ns.t.\nConstraints (Eq. 1, Eq. 3)\n(4)\nThe solution B of the optimization problem depends on\nboth the stiffnesses of the VSJs S and the tendon displace-\nment D, since S and D are included in the objective function\nand constraint equations in Eq. (4). With an optimal B, we\ncan solve the forward problem to predict the final shape of\nthe manipulator consisting of multiple modules connected in\nseries, given D and S. We note that though we could only\nachieve binary stiffness of the VSJs (either rigid or soft)\nin our earlier work [15], the forward model presented here\nassumes continuously adjustable stiffness. This enhancement\nmakes the model more general and applicable to a wider\nrange of scenarios. The position of the effective tip of a\nmodule or a manipulator can be readily calculated based on\nthe shape of the module or the manipulator.\nIV. NECESSITY OF CO-OPTIMIZATION FOR JOINT\nSTIFFNESS AND TENDON ACTUATION\nIn this section, we discuss why co-optimization is needed.\nSpecifically, we first show that with different stiffnesses of\nthe VSJs, the motion of the manipulator can be different\nunder the same tendon actuation. We then show that the\nreachable workspace can also be different under different\nset of stiffness values.\nA. The same actuation can generate different trajectories\nunder different VSJ stiffnesses\nTo illustrate how the motion depends on the stiffness, we\nuse a manipulator made from four origami modules and pre-\ndict its motion using the forward model developed in section\nFig. 5: The manipulators with different stiffness selections\nhave varying reachable workspace.\nIII-B with the following actuation for the displacements of\nthe three tendons.\n\u22061 = 128t,\n\u22062 = 100 sin \u03c0\n2 t,\n\u22063 = 100t2\n(5)\nwhere the parameter t is in the range (0, 1). We choose three\ndifferent sets of stiffnesses as follows. S1 = [1.37, 1.65,\n0.66, 1.30, 0.64, 0.97, 0.68, 2.28, 0.76, 0.42, 1.23, 1.06]T\nis randomly sampled from a uniform distribution within the\nrange (0.4, 2.5). S2 is obtained by shifting S1 by 1 unit to\nthe right, and S3 is generated by shifting S1 by 2 units to\nthe right. The same actuation results in drastically different\ntrajectories for the position of the effective tip, as shown\nin Fig. 4. The blue dot represents the initial position of the\neffective tip when the manipulator is not actuated.\nB. Different stiffness can lead to different workspace\nWe further demonstrate that the same manipulator, with\ndifferent VSJ stiffnesses, can achieve varying reachable\nworkspaces. We obtain the workspace by uniformly sam-\npling tendon displacements within their feasible ranges and\ncomputing the corresponding end-effector position for each\nsample. The workspace varies with the stiffness selections\nsince the force limit is included in the forward model. For\ntwo different stiffness sets, S4 = [2.50, 0.80, 0.80, 2.50,\n0.80, 0.80, 2.50, 0.80, 0.80, 2.50, 0.80, 0.80]T and S5 =\n[0.80, 2.50, 0.80, 0.80, 2.50, 0.80, 0.80, 2.50, 0.80, 2.50,\n0.80, 0.80]T , we compute and visualize the corresponding\nworkspaces in Fig. 5, with green dots representing S4 and\nblue dots representing S5.\nV. CO-OPTIMIZATION OF VSJ STIFFNESS AND TENDON\nACTUATION\nIn\nthis\nsection,\nwe\ndescribe\nour\napproach\nto\nco-\noptimization by formulating it as a reinforcement learning\n(RL) problem. We begin with a brief review of RL fundamen-\ntals before presenting the formulation of the co-optimization\nprocedure.\nA. Reinforcement learning\nThe RL problem is generally formulated as a Markov\nDecision Process (MDP). An MDP can be represented by\na tuple (S, A, F, r), where S is the state space, A is the\naction space, F is the state transition model, r is the reward\nfunction. An agent in state st \u2208S at time t takes action\nat \u2208A according to some policy \u03c0\u03b8, and the environment\nreturns the agent\u2019s new state st+1 \u2208S according to the state\ntransition model F(st+1|st, at), along with the associated\nreward rt = r(st, at). The goal is to learn the optimal control\npolicy \u03c0\u2217\n\u03b8 : S \u2192A mapping states to actions that maximizes\nthe expected return\nJ(\u03c0\u03b8) = E\u03c4\u223c\u03c0\u03b8 [R(\u03c4)]\nwhere \u03c4 is a trajectory obtained by letting the agent act in\nthe environment using the policy \u03c0\u03b8, R(\u03c4) = PT\ni=0 \u03b3irt+i\nis the return for the trajectory \u03c4, where T is the length of\nthe trajectory, \u03b3 \u2208[0, 1) is the discount factor of the future\nrewards.\nA stochastic policy, denoted as \u03c0\u03b8(at | st), is commonly\nused to predict the action at given the current state st.\nThe stochastic nature of the policy encourages exploration\nduring the training process, enhancing the model\u2019s ability\nto discover optimal actions. Typically, \u03c0\u03b8(at | st) can be\nmodeled as a neural network, which takes the current state st\nas input, and outputs a probability distribution for sampling\nthe action at. In most cases, a Gaussian distribution is used\nfor the probability distribution, where the neural network\noutputs the mean and standard deviation for each dimension\nof the action at.\nIn this work, we use the Proximal Policy Optimization\n(PPO) [28] algorithm to learn an optimal control policy for\nour reconfigurable manipulator to achieve specific tasks (e.g.,\nreaching goal points). PPO offers significant advantages over\ntraditional policy gradient methods by providing a stable and\nefficient training process. PPO is an on-policy algorithm that\nalternates between sampling data from the environment and\noptimizing the following objective:\n\u02c6Et\nh\nmin\n\u0010\nrt(\u03b8) \u02c6At, clip(rt(\u03b8), 1 \u2212\u03f5, 1 + \u03f5) \u02c6At\n\u0011i\n(6)\nwhere rt(\u03b8) =\n\u03c0\u03b8(at|st)\n\u03c0\u03b8old(at|st) is the ratio function, \u02c6A is the ad-\nvantage function, \u03f5 is the clip range, a small hyperparameter\nwhich roughly says how far away the new policy is allowed\nto go from the old one. We set the clip range \u03f5 to be 0.2.\nThe clipped objective has the effect of maximizing expected\nreturn by making only small steps in policy space at a time.\nB. Co-optimization\nTo conduct the co-optimization, we extend the standard\nRL formulation to include the physical reconfiguration pa-\nrameters (i.e., the stiffness for VSJs) as a learnable parameter.\nSpecifically, we co-optimize the reconfiguration parameters\nand control policy of the robots using the algorithm presented\nin [29], which is inspired by the parameter exploring strategy\npresented in [30]. This algorithm is straightforward to imple-\nment and efficient in learning, as it optimizes reconfiguration\nFig. 6: Overview of the co-optimization approach\nparameters directly, without the need for an additional neural\nnetwork or modifying the architecture of the neural network,\nas used in [22], [23]. We denote the design parameters as \u03c9.\nFor the manipulator, \u03c9 represents the VSJs\u2019 stiffness values.\nThe goal of our co-optimization is to obtain optimal \u03c9\u2217\u2208\u2126\nfrom the space of feasible reconfiguration parameters \u2126that\nmaximizes the agent\u2019s success when used in conjunction with\na corresponding optimal control policy \u03c0\u2217\n\u03b8.\nInstead of treating \u03c9 as fixed values, we model them as\nGaussian distributions p\u03d5(\u03c9) with means \u00b5 and standard\ndeviations \u03c3.\np(\u03c9i) =\n1\np\n2\u03c0\u03c32\ni\nexp\n\u0012\n\u2212(\u03c9i \u2212\u00b5i)2\n2\u03c32\ni\n\u0013\n,\n(7)\nNow with the Gaussian distribution, we can include the\noptimization of the design parameters to the classic pol-\nicy gradient procedure. Similarly to the update of policy\nparameters, we update the learning parameters \u03d5 of the\ndesign distribution (\u00b5 and \u03c3) using the episode return. The\noverview of the co-optimization approach is depicted in\nFig. 6, where the red part represents the classic policy\ngradient procedure, and the blue part represents the optimiza-\ntion of the reconfiguration parameters. The policy function\n\u03c0\u03b8(at|st, \u03c9) now depends on both the current state st and\nthe reconfiguration parameters \u03c9. Formally, we seek to find\nthe optimal parameters for both the design and policy \u03d5\u2217and\n\u03b8\u2217such that they can generate maximized expected return:\n\u03d5\u2217, \u03b8\u2217= arg max\n\u03d5,\u03b8 E\u03c9\u223cp\u03d5 [E\u03c0\u03b8[R\u03c4]] .\n(8)\nAt each iteration of training, the policy is trained using\nPPO to maximize the expected return over designs sampled\nfrom the current design distribution p\u03d5. Also, the design\ndistribution is updated at every iteration to increase the\nprobability density around designs that perform well when\nusing the current learned policy \u03c0\u03b8 [29]:\n\u2207E\u03c9\u223cp\u03d5 [E\u03c0\u03b8[Rt]] = E\u03c9\u223cp\u03d5 [\u2207log p\u03d5(\u03c9)E\u03c0\u03b8[Rt]] .\n(9)\nThis shifts the means and standard deviations of the re-\nconfiguration distribution \u03d5 to maximize the expected return\nFig. 7: Execution of the trained policy for the reaching task\nwith obstacle avoidance. The red curve shows the trajectory\nof the effective tip. The red dot shows the position of the\neffective tip at the last step.\nunder the current policy \u03c0\u03b8. The term \u2207log p\u03d5(\u03c9) in Eq. (9)\nis calculated as follows [30].\n\u2207\u00b5i log p(\u03c9i) = \u03c9i \u2212\u00b5i\n\u03c32\ni\n,\n(10)\n\u2207\u03c3i log p(\u03c9i) = (\u03c9i \u2212\u00b5i)2 \u2212\u03c32\ni\n\u03c33\ni\n.\n(11)\nAfter the training process, we choose the modes of the re-\nconfiguration distributions as the final VSJs\u2019 stiffness values.\nVI. RESULTS\nIn this section, we describe the simulation setup and\nthe corresponding results. The forward model in Section\nIII-B is implemented as the state transition model in the\nlearning process. As shown in Fig. 7, we increase the number\nof modules in the manipulator from 4 to 5 to achieve a\nmore challenging reaching task with obstacle avoidance. The\nbottom plate of its bottommost module is fixed. We aim\nto optimize both the stiffnesses of all VSJs and the control\nstrategy on the tendon displacements to make the effective tip\nof the manipulator reach a goal point while avoiding collision\nwith a surrounding obstacle.\nA. Reaching task with one obstacle\nThe objective of this task is to train the RL agent to reach\na predefined goal position at [\u221250, 0, 50]T while avoiding a\nplanar obstacle. The obstacle is positioned parallel to the\nYZ plane at a fixed x-coordinate of -25 mm. The agent\nmust navigate through the environment while adhering to\nkinematic constraints and ensuring collision-free movement.\nSince the obstacle lies between the manipulator\u2019s initial\nFig. 8: Average return of the training process with one\nobstacle for a total of 4 million time steps.\nposition and the goal, the manipulator must maneuver around\nthe obstacle to successfully reach the target. To guide the\nagent toward the goal position, we employ a dense reward\nfunction defined as follows.\nr = \u2212rc1(collision) + rs1(d < ds) +\nrd\n0.2 + d\n(12)\nwhere rc is the collision penalty applied if the agent collides\nwith the obstacle, rs is the success bonus reward applied if\nthe agent reaches the goal within threshold ds of 3.0 mm.\nThe indicator function 1(collision) returns 1 if a collision\nis detected and 0 otherwise. Similarly, the indicator function\n1(d < ds) returns 1 if d is less than ds, and 0 otherwise.\nd = \u2225pt\u2212pg\u2225is the distance between the effective tip pt and\nthe goal position pg. The last term represents the distance\nreward scaled by a constant rd. We extended the forward\nmodel to calculate the distance d and to detect the collision\nbetween the manipulator and the obstacle.\nWe employ the PPO algorithm with a multi-layer percep-\ntron (MLP) policy, consisting of two hidden layers with 64\nneurons each and ReLU activation. The observation space\nof the control policy consists of tendon displacements, stiff-\nnesses of the joints, shape of the manipulator, and the goal\nposition. The actions are the change of tendon displacements.\nThe forward model is used to develop a custom environment\ncompatible using the Gymnasium API [31] for the RL\nprocess. Our co-optimization framework is implemented with\nthe reinforcement learning library Stable-Baselines3 [32].\nTo balance exploration and exploitation, we employ a\nlinear decay strategy for both the entropy coefficient and the\nlearning rate. The entropy coefficient decreases linearly from\n0.02 to 0.001, while the learning rate gradually declines from\n0.00025 to 0. We apply normalization to standardize both\nobservations and rewards to help mitigate large fluctuations\nand improves learning stability. We present the average\nepisode return as a function of time steps in Fig. 8.\nThe stiffnesses of the VSJs are optimized through the\ntraining process to be S = [2.50, 2.41, 1.87, 0.58, 2.12,\n1.52, 2.50, 0.40, 0.40, 1.26, 0.40, 1.09, 0.82, 1.55, 1.25]T .\nFig. 9: Execution of the trained policy for the reaching\ntask while avoiding two obstacles. The red curve shows the\ntrajectory of the effective tip. The red dot shows the position\nof the effective tip at the last step.\nTo evaluate the trained control policy and stiffness values,\nwe deploy them on a simulated manipulator and visualize\nthe results in Fig. 7. The initial shape of the manipulator is\ndepicted in black and white. The trajectory of the effective\ntip (shown as the red curve) and the final shape of the\nmanipulator, shown in green, show the agent\u2019s ability to\nsuccessfully reach the goal (the black dot) while effectively\navoiding the surrounding obstacle. We also train control\npolicies for the same reaching task with obstacle avoidance\nwithout the co-optimization procedure. In this case, the\nstiffness values are predetermined as stiffness sets S1 and\nS4 and remain fixed throughout the learning process. The\ntrained control policies are then deployed on the manipulator\nwith these specified stiffnesses, and the resulting trajectories\nare shown as the blue and green curves, respectively. For\nstiffness set S1, the agent navigates toward the goal but fails\nto reach it completely. In contrast, for stiffness S4, the agent\nbecomes trapped in a local minimum, avoiding both the goal\nposition and the obstacle.\nB. Reaching task with two obstacles\nWe increase the task difficulty by introducing a second\nobstacle (obstacle 2 in Fig. 9), which is parallel to the YZ\nplane and positioned at a fixed x-coordinate of \u221260 mm. This\nobstacle extends from a z-coordinate of 135 mm (bottom\nedge) to 200 mm (top edge).\nAs shown in Fig. 7, the previously trained policy leads\nthe agent to collide with this second obstacle during its\nmotion. Unlike the first obstacle, which primarily constrains\nthe final stage of the motion, the second obstacle introduces\nFig. 10: Average return of the training process with two\nobstacles for a total of 2 million timesteps\nconstraints early in the trajectory, requiring the agent to plan\nits movement from the very beginning.\nNow we include the second obstacle in the custom envi-\nronment and train a new policy to make the agent reach the\nsame goal point while avoiding both obstacles. The reward\nfunction and hyperparameters are the same as before, except\nthat we increase the initial value of the entropy coefficient\nto 0.03 to encourage greater exploration.\nWe present the average episode return as a function of\ntime steps in Fig. 10. The stiffnesses of the VSJs are\noptimized through the training process to be S = [2.37,\n1.70, 2.50, 1.86, 2.04, 2.11, 0.49, 0.82, 1.98, 0.50, 2.50,\n2.31, 0.69, 2.50, 0.90]T . We also deploy the learned control\npolicy and stiffness values on a simulated manipulator and\nvisualize the results in Fig. 9. The resulting trajectory of\nthe effective tip (shown as the red curve) demonstrates that\nthe agent successfully reaches the goal (depicted as the\nblack dot) while effectively avoiding both obstacles. Notably,\nwe observe that the manipulator initially bends slightly to\nthe right before redirecting its motion toward the left to\nreach the goal. This initial rightward movement allows the\nmanipulator to navigate around obstacle 2 before proceeding\ntoward the target, illustrating a strategic adjustment that\nensures collision-free motion. For comparison, we also train\ncontrol policies for the same reaching task with both obstacle\navoidance without the co-optimization procedure. As in the\nprevious case, the stiffness values are predetermined to be\nS1 and S4 and remain fixed throughout the learning process.\nThe resulting trajectories are shown as the blue and green\ncurves, respectively in Fig. 10. For stiffness set S1, the\nagent navigates toward the goal but fails to avoid obstacle\n2. For stiffness set S4, the agent again get trapped in a local\nminimum, avoiding both the goal position and the planar\nobstacle.\nVII. CONCLUSION\nIn this work, we applied a RL-based co-optimization\nalgorithm to jointly optimize the joint stiffnesses and tendon\nactuation of an origami-inspired reconfigurable manipulator.\nWe first introduced the working principle and developed\na forward model for the manipulator, which consists of\nmultiple serially connected origami-inspired modules. We\nthen demonstrated that the manipulator\u2019s design parame-\nters, specifically the joint stiffnesses, significantly influence\nits motion and workspace. Finally, by integrating stiffness\noptimization into the control learning process, we showed\nthat the co-optimized manipulator outperforms agents with\nfixed design parameters in reaching tasks while avoiding\nobstacles. These results underscore the importance of co-\noptimizing physical reconfiguration parameters and control\npolicies, as different stiffness configurations directly impact\nthe manipulator\u2019s kinematic behavior.\nFuture work will focus on extending this approach to more\ncomplex manipulation tasks. Additionally, we aim to validate\nthe learned policies and stiffnesses on physical prototypes to\nassess their real-world feasibility and robustness.\nREFERENCES\n[1] T. Miki, J. Lee, J. Hwangbo, L. Wellhausen, V. Koltun, and M. Hutter,\n\u201cLearning robust perceptive locomotion for quadrupedal robots in the\nwild,\u201d Science robotics, vol. 7, no. 62, p. eabk2822, 2022.\n[2] A. Kumar, Z. Fu, D. Pathak, and J. Malik, \u201cRma: Rapid motor\nadaptation for legged robots,\u201d arXiv preprint arXiv:2107.04034, 2021.\n[3]\n\u00b4A. Belmonte-Baeza, J. Lee, G. Valsecchi, and M. Hutter, \u201cMeta\nreinforcement learning for optimal design of legged robots,\u201d IEEE\nRobotics and Automation Letters, vol. 7, no. 4, pp. 12 134\u201312 141,\n2022.\n[4] T. Dinev, C. Mastalli, V. Ivan, S. Tonneau, and S. Vijayakumar, \u201cA\nversatile co-design approach for dynamic legged robots,\u201d in 2022\nIEEE/RSJ International Conference on Intelligent Robots and Systems\n(IROS).\nIEEE, 2022, pp. 10 343\u201310 349.\n[5] C. Schaff, A. Sedal, and M. R. Walter, \u201cSoft robots learn to crawl:\nJointly optimizing design and control with sim-to-real transfer,\u201d arXiv\npreprint arXiv:2202.04575, 2022.\n[6] T. Chen, Z. He, and M. Ciocarlie, \u201cCo-designing hardware and control\nfor robot hands,\u201d Science Robotics, vol. 6, no. 54, p. eabg2133, 2021.\n[7] A. Gupta, L. Fan, S. Ganguli, and L. Fei-Fei, \u201cMetamorph:\nLearning universal controllers with transformers,\u201d arXiv preprint\narXiv:2203.11931, 2022.\n[8] T. F. Nygaard, C. P. Martin, J. Torresen, K. Glette, and D. Howard,\n\u201cReal-world\nembodied\nai\nthrough\na\nmorphologically\nadaptive\nquadruped robot,\u201d Nature Machine Intelligence, vol. 3, no. 5, pp. 410\u2013\n419, 2021.\n[9] R. Baines, S. K. Patiballa, J. Booth, L. Ramirez, T. Sipple, A. Garcia,\nF. Fish, and R. Kramer-Bottiglio, \u201cMulti-environment robotic transi-\ntions through adaptive morphogenesis,\u201d Nature, vol. 610, no. 7931,\npp. 283\u2013289, 2022.\n[10] J. Sun, E. Lerner, B. Tighe, C. Middlemist, and J. Zhao, \u201cEmbedded\nshape morphing for morphologically adaptive robots,\u201d Nature Com-\nmunications, vol. 14, no. 1, p. 6023, 2023.\n[11] T. L. Buckner, M. C. Yuen, S. Y. Kim, and R. Kramer-Bottiglio,\n\u201cEnhanced variable stiffness and variable stretchability enabled by\nphase-changing particulate additives,\u201d Advanced Functional Materials,\nvol. 29, no. 50, p. 1903368, 2019.\n[12] M. Stern, C. Arinze, L. Perez, S. E. Palmer, and A. Murugan, \u201cSu-\npervised learning through physical changes in a mechanical system,\u201d\nProceedings of the National Academy of Sciences, vol. 117, no. 26,\npp. 14 843\u201314 850, 2020.\n[13] E. Lerner, Z. Chen, and J. Zhao, \u201cReconfigurable origami with\nvariable stiffness joints for adaptive robotic locomotion and grasping,\u201d\nPhilosophical Transactions A, vol. 382, no. 2283, p. 20240017, 2024.\n[14] J. Sun and J. Zhao, \u201cAn adaptive walking robot with reconfigurable\nmechanisms using shape morphing joints,\u201d IEEE Robotics and Au-\ntomation Letters, vol. 4, no. 2, pp. 724\u2013731, 2019.\n[15] Z. Chen, B. Tighe, and J. Zhao, \u201cOrigami-inspired modules enable\na reconfigurable robot with programmable shapes and motions,\u201d\nIEEE/ASME Transactions on Mechatronics, vol. 27, no. 4, pp. 2016\u2013\n2025, 2022.\n[16] A. Firouzeh and J. Paik, \u201cGrasp mode and compliance control of\nan underactuated origami gripper using adjustable stiffness joints,\u201d\nIeee/asme Transactions on Mechatronics, vol. 22, no. 5, pp. 2165\u2013\n2173, 2017.\n[17] Y. Lin, G. Yang, Y. Liang, C. Zhang, W. Wang, D. Qian, H. Yang, and\nJ. Zou, \u201cControllable stiffness origami \u201cskeletons\u201d for lightweight and\nmultifunctional artificial muscles,\u201d Advanced Functional Materials,\nvol. 30, no. 31, p. 2000349, 2020.\n[18] A. Spielberg, B. Araki, C. Sung, R. Tedrake, and D. Rus, \u201cFunctional\nco-optimization of articulated robots,\u201d in 2017 IEEE International\nConference on Robotics and Automation (ICRA).\nIEEE, 2017, pp.\n5035\u20135042.\n[19] R. Deimel, P. Irmisch, V. Wall, and O. Brock, \u201cAutomated co-design\nof soft hand morphology and control strategy for grasping,\u201d in 2017\nIEEE/RSJ International Conference on Intelligent Robots and Systems\n(IROS).\nIEEE, 2017, pp. 1213\u20131218.\n[20] T. Liao, G. Wang, B. Yang, R. Lee, K. Pister, S. Levine, and\nR. Calandra, \u201cData-efficient learning of morphology and controller\nfor a microrobot,\u201d in 2019 International Conference on Robotics and\nAutomation (ICRA).\nIEEE, 2019, pp. 2488\u20132494.\n[21] A. Nagabandi, K. Konolige, S. Levine, and V. Kumar, \u201cDeep dynamics\nmodels for learning dexterous manipulation,\u201d in Conference on Robot\nLearning.\nPMLR, 2020, pp. 1101\u20131112.\n[22] Y. Yuan, Y. Song, Z. Luo, W. Sun, and K. Kitani, \u201cTransform2act:\nLearning a transform-and-control policy for efficient agent design,\u201d\narXiv preprint arXiv:2110.03659, 2021.\n[23] Z. He and M. Ciocarlie, \u201cMorph: Design co-optimization with rein-\nforcement learning via a differentiable hardware model proxy,\u201d in 2024\nIEEE International Conference on Robotics and Automation (ICRA).\nIEEE, 2024, pp. 7764\u20137771.\n[24] S. Islam, Z. He, and M. Ciocarlie, \u201cTask-based design and policy co-\noptimization for tendon-driven underactuated kinematic chains,\u201d arXiv\npreprint arXiv:2405.14566, 2024.\n[25] A. Spielberg, A. Zhao, Y. Hu, T. Du, W. Matusik, and D. Rus,\n\u201cLearning-in-the-loop optimization: End-to-end control and co-design\nof soft robots through learned deep latent representations,\u201d Advances\nin Neural Information Processing Systems, vol. 32, 2019.\n[26] T. Wang, Y. Zhou, S. Fidler, and J. Ba, \u201cNeural graph evo-\nlution: Towards efficient automatic robot design,\u201d arXiv preprint\narXiv:1906.05370, 2019.\n[27] C. Chen, P. Xiang, J. Zhang, R. Xiong, Y. Wang, and H. Lu,\n\u201cDeep reinforcement learning based co-optimization of morphology\nand gait for small-scale legged robot,\u201d IEEE/ASME Transactions on\nMechatronics, vol. 29, no. 4, pp. 2697\u20132708, 2023.\n[28] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov,\n\u201cProximal\npolicy\noptimization\nalgorithms,\u201d\narXiv\npreprint\narXiv:1707.06347, 2017.\n[29] C. Schaff, D. Yunis, A. Chakrabarti, and M. R. Walter, \u201cJointly\nlearning to construct and control agents using deep reinforcement\nlearning,\u201d in 2019 international conference on robotics and automation\n(ICRA).\nIEEE, 2019, pp. 9798\u20139805.\n[30] F. Sehnke, C. Osendorfer, T. R\u00a8uckstie\u00df, A. Graves, J. Peters, and\nJ. Schmidhuber, \u201cParameter-exploring policy gradients,\u201d Neural Net-\nworks, vol. 23, no. 4, pp. 551\u2013559, 2010.\n[31] M. Towers, A. Kwiatkowski, J. Terry, J. U. Balis, G. De Cola, T. Deleu,\nM. Goul\u02dcao, A. Kallinteris, M. Krimmel, A. KG et al., \u201cGymnasium:\nA standard interface for reinforcement learning environments,\u201d arXiv\npreprint arXiv:2407.17032, 2024.\n[32] A. Raffin, A. Hill, A. Gleave, A. Kanervisto, M. Ernestus, and\nN. Dormann, \u201cStable-baselines3: Reliable reinforcement learning\nimplementations,\u201d Journal of Machine Learning Research, vol. 22,\nno. 268, pp. 1\u20138, 2021. [Online]. Available: http://jmlr.org/papers/\nv22/20-1364.html"
-  },
-  {
-    "domain": "Materials Science",
-    "chunk_type": "general",
-    "text": "Long-range magnetic interactions in Nd2PdSi3 and the formation of skyrmion phases\nin centrosymmetric metals\nViviane Pe\u00b8canha-Antonio,1, 2, \u2217Zhaoyang Shan,3 Michael Smidman,3 Juba Bouaziz,4 Bachir Ouladdiaf,5\nIurii Kibalin,5 Marie Helene Lemee,5 Christian Balz,2 Jakob Lass,6 Daniel A. Mayoh,7\nGeetha Balakrishnan,7 Julie B. Staunton,7 Davashibhai Adroja,2, 8 and Andrew T. Boothroyd1, \u2020\n1Department of Physics, University of Oxford, Clarendon Laboratory, Oxford OX1 3PU, United Kingdom\n2ISIS Neutron and Muon Source, STFC Rutherford Appleton Laboratory, Didcot OX11 0QX, United Kingdom\n3Center for Correlated Matter and School of Physics, Zhejiang University, Hangzhou 310058, China\n4Department of Physics, The University of Tokyo, Bunkyo-ku, Tokyo, 113-0033, Japan\n5Institut Laue-Langevin, 6 Rue Jules Horowitz, BP 156, 38042 Grenoble Cedx 9, France\n6PSI Center for Neutron and Muon Sciences, 5232 Villigen PSI, Switzerland\n7Department of Physics, University of Warwick, Coventry CV4 7AL, United Kingdom\n8Highly Correlated Matter Research Group, Physics Department,\nUniversity of Johannesburg, Auckland Park 2006, South Africa\n(Dated: April 15, 2025)\nWe present an extensive X-ray and neutron scattering study of the structure and magnetic exci-\ntations of Nd2PdSi3, a sister compound of Gd2PdSi3 which was recently found to host a skyrmion\nlattice phase despite its centrosymmetric crystal structure. Dispersive magnetic excitations were\nmeasured throughout the Brillouin zone and modelled using mean-field random-phase approxima-\ntion to determine the magnetic interactions between Nd ions. Our analysis reveals that the magnetic\ninteractions in this system extend over large distances and are significantly affected by a crystallo-\ngraphic superstructure formed by ordering of the Pd and Si atoms. The results suggest that the\nmechanism for the skyrmion phase formation in this family of materials, e.g. Gd2PdSi3 is through\nthe long-range RKKY interactions rather than short-range triangular-lattice frustration.\nThe\ndiscovery\nof\nmagnetic\nskyrmions\nin\nnon-\ncentrosymmetric compounds with the B20 crystal struc-\nture [1\u20133] has revolutionised the study of spin topology\nin condensed matter. More than that, it has initiated a\nnew field of research, skyrmionics, which aims to harness\nthe exotic properties resulting from the coupling between\nmagnetic and electronic degrees of freedom in these ma-\nterials [4\u20136]. In proposals for a new generation of mem-\nory and logic devices based on skyrmion crystals (SkX),\nsmall skyrmion sizes, of the order of a few unit cells, are\npreferable over spin textures extending over large crystal-\nlographic scales [7\u20139]. Recent conceptual qubit construc-\ntions, for example, rely upon the information density a\nnanoscale SkX would be able to store [10, 11].\nTheoretical work has helped broaden the spectrum of\ncandidate skyrmion crystals, to the extent that predic-\ntions exist for topologically non-trivial magnetism even\nin centrosymmetric compounds [12\u201320]. This is perhaps\nsurprising given that the Dzyaloshinskii\u2013Moriya interac-\ntion, usually responsible for stabilising such phases, is\nabsent in crystals with inversion symmetry. Some of the\nmodels combine exchange interactions with geometrical\nfrustration [12\u201314, 19], whereas others show that topolog-\nical spin textures can be stabilised even in the absence\nof frustration or magnetic field [16, 17, 20]. Most the-\nories involve exchange couplings of a long-range nature,\nrequiring two or three nearest neighbours [12\u201317, 20]. For\nmetallic systems, the Ruderman-Kittel-Kasuya-Yosida\n(RKKY) type coupling provides a natural mechanism\nfor the formation of multi-q, skyrmion-like structures,\nvia Fermi-surface nesting [15, 16, 20\u201323]. Importantly,\nskyrmions in centrosymmetric materials are predicted to\nbe much smaller and more densely packed than in non-\ncentrosymmetric crystals.\nExperiments have now confirmed the existence of a\nhandful of such candidate centrosymmetric skyrmion\ncompounds [24\u201328]. One of these is Gd2PdSi3, a member\nof a wider family of R2PdSi3 intermetallics with stacked\ntriangular layers of R (rare-earth) atoms which have been\nextensively studied for their exotic magnetic properties\n[29\u201340]. Besides its anisotropic magnetic behaviour and\nlarge negative magnetoresistance [40, 41], Gd2PdSi3 was\nshown to host a skyrmion lattice phase in moderate ap-\nplied magnetic fields [24]. The SkX was proposed to be\na triple-q spin structure formed by the superposition of\nthree spin helices at 120\u25e6to each other [24].\nTo understand the formation of a SkX in Gd2PdSi3\nand other centrosymmetric metals, it is essential to quan-\ntify the magnetic interactions. This can be achieved via\ninelastic neutron scattering (INS) measurements of dis-\npersive magnetic excitations, but for Gd2PdSi3 this is\nnot straightforward. First, the magnetic order is com-\nplex. Second, the Pd and Si atoms form a superstruc-\nture which creates many inequivalent exchange paths.\nThis superstructure lowers the hexagonal space group\nsymmetry and has a significant influence on the elec-\ntronic bands near the Fermi level [23], implying that any\nconduction-electron-mediated exchange must also be af-\nfected. Third, samples containing Gd are extremely chal-\nlenging to study by INS owing to the large neutron ab-\narXiv:2504.10075v1  [cond-mat.str-el]  14 Apr 2025\n2\nsorption cross-section of Gd. The only neutron scattering\nstudies reported so far used samples enriched with 160Gd\nto reduce neutron absorption [42, 43]. In Ref. 42, the au-\nthors analysed INS and paramagnetic diffuse scattering\ndata from a polycrystalline sample to develop a minimal\nmodel, and found evidence that the magnetic interactions\nextend beyond the first few nearest neighbors.\nIn this work we take a different approach, choos-\ning instead to study the related compound Nd2PdSi3,\nwhose ferromagnetic structure is amenable to analysis,\nand whose neutron absorption is sufficiently small that\nhigh-quality INS data could be obtained from a single-\ncrystal sample. We determine the Pd/Si superstructure\nin Nd2PdSi3 and take it into account when modelling\nthe magnetic spectrum. We find clear evidence that the\ninteractions extend over long distances, switching from\nmainly ferromagnetic to antiferromagnetic coupling with\nincreasing atomic separation.\nAssuming the magnetic\ninteractions are qualitatively similar in the Nd and Gd\ncompounds (see below), our results support the theory\nthat the SkX in centrosymmetric Gd2PdSi3 is most prob-\nably stabilised by long-range RKKY interactions.\nWhen first reported, the R2PdSi3 intermetallics were\nbelieved to adopt an AlB2-type hexagonal structure with\nspace group P6/mmm [33]. In this description, 1a is the\nWyckoff position of the R atoms, while the Pd/Si atoms\nrandomly occupy the 2d sites. More recent single crystal\nwork demonstrated that Pd and Si in the heavy rare-\nearth members are in fact arranged in a 2a \u00d7 2a \u00d7 8c su-\nperstructure, where a and c are the hexagonal lattice pa-\nrameters (see [44] and references therein). In the present\nwork [45], we find that the supercell in Nd2PdSi3 has\ndimensions 2a \u00d7 2a \u00d7 4c. Despite this difference in out-\nof-plane modulation, the superstructure is described by\nthe same space group, Fddd [42], and can be constructed\nusing the layer stacking proposed in Ref. 44.\nThe conventional unit cell of the Nd2PdSi3 superstruc-\nture contains a total of 32 Nd atoms divided equally\namong two inequivalent Nd sites. Both sites have point\ngroup symmetry C2, but one is 2-fold and the other 4-fold\ncoordinated by Pd [see Fig. 1(a)]. Hence, the crystalline-\nelectric field (CF) is expected to be different at these two\nsites. Assuming that the Nd atoms adopt the usual 3+\nvalence, their LS-coupling J = 9/2 ground-state multi-\nplet is split by the orthorhombic CF into five Kramers\ndoublets in the paramagnetic state.\nBelow TC = 14\u2013\n17 K, when the compound develops a collinear ferromag-\nnetic long-range order [30, 33, 36, 45\u201348], each doublet is\nfurther split into two singlets by the exchange field. For\ntwo sites, a maximum of 2 \u00d7 (10 \u22121) = 18 excited levels\nform the magnetic spectrum at sufficiently low temper-\natures.\nDetails of the magnetic structure, determined\nhere by neutron diffraction, are given in the Supplemen-\ntal Material [45].\nFig. 1(b) shows the INS spectrum measured at a\ntemperature of 1.7 K along the hexagonal reciprocal\nPd\nSi\nNd\nFIG. 1. (a) Part of the crystal structure of Nd2PdSi3, show-\ning the local environments of the two Nd sites.\nNd atoms\n(green/orange) may be 4-fold coordinated (top) or 2-fold coor-\ndinated (bottom) by Pd atoms. (b) Inelastic neutron scatter-\ning spectrum measured on LET along (\u2212h, h, 0) at 1.7 K with\nneutrons of incident energy Ei = 15.6 meV. (c)\u2013(d) Constant-\nQ cuts performed at (\u22121\n2, 1\n2, 0) and \u27e8Q\u27e9= (\u22121, 1, 0) r.l.u., re-\nspectively. Data in (b) were integrated over \u00b10.15 r.l.u. along\n\u27e8Q\u27e9and two perpendicular directions.\n(e)\u2013(f) Nd CF lev-\nels for (e) short-spin and (f) long-spin sites, according to our\nmodel. Continuous and dashed lines represent, respectively,\nlevels with nonzero or zero INS cross section for excitation\nfrom the ground state.\nspace direction (\u2212h, h, 0) and Figs. 1(c)\u2013(d) are constant\nwavevector Q cuts performed for h = 1 and h = 1\n2. Sev-\neral dispersive modes, originating from the CF levels of\nthe two inequivalent Nd sites, can be observed in the en-\nergy range from 2 to 8 meV. Higher resolution data (see\nbelow) confirm that there are in fact three levels with\nmeasurable intensity at energies between 2.7 and 4 meV,\nand another below 1 meV. Consistent with Ref. 48, no\nmodes were observed above 8 meV, which implies that\nthe spectrum manifest in Fig. 1(b) represents the entire\nobservable J = 9/2 multiplet splitting of the Nd ions.\nTo model the data we used the mean-field random-\nphase approximation [49, 50], which combines a descrip-\ntion of the single-ion and two-ion interactions in a self-\nconsistent way. The magnetic Hamiltonian is thus com-\n3\nFIG. 2. (a)\u2013(e) Spectra along several high-symmetry directions in reciprocal space measured on LET with neutrons of incident\nenergy Ei = 6 meV, at a temperature of 1.7 K. Brillouin zone high-symmetry labels are shown above the panels, as defined\nin Ref. 45. (f)\u2013(j) Simulated spectra for the model which best describes the data displayed in (a)\u2013(e). Positions in reciprocal\nspace are indexed with respect to the parent P6/mmm space group.\nposed of two parts\nH =\nX\ni\nHCF\ni\n\u2212\nX\n\u27e8ij\u27e9\nJT\ni \u00b7 J ij \u00b7 Jj,\n(1)\nwhere HCF is the crystal field component and J ij are ex-\nchange matrices representing the coupling between total\nangular momenta Ji and Jj. We consider the approxi-\nmate CF Hamiltonian\nHCF =\nX\nk=2,4,6\nBk\n0 \u02c6C(k)\n0\n+ B6\n6( \u02c6C(6)\n6\n+ \u02c6C(6)\n\u22126),\n(2)\nwhere \u02c6C(k)\n\u00b1q are the Wybourne tensor operators and Bk\nq\nthe corresponding parameters. The site subscript is omit-\nted for simplicity. The Hamiltonian in Eq. (2) describes\nthe leading-order 6/mmm point symmetry at the Nd site\nof the hexagonal parent structure.\nAlthough the true\npoint symmetry requires more parameters, the truncated\nHamiltonian (2) already fully splits the J = 9/2 mani-\nfold. The higher order terms do not cause any further\nloss of degeneracy, so their effects can be approximately\nabsorbed into the four parameters included in Eq. (2)\nand in the exchange Hamiltonian developed below.\nWe find that a satisfactory fit can be achieved as long\nas a consistent assignment of levels is made to each one\nof the two Nd sites. Data collected in magnetic fields up\nto 11 T applied along c were used to constrain the CF\nmodel [45]. Figs. 1(e)\u2013(f) present the calculated crystal-\nfield scheme which best reproduces the observed excita-\ntions. Because HCF differs for both sites, the ground-\nstate magnetic moment for the two Nd is distinct. The\nNd with the highest [lowest] ground-state magnetic mo-\nment is referred to as long spin [short spin], and displayed\nin Fig. 1(e) [1(f)]. The Bk\nq parameters used to generate\nFigs. 1(e)\u2013(f) are listed in Ref. 45.\nThe determination of the exchange couplings was made\nusing higher-resolution INS data shown in Figs. 2(a)\u2013(e),\nwhich provide a detailed view of the excitations up to\n5 meV. Three levels can be seen between 2\u20134 meV \u2014\ntwo of the modes soften at the \u0393-point and, separated\nby a small gap, a third mode disperses to higher ener-\ngies \u223c4 meV towards the M points. We identify these\nmodes as three different CF levels.\nAccording to our\nmodel, the mode of lowest energy out of the trio belongs\nto the long-spin site, while the upper two belong to the\nshort-spin site [see Figs. 1(e)\u2013(f)]. The level below 1 meV\noriginates from the splitting of the doublet ground-state\non the long-spin sites. Although the ground state of the\nshort-spin site also splits by \u223c1 meV, this transition has\nvery small scattering intensity and cannot be observed in\nthe data.\nFrom the general softening of the modes near the \u0393\npoint, see Figs. 2(a)\u2013(d), it can be inferred that the dom-\ninant nearest-neighbor in-plane couplings are ferromag-\nnetic, which agrees with the magnetic structure reported\nin Ref. 48 and also with our own neutron diffraction data\n[45].\nA striking additional feature of the in-plane dis-\npersion is a high frequency modulation which has a local\nmaximum at \u0393. The periodicity of the oscillations, which\nis slightly different for the modes originating from each of\nthe Nd sites, implies that higher-neighbor antiferromag-\nnetic interactions are significant. We find that the clos-\nest in-plane antiferromagnetic coupling is between 5th-\nnearest neighbors, i.e. 3rd-nearest neighbors within the\nab plane.\nThe spectra shown in Figs. 2(f)\u2013(j) were simulated\n4\nFIG. 3. Trace of the exchange coupling matrices used to ob-\ntain the spectra in Fig. 2, as a function of nth-neighbor dis-\ntance rn. Colours orange, gray and green correspond to ll, ls\nand ss couplings, respectively. Continuous line is a guide to\nthe eye. The insets show first- and second-nearest neighbour\nexchange pathways, J 1 and J 2. Open white circles indicate\npositions of inversion centres.\nfrom the model that best describes the data, taking\ninto account the six possible superstructure domains [45].\nOverall, the simulations reproduce all the main observed\nfeatures. Fig. 3 plots the exchange parameters from the\nmodel, as represented by the trace of the exchange matri-\nces, as a function of the nth-neighbor distance rn. The ex-\nchange interactions are seen to extend to large distances\nand tend to oscillate with rn. With two distinct Nd sites\nin the superstructure, symmetry-inequivalent exchange\ncouplings cannot be discriminated simply by bond dis-\ntance. Because of that, the J ij in Fig. 3 are labelled\nwith a superscript \u2014 long-long (ll), short-short (ss) or\nlong-short (ls) \u2014 indicating which sites are coupling, and\na subscript n for bonds at a distance rn. These are illus-\ntrated in the inset to Fig. 3 for first and second neighbors.\nA complete symmetry analysis of the couplings and the\nassumptions made in this work can be found in Ref. 45.\nThe markedly different dispersions of the modes is par-\nticularly evident in the out-of-plane direction, shown in\nFig. 2(e). The modes originating from the long-spin sites\n(centred on \u223c1 and 3 meV) are seen to soften by about\n0.4 meV at the \u03931 point, whereas the modes associated\nwith the short-spin sites (3.2 and 3.7 meV) are essentially\ndispersionless.\nThe observation of flat modes suggests\nthat the magnetic coupling between the short-spin sites\nis highly frustrated. Consistent with this, our model finds\nthe nearest-neighbor in-plane and out-of-plane couplings\nbetween short spins to be similar in magnitude but op-\nposite in sign, as can be seen from the points (green sym-\nFIG. 4. Constant-energy slices through data (left) and models\n(middle and right) integrated over the energy interval 3.0 to\n3.3 meV. The 360\u25e6data map was constructed by rotation of\na measured 120\u25e6sector. Middle panel shows simulations us-\ning anisotropic exchange coupling, as in Figs. 2(f)\u2013(j). Right\npanel is for an isotropic exchange model.\nFIG. 5.\nCalculated element-specific density of states close\nto the Fermi energy (E = 0 eV) for Nd2PdSi3 (left) and\nGd2PdSi3 (centre).\nRight:\ntotal density of states.\nThe\nNd2PdSi3 superstructure is assumed for both compounds.\nbols) in Fig. 3 representing J ss\n1 and J ss\n2 . By contrast,\nwe find no significant short-range frustration within the\nNd layers, e.g. the J 5 and J 1 couplings also have oppo-\nsite sign but differ by an order of magnitude.\nOur simplifying assumption of 6-fold rotational sym-\nmetry for HCF,\nEq. (2),\nconstrains the single-ion\nanisotropy to be isotropic in the ab plane. The true Fddd\nsymmetry, however, allows in-plane single-ion anisotropy\nas well as anisotropic J ij couplings along most paths\n[45]. Although not apparent in Fig. 2, the effects of mag-\nnetic anisotropy can be seen in the constant-energy maps\nof Fig. 4, which integrate over the middle of the three\nbands located between 3 and 4 meV. The data (left panel)\ndisplay two-fold symmetry around the reciprocal lattice\npoints, whereas the intensity calculated from a model\nwith isotropic exchange interactions (right panel) has six-\nfold symmetry.\nThe main features of the data are re-\nproduced reasonably well by our model with anisotropic\nexchange interactions (middle panel). The intensity map\nhas six-fold symmetry around the origin because of the\naveraging of the six two-fold symmetric domains.\nWe\nstress that anisotropy in the spectrum could also arise\nfrom higher-order terms in HCF, but the effects of single-\nand two-ion anisotropy are very difficult to separate.\nOur model has a large parameter set, and there may\nexist other sets which give similar agreement. However,\nby exploring the effects of varying the parameters we\n5\nhave found that the general features summarised above\nare robust. Moreover, there are good reasons to expect\nthat our overall findings will extend to Gd2PdSi3. First,\nthe superstructures of the Nd and Gd compounds are es-\nsentially the same (the only difference being a doubling\nalong the c axis in the latter). Second, first-principles\ncalculations that we have performed (see Fig. 5), as well\nas experimental studies [51], indicate that the electronic\nstructure near the Fermi energy in R2PdSi3 does not vary\nsignificantly with R, implying that the magnetic interac-\ntions will also be relatively insensitive to R.\nConsequently, our comprehensive study of the mag-\nnetic spectrum of single-crystal Nd2PdSi3 has impor-\ntant implications for the formation of a SkX phase in\nGd2PdSi3.\nOur data reveal a critical dependence on\nthe Pd/Si superstructure, such that the pattern of mag-\nnetic interactions has orthorhombic, not hexagonal sym-\nmetry.\nOur results also raise questions about the role\nplayed by geometric frustration in the magnetism dis-\nplayed by this family of compounds. Although there are\nsome frustrated magnetic interactions, especially asso-\nciated with the antiferromagnetic J ss\n2\nnearest-neighbor\ncoupling along the c axis, we do not find evidence for\nstrong short-range frustration within the Nd layers. Our\nanalysis, moreover, has revealed the long-range nature of\nthe magnetic interactions, with non-negligible couplings\nextending up to at least the 26th nearest neighbor and a\ntendency to oscillate with distance. The evidence pre-\nsented here suggests that theories for the SkX phase\nin Gd2PdSi3 must go beyond simple models of short-\nrange frustrated interactions on a triangular lattice, and\npoint instead towards approaches based on longer-range\nRKKY exchange. Such considerations may also apply to\nother metallic centrosymmetric SkX materials.\nACKNOWLEDGMENTS\nThe research at Oxford was supported by UK Research\nand Innovation, grant No. EP/T027991/1, and by the\nOxford\u2013ShanghaiTech collaboration project. The work\nat Warwick was supported by UK Research and Inno-\nvation, grant Nos. EP/T005963/1 and EP/N032128/1.\nData from the neutron experiments are available from\nthe STFC ISIS Neutron and Muon Source, proposal\nNos. RB2010616 (Merlin [52]), RB2000180 (Merlin [53]),\nand RB2310234 (LET [54]), and from the Institut Laue-\nLangevin, proposal No. 5-41-1185 (D3 [55]). Other data\nare available from the authors upon reasonable request.\n\u2217viviane.antonio@stfc.ac.uk\n\u2020 andrew.boothroyd@physics.ox.ac.uk\n[1] S. M\u00a8uhlbauer,\nB. Binz,\nF. Jonietz,\nC. Pfleiderer,\nA. Rosch,\nA. Neubauer,\nR. Georgii, and P. B\u00a8oni,\nSkyrmion lattice in a chiral magnet, Science 323, 915\n(2009).\n[2] C. Pappas, E. Leli`evre-Berna, P. Falus, P. M. Bentley,\nE. Moskvin, S. Grigoriev, P. Fouquet, and B. Farago,\nChiral paramagnetic skyrmion-like phase in MnSi, Phys.\nRev. Lett. 102, 197202 (2009).\n[3] X. Z. Yu, Y. Onose, N. Kanazawa, J. H. Park, J. H. Han,\nY. Matsui, N. Nagaosa, and Y. Tokura, Real-space ob-\nservation of a two-dimensional skyrmion crystal, Nature\n465, 901 (2010).\n[4] A. Fert, V. Cros, and J. Sampaio, Skyrmions on the track,\nNat. Nanotechnol. 8, 152 (2013).\n[5] T. Lancaster, Skyrmions in magnetic materials, Con-\ntemp. Phys. 60, 246 (2019).\n[6] A. N. Bogdanov and C. Panagopoulos, Physical founda-\ntions and basic properties of magnetic skyrmions, Nat.\nRev. Phys. 2, 492 (2020).\n[7] L. Thomas, F. Lionti, R. Ballou, D. Gatteschi, R. Ses-\nsoli, and B. Barbara, Macroscopic quantum tunnelling of\nmagnetization in a single crystal of nanomagnets, Nature\n383, 145 (1996).\n[8] R. Wiesendanger, Nanoscale magnetic skyrmions in\nmetallic films and multilayers: a new twist for spintron-\nics, Nat. Rev. Mater. 1, 16044 (2016).\n[9] C. Moreau-Luchaire, C. Moutafis, N. Reyren, J. Sampaio,\nC. A. F. Vaz, N. Van Horne, K. Bouzehouane, K. Garcia,\nC. Deranlot, P. Warnicke, P. Wohlh\u00a8uter, J. M. George,\nM. Weigand, J. Raabe, V. Cros, and A. Fert, Additive\ninterfacial chiral interaction in multilayers for stabiliza-\ntion of small individual skyrmions at room temperature,\nNat. Nanotechnol. 11, 444 (2016).\n[10] C. Psaroudaki and C. Panagopoulos, Skyrmion qubits: A\nnew class of quantum logic elements based on nanoscale\nmagnetization, Phys. Rev. Lett. 127, 067201 (2021).\n[11] J. Xia, X. Zhang, X. Liu, Y. Zhou, and M. Ezawa, Univer-\nsal quantum computation based on nanoscale skyrmion\nhelicity qubits in frustrated magnets, Phys. Rev. Lett.\n130, 106701 (2023).\n[12] T. Okubo, S. Chung, and H. Kawamura, Multiple-q\nstates and the skyrmion lattice of the triangular-lattice\nheisenberg antiferromagnet under magnetic fields, Phys.\nRev. Lett. 108, 017206 (2012).\n[13] A. O. Leonov and M. Mostovoy, Multiply periodic states\nand isolated skyrmions in an anisotropic frustrated mag-\nnet, Nat. Commun. 6, 8275 (2015).\n[14] V. Lohani, C. Hickey, J. Masell, and A. Rosch, Quantum\nskyrmions in frustrated ferromagnets, Phys. Rev. X 9,\n041063 (2019).\n[15] Z. Wang, Y. Su, S.-Z. Lin, and C. D. Batista, Skyrmion\ncrystal from RKKY interaction mediated by 2D electron\ngas, Phys. Rev. Lett. 124, 207201 (2020).\n[16] Z. Wang, Y. Su, S.-Z. Lin, and C. D. Batista, Meron,\nskyrmion, and vortex crystals in centrosymmetric tetrag-\nonal magnets, Phys. Rev. B 103, 104408 (2021).\n[17] S. Hayami, R. Ozawa, and Y. Motome, Effective bilinear-\nbiquadratic model for noncoplanar ordering in itinerant\nmagnets, Phys. Rev. B 95, 224424 (2017).\n[18] T. Nomoto, T. Koretsune, and R. Arita, Formation mech-\nanism of the helical Q structure in Gd-based skyrmion\nmaterials, Phys. Rev. Lett. 125, 117204 (2020).\n[19] S.-Z. Lin and S. Hayami, Ginzburg\u2013Landau theory for\nskyrmions in inversion-symmetric magnets with compet-\ning interactions, Phys. Rev. B 93, 064430 (2016).\n6\n[20] S. Hayami and Y. Motome, Square skyrmion crystal in\ncentrosymmetric itinerant magnets, Phys. Rev. B 103,\n024439 (2021).\n[21] K. Mitsumoto and H. Kawamura, Replica symmetry\nbreaking in the RKKY skyrmion-crystal system, Phys.\nRev. B 104, 184432 (2021).\n[22] J. Bouaziz, E. Mendive-Tapia, S. Bl\u00a8ugel, and J. B.\nStaunton, Fermi-surface origin of skyrmion lattices in\ncentrosymmetric rare-earth intermetallics, Phys. Rev.\nLett. 128, 157206 (2022).\n[23] Y. Dong, Y. Arai, K. Kuroda, M. Ochi, N. Tanaka,\nY. Wan,\nM. D. Watson,\nT. K. Kim,\nC. Cacho,\nM. Hashimoto, D. Lu, Y. Aoki, T. D. Matsuda, and\nT. Kondo, Fermi surface nesting driving the RKKY\ninteraction in the centrosymmetric skyrmion magnet\nGd2PdSi3, Phys. Rev. Lett. 133, 016401 (2024).\n[24] T.\nKurumaji,\nT.\nNakajima,\nM.\nHirschberger,\nA. Kikkawa, Y. Yamasaki, H. Sagayama, H. Nakao,\nY. Taguchi, T. Arima, and Y. Tokura, Skyrmion lattice\nwith a giant topological hall effect in a frustrated\ntriangular-lattice magnet, Science 365, 914 (2019).\n[25] M.\nHirschberger,\nT.\nNakajima,\nS.\nGao,\nL.\nPeng,\nA. Kikkawa, T. Kurumaji, M. Kriener, Y. Yamasaki,\nH.\nSagayama,\nH.\nNakao,\nK.\nOhishi,\nK.\nKakurai,\nY. Taguchi,\nX. Yu,\nT.-h. Arima, and Y. Tokura,\nSkyrmion phase and competing magnetic orders on\na breathing kagom\u00b4e lattice, Nat. Commun. 10, 5831\n(2019).\n[26] N. D. Khanh, T. Nakajima, X. Yu, S. Gao, K. Shi-\nbata, M. Hirschberger, Y. Yamasaki, H. Sagayama,\nH. Nakao, L. Peng, K. Nakajima, R. Takagi, T.-h. Arima,\nY. Tokura, and S. Seki, Nanometric square skyrmion lat-\ntice in a centrosymmetric tetragonal magnet, Nat. Nan-\notechnol. 15, 444 (2020).\n[27] R. Takagi, N. Matsuyama, V. Ukleev, L. Yu, J. S.\nWhite, S. Francoual, J. R. L. Mardegan, S. Hayami,\nH. Saito, K. Kaneko, K. Ohishi, Y. \u00afOnuki, A. Taka-Hisa,\nY. Tokura, T. Nakajima, and S. Seki, Square and rhom-\nbic lattices of magnetic skyrmions in a centrosymmetric\nbinary compound, Nat. Commun. 13, 1472 (2022).\n[28] S. Gao,\nH. D. Rosales,\nF. A. G\u00b4omez Albarrac\u00b4\u0131n,\nV. Tsurkan, G. Kaur, T. Fennell, P. Steffens, M. Boehm,\nP. \u02c7Cerm\u00b4ak, A. Schneidewind, E. Ressouche, D. C. Cabra,\nC. R\u00a8uegg, and O. Zaharko, Fractional antiferromagnetic\nskyrmion lattice induced by anisotropic couplings, Na-\nture 586, 37 (2020).\n[29] R. Mallik, E. V. Sampathkumaran, M. Strecker, and\nG.\nWortmann,\nObservation\nof\na\nminimum\nin\nthe\ntemperature-dependent electrical resistance above the\nmagnetic-ordering temperature in Gd2PdSi3, Europhys.\nLett. 41, 315 (1998).\n[30] P. Kotsanidis, J. Yakinthos, and E. Gamari-Seale, Mag-\nnetic properties of the ternary rare earth silicides\nR2PdSi3 (R = Pr, Nd, Gd, Tb, Dy, Ho, Er, Tm and\nY), J. Magn. Magn. Mater. 87, 199 (1990).\n[31] R.\nMallik,\nE.\nSampathkumaran,\nand\nP.\nPaulose,\nLarge low temperature magnetoresistance and magnetic\nanomalies in Tb2PdSi3 and Dy2PdSi3, Solid State Com-\nmun. 106, 169 (1998).\n[32] S. R. Saha, H. Sugawara, T. D. Matsuda, Y. Aoki,\nH. Sato, and E. V. Sampathkumaran, Magnetic, ther-\nmal, and transport properties of single crystals of anti-\nferromagnetic kondo-lattice Ce2PdSi3, Phys. Rev. B 62,\n425 (2000).\n[33] A. Szytu la, M. Hofmann, B. Penc, M. \u00b4Slaski, S. Majum-\ndar, E. Sampathkumaran, and A. Zygmunt, Magnetic\nbehaviour of R2PdSi3 compounds with R = Ce, Nd, Tb\u2013\nEr, J. Magn. Magn. Mater. 202, 365 (1999).\n[34] M. Frontzek, A. Kreyssig, M. Doerr, A. Schneidewind,\nJ.-U. Hoffmann, and M. Loewenhaupt, Frustration in\nR2PdSi3 (R = Tb, Er) compounds: spin-glass or mag-\nnetic short range order? Neutron diffraction studies, J.\nPhys.: Condens. Matter 19, 145276 (2007).\n[35] M. Frontzek, A. Kreyssig, M. Doerr, M. Rotter, G. Behr,\nW. L\u00a8oser, I. Mazilu, and M. Loewenhaupt, Magneto-\ncrystalline anisotropy in R2PdSi3 (R = Tb, Dy, Ho, Er,\nTm) single crystals, J. Magn. Magn. Mater. 301, 398\n(2006).\n[36] D. X. Li, S. Nimori, Y. Shiokawa, Y. Haga, E. Yamamoto,\nand Y. Onuki, ac susceptibility and magnetic relaxation\nof R2PdSi3 (R = Nd, Tb, and Dy), Phys. Rev. B 68,\n012413 (2003).\n[37] S. Majumdar, E. V. Sampathkumaran, P. L. Paulose,\nH. Bitterlich, W. L\u00a8oser, and G. Behr, Anisotropic giant\nmagnetoresistance, magnetocaloric effect, and magnetic\nanomalies in single crystalline Tb2PdSi3, Phys. Rev. B\n62, 14207 (2000).\n[38] S. Majumdar, H. Bitterlich, G. Behr, W. L\u00a8oser, P. L.\nPaulose, and E. Sampathkumaran, Magnetic and trans-\nport behavior of single-crystalline Dy2PdSi3, Phys. Rev.\nB 64, 012418 (2001).\n[39] P. L. Paulose, E. V. Sampathkumaran, H. Bitterlich,\nG. Behr, and W. L\u00a8oser, Anisotropic spin-glass-like and\nquasi-one-dimensional magnetic behavior in the inter-\nmetallic compound Tb2PdSi3, Phys. Rev. B 67, 212401\n(2003).\n[40] S. R. Saha, H. Sugawara, T. D. Matsuda, H. Sato,\nR.\nMallik,\nand\nE.\nV.\nSampathkumaran,\nMagnetic\nanisotropy,\nfirst-order-like\nmetamagnetic\ntransitions,\nand large negative magnetoresistance in single-crystal\nGd2PdSi3, Phys. Rev. B 60, 12162 (1999).\n[41] M. Gomil\u02c7sek, T. J. Hicken, M. N. Wilson, K. J. A.\nFranke, B. M. Huddart, A. \u02c7Stefan\u02c7ci\u02c7c, S. J. R. Holt,\nG. Balakrishnan, D. A. Mayoh, M. T. Birch, S. H. Moody,\nH. Luetkens, Z. Guguchia, M. T. F. Telling, P. J. Baker,\nS. J. Clark, and T. Lancaster, Anisotropic skyrmion\nand multi-q spin dynamics in centrosymmetric gd2pdsi3,\nPhys. Rev. Lett. 134, 046702 (2025).\n[42] J. A. M. Paddison, B. K. Rai, A. F. May, S. Calder,\nM. B. Stone, M. D. Frontzek, and A. D. Christianson,\nMagnetic interactions of the centrosymmetric skyrmion\nmaterial Gd2PdSi3, Phys. Rev. Lett. 129, 137202 (2022).\n[43] J.\nJu,\nH.\nSaito,\nT.\nKurumaji,\nM.\nHirschberger,\nA. Kikkawa, Y. Taguchi, T. Arima, Y. Tokura, and\nT. Nakajima, Polarized neutron scattering study of\nthe centrosymmetric skyrmion host material Gd2PdSi3,\nPhys. Rev. B 107, 024405 (2023).\n[44] F. Tang, M. Frontzek, J. Dshemuchadse, T. Leisegang,\nM. Zschornak, R. Mietrach, J.-U. Hoffmann, W. L\u00a8oser,\nS. Gemming, D. C. Meyer, and M. Loewenhaupt, Crys-\ntallographic superstructure in R2PdSi3 compounds (R =\nheavy rare earth), Phys. Rev. B 84, 104105 (2011).\n[45] See Supplemental Material at xxx for details of the sam-\nple characterisation, x-ray and neutron diffraction data\nof the Pd/Si superstructure and magnetic structure, data\nanalysis, model, ab initio electronic structure calcula-\ntions, and for additional figures showing fits to the data.\n7\n[46] Y. Xu, W. L\u00a8oser, F. Tang, C. G. F. Blum, L. Liu, and\nB. B\u00a8uchner, Crystal growth of the intermetallic com-\npound Nd2PdSi3, Cryst. Res. Technol. 46, 135 (2011).\n[47] K. Mukherjee, T. Basu, K. K. Iyer, and E. V. Sam-\npathkumaran, 4f hybridization effect on the magnetism\nof Nd2PdSi3, Phys. Rev. B 84, 184415 (2011).\n[48] M. Smidman, C. Ritter, D. T. Adroja, S. Rayaprol,\nT. Basu, E. V. Sampathkumaran, and A. D. Hillier,\nMagnetic order in Nd2PdSi3 investigated using neutron\nscattering and muon spin relaxation, Phys. Rev. B 100,\n134423 (2019).\n[49] J. Jensen and A. R. Mackintosh, Rare Earth Magnetism\n(Clarendon Press, Oxford, 1991).\n[50] A. T. Boothroyd, Principles of Neutron Scattering from\nCondensed Matter (Oxford University Press, Oxford,\nUK, 2020).\n[51] D.\nS.\nInosov,\nD.\nV.\nEvtushinsky,\nA.\nKoitzsch,\nV. B. Zabolotnyy, S. V. Borisenko, A. A. Kordyuk,\nM. Frontzek, M. Loewenhaupt, W. L\u00a8oser, I. Mazilu,\nH. Bitterlich, G. Behr, J.-U. Hoffmann, R. Follath, and\nB. B\u00a8uchner, Electronic structure and nesting-driven en-\nhancement of the RKKY interaction at the magnetic or-\ndering propagation vector in Gd2PdSi3 and Tb2PdSi3,\nPhys. Rev. Lett. 102, 046401 (2009).\n[52] M. Smidman,\nG. Balakrishnan,\nD. Le,\nD. Mayoh,\nE.\nV.\nSampathkumaran,\nD.\nT.\nAdroja,\nZ.\nShan,\nand\nV.\nPe\u00b8canha-Antonio,\nSpin-waves\nin\nNd2PdSi3\nwith coexistent ferromagnetism and antiferromagnetism\n(2020), STFC ISIS Neutron and Muon Source, DOI:\n10.5286/ISIS.E.RB2010616-1.\n[53] M.\nSmidman,\nD.\nLe,\nD.\nMayoh,\nD.\nT.\nAdroja,\nG. Balakrishnan, and Z. Shan, Spin-waves in Nd2PdSi3\nwith coexistent ferromagnetism and antiferromagnetism\n(2020), STFC ISIS Neutron and Muon Source, DOI:\n10.5286/ISIS.E.RB2000180.\n[54] V. Pe\u00b8canha-Antonio, C. Balz, D. T. Adroja, Z. Shan,\nM.\nSmidman,\nand\nA.\nT.\nBoothroyd,\nMagnetic\nexcitons\nin\nthe\nfrustrated\ntopological\nNd2PdSi3\n(2023), STFC ISIS Neutron and Muon Source, DOI:\n10.5286/ISIS.E.RB2310234-1.\n[55] V. Pe\u00b8canha-Antonio, D. T. Adroja, N. A. Katcho, A. T.\nBoothroyd, B. Ouladdiaf, and M. Smidman, Magnetic\nstructure of the frustrated, triangular-lattice metallic\nmagnet Nd2PdSi3 (2023), institut Laue-Langevin (ILL)\nDOI: 10.5291/ILL-DATA.5-41-1185.\nSupplemental Material for \u201cLong-range magnetic interactions in Nd2PdSi3 and the\nformation of skyrmion phases in centrosymmetric metals\u201d\nViviane Pe\u00b8canha-Antonio,1, 2, \u2217Zhaoyang Shan,3 Michael Smidman,3 Juba Bouaziz,4 Bachir Ouladdiaf,5\nIurii Kibalin,5 Marie Helene Lemee,5 Christian Balz,2 Jakob Lass,6 Daniel A. Mayoh,7\nGeetha Balakrishnan,7 Julie B. Staunton,7 Davashibhai Adroja,2, 8 and Andrew T. Boothroyd1, \u2020\n1Department of Physics, University of Oxford, Clarendon Laboratory, Oxford OX1 3PU, United Kingdom\n2ISIS Neutron and Muon Source, STFC Rutherford Appleton Laboratory, Didcot OX11 0QX, United Kingdom\n3Center for Correlated Matter and School of Physics, Zhejiang University, Hangzhou 310058, China\n4Department of Physics, The University of Tokyo, Bunkyo-ku, Tokyo, 113-0033, Japan\n5Institut Laue-Langevin, 6 Rue Jules Horowitz, BP 156, 38042 Grenoble Cedx 9, France\n6PSI Center for Neutron and Muon Sciences, 5232 Villigen PSI, Switzerland\n7Department of Physics, University of Warwick, Coventry CV4 7AL, United Kingdom\n8Highly Correlated Matter Research Group, Physics Department,\nUniversity of Johannesburg, Auckland Park 2006, South Africa\n(Dated: April 15, 2025)\nI.\nEXPERIMENTAL METHODS\nA single crystal of Nd2PdSi3 was grown by the op-\ntical floating-zone method, see Ref. 1 for details. The\ngrown crystal was in the form of a rod of approximate\ndimensions 5 mm (diameter) \u00d7 50 mm (length). Parts of\nthe same rod were used for all the measurements in this\nwork. X-ray single-crystal diffraction was carried out at\nroom temperature on a SuperNova Agilent diffractometer\nequipped with a Mo source.\nMagnetic susceptibility measurements were performed\nwith a Quantum Design MPMS3 SQUID-VSM magne-\ntometer. The same piece of crystal was used for the neu-\ntron diffraction and magnetometry measurements. Fig. 1\nshows susceptibility data for a magnetic field of 100 Oe\napplied along the crystallographic c axis. The onset of\nFIG. 1. Magnetic susceptibility measured on a single crystal\nof Nd2PdSi3 with H \u2225c. The sharp increase in susceptibility\naround 14 K marks the onset of ferromagnetism.\n\u2217viviane.antonio@stfc.ac.uk\n\u2020 andrew.boothroyd@physics.ox.ac.uk\nspontaneous magnetisation in the sample takes place at\nTC \u224314 K. This temperature is about 1.7 K lower than\nthat reported for powder samples in Refs. 2 and 3.\nNeutron diffraction measurements were performed on\nthe four-circle diffractometer D10+ at the Institut Laue\u2013\nLangevin (ILL). A single crystal of dimensions 2 mm \u00d7\n1 mm \u00d7 2 mm was cut from the main rod and fixed on a\nstandard aluminium mount which was installed in a four-\ncircle cryostat. Neutrons of wavelengths \u03bb = 1.26 \u02daA and\n2.36 \u02daA were selected by Bragg diffraction from a Cu\nmonochromator and a pyrolytic graphite monochroma-\ntor, respectively. A graphite filter was used to suppress\nhigher order harmonic wavelengths in the incident beam.\nThe scattered neutrons were recorded on a 94 \u00d7 94 mm2\narea detector. In the paramagnetic phase, a total of 740\ninequivalent reflections were recorded using \u03bb = 1.26 \u02daA at\na temperature of 150 K. Using \u03bb = 2.36 \u02daA, 111 inequiv-\nalent reflections were measured at 20 K, just above the\nphase transition temperature, and 94 at 2 K.\nFor the inelastic neutron scattering (INS) experiments,\na sample of total mass around 5 g was fixed in an alu-\nminium plate with the hexagonal [001] axis vertical.\nTime-of-flight INS was performed on the MERLIN and\nLET chopper spectrometers [4, 5] at the ISIS Facility.\nOn MERLIN, the experiment was divided in two parts\nmaking use of two different sample environments. In the\nfirst part, the sample was loaded in a closed-cycle refrig-\nerator (CCR) and data were recorded at 7.7 K and 30 K.\nIn the second, the sample was loaded in a helium cryo-\nstat and data were collected at 1.8 K. For both parts, the\ncrystal was rotated through an angle of 125\u25e6in 1\u25e6steps\naround the vertical axis.\nThe instrument chopper fre-\nquency was set at 250 Hz and operated in repetition-rate\nmultiplication (RRM) mode, so that angular scans with\nincident energies Ei of 10, 20 and 53 meV were performed\nsimultaneously.\nOn LET [5], the single crystal was fixed in an alu-\nminium mount and loaded in a helium cryostat. During\ndata collection, at a temperature of 1.8 K, the sample\nwas rotated through an angle of 120\u25e6in 1\u25e6steps around\narXiv:2504.10075v1  [cond-mat.str-el]  14 Apr 2025\n2\nFIG. 2. X-ray diffraction in the (h0l), (0hl) and (hhl) recip-\nrocal lattice planes, measured at room temperature.\nthe vertical axis. RRM enabled the simultaneous mea-\nsurement of Ei = 6 and 15.6 meV, among other incident\nenergies with less flux. For the former energy transfer,\nthe FWHM of the energy resolution at the elastic line is\n0.16 meV, down to about 0.06 meV at 5 meV.\nInelastic neutron scattering data in magnetic field were\ncollected on the cold neutron multiplexing spectrometer\nCAMEA [6] at the Swiss Spallation Neutron Source at\nthe Paul Scherrer Institut (PSI). For all magnetic fields,\nmeasurements were performed with Ei = 7 and 8 meV\n(+0.13 meV for interlacing) and with the centre of the\ndetector tank at angles of 2\u03b8 = \u221244\u25e6and \u221248\u25e6. In each\ncase, the sample was rotated through an angle of 141\u25e6in\n1\u25e6steps. For 0, 4, and 7 T also the elastic line was mea-\nsured, i.e. Ei = 5 and 5.13 meV at 2\u03b8 = \u221251\u25e6and \u221256\u25e6.\nIn addition, at 0 and 7 T the spectrum was measured up\nto 11 meV with Ei = 10, 10.13, 12, 12.13, 14, 14.13 and\n2\u03b8 = \u221238\u25e6and \u221242\u25e6, covering 40\u25e6of sample rotation in\nsteps of 1\u25e6. All points were measured with a monitor of\n250 000 corresponding to \u223c1 minute at Ei = 5 meV and\n2 minutes at Ei = 14 meV, except for the 0 T dataset,\nwhere a 190 000 monitor was used. Data were analysed\nusing the MJOLNIR software package [7].\nII.\nCRYSTALLOGRAPHIC SUPERSTRUCTURE\nAND STATIC MAGNETISM\nThe superstructure of the heavy rare-earth R2PdSi3\nseries (R = Gd, Tb, Dy, Ho, Er, Tm) was thoroughly\nstudied in the work of Tang et al. [8]. In those systems,\nthe unit cell is doubled along the hexagonal a and b di-\nrections and octupled along c. Our single-crystal x-ray\ndiffraction measurements, represented in Fig. 2, demon-\nstrate that in Nd2PdSi3 the superstructure unit cell is\nenlarged by a factor 2a \u00d7 2a \u00d7 4c instead. Peaks are ob-\nserved in the (hhl) reciprocal lattice plane at positions\nwith h = (2n + 1)/2 and l = m/4, l \u0338= integer, where the\nindices h, k and l refer to the parent hexagonal lattice\nand n, m are integers. The same conditions on h and l\napply for the (h0l) and (0hl) reciprocal lattice planes.\nThe superstructure proposed in Ref. 8 is formed by\nthe stacking of four layers in which the Pd atoms have\ndistinct positions. These layers are labeled A, B, C and\nD, and are shown in Fig. 3. In order to explain the ex-\ntinction rules observed in their diffraction data, one of\nthe assumptions made by the authors of Ref. 8 is that no\ntwo adjacent layers along the c axis are of the same type.\nGiven that the periodicity of the unit cell in Nd2PdSi3 is\nvertically increased by a factor of 4, the minimal unit cell\nwhich can be hypothesised in our case contains a vertical\nstacking of all the four layers shown in Fig. 3.\nFollowing symmetry considerations, it can be shown\nthat\nsix\ndifferent\ntwin\ndomains\ncontribute\nto\nthe\nmeasured\nstructure\nfactors.\nThese\ndomains\ncan\nbe\nfound\nby\napplying\nthe\nsymmetry\noperations\n{1, 3\u2212\n001, 3+\n001, \u00af1, \u00af3\u2212\n001, \u00af3+\n001 of the P6/mmm space group to\nthe stacking ABCD, which we refer to as Domain 1. The\nparticular layer stacking of each domain is given in Ta-\nble I. In Fig. 3, we reproduce the layers A, B, C and D\nin the orthorhombic supercell, and show the relation be-\ntween these lattice basis vectors with the hexagonal cell.\nTransformation matrices Tm relating the coordinates of\ndomains m = 2, 3, ..., 6 with domain 1 in the Fddd space\ngroup are given below. The notation used combines a\n3\u00d73 rotation matrix with a translation vector in the last\ncolumn.\nT2 =\n\uf8eb\n\uf8ed\n\u22121/2\n\u22121/2\n0\n1/2\n3/2\n\u22121/2\n0\n3/4\n0\n0\n1\n0\n\uf8f6\n\uf8f8.\n(1)\nT3 =\n\uf8eb\n\uf8ed\n\u22121/2\n1/2\n0\n\u22121/8\n\u22123/2\n\u22121/2\n0\n9/8\n0\n0\n1\n0\n\uf8f6\n\uf8f8.\n(2)\nT4 =\n\uf8eb\n\uf8ed\n\u22121\n0\n0\n1/4\n0\n\u22121\n0\n5/4\n0\n0\n\u22121\n0\n\uf8f6\n\uf8f8.\n(3)\nT5 =\n\uf8eb\n\uf8ed\n1/2\n1/2\n0\n\u22121/4\n\u22123/2\n1/2\n0\n1/2\n0\n0\n\u22121\n0\n\uf8f6\n\uf8f8.\n(4)\nT6 =\n\uf8eb\n\uf8ed\n1/2\n\u22121/2\n0\n3/8\n3/2\n1/2\n0\n1/8\n0\n0\n\u22121\n0\n\uf8f6\n\uf8f8.\n(5)\nA.\nStructure refinement\nThe x-ray data measured at room temperature, and\nthe neutron diffraction data measured in the paramag-\nnetic phase at 150 K, were refined in the Fddd group\ndescription using FullProf [9].\nThe model assumes an\nequal population of the six domains. Atomic positions of\nthe general site (32h) were not refined. In the recipro-\ncal space maps shown in Fig. 2, a weak diffuse scattering\n3\ndomain\nStacking\n1\nABCD\n2\nADBC\n3\nACDB\n4\nDCBA\n5\nCBDA\n6\nBDCA\nTABLE I. The layer stacking of the six domains obtained by\napplying the symmetry operations {1, 3\u2212\n001, 3+\n001, \u00af1, \u00af3\u2212\n001, \u00af3+\n001\nof the P6/mmm space group to the ABCD sequence of planes.\nAn alternative way of generating the six domains is consid-\nering all the possible ways of stacking vertically A, B, C and\nD, requiring that no adjacent layer is the same and neglecting\ncyclic plane permutations, which are necessarily equivalent.\nFIG. 3. Layers of ordered Pd and Si atoms in the superlattice\nab plane of Nd2PdSi3. The four layers, labeled A, B, C and\nD, are stacked along the c axis according to the sequences\nlisted in Table I to form the different superstructure domains.\nDashed and continuous black lines indicate the orthorhombic\nand hexagonal superstructure unit cells, respectively.\nThe\ncorresponding cell vectors are ao, bo and a, b. The unit cell\nof the parent P6/mmm space group is shown in red.\ncan be observed along the (00l) direction, evidencing that\nstacking faults occur. In order to (partially) account for\nthat in the refinement of the peak intensities, the aver-\nage occupation of Pd and Si sites was initially allowed to\nvary equally for all the domains, while the overall stoi-\nchiometry, i.e. three atoms of Si for each Pd, was kept\nconstrained. Isotropic displacement parameters were re-\nfined at a later stage, during which the occupancies were\nkept constant at a value which gave the best agreement\nwith the measured patterns.\nA summary of the x-ray diffraction refinement results\nfor domain 1 is presented in Table II. A quantitative com-\nparison between calculated and measured structure fac-\ntors is shown in Figs. 3(a) and 3(b) for x-rays and neu-\ntrons, respectively.\nB.\nMagnetic structure refinement\nIt was found in Ref. 3 that, in the magnetic long-range\nordered phase, magnetic scattering is detected in integer-\nand non-integer-indexed Bragg peaks, and the tempera-\nture dependence of these two sets is different.\nAs ex-\npected, a similar behaviour is observed in our single-\ncrystal neutron diffraction results, as may be seen in\nFIG. 4.\nRefinement results of the single crystal diffraction\nmeasured with (a) x-rays at room temperature, and (b) neu-\ntrons on D10+ at 150 K.\nFIG. 5. Temperature dependence of Bragg-peak intensities\nfor parent structure and superstructure reflections. For clar-\nity, the indexing follows the hexagonal P6/mmm space group\ndescription. Red lines are a guide to the eye.\nFig. 5.\nIn order to refine the magnetic structure at the low-\nest measured temperature we subtracted data collected\nat 20 K from those collected at 1.7 K. Our refinement\ndemonstrates that the measured intensities, shown in\nFig. 6, may be described by a magnetic unit cell con-\ntaining two magnetic moments of distinct magnitudes,\none \u201cshort\u201d and one \u201clong\u201d, both oriented ferromagnet-\nically along c. The temperature dependence of the su-\nperstructure peaks, shown in Fig. 5, occurs because the\nratio of the magnitudes of these two moments changes as\nthe temperature is lowered below TC. The integer-index\npeaks depend on the sum of the two moments, whereas\nthe non-integer peaks depend on the difference. Accord-\ningly, the ratio is largest at T \u22438 K where the intensity\nof the non-integer peaks is a maximum. At lower temper-\natures, the intensity of the non-integer peaks decreases\nas the moments become more similar in magnitude.\nThe refined magnetic structure using our model is rep-\nresented in Fig. 7 for each domain. We note that, from\nthe diffraction experiment alone, it is impossible to know\nif the magnetic moments of the four-fold Pd-coordinated\natoms are long and those of the two-fold Pd-coordinated\natoms are short, or vice-versa. The long and short or-\ndered moments at 1.7 K are found to be 2.3(3)\u00b5B and\n1.3(3)\u00b5B, respectively.\nThe magnetic structure proposed here differs from that\n4\nLattice parameters\nao = 14.2402 \u02daA\nbo = 8.2216 \u02daA\nco = 16.8595 \u02daA\nLattice angles\n\u03b1 = 90\u25e6\n\u03b2 = 90\u25e6\n\u03b3 = 90\u25e6\nAtom\nAtomic position\nOccupancy\nIsotropic displacement\n(Wyckoff site)\nparameters (\u02daA2)\nNd\n1/8,5/8,0 (16g)\n0.5\n0.52(3)\nNd\n1/8,5/8,1/2 (16g)\n0.5\n0.52(3)\nPd/Si\n7/24,1/8,1/8 (16e)\n0.60(1)/0.4(1)\n0.64(4)\nSi/Pd\n11/24,1/8,1/8 (16e)\n0.866(1)/0.134(1)\n0.64(4)\nSi/Pd\n13/24,7/8,1/8 (32h)\n0.867(1)/0.133(1)\n0.64(4)\nTABLE II. Summary of the structural model assumed for the superstructure of Nd2PdSi3 in the space group Fddd. The atomic\npositions in the other five domains can be obtained with the transformation matrices Tm defined in Eqs. (1)-(5) above.\nFIG. 6.\nMagnetic scattering measured on D10+.\nThe in-\nset shows the refinement of the weak reflections, associated\nmostly with the superstructure peaks.\nof Ref. 3. First, we do not find evidence for an incommen-\nsurate magnetic propagation vector k, observed in their\npowder sample. Second, in our model the two different\nvalues of the Nd ordered moment correlate directly with\nthe two different crystal-field environments caused by the\nPd/Si superstructure, which is not the case for the model\nof Ref. 3 as it assumes no superstructure. In fact, both\nmodels give the same diffraction intensities after domain\naveraging, but our model has a more physical basis with\nrespect to the full crystallographic structure determined\nhere.\nIII.\nCRYSTAL FIELD PARAMETERS\nThe Bk\nq parameters in our crystal-field model (see\nEq. (2) of the main text) for the long- and short-spin\nNd sites are listed in Table III. Figures 1(e) and (f) in\nthe main text illustrate the energy-level splitting of the\n4f single-ion states of Nd produced by these two sets of\nparameters.\n(a) Domain 1\n(b) Domain 2\n(c) Domain 3\n(d) Domain 4\n(e) Domain 5\n(f) Domain 6\nFIG. 7.\n(a)-(f) Magnetic structure of the six domains of\nNd2PdSi3. In our unit-cell origin choice, orange atoms have\nlong, while green have short magnetic moments.\nRefinement of the CF parameters was informed by INS\nmeasurements of the response of the magnetic modes to\nan applied magnetic field. Fig. 8 shows constant-Q cuts\nintegrated around (\u22121/2, 1/2, 0) and (\u22121, 1, 0) r.l.u. for\nseveral different magnetic fields up to 11 T applied paral-\nlel to c. A shift of the CF levels towards higher energies\noccurs with increased field, and the rate of increase is\ndifferent for each level. The largest energy shift observed\nis on the lowest mode, which results from the Zeeman\nsplitting of the ground-state doublet of the long-spin site.\nThe middle and bottom panels show the relatively weak\nfield dependence of the three modes between 3\u20134 meV,\nand the other levels measured above 5 meV, respectively.\nIn the latter energy range, the strongest observed field\ndependence is on the levels with energies above 6.8 meV,\n5\nParameter\nLong spin (meV)\nShort spin (meV)\nB2\n0\n15\n33\nB4\n0\n-23\n-33\nB6\n0\n28\n31\nB6\n6\n-1\n-12\nTABLE III. Fitted crystal field parameters for the two Nd\nsites in Nd2PdSi3.\nParameters are given in the Wybourne\nconvention.\nwhich appear displaced by about 1 meV for a 7 T mag-\nnetic field.\nThese data were included in our crystal-field analysis\nby the addition of a Zeeman splitting term\nHz = \u2212\u00b5BgJJi \u00b7 H,\n(6)\nwhere \u00b5B is the Bohr magneton, gJ = 8/11 is the Land\u00b4e\ng-factor and H is the magnetic field, to the Hamiltonian\nEq. (1) in the main text. Gaussians were fitted to the\nwhole dataset, including those shown in Fig. 8, and the\nmeasured splittings were used as fitting constraints to the\nCF model.\nIV.\nEXCHANGE COUPLING\nFig. 9 shows, for domain 1 [Fig. 7(a)], the position\nof each nth-nearest neighbor of a long-spin Nd. In this\nwork, we found that couplings up to the 26th nearest\nneighbours were found to affect, to a lesser or greater\nextent, the observed dispersions of the magnetic levels\nof the two inequivalent rare-earth.\nOpen white circles\nin Fig. 9 mark the position of inversion centres, which in\nthe orthorhombic structure exist in the Nd layers between\nsame-type sites.\nAlthough all the evidence available points towards an\northorhombic description for Nd2PdSi3 as being the most\naccurate one, the data may not always be sensitive to\nthe full symmetry of the Fddd space group. In order to\ndevelop a model for exchange couplings, we initially as-\nsumed the highest possible symmetry (that imposed by\nP6/mmm), and subsequently lowered it, differentiating\nbetween the ll, ls and ss paths, where l and s refer to\nNd sites with long and short moments. Since these as-\nsumptions have proven to be a minimal requirement of\nthe model, all the exchange matrices are labelled J ij\nn ,\nwhere i, j represent either l or s.\nExchange pathways up to the 5th nearest-neighbours\nare shown in Figs. 10 and 11. The different colors of the\nbonds each correspond to a different J ij\nn which would\nin principle be allowed in the Fddd space group.\nFor\nthe n = 1, 4 in-plane couplings shown in Fig. 10, for\nexample, all the J ll\nn and J ss\nn are symmetry-related, but\nseveral different J ls\nn interactions are allowed in one unit\ncell. When n = 3, Fig. 11, there are different J ij\nn for\neach of ij = ll, ss and ls.\nAfter careful investigation, we found that in most cases\nno improvement to the model could be achieved by allow-\ning all the J ij\nn for the same n to vary independently. The\nonly exceptions are for n = 5 and 26, for which only ll and\nss bonds exist, but not ls. As represented in Fig. 9, some\nof the directions connecting the central atom with n = 5\nor n = 26 neighbors contain inversion centres, while oth-\ners do not. To mark the inequivalence of these two types\nof paths, bonds containing inversion are labelled as J ii\nn\u2032,\nwhere n\u2032 = 5\u2032, 26\u2032 and i = l, s. This notation was also\nused in Fig. 10. All J ls\nn , J ll\nn and J ss\nn for n \u0338= 5, 26 were\nconsidered to be the same apart from a similarity trans-\nformation (see next paragraph). In Figs. 10 and 11 the\nJ ij\nn labels indicate the bonds which we assumed to be\nthe same.\nIf the coupling is of an isotropic Heisenberg type, which\nis our assumption when n = 2 and n \u22655, 5\u2032, an uni-\ntary transformation between any set of orthogonal co-\nordinates will not modify the exchange matrices.\nThe\nsame is not true when the exchange is anisotropic. In\nthis case, the coupling will depend on the basis and bond\ndirections, although the trace of the matrices is always\npreserved. Therefore, before defining the exchange cou-\npling matrices J ij\nn , we have to express the coordinate\nbasis in which they are described. In this work\u2019s conven-\ntion, the local orthonormal basis for a Nd atom located\nat\n\u00001/2, 1/2, 1/4\n\u0001\nin the 2\u00d72\u00d74 conventional unit cell of\ndomain 1 [Fig. 7(a)] is chosen to be \u02c6aex, \u02c6bex, \u02c6cex, where\n\uf8eb\n\uf8ec\n\uf8ed\n\u02c6aex\n\u02c6bex\n\u02c6cex\n\uf8f6\n\uf8f7\n\uf8f8=\n\uf8eb\n\uf8ec\n\uf8ed\n1\n0\n0\n1\n\u221a\n3\n2\n\u221a\n3\n0\n0\n0\n1\n\uf8f6\n\uf8f7\n\uf8f8\n\uf8eb\n\uf8ec\n\uf8ed\n\u02c6a\n\u02c6b\n\u02c6c\n\uf8f6\n\uf8f7\n\uf8f8,\n(7)\nand \u02c6a, \u02c6b, \u02c6c are unit vectors in the hexagonal basis, defined\nrelative to the orthorhombic lattice in Fig. 3 above. This\nreference Nd (long-spin, orange) and its bonds are the\nfirst to be represented in Figs. 10 and 11. For J 1 and\nJ 3, the coordinate transformations are related to bond\ndirections as follows:\n\u2022 J ij\nn,b = 3T\n001 J ij\nn,a 3001, where\n3001 =\n\uf8eb\n\uf8ec\n\uf8ed\n\u22121\n2\n\u221a\n3\n2\n0\n\u2212\n\u221a\n3\n2\n\u22121\n2\n0\n0\n0\n1\n\uf8f6\n\uf8f7\n\uf8f8\nin the (\u02c6aex, \u02c6bex, \u02c6cex) coordinate basis and\n\u2022 J ij\nn,ab = 2T\n100 J ij\nn,b 2100, where\n2100 =\n\uf8eb\n\uf8ec\n\uf8ed\n1\n0\n0\n0\n\u22121\n0\n0\n0\n\u22121\n\uf8f6\n\uf8f7\n\uf8f8\nSimilarly, for J 4:\n6\nFIG. 8.\nConstant-Q cuts performed at (a)-(b) (\u22121/2, 1/2, 0) and (c) (\u22121, 1, 0) r.l.u.\non data collected on CAMEA at a\ntemperature of \u223c2 K for several magnetic fields applied along c. The data were integrated over \u00b10.15 r.l.u. along \u27e8Q\u27e9and\ntwo perpendicular directions. Panel (a) includes the same cut from the LET data measured in zero field, for comparison. Solid\nlines are Gaussian fits which were used to identify the levels shift with field.\nFIG. 9. Diagrams showing up to the 26th nearest-neighbor exchange pathways for superstructure domain 1 of Nd2PdSi3. The\npaths start on a long-spin Nd site (unnumbered, smaller orange circle in the diagram on the left). White circles represent\ninversion centres in the ab-plane. The c coordinate of each of the planes increases from left to right as one moves to the next\nlayer.\n\u2022 J ij\nn,2ab = 3T\n001 J nm\ni,a2b 3001 and\n\u2022 J ij\nn,\u00afab = 2T\n100 J ij\nn,2ab 2100.\nNote that the three-fold rotation symmetry between\nbonds is not enforced by the Fddd space group, but re-\nlated to the assumptions made in this work (see above).\nFinally, the matrices from which the exchange cou-\nplings in the whole unit cell can be generated are stated\nbelow. The parameters are given in \u00b5eV.\n\u2022 n = 1\nJ ls\n1,a =\n\uf8eb\n\uf8ec\n\uf8ed\n\u22127\n0\n0\n0\n9.5\n0\n0\n0\n7.5\n\uf8f6\n\uf8f7\n\uf8f8,\nJ ll\n1,b =\n\uf8eb\n\uf8ec\n\uf8ed\n10\n0\n0\n0\n10\n0\n0\n0\n10\n\uf8f6\n\uf8f7\n\uf8f8\nand\nJ ss\n1,b =\n\uf8eb\n\uf8ec\n\uf8ed\n5.2\n0\n0\n0\n15.6\n0\n0\n0\n10.4\n\uf8f6\n\uf8f7\n\uf8f8.\n\u2022 n = 3\nJ ll\n3,a =\n\uf8eb\n\uf8ec\n\uf8ed\n4.7\n0\n0\n0\n2.9\n0\n0\n0\n3.8\n\uf8f6\n\uf8f7\n\uf8f8,\nJ ss\n3,a =\n\uf8eb\n\uf8ec\n\uf8ed\n5.2\n0\n0\n0\n1.3\n0\n0\n0\n2.6\n\uf8f6\n\uf8f7\n\uf8f8\nand\nJ ls\n3,a =\n\uf8eb\n\uf8ec\n\uf8ed\n\u22123.3\n0\n0\n0\n3.3\n0\n0\n0\n3.3\n\uf8f6\n\uf8f7\n\uf8f8.\n\u2022 n = 4\nJ ls\n4,a2b =\n\uf8eb\n\uf8ec\n\uf8ed\n4.8\n0\n0\n0\n1.8\n0\n0\n0\n3.6\n\uf8f6\n\uf8f7\n\uf8f8,\nJ ll\n4,a2b =\n\uf8eb\n\uf8ec\n\uf8ed\n\u22121.6\n0\n0\n0\n2.4\n0\n0\n0\n2.4\n\uf8f6\n\uf8f7\n\uf8f8\n7\nFIG. 10.\nDiagrams showing the first-, fourth- and fifth-\nnearest-neighbor couplings between different Nd sites. Those\ncorrespond to first-, second- and third- nearest-neighbor\nbonds within the ab plane, respectively. In these figures, the\nc coordinate of each Nd increases from left to right.\nFirst\ncentral atom on the left side is located at 1\n2, 1\n2, 1\n4 in the con-\nventional unit cell of Fig. 3. Different colors show bonds which\ncan be inequivalent in the Fddd space group, whereas labels\nrefer to the simplifying assumptions made in this work.\nand\nJ ss\n4,a2b =\n\uf8eb\n\uf8ec\n\uf8ed\n\u22125.0\n0\n0\n0\n4.0\n0\n0\n0\n3.0\n\uf8f6\n\uf8f7\n\uf8f8.\nAll other coupling matrices are assumed isotropic. The\ntrace of each exchange matrix, which was used to gener-\nate Fig. 4 in the main text, is given in Table IV.\nThe presence of inversion centres between Nd sites of\nthe same type implies that DM coupling along these di-\nrections is forbidden. Antisymmetric exchange was, how-\never, considered for all the other bonds where it is al-\nlowed, but the agreement between the model and the\ndata did not improve.\nV.\nADDITIONAL DATA AND MODEL\nQUALITY CONSIDERATIONS\nData measured on LET with neutrons of incident en-\nergy Ei = 15.6 meV along several high-symmetry direc-\ntions in reciprocal space are shown in the upper pan-\nels of Fig. 12. Similar data were collected on MERLIN\nwith Ei = 10 meV, in addition to chopper repetitions\nof Ei = 20 meV and Ei = 50 meV (not shown). The\nhigher incident energies were used to ascertain that no\nexcitations were present above \u223c9 meV.\nThe\ncalculated\nspectra\nof\nthe\nexchange-coupled,\nground-state J multiplet of the Nd ion are shown along\nFIG. 11.\nDiagrams showing the second- and third-nearest\nneighbor couplings between different Nd sites. As in Fig. 10,\ndifferent colors show bonds which can be inequivalent in the\nFddd space group.\nFor J 3, left and right hexagons show\nlayers below and above the central rare-earth, respectively.\nCoupling\nss (\u00b5eV)\nls (\u00b5eV)\nll (\u00b5eV)\nrn (\u02daA)\nJ 1\n30.0\n10.0\n31.2\n4.11\nJ 2\n6.0\n6.0\n\u221230.0\n4.21\nJ 3\n11.4\n3.3\n9.1\n5.88\nJ 4\n3.2\n10.2\n2.0\n7.12\nJ 5\u2032\n\u22124.5\n-\n\u22126.3\n8.22\nJ 5\n\u22122.7\n-\n\u22124.5\n8.22\nJ 6\n3.6\n\u22120.3\n\u22121.8\n8.27\nJ 7\n-\n\u22123\n-\n8.42\nJ 8\n1.8\n\u22121.2\n1.2\n9.23\nJ 9\n2.4\n< |0.3|\n< |0.3|\n9.37\nJ 10\n\u22123.6\n\u22121.3\n\u22122.3\n10.87\nJ 14\n\u22123.6\n\u22123.6\n\u22121.8\n12.33\nJ 19\n\u22123.6\n-\n\u22121.8\n14.23\nJ 21\n< |0.3|\n\u22120.9\n\u22120.9\n14.82\nJ 26\u2032\n\u22120.9\n-\n\u22120.9\n16.44\nJ 26\n< |0.3|\n-\n\u22120.4\n16.44\nTABLE IV. Trace of the exchange matrices in Nd2PdSi3. The\nminimum level of uncertainty in the parameters is \u00b10.3 \u00b5eV.\n8\nFIG. 12. Top panels show data collected on LET with neutrons of Ei = 15.6 meV displaying the coupled ground-J multiplet\nof the Nd ion in Nd2PdSi3. Dispersions are seen propagating along three orthogonal directions: two in plane (left and centre,\nrespectively) and one out-of-plane (right). Bottom panels show the coupled crystal-field and exchange model calculated using\nH defined in the main text.\nwith the data in Fig. 12. The energy of the levels up to\n\u223c5 meV is captured well by our reduced CF model, al-\nthough the agreement is less good for the levels at higher\nenergies. A better fit of these modes would necessarily\nrequire the use of an extended CF Hamiltonian, including\nterms allowed by symmetry but neglected in our calcula-\ntion, as pointed out in the main text. This simplification\nof HCF means that in-plane single-ion anisotropy, which\ncertainly exists in the real system \u2013 see discussion below\n\u2013 is not considered in the model.\nFig. 13 shows the limits and definitions of the Bril-\nlouin zone used in the figure labels of the reciprocal space\nhypervolume measured with neutrons of Ei = 6 meV.\nThese refer to Fig. 2 in the main text and Fig. 14 be-\nlow, which shows additional data measured away from\n\u0393 points. Extra constant-energy cuts performed in ad-\ndition to those shown in Fig. 4 of the main text can be\nseen in Fig. 15.\nFig. 16, which contains constant-Q cuts, illustrates the\nlevel of agreement for several Brillouin zone points. The\nmajority of the small details in the excitations are well\ndescribed by our model, including the single-ion ener-\ngies and dispersion of the modes. On the other hand,\nsome simulations in Fig. 14 make evident the reciprocal\nspace regions in which the calculation provides a less sat-\nisfactory description of the spectrum. One of the main\ndifferences between experiment and model is the inten-\nsity modulation caused by anisotropy in the energy level\nlocated between 3.6 and 3.8 meV, seen in the constant-\nenergy cuts of Fig. 15. Unlike what happened with the\nother levels appearing at similar energies, our attempts to\nfit the intensity of this mode considering only exchange\nFIG. 13. Hexagonal Brillouin zone used in this work, showing\nlabelled high-symmetry points which are inequivalent in the\nFddd space group. (a) shows in-plane and (b)\u2013(c) show out-\nof-plane directions.\nanisotropy failed.\nA less good agreement is also seen\nin the out-of-plane directions, for which the data vol-\nume covered in the experiment is significantly smaller.\nThe complete determination of exchange and single-ion\nanisotropies in the system is, in principle, possible, but\nwould add an extra layer of complexity in an already very\ncomplicated problem, and would require additional data.\n9\nFIG. 14. Data collected on LET with neutrons of Ei = 6 meV(top) along with calculated model (bottom) at a temperature of\n1.7 K. Corresponding Brillouin zone labels are described in Fig. 13.\nFIG. 15. Constant-energy slices through data (left) and models (middle and right) integrated over the energy intervals (a)\n2.7 to 3.0 meV and (b) 3.6 to 3.8 meV, analogous to Fig. 4 in the main text. Middle and right panels show simulations using\nanisotropic and isotropic exchange coupling, respectively.\nVI.\nELECTRONIC STRUCTURE\nCALCULATIONS\nThe electronic structure calculations for Nd2PdSi3\nand Gd2PdSi3 were performed using a crystallographic\nunit cell with an orthorhombic space group Fddd and\na 2a \u00d7 2a \u00d7 4c superstructure. In other words, we as-\nsume the superstructure found in Nd2PdSi3 applies for\nboth compounds, to facilitate comparison of their elec-\ntronic structures. Further information about the theo-\nretical approach can be found in Refs. 10 and 11. The\nlattice constants were taken from experiment and given\nin Table V, with five Wyckoff positions: one Pd, two\nrare-earth R, and two Si populated sites.\nThe primi-\ntive cell contains 24 atoms with 4 Pd, 12 Si, and 8 rare-\nearth atoms.\nThese rare-earth atoms are divided into\ntwo groups of four atoms each; R1 (R2) is referred to as\nlong (short) spin in the main text. The electronic struc-\n10\nFIG. 16. Constant-Q cuts found by integrating scattering in-\ntensity measured at the reciprocal-lattice point written in the\nfigure labels \u00b10.05 r.l.u. in the three dimensional reciprocal\nspace.\nture calculations are performed using multiple scattering\n(KKR) Density Functional Theory. The single-site po-\ntentials are generated using the HUTSEPOT DFT-KKR\ncode [12], which incorporates the local self-interaction\ncorrection on the 4f electron orbitals [13]. The 4f elec-\ntronic configuration is initially set to the Hund\u2019s Rule\nstate configuration, with a spin of S = 3/2 and orbital\nangular momentum L = 6 for Nd, and S = 7/2 and\nL = 0 for Gd.\nThe local density of states (LDOS) is obtained, includ-\ning relativistic effects, using the MARMOT electronic\nstructure code [14]. The DFT-KKR Green function is\nexpanded using spherical harmonics with an orbital an-\ngular momentum cutoff lmax = 3. The LDOS is obtained\nusing a mesh of energies with spacing of 0.005 Ry. and\nwith an adaptive Brillouin zone integration with relative\nerror over every division in wave-vector space of less then\n0.1%. The LDOS near the Fermi energy is depicted in\nFig. 5 in the main text and shows that R1 and R2 are in-\nequivalent for both compounds. This inequivalence stems\nfrom the different surrounding immediate environments\n(see main text, Fig. 5), with R1 surrounded by a hexago-\nnal Si lattice in one adjacent layer, and the other hexag-\nonal layer containing four Si atoms and two Pd atoms.\nIn contrast, R2 is surrounded by four Si atoms and two\nPd atoms from both adjacent layers, leading to a differ-\nent crystal field environment. For Nd2PdSi3, the spin-up\nchannel is split into two peaks. One of them is partially\noccupied, while the other is located \u223c0.5 eV above the\nFermi energy (see figure). For Gd2PdSi3, the 4f spin-up\nchannel is fully occupied, and the unoccupied 4f spin-\ndown channel forms a single peak located around 1 eV\nabove the Fermi energy. The corresponding spin and or-\nbital local moments obtained from the relativistic calcu-\nlations are given in Table V for Nd2PdSi3 and Gd2PdSi3,\nrespectively. The Nd compound has spin (orbital) mo-\nments close to 3 \u00b5B (6 \u00b5B) while the Gd-one has large\nGd spin moments (7 \u00b5B) as expected from Hund\u2019s Rules.\nThe negligible orbital moments on the Gd atoms origi-\nnate from the spin polarization of the valence electrons\n(trivalent 5d1 6s2).\n[1] D. Mayoh, A. \u02c7Stefan\u02c7ci\u02c7c, M. Lees, and G. Balakrishnan,\nJ. Cryst. Growth 642, 127774 (2024).\n[2] A. Szytu la, M. Hofmann, B. Penc, M. \u00b4Slaski, S. Majum-\ndar, E. Sampathkumaran, and A. Zygmunt, J. Magn.\nMagn. Mater. 202, 365 (1999).\n[3] M. Smidman, C. Ritter, D. T. Adroja, S. Rayaprol,\nT. Basu, E. V. Sampathkumaran, and A. D. Hillier, Phys.\nRev. B 100, 134423 (2019).\n[4] R. I. Bewley, T. Guidi, and S. M. Bennington, Notiziario\nNeutroni e Luce di Sincrotrone 1, 22 (2009).\n[5] R. Bewley, J. Taylor, and S. Bennington., Nucl. Instrum.\nMethods Phys. Res. A: Accel. Spectrom. Detect. Assoc.\nEquip. 637, 128 (2011).\n[6] J. Lass, H. Jacobsen, K. M. L. Krighaar, D. Graf,\nF. Groitl, F. Herzog, M. Yamada, C. K\u00a8agi, R. A. M\u00a8uller,\nR. B\u00a8urge, M. Schild, M. S. Lehmann, A. Bollhalder,\nP. Keller, M. Bartkowiak, U. Filges, U. Greuter, G. Thei-\ndel, H. M. R\u00f8nnow, C. Niedermayer, and D. G. Mazzone,\nRev. Sci. Instrum. 94, 023302 (2023).\n[7] J. Lass, H. Jacobsen, D. G. Mazzone, and K. Lefmann,\nSoftwareX 12, 100600 (2020).\n[8] F. Tang, M. Frontzek, J. Dshemuchadse, T. Leisegang,\nM. Zschornak, R. Mietrach, J.-U. Hoffmann, W. L\u00a8oser,\nS. Gemming, D. C. Meyer, and M. Loewenhaupt, Phys.\nRev. B 84, 104105 (2011).\n[9] J. Rodr\u00b4\u0131guez-Carvajal, Phys. B: Condens. Matter 192,\n55 (1993).\n[10] J. Bouaziz, G. Bihlmayer, C. E. Patrick, J. B. Staunton,\nand S. Bl\u00a8ugel, Phys. Rev. B 109, L201108 (2024).\n[11] J. Bouaziz, E. Mendive-Tapia, S. Bl\u00a8ugel, and J. B.\nStaunton, Phys. Rev. Lett. 128, 157206 (2022).\n[12] M. Hoffmann, A. Ernst, W. Hergert, V. N. Antonov,\nW. A. Adeagbo, R. M. Geilhufe, and H. Ben Hamed,\nPhys. Status Solidi B 257, 1900671 (2020).\n[13] M. L\u00a8uders, A. Ernst, M. D\u00a8ane, Z. Szotek, A. Svane,\nD. K\u00a8odderitzsch, W. Hergert, B. L. Gy\u00a8orffy, and W. M.\nTemmerman, Phys. Rev. B 71, 205109 (2005).\n[14] C. E. Patrick and J. B. Staunton, Electronic Structure\n4, 017001 (2022).\n11\nQuantity\nNd2PdSi3\nGd2PdSi3\nLattice Constants (a, b, c) (\u02daA)\n(14.2345, 8.2183, 16.8442)\n(14.0566, 8.1156, 16.3507)\nSpin Moment (\u00b5B)\n(Nd1: 3.2244, Nd2: 3.2389)\n(Gd1: 7.0723, Gd2: 7.0852)\nOrbital Moment (\u00b5B)\n(Nd1: -5.7582, Nd2: -5.7743) (Gd1: 0.0747, Gd2: 0.0763)\nTABLE V. Comparison of lattice constants and spin/orbital moments for Nd2PdSi3 and Gd2PdSi3."
-  },
-  {
-    "domain": "Materials Science",
-    "chunk_type": "general",
-    "text": "1\nUnleashing Expert Opinion from Social Media for\nStock Prediction\nWanyun Zhou\u2217, Saizhuo Wang\u2020, Xiang Li\u2217, Yiyan Qi\u2021, Jian Guo\u2021, Xiaowen Chu\u2217, Fellow, IEEE\n\u2217The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China\n\u2020 The Hong Kong University of Science and Technology, Hong Kong SAR\n\u2021 IDEA Research, Shenzhen, China\nAbstract\u2014While stock prediction task traditionally relies on\nvolume-price and fundamental data to predict the return ratio\nor price movement trend, sentiment factors derived from social\nmedia platforms such as StockTwits offer a complementary and\nuseful source of real-time market information. However, we find\nthat most social media posts, along with the public sentiment they\nreflect, provide limited value for trading predictions due to their\nnoisy nature. To tackle this, we propose a novel dynamic expert\ntracing algorithm that filters out non-informative posts and iden-\ntifies both true and inverse experts whose consistent predictions\ncan serve as valuable trading signals. Our approach achieves sig-\nnificant improvements over existing expert identification methods\nin stock trend prediction. However, when using binary expert\npredictions to predict the return ratio, similar to all other expert\nidentification methods, our approach faces a common challenge\nof signal sparsity with expert signals cover only about 4% of all\nstock-day combinations in our dataset. To address this challenge,\nwe propose a dual graph attention neural network that effectively\npropagates expert signals across related stocks, enabling accurate\nprediction of return ratios and significantly increasing signal cov-\nerage. Empirical results show that our propagated expert-based\nsignals not only exhibit strong predictive power independently\nbut also work synergistically with traditional financial features.\nThese combined signals significantly outperform representative\nbaseline models in all quant-related metrics including predictive\naccuracy, return metrics, and correlation metrics, resulting in\nmore robust investment strategies. We hope this work inspires\nfurther research into leveraging social media data for enhancing\nquantitative investment strategies. The code can be seen in\nhttps://github.com/wanyunzh/DualGAT.\nIndex Terms\u2014Stock prediction, expert identification, social\nmedia analysis, graph neural networks\nI. INTRODUCTION\nQuantitative investment research has traditionally relied on\nvolume-price data and fundamental data for stock prediction,\nwith the goal of forecasting return ratios or price movement\ntrends [1]\u2013[6]. In recent years, there has been a growing\nfocus on utilizing sentiment features derived from social media\nas a complementary information source for stock prediction\n[7]\u2013[12]. Social media platforms dedicated to investment and\ntrading discussions (e.g., StockTwits, reddit) have become\nincreasingly influential in financial markets [13], [14]. These\nplatforms not only reflect investor behavior and market sen-\ntiment in real time but also demonstrate significant market-\nmoving potential. A representative example is the GameStop\nshort squeeze [15] that originated on social media and led\nCorresponding\nauthors:\nJian\nGuo,\nXiaowen\nChu.\nEmail:\nguojian@idea.edu.cn, xwchu@hkust-gz.edu.cn\nto unprecedented market volatility. Recognizing this potential,\nsocial media data has been actively incorporated into trading\nstrategies by professional trading firms (e.g., WorldQuant, the\ntop-tier hedge fund) to find predictive signals [16]. Among all\nthe investment-related social media platforms or communities,\nStockTwits [17] stands out as a specialized platform designed\nspecifically for investors to share insights on various stocks.\nOn StockTwits, each stock has its own dedicated discussion\nthread, where users can post messages related to that stock.\nUnlike general social media platforms where sentiment of the\nposts typically relies on Large Language Models with potential\ninaccuracies, StockTwits offers a unique advantage: users can\nexplicitly self-label their posts with \u201cBullish\u201d or \u201cBearish\u201d\ntags, providing clear and accurate sentiment indicators for\nspecific stocks. This self-labeling mechanism eliminates the\nuncertainty associated with algorithmic sentiment analysis and\noffers more reliable sentiment signals.\nAlthough StockTwits provides valuable self-labeled senti-\nment data and contains a vast amount of discussions, extracting\nuseful trading signals for either stock movement trend or\nreturn ratio prediction from the raw social media data remains\nchallenging due to its substantial noise. To demonstrate that\nmost social media posts are noisy and non-informative for\ntrading predictions, we collected all StockTwits posts from\n2017 to 2023 and examined the relationship between public\nsentiment and subsequent stock price movements. We applied\na native filter and aggregation method that identifies the\ndominant sentiment for each stock based on a threshold of\nposts and the intensity of the sentiment. Specifically, if the\nnumber of posts for a given stock exceeded 30 on a given\nday, and one sentiment (either bullish or bearish) dominated by\nmore than 85%, we used that sentiment to predict the stock\u2019s\ntrend for the following day. As shown by the solid blue line\nwith square markers in Figure 1, the accuracy of dominant\nsentiment for predicting the stock\u2019s movement trend (bullish\nfor rise and bearish for fall) on the next day (T+1) was 47.63%,\n3 days later (T+3) was 46.99%, and 7 days later (T+7) was\n47.01%. These results suggest that using aggregated sentiment\nfor stock prediction yields accuracy that is not even better than\nrandom guessing, demonstrating the limited predictive value of\nsocial media sentiment. Such underperformance indicates that\nthe majority of social media posts contain significant noise\nrather than actionable trading information.\nIn fact, social media data is inherently noisy when used as\ntrading signals [18]\u2013[22]. This is largely due to the open nature\narXiv:2504.10078v1  [cs.CE]  14 Apr 2025\n2\nof these platforms, where anyone can post opinions, many of\nwhich neither influence market movements nor have predictive\npower. Retail investors\u2019 discussions, for example, have limited\nimpact on stock movements, and many of their posts on\nthese platforms are speculative, driven by personal biases, or\nbased on rumors [23]. These non-professional posts often lack\ninvestment insights, making them unreliable as trading signals\n[24]. Additionally, insider traders can deliberately manipulate\nsocial media to influence investor sentiment and, in turn, affect\nmarket trends [25]. Moreover, the prevalence of emotional and\nimpulsive posts during market volatility further contributes to\nthis noise, as users tend to react to price movements rather than\nprovide predictive insights [26]. As a result, truly informative\nposts from expert users are rare, making it crucial to accurately\nidentify these experts amidst the noise [20].\nDespite efforts in existing research to identify expert signals\n[18]\u2013[20], [27], [28], current methods struggle to reliably\ndistinguish true experts from users who simply appear to be\nexperts due to random correlations with market movements.\nThese methods often overlook the fact that true experts should\nbe able to make precise predictions across different market\nregimes. Their expert identification approaches tend to favor\nusers who post frequently and whose posts happen to be\ncorrect during specific market trends, leading to biased results.\nFurthermore, these methods do not address the widespread\npresence of bots and spam accounts in social media posts,\nfurther undermining their effectiveness in identifying genuine\nexpert signals.\nTo address these limitations in expert identification, we\npropose a novel dynamic expert tracing algorithm tailored to\nstock data. Our algorithm incorporates multiple key principles\nto ensure robust expert identification. First, it filters out noise\nfrom bots and spammers to address the data quality issue.\nSecond, it evaluates both long-term and dynamic prediction\naccuracy across various market regimes to identify experts.\nThird, it ensures that experts focus only on a manageable\nnumber of stocks by filtering out users with numerous posts on\ndifferent stocks in a single day. Furthermore, we identify both\ntrue experts and \u2018inverse experts\u2019\u2014users who consistently\nmake incorrect predictions. The identification of inverse ex-\nperts are valuable trading signals for two main reasons. First,\nresearch in behavioral economics [29], [30] shows that several\ncognitive biases can lead to systematic misjudgments in market\npredictions, a characteristic frequently observed in inverse\nexperts. This is particularly relevant since most traders in the\nmarket consistently lose money due to these cognitive biases,\neffectively making them inverse experts whose predictions can\nserve as reliable contrary indicators. Second, some inverse\nexperts may emerge from deliberate market manipulation.\nCertain institutions or large investors strategically manipulate\nstock prices through social media for profit. For example,\nwhen they want to sell, they use controlled media accounts\nto post bullish views to encourage retail investors to buy. This\nallows them to offload their shares at higher prices while they\nsell in large volumes, causing the price to drop (see [31] for\ncases). Conversely, when these institutions want to buy, they\npost bearish views to prompt retail investors into selling. This\nenables them to accumulate shares at lower prices before they\ndrive the price up. These manipulated media accounts, though\ndeceptive in intent, can act as reliable inverse experts that\nprovide valuable trading insights.\nFor true experts, we can align our trading actions with\ntheir sentiment predictions (taking long positions for bullish\nsignals and short positions for bearish ones), while for in-\nverse experts, we can strategically take opposite positions.\nBy applying this dual-strategy approach with our expert iden-\ntification method, we predict stock price movements across\ndifferent time horizons: the next day (T+1), 3 days later (T+3),\nand 7 days later (T+7). As shown in Figure 1, the experts\nidentified by our expert tracing system demonstrate notably\nhigher accuracy in predicting future stock movement trends,\nleading to significant improvements in expert identification\nover existing methods. While predicting the stock future trend\nis important, the primary objective of quantitative investing is\nnot just to identify trends but to maximize excess returns by\nselecting stocks with the highest potential for future profits.\nThus, our model aims to predict the stock return ratio for\nthe next trading day. To achieve this, we transform the expert\npredictions, which serve as binary trend indicators (rise or\nfall), into continuous return ratios that can be treated as expert\nsignals. Once we have used these expert signals to predict the\nreturn ratio, we can evaluate the model\u2019s performance using\nvarious quant-related metrics that are crucial for investment\nstrategies. These include correlation metrics that reflect the\nrelative strength of the relationship between predicted and ac-\ntual returns (e.g., information coefficient and rank information\ncoefficient), return metrics that measure the profitability and\nrisk-adjusted returns (e.g., annualized return and sharpe ratio),\nand predictive accuracy. These metrics collectively provide a\ncomprehensive reference for evaluating investment strategies\nand establish a solid foundation for making informed and\nrobust investment decisions.\nThough expert signals can provide strong predictions, chal-\nlenges remain in their practical application. Like all methods\nthat rely on expert predictions for trading signals, our strict\nexpert tracing algorithm faces a common challenge of signal\nsparsity [19], [27], [28]. The resulting expert signals cover\nonly about 4% of all stock-day combinations in our dataset\n(i.e., most stocks have no expert predictions on most trading\ndays), making it impractical for some quantitative investment\nscenarios, such as cross-sectional stock selection. To mitigate\nthis sparsity issue, and considering the high dependency be-\ntween asset returns in financial markets [32]\u2013[34], we aim\nto leverage the powerful propagation capabilities of graph\nneural networks to enhance the flow of information [35],\n[36]. Therefore, we propose a message-passing mechanism\nbased on dual graph attention networks, termed DualGAT,\nto propagate expert signals across related stocks. By using\nvarious sources of relational information on stocks to construct\nmultiple graph structures, our approach effectively broadens\nthe coverage of expert signals throughout the dataset. Addi-\ntionally, when incorporating these expert signals with other tra-\nditional financial features (e.g., price-volume and fundamental\ndata), the combined signals consistently demonstrate enhanced\npredictive power in forecasting stock returns. This synergistic\neffect suggests that our expert signals capture distinct stock\n3\nFig. 1. Comparison of stock prediction accuracy (%) between different expert\nidentification methods and naive sentiment aggregation methods.\ncharacteristics that complement, rather than merely duplicate,\nthe information of traditional financial features. Such comple-\nmentarity is particularly valuable for quantitative strategies as\nit diversifies the sources of information, thereby enhancing the\nrobustness of our investment framework. The pipeline of our\nmethod can be seen in Fig 2. In summary, our contributions\nare as follows:\n\u2022 We establish a comprehensive and up-to-date social me-\ndia dataset, offering users easy access to several years\nof curated StockTwits data, eliminating the complexities\nof web scraping and making it more accessible for\nresearchers and practitioners in the field.\n\u2022 We address the issue of noisy social media data in stock\nprediction and propose a novel dynamic expert tracing al-\ngorithm that effectively identifies valuable trading signals\nby distinguishing both true experts and inverse experts\nfrom noise. Our approach substantially outperforms exist-\ning expert identification methods, with identified experts\nshowing significantly higher accuracy in predicting stock\nmovement trends.\n\u2022 We address the common signal sparsity challenge in\nexpert-based trading strategies through our DualGAT\nmodel, which greatly enhances signal coverage by prop-\nagating expert predictions across related stocks.\n\u2022 We demonstrate that our expert signals can work syn-\nergistically with traditional financial features, and these\ncombined signals significantly outperform representative\nbaseline models across all metrics, leading to more robust\nquantitative investment strategies.\nII. RELATED WORK\nA. Social Media-based Stock Prediction\nStock prediction using social media data has gained sig-\nnificant attention in recent years due to its potential to\ncapture market sentiment and crowd wisdom. Early research\nprimarily focused on sentiment analysis of social media posts\nto predict market movements. For instance, Bollen et al.\n[37] analyzed Twitter mood to predict daily stock market\nchanges, while Nguyen et al. [38] developed sentiment-based\nfeatures from StockTwits messages for market prediction.\nThese studies demonstrated that social media sentiment could\nserve as a valuable indicator of market trends. As deep\nlearning advanced, especially in natural language processing,\nresearchers began incorporating more sophisticated methods.\nDeepClue [39] developed a hierarchical neural network to\nprocess company-related tweets from the social media and\nvisualize their contribution to the prediction of stock move-\nments. Xu et al. [7] presented a market information encoder\nwith a novel deep generative model to jointly encode tweets\nand price signals for stock prediction while Sawhney et al.\n[8] proposed MAN-SF that uses bilinear transformations to\nlearn temporal interactions between tweet representations and\nhistorical prices. Zhang et al. [12] integrated social media\nsentiment as one of the model inputs in their dynamic graph\nattention network.\nHowever, as mentioned in the introduction, tweets from\nsocial media are inherently noisy. Moreover, our experiments\n(Figure 1) show that public sentiment has limited predic-\ntive power for stock movements, which is also supported\nby Oliveira et al. [40] and Coyne et al. [19]. While the\naforementioned studies have made progress in leveraging\nsocial media data for stock prediction, they did not effectively\nfilter the massive amount of tweets, potentially leading to\nmodels processing largely non-informative information. This\nhighlights the importance of identifying experts within social\nmedia platforms whose predictions can provide informative\nand valuable trading signals.\nB. Expert Identification in Social Media\nExpert identification in social media has emerged as a\ncritical research area, with various approaches developed\nto address this challenge. Early studies primarily employed\nauthority-oriented methods or topic-oriented approaches [41],\n[42]. Authority-oriented methods maninly focus on users\u2019\nreputation or social influence by constructing the user-to-\nuser graph. For instance, Bouguessa et al. [43] employed\nan in-degree method to identify experts while Zhu et al.\n[44] ranked user authority in an extended category graph.\nAlthough effective, these approaches may not always match\nnew questions with relevant expertise. In contrast, topic-\noriented methods leverage latent topic modeling and linguistic\nanalysis of users\u2019 posts to match experts based on question\ncontent. Guo et al. and Zhou et al. [45], [46] introduced topic-\nbased probabilistic models to question-answering activities and\nidentify experts. Weng et al. [47] proposed TwitterRank, which\nidentifies influential users within specific domains by perform-\ning a topic-sensitive random walk that models the transition\nprobability from one Twitter user to another. Additionally,\ntemporal dynamics have been recognized as crucial factors in\nexpertise identification, as demonstrated by Chi et al. [48], who\nshowed that dynamic LDA topic models outperform traditional\nstatic approaches in tracking evolving expertise indicators.\nIn the context of financial social media, expert identification\nprimarily focuses on users\u2019 prediction accuracy for securities.\nSeveral studies [18]\u2013[20], [27], [28] have attempted to identify\nfinancial experts based on their historical prediction accuracy\nover a specific time period or number of posts. However, these\n4\nExpert\nLucky\nScam\nRobot\nMob\nTime\nUser \nSpectrum\nSparse Expert Prediction\nPropagated Expert Prediction\nMarket Data & \nFundamental Features\nMulti-view Expert signal Propagation\nDownstream Model\nCorrect\nIncorrect\nKnowledge \nGraph\nSector & \nIndustry\n\u2026\u2026\nStatistical \nCorrelation\nIdentify & \nExtract\nInverse Expert\nuser_id, stock, tweet_time, sentiment, past_num, past_accuracy,\u2026\u2026\nIdentify & \nExtract\nDaily cross-sectional return predictions\nFig. 2. An overview of our proposed method. We first identify experts and inverse experts from social media based on their historical prediction performance\nacross different market regimes, as shown in the left part where the prediction patterns of different types of social media users (including experts/inverse\nexperts, bots, spammers, lucky users, and mob, see Sec.IV Expert User Identification for detailed analysis) are illustrated. Then we identify and extract these\nexpert signals which are sparse in nature. To address the sparsity issue, we propose a dual graph attention network (DualGAT) that takes both the sparse expert\nsignals and features obtained by the market and fundamental data as input. The DualGAT incorporates relational information among stocks from multiple\naspects to propagate expert signals across related stocks, significantly expanding the coverage of expert signals, and finally outputs the daily cross-sectional\nreturn predictions for each stock.\nmethods fail to filter out automated bots or spammers and\nsuffer from the problem of insufficient past data criteria when\nidentifying expert users (see Sec.IV Expert User Identifica-\ntion for a detailed analysis). These limitations are effectively\naddressed by our dynamic expert tracing algorithm.\nC. Graph-based Methods in Financial Applications\nDue to the phenomenon of stock momentum spillover,\nGraph Neural Networks (GNNs) and graph modeling have\ndemonstrated remarkable success in financial applications by\neffectively modeling complex relationships between market\nentities [4], [6], [49]\u2013[51]. Recent studies have leveraged\nGNN architectures to model both market relationships and\ntemporal dynamics. Chen et al. [52] proposed a collaborative\nmodel that employs LSTM and GCN. Feng et al. [2] intro-\nduced a temporal graph convolution layer to capture stock\nrelations in a time-sensitive manner, enabling the evolution\nof relational weights among connected edges over time. Hsu\net al. [3] developed FinGAT, which employs graph attention\nnetworks to identify profitable stock combinations by learning\nlatent interactions among stocks and sectors. Zhao et al. [49]\nproposed a bi-typed market knowledge graph approach with\ndual attention networks to capture momentum spillover signals\nbetween stocks and related entities such as executives through\nheterogeneous GNNs. Wang et al. [53] introduced HATR-I, a\nhierarchical adaptive temporal relational model that formulate\ndifferent views of domain adjacency graphs into a unified\nmultiplex network and use the multi-stage relational matching\nfor stock prediction. Liu et al. [54] proposed a novel graph\nmodeling method to explore the interactions among stock\nfactors. Ying et al. [55] captured temporal relationships using\nboth sequential features and attributes from stock documents\nthrough a time-aware relational attention network. Li et al.\n[56] transform the price series into a graph structure using\nchart similarity to forecast turning points in stock price.\nDifferent from these works that primarily focus on modeling\ninter-stock market relationships, our GNN framework not only\ncaptures the intricate dependencies among stocks but also\nserves as an effective mechanism for propagating sparse expert\nsignals across related stocks, thereby addressing the critical\nchallenge of signal sparsity in expert-based trading strategies.\nIII. PROBLEM FORMULATION\nIn this paper, we first aim to identify expert users and extract\ntheir trading signals from social media posts of the current\ntrading day. Based on these expert signals, combined with\nstock features from previous trading days, we then predict the\nstock return ratio for the next trading day. Specifically, given a\nset of stocks S, for each stock u \u2208S on trading day t, we col-\nlect both price-volume features (open,high,low,close,volume)\nand fundamental indicators to form a feature vector xu,t \u2208RF .\nWe focus on predicting the stock return ratio as it normalizes\nthe price variation between different stocks. The return ratio\nat day t + 1 is defined as:\nru,t+1 = (cu,t+1 \u2212cu,t)/cu,t\n(1)\nwhere cu,t is the closing price of stock u at day t. To predict\nthe return ratio {ru,t+1}u\u2208S for day t + 1, our model takes\nthree types of inputs:\n5\n1) Historical market data: For each stock u\n\u2208\nS,\nwe\nuse\nprice\nand\nfundamental\nfeatures\nXu,t\n=\n[xu,t\u2212L, . . . , xu,t\u22121] \u2208RF \u00d7L from day t \u2212L to t \u22121\nwith a rolling window of size L. Unlike most stock\nprediction papers that use data up to timestep t, we use\nprice features until t \u22121 to avoid look-ahead bias, as\nthe exact price for day t are only available after market\nclose.\n2) Expert signals: For each stock u \u2208S, we perform real-\ntime collection of StockTwits posts during day t. The\ncollection process ends 5 minutes before market close,\nfrom which we extract expert signal eu,t if available.\n3) Dynamic stock graph: To capture the relationships be-\ntween stocks, we construct a dynamic graph Gt =\n(Vt, Et) based on price data up to day t \u22121, where\nnodes Vt = S represent the complete set of stocks and\nedges Et capture their relationships (see Sec. V.C Dual\nGraph Attention Network for details).\nIV. EXPERT USER IDENTIFICATION\nDespite the presence of noise in social media data, there is\nstill potential to harness it by developing more sophisticated\nmodels. One promising avenue is through building an expert\ntracing system, as proposed by several studies that attempted\nto identify expert users within social media platforms. Among\nthem, the MFN model proposed by Wang et al. [20], the Smart-\nUser-Classification model by Coyne et al. [19], the Winners\nmodel by Liao et al. [28] and \u201cconsistently correct user iden-\ntification\u201d by Bouadjenek et al. [27] aimed to classify expert\nusers from social media posts that carry subjective sentiments.\nWhile the former three models rely on external sentiment\nanalysis models to label tweets, which might introduce errors,\nour dataset from StockTwits contains user-labeled sentiments\n(bullish or bearish), which eliminates the need for external\nsentiment inference and reduces potential errors. However,\nwhen applied to our comprehensive dataset spanning the past\nfive years, these methods all failed to identify experts/inverse\nexperts who could consistently make (in)accurate predictions\nduring the test period. Users identified as experts in the training\nphase did not perform reliably as experts in the test phase.\nSeveral factors contribute to the failure of these models,\nand they are key reasons why these approaches do not work\neffectively in our data. Additionally, we propose solutions to\naddress these issues in our model:\n1) Filtering spammers: Neither of these models applied\neffective filtering to remove noise generated by auto-\nmated bots or spammers. We discovered that social\nmedia platforms contain posts from bots or spammers,\ncharacterized by repetitive posts at fixed intervals (e.g.,\npublishing content around every 24 seconds), all carry-\ning sentiment labels. These spammers can post hundreds\nor even thousands of tweets with sentiment labels per\nday about the same stock. These excessive posts distort\nsentiment analysis and need to be partially filtered. To\naddress this issue, for each user-stock pair, we only\nretain the post closest to market closing time on any\ngiven day.\n2) Sufficient Past Data Criteria: Existing methods employ\ninsufficient criteria for expert identification, either using\na fixed number of N past posts (e.g., N=10) [28] or\nconsidering prediction accuracy only over a short period\nof K days (e.g., K=40 or 90) [20], [27]. However, we\nfound these methods unreliable. In our dataset, even after\nfiltering out spammers, users who achieved over 80%\naccuracy in their past 20 posts had an average accuracy\nof only 54.0%, and users who achieved 80% accuracy in\nthe past 40 days had an average accuracy of only 53.9%\nduring the testing phase. Further analysis revealed that\nmost users only have high prediction accuracy during\nspecific periods, after which their accuracy declines\nsignificantly, and vice versa. For example, users who\nposted for more than 20 days in 2022 with an accuracy\nabove 80% had an average accuracy about 50% in 2023,\nwhile users who had accuracy below 20% in 2022 saw\ntheir accuracy rise to nearly 50% in 2023. To better\nunderstand the shortcomings of existing approaches and\naddress these limitations in expert identification, we\npropose two key criteria:\n\u2022 Precise\nprediction\nacross\ndifferent\nmarket\nregimes: Some users consistently post only bullish\npredictions every day, while others post only bearish\npredictions. For those posting only bullish predic-\ntions, when the stock is in an upward trend, they\nmay appear to be experts due to their constant\nbullish posts, but during market corrections, these\nsame bullish posts lead to poor predictions, effec-\ntively turning them into inverse experts. Similarly,\nusers who consistently post bearish predictions may\nseem expert-like during market downturns but be-\ncome inverse experts during upward trends. These\nusers have little value for analysis as they do not\nbase their predictions on real market insight. To\naddress this issue, not only do we consider the accu-\nracy of the past N tweets to ensure the dynamism\nof the experts, but we also need to look at their\nlong-term predictions (two years) and see if they\nmaintain a high level of accuracy, as most stocks\nwill not remain in the same trend over an extended\nperiod of time.\n\u2022 Focused Expert Evaluation: True market experts\ntypically specialize in a manageable number of\nstocks to maintain deep analysis and insight. How-\never, some users post multiple bullish or bearish\nsentiments for a large number of stocks on the same\nday. If the market happens to move in their predicted\ndirection, they will appear highly accurate and be\nincorrectly identified as experts in previously men-\ntioned expert identification methods. Such broad,\nunfocused posting behavior contradicts the nature\nof genuine expertise, which requires concentrated\nattention and thorough analysis. To ensure we iden-\ntify truly focused experts, we require that a user\u2019s\npast 20 posts must span at least five different trading\n6\ndays 1. This helps filter out users who spread their\npredictions too thin across many stocks without\ndemonstrating sustained, focused analysis.\nFurthermore, previous models focused solely on identifying\nexperts who consistently made correct predictions, ignoring\nthe possibility of identifying inverse experts. We have added\ninverse experts to our approach, where we identify these\ninverse experts and take the opposite action based on their\nopinions.\nBy utilizing this approach, we developed a dynamic expert\ntracing system. For each trading day, if there is an expert\nwhose past predictions meet the aforementioned criteria, we\ntreat that expert\u2019s prediction for the day as a representative\ntrend signal, serving as the pseudo ground truth for whether\nthe stock will rise or fall.\nThe procedure is summarized in Algorithm\n1. For each\ntrading day d, we first filter the posts to retain only the latest\npost per user-stock pair. For users who posted on day d, we\nevaluate their prediction accuracy over two time horizons.\nLet Ti,recent and Fi,recent represent the number of correct\nand incorrect predictions in their most recent N posts, while\nTi,long and Fi,long denote these numbers over the past T\nyears. To ensure both recent performance and consistent long-\nterm accuracy, we require experts to meet two criteria: their\nmost recent N (N=20) posts must span at least K (K=5)\ndifferent days and achieve an accuracy above P2 (80%), while\nmaintaining a long-term accuracy above P1 (65%) over the\npast T (T=2) years. Similarly, inverse experts are identified\nwhen their recent and long-term accuracies fall below 1-P2\n(20%) and 1-P1 (35%) respectively.\nThe expert tracing system achieved an accuracy rate as\nhigh as 77.26%. Furthermore, the inference stage has a time\ncomplexity of O(len(Cd)) where len(Cd) is the number of\nposts on the given day d. Since our Stocktwits database\ncontains the historical posting data and accuracy for each user,\nthe inference phase simply involves traversing all posts once\nto update the relevant information for each user. Therefore, the\ntime taken to complete this operation scales linearly with the\ntotal number of posts. The low complexity, combined with the\nhigh accuracy, demonstrates the effectiveness and efficiency of\nour method in enhancing stock prediction models by filtering\nout noise and focusing on both expert and inverse expert\ninsights.\nV. EXPERT OPINION PROPAGATION\nDespite integrating both expert and inverse expert signals,\nlike other expert identification approaches [19], [20], [27], our\nmodel still suffers from signal sparsity, producing meaningful\npredictions for only a few stocks on certain trading days.\nAlthough these predictions serve as a pseudo ground truth for\nstock trends, they account for only about 4% of our dataset,\nleaving most data points unlabeled. To address this limitation,\nwe propose a graph-based method that transforms these limited\nexpert predictions into practical signals and propagates them\nacross related stocks, thereby extending coverage to inform all\ndata points in the dataset.\n1we tested various parameters and found that five different days yielded\nthe best performance considering both the coverage and accuracy\nAlgorithm 1 Expert Tracing System for Date d\nRequire: current date d, minimum post requirement N, min-\nimum days span K, tweet set C with Cd \u2282C denoting\ntweets posted on day d, long-term accuracy threshold P1,\nshort-term accuracy threshold P2, long-term range T\nEnsure: expert and inverse expert extraction for date d\n1: Cd \u2190filter daily latest(Cd) {Keep only the latest post\nper user-stock pair on day d}\n2: Id \u2190all users of Cd\n3: for each i \u2208Id do\n4:\nCi,recent \u2190get recent N posts(C, i, N) {Get most\nrecent N posts for user i}\n5:\ndaysi,recent \u2190count unique days(Ci,recent)\n6:\nif daysi,recent < K then\n7:\ncontinue\n8:\nend if\n9:\nTi,recent \u21900; Fi,recent \u21900 {Recent N posts accuracy\ncounters}\n10:\nfor each c \u2208Ci,recent do\n11:\nCheck stock price change \u03c1 on the next trading day\n12:\nif (c \u2208bullish \u2227\u03c1 \u2208rise) \u2228(c \u2208bearish \u2227\u03c1 \u2208fall)\nthen\n13:\nTi,recent \u2190Ti,recent + 1\n14:\nelse\n15:\nFi,recent \u2190Fi,recent + 1\n16:\nend if\n17:\nend for\n18:\nAi,recent \u2190Ti,recent/(Ti,recent + Fi,recent)\n19:\nCi,long \u2190get posts(Cd\u2212T :d, i) {Get all posts in past\nT years}\n20:\nTi,long \u21900; Fi,long \u21900 {Long-term accuracy coun-\nters}\n21:\nfor each c \u2208Ci,long do\n22:\nCheck stock price change \u03c1 on the next trading day\n23:\nif (c \u2208bullish \u2227\u03c1 \u2208rise) \u2228(c \u2208bearish \u2227\u03c1 \u2208fall)\nthen\n24:\nTi,long \u2190Ti,long + 1\n25:\nelse\n26:\nFi,long \u2190Fi,long + 1\n27:\nend if\n28:\nend for\n29:\nAi,long \u2190Ti,long/(Ti,long + Fi,long)\n30:\nif Ai,recent \u2265P2 and Ai,long \u2265P1 then\n31:\nMark i as expert and take long (short) positions for\nbullish (bearish) sentiment\n32:\nelse if Ai,recent \u2264(1 \u2212P2) and Ai,long \u2264(1 \u2212P1)\nthen\n33:\nMark i as inverse expert and take short (long) posi-\ntions for bullish (bearish) sentiment\n34:\nend if\n35: end for\nA. Trend Signals Transformation\nThough predicting whether a stock will rise or fall is\nimportant, the ultimate goal is not simply trend forecasting\nbut maximizing excess returns through investment strategies.\n7\nFeng et al. [2] has pointed out that traditional stock prediction\nmethods focus either on classification (predicting the direction)\nor regression (predicting the price) are suboptimal because\nthey fail to align with the goal of selecting stocks with\nthe highest potential returns, which is the real objective for\ninvestors. Given that a more practical approach to stock pre-\ndiction involves optimizing return ratios rather than focusing\nsolely on trend direction, it becomes necessary to transform\nthe pseudo ground truth from binary trend indicators (up or\ndown) into continuous return ratios which can be treated as\nexpert signal.\nThe transformation is as follows: for stocks predicted by\nexperts to rise, we compute the average return ratio from\ndays within the past 30 days when the stock showed an\nupward trend. Similarly, for a predicted downward trend, we\nuse the average of the down days\u2019 return ratios within the\nsame period. This 30-day window for the return ratio which\nacts as a monthly indicator is a common approach in financial\nmodelling for stock prediction [57]\u2013[59] as it helps to smooth\nout daily volatility while providing a sufficient sample size\nto capture recent market trends. This calculated average thus\nserves as a \u2018prior\u2019 for return ratios, which effectively functions\nas the expert signal. If a stock is not predicted by experts on\nany given day, then the \u2018prior\u2019 for that stock is set to 0.\nB. Temporal Pre-training Model\nNow that we have obtained the sparse prior for the return\nratios, the challenge remains that for over 95% of the stock-\nday pairs, we still lack any prior information for these pairs\nas they have no expert signal. To address this, we propose an\ninitial step to train a temporal model, which will serve as a\nbaseline for generating return ratio estimates. This step acts\nsimilarly to a pre-training process, where the temporal model\ncaptures patterns in stock movements without direct reliance\non expert signals.\nThe model is trained using historical stock price data (open,\nhigh, low, close prices, and trading volume) and fundamental\ndata. Unlike many studies that follow Feng et al.\u2019s [2], [8],\n[53] method of normalizing the entire dataset across the whole\ntime dimension (including test sets), which may lead to test\nset leakage, we normalize each feature daily across the stock\ncross-section.\n1) MS-LSTM:\nInspired by MS-RNN [60], we propose\nthe pre-training model, termed MS-LSTM (Multi-Scale Long\nShort-Term Memory), which uses a multi-scale LSTM ar-\nchitecture. Each scale is handled by an independent LSTM\nnetwork, allowing the model to capture temporal dependencies\nat varying resolutions. The trained model generates a founda-\ntional return ratio estimate for each stock daily, which acts as\na baseline for stocks that lack expert signals.\nGiven an input sequence X \u2208RN\u00d7L\u00d7d, where N is the\nnumber of stocks, L is the sequence length, and d is the\nfeature dimension, MS-LSTM processes the data through the\nfollowing steps:\n1. For each scale si in S = [s1, s2, . . . , sn], the model\nextracts a subsequence by sampling the input at intervals of\nsi:\nExtract(X, si) = [Xt=0, Xt=si, Xt=2si, . . . , Xt=L\u2212si]\n\u2208RN\u00d7L/si\u00d7d\n(2)\n2. Each extracted subsequence is processed by its own\nLSTM network:\nH(i) = LSTMi(Extract(X, si))\n(3)\nwhere H(i) \u2208RN\u00d7L/si\u00d7h represents the sequence of hidden\nstates for scale si, and h is the hidden dimension.\n3. The final representation is obtained by first extracting the\nlast hidden state from each scale (H(i)\nt=L/si), averaging these\nstates across all scales, applying layer normalization, and then\ntransforming through an MLP to predict return ratios:\nhmid = LayerNorm\n \n1\nn\nn\nX\ni=1\nH(i)\nt=L/si\n!\n\u2208RN\u00d7h\n(4)\n\u02c6r = MLP(hmid) \u2208RN\u00d71\n(5)\nwhere \u02c6r represents the predicted return ratios for all stocks.\nThis multi-scale architecture allows the model to capture\nboth short-term and long-term dependencies in the input\nsequence and produces return ratio estimates for each stock.\n2) Loss Function: During the pre-training process, we use\nthe Information Coefficient (IC) loss function to train the\nnetwork. As our task focuses on cross-sectional stock selection\nrather than timing individual stock movements, IC is particu-\nlarly suitable because it measures the model\u2019s ability to rank\nstocks\u2019 relative performance, which is crucial for portfolio\nconstruction to outperform benchmark indices. Specifically, it\ncalculates the correlation between the model\u2019s predicted output\nand the observed return ratio across the training set. The IC\nloss is computed as:\nIC =\nPN\ni=1(\u02c6ri \u2212\u02c6r)(ri \u2212r)\nqPN\ni=1(\u02c6ri \u2212\u02c6r)2 \u00b7\nqPN\ni=1(ri \u2212r)2\n,\n(6)\nwhere \u02c6ri and ri are the predicted and actual return ratios for\nthe i-th sample, and \u02c6r and r are the mean predicted and actual\nreturn ratios, respectively. The IC loss function LIC is defined\nas:\nLIC = \u2212IC.\n(7)\nBy maximizing the IC during training, the model is optimized\nto capture the correlation between its predictions and the actual\nobserved returns.\nC. Dual Graph Attention Network\nFor each stock on each day, the generated return ratio\nwhich from the pre-trained model offers a relative measure\nof expected returns is combined with two key features:\n1. Pseudo label availability: A binary feature indicating\nwhether an expert prediction exists for that day (1 if an expert\nprediction exists, 0 otherwise).\n2. Expert signal: When an expert signal is available, the\npreviously calculated return ratio prior is used. If no expert\nprediction is made for a stock on a given day, its prior is set\nto 0.\n8\nCorrelation graph\nGAT layer\nTech.\nIndustry graph\n1-hop\nGAT layer\n1-hop\nDual Graph Attention\n\u00d7\n1\n3\n2\n4\n5\n6\n1\n2\n4\n3\n6\n5\n1\n3\n2\n4\n5\n6\n1\n3\n2\n4\n5\n6\n\u2026\n\u2026\n\u2026\nGAT layer\n1\n3\n2\n4\n5\n6\n1\n3\n2\n4\n5\n6\n2-hop\nDual Graph Attention\n2-hop\nGAT layer\nMLP\nReturn\n\u2026\n\u2026\n\u2026\n\u2026\n\u2026\n\u2026\nFig. 3. The workflow of DualGAT. On the far left, it illustrates the construction process of both the correlation and industry graphs, where the orange nodes\nrepresent stocks with initial expert signals. As seen in the figure, DualGAT facilitates the effective propagation of expert signals across all stocks.\nThese three features are combined to form a comprehensive\nfeature set for each stock-day pair. The pseudo label availabil-\nity helps the model understand when to rely more heavily\non the pre-trained output, while the return ratio prior, when\navailable, acts as a corrective signal to refine the baseline\nprediction. Utilizing this feature set, we propose using the\ngraph attention network to propagate the sparse expert-derived\nsignals and refine the initial estimates from the temporal pre-\ntraining model.\nFor the construction of the graph, we create two types of\ngraphs to represent stock relationships: an industry graph and a\ndynamically updated correlation matrix graph. For the industry\ngraph, each node represents a stock, and edges are established\nbetween nodes if the corresponding stocks belong to the same\nGlobal Industry Classification Standard (GICS) sector. The\ncorrelation matrix graph is more dynamic, reflecting changes\nin the stock relationships based on the previous 30 days of\nprice movements. The following is the process for constructing\nthe dynamic correlation matrix graph:\n1. Data Preparation: Collect the past 30 days of stock price\ndata.\n2. Correlation Calculation: We calculate the correlation\ncoefficients for pairs of stocks and compute the correlation\nmatrix based on the past 30-day window.\n3. Graph Construction: Identify positively correlated stock\npairs. If the correlation coefficient between two stocks is\ngreater than the set threshold (\u03b81 = 0.77 as the general\nthreshold and \u03b82 = 0.67 as the threshold for stocks with\nexpert-derived pseudo labels), an edge will be created between\nthis stock pair in the graph. Notably, for stocks associated with\nexpert-derived pseudo labels, we apply a lower correlation\nthreshold, allowing these labeled stocks to influence their\nconnected peers more strongly. The leftmost diagram in Figure\n3 illustrates the construction process of both correlation and\nindustry graphs. In the correlation graph example, an edge\nexists between nodes 2 and 5 as their correlation coefficient\n(0.68) exceeds \u03b82, with node 2 containing expert-derived\npseudo labels, also known as expert signals. However, despite\nhaving a higher correlation coefficient than nodes 2 and 5,\nnodes 3 and 5 are not connected because their correlation\ncoefficient fails to reach \u03b81, as neither node contains expert\nsignals.\nThe graph is updated daily to reflect the latest market\ndynamics. This adaptive approach ensures that our model\ncan respond to changes in stock relationships and leverage\nnew information as it becomes available. By incorporating\nboth industry and correlation relationships, we significantly\nexpand the coverage of expert signals beyond directly con-\nnected stocks, as signals can propagate through multiple paths\nin these complementary graphs. To effectively integrate and\nleverage these complex relationships, we implement a dual-\ngraph attention-based graph neural network. Our GNN archi-\ntecture, termed DualGAT, is constructed to process signals\nfrom both the industry and correlation graphs concurrently,\nenabling it to adaptively learn which graph provides more\nrelevant information at each node for every given stock at\neach step. The whole framework of our DualGAT can be seen\nin Figure 3 and it consists of several components structured\nto handle inputs from two distinct graphs:\nThe whole framework of our DualGAT can be seen in\nFigure 3. The model processes information from two distinct\ngraphs through a two-hop mechanism with graph attention,\nfollowed by attentive feature fusion at each hop. The frame-\nwork consists of the following key components:\n1) Graph Attention Layers: For each graph (industry\nand correlation), we employ Graph Attention Networks\n(GAT) to learn node representations. At each layer, a\nnode v aggregates information using attention mecha-\nnisms from its neighborhood N(v), where N(v) in-\ncludes both the adjacent nodes and node v itself (i.e.,\nv \u2208N(v)). The attention coefficients \u03b1vu between node\nv and each node u \u2208N(v) are computed as:\n\u03b1vu =\nexp(LeakyReLU(a\u22a4[Whv\u2225Whu]))\nP\nk\u2208N(v) exp(LeakyReLU(a\u22a4[Whv\u2225Whk]))\n(8)\nwhere hv \u2208Rdin is the input feature vector for node\nv, W \u2208Rdhidden\u00d7din is a learnable weight matrix, and\na \u2208R2dhidden is the attention vector. The node features\n9\nare then transformed:\nh\u2032\nv = \u03c3\n\uf8eb\n\uf8edX\nu\u2208N(v)\n\u03b1vuWhu\n\uf8f6\n\uf8f8\n(9)\nwhere h\u2032\nv \u2208Rdhidden is the output feature vector and \u03c3\nis a non-linear activation function (ReLU).\n2) Dual-Graph Attentive Feature Fusion: After each\ngraph attention layer, we perform attentive fusion of\nfeatures from both graphs. Let Hind \u2208RN\u00d7dhidden and\nHcor \u2208RN\u00d7dhidden be the node features from industry\nand correlation graphs respectively, where N represents\nthe total number of stocks (i.e., the number of nodes in\neach graph). The fusion process involves:\n\u03b1ind =\ndhidden\nX\nj=1\nqind,jHind,j \u2208RN\n\u03b1cor =\ndhidden\nX\nj=1\nqcor,jHcor,j \u2208RN\n(10)\nwhere qind, qcor\n\u2208Rdhidden are learnable attention\nvectors. The attention weights are normalized using\nsoftmax:\n[\u03b2ind, \u03b2cor] = softmax([\u03b1ind, \u03b1cor]) \u2208R2\u00d7N\n(11)\nThe fused features are then computed as:\nHfused =(\u03b2\u22a4\nind \u22971dhidden) \u2299Hind\n+ (\u03b2\u22a4\ncor \u22971dhidden) \u2299Hcor\n\u2208RN\u00d7dhidden\n(12)\nwhere \u03b2\u22a4\nind, \u03b2\u22a4\ncor \u2208RN\u00d71 are the transposed atten-\ntion weights, \u2297denotes the Kronecker product with\n1dhidden \u2208Rdhidden (broadcasting to RN\u00d7dhidden), \u2299\nrepresents element-wise multiplication, and Hfused rep-\nresents the fused node features that will be used as input\nfor the next graph attention layer or final prediction.\n3) Two-Hop Architecture: The model employs two con-\nsecutive hops of graph attention and feature fusion.\nThe first hop transforms input features (with dimension\nbeing din for each node) to hidden representations (with\ndimension being dhidden for each node), followed by\nfeature fusion. The second hop processes these fused\nfeatures to produce final representations (with dimension\nbeing dout for each node), which undergo another round\nof fusion to obtain Hfinal\nfused\n\u2208RN\u00d7dout. Finally, a\nsimple MLP layer transforms the fused features to scalar\npredictions:\nyv = Wmlphfinal\nfused,v + b\n(13)\nwhere hfinal\nfused,v \u2208Rdout represents the feature vector of\nnode v in Hfinal\nfused, Wmlp \u2208R1\u00d7dout and b \u2208R are the\nparameters of the final MLP layer.\nOur dual-graph attention mechanism enables effective in-\nformation propagation through both industry relationships and\ncorrelation-based connections. The model adaptively learns\nthe relative importance of each graph structure for different\nnodes during the message-passing process, leading to op-\ntimal feature fusion at each hop. Leveraging this adaptive\nfusion mechanism, the combination of dual-graph attention\nand expert-derived priors allows the model to refine and\ncorrect initial predictions. By incorporating these sparse yet\nvaluable expert signals, the synergy between the temporal\nmodel\u2019s output and the expert signals enhances the accuracy\nand robustness of information propagation across all stocks,\nultimately improving predictive performance.\nVI. EXPERIMENTS\nA. Experimental Setup\n1) Dataset: In our study, the expert signals come from\nStockTwits, a social media platform designed specifically\nfor investors and traders to share insights. Users can post\nmessages related to individual stocks, each of which has its\nown discussion thread for comments. These messages are often\ntagged with sentiments like \u201cbullish\u201d or \u201cbearish,\u201d reflecting\nthe user\u2019s outlook on the stock. StockTwits merges social\nmedia engagement with stock market discussions, making it a\nvaluable tool for capturing real-time investor sentiment.\nWe accessed this data through the StockTwits API2, specif-\nically using the endpoint to fetch publicly available data\nrelated to specific stocks. During collection, we implemented\nan automatic pagination mechanism in our data retrieval script.\nThis allowed us to capture both the most recent and complete\nhistorical data from the forum.\nTo optimize the efficiency of our data retrieval process, we\nemployed a distributed system approach using proxy services,\nenabling multi-threaded, multi-node data collection. This setup\nsignificantly accelerated our data gathering efforts; a single\nnode can collect one day\u2019s data from the forum in about 30\nminutes. By employing 30 nodes simultaneously, we signif-\nicantly accelerated the data collection process, retrieving all\ndaily posts within a minute. This substantial increase in speed\nensured that we could quickly gather the latest data, enabling\ntimely mining of expert signals to facilitate subsequent model\ntraining.\nThe collected data contains a vast amount of valuable\ninformation, with each tweet corresponding to a user\u2019s opinion\nabout a specific stock. For each tweet, the data includes\nthe content, the stock it refers to (including its associated\nindustry, sector, symbol, and the exchange where it is listed),\nthe creation time of the tweet, the user ID, the number of\nlikes it received, the sentiment expressed by the user (either\nbullish or bearish), and the retweet information. Our data\ncollection spanned from 2010 to the end of 2023, and we\nhave established a database to store all the collected data. This\nallows users to access various StockTwits data directly from\nthe database without the need to write complex scripts for data\nextraction. Stock price data was sourced from Polygon.io3, and\nfundamental data from EODHD4.\n2https://api.stocktwits.com/api/2/streams/symbol/stock.json\n3https://polygon.io\n4https://eodhd.com\n10\nTABLE I\nPERFORMANCE OF DIFFERENT PROPAGATION METHODS AGAINST THE DUALGAT MODEL ACROSS VARIOUS METRICS ON DIFFERENT DATASETS. ACC,\nIC, RIC, ICIR, AR ARE MEASURED IN PERCENTAGE (%).\nNASDAQ 100\nSP 500\nUpdated StockNet\nMethod\nACC\nIC\nRIC\nICIR\nAR\nSR\nACC\nIC\nRIC\nICIR\nAR\nSR\nACC\nIC\nRIC\nICIR\nAR\nSR\nCorrelation matrix propagation\n55.71\n5.33\n6.79\n19.53\n21.11\n1.16\n52.98\n6.86\n7.06\n34.47\n19.81\n1.72\n52.32\n6.08\n5.31\n24.61\n19.98\n1.81\nIndustry matrix propagation\n54.40\n4.76\n4.64\n19.02\n16.93\n0.75\n52.08\n4.86\n5.62\n31.83\n15.60\n1.21\n51.71\n5.39\n3.05\n28.61\n10.04\n0.88\nCorrelation+Industry matrix propagation\n56.62\n6.25\n6.42\n26.62\n21.18\n1.45\n52.91\n7.13\n6.30\n41.99\n21.78\n2.09\n52.82\n6.12\n3.32\n24.15\n17.81\n1.16\nOur DualGAT model\n57.83\n7.98\n9.69\n36.66\n32.91\n2.69\n55.02\n8.26\n8.29\n43.17\n29.60\n2.92\n54.81\n7.62\n7.78\n30.69\n23.46\n1.87\nFor the pre-training of our temporal model, we utilized the\ncollected stock prices and fundamental data spanning from\n2010 to 2018. To help enrich our expert signal collection\nand avoid isolated nodes in the industry graph, we augmented\nour dataset with 10 highly discussed stocks beyond the index\nconstituents. Our DualGAT model was trained and validated\nusing data from 2019 to 2021 and conducted extensive testing\nof our DualGAT model on three distinct datasets: the Nasdaq\n100 index, S&P 500 index, and the updated StockNet dataset\nby [61], using data from 2022 to 2023 as the test sets.\n2) Baseline: In this paper, we employ three different base-\nlines to evaluate the effectiveness of our model:\n1. Expert signal baseline: As illustrated in Figure 1, the\nexpert selection methods proposed by Wang et al. [20] in\nthe MFN model, the smart user classification model proposed\nby Coyne et al. [19], the user centric method proposed by\nBouadjenek et al. [27] and the Wining model proposed by\nLiao et al. [28] were selected as manually designed baseline\nmodels to assess the accuracy of the expert signals.\n2. Simple message propagation baseline: To demonstrate the\neffectiveness of the message-passing capability in our designed\nDualGAT, we employ the simplest form of message passing\nwithout graph neural network architectures, denoted as Wx,\nwhere W is the normalized adjacency matrix derived from\neither our industry or correlation matrix, and x represents the\nstock\u2019s return ratio prior mentioned before. Similar to Dual-\nGAT, this message-passing method also performs a two-hop\npropagation. This process helps to diffuse the return ratio prior\nfrom nodes (stocks) with expert signals (expert-derived pseudo\nlabels) to adjacent nodes, thereby leveraging the structural\nconnections of the graph to enhance the predictive ability. This\nbaseline is crucial for understanding the ability of the expert\nsignal independently as a trading signal. At the same time,\nit also serves as the baseline for our DualGAT model that\ncombines traditional financial features.\n3. Temporal or spatial-temporal model comparison base-\nline: To demonstrate the effectiveness of our expert-driven ap-\nproach combined with DualGAT, this baseline employs other\nmodels known for temporal and spatial-temporal predictions\nin time series forecasting or stock prediction. We use these\nmodels to train and compare performance on the same training,\nvalidation and test sets.\n3) Metrics: For the evaluation of our model\u2019s performance,\nwe select six key metrics: ACC (Accuracy), IC (Information\nCoefficient), RIC (Rank information coefficient), ICIR (In-\nformation Coefficient of Information Ratio), AR (Annualized\nReturn), and SR (Sharpe Ratio).\n\u2022 ACC (Accuracy): This metric reflects the percentage\nof correct predictions our model makes regarding the\nupward or downward trend of stock prices.\n\u2022 IC (Information Coefficient) and ICIR: IC measures\nthe correlation between the predicted values and the\nactual values. It quantifies the ability of the model to\nmake correct relative predictions rather than absolute\nstock price predictions. ICIR, which adjusts the IC to\naccount for signal volatility, provides a refined measure\nof predictive performance over time.\n\u2022 RIC (Rank Information Coefficient): RIC measures the\ncorrelation between the predicted rankings of stocks and\ntheir actual rankings. It quantifies the model\u2019s ability to\ncorrectly rank stocks relative to each other.\n\u2022 AR (Annualized Return): For returns, our strategy\ninvolves taking long positions in the top 10% of stocks\npredicted to have the highest return ratio, and short\npositions in the 10% predicted to decrease the most.\nThis method is designed to capitalize on the extremes\nof the market\u2019s movements, thereby maximizing potential\nreturns from market volatility.\n\u2022 SR (Sharpe Ratio): The Sharpe Ratio is calculated by\ndividing the average return earned in excess of the risk-\nfree rate by the standard deviation of return. It provides\nan adjusted measure of return that accounts for the risk\ntaken, offering insights into the risk-adjusted performance\nof our investment strategy.\nThese metrics collectively provide a comprehensive frame-\nwork for assessing the effectiveness and reliability of our\npredictive model in a dynamic financial environment.\nB. Main Results\nIn this study, we have systematically evaluated our expert\ntracing system along with the DualGAT model against vari-\nous models and baselines to determine its effectiveness and\nefficiency in stock price prediction.\nFor the mining of expert signals, Figure 1 illustrates the\nperformance of several expert mining methods across three\nprediction horizons: T+1, T+3, and T+7. The model labeled\n\u201cOurs\u201d significantly surpasses other models in accuracy at\neach time point. At T+1, \u201cOurs\u201d shows an accuracy of 77.26%,\nwhich is 18% and 21% higher than the second and third place\nmodels (User Centric [27] and SUC [19]). At T+3 and T+7,\nthe accuracy of \u201cOurs\u201d remains better than the other models.\nThese results demonstrate robust short-term predictive strength\nof our expert tracing system, Although there is a gradual\ndecrease in accuracy as the prediction horizon extends, our\nmodel still maintains a significant lead above the baseline\n(Random Guess) across all evaluated horizons.\nTo demonstrate the effectiveness of the message-passing\ncapability in our designed model, we compare our DualGAT\n11\nTABLE II\nPERFORMANCE COMPARISON OF VARIOUS TIME SERIES FORCASTING AND STOCK PREDICTION MODELS ON THE TEST SET. ACC, IC, RIC, ICIR, AR\nARE MEASURED IN PERCENTAGE (%). / REPRESENTS OUT OF CUDA MEMORY.\nNASDAQ 100\nSP 500\nUpdated StockNet\nModel\nACC\nIC\nRIC\nICIR\nAR\nSR\nACC\nIC\nRIC\nICIR\nAR\nSR\nACC\nIC\nRIC\nICIR\nAR\nSR\nLSTM\n49.22\n0.84\n0.65\n4.52\n5.95\n0.17\n49.28\n3.72\n1.65\n16.73\n19.35\n1.62\n49.00\n0.57\n1.16\n2.37\n2.39\n-0.13\nMixer [62]\n49.81\n1.00\n1.27\n5.88\n7.38\n0.38\n50.72\n3.70\n2.02\n19.64\n19.76\n1.79\n51.06\n1.44\n1.64\n5.14\n5.86\n0.15\nPatchTST [63]\n49.58\n1.90\n1.04\n7.22\n12.38\n1.24\n50.36\n1.76\n1.89\n15.19\n9.49\n0.91\n49.85\n2.49\n2.51\n11.35\n17.36\n1.47\nCrossformer [64]\n49.34\n0.76\n0.56\n3.14\n9.36\n0.43\n49.28\n2.07\n0.38\n10.04\n7.11\n0.40\n50.95\n0.01\n-0.05\n0.04\n-2.11\n-0.52\nTimesNet [65]\n49.74\n0.92\n0.91\n5.13\n9.72\n0.70\n50.73\n2.06\n1.62\n11.53\n10.42\n0.87\n49.03\n0.55\n0.70\n2.19\n3.38\n-0.06\nMS-LSTM\n49.40\n0.89\n0.88\n4.81\n6.94\n0.54\n49.29\n2.84\n2.96\n18.45\n17.39\n1.29\n49.06\n0.85\n1.46\n3.52\n9.57\n0.74\nTimeMixer [66]\n50.32\n0.71\n0.91\n4.09\n7.68\n0.30\n50.37\n1.40\n1.28\n10.32\n10.78\n0.78\n50.33\n0.76\n1.14\n4.25\n-0.09\n-0.41\nLSTM-RGCN [67]\n49.46\n1.10\n1.07\n6.30\n17.73\n1.20\n49.30\n1.59\n0.95\n15.36\n14.56\n1.54\n49.02\n0.90\n1.42\n4.93\n5.24\n0.11\nSFM [68]\n49.24\n2.81\n2.51\n18.37\n2.41\n2.29\n49.30\n3.37\n3.47\n19.85\n21.17\n2.10\n49.01\n1.40\n1.51\n7.38\n6.71\n0.95\nHATR-I [53]\n49.38\n0.41\n0.68\n1.90\n4.43\n0.01\n/\n/\n/\n/\n/\n/\n49.08\n3.89\n2.67\n16.81\n21.34\n1.64\nTHGNN [69]\n50.75\n0.76\n0.63\n3.73\n9.78\n0.36\n50.79\n4.20\n4.25\n15.36\n21.83\n1.81\n49.59\n0.67\n0.72\n2.76\n4.97\n0.08\nSAMBA [51]\n50.68\n1.04\n0.84\n5.59\n13.13\n0.73\n50.58\n2.55\n1.50\n13.84\n20.09\n1.66\n50.30\n0.70\n0.59\n2.55\n11.43\n0.50\nOurs w/o expert signal\n57.66\n7.83\n9.49\n35.60\n32.55\n2.64\n54.94\n7.15\n7.80\n36.68\n26.93\n2.61\n53.82\n6.29\n6.71\n26.04\n20.41\n1.26\nOurs w/ expert signal\n57.83\n7.98\n9.69\n36.66\n32.91\n2.69\n55.02\n8.26\n8.29\n43.17\n29.56\n2.92\n54.81\n7.62\n7.78\n30.69\n28.46\n1.87\nwith the simple message passing method denoted as Wx\nmentioned above. For the propagation matrix W,we utilized\nthe correlation matrix (Wcor), the industry matrix (Wind),\nand a combination of both. The result in Table I shows that\nDualGAT outperforms all messgae passing methods on all\nmetrics, indicating DualGAT\u2019s architectural advancements.\nMoreover, we demonstrate that DualGAT effectively ad-\ndresses the sparsity of expert signals. Since the graph attention\nlayer in our neural network corresponds to one hop of propaga-\ntion, its signal propagation capability is comparable to simple\nmessage passing methods (similar to BFS, where edges in\nthe two graphs can spread expert signals). Through empirical\nanalysis using simple message passing methods, we show\nthat a two-hop propagation setting achieves sufficient node\ncoverage, reaching 89.1% of all nodes. DualGAT enhances\nthis propagation process by incorporating learnable attention\ncoefficients, which dynamically adjust the importance of dif-\nferent nodes. Furthermore, attention mechanisms are applied\nbetween the industry and correlation graphs, enabling the\nmodel to adaptively weigh the contribution of each graph\nstructure during the propagation process, thereby improving\nthe prediction performance.\nWhen comparing our system with other temporal or spatial-\ntemporal models used in time series forcating or stock predic-\ntion using the same financial features (price-volume data and\nfundamental data) on the same test sets shown in Table II,\nwe find that our model not only leads but also significantly\noutperforms the traditional and more recent models in time\nseries forecasting and stock prediction across all metrics on all\ndatasets. Notably, even for stocks without expert signals, where\npredictions rely solely on propagated information through our\nDualGAT architecture, the model still significantly outper-\nforms the baseline models, demonstrating the effectiveness of\nour signal propagation mechanism. In addition, the results of\nthe pre-trained temporal model MS-LSTM can also be seen in\nTable II. Since both our model and the baseline model use the\nsame traditional financial features, the significant improvement\nof our model with the addition of expert signals demon-\nstrates that our expert signal captures valuable information\nbeyond what traditional financial features provide. Overall,\nthese results affirm the superiority of our approach, especially\nin utilizing expert-derived signals enhanced by graph-based\nlearning.\nC. Ablation Study\nIn this ablation study, we analyze the impact of different\ngraph structures and graph neural network architectures on the\nperformance of our model. For the graph structure, we tested\nthree configurations using the GAT architecture: employing\nonly the industry graph, only the correlation graph, and a\ncombination of both by our DualGAT. The results shown\nin Table III indicated that the combined graph setup outper-\nformed the individual setups, demonstrating the effectiveness\nof integrating diverse market relationships into the model.\nMeanwhile, for a single graph source, we find that combining\nthe information contained in the pre-trained model output\nwith the expert signal yields better results than the message\npassing using only expert signals, especially in terms of AR\nand SR. This further demonstrates that expert signals can work\nsynergistically with traditional financial features.\nWe also compared the performance of two graph neural\nnetwork architectures, GCN and GAT, using the combined\nindustry and correlation graphs. As shown in Table IV ,the\nGAT architecture showed superior performance over GCN.\nVII. DISCUSSION\nOur study demonstrates a significant advancement in uti-\nlizing social media data for stock prediction by effectively\naddressing two fundamental challenges: identifying valuable\nexpert signals from noisy social media and propagating these\nsparse signals across the stock network for better prediction.\nA critical aspect of our framework is its robustness in han-\ndling sparse expert signals across different sectors and stock\ntypes. Our dual-graph architecture effectively addresses the\nsparsity problem through different propagation mechanisms.\nWe have found that the industry graph enables each stock\u2019s\nsignal to reach an expected 20.65% of nodes in a single hop\nwhile the correlation graph complements this by propagating\nsignals to an additional 9.16% of nodes. This dual-graph\napproach ensures comprehensive and sufficient signal coverage\naround 90% after two-hop propagation. The accuracy of signal\npropagation is equally crucial as coverage. Our analysis reveals\nstrong trend alignment between connected stocks, supporting\nthe reliability of our propagation mechanism. In the correlation\n12\nTABLE III\nPERFORMANCE COMPARISON OF THE GAT MODEL USING DIFFERENT GRAPH STRUCTURES. ACC, IC, RIC, ICIR, AR ARE MEASURED IN PERCENTAGE\n(%).\nNASDAQ 100\nSP 500\nUpdated StockNet\nGraph\nACC\nIC\nRIC\nICIR\nAR\nSR\nACC\nIC\nRIC\nICIR\nAR\nSR\nACC\nIC\nRIC\nICIR\nAR\nSR\nCorrelation\n57.15\n9.09\n7.02\n24.51\n27.73\n1.66\n53.08\n6.43\n7.53\n30.46\n24.95\n2.04\n53.61\n5.93\n5.87\n20.76\n23.77\n1.56\nIndustry\n54.44\n4.42\n3.29\n16.93\n16.94\n0.76\n52.61\n5.53\n5.15\n41.14\n19.15\n1.56\n52.67\n3.91\n5.19\n19.91\n15.65\n0.89\nIndustry+Correlation\n57.83\n7.98\n9.69\n36.66\n32.91\n2.69\n55.02\n8.26\n8.29\n43.17\n29.56\n2.92\n54.81\n7.62\n7.78\n30.69\n28.46\n1.87\nTABLE IV\nPERFORMANCE COMPARISON BETWEEN GCN AND GAT. ACC, IC, RIC, ICIR, AR ARE MEASURED IN PERCENTAGE (%).\nNASDAQ 100\nSP 500\nUpdated StockNet\nModel type\nACC\nIC\nRIC\nICIR\nAR\nSR\nACC\nIC\nRIC\nICIR\nAR\nSR\nACC\nIC\nRIC\nICIR\nAR\nSR\nGCN\n56.99\n5.60\n7.09\n27.66\n28.12\n2.31\n54.83\n7.87\n5.43\n41.51\n23.52\n2.19\n54.06\n5.85\n4.56\n34.05\n20.27\n1.29\nGAT\n57.83\n7.98\n9.69\n36.66\n32.91\n2.69\n55.02\n8.26\n8.29\n43.17\n29.56\n2.92\n54.81\n7.62\n7.78\n30.69\n28.46\n1.87\ngraph, stock pairs meeting our correlation thresholds (0.67\nand 0.77) show trend alignment probabilities of 75.09% and\n80.04% respectively in the next day. Similarly, stocks con-\nnected in the industry graph demonstrate a 70.4% probability\nof aligned trends. Even accounting for potential error propa-\ngation, the accuracy of the propagated expert signals remains\nrobust, with even the simplest message passing approach could\noutperform baseline models in Table II.\nWhile our framework excels at daily predictions, it\u2019s im-\nportant to know its scope. Our focus is on daily prediction\nhorizons because price and social media information are most\nrelevant at this timescale. Factors that mainly influence long-\nterm predictions, such as policy changes and macroeconomic\nconditions, fall outside our scope. Nevertheless, we can still\nleverage social media to more accurately capture and utilize\nthe impact of these factors on predictions. For instance, when\nthere is disagreement among users about the stock trend\nprediction due to policy changes, the effects of these factors\nare often reflected in social media discussions, leading to a\nsurge in activity where experts may actively participate. Figure\n1 shows that experts often provide more accurate insights\nthan the general sentiment of the crowd. By identifying and\nleveraging these expert opinions, our framework captures the\nactionable impact of these events more effectively, ultimately\nimproving prediction accuracy. On the other hand, the primary\nfocus of our work is based on social media which could\nindirectly capture the effects of these external factors. Based\non this, our method can work synergistically with other factors\nto improve overall prediction performance.\nUltimately, our method emphasizes the use of expert opin-\nions based on self-labeled sentiments, avoiding the compu-\ntational complexity of directly processing large volumes of\ntweets using language models. Though large language models\n(LLMs) have become increasingly mature and widely used in\nrecent years, the massive volume of posts on StockTwits poses\nsignificant computational challenges. Processing over 200,000\ndaily tweets with LLMs would significantly slow down infer-\nence times, result in missing many critical information thus\nmake real-time predictions impractical for our task setting\nsince decisions must be made before the market close. As\nsuch, our approach does not compete with LLMs but rather\ncomplements them by focusing on meaningful social media\nopinions. For instance, one promising future direction involves\nusing our expert tracing system to quickly identify potential\nexperts in the large volume of daily tweets. This would enable\nus to prioritize their texts for LLM processing, allowing the\nLLM to extract additional information beyond sentiment. Such\nselective processing would enhance the synergy between the\ncapability of LLMs and expert signals.\nVIII. CONCLUSION\nOur research has highlighted the significant noise present in\npublic sentiment signals derived from social media platforms\nlike StockTwits, which complicates the forecasting of stock\nprices. To tackle this challenge, we developed an expert tracing\nsystem that effectively identifies and utilizes valuable expert\nsignals amidst this noise. While these signals are highly effec-\ntive, their sparsity limits their direct application. To overcome\nthe limitations posed by sparse signals, we constructed a dual\ngraph-based framework that employs graph attention networks.\nThis innovative approach allows for the effective propagation\nof sparse expert signals across the network, significantly\nenhancing the richness and accuracy of our predictions. Our\nmethod significantly outperforms various benchmark models\nin time-series forecasting and stock prediction.\nFor future work, there is substantial potential to further\nexploit the wealth of information embedded in social media\ncontent. As StockTwits dataset also contains the follower count\nfor each user, we could investigate the relationship between\nsocial media influence and stock price movements, as well\nas examine the predictive capabilities of influential users. An-\nother promising direction could involve processing the textual\ncontent of tweets by the experts or social media influencers\nthrough large language models to generate embeddings. These\nembeddings could then be integrated with sentiment features,\nproviding a richer set of features for our graph-based models.\nREFERENCES\n[1] E. Hoseinzade and S. Haratizadeh, \u201cCnnpred: Cnn-based stock market\nprediction using a diverse set of variables,\u201d Expert Systems with Appli-\ncations, vol. 129, pp. 273\u2013285, 2019.\n[2] F. Feng, X. He, X. Wang, C. Luo, Y. Liu, and T.-S. Chua, \u201cTemporal re-\nlational ranking for stock prediction,\u201d ACM Transactions on Information\nSystems (TOIS), vol. 37, no. 2, pp. 1\u201330, 2019.\n[3] Y.-L. Hsu, Y.-C. Tsai, and C.-T. Li, \u201cFingat: Financial graph attention\nnetworks for recommending top-k k profitable stocks,\u201d IEEE transac-\ntions on knowledge and data engineering, vol. 35, no. 1, pp. 469\u2013481,\n2021.\n13\n[4] R. Xing, R. Cheng, J. Huang, Q. Li, and J. Zhao, \u201cLearning to understand\nthe vague graph for stock prediction with momentum spillovers,\u201d IEEE\nTransactions on Knowledge and Data Engineering, 2023.\n[5] H. Wang, T. Wang, S. Li, J. Zheng, W. Chen, and W. Chen, \u201cAgree\nto disagree: Personalized temporal embedding and routing for stock\nforecast,\u201d IEEE Transactions on Knowledge and Data Engineering,\n2024.\n[6] Q. Gao, Z. Liu, L. Huang, K. Zhang, J. Wang, and G. Liu, \u201cRelational\nstock selection via probabilistic state space learning,\u201d IEEE Transactions\non Knowledge and Data Engineering, 2024.\n[7] Y. Xu and S. B. Cohen, \u201cStock movement prediction from tweets and\nhistorical prices,\u201d in Proceedings of the 56th Annual Meeting of the\nAssociation for Computational Linguistics (Volume 1: Long Papers),\n2018, pp. 1970\u20131979.\n[8] R. Sawhney, S. Agarwal, A. Wadhwa, and R. Shah, \u201cDeep attentive\nlearning for stock movement prediction from social media text and\ncompany correlations,\u201d in Proceedings of the 2020 Conference on\nEmpirical Methods in Natural Language Processing (EMNLP), 2020,\npp. 8415\u20138426.\n[9] Q. Li, Y. Chen, J. Wang, Y. Chen, and H. Chen, \u201cWeb media and stock\nmarkets: A survey and future directions from a big data perspective,\u201d\nIEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 2,\npp. 381\u2013399, 2017.\n[10] D. Luo, W. Liao, S. Li, X. Cheng, and R. Yan, \u201cCausality-guided\nmulti-memory interaction network for multivariate stock price movement\nprediction,\u201d in Proceedings of the 61st Annual Meeting of the Associa-\ntion for Computational Linguistics (Volume 1: Long Papers), 2023, pp.\n12 164\u201312 176.\n[11] Q. Zhang, Y. Zhang, F. Bao, Y. Liu, C. Zhang, and P. Liu, \u201cIncorpo-\nrating stock prices and text for stock movement prediction based on\ninformation fusion,\u201d Engineering Applications of Artificial Intelligence,\nvol. 127, p. 107377, 2024.\n[12] Q. Zhang, Y. Zhang, X. Yao, S. Li, C. Zhang, and P. Liu, \u201cA dynamic\nattributes-driven graph attention network modeling on behavioral finance\nfor stock prediction,\u201d ACM Transactions on Knowledge Discovery from\nData, vol. 18, no. 1, pp. 1\u201329, 2023.\n[13] F. Audrino, F. Sigrist, and D. Ballinari, \u201cThe impact of sentiment and\nattention measures on stock market volatility,\u201d International Journal of\nForecasting, vol. 36, no. 2, pp. 334\u2013357, 2020.\n[14] L. Wang, J. Li, L. Zhao, Z. Kou, X. Wang, X. Zhu, H. Wang, Y. Shen,\nand L. Chen, \u201cMethods for acquiring and incorporating knowledge into\nstock price prediction: A survey,\u201d arXiv preprint arXiv:2308.04947,\n2023.\n[15] \u201cGameStop\nshort\nsqueeze,\u201d\nOct.\n2024,\npage\nVersion\nID:\n1249188023. [Online]. Available: https://en.wikipedia.org/w/index.php?\ntitle=GameStop short squeeze&oldid=1249188023\n[16] WorldQuant. Social Media Dataset - WorldQuant Brain Platform.\n[Online].\nAvailable:\nhttps://platform.worldquantbrain.com/data/\ndata-sets?category=socialmedia&delay=1&instrumentType=EQUITY&\nlimit=20&offset=0&region=USA&universe=TOP3000\n[17] \u201cStockTwits,\u201d https://stocktwits.com/.\n[18] R. Gupta and M. Chen, \u201cSentiment analysis for stock price prediction,\u201d\nin 2020 IEEE conference on multimedia information processing and\nretrieval (MIPR).\nIEEE, 2020, pp. 213\u2013218.\n[19] S. Coyne, P. Madiraju, and J. Coelho, \u201cForecasting stock prices using\nsocial media analysis,\u201d in 2017 IEEE 15th Intl Conf on Depend-\nable, Autonomic and Secure Computing, 15th Intl Conf on Perva-\nsive Intelligence and Computing, 3rd Intl Conf on Big Data Intelli-\ngence and Computing and Cyber Science and Technology Congress\n(DASC/PiCom/DataCom/CyberSciTech).\nIEEE, 2017, pp. 1031\u20131038.\n[20] H. Wang, T. Wang, and Y. Li, \u201cIncorporating expert-based investment\nopinion signals in stock prediction: A deep learning framework,\u201d in\nProceedings of the AAAI Conference on Artificial Intelligence, vol. 34,\nno. 01, 2020, pp. 971\u2013978.\n[21] P. Xie, H. Chen, and Y. J. Hu, \u201cSignal or noise in social media\ndiscussions: the role of network cohesion in predicting the bitcoin\nmarket,\u201d Journal of Management Information Systems, vol. 37, no. 4,\npp. 933\u2013956, 2020.\n[22] M. Schnaubelt, T. G. Fischer, and C. Krauss, \u201cSeparating the signal from\nthe noise\u2013financial machine learning for twitter,\u201d Journal of Economic\nDynamics and Control, vol. 114, p. 103895, 2020.\n[23] H. Zhang, Y. Chen, W. Rong, J. Wang, and J. Tan, \u201cEffect of social\nmedia rumors on stock market volatility: A case of data mining in china,\u201d\nFrontiers in Physics, vol. 10, p. 987799, 2022.\n[24] L. Eliner and B. Kobilov, \u201cTo the moon or bust: Do retail investors\nprofit from social media induced trading,\u201d Working Paper]. https://www.\nbotirkobilov. com/research, Tech. Rep., 2023.\n[25] Y. Tan, W. Zhang, and X. Kong, \u201cMarket manipulation by rumor-\nmongers: Evidence from insiders\u2019 stock selling,\u201d China Journal of\nAccounting Research, vol. 16, no. 3, p. 100318, 2023.\n[26] Y. Ge, J. Qiu, Z. Liu, W. Gu, and L. Xu, \u201cBeyond negative and positive:\nExploring the effects of emotions in social media during the stock market\ncrash,\u201d Information processing & management, vol. 57, no. 4, p. 102218,\n2020.\n[27] M. R. Bouadjenek, S. Sanner, and G. Wu, \u201cA user-centric analysis of\nsocial media for stock market prediction,\u201d ACM Transactions on the\nWeb, vol. 17, no. 2, pp. 1\u201322, 2023.\n[28] W. Liao, S. Shah, and M. Makrehchi, \u201cWinning by following the\nwinners: Mining the behaviour of stock market experts in social media,\u201d\nin International Conference on Social Computing, Behavioral-Cultural\nModeling, and Prediction.\nSpringer, 2014, pp. 103\u2013110.\n[29] D. J. Hilton, \u201cThe psychology of financial decision-making: Applications\nto trading, dealing, and investment analysis,\u201d Journal of Psychology and\nFinancial Markets, vol. 2, pp. 37 \u2013 53, 2001.\n[30] M. Lekovi\u00b4c, \u201cCognitive biases as an integral part of behavioral finance,\u201d\nEconomic Themes, vol. 58, no. 1, pp. 75\u201396, 2020.\n[31] U.S. Securities and Exchange Commission, \u201cSEC Charges Eight Social\nMedia Influencers in 100 Million Stock Manipulation Scheme Promoted\non Discord and Twitter,\u201d https://www.sec.gov/newsroom/press-releases/\n2022-221, 2022.\n[32] T. Ane and C. Kharoubi, \u201cDependence structure and risk measure,\u201d The\njournal of business, vol. 76, no. 3, pp. 411\u2013438, 2003.\n[33] H. Zhu, S.-Y. Liu, P. Zhao, Y. Chen, and D. L. Lee, \u201cForecasting asset\ndependencies to reduce portfolio risk,\u201d in Proceedings of the AAAI\nConference on Artificial Intelligence, vol. 36, no. 4, 2022, pp. 4397\u2013\n4404.\n[34] C. Zhang, X. Pu, M. Cucuringu, and X. Dong, \u201cGraph neural networks\nfor forecasting multivariate realized volatility with spillover effects,\u201d\narXiv preprint arXiv:2308.01419, 2023.\n[35] Y. Zhang, Z. Zhou, Q. Yao, X. Chu, and B. Han, \u201cAdaprop: Learning\nadaptive propagation for graph neural network based knowledge graph\nreasoning,\u201d in Proceedings of the 29th ACM SIGKDD conference on\nknowledge discovery and data mining, 2023, pp. 3446\u20133457.\n[36] T. Xiao, Z. Chen, D. Wang, and S. Wang, \u201cLearning how to propagate\nmessages in graph neural networks,\u201d in Proceedings of the 27th ACM\nSIGKDD Conference on Knowledge Discovery & Data Mining, 2021,\npp. 1894\u20131903.\n[37] J. Bollen, H. Mao, and X. Zeng, \u201cTwitter mood predicts the stock\nmarket,\u201d Journal of computational science, vol. 2, no. 1, pp. 1\u20138, 2011.\n[38] T. H. Nguyen, K. Shirai, and J. Velcin, \u201cSentiment analysis on social me-\ndia for stock movement prediction,\u201d Expert Systems with Applications,\nvol. 42, no. 24, pp. 9603\u20139611, 2015.\n[39] L. Shi, Z. Teng, L. Wang, Y. Zhang, and A. Binder, \u201cDeepclue: visual\ninterpretation of text-based deep stock prediction,\u201d IEEE Transactions\non Knowledge and Data Engineering, vol. 31, no. 6, pp. 1094\u20131108,\n2018.\n[40] N. Oliveira, P. Cortez, and N. Areal, \u201cOn the predictability of stock\nmarket behavior using stocktwits sentiment and posting volume,\u201d in\nProgress in Artificial Intelligence: 16th Portuguese Conference on\nArtificial Intelligence, EPIA 2013, Angra do Hero\u00b4\u0131smo, Azores, Portugal,\nSeptember 9-12, 2013. Proceedings 16.\nSpringer, 2013, pp. 355\u2013365.\n[41] Z. Zhao, L. Zhang, X. He, and W. Ng, \u201cExpert finding for question\nanswering via graph regularized matrix completion,\u201d IEEE Transactions\non Knowledge and Data Engineering, vol. 27, no. 4, pp. 993\u20131004,\n2014.\n[42] Z. Zhao, Q. Yang, D. Cai, X. He, and Y. Zhuang, \u201cExpert finding\nfor community-based question answering via ranking metric network\nlearning.\u201d in Ijcai, vol. 16, 2016, pp. 3000\u20133006.\n[43] M. Bouguessa, B. Dumoulin, and S. Wang, \u201cIdentifying authoritative\nactors in question-answering forums: the case of yahoo! answers,\u201d in\nProceedings of the 14th ACM SIGKDD international conference on\nKnowledge discovery and data mining, 2008, pp. 866\u2013874.\n[44] H. Zhu, E. Chen, H. Xiong, H. Cao, and J. Tian, \u201cRanking user authority\nwith relevant knowledge categories for expert finding,\u201d World Wide Web,\nvol. 17, pp. 1081\u20131107, 2014.\n[45] J. Guo, S. Xu, S. Bao, and Y. Yu, \u201cTapping on the potential of q&a\ncommunity by recommending answer providers,\u201d in Proceedings of the\n17th ACM conference on Information and knowledge management, 2008,\npp. 921\u2013930.\n[46] G. Zhou, S. Lai, K. Liu, and J. Zhao, \u201cTopic-sensitive probabilistic\nmodel for expert finding in question answer communities,\u201d in Pro-\nceedings of the 21st ACM international conference on Information and\nknowledge management, 2012, pp. 1662\u20131666.\n14\n[47] J. Weng, E.-P. Lim, J. Jiang, and Q. He, \u201cTwitterrank: finding topic-\nsensitive influential twitterers,\u201d in Proceedings of the third ACM inter-\nnational conference on Web search and data mining, 2010, pp. 261\u2013270.\n[48] R. Chi, B. Wu, and L. Wang, \u201cExpert identification based on dynamic\nlda topic model,\u201d in 2018 IEEE Third International Conference on Data\nScience in Cyberspace (DSC).\nIEEE, 2018, pp. 881\u2013888.\n[49] Y. Zhao, H. Du, Y. Liu, S. Wei, X. Chen, F. Zhuang, Q. Li, and G. Kou,\n\u201cStock movement prediction based on bi-typed hybrid-relational market\nknowledge graph via dual attention networks,\u201d IEEE Transactions on\nKnowledge and Data Engineering, vol. 35, no. 8, pp. 8559\u20138571, 2022.\n[50] S. Xiang, D. Cheng, C. Shang, Y. Zhang, and Y. Liang, \u201cTemporal and\nheterogeneous graph neural network for financial time series prediction,\u201d\nin Proceedings of the 31st ACM international conference on information\n& knowledge management, 2022, pp. 3584\u20133593.\n[51] A. Mehrabian, E. Hoseinzade, M. Mazloum, and X. Chen, \u201cMamba\nmeets financial markets: A graph-mamba approach for stock price\nprediction,\u201d arXiv preprint arXiv:2410.03707, 2024.\n[52] Y. Chen, Z. Wei, and X. Huang, \u201cIncorporating corporation relationship\nvia graph convolutional neural networks for stock price prediction,\u201d in\nProceedings of the 27th ACM international conference on information\nand knowledge management, 2018, pp. 1655\u20131658.\n[53] H. Wang, T. Wang, S. Li, and S. Guan, \u201cHatr-i: Hierarchical adaptive\ntemporal relational interaction for stock trend prediction,\u201d IEEE Trans-\nactions on Knowledge and Data Engineering, vol. 35, no. 7, pp. 6988\u2013\n7002, 2022.\n[54] Y. Liu, S. Di, L. Chen, X. Zhou, and F. Lin, \u201cA universal and inter-\npretable method for enhancing stock price prediction,\u201d in Proceedings of\nthe 33rd ACM International Conference on Information and Knowledge\nManagement, 2024, pp. 1533\u20131543.\n[55] X. Ying, C. Xu, J. Gao, J. Wang, and Z. Li, \u201cTime-aware graph\nrelational attention network for stock recommendation,\u201d in Proceedings\nof the 29th ACM International Conference on Information & Knowledge\nManagement, 2020, pp. 2281\u20132284.\n[56] S. Li, Y. Liu, X. Chen, J. Wu, and K. Xu, \u201cForecasting turning points in\nstock price by integrating chart similarity and multipersistence,\u201d IEEE\nTransactions on Knowledge and Data Engineering, 2024.\n[57] F. Kamalov, \u201cForecasting significant stock price changes using neural\nnetworks,\u201d Neural Computing and Applications, vol. 32, no. 23, pp.\n17 655\u201317 667, 2020.\n[58] R. Corizzo and J. Rosen, \u201cStock market prediction with time series\ndata and news headlines: a stacking ensemble approach,\u201d Journal of\nIntelligent Information Systems, vol. 62, no. 1, pp. 27\u201356, 2024.\n[59] I. K. Nti, A. F. Adekoya, and B. A. Weyori, \u201cA systematic review\nof fundamental and technical analysis of stock market predictions,\u201d\nArtificial Intelligence Review, vol. 53, no. 4, pp. 3007\u20133057, 2020.\n[60] J. Chung, S. Ahn, and Y. Bengio, \u201cHierarchical multiscale recurrent neu-\nral networks,\u201d in International Conference on Learning Representations,\n2022.\n[61] K. J. Koa, Y. Ma, R. Ng, and T.-S. Chua, \u201cLearning to generate ex-\nplainable stock predictions using self-reflective large language models,\u201d\nin Proceedings of the ACM on Web Conference 2024, 2024, pp. 4304\u2013\n4315.\n[62] I. O. Tolstikhin, N. Houlsby, A. Kolesnikov, L. Beyer, X. Zhai, T. Un-\nterthiner, J. Yung, A. Steiner, D. Keysers, J. Uszkoreit et al., \u201cMlp-mixer:\nAn all-mlp architecture for vision,\u201d Advances in neural information\nprocessing systems, vol. 34, pp. 24 261\u201324 272, 2021.\n[63] Y. Nie, N. H. Nguyen, P. Sinthong, and J. Kalagnanam, \u201cA time series\nis worth 64 words: Long-term forecasting with transformers,\u201d arXiv\npreprint arXiv:2211.14730, 2022.\n[64] Y. Zhang and J. Yan, \u201cCrossformer: Transformer utilizing cross-\ndimension dependency for multivariate time series forecasting,\u201d in The\neleventh international conference on learning representations, 2023.\n[65] H. Wu, T. Hu, Y. Liu, H. Zhou, J. Wang, and M. Long, \u201cTimesnet:\nTemporal 2d-variation modeling for general time series analysis,\u201d in\nThe Eleventh International Conference on Learning Representations.\n[66] S. Wang, H. Wu, X. Shi, T. Hu, H. Luo, L. Ma, J. Y. Zhang,\nand J. ZHOU, \u201cTimemixer: Decomposable multiscale mixing for time\nseries forecasting,\u201d in The Twelfth International Conference on Learning\nRepresentations.\n[67] W. Li, R. Bao, K. Harimoto, D. Chen, J. Xu, and Q. Su, \u201cModeling\nthe stock relation with graph network for overnight stock movement\nprediction,\u201d in Proceedings of the twenty-ninth international conference\non international joint conferences on artificial intelligence, 2021, pp.\n4541\u20134547.\n[68] L. Zhang, C. Aggarwal, and G.-J. Qi, \u201cStock price prediction via\ndiscovering multi-frequency trading patterns,\u201d in Proceedings of the 23rd\nACM SIGKDD international conference on knowledge discovery and\ndata mining, 2017, pp. 2141\u20132149.\n[69] S. Xiang, D. Cheng, C. Shang, Y. Zhang, and Y. Liang, \u201cTemporal and\nheterogeneous graph neural network for financial time series prediction,\u201d\nin Proceedings of the 31st ACM international conference on information\n& knowledge management, 2022, pp. 3584\u20133593."
-  },
-  {
-    "domain": "Materials Science",
-    "chunk_type": "general",
-    "text": "Abstract\u2014This study presents the first experimental \nexploration into cryogenic ferroelectric behavior in \nwurtzite ferroelectrics. A breakdown field (EBD) to coercive \nfield (EC) ratio of 1.8 is achieved even at 4 K, marking the \nlowest ferroelectric switching temperature reported for \nwurtzite ferroelectrics. Additionally, a significant evolution \nin fatigue behavior is captured, transitioning from hard \nbreakdown \nto \nferroelectricity \nloss \nat \ncryogenic \ntemperatures. These findings unlock the feasibility for \nwurtzite ferroelectrics to advance wide temperature \nnon-volatile memory. \nIndex Terms\u2014wurtzite ferroelectrics, AlScN, cryogenic \ntemperature, fatigue \nI. Introduction \nhe advancement of non-volatile memory (NVM) \ntechnologies capable of reliable operation across a broad \ntemperature spectrum, spanning from cryogenic temperatures \nas low as 4 K to ultra-high temperatures exceeding 1200 K, \nremains a formidable challenge [1]-[3]. This constraint \nsignificantly impedes progress in critical domains such as \naerospace systems, deep-space exploration, nuclear fusion, \nplasma technologies, and high-power laser systems [4], [5]. \nWurtzite ferroelectrics, distinguished by their exceptionally \nhigh curie temperatures of approximately 1373 K, emerge as \npromising candidates for enabling NVM devices operable \nacross these extreme thermal environments [6]. While their \nstability at elevated temperatures has been well-documented, \nshowing ferroelectricity up to 873 K, their behavior under \ncryogenic conditions remains uncharted, associated with a \nquestionable feasibility [3], [6], [7]. This knowledge gap stems \n \nManuscript received XXXX; revised XXXXX; accepted XXXXX. Date \nof publication XXXXXX; date of current version XXXXXXX. The authors \nacknowledge support from the National Science and Technology Major \nProject (No. 2022ZD0119002), the National Natural Science \nFoundation of China (Grant No. 92264101, 92464205, 62025402, \n62090033), the Major Program of Zhejiang Natural Science Foundation \n(Grant No. LD25F040004), and the Postdoctoral Fellowship Program of \nChina Postdoctoral Science Foundation (No. GZC20241309). The \nreview of this paper was arranged by Editor XXXXX. (Corresponding \nauthor: Jiuren Zhou, Siying Zheng, and Feng Zhu)  \nR. Wang, J. Zhou, S. Zheng, W. Sun, H. Xu, B. Li, Y. Liu, Y. Hao, and \nG. Han are with School of Microelectronics, Xidian University, Xi\u2019an, \n710126, China, and also with Hangzhou Institute of Technology, Xidian \nUniversity, Hangzhou, 311200, China. (e-mail: zhoujiuren@163.com; \nsiying_zheng@163.com). \nF. Zhu, with TRACE EM Unit and Department of Materials Science \nand Engineering, City University of Hong Kong, Hong Kong, 000000, \nChina (e-mail: fengzhu@cityu.edu.hk).  \nColor versions of one or more figures in this letter are available at \nxxxxxxxxxxxxxxxxxxxxxx.  \nDigital Object Identifier xxxxxxxxxxxxxxxxxxxxx \nprimarily from material limitations at low temperatures, where \nthe coercive field (EC) approaches or exceeds the breakdown \nfield (EBD), thus usually leading to the loss of ferroelectricity \n[1]. To expand the low-temperature NVM applications of \npromising wurtzite materials, there is an urgent need to study \nthe cryogenic ferroelectric behavior. \nIn this study, we bridge this gap by experimentally \nfabricating well-crystalline wurtzite AlScN and clarifying its \nfeasibility for ferroelectric operating at cryogenic temperatures, \ndown to 4 K. Beyond merely evaluating their ferroelectric \npolarization switching at cryogenic levels, the study also \ndelves into critical reliability concerns in such a regime, \nparticularly hard breakdown and fatigue, to provide a \ncomprehensive understanding of their operational stability. \nII. Experiments \n \nFig. 1(a) outlines the fabrication process for wurtzite AlScN \ncapacitors. The native oxide on a heavily doped P-type Si (001) \nwafer was removed by dilute hydrofluoric acid (DHF). An \nAlScN film was then sputtered at 200 \u00b0C using a single \nAl0.8Sc0.2 target, with high N2 (160 sccm) and low Ar (32 sccm) \nflows. The vacuum conveying ensured the unoxidized AlScN \nfilm. Finally, the top electrode was defined with an area of \n5024 \u03bcm\u00b2. To prevent material failure at cryogenic \ntemperatures, N-rich and oxygen-free processing conditions \nwere used [7].  \nFig. 1(b) illustrates the device structure, comprising a \n163-nm-thick AlScN layer sandwiched between a Pt top \nelectrode and a heavily doped P-type silicon bottom electrode. \nHigh-resolution transmission electron microscopy (HRTEM) \nimages in Fig. 1(c) underscore the crystalline integrity of the \nCryogenic Ferroelectric Behavior of Wurtzite \nFerroelectrics \nRuiqing Wang, Jiuren Zhou, Member, IEEE, Siying Zheng, Feng Zhu, Wenxin Sun, Haiwen Xu,  \nBochang Li, Yan Liu, Yue Hao, Senior Member, IEEE, and Genquan Han, Senior Member, IEEE \n20 nm\nR3\nAlScN deposition:\n\uf070N-rich ambient sputtering\n\uf070Vacuum conveying\nTop electrode definition\nNative oxide removal\nAlScN\n(~ 163 nm)\nPt (~ 50 nm)\nP+-Si\nR1\nR2\nR3\n20 nm\nR1\n(0002)\nc*\n(0002)\nc*\nR1.1\n5 nm\n~2.52 \u00c5\n20 nm\nR2\n(a)\n(b)\n(c)\n(0002)\nc*\nc*\nR1.1\n \nFig. 1. (a) Key process flow for fabricating AlScN capacitors. (b-c)\nHRTEM images, demonstrating the obtained superior crystalline at\nboth top interface and bulk regions. \nT\n \nwurtzite ferroelectric layer. At the top interface (Region 1, R1), \nthe samples exhibit intact crystalline with preserved direction \nand lattice structure, avoiding the typical oxidized interfacial \nlayers seen in similar systems [8]. Diffraction analyses from \nbulk regions (Regions 2 and 3, R2 & R3) reveal almost \nuniform crystalline throughout the film, even at grain \nboundaries where nitrogen vacancies are prone to accumulate. \nThese optimized fabrication measures effectively suppress \noxidation at the interface and nitrogen vacancy formation \nwithin the bulk, ensuring superior crystalline and enhanced \nbreakdown characteristics [9], [10]. \nIII. RESULTS AND DISCUSSION \nThe \ncryogenic \nferroelectric \npolarization \nswitching \ncharacteristics of wurtzite AlScN are detailed in Fig. 2. Fig. 2(a) \nillustrates the applied positive-up-negative-down (PUND) \npulse trains, with a fixed pulse width of 25 \u00b5s, while the pulse \namplitudes vary between 3.9 and 7.5 MV/cm, which \naccommodates the increased coercive field (EC) induced by the \ntemperature decrease, owing to the raised ferroelectric \npolarization switching barrier [11]. This measurement scheme \nthus ensured a consistent remnant polarization of our samples, \napproximately 100 \u00b5C/cm2, across the whole tested \ntemperature range, spanning from 400 to 4 K.  \nFig. 2(b) presents the extracted polarization (P) versus (V) \ncurves through the dynamic PUND test, demonstrating \nsuccessful ferroelectric polarization switching at extreme-low \ntemperature of 4 K, albeit with a significantly large EC. The \nincrease in EC of wurtzite ferroelectrics leads to an inevitable \nrise in the required external electric field, which in turn leads to \nan increase in leakage current, resulting in a widened \npolarization gap at an electric field near 0 MV/cm [11], [12]. \nFurthermore, Fig. 2(c) shows the quasi-static C-V curves for \nthe same sample. The characteristic butterfly-shaped profile, \nalong with the pronounced tailing effect, further confirms \nsuccessful ferroelectric polarization switching at 4 K, as well \nas the obvious increases in EC and leakage current. \nAdditionally, wurtzite ferroelectrics also exhibit a decrease in \npermittivity as the temperature lowers, which can be attributed \nto the suppression of dipoles orientation [13].  \nFocusing on the cryogenic operational reliability of wurtzite \nferroelectrics, \ntime-zero-dependent \nbreakdown \n(TZDB) \nmeasurements were conducted. Fig. 3(a) presents the \ncorresponding I-V curves at various temperatures. The small \nhumps and abrupt jumps in the I-V curves mean the \nferroelectric polarization switching (EC points) and hard \nbreakdown (breakdown electric field points, EBD points) of \nAlScN capacitors, respectively. Although the EC of wurtzite \nferroelectrics \nincreases \ncontinuously \nwith \ndecreasing \ntemperature, it remains smaller than the EBD, which guarantees \nthe reliability of ferroelectric polarization switching in \ncryogenic temperatures. Notably, a convergence between EBD \nand EC is observed as the temperature decreases [14], [15]. \nTo investigate the underlying conduction mechanisms with \ntemperature transformation, Fig. 3(b) plots the leakage current \nof the samples as a function of 1000/T. At higher temperatures, \nthe leakage current exhibits a strong temperature dependence; \nhowever, this dependence weakens at lower temperatures, \nindicating a transition in the conduction mechanism from \nPoole-Frenkel (PF) hopping to Fowler-Nordheim (FN) \ntunneling [16]. Especially, such a low knee point observed \naround 150-200 K in the current wurtzite ferroelectrics, is \nmuch smaller than the counterparts of fluorite ferroelectrics, \naround 400 K [16], underscoring the necessity of wurtzite \nferroelectrics for further enhancement in crystalline, towards \nthe advanced applications at even lower temperatures. \nShifting to the temperature-dependent characteristics of the \nt\nE\nEpulse: 3.9, 4.7, 5.2, 5.8, \n6.5, 6.6, 7, 7.5 MV/cm\ntrise, pulse, fall: 25 \u03bcs\nTtest: 400, 300, 250, 200, 150, 100, 77, 4 K\n(a)\n(b)\n(c)\n\u22128\n\u22124\n0\n4\n8\n\u2212150\n\u2212100\n\u221250\n0\n50\n100\nPolarization (\u03bcC/cm2)\nElectric Field (MV/cm)\nRaised Ileakage\ntpulse:\n25 \u03bcs\n\u22126 \u22124 \u22122 0\n2\n4\n6\n0.6\n0.7\n0.8\n0.9\n1.0\n1.1\nCapacitance (fF/\u03bcm2)\nElectric Field (MV/cm)\nEnlarged EC\nRaised Ileakage\nftest : 100 kHz\n \nFig. 2 (a) PUND pulse scheme used for testing temperatures (Ttest)\nranging from 400 to 4 K. (b) PUND and (c) C-V measurements. \n\u22128\n\u22124\n0\n4\n8\n10\u22129\n10\u22127\n10\u22125\n10\u22123\n10\u22121\n101\n103\nJ (A/cm2)\nElectric Field (MV/cm)\nFe switching\nEC shift\nTtest: 400, 300, 250, 200\n150, 100, 77, 4 K\n(a)\n(b)\n100\n101\n102\n10\u22123\n10\u22122\n10\u22121\n100\n101\nJ (A/cm2)\n1000/T (K-1)\n5 MV/cm\n4 MV/cm\n3 MV/cm\nPF\nFN-TUN\nTtest: 400, 300, 250, 200\n150, 100, 77, 4 K\n \nFig. 3 Breakdown properties of AlScN film, involving (a) TZDB test, (b)\nthe leakage current as a function of 1000/T at different electric fields.  \n0\n100\n200\n300\n400\n5\n7\n10\n12\n15\nElectric Field (MV/cm)\nTemperature (K)\n2\u00d7EBD\n2\u00d7EC\nStable\n20 Samples\n~ 2.6X\nkBD:-16.42 kV\u00b7cm-1\u00b7K-1\nkC:-10.12 kV\u00b7cm-1\u00b7K-1\n~ 1.8X\n \nFig. 4. Temperature dependence of the EBD and EC for 20 samples, with \nfitting kBD and kC constants and EBD/EC of 1.8 at 4 K.  \nTable I Temperature coefficient of EC in Ferroelectric Wurtzites \n \n[15] \nThis work \nTemperature \nrange \n300 ~673 K \n400 ~ 4 K (cryogenic) \nTemperature \ncoefficient of (EC) \n- 4.50 kV\u2022cm-1\u2022K-1 \n- 5.06 kV\u2022cm-1\u2022K-1 \n \nEBD and EC, Fig. 4 presents statistical data on EC and EBD as a \nfunction of temperature, which are extracted from quasi-static \nC-V and I-V curves. The device counts are 20. The EC exhibits \na monotonic increase as the temperature decreases from 400 to \n4 K, with the cryogenic temperature coefficient of EC reported \nfor the first time, as - 5.06 kV\u00b7cm-1\u00b7K-1 (half of kC in Fig. 4), as \nsummarized in the following Table I. In contrast, EBD displays \na distinct trend: it initially increases with decreasing \ntemperature before stabilizing, with a knee point around 200 K, \nwhich corresponds well to the conduction mechanism \ntransition illustrated in Fig. 3(b). Remarkably, even at 4 K, our \nsamples maintain a robust EBD/EC ratio of 1.8, enabling reliable \nferroelectric polarization switching in wurtzite ferroelectrics. \nFurthermore, the cryogenic fatigue characteristics of \nwurtzite ferroelectrics are examined in Fig. 5, with a stress \nfrequency of 100 kHz and varying pulse amplitudes applied \ndepending on the test temperatures. The competitive \nendurance characteristics, up to 2\u00d7106 @ 300 K, is obtained in \nour samples. More importantly, Fig. 5 reveals a distinct \ntransition in fatigue behavior: breakdown dominates at high \ntemperatures, whereas ferroelectricity loss becomes prominent \nat cryogenic temperatures, which both strongly relate to \nnitrogen vacancy motion and ferroelectric domain pinning [17], \n[18], requiring further investigation into nitrogen vacancy \noptimization and the underlying mechanisms governing \nfatigue properties. \nFig. 6 benchmarks the extreme research temperatures for the \nreported wurtzite ferroelectrics so far, highlighting our \npioneering entry into the cryogenic temperature regime, with \nreliable ferroelectric polarization switching down to 4 K [6], \n[14], [15], [19]-[22]. This milestone underscores the \nsignificant potential of wurtzite ferroelectrics for the advance \nwide-temperature applications. \nIV. CONCLUSION \nFor the first time, the operational temperature of wurtzite \nferroelectrics has been extended to the cryogenic regime, \nremarkably down to 4 K. Coupled with comprehensive \nreliability investigations, this work paves the way for \nsignificant advancements in non-volatile memory technologies \ntailored for extreme wide-temperature environments. \n(a)\n(b)\n0\n100\n200\n300\n400\n102\n103\n104\n105\n106\n107\n108\n109\nCycles\nTemperature (K)\nHard Breakdown\nFe-Loss\nfStress @ 100 kHz\n10-1 100 101 102 103 104 105 106 107 108\n-200\n-100\n0\n100\n200\n300\nPr (\u03bcC/cm2)\nCycles\n 400 K \n300 K  \n 250 K \n 200 K\n 150 K \n 100 K \n 77 K \n 4 K\nfStress @ 100 kHz\n \nFig. 5 (a) Endurance characteristics of the ferroelectric AlScN capacitor \nunder wide-ranging temperatures. (b) Cumulative analyses of \nendurance, showing a distinct transition with temperature change. \n2019\n2021\n2023\n2025\n4\n10\n100\n1000\nTemperature (K)\nYear\nWurtzite ferroelectrics\nThis work\n[19]\n[20]\n[14]\n[15]\n[21]\n[6]\nCryogenic \n(< 120 K)\n[22]\n \nFig. 6. Comparison of extreme temperature characteristics of wurtzite\nferroelectrics, emphasizing the record-low temperature of 4 K achieved \nin this work. \n \nREFERENCES \n[1] S. Alam, M. S. Hossain, S. R. Srinivasa, and A. Aziz, \u201cCryogenic \nmemory technologies,\u201d Nat. Electron., vol. 6, no. 3, pp. 185-198, \nMar. 2023, doi: 10.1038/s41928-023-00930-2.  \n[2] H. L. Chiang, H. L. Chiang, T. C. Chen, J. F. Wang, S. \nMukhopadhyay, W. K. Lee, C. L. Chen, W. S. Khwa, B. \nPulicherla, P. J. Liao, K. W. Su, K. F. Yu, T. Wang, H. -S. P. \nWong, C. H. Diazl, and J. Cai, \u201cCold CMOS as a \npower-performance-reliability booster for advanced FinFETs,\u201d in \nProc. IEEE Symp. VLSI Technol., Honolulu, HI, USA, Jun. 2020, \npp. 1-2, doi: 10.1109/VLSITechnology18217.2020.9265065 \n[3] W. Sui, H. Wang, J. Lee, A. Qamar, M. Rais\u2010Zadeh, and P. X. \n\u2010L. Feng, \u201cAlScN\u2010on\u2010SiC thin film micromachined resonant \ntransducers operating in high\u2010temperature environment up to \n600 \u2103,\u201d Adv. Funct. Mater., vol. 32, no. 34, pp. 2202204, Aug. \n2022, doi: 10.1002/adfm.202202204. \n[4] M. Zhang, H. Qian, J. Xu, M. Ma, R. Shen, G. Lin, J. Gu, Y. Liu, \nC. Jin, J. Chen, and G. Han, \u201cEnhanced endurance and stability of \nFDSOI ferroelectric FETs at cryogenic temperatures for \nadvanced memory applications,\u201d IEEE Trans. Electron Devices, \nvol. \n71, \nno. \n11, \npp. \n6680-6685, \nNov. \n2024, \ndoi: \n10.1109/TED.2024.3456763. \n[5] J. Hur, Y. -C. Luo, Z. Wang, S. Lombardo, A. I. Khan, and S. Yu, \n\u201cCharacterizing ferroelectric properties of Hf0.5Zr0.5O2 from \ndeep-cryogenic temperature (4 K) to 400 K,\u201d IEEE J. Explor. \nSolidState Comput. Devices Circuits, vol. 7, pp. 168-174, 2021, \ndoi: 10.1109/JXCDC.2021.3130783. \n[6] D. K. Pradhan, D. C. Moore, G. Kim, Y. He, P. Musavigharavi, K. \nH. Kim, N. Sharma, Z. Han, X. Du, V. S. Puli, E. A. Stach, W. J. \nKennedy, N. R. Glavin, R. H. Olsson, and D. Jariwala, \u201cA \nscalable ferroelectric non-volatile memory operating at 600 \u00b0C,\u201d \nNat. Electron., vol. 7, no. 5, pp. 348-355, Apr. 2024, doi: \n10.1038/s41928-024-01148-6.  \n[7] W. Sun, J. Zhou, N. Liu, S. Zheng, X. Li, B. Li, Y. Liu, Y. Hao, \nand G. Han, \u201cIntegration of ferroelectric Al0.8Sc0.2N on Si (001) \nsubstrate,\u201d IEEE Electron Device Lett., vol. 45, no. 4, pp. 574-577, \nApr. 2024, doi: 10.1109/LED.2024.3363724. \n[8] R. Wang, J. Zhou, D. Yao, S. Zheng, B. Li, X. Li, Y. Liu, Y. Hao, \nand G. Han \u201cUnraveling fatigue mechanisms in ferroelectric \nAlScN films: The role of oxygen infiltration,\u201d in IEEE Electron \nDevice Lett., early access, doi: 10.1109/LED.2024.3522947. \n[9] J. Kataoka, S. -L. Tsai, T. Hoshii, H. Wakabayashi, K. Tsutsui, \nand K. Kakushima, \u201cA possible origin of the large leakage current \nin ferroelectric Al1\u2212xScxN films,\u201d Jpn. J. Appl. Phys., vol. 60, no. \n3, pp. 030907, Feb. 2021, doi: 10.35848/1347-4065/abe644. \n[10] M. Li, K. Hu, H. Lin, V. Felmetsger, and Y. Zhu, \u201cOxidation of \nsputtered AlScN films exposed to the atmosphere,\u201d in 2022 IEEE \nInternational Ultrasonics Symposium (IUS), Venice, Italy: IEEE, \nOct. 2022, pp. 1-3, doi: 10.1109/IUS54386.2022.9957694. \n[11] W. Sun, J. Zhou, F. Jin, N. Liu, S. Zheng, B. Li, X. Li, Y. Liu, Y. \nHao, and G. Han, \u201cTemperature dependence in coercive field of \nferroelectric AlScN integrated on Si substrate\u201d, in 2024 IEEE \nInternational Conference on IC Design and Technology \n(ICICDT), Singapore, Singapore: IEEE, Sept. 2024, pp. 1-4, doi: \n10.1109/ICICDT63592.2024.10717847. \n[12] G. Giribaldi, M. Pirro, B. H. Soukup, M. Assylbekova, L. \nColombo, \nand \nM. \nRinaldi, \n\u201cCompensation \nof \ncontact \nnature-dependent asymmetry in the leakage current of \nferroelectric ScxAl1\u2212xN thin-film capacitors,\u201d in 2021 IEEE 34th \nInternational Conference on Micro Electro Mechanical \nSystems(MEMS), Gainesville, FL, USA, Jan. 2021, pp. 650-653, \ndoi: 10.1109/MEMS51782.2021.9375451. \n[13] N. Sahu, \u201cStudy of crystal structure and electrical properties on \nlead titanate and lead zirconate titanate based ceramic oxides,\u201d \nPh.D. dissertation, National Institute of Technology, Rourkela, \nOdisha, India, Oct. 2011. \n[14] W. Zhu, J. Hayden, F. He, J. -I. Yang, P. Tipsawat, M. D. Hossain, \nJ.P. Maria, and S. Trolier-McKinstry, \u201cStrongly temperature \ndependent ferroelectric switching in AlN, Al1-xScxN, and \nAl1-xBxN thin films,\u201d Appl. Phys. Lett., vol. 119, no. 6, pp. 62901, \nAug. 2021, doi: 10.1063/5.0057869. \n[15] D. Drury, K. Yazawa, A. Zakutayev, B. Hanrahan, and G. \nBrennecka, \n\u201cHigh-temperature \nferroelectric \nbehavior \nof \nAl0.7Sc0.3N,\u201d Micromachines, vol. 13, no. 6, pp. 887, May 2022, \ndoi: 10.3390/mi13060887. \n[16] Z. Xu, M. Houssa, S. De Gendt, and M. Heyns, \u201cPolarity effect on \nthe temperature dependence of leakage current through \nHfO2/SiO2 gate dielectric stacks,\u201d Appl. Phys. Lett., vol. 80, no. \n11, pp. 1975-1977, Mar. 2002, doi: 10.1063/1.1435411. \n[17] S. Jindal, S. K. Manhas, S. Balatti, A. Kumar, and M. Pakala, \n\u201cTemperature-dependent field cycling behavior of ferroelectric \nhafnium zirconium oxide (HZO) MFM capacitors,\u201d IEEE Trans. \nElectron Devices, vol. 69, no. 7, pp. 3990-3996, Jul. 2022, doi: \n10.1109/TED.2022.3172244. \n[18] S. -L. Tsai, T. Hoshii, H. Wakabayashi, K. Tsutsui, T. -K. Chung, \nE. Y. Chang, and K. Kakushima, \u201cField cycling behavior and \nbreakdown mechanism of ferroelectric Al0.78Sc0.22N films,\u201d Jpn. J. \nAppl. \nPhys., \nvol. \n61, \npp. \nSJ1005, \nAug. \n2022, \ndoi: \n10.35848/1347-4065/ac54f6. \n[19] S. Fichtner, N. Wolff, F. Lofink, L. Kienle, and B. Wagner, \n\u201cAlScN: A III-V semiconductor-based ferroelectric,\u201d J. Appl. \nPhys., vol. 125, no. 11, pp. 114103, Mar. 2019, doi: \n10.1063/1.5084945. \n[20] S. Yasuoka, T. Shimizu, A. Tateyama, M. Uehara, H. Yamada, M. \nAkiyama, Y. Hiranaga, Y. Cho, and H. Funakubo, \u201cEffects of \ndeposition conditions on the ferroelectric properties of (Al1 \u2212 \nxScx)N thin films,\u201d Journal of Applied Physics, vol. 128, no. 11, \npp. 114103, Sep. 2020, doi: 10.1063/5.0015281. \n[21] K. D. Kim, Y. B. Lee, S. H. Lee, I. S. Lee, S. K. Ryoo, S. Y. Byun, \nJ. H. Lee, and C. S. Hwang, \u201cImpact of operation voltage and NH3 \nannealing on the fatigue characteristics of ferroelectric AlScN \nthin films grown by sputtering,\u201d Nanoscale, vol. 15, no. 40, pp. \n16390-16402, Sept. 2023, doi: 10.1039/D3NR02572A. \n[22] L. Chen, Q. Wang, C. Liu, M. Li, W. Song, W. Wang, D. K. Loke, \nand Y. Zhu, \u201cLeakage mechanism and cycling behavior of \nferroelectric Al0.7Sc0.3N,\u201d Materials, vol. 17, no. 2, pp. 397, Jan. \n2024, doi: 10.3390/ma17020397."
-  },
-  {
-    "domain": "Materials Science",
-    "chunk_type": "general",
-    "text": "MIPS is a Maxwell fluid with an extended and non-monotonic crossover\nJos\u00e9 Mart\u00edn-Roca and Chantal Valeriani\u2217\nDept. de Estructura de la Materia, F\u00edsica T\u00e9rmica y Electr\u00f3nica, Universidad Complutense de Madrid, Spain\nKristian Thijssen\nNiels Bohr Institute, University of Copenhagen, Denmark\nTyler Shendruk\u2020\nSchool of Physics and Astronomy, University of Edinburgh, UK\nAngelo Cacciuto\u2021\nDepartment of Chemistry, Columbia University, New York, USA\nUnderstanding the mechanical properties of active suspensions is crucial for their potential ap-\nplications in materials engineering.\nAmong the various phenomena in active matter that have\nno analogue in equilibrium systems, motility-induced phase separation (MIPS) in active colloidal\nsuspensions is one of the most extensively studied. However, the mechanical properties of this fun-\ndamental active state of matter remain poorly understood. This study investigates the rheology of\na suspension of active colloidal particles under constant and oscillatory shear. Systems consisting of\npseudo-hard active Brownian particles exhibiting co-existence of dense and dilute phases behave as a\nviscoelastic Maxwell fluid at low and high frequencies, displaying exclusively shear thinning across a\nwide range of densities and activities. Remarkably, the cross-over point between the storage and loss\nmoduli is non-monotonic, rising with activity before the MIPS transition but falling with activity\nafter the transition, revealing the subtleties of how active forces and intrinsically out-of-equilibrium\nphases affect the mechanical properties of these systems.\nActive materials, composed of self-propelled units\nthat consume energy to generate motion, display a vari-\nety of fascinating properties that deviate from those of\nsystems at equilibrium [1, 2]. This includes large cor-\nrelated motion [3, 4], anomalous transport [5\u20139], and\nemergent self-organization [8, 10\u201312]. Motility-Induced\nPhase Separation (MIPS), where a solution of motile\nparticles separates into co-existing dense and dilute re-\ngions in the absence of any attractive interactions, is\na textbook example of a behavior that is exclusive to\nactive systems [13]. The origin of the MIPS transition\nand how it depends on the details of the particle inter-\nactions has been the subject of intense scrutiny [14\u201319].\nThis is because active Brownian particles (ABP) repre-\nsent the simplest active model with a well-defined inter-\nplay between thermal, dispersion and active forces [20].\nThe intrinsically out-of-equilibrium nature of the self-\norganized structures in MIPS implies that traditional\napproaches and even definitions that are workhorses in\nthe context of equilibrium statistical mechanics must be\napplied with care. Striking examples are studies using\ndifferent (and conflicting) definitions of surface tension\nto characterize the statistical properties of the dense-\nfluid MIPS interface [21\u201330].\nDespite standing as possibly one of the most funda-\nmental active states of matter, the mechanical proper-\nties of the MIPS phase remain poorly understood, es-\npecially its response to external driving forces. Much\nof the previous work on externally driven ABPs has\nfocused on very dense or crystalline suspensions in\nwhich activity can fluidize an otherwise \"jammed\" sys-\ntem [31], and induce either shear thinning [32] or shear\nthickening [33] depending on the softness of the par-\nticles.\nIn contrast, the rheological properties of di-\nlute ABPs, where activity drives large-scale density\nvariations through MIPS [13], is largely unexplored.\nCrucially, while MIPS is often described as a liquid-\ngas phase separation due to its mathematical analogy\nwith equilibrium systems [34], a significant degree of\nlocal crystalline order is detectable within the dense\nphase [13]. The translational symmetry breaking sug-\ngests that the dense phase is more akin to a solid crystal\nthan a liquid droplet. This apparent paradox raises key\nquestions about the rheological nature of such systems,\nsince broken symmetries generally result in elasticity \u2014\nat least in passive materials. Traditionally, the rheolog-\nical properties of passive systems arise from a thermal\nrelaxation in response to external driving. However, ac-\ntive materials behave differently because of the intrin-\nsically out-of-equilibrium pathways by which emergent\nphenomena and novel behaviors can arise in response to\ndriving forces. Notable examples of active responses to\nexternal driving include super-fluid like behavior [35],\nthe emergence of higher-order defects [36], non-linear\nDarcy\u2019s law [37, 38] and negative drag [39].\nWhile the field of active matter holds promise for\ndeveloping exciting materials with unusual mechani-\ncal properties, understanding how these materials re-\nspond to external driving forces is key to comprehend-\ning, controlling and designing their mechanical behav-\nior. Studying the rheology of the MIPS phase is a first\nfundamental step in this direction.\nIn this work, we\narXiv:2504.10332v1  [cond-mat.soft]  14 Apr 2025\n2\nstudy the rheological behavior of a suspension of pseudo-\nhard ABPs across the MIPS transition using numerical\nsimulations. We consider both oscillatory and steady\nshear, and gain new insight into the unusual mechanical\nproperties of the MIPS state. We show that the rheo-\nlogical response is akin to a Maxwell fluid at low and\nhigh frequencies, but has an extended crossover regime\nat intermediate frequencies. Furthermore, we discover\nthat the crossover point is non-monotonic with activity\nsince the timescale switches from one associated with\nthe thermal component of the individual active particles\nto one associated with their active persistent motion.\nWe perform Brownian dynamics simulations of N\nABP of diameter d in two dimensions.\nThe equa-\ntions of motion for the position ri and orientation\nni = [cos(\u03b8i), sin(\u03b8i)] of particle i are given by\ndri\ndt = \u22121\n\u03b6 \u2207Ui + v0 ni +\np\n2DT\u03bei\n(1a)\nd\u03b8i\ndt =\np\n2DR \u03bei\n(1b)\nwhere \u03b6 is the translational friction coefficient, Ui =\nPN\nj\u0338=i Uij is the total pairwise potential for particle i,\nv0 is the self-propulsion speed of the particle, DT is the\ntranslation diffusion, DR = 3DT/d2 the rotation diffu-\nsion and \u03bei is the white Gaussian noise with \u27e8\u03bei\u27e9= 0;\n\n\u03bei(t) \u2297\u03bej(t\u2032)\n\u000b\n= 1\u03b4ij \u03b4(t \u2212t\u2032).\nThe self-propulsion\nspeed can be imagined as due to an active force of\nstrength Fa = \u03b6 v0. We resolve the equations of mo-\ntion using LAMMPS [40]. The excluded volume inter-\naction between the particles is enforced via the pairwise\npseudo-hard sphere potential [41]\nUij =\n\uf8f1\n\uf8f2\n\uf8f3\n\u03f5 50\n\u0000 50\n49\n\u000149\n\u0014\u0010\nd\nrij\n\u001150\n\u2212\n\u0010\nd\nrij\n\u001149\u0015\n+ \u03f5,\nrij < 50\n40 d\n0,\notherwise\n(2)\nwhere rij the interparticle distance and \u03f5 the charac-\nteristic energy of the system with Lennard-Jones units.\nThroughout this work, we used dimensionless units,\nwith \u03f5 = 1, d = 1 and the time unit \u03c4 = 1. Since at\nequilibrium the potential reproduces the second virial\ncoefficient of hard spheres when the temperature is\nkBT = 1.5\u03f5 [41], we set T = 1.5, which gives DT = 1.5\nand DR = 4.5. The rest side length of the simulation\nbox is Lx = Ly = 100, and the number of particles is\nobtained from the number density, \u03c1 = 0.5 unless oth-\nerwise stated.\nWhile the system undergoes strain deformations,\nLees-Edwards boundary conditions are applied [42, 43].\nWe use a strain that does not impart any torque on the\norientation of the particles (inclusion of such torque does\nnot change the reported behavior [32, 44, 45]). When a\nstrain \u03b3 with strain rate \u02d9\u03b3 is applied to the active sys-\ntem, the dynamics of individual particles are governed\nby two dimensionless numbers Pes and Pea [33]. The\nFIG. 1.\nStorage moduli G\u2032 (closed circles) and loss moduli\nG\u2032\u2032 (open squares) with oscillation amplitude Ax/Ly = 5%\n(SAOS regime) for \u03c1 = 0.5 and different activities Pea = 0\n(blue), 42 (green) and 120 (red). Dashed and straight lines\nare visual guides to show the expected maxwell scaling of \u03c91\nand \u03c92.\nshear P\u00e9clet number\nPes = \u02d9\u03b3 d2\nDT\n(3)\nprovides information about the relevance of the imposed\nshear rate with respect to thermal fluctuations, while\nthe Active P\u00e9clet number\nPea = 3v0\nDRd\n(4)\nmeasures the ratio between the orientation decorrelation\ntime \u03c4R = 1/DR and self-propulsion time \u03c4a = d/v0.\nThe stress response to strain deformations is calcu-\nlated as\n\u03c3 = \u2212\n1\n2LxLy\nN\nX\ni=1\nX\ni>j\n\u2202U\n\u2202rij\nrij \u2297rij\nrij\n,\n(5)\nwhich gives the virial contribution due to interparticle\ninteractions (Eq. 2). \"Swimming\" stress, which is the\nself-propelled contribution [46\u201348], is not taken into ac-\ncount because the orientations are decoupled from the\nshear flow in these simulations, such that off-diagonal\nterms average to zero [32, 33, 44].\nIf the amplitude of the applied deformation is suffi-\nciently small, the system responds linearly with stress\nproportional to strain, \u03b3(t) = (Ax/Ly) sin(\u03c9 t), and\nstrain rate, \u02d9\u03b3(t) = (Ax\u03c9/Ly) cos(\u03c9 t), where \u03c9 is the\nfrequency of the oscillatory shear. In this regime, only\n3\nthe first-order natural modes of the system are excited\n[49, 50], corresponding to the Small Amplitude Oscilla-\ntory Shear (SAOS) regime and the resulting stress re-\nsponse is\n\u03c3xy(t) = G\u2032\u03b3(t) + G\u2032\u2032\u03c9 \u02d9\u03b3(t),\n(6)\nwhere G\u2032 and G\u2032\u2032 are the storage and loss moduli, re-\nspectively. To ensure that the response of the system\nremains linear with applied deformation, an amplitude\nsweep at fixed frequency is performed to identify the\nrange of amplitudes for which G\u2032 and G\u2032\u2032 are indepen-\ndent of the strain amplitude (SI [51]).\nWe find that\nAx/Ly = 5% ensures that the system operates within\nthe linear regime across a broad range of frequencies\nwhile providing a sufficiently high measurable response.\nFigure 1 shows the interplay between the elastic, G\u2032,\nand viscous, G\u2032\u2032, responses of the system for different\nvalues of Pea, and includes data below (blue), above\n(red) and at (green) the MIPS transition (see SI [51]).\nInterestingly, regardless of Pea, the system shows a\nliquid-like response (G\u2032\u2032 > G\u2032) at low frequencies (small\nPes). This is an expected result for the passive system\nwhich is characterized by a single low density fluid phase\n(blue data).\nHowever, the behavior above the MIPS\ntransition point (red data) indicates that the elastic re-\nsponse is still compatible with that of a liquid rather\nthan that of a solid, despite the high degree of crystalline\norder observed within the fully formed condensed phase\n(Fig. 2; inset).\nThis result is not expected for pas-\nsive materials and suggests that non-insignificant par-\nticle rearrangement occurs within the active condensed\nclusters under the action of low-frequency shear forces.\nAs the frequency increases from low Pes, both moduli\nincrease. For the systems in this regime, G\u2032\u2032 \u223c\u03c9 and\nG\u2032 \u223c\u03c92. Interestingly, these exponents are precisely\nwhat one would expect for a Maxwell fluid in the low\nfrequency limit [52].\nAt high frequencies (large Pes), the storage modulus\nG\u2032 approaches a plateau (Fig. 1). This is independent\nof Pea and occurs for both the passive and the active\nsystems. Simultaneously, the loss modulus G\u2032\u2032 is dras-\ntically reduced as the oscillation period becomes much\nshorter than the typical relaxation time, causing the\nsystem to behave as a purely elastic solid for small am-\nplitudes. In this regime, the deformation of the system\nand the displacement of the particles are synchronized,\nresulting in the loss modulus G\u2032\u2032 \u21920. This too is what\none expects for a Maxwell fluid.\nSince both moduli increase at low frequencies, with\nG\u2032 rising faster than G\u2032\u2032, the curves meet at a crossover\npoint, Pe\u2020\ns, which establishes the dominant relaxation\ntime of the material (\u223c1/Pe\u2020\ns), and the onset value\nabove which the system exhibits an elastic (solid-like)\nresponse against the external shear forces. For the pas-\nsive system (Pea = 0), the crossover occurs at a shear\nrate of Pe\u2020\ns \u224330 (Fig. 2). As activity is introduced,\nFIG. 2. Dimensionless crossover frequency Pe\u2020\ns for storage\nG\u2032 and loss G\u2032\u2032 moduli shown in Fig. 1 as a function of\ndimensionless activity Pea at density \u03c1 = 0.5 for oscillation\namplitude Ax/Ly = 5% (SAOS regime). Shaded area rep-\nresents the MIPS phase separation boundary of the state\ndiagram (see SI [51]). Insets: Snapshots of the system for\nthe colored point at different activities.\nFIG. 3. Effective viscosity \u03b7 as a function of the dimension-\nless frequency Pes with oscillation amplitude Ax/Ly = 5%\n(SAOS regime) for density \u03c1 = 0.5 and different activities:\nPea = 0 (blue), 42 (green) and 120 (red). All data points\ncomputed from Fig. 1 using Eq. 7. Inset: Zero-shear limit\nof effective viscosity \u03b70 = limPes\u21920 \u03b7 as a function of the\nactivity. Colored points follows same color code as in Fig. 2\nPe\u2020\ns increases with Pea until it reaches a maximum right\nbefore the MIPS transition point where the suspension\n4\nbegins to phase separate.\nAt the transition point, a\nsharp drop in Pe\u2020\ns occurs, which eventually saturates to\nPe\u2020\ns \u22435 at high activities once large clusters develop\nwithin the suspension.\nThe non-monotonic behavior of the crossover point\nstresses the non-equilibrium nature of the MIPS tran-\nsition.\nBefore the system phase separates, the active\nforces effectively increase the fluidity of the gas rather\nthan promoting rigidity as we approach the MIPS tran-\nsition. The linear growth of Pe\u2020\ns with the activity, points\nto the active timescale \u03c4a = d/v0 as the most relevant\ncharacteristic time for the response in this regime.\nIt is tempting to associate the role of activity in this\ngas-like regime with a factor that increases an effec-\ntive temperature of the system, but the nature of the\ntransition is more subtle. Indeed, as the system crosses\nthe MIPS transition, particles begin to aggregate into\nclusters. This is because their rotational diffusion time\n\u03c4r = 1/DR determines the rate at which particles can\nescape the surface of these clusters [53, 54]. Contrary\nto the previous case, in this regime activity takes the\nrole of an effective attraction that drives the formation\nof dense clusters and \u03c4r becomes the relevant timescale\nfor the system response, as indicated by the constant\nvalue of Pe\u2020\ns above the MIPS transition (Fig. 2).\nTogether, these considerations explain why MIPS re-\nsponds like a Maxwell fluid, with a single characteristic\ntime scale at low and high frequencies, but with a non-\nmonotonic crossover point which depends on activity:\nAt low activities (in the gas-like phase below MIPS), it\nis the active time scale \u03c4a that dominates the response\nand so Pe\u2020\ns grows linearly with Pea \u223cv0. At the MIPS\ntransition, the system begins to phase separate and the\ncrossover point drops sharply to a value Pe\u2020\ns \u22435 that\nis independent of activity because the system is dom-\ninated by the presence of large clusters and the relax-\nation time associated with the clusters is \u03c4r, which now\nsets the crossover point, independent of activity. The\npresence of the dense clusters and the purely thermal\nnature of \u03c4r, explain the drop in value of Pe\u2020\ns and its\nindependence of Pea. While the low and high frequency\nlimits are dominated by single timescales, the response\nat intermediate frequencies around the crossover Pe\u2020\ns is\nmore complex, exhibiting an extended regime compared\nto Maxwell fluids (Fig. 1). While the low and high-\nfrequency limits scale like a simple Maxwell-like fluid,\nthe crossover cannot be captured with a single timescale\nand so the moduli cannot be fit to a single Maxwell fluid\nacross all frequencies.\nOverall, the behavior of the G\u2032/G\u2032\u2032 as shown in Fig. 1\nand that of Pe\u2020\ns in Fig. 2 suggests that suspensions of\nABPs both below and above the MIPS transition have\na fairly simple response akin to a Maxwell fluid.\nA\nMaxwell model of a fluid is characterized by a constant\ncomplex viscosity at low frequencies and shear thinning\nat high frequencies.\nTo further explore this analogy,\nwe next extract the effective shear viscosity of the ABP\nsystem using\n\u03b7 =\ns\u0012G\u2032\n\u03c9\n\u00132\n+\n\u0012G\u2032\u2032\n\u03c9\n\u00132\n.\n(7)\nThis definition has been previously employed to uncover\nnon-trivial behavior in various active systems, ranging\nfrom increasing viscosity with activity to negative vis-\ncosities [55\u201357].\nPseudo-hard ABPs show that the zero-frequency limit\nof the viscosity does not increase above the passive val-\nues (Fig. 3; blue and green curves). A clear transition\nfrom Newtonian to shear-thinning behavior is observed\nas Pes increases, when a marked decay of the viscos-\nity is observed.\nThe zero-frequency limit of the vis-\ncosity is constant until the MIPS transition, at which\npoint it suddenly begins to rise (Fig. 3; inset).\nFor\nthis low activity limit, additional activity broadens the\nrange over which the suspension maintains Newtonian-\nlike behavior with a constant viscosity (Fig. 3; green).\nThe situation is different above the MIPS transition\n(Fig. 3; red), where the zero-frequency viscosity is sig-\nnificantly larger than that of the passive system, sug-\ngesting that MIPS clusters enhance viscous dissipation.\nFurthermore, the shear thinning onset above the MIPS\ntransition occurs at frequencies that are smaller than\nthose of the active suspensions below the MIPS transi-\ntion, effectively reducing the Newtonian range. At suf-\nficiently large shear (Pes), all curves exhibit a charac-\nteristic thinning behavior, with the viscosity decaying\nproportionally as \u03b7 \u223cPe\u22121\ns\n\u223c\u03c9\u22121. This decay arises\nat high frequencies because G\u2032 \u226bG\u2032\u2032 and the storage\nmodulus remains roughly constant, leading to the scal-\ning \u03b7 \u2248G\u2032/\u03c9 \u223c\u03c9\u22121. This result has been previously\nobserved in other active systems [32, 58].\nTo further investigate the shear-thinning behavior ob-\nserved in the high-frequency limit, we use non-linear\nconstant shear measurements.\nUnder constant shear,\nat high rates the shear response excites higher order\nmodes. For Pes beyond the cross-over point, the stress\nfollows a simple power law (Fig. 4a) which is charac-\nterized using the Herschel-Bulkley model [59]\n\u03c3xy = A + B \u02d9\u03b3n.\n(8)\nA Herschel-Bulkley exponent n smaller than 1 is indica-\ntive of shear-thinning behavior, while n > 1 is associated\nto shear thickening and n = 1 corresponds to a classical\nNewtonian fluid. Figure 4b shows the Herschel-Bulkley\nexponent n obtained by fitting our numerical data to\nthe Herschel-Bulkley model for different values of the\nactive P\u00e9clet number at a fixed density \u03c1 = 0.5. The\ninset shows n at a fixed active force, Pea = 120, as a\nfunction of the system density \u03c1 (see SI [51]). Crucially,\nthe power decreases monotonically with both parame-\nters and is systematically less than 1, indicating shear-\nthinning behavior at high frequencies for all activities\n5\nFIG. 4. (a) Average out-of-diagonal stress component \u27e8\u03c3xy\u27e9for simulations with constant shearing as a function of dimen-\nsionless shear rate Pes for density \u03c1 = 0.5 and dimensionless activities Pea=6 (blue), 42 (green), 66 (orange) and 120 (red).\n(b) Power law exponents for the curves shown in panel (a) as a function of dimensionless activity Pea for \u03c1 = 0.5. Inset:\nThe dependency of the exponent n on density \u03c1 for Pea = 120.\nand densities. This result suggests that increased ac-\ntivity in ABP suspensions enhances fluidization both\nbelow and above the MIPS transition. Consequently,\nthis leads to a decrease in the shear-thinning exponent\nas activity increases.\nPrevious studies on the rheology of active colloidal\nparticles under constant shear have reported shear-\nthickening behavior [32, 33]. However, these found shear\nthickening only for high shear rates and extreme den-\nsities [33]. The primary differences between our results\nand previous studies is that previous studies focus ex-\nclusively on soft overlapping particles at densities well\nabove that of closed-packed disks, while we consider the\neffect of shear forces on pseudo-hard ABP across the\nMIPS transition.\nWhile previous studies have considered the phase be-\nhavior of MIPS and proposed useful analogies to co-\nexistance in passive liquid-liquid phase separation [60],\nthe study presented here has considered the rheological\nresponse of MIPS for the first time. The results have\nrevealed a rheological analogy: MIPS responds like a\nMaxwell fluid at low and high frequencies, but has an ex-\ntended crossover regime at intermediate frequencies and\na non-monotonic crossover point that follows the active\ntimescale below the MIPS transition and the rotational\nrelaxation timescale above. In addition to demonstrat-\ning how rheological measurements can be used by future\nstudies to reveal more about self-assembled structures in\nintrinsically out-of-equilibrium materials, these findings\nshow that even simple realizations of activity can re-\nsult in complex and exciting rheological responses since\nactive materials possess alternative routes to respond\nbeyond the thermodynamically accessible pathways of\npurely passive materials.\nACKNOWLEDGEMENTS\nThis research has received funding (T.N.S. and\nK.T.) from the European Research Council under the\nEuropean Union\u2019s Horizon 2020 research and inno-\nvation programme (Grant Agreement Nos.\n851196\nand 101029079).\nThis work was further supported\n(K.T.) by the Novo Nordisk Foundation grants no.\nNNF18SA0035142 and NNF23OC0085012. J.M.R. ac-\nknowledges financial support from the UCM predoc-\ntoral contract (call CT15/23).\nC.V. acknowledges\nfundings IHRC22/00002 and PID2022-140407NB-C21\nfrom MINECO. We acknowledge useful discussions with\nMarco Mazza and James Richards. For the purpose of\nopen access, the author has applied a Creative Com-\nmons Attribution (CC BY) licence to any Author Ac-\ncepted Manuscript version arising from this submission.\nThe authors thank the Nordita Institute, since the idea\nof the project was originated during the \"Current and\nFuture Themes in Soft and Biological Active Matter\"\nworkshop.\n\u2217cvaleriani@ucm.es\n\u2020 t.shendruk@ed.ac.uk\n6\n\u2021 ac2822@columbia.edu\n[1] S. Ramaswamy, Journal of Statistical Mechanics: The-\nory and Experiment 2017, 054002 (2017).\n[2] C. Bechinger, R. Di Leonardo, H. L\u00f6wen, C. Reichhardt,\nG. Volpe, and G. Volpe, Reviews of Modern Physics 88,\n045006 (2016).\n[3] P. Baconnier, O. Dauchot, V. D\u00e9mery, G. D\u00fcring,\nS. Henkes, C. Huepe,\nand A. Shee, Reviews of Mod-\nern Physics 97, 015007 (2025).\n[4] S. A. Mallory, C. Valeriani,\nand A. Cacciuto, Annual\nReview of Physical Chemistry 69, 59 (2018).\n[5] S. Williams, R. Jeanneret, I. Tuval, and M. Polin, Na-\nture Communications 13, 4776 (2022).\n[6] Z. Liao, M. Han, M. Fruchart, V. Vitelli, and S. Vaikun-\ntanathan, The Journal of Chemical Physics 151 (2019).\n[7] J. Mart\u00edn-Roca, C. M. Barriuso G, R. Mart\u00ednez Fern\u00e1n-\ndez, C. Betterelli Giuliano, R. Zhang, C. Valeriani, and\nL. G. Wilson, Proceedings of the National Academy of\nSciences 122, e2409510121 (2025).\n[8] O. Granek, Y. Kafri, and J. Tailleur, Physical Review\nLetters 129, 038001 (2022).\n[9] N. Koumakis, C. Maggi, and R. Di Leonardo, Soft Mat-\nter 10, 5695 (2014).\n[10] S. Ramaswamy, Annual Review of Condensed Matter\nPhysics 1, 323 (2010).\n[11] S. Rana, M. Samsuzzaman, and A. Saha, Soft Matter\n15, 8865 (2019).\n[12] Z. Peng and R. Kapral, Soft Matter 20, 1100 (2024).\n[13] M. E. Cates and J. Tailleur, Annual Review of Con-\ndensed Matter Physics 6, 219 (2015).\n[14] A. Torres-Carbajal and F. J. Sevilla, Physics of Fluids\n36 (2024).\n[15] J. Martin-Roca, R. Martinez, L. C. Alexander, A. L.\nDiez, D. G. Aarts, F. Alarcon, J. Ram\u00edrez, and C. Va-\nleriani, The Journal of Chemical Physics 154 (2021).\n[16] J. Stenhammar, D. Marenduzzo, R. J. Allen, and M. E.\nCates, Soft Matter 10, 1489 (2014).\n[17] T. Kolb and D. Klotsa, Soft Matter 16, 1967 (2020).\n[18] M. E. Cates and J. Tailleur, Europhysics Letters 101,\n20010 (2013).\n[19] A. P. Solon, J. Stenhammar, R. Wittkowski, M. Kardar,\nY. Kafri, M. E. Cates, and J. Tailleur, Physical Review\nLetters 114, 198301 (2015).\n[20] M. C. Marchetti, Y. Fily, S. Henkes, A. Patch,\nand\nD. Yllanes, Current Opinion in Colloid & Interface Sci-\nence 21, 34 (2016).\n[21] J. Bialke, J. Siebert, H. Lowen, and T. Speck, Physical\nReview Letters 115, 098301 (2018).\n[22] A.\nPatch,\nD.\nM.\nSussman,\nD.\nYllanes,\nand\nM. Marchetti, Soft Matter 14, 7435 (2018).\n[23] A. Omar, Z.-G. Wang, and J. Brady, Physical Review\nE 101, 012604 (2020).\n[24] J. T. Siebert, F. Dittrich, F. Schmid, K. Binder,\nT. Speck, and P. Virnau, Physical Review E 98, 030601\n(2018).\n[25] F. Ginot, I. Theurkauff, D. Levis, C. Ybert, L. Bocquet,\nL. Berthier, and C. Cottin-Bizonne, Physical Review X\n5, 011004 (2015).\n[26] Y. Fily, A. Baskaran,\nand M. F. Hagan, Soft Matter\n10, 5609 (2014).\n[27] S. C. Takatori, W. Yan, and J. F. Brady, Physical Re-\nview Letters 113, 1 (2014).\n[28] S. A. Mallory, A. K. Omar, and J. F. Brady, Physical\nReview E 104, 044612 (2021).\n[29] E. Chac\u00f3n, F. Alarc\u00f3n, J. Ram\u00edrez, P. Tarazona, and\nC. Valeriani, Soft Matter 18, 2646 (2022).\n[30] S. Hermann, D. de las Heras, and M. Schmidt, Physical\nReview Letters 123, 268002 (2019).\n[31] S. Ongenae, M. Cuvelier, J. Vangheel, H. Ramon, and\nB. Smeets, Frontiers in Physics 9, 649821 (2021).\n[32] R. Wiese, K. Kroy, and D. Levis, Phys. Rev. Lett. 131,\n178302 (2023).\n[33] A. G. Bayram, F. J. Schwarzendahl, H. L\u00f6wen,\nand\nL. Biancofiore, Soft Matter 19, 4571 (2023).\n[34] L. Hecht, I. Dong, and B. Liebchen, Nature Communi-\ncations 15, 3206 (2024).\n[35] S. Takatori and J. Brady, Physical Review Letters 118,\n018003 (2017).\n[36] D. P. Rivas, T. N. Shendruk, R. R. Henry, D. H. Reich,\nand R. L. Leheny, Soft Matter 16, 9331 (2020).\n[37] F. Mackay, J. Toner, A. Morozov, and D. Marenduzzo,\nPhysical Review Letters 124, 187801 (2020).\n[38] R. R. Keogh, T. Kozhukhov, K. Thijssen,\nand T. N.\nShendruk, Physical Review Letters 132, 188301 (2024).\n[39] G. Foffano, J. S. Lintuvuori, K. Stratford, M. E.\nCates,\nand D. Marenduzzo, Physical Review Letters\n109, 028103 (2012).\n[40] A. P. Thompson, H. M. Aktulga, R. Berger, D. S. Bolin-\ntineanu, W. M. Brown, P. S. Crozier, P. J. in \u2019t Veld,\nA. Kohlmeyer, S. G. Moore, T. D. Nguyen, R. Shan,\nM. J. Stevens, J. Tranchida, C. Trott, and S. J. Plimp-\nton, Computer Physics Communications 271, 108171\n(2022).\n[41] J. Jover, A. Haslam, A. Galindo, G. Jackson,\nand\nE. M\u00fcller, The Journal of Chemical Physics 137, 144505\n(2012).\n[42] A. Lees and S. Edwards, Journal of Physics C: Solid\nState Physics 5, 1921 (1972).\n[43] M. P. Allen and D. J. Tildesley, Computer simulation\nof liquids (Oxford university press, 2017).\n[44] R. Mandal and P. Sollich, Proceedings of the National\nAcademy of Sciences 118, e2101964118 (2021).\n[45] C. Villarroel and G. D\u00fcring, Soft Matter 17, 9944\n(2021).\n[46] R. G. Winkler, A. Wysocki,\nand G. Gompper, Soft\nMatter 11, 6680 (2015).\n[47] S. C. Takatori, W. Yan, and J. F. Brady, Physical Re-\nview Letters 113, 028103 (2014).\n[48] Y. Hatwalne, S. Ramaswamy, M. Rao, and R. A. Simha,\nPhysical Review Letters 92, 118101 (2004).\n[49] H. A. Barnes, J. F. Hutton, and K. Walters, An intro-\nduction to rheology, Vol. 3 (Elsevier, 1989).\n[50] U. Daalkhaijav, (2018).\n[51] See Supplemental Material at URL-will-be-inserted-by-\npublisher.\n[52] M. Grimm, S. Jeney, and T. Franosch, Soft Matter 7,\n2076 (2011).\n[53] A. K. Omar, Y. Wu, Z.-G. Wang, and J. F. Brady, ACS\nnano 13, 560 (2018).\n[54] M. C. Pedersen, S. Mukherjee, A. Doostmohammadi,\nC. Mondal, and K. Thijssen, Physical Review Letters\n133, 228301 (2024).\n[55] D. Saintillan, Experimental Mechanics 50, 1275 (2010).\n[56] S. Heidenreich, S. Hess, and S. H. Klapp, Physical Re-\nview E 83, 011907 (2011).\n[57] J. Jara, F. Alarc\u00f3n, A. K. Monnappa, J. I. Santos,\nV. Bianco, P. Nie, M. P. Ciamarra, \u00c1. Canales, L. Di-\nnis, I. L\u00f3pez-Montero, et al., Frontiers in microbiology\n7\n11, 588884 (2021).\n[58] M. Tennenbaum, Z. Liu, D. Hu,\nand A. Fernandez-\nNieves, Nature Materials 15, 54 (2016).\n[59] P. Saramito, Complex fluids (Springer, 2016).\n[60] A. K. Omar, H. Row, S. A. Mallory, and J. F. Brady,\nProceedings of the National Academy of Sciences 120,\ne2219900120 (2023)."
-  },
-  {
-    "domain": "Materials Science",
-    "chunk_type": "general",
-    "text": "DUE: A Deep Learning Framework and Library for Modeling\nUnknown Equations \u2217\nJunfeng Chen\u2020, Kailiang Wu\u2021, AND Dongbin Xiu\u00a7\nAbstract.\nEquations, particularly differential equations, are fundamental for understanding\nnatural phenomena and predicting complex dynamics across various scientific and engineering disci-\nplines. However, the governing equations for many complex systems remain unknown due to intri-\ncate underlying mechanisms. Recent advancements in machine learning and data science offer a new\nparadigm for modeling unknown equations from measurement or simulation data. This paradigm\nshift, known as data-driven discovery or modeling, stands at the forefront of artificial intelligence\nfor science (AI4Science), with significant progress made in recent years. In this paper, we introduce\na systematic framework for data-driven modeling of unknown equations using deep learning. This\nversatile framework is capable of learning unknown ordinary differential equations (ODEs), partial\ndifferential equations (PDEs), differential-algebraic equations (DAEs), integro-differential equations\n(IDEs), stochastic differential equations (SDEs), reduced or partially observed systems, and non-\nautonomous differential equations.\nBased on this framework, we have developed Deep Unknown\nEquations (DUE), an open-source software package designed to facilitate the data-driven modeling\nof unknown equations using modern deep learning techniques. DUE serves as an educational tool for\nclassroom instruction, enabling students and newcomers to gain hands-on experience with differential\nequations, data-driven modeling, and contemporary deep learning approaches such as fully connected\nneural networks (FNN), residual neural networks (ResNet), generalized ResNet (gResNet), operator\nsemigroup networks (OSG-Net), and Transformers from large language models (LLMs). Addition-\nally, DUE is a versatile and accessible toolkit for researchers across various scientific and engineering\nfields. It is applicable not only for learning unknown equations from data but also for surrogate mod-\neling of known, yet complex, equations that are costly to solve using traditional numerical methods.\nWe provide detailed descriptions of DUE and demonstrate its capabilities through diverse examples,\nwhich serve as templates that can be easily adapted for other applications. The source code for DUE\nis available at https://github.com/AI4Equations/due.\nKey words. education software, differential equations, deep learning, neural networks\nAMS subject classifications. 68T07, 65-01, 65-04, 37M99, 65M99, 65P99\n1. Introduction. Equations, especially differential equations, form the founda-\ntion of our understanding of many fundamental laws. They help human unlock the\nmysteries of microscopic particles, decipher the motion of celestial bodies, predict\nclimate changes, and explore the origins of the universe. Differential equations have\nwidespread applications across disciplines such as physics, chemistry, biology, and epi-\ndemiology. Traditionally, these equations were derived from first principles. However,\nfor many complex systems, the governing equations remain elusive due to intricate\nunderlying mechanisms.\nRecent advancements in machine learning and data science are revolutionizing\nhow we model dynamics governed by unknown equations. This paradigm shift, known\nas data-driven discovery or modeling, stands at the forefront of artificial intelligence\nfor science (AI4Science). In the past few years, significant progress has been made in\nlearning or discovering unknown equations from data. Techniques such as symbolic\n\u2217J. Chen and K. Wu were partially supported by NSFC grants (No. 92370108 and No. 12171227)\nand Shenzhen Science and Technology Program (No. RCJC20221008092757098).\n\u2020Department of Mathematics and Shenzhen International Center for Mathematics, Southern Uni-\nversity of Science and Technology, Shenzhen 518055, China (chenjf2@sustech.edu.cn).\n\u2021Corresponding author.\nDepartment of Mathematics and Shenzhen International Center for\nMathematics, Southern University of Science and Technology, Shenzhen, Guangdong 518055, China\n(wukl@sustech.edu.cn).\n\u00a7Department\nof\nMathematics,\nThe\nOhio\nState\nUniversity,\nColumbus,\nOH\n43210,\nUSA\n(xiu.16@osu.edu).\n1\narXiv:2504.10373v1  [cs.LG]  14 Apr 2025\n2\nJ. CHEN, K. WU, D. XIU\nregression [3, 57], sparsity-promoting regression [7, 62, 59, 6, 54, 55, 43, 4], Gaussian\nprocesses [48], polynomial approximation [64, 63, 1], linear multistep methods [28, 19],\ngenetic algorithms [66, 67, 12], parameter identification [44], deep neural networks\n(DNNs) [47, 49, 39, 38, 58], and neural ordinary differential equations (ODEs) [11, 29]\nhave shown great promise. Successfully learning these equations enables their solution\nusing appropriate numerical schemes to predict the evolution behavior of complex\nsystems.\nA distinct approach is using data-driven methods to learn the dynamics or flow\nmaps of the underlying unknown equations [46, 65, 14].\nThis approach facilitates\nrecursive predictions of a system\u2019s evolution, thereby circumventing the need to solve\nthe learned equations. A classic example is dynamic mode decomposition (DMD) [56,\n60], which seeks the best-fit linear operator to advance state variables forward in time,\nserving as an approximation to the Koopman operator associated with the underlying\nsystem [5]. With the rapid development of deep learning [26], DNNs have shown great\npromise in data-driven modeling of unknown equations.\nCompared to traditional\nmethods, DNNs excel in managing high-dimensional problems, processing very large\ndatasets, and facilitating parallel computing. DNNs have proven highly effective in\nlearning the dynamics or flow maps of various types of equations, including ODEs\n[46], partial differential equations (PDEs) [65], differential-algebraic equations (DAEs)\n[15], integro-differential equations (IDEs) [14], and stochastic differential equations\n(SDEs) [13]. This flow map learning (FML) methodology has also been extended\nto partially observed systems with missing state variables [21] and non-autonomous\ndynamical systems [45]. Recent progresses in scientific machine learning (SciML) have\nintroduced advanced deep learning techniques for approximating general operators\nmapping between two infinite-dimensional function spaces.\nNotable contributions\ninclude neural operators [35, 36, 31] and deep operator networks (DeepONet) [41, 70],\nwhich can also model PDEs.\nFig. 1: The overall structure of DUE.\nWhile deep learning garners growing interest among students and researchers\nacross various fields, newcomers often encounter challenges due to the complexity of\nnew concepts, algorithms, and coding requirements. To address this, we present Deep\nUnknown Equations (DUE), a framework and open-source Python library for deep\nlearning of unknown equations. DUE aims to simplify the learning process and fa-\nDUE\n3\ncilitate the adoption of advanced deep learning techniques, such as residual neural\nnetworks (ResNet) [24], generalized ResNet (gResNet) [15], operator semigroup net-\nwork (OSG-Net) [9], and Transformers [61]. It serves as both an educational tool for\nstudents and a powerful resource for researchers, enabling the learning and modeling\nof any time-dependent differential equations. One of DUE\u2019s standout features is its\nuser-friendly design, which allows users to start learning unknown equations with as\nfew as ten lines of code. This simplicity saves significant time on conceptualization\nand analysis, making advanced techniques more accessible. Moreover, DUE is not\nonly valuable for learning unknown equations but also for creating surrogate models\nof known yet complex equations that are computationally expensive to solve using\ntraditional numerical methods. As the field of deep learning continues to advance\nrapidly, we are committed to maintaining and updating DUE to ensure it remains a\nvaluable tool for those interested in the deep learning of unknown equations. While\nsimilar efforts, such as DeepXDE [42] and NeuralUQ [70], have been made to ease\nthe learning and adoption curve, they focus primarily on solving given differential\nequations or uncertainty quantification. In contrast, DUE uniquely targets the deep\nlearning of unknown equations. In summary, DUE is a comprehensive framework and\naccessible tool that empowers students and researchers to harness deep learning for\nmodeling unknown equations, opening new avenues in scientific discovery.\n2. Data-Driven Deep Learning of Unknown Equations. In this section,\nwe explore how deep learning can be applied to model unknown differential equations\nfrom measurement data. After establishing the basic setup in Section 2.1, we intro-\nduce the essential concepts and methods for modeling unknown ODEs. This includes\ndiscussions on data preprocessing, neural network architectures, and model training,\nwhich form the core components of the deep-learning-based modeling framework. We\nthen describe how this approach can be extended to partially observed systems. Fi-\nnally, we discuss learning unknown PDEs in both nodal and modal spaces.\n2.1. Setup and Preliminaries. To set the stage for our exploration, let us delve\ninto the setup for modeling unknown ODEs [46] and PDEs [65, 14]. The framework\nwe describe can be easily adapted to other types of equations, including DAEs [15],\nIDEs [14], and SDEs [13].\nLearning ODEs. Imagine we are trying to understand an autonomous system\nwhere the underlying equations are unknown ODEs:\n(2.1)\ndu\ndt = f(u(t)),\nu(t0) = u0,\nwhere f : Rn \u2192Rn is unknown. A classic example is the damped pendulum system:\n(2.2)\n\uf8f1\n\uf8f4\n\uf8f2\n\uf8f4\n\uf8f3\ndu1\ndt = u2,\ndu2\ndt = \u2212\u03b1u2 \u2212\u03b2sin(u1),\nwhere u1 is the angle, u2 is the angular velocity, \u03b1 is the damping coefficient, and \u03b2\nrepresents the effect of gravity. If these equations are known, then numerical methods\nlike the Runge\u2013Kutta can solve them, predicting how u1 and u2 evolve over time.\nBut what if these equations are unknown? If we can observe or measure the state\nvariables, can we build a data-driven model to predict their evolution?\nAssume we have measurement data of u collected from various trajectories. Let\n4\nJ. CHEN, K. WU, D. XIU\nt0 < t(i)\n1\n< \u00b7 \u00b7 \u00b7 < t(i)\nK be a sequence of time instances. We use\n(2.3)\nu(i)\nk\n= u(t(i)\nk ; u(i)\n0 , t0) + \u03f5(i)\nu,k,\nk = 1, . . . , Ki,\ni = 1, . . . , Itraj,\nto denote the state at time t(i)\nk\nalong the i-th trajectory originating from the initial\nstate u(i)\n0\nat t0, for a total of Itraj trajectories.\nIn real-world scenarios, the data\nmay contain measurement noise \u03f5(i)\nu,k, typically modeled as random variables. Our\nobjective is to create a data-driven model for the unknown ODEs that can predict\nthe evolution of u from any initial state u(t0) = u0.\nFig. 2: Left: Trajectory data collected from multiple initial states for learning ODEs.\nRight: Snapshot data for learning PDEs (only one trajectory is displayed for visual-\nization, while the real dataset may contain multiple trajectories).\nLearning PDEs.\nNow, consider the more complex scenario of an unknown\ntime-dependent PDE system:\n(2.4)\n\uf8f1\n\uf8f4\n\uf8f2\n\uf8f4\n\uf8f3\n\u2202tu = L(u),\n(x, t) \u2208\u2126\u00d7 R+,\nB(u) = 0,\n(x, t) \u2208\u2202\u2126\u00d7 R+,\nu(x, 0) = u0(x),\nx \u2208\u00af\u2126,\nwhere \u2126\u2286Rd is the physical domain, L is the unknown operator governing the PDE,\nB specifies the boundary conditions, and the solution u(x, t) belongs to an infinite-\ndimensional Hilbert space V. A fundamental example of PDEs is the one-dimensional\nBurgers\u2019 equation:\n\u2202tu = L(u)\nwith\nL(u) = \u2212\u2202x\n\u0012u2\n2\n\u0013\n+ \u03bd\u2202xxu,\nwhere the state of u(x, t) is governed by a convective term \u2202x(u2/2) and a diffusive\nterm \u03bd\u2202xxu, with \u03bd > 0 being the diffusion coefficient (or kinematic viscosity in the\ncontext of fluid mechanics). With given initial conditions, numerical methods can\npredict future solutions. But what if the underlying mechanism is unclear and the\nright-hand side of the PDE is unknown? Can we use measurable data of u(x, t) to\nuncover the dynamics?\nAssume the solution u(x, t) of the unknown system (2.4) is measurable, i.e., the\nsnapshot data of u are available at certain time instances as shown in Figure 2:\n(2.5)\nu(xs, t(i)\nk )\ns = 1, 2 . . . , n,\nk = 1, . . . , Ki,\ni = 1, . . . , Itraj.\nDUE\n5\nHere, {xs}n\ns=1 are the discrete spatial locations at which the solution data is measured.\nIn practice, solution data may be collected on varying sets of sampling locations,\nnecessitating interpolation or fitting methods to transform the data onto a consistent\nset {xs}n\ns=1. Our goal is to create a data-driven model for the unknown PDE that\ncan predict the temporal evolution of u given any initial state u(x, 0) = u0(x).\n2.2. Data Pairs. In DUE, we mainly focus on learning the integral form of the\nunderlying equations, which is equivalent to learning the flow maps {\u03a6\u2206}\u2206\u22650 that\ndescribe the time evolution of state variables. The flow map for a time step \u2206is\ndefined as\n(2.6)\n\u03a6\u2206(u0) := u(t0 + \u2206) = u0 +\nZ t0+\u2206\nt0\nf(u(s))ds = u0 +\nZ \u2206\n0\nf(\u03a6s(u0))ds,\nwhere t0 can be arbitrarily shifted for autonomous systems.\nThe flow maps fully\ncharacterize the system\u2019s time evolution. The data (2.3) may be collected at constant\nor varying time lags \u2206(i)\nk\n= t(i)\nk+1 \u2212t(i)\nk . Depending on this, we rearrange the data as\nfollows:\nRearranging Data with Fixed Time Lag \u2206. When data is collected at a\nconstant time lag \u2206, our goal is to learn a single flow map for this specific \u2206. We\nsegment the collected trajectories to form a dataset of input-output pairs:\n(2.7)\nn\nu(j)\nin , u(j)\nout\no\n,\nj = 1, 2, ..., J,\nwhere u(j)\nin and u(j)\nout are neighboring states such that u(j)\nout \u2248\u03a6\u2206(u(j)\nin ), accounting for\nsome measurement noise. Note that multiple data pairs can be extracted from a single\ntrajectory by segmenting it into smaller temporal intervals, leading to J \u2265Itraj.\nRearranging Data with Varying Time Lags. When the time lag \u2206varies,\neach \u2206represents a different flow map. Our objective becomes learning a family of\nflow maps {\u03a6\u2206}\u22061\u2264\u2206\u2264\u22062, where \u22061 and \u22062 are the minimum and maximum time\nlags in the dataset. We rearrange the data into:\n(2.8)\nn\nu(j)\nin , \u2206(j), u(j)\nout\no\n,\nj = 1, 2, ..., J,\nwith u(j)\nout \u2248\u03a6\u2206(j)(u(j)\nin ), considering some measurement noise.\n2.3. Deep Neural Networks. In this subsection, we introduce several effective\nDNN architectures for modeling unknown equations, including the basic feedforward\nneural networks (FNNs), residual neural network (ResNet) [24], generalized ResNet\n(gResNet) [15], and operator semigroup network (OSG-Net) [9].\nFNN. As a foundational architecture in deep learning, FNN with L hidden layers\ncan be mathematically represented as:\n(2.9)\nN\u03b8(uin) = WL+1 \u25e6(\u03c3L \u25e6WL) \u25e6\u00b7 \u00b7 \u00b7 \u25e6(\u03c31 \u25e6W1)(uin),\nwhere W\u2113\u2208Rn\u2113\u00d7n\u2113\u22121 is the weight matrix of the \u2113th hidden layer, \u03c3\u2113denotes the\nactivation function, \u25e6signifies composition, and \u03b8 denotes all trainable parameters.\nCommon activation functions include the hyperbolic tangent (Tanh), the rectified\nlinear unit (ReLU), and the Gaussian error linear unit (GELU). For flow map learning,\nwe set n0 = nL+1 = n, where n denotes the number of state variables (recalling that\nu \u2208Rn). The numbers of neurons in the hidden layers, n\u2113with \u2113= 1, 2, ..., L, are\nhyperparameters that typically require calibration based on the specific problems.\n6\nJ. CHEN, K. WU, D. XIU\nResNet. ResNet [24] is an advanced variant of FNN, particularly effective for\nlearning unknown equations [46]. Initially proposed for image processing [24], ResNet\nintroduces an identity mapping, enabling the network to learn the residue of the input-\noutput mapping more effectively. As depicted in Figure 3, a ResNet can be described\nas\n(2.10)\nbuout = ResNet\u03b8(uin) := uin + N\u03b8(uin) = (In + N\u03b8) (uin),\nBy comparing (2.10) with (2.6), ResNet is particularly suitable for FML, as it enforces\nN\u03b8 to approximate the effective increment of the state variables:\n(2.11)\nN\u03b8(uin) \u2248\nZ \u2206\n0\nf(u(s))ds =\nZ \u2206\n0\nf(\u03a6s(uin))ds.\nuin\nPlain\nneural\nnetwork\n+\nuout\nuin\nPrior\nmodel\nPlain\nneural\nnetwork\n+\nuout\nuin\n\u2206\nPlain\nneural\nnetwork\n\u00d7\n+\nuout\nFig. 3: ResNet (left), gResNet (middle), and OSG-Net (right).\nThe symbol \u201c+\u201d\nindicates element-wise summation, while the symbol \u201c\u00d7\u201d rerpesents multiplication.\ngResNet. As shown in Figure 3, gResNet [15] generalizes the traditional ResNet\nconcept by defining the residue as the difference between the output data and the\npredictions made by a prior model:\n(2.12)\nuout = gResNet(uin) := A(uin) + N\u03b8(uin),\nwhere A is the prior model, and N\u03b8 acts as a correction for A. If an existing prior\nmodel is unavailable, A can be constructed from data, such as using a modified DMD\n[15] to construct a best-fit affine model:\nA(uin) := Auin + b,\nwhere A \u2208Rn\u00d7n and b \u2208Rn are determined by solving the following linear regression\nproblem:\n(2.13)\n(A, b) = arg min\n\u02dcA\u2208Rn\u00d7n\n\u02dcb\u2208Rn\n1\nJ\nJ\nX\nj=1\n\r\r\ru(j)\nout \u2212\u02dcAu(j)\nin \u2212\u02dcb\n\r\r\r\n2\n2 .\nTo solve problem (2.13), we first augment the input vector by appending a constant\nterm:\n\u02dcuin =\n\u0014uin\n1\n\u0015\n\u2208Rn+1,\nDUE\n7\nwhere the constant 1 accommodates the bias term b in the affine model. Next, we\nconstruct the following matrices using the dataset (2.7):\nY := [u(1)\nout, u(2)\nout, . . . , u(J)\nout] \u2208Rn\u00d7J,\nX := [\u02dcu(1)\nin , \u02dcu(2)\nin , . . . , \u02dcu(J)\nin ] \u2208R(n+1)\u00d7J.\nThe solution to the linear regression problem (2.13) can then be explicitly expressed\nas\n\u0002A\nb\u0003\n= YX\u22a4(XX\u22a4)\u22121.\nThis modification to DMD accommodates potential non-homogeneous terms in\nthe unknown equations, making the approximation more flexible.\nThe concept of\ngResNet encompasses the standard ResNet with A = In and b = 0.\nOSG-Net. To adeptly approximate a family of flow maps associated with varying\ntime step sizes, it is necessary to incorporate the time step size as an input to DNN.\nThe flow maps of autonomous systems form a one-parameter semigroup, satisfying\n\u03a60 = In,\n(2.14a)\n\u03a6\u22061+\u22062 = \u03a6\u22061 \u25e6\u03a6\u22062\n\u2200\u22061, \u22062 \u2208R+.\n(2.14b)\nThe semigroup property is crucial as it connects the system\u2019s evolutionary behaviors\nacross different time scales. Therefore, it is natural for data-driven models to ad-\nhere to this property. The OSG-Net, proposed in [9], is well-suited for this purpose.\nMathematically, an OSG-Net can be expressed as\n(2.15)\nbuout = OSG-Net\u03b8(uin, \u2206) := uin + \u2206N\u03b8(uin, \u2206).\nThe architecture of OSG-Net, illustrated in Figure 3, involves concatenating the state\nvariables uin with the time step size \u2206before inputting them into the network N\u03b8.\nUnlike ResNet, OSG-Net introduces an additional skip connection that scales the\noutput of N\u03b8 by \u2206. This design ensures that an OSG-Net inherently satisfies the\nfirst property (2.14a). As for the second property, we can design special loss functions\nto embed this prior knowledge into OSG-Net via training, which can enhance the\nmodel\u2019s long-term stability (see Section 3.2 for detailed discussions).\nBy comparing (2.15) with (2.6), it is clear that N\u03b8 serves as an approximation to\nthe time-averaged effective increment:\n(2.16)\nN\u03b8(uin, \u2206) \u22481\n\u2206\nZ \u2206\n0\nf(u(s))ds = 1\n\u2206\nZ \u2206\n0\nf(\u03a6s(uin))ds.\n2.4. Model Training and Prediction. Once the data pairs are rearranged\nand an appropriate DNN architecture is selected, model training is carried out by\nminimizing a suitable loss function. The commonly used mean squared error (MSE)\nquantifies the discrepancy between the predicted outputs and the actual values:\n(2.17)\nL(\u03b8) = 1\nJ\nJ\nX\nj=1\n\r\r\rbu(j)\nout(\u03b8) \u2212u(j)\nout\n\r\r\r\n2\n2 .\nIt is worth noting that training data extracted from the same trajectory are not inde-\npendent. To account for the structure of observational noise or the highly clustered\nnature of data from a single trajectory, a suitably weighted norm can be applied in\n8\nJ. CHEN, K. WU, D. XIU\nthe loss function (2.17). Some alternative loss functions will be discussed in Section 3\nto enhance the prediction accuracy and stability.\nIn practice, L(\u03b8) is minimized using stochastic gradient descent (SGD) [53] or its\nvariants, such as Adam [30]. SGD works by randomly splitting the training dataset\ninto mini-batches. At each iteration, the gradient of the loss function with respect\nto \u03b8 is computed for one mini-batch, and this gradient is used to update the param-\neters. This process repeats for multiple epochs until the loss function is sufficiently\nminimized. The procedure for training DNNs using SGD is outlined in Algorithm 2.1.\nAlgorithm 2.1 Model training using stochastic gradient descent (SGD)\nRequire: Number of epochs E, batch size B; training data {(u(j)\nin , u(j)\nout)}J\nj=1 (fixed\ntime lag) or {(u(j)\nin , \u2206(j), u(j)\nout)}J\nj=1 (varied time lags)\n1: Initialize the DNN parameters \u03b8 randomly\n2: for epoch = 1 to E do\n3:\nShuffle the training data\n4:\nfor batch = 1 to\n\u0004 J\nB\n\u0005\ndo\n5:\nSample a mini-batch \u039b of size B from the training data\n6:\nUpdate the DNN parameters:\n\u03b8 \u2190\u03b8 \u2212\u03b7\u2207\u03b8L(\u039b)(\u03b8),\n7:\nwhere the learning rate \u03b7 > 0 is often adapted during training, and\n\u2207\u03b8L(\u039b)(\u03b8) = 1\nB\nX\nj\u2208\u039b\n\u2207\u03b8\n\r\r\rbu(j)\nout(\u03b8) \u2212u(j)\nout\n\r\r\r\n2\n2 .\n8:\nend for\n9: end for\nOnce the DNN is successfully trained, it is recursively used to conduct predictions\nfrom any given initial state upre(t0) = u(t0). The trained DNN model, denoted as b\u03a6\u03b8\npredicts the solution evolution as follows:\n(2.18)\nupre(tk+1) = b\u03a6\u03b8(upre(tk)),\nk = 0, 1, . . .\nwith a fixed time step size tk+1 \u2212tk \u2261\u2206, or\n(2.19)\nupre(tk+1) = b\u03a6\u03b8(upre(tk), \u2206k),\nk = 0, 1, . . .\nwith varying time step sizes tk+1 \u2212tk = \u2206k.\n2.5. Learning Partially Observed Systems. In many real-world scenarios,\ncollecting data for all state variables u \u2208Rn is not always feasible. Instead, obser-\nvations can be restricted to a subset of the state variables w \u2208Rm, where m < n.\nThis limitation shifts the focus to learning the dynamics of w alone, resulting in\nnon-autonomous unknown governing equations due to the absence of other variables.\nSimilar to the fully observed case, the training data can be constructed from sampling\non multiple long trajectories or many short trajectories with M + 1 observations of\nw. If data from multiple long trajectories of w with a fixed time lag \u2206are available:\n(2.20)\nw(i)\nk\n= w(t(i)\nk ; w(i)\n0 , t0) + \u03f5(i)\nw,k,\nk = 1, . . . , Ki,\ni = 1, . . . , Itraj,\nDUE\n9\nthen we rearrange these trajectories into shorter bursts of M + 1 consecutive states:\n(2.21)\nn\nw(j)\n0 , w(j)\n1 , . . . , w(j)\nM+1\no\n,\nj = 1, 2, . . . , J.\nTo model the temporal evolution of w, a memory-based DNN architecture was intro-\nduced in [21]:\n(2.22)\nwk+1 = wk + N\u03b8(wk, wk\u22121, . . . , wk\u2212M),\nk \u2265M > 0,\nwhere T := M\u2206represents the memory length, which is problem-dependent and often\nrequires manual tuning. The state wk at time tk, along with the M preceding states,\nare concatenated as inputs for the neural network N\u03b8. The following loss function is\nthen minimized:\n(2.23)\nL(\u03b8) = 1\nJ\nJ\nX\nj=1\n\r\r\rw(j)\nM+1 \u2212\n\u0010\nw(j)\nM + N\u03b8(w(j)\nM , . . . , w(j)\n1 , w(j)\n0 )\n\u0011\r\r\r\n2\n2 .\nLearning a fully observed system is a special case with m = n and M = 0. Once the\nDNN model is successfully trained, it can be recursively used to predict the system\u2019s\nevolution from any initial states (w(t0), w(t1), . . . , w(tM)):\n(2.24)\n(\nwpre(tk) = w(tk),\nk = 0, 1, . . . , M,\nwpre(tk+1) = wpre(tk) + N\u03b8 (wpre(tk), wpre(tk\u22121), . . . , wpre(tk\u2212M)) ,\nk \u2265M,\nwhere tk+1 \u2212tk \u2261\u2206.\nThis approach has also been applied to systems with hidden parameters [22], as\nwell as PDE systems with snapshot data observed on a subset of the domain [16].\n2.6. Learning Unknown PDEs. The aforementioned framework can be seam-\nlessly extended to data-driven modeling of unknown PDEs. This can be effectively\nachieved in either nodal or modal space, as illustrated in Figure 4.\nexpansion\nFourier\nGeneralized\nmesh grids\nSample on\nu( \u00b7 , t)\nU(t)\nV(t)\n\u03a6\u2206\nV(t + \u2206)\nU(t + \u2206)\nu( \u00b7 , t + \u2206)\nPp\nj=1 v j(t)\u03c8 j(x)\nFig. 4: Learning PDEs in nodal space (top branch) and modal space (bottom branch).\n2.6.1. Learning in Nodal Space. Let u : \u2126\u00d7 R+ \u2192Rdu represent the state\nvariables of the underlying unknown d-dimensional PDE, and \u2126\u2282Rd, where d is the\nspatial dimension, and du is the length of the state vector u. As shown in the upper\n10\nJ. CHEN, K. WU, D. XIU\nbranch of Figure 4, assume we have measurement data of u at a set of nodal points\nX = {x1, . . . , xn} \u2282\u2126, collected from various trajectories:\n(2.25)\nU(i)\nk\n= U(t(i)\nk ; U(i)\n0 , t0) + \u03f5(i)\nU,k,\nk = 1, . . . , Ki,\ni = 1, \u00b7 \u00b7 \u00b7 , Itraj,\nwhere U(t) = (u(x1, t), . . . , u(xn, t))\u22a4\u2208Rn\u00d7du is a matrix. While ResNet and OSG-\nNet built upon FNNs can be used for learning PDEs [14, 9], they can be compu-\ntationally expensive when X contains a large number of nodal points. To address\nthis, we can replace FNNs with more suitable DNNs, such as the convolutional neural\nnetworks (CNNs) [33, 68], the Fourier Neural Operator (FNO) [36], and many other\nneural operators [34, 8, 10], including those built upon Transformers [61, 10] from\nlarge language models.\nTransformers. Transformers [61], particularly those based on the self-attention\nmechanism, are highly effective for capturing long-range dependencies in data. Math-\nematically, a generalized Transformer can be expressed as\nT\u03b8(Uin) = \u03c9L+1 \u25e6(\u03c3L \u25e6\u03b1L \u25e6\u03c9L) \u25e6\u00b7 \u00b7 \u00b7 \u25e6(\u03c31 \u25e6\u03b11 \u25e6\u03c91)(Uin, X),\nwhere each set of operations {\u03c3\u2113\u25e6\u03b1\u2113\u25e6\u03c9\u2113}L\n\u2113=1 represents the following transformation:\n(2.26)\nU\u2113= \u03c3\u2113\u25e6\u03b1\u2113\u25e6\u03c9\u2113(U\u2113\u22121) := \u03c3\u2113(A\u2113U\u2113\u22121W\u2113).\nHere, U\u2113\u2208Rn\u2113\u00d7d\u2113is a matrix, with \u2113= 1, 2, . . . , L, represents the output of the \u2113-th\nhidden layer. The initial input, U0 = [Uin, X] \u2208Rn\u00d7(du+d), is formed by concatenat-\ning the input function values and nodal point coordinates. In this setup: \u03c3\u2113is the\nactivation function; \u03c9\u2113represents a transformation via right multiplication by a weight\nmatrix W\u2113\u2208Rd\u2113\u22121\u00d7d\u2113; \u03b1\u2113represents a convolution via left multiplication by a kernel\nmatrix A\u2113\u2208Rn\u2113\u00d7n\u2113\u22121. Each hidden layer can thus be interpreted as transforming a\nvector-function with d\u2113\u22121 components sampled on a latent grid X\u2113\u22121 = {x\u2113,j}n\u2113\u22121\nj=1 , to a\nnew vector-function with d\u2113components sampled on a new latent grid X\u2113= {x\u2113,i}n\u2113\ni=1,\nwhere X0 = XL = X. The sizes of the hidden layers, specified by {d\u2113}L\n\u2113=1 and {n\u2113}L\u22121\n\u2113=1 ,\nare hyperparameters that typically require tuning based on the problem at hand. At\nthe output layer, we set dL+1 = du and nL = n to produce the predicted function\nvalues on the target grid X.\nTransformers can be enhanced with a multi-head attention mechanism, perform-\ning multiple convolutions in each hidden layer to provide a comprehensive view of the\ntarget operator. This is achieved by replacing A\u2113U\u2113\u22121W\u2113in (2.26) with the concatena-\ntion of different heads {Ah\n\u2113U\u2113\u22121W h\n\u2113}H\nh=1, where Ah\n\u2113\u2208Rn\u2113\u00d7n\u2113\u22121 and W h\n\u2113\u2208Rd\u2113\u22121\u00d7\nd\u2113\nH .\nThe general formulation in (2.26) encompasses many deep learning methods, dis-\ntinguished by the implementation of the convolution operator A\u2113.\n\u2022 In CNNs [32], A\u2113performs local weighted sums over spatially structured\ndata. The non-zero values of A\u2113, which constitute the trainable weights, are\nidentical but shifted accross the rows, as these weights are shared accross \u2126.\nThis convolution is usually combined with pooling or up-pooling layers [52],\nwhich downsample or upsample U\u2113from the grid X\u2113\u22121 to a coarser or finer\ngrid X\u2113.\n\u2022 In Transformers built upon the self-attention mechanism [61], A\u2113performs\nglobal convolution. Mathematically, A\u2113is implemented as\n(2.27)\nA\u2113= Softmax\n \n(U\u2113\u22121W Q\n\u2113)(U\u2113\u22121W K\n\u2113)\u22a4\np\nd\u2113\u22121\n!\n,\nDUE\n11\nwhere W Q\n\u2113, W K\n\u2113\n\u2208Rd\u2113\u22121\u00d7d\u2113are two trainable weight matrices, and Softmax\nnormalizes each row of a matrix into a discrete probability distribution. In\n[37], a cross-attention mechanism was proposed to enable the change of mesh.\nSpecifically, U\u2113\u22121W Q\n\u2113in (2.27) is replaced by X\u2113W X\n\u2113, with W X\n\u2113\n\u2208Rd\u00d7d\u2113being\na trainable weight matrix. This design allows cross-attention to output a new\nfunction sampled on any mesh X\u2113.\nPosition-induced Transformer (PiT). Here, we present a Transformer-based\nmethod, named PiT, built upon the position-attention mechanism proposed in [10].\nDistinguished from other Transformer-based networks [8, 23, 37] built upon the clas-\nsical self-attention [61], position-attention implements the convolution operator by\nconsidering the spatial interrelations between sampling points. Define the pariwise\ndistance matrix D\u2113\u2208Rn\u2113\u00d7n\u2113\u22121 between X\u2113and X\u2113\u22121 by D\u2113,ij = \u2225x\u2113,i \u2212x\u2113\u22121,j\u22252\n2.\nThen A\u2113is defined as A\u2113:= Softmax(\u2212\u03bb\u2113D\u2113), where \u03bb\u2113\u2208R+ is a trainable parame-\nter. Position-attention represents a global linear convolution with a stronger focus on\nneighboring regions, resonating with the concept of domain of dependence in PDEs\nand making PiT appealing for learning PDEs [10]. The parameter \u03bb\u2113is interpretable,\nas most attention at a point x\u2113,i \u2208X\u2113is directed towards those points x\u2113\u22121,j \u2208X\u2113\u22121\nwith the distance to x\u2113,i smaller than 1/\u221a\u03bb\u2113.\nIn practice, we construct a latent\nmesh Xltt by coarsening X while preserving essential geometric characteristics, and\nlet X\u2113= Xltt,\nn\u2113= nltt,\nfor \u2113= 1, 2, . . . , L \u22121, with nltt < n. This design re-\nduces the computational cost caused by a potential large number of sampling points\nin the dataset. Like many other neural operators [31, 2], PiT is mesh-invariant and\ndiscretization convergent. Once trained, PiT can generalize to new input meshes, de-\nlivering consistent and convergent predictions as the input mesh is refined. To learn\ntime-dependent unknown PDEs from data, we construct a (g)ResNet or OSG-Net\nwith PiT as the basic block. Once the model is successfully trained, we can recur-\nsively call the model to predict the evolutionary behaviors of u(x, t) given any initial\nconditions.\n2.6.2. Learning in Modal Space. An alternative strategy is to model un-\nknown PDEs in modal space [65] by combining traditional model reduction with deep\nlearning approaches. Initially, select a finite-dimensional function space with a suit-\nable basis to approximate each component of u(x, \u00b7):\nVp = span {\u03c81(x), ..., \u03c8p(x)} ,\nwhere p \u2264n, and the basis functions \u03a8(x) := (\u03c81(x), ..., \u03c8p(x))\u22a4are defined on the\nphysical domain \u2126. As shown in the lower branch of Figure 4, the solution of the\nunderlying PDE can then be approximated in Vp by a finite-term series:\nu(x, t) \u2248\np\nX\nj=1\nvj(t)\u03c8j(x),\nwith V := (v1, ..., vp)\u22a4\u2208Rp\u00d7du being the modal expansion coefficients. This intro-\nduces a bijective mapping:\n(2.28)\n\u03a0 : Rp\u00d7du \u2192[Vp]du,\n\u03a0V = V\u22a4\u03a8(x),\nwhich defines a unique correspondence between a function in [Vp]du and its modal\nexpansion coefficients.\n12\nJ. CHEN, K. WU, D. XIU\nNow, we project each data sample U(i)\nk\nin (2.25) into [Vp]du, yielding a coefficient\nmatrix V(i)\nk . This is achieved by solving the linear regression problem:\n(2.29)\nV(i)\nk\n= arg min\n\u02dcV\u2208Rp\u00d7du\n\r\r\r(U(i)\nk )\u22a4\u2212\u02dcV\n\u22a4\u03a8(X)\n\r\r\r\n2\n2 ,\nwhere \u03a8(X) = (\u03a8(x1), \u03a8(x2), . . . , \u03a8(xn)) is a p \u00d7 n matrix, representing the basis\nfunction values evaluated at the sampling grids X.\nThe solution to (2.29) can be\nexpressed as\nV(i)\nk\n=\n\u0000\u03a8(X)\u03a8(X)\u22a4\u0001\u22121 \u03a8(X)U(i)\nk .\n= V(t(i)\nk ; V(i)\n0 , t0) + \u03f5(i)\nV,k,\nk = 1, . . . , Ki,\ni = 1, \u00b7 \u00b7 \u00b7 , Itraj,\nwhere V(t(i)\nk ; V(i)\n0 , t0) denotes the modal coefficients of the underlying function, and\n\u03f5(i)\nV,k =\n\u0000\u03a8(X)\u03a8(X)\u22a4\u0001\u22121 \u03a8(X)\u03f5(i)\nU,k represents the noise inherited from the nodal value\nnoise. We can then treat V as the state variables and model the unknown governing\nODEs using deep learning approaches, offering a predictive model for the evolution\nof V. The behavior of U can be easily inferred through the bijective mapping (2.28).\nLearning unknown PDEs in the modal space provides great flexibility in choosing\ndifferent basis functions to represent the solution, including trigonometric functions,\nwavelet functions, Legendre polynomials, and piecewise polynomials. This approach is\nanalogous to traditional numerical methods, such as spectral Galerkin, finite element,\nand finite volume methods, commonly used for solving known PDEs.\n2.6.3. Remarks on Learning PDEs. In the modal learning approach, when\nan interpolation basis is used, the resulting modal coefficients directly correspond to\nfunction values. This allows both the modal and nodal learning approaches to be\nrepresented through the expansion shown in the bottom path of Figure 4, highlight-\ning a connection between the two methods. Although Transformers were originally\ndeveloped for nodal learning, they may also be adapted for modal learning, as the\nattention mechanism can be used to capture dependencies among different modes.\nOur data-driven models in DUE serve as approximate evolution operators for the\nunderlying unknown PDEs. They enable prediction of future solutions for any initial\nconditions without necessitating retraining.\nThis contrasts with physics-informed\nneural networks (PINNs) [50], which require fewer or no measurement data but solve\na given PDE for a specific initial condition, typically necessitating retraining for each\nnew initial condition.\nThe above deep learning frameworks are not only useful for modeling unknown\nPDEs but also for creating surrogate models of known, yet complex, PDEs that are\nexpensive to solve using traditional numerical methods.\n3. Enhancing Prediction Accuracy and Stability. In learning unknown\ntime-dependent differential equations, our goal is to predict the system\u2019s evolution\naccurately over extended periods.\nThis section introduces two loss functions and\na novel neural network architecture designed to enhance the long-term prediction\naccuracy and stability of the learned models.\n3.1. Multi-step Loss. Research by [14] shows that using a multi-step loss func-\ntion can significantly improve predictive models with fixed time step sizes. This ap-\nproach averages the loss over multiple future time steps.\nThe training dataset is\nDUE\n13\nstructured as follows:\n(3.1)\nn\nw(j)\n0 , w(j)\n1 , . . . , w(j)\nM+1, . . . , w(j)\nM+1+K\no\n,\nj = 1, 2, . . . , J,\nwhere K \u22650 represents the number of future time steps. During training, initial\nstates w(j)\n0 , w(j)\n1 , . . . , w(j)\nM are used, and the DNN model (2.22) is executed for K + 1\nsteps to produce predictions bw(j)\nM+1, . . . , bw(j)\nM+1+K.\nThe multi-step loss function is\ndefined as\n(3.2)\nL(\u03b8) =\n1\nJ(K + 1)\nJ\nX\nj=1\nK\nX\nk=0\n\r\r\rw(j)\nM+1+k \u2212bw(j)\nM+1+k(\u03b8)\n\r\r\r\n2\n2 .\nNote that the loss function in Equation (2.23) is a special case with K = 0.\n3.2. Semigroup-informed Loss. As mentioned in Section 2.3, an OSG-Net\ninherently satisfies the first constraint (2.14a). To embed the second property (2.14b)\ninto an OSG-Net, a global direct semigroup-informed (GDSG) loss function was in-\ntroduced in [9], which effectively guides an OSG-Net to adhere to (2.14b) through\ntraining. The GDSG method integrates a regularization term informed by the semi-\ngroup property (2.14b) to the data-driven loss function:\n(3.3)\nL(\u03b8) =\n1\n(1 + \u03bb)J\nJ\nX\nj=1\n\u0012\r\r\ru(j)\nout \u2212bu(j)\nout(\u03b8)\n\r\r\r\n2\n2 + \u03bbR(j)\nSG(\u03b8)\n\u0013\n,\nwhere \u03bb > 0 serves as a regularization factor, and R(j)\nSG(\u03b8) is defined as\n(3.4)\nR(j)\nSG(\u03b8) := 1\n2\n\u0012\r\r\r\u00afu(j)(\u03b8) \u2212\u02dcu(j)(\u03b8)\n\r\r\r\n2\n2 +\n\r\r\r\u00afu(j)(\u03b8) \u2212\u02d8u(j)(\u03b8)\n\r\r\r\n2\n2\n\u0013\n,\nwith \u00afu(j), \u02dcu(j), and \u02d8u(j) being network predictions of randomly generated initial con-\nditions eu(j)\n0\nand random forward time steps \u2206(j)\n0 , \u2206(j)\n1 :\n\u00afu(j) = OSG-Net\u03b8\n\u0010\neu(j)\n0 , \u2206(j)\n0\n+ \u2206(j)\n1\n\u0011\n,\nwhich is the predicted state after a single forward step of size \u2206(j)\n0\n+ \u2206(j)\n1 , and\n\u02dcu(j) = OSG-Net\u03b8\n\u0010\nOSG-Net\u03b8\n\u0010\neu(j)\n0 , \u2206(j)\n0\n\u0011\n, \u2206(j)\n1\n\u0011\n,\n\u02d8u(j) = OSG-Net\u03b8\n\u0010\nOSG-Net\u03b8\n\u0010\neu(j)\n0 , \u2206(j)\n1\n\u0011\n, \u2206(j)\n0\n\u0011\n,\nwhich are the predicted states after two sequential forward steps. According to the\nsemigroup property, \u00afu(j), \u02dcu(j), and \u02d8u(j) are predictions of the same true state and\nshould therefore be enforced to be equal. Hence, incorporating (3.4) into the loss\nfunction encourages OSG-Net\u03b8 to adhere to property (2.14b). Remarkably, comput-\ning the residue (3.4) does not require additional measurement data. Moreover, the\nGDSG method can be further improved by generating multiple pairs of random data\n{eu(j,q)\n0\n, \u2206(j,q)\n0\n, \u2206(j,q)\n1\n}Q\nq=1 and using the averaged residue over Q pairs; see Section 3.2\nof [9] for more details.\n14\nJ. CHEN, K. WU, D. XIU\n3.3. Dual-network Technique for Multiscale Dynamics. Modeling equa-\ntions with varying time step sizes necessitates capturing dynamics characterized by\ntemporal multiscale properties. A plain neural network may struggle with large time\nscale separations, leading to poor long-term prediction accuracy. In this paper, we\nintroduce a novel dual-network architecture, called the dual-OSG-Net, which we pro-\npose as a new approach that leverages the gating mechanism [27] to effectively learn\ndynamics across broader time scales.\n\u2206\nuin\nOSG-Net\u03b81\nOSG-Net\u03b82\nGating\u03b83\nh2\nh1\nw1\nw2\nw1h1 + w2h2\nuout\nFig. 5: Dual-OSG-Net for learning multiscale equations.\nAs illustrated in Figure 5, the dual-OSG-Net combines predictions from two in-\ndependent OSG-Nets using weighted averaging. The weights {(w1, w2)|w1 > 0, w2 >\n0, w1 + w2 = 1} are determined by another neural network, Gating\u03b83, with Softmax\nactivation at its output layer. This gating network Gating\u03b83 is trained simultaneously\nwith the two OSG-Nets (OSG-Net\u03b81 and OSG-Net\u03b82) and intelligently decides which\nOSG-Net weighs more. The gating mechanism adaptively assigns a weight to each\nOSG-Net based on the time step size, allowing each network to adaptively focus on a\nspecific scale. For small time steps, it prioritizes the OSG-Net optimized for fine-scale\ndynamics, while for larger steps, it emphasizes the network suited to coarse scales.\nThis adaptability enables the dual-OSG-Net to handle multi-scale problems more ef-\nfectively than a single, larger OSG-Net, which lacks this flexibility and must attempt\nto capture all scales simultaneously. In Section 5.4, we will demonstrate the superior\nperformance of the dual-OSG-Net compared to the standard single OSG-Net through\nnumerical comparisons.\n4. Overview and Usage of DUE. This section introduces the structure and\nusage of DUE, a comprehensive library designed for data-driven learning of unknown\nequations. As illustrated in Figure 1, DUE comprises three main modules:\n\u2022 datasets: This module handles data loading and essential preprocessing\ntasks such as slicing, regrouping, and normalization.\n\u2022 networks: It includes a variety of DNN architectures like FNN, ResNet,\ngResNet, OSG-Net, dual-OSG-Net, Transformers, and more.\n\u2022 models: This module is dedicated to training the deep learning-based mod-\nels, offering various learning strategies to enhance prediction accuracy and\nstability.\nThis structure allows users to quickly understand its usage and customize or add new\nfunctionalities as needed. Detailed usage and customization of DUE are explained in\nSections 4.1 and 4.2.\n4.1. Usage. With DUE, learning unknown equations is simplified to just a few\nlines of code. Below is a template script with detailed comments for modeling the\ndynamics of a damped pendulum (see Section 5.1 for detailed descriptions). For more\ncomplex tasks, slight modifications may be needed, such as alternating data loaders,\nchanging neural network architectures, and adapting training strategies.\nDUE\n15\nimport due\n# Load the configuration for the modules: datasets, networks, and models\nconf_data, conf_net, conf_train = due.utils.read_config(\"config.yaml\")\n# Load the (measurement) data, slice them into short bursts,\n# apply normalization, and store the minimum and maximum values of the state varaibles\ndata_loader = due.datasets.ode.ode_dataset(conf_data)\ntrainX, trainY, test_set, vmin, vmax = data_loader.load(\"train.mat\", \"test.mat\")\n# Construct a neural network\nmynet = due.networks.fcn.resnet(vmin, vmax, conf_net)\n# Define and train a model, save necessary information of the training history\nmodel = due.models.ODE(trainX, trainY, mynet, conf_train)\nmodel.train()\nmodel.save_hist()\n# Conduct long-term prediction for arbitrarily given initial conditions\npred = mynet.predict(test_set[...,:conf_data[\"memory\"]+1], steps=1000, device=\"cpu\")\n4.1.1. Configuration. To simplify the specification of hyperparameters such as\nmemory steps, multi-steps in the loss function, network depth and width, training\nepochs, batch size, and more, users can consolidate them in a single configuration file.\nThis file can be seamlessly processed using the due.utils.read config function.\nUpon processing, these hyperparameters are stored in three Python dictionaries: one\nfor data processing configuration, one for neural network architecture configuration,\nand one for model training configuration. All the modules in Figure 1 are designed\nto work with such dictionaries, relieving users from specifying each hyperparameter\nindividually when calling multiple modules. This streamlined approach facilitates the\nlaunch of new tasks and allows for easy calibration of hyperparameters. To automate\nhyperparameter optimization [20], users can implement automated grid search via an\nexternal script that iterates over the configuration file in a for-loop.\ndata:\nproblem_type: ode\nnbursts: 10\nmemory: 0\nmulti_steps: 10\nproblem_dim: 2\nnetwork:\ndepth: 3\nwidth: 10\nactivation: \"gelu\"\ntraining:\ndevice: \"cpu\"\nepochs: 500\nbatch_size: 5\noptimizer: \"adam\"\nlearning_rate: 0.001\nloss: \"mse\"\nsave_path: \"./model\"\n4.1.2. Data Preprocessing. DUE is equipped to handle both ODE and PDE\ndata with either fixed or varied time lags. To accommodate these diverse scenarios,\nwe have implemented four modules in the \u201cdatasets\u201d class:\n\u2022 ode dataset: For unknown ODEs and data with fixed time lag.\n\u2022 ode dataset osg: For unknown ODEs and data with varied time lags.\n\u2022 pde dataset: For unknown PDEs and data with fixed time lag.\n\u2022 pde dataset osg: For unknown PDEs and data with varied time lags.\nUsers only need to prepare the measurement data and employ one of these four mod-\nules. The data will be automatically rearranged, normalized, and partitioned into\ninput-output pairs, as indicated by (2.7), (2.8), (2.21), and (3.1).\n16\nJ. CHEN, K. WU, D. XIU\n4.1.3. Neural Networks. The networks module in DUE offers a wide array\nof DNN architectures for ODE and PDE learning. For modeling ODEs with fixed\nand varied time step sizes, we have implemented resnet, gresnet, and osg net\nbuilt upon FNNs, respectively. As for learning PDEs, we have implemented pit\u2014\nthe Position-induced Transformer [10]\u2014for handling data with fixed time lag, and\nosg fno\u2014an OSG-Net built upon the Fourier neural operator [36, 9]\u2014for cases with\nvaried time step sizes.\nAs described in Section 2.6, unknown PDEs can also be\nlearned in modal space. We provide the generalized fourier projection1d and\ngeneralized fourier projection2d functions for computing modal expansion co-\nefficients from snapshot data for one- and two-dimensional problems. All the DNN\narchitectures in DUE belong to the nn class, which can be further enriched by cus-\ntomized deep learning methods to suit specific needs.\n4.1.4. Model Training. The models module implements the training proce-\ndures for deep learning models. Four training routines are available:\n\u2022 ode: For learning unknown ODEs with fixed time step size.\n\u2022 ode osg: For modeling unknown ODEs with varied time step sizes.\n\u2022 pde: For learning unknown PDEs with fixed time step size\n\u2022 pde osg: For modeling unknown PDEs with varied time step sizes.\nWe have also integrated the GDSG method to embed the semigroup property into\nmodels with varied time step sizes. Users only need to specify the hyperparameters of\nthe semigroup loss as detailed in Section 3.2, and DUE handles the complex procedures\nof training with the GDSG method.\n4.2. Customization. We have adopted a modular architecture for DUE, en-\nsuring that its key modules, networks and models, can be separately customized.\nUsers have the flexibility to adapt the neural network architecture to suit their specific\nrequirements and implement new training methods to enhance models\u2019 prediction ac-\ncuracy and stability. In this section, we briefly show how to customize neural network\narchitectures and training methods.\n4.2.1. Neural Networks. As described in Section 4.1.3, DUE already provides\na range of neural network architectures that address various scenarios in ODE and\nPDE learning. Users interested in exploring more specialized or recent deep learning\nmethods can implement them by following the guidelines in Procedure 4.1.\nProcedure 4.1 Customization of the neural network NewNet.\nclass NewNet(nn):\n\"\"\"New network architectures belong to the nn class\"\"\"\ndef __init__(self):\n\"\"\" create the computational layers here\"\"\"\nself._layer1 = ...\nself._layer2 = ...\ndef forward(self, x):\n\"\"\"Return the output of NewNet\"\"\"\nx1 = self._layer1(x)\nx2 = self._layer2(x1)\nreturn x2 + x\n4.2.2. Model Training. In the current version of DUE, we have implemented\nthe multi-step loss function [14] for data-driven modeling with fixed time step size, and\nthe GDSG method [9] for cases with varied time step sizes. If users have developed\ncustom training methods, such as new loss functions, implementing them in DUE is\nstraightforward using the following procedure.\nDUE\n17\nProcedure 4.2 Customization of the training method for ODEs and PDEs.\nclass New_learning(ODE): # New_learning(PDE):\n\"\"\"\nNew ODE learning methods belong to the ODE class\nNew PDE learning methods belong to the PDE class\n\"\"\"\ndef NewLoss(self, true, pred):\n\"\"\"Creat the customized loss function here\"\"\"\nloss = ...\nreturn loss\ndef train(self):\n\"\"\"Construct the training loop for a number of epochs\"\"\"\nfor i in range(self.n_epochs):\nfor x, y in self.train_loader:\npred = self.mynet(x)\nloss = self.NewLoss(y, pred)\nself.optimizer.zero_grad()\nloss.backward()\nself.optimizer.step()\nBy leveraging DUE\u2019s modularity and flexibility, users can effectively address a\nwide range of data-driven modeling challenges in unknown ODE and PDE systems.\n5. Demonstration Examples. In this section, we present diverse examples\nto demonstrate the effectiveness of DUE for data-driven learning of unknown ODEs\nand PDEs. The examples include: (1) the damped pendulum system, (2) coupled\noscillators with real-world noisy data, (3) the chaotic Lorenz system, (4) the Robertson\nchemical reaction problem involving high stiffness and multi-scale dynamics, (5) the\none-dimensional viscous Burgers\u2019 equation, and (6) the vorticity evolution of the two-\ndimensional Navier\u2013Stokes equations, and (7) the two-dimensional flow past a circular\ncylinder. In these examples, the true governing equations are known, but they serve\nonly two purposes: generating synthetic data for training the DNNs and providing\nreference solutions for comparison during testing. During the data-driven learning\nprocess, the true equations are regarded as unknown.\nFor all examples, we use the GELU activation function [25] and train the models\nwith the Adam optimizer [30]. The learning rate is initialized at 0.001 and follows\na cosine annealing schedule [40]. Detailed training configurations and dataset infor-\nmation for all examples are presented in the dedicated subsections below. To ensure\nusers can quickly understand how DUE works and apply it to their tasks, we provide\ndetailed comments in the code for each numerical example. All this information is\navailable on the GitHub page of DUE.\n5.1. Damped Pendulum. The first example is the damped pendulum system\n[46, 18], described by the equations (2.2) with \u03b1 = 0.1 and \u03b2 = 9.80665. Synthetic\ndata is generated using the fourth-order Runge\u2013Kutta method to advance the true\nsystem forward in time. The dataset comprises N = 1, 000 trajectories for (u1, u2),\neach with a length of L = 1, 000 and a time lag of \u2206= 0.02.\nThe initial states\nof these trajectories are randomly sampled from the uniform distribution on \u2126=\n[\u2212\u03c0/2, \u03c0/2] \u00d7 [\u2212\u03c0, \u03c0].\n5.1.1. Fully Observed Case. In this case, we assume both state variables u1\nand u2 are observable. We set K = 10 for the multi-step loss and randomly sample\n10 bursts from each trajectory to construct the training dataset. The ResNet has 3\nhidden layers, each with 10 neurons, and is trained for 500 epochs with a batch size of\n5. Following training, the model\u2019s performance is evaluated on a new and unseen test\nset consisting of 100 trajectories with initial states uniformly sampled on \u2126. Figure 6\n18\nJ. CHEN, K. WU, D. XIU\ndisplays an example trajectory alongside the reference solution, as well as the average\n\u21132 error over time. The trained model demonstrates accurate predictions up to t = 20,\nequivalent to 1,000 forward steps.\n\u22120.8\n\u22120.4\n0\n0.4\n0.8\n\u22122.6\n\u22121.3\n0\n1.3\n2.6\nu1\nu2\nReference\nPrediction\n0\n1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n17\n18\n19\n20\n0\n0.002\n0.004\n0.006\n0.008\n0.01\n0.012\nt\nError u1\nError u2\nFig. 6: Fully observed damped pendulum system. Left: Comparison between the\npredicted and reference solutions. Right: Average \u21132 error computed on the test set.\n5.1.2. Partially Observed Case. In this scenario, we focus on modeling a re-\nduced system solely related to u1. Trajectories of u2 are excluded from the training\ndata, and we address this partially observed system by adopting M = 10 memory\nsteps in the model. Thanks to the optimized data processing module of DUE, users\ncan easily try different values of M by modifying the memory parameter in the con-\nfiguration file; see Section 4.1.1. Other configurations remain the same as in the fully\nobserved case. Figure 7 illustrates an example trajectory and the average \u21132 error on\nthe test set.\n0\n4\n8\n12\n16\n20\n\u22121\n\u22120.5\n0\n0.5\n1\nt\nReference\nPrediction\n0\n4\n8\n12\n16\n20\n0\n0.002\n0.004\n0.006\nt\nError u1\nFig. 7: Partially observed damped pendulum system. Left: Comparison between the\npredicted and reference solutions. Right: Average \u21132 error computed on the test set.\n5.1.3. Robustness to Noisy Data. In the third scenario, we introduce artifi-\ncial noise to the synthetic data used in Section 5.1.2 to assess the model\u2019s robustness\nto measurement errors. Specifically, the training data are modified as\nn\nu(j)\nin (1 + \u03f5(j)\nin ), u(j)\nout(1 + \u03f5(j)\nout)\noJ\nj=1 ,\nwhere the relative noise terms \u03f5(j)\nin and \u03f5(j)\nout are drawn from a uniform distribution\nover [\u2212\u03b7, \u03b7], with \u03b7 representing the noise level. We perform two experiments with\n\u03b7 set to 0.05 and 0.1, corresponding to noise levels of 5% and 10%, respectively. All\nother settings are kept the same as in Section 5.1.2. Figure 8 shows the predicted\nDUE\n19\ntrajectories generated by two different models trained on noisy data. While some\ndeviation from the exact dynamics is observed, the oscillating and damping patterns of\nthe solution remain well-captured. The model\u2019s performance can be further enhanced\nby increasing the amount of training data.\n0\n4\n8\n12\n16\n20\n-1\n-0.5\n0\n0.5\n1\nt\nReference\nPrediction\n0\n4\n8\n12\n16\n20\n-1\n-0.5\n0\n0.5\n1\nt\nReference\nPrediction\nFig. 8: Partially observed damped pendulum system with noisy data. Left: Noise\nlevel \u03b7 = 5%. Right: Noise level \u03b7 = 10%.\n5.2. Two Coupled Oscillators with Real-World Noisy Data. Next, we\nuse DUE to model the unknown ODEs of two coupled oscillators using real-world\ndata [57, 69]. This dataset consists of a single trajectory with 486 recorded states,\nof which the first 360 states are used for training and the remaining for testing. The\nstate variables of interest include the positions and momenta of the two oscillators,\nresulting in a state space in R4. Due to measurement noise, the experimental data\nmay not perfectly represent the full system. In this example, we examine the impact\nof memory terms in modeling partially observed systems by training two models with\nM = 0 and M = 10, respectively.\nEach model employs a ResNet with 3 hidden\nlayers, each containing 10 neurons, and is trained for 500 epochs with a batch size\nof 1. The predicted phase plots are displayed in Figure 9. Despite the data scarcity\nand measurement noise, both models successfully capture the underlying dynamics.\nThe advantage of using memory terms is evident from the improved accuracy with\nM = 10 compared to M = 0.\n5.3. Chaotic Lorenz system. Next, we demonstrate DUE\u2019s capability to\nmodel the chaotic Lorenz system [17]. The true equations are given by:\n(5.1)\n\uf8f1\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f2\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f3\ndu1\ndt = \u03c3(u2 \u2212u1),\ndu2\ndt = u1(\u03c1 \u2212u3) \u2212u2,\ndu3\ndt = u1u2 \u2212\u03b2u3,\nwith \u03c3 = 10, \u03c1 = 28, and \u03b2 = 8/3. The synthetic dataset consists of N = 1, 000\ntrajectories for (u1, u2, u3), each with a length of L = 10, 000 and a time lag of\n\u2206= 0.01.\nInitial states are randomly sampled from the uniform distribution on\n\u2126= [\u2212\u03c0/2, \u03c0/2]3. We set K = 10 for the multi-step loss and randomly sample 5\nbursts from each trajectory to construct the training dataset. In this example, we\ncompare the performance of ResNet and gResNet. The baseline ResNet is built upon\nan FNN with 3 hidden layers, each with 10 neurons. The gResNet consists of an FNN\nwith the same architecture and a pre-trained affine model, implemented as affine in\n20\nJ. CHEN, K. WU, D. XIU\n\u22121\n\u22120.5\n0\n\u22121\n0\n1\nPosition\nMomentum\nInitial state\nTruth\nPrediction M = 0\n0\n0.5\n1\n1.5\n\u22122\n\u22121\n0\n1\n2\nPosition\nMomentum\n0\n1\n2\n3\n4\n0\n0.05\n0.1\n0.15\nt\nError M = 0\n\u22121\n\u22120.5\n0\n\u22121\n0\n1\nPosition\nMomentum\nInitial states\nTruth\nPrediction M = 10\n0\n0.5\n1\n1.5\n\u22122\n\u22121\n0\n1\n2\nPosition\nMomentum\n0\n1\n2\n3\n4\n0\n0.05\n0.1\n0.15\nt\nError M = 10\nFig. 9: Two coupled oscillators from real-world data. Models with different memory\nsteps (M) predict the system\u2019s evolution. Top: M = 0. Bottom: M = 10. Left: Phase\nplots of Mass 1. Middle: Phase plots of Mass 2. Right: the \u2113\u221eerror (suggested in\n[69]) computed on the last 126 states of the experimental data.\nDUE. Both models are trained for 500 epochs with a batch size of 5. After training,\nthe models are evaluated on a new and unseen test set consisting of 100 trajectories\nwith initial states uniformly sampled on \u2126. Figure 10 displays an example of the\npredicted and reference trajectories, while Figure 11 shows the average \u21132 error up\nto t = 10 on the test set.\nThese results indicate that both ResNet and gResNet\ncan capture the system\u2019s chaotic evolution, with gResNet achieving higher prediction\naccuracy.\nFig. 10: Lorenz equations. From left to right: reference solution, prediction by gRes-\nNet, prediction by ResNet.\n1\n3\n5\n7\n9\nt\n0\n1\n2\n3\nError u1-ResNet\nError u1-gResNet\n1\n3\n5\n7\n9\nt\n0\n1\n2\n3\n4\n5\nError u2-ResNet\nError u2-gResNet\n1\n3\n5\n7\n9\nt\n0\n1\n2\n3\n4\n5\nError u3-ResNet\nError u3-gResNet\nFig. 11: Lorenz equations. From left to right: \u21132 error of u1, u2, and u3.\nDUE\n21\n5.4. Robertson Chemical Reaction Equations with Multi-Scale Dy-\nnamics. This example explores the Robertson chemical reaction system, which de-\nscribes the kinetics of three chemical species: A, B, and C. Proposed by Robertson in\n1966 [51], the system is governed by the following nonlinear ODEs:\n(5.2)\n\uf8f1\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f2\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f4\n\uf8f3\ndu1\ndt = \u2212k1u1 + k2u2u3,\ndu2\ndt = k1u1 \u2212k2u2u3 \u2212k3u2\n2,\ndu3\ndt = k3u2\n2,\nwhere (u1, u2, u3) represent the concentrations of (A, B, C), respectively. The reaction\nrates are k1 = 0.04, k2 = 104, and k3 = 3 \u00d7 107, making the system highly stiff. To\ncapture dynamics across both small and large time scales, we use DUE\u2019s ode osg\nmodule to approximate flow maps with varied time step sizes [9].\nThe synthetic\ndataset comprises 50, 000 input-output pairs, with time lags randomly sampled from\n10U[\u22124.5,2.5], where U[\u22124.5, 2.5] is the uniform distribution on [\u22124.5, 2.5]. Initial states\nare randomly sampled from the domain [0, 1] \u00d7 [0, 5 \u00d7 10\u22125] \u00d7 [0, 1], and the system\nis solved using the variable-step, variable-order ode15s solver in Matlab.\nTo address the challenge of multi-scale temporal dynamics, we employ a dual-\nOSG-Net with 3 hidden layers, each containing 60 neurons.\nThe neural network\nmodel is trained using the GDSG method to embed the semigroup property, with the\nhyperparameters \u03bb and Q both set to 1. Additionally, we train a second model using\nthe vanilla OSG-Net [9] for benchmarking. Both models are trained for 10,000 epochs\nwith a batch size of 500. After training, predictions are initiated from (u1, u2, u3) =\n(1, 0, 0) to forecast the multi-scale kinetics of the three chemical species until t =\n100, 000, a challenging long-term prediction task.\nThe time step size starts from\n\u22061 = 5 \u00d7 10\u22125 and doubles after each step until it reaches \u22062 = 300. As shown in\nFigure 12, the dual-OSG-Net model accurately predicts the dynamics across all time\nscales between \u22061 and \u22062, demonstrating superior long-term accuracy compared to\nthe vanilla OSG-Net model.\n10\u22124\n10\u22122\n100\n102\n104\n0\n0.25\n0.5\n0.75\n1\nt\nReference u1\nPrediction u1\nReference u2\nPrediction u2\nReference u3\nPrediction u3\n10\u22124\n10\u22122\n100\n102\n104\n0\n0.25\n0.5\n0.75\n1\nt\nReference u1\nPrediction u1\nReference u2\nPrediction u2\nReference u3\nPrediction u3\nFig. 12: Robertson chemical reaction equations. Left: OSG-Net prediction vs. refer-\nence solution. Right: Dual-OSG-Net prediction vs. reference solution. Initial state:\n(1, 0, 0). The value of u2 is multiplied by 104 for clearer visualization.\n5.5. One-dimensional Viscous Burgers\u2019 Equation. This example demon-\nstrates DUE\u2019s capabilities in learning PDEs by focusing on the viscous Burgers\u2019 equa-\n22\nJ. CHEN, K. WU, D. XIU\ntion with Dirichlet boundary conditions [65, 14]:\n(5.3)\n\uf8f1\n\uf8f4\n\uf8f2\n\uf8f4\n\uf8f3\n\u2202tu + \u2202x\n\u0012u2\n2\n\u0013\n= 1\n10\u2202xxu,\n(x, t) \u2208(0, 2\u03c0) \u00d7 R+,\nu(0, t) = u(2\u03c0, t) = 0,\nt \u22650.\nThe training data are generated by sampling the power series solutions of the true\nequation on a uniform grid with 128 nodal points. Initial conditions are drawn from\na Fourier series with random coefficients: u(x, t = 0) = P10\nm=1 am sin(mx), where\nam \u223cU[\u22121/m, 1/m].\nWe generate N = 1, 000 trajectories of the solution with\ndifferent initial conditions, and record L = 40 snapshots on each trajectory with a\ntime lag \u2206= 0.05.\nIn this example,\nwe introduce how to learn PDEs in modal space us-\ning DUE\u2019s generalized fourier projection1d class.\nFirst,\ninitialize this\nclass\nby\nspecifying\na\ntruncation\nwave\nnumber\nfor\nthe\nmodal\nexpansion.\nThe\ntraining\ndata\nare\nprojected\ninto\nthe\nreduced\nmodal\nspace\nvia\nthe\ngeneralized fourier projection1d.forward function. This data transformation\nis followed by a standard ODE modeling procedure, resulting in a model that cap-\ntures the dynamics of the modal coefficients. During prediction, the future states of\nthe modal coefficients are used to recover solutions in the physical space using the\ngeneralized fourier projection1d.backward function. For this example, the\ntruncation wave number is set to 10. We adopt a ResNet with 3 hidden layers, each\ncontaining 60 neurons. The model is trained for 500 epochs with a batch size of 10.\nAfter training, we evaluate the model\u2019s performance on a new and unseen test set.\nFigure 13 displays predictions for two example trajectories up to t = 10, equivalent\nto 200 forward steps.\nFig. 13: One-dimensional viscous Burgers\u2019 equation. Predicted and reference solutions\nfor two example trajectories originating from different initial conditions. Black solid\nlines indicate reference solution contours, while red dotted lines and colored plots\nshow predictions.\n5.6. Two-dimensional Incompressible Navier\u2013Stokes Equations. This\nexample illustrates learning the incompressible Navier\u2013Stokes equations [36, 9]:\n(5.4)\n\uf8f1\n\uf8f4\n\uf8f2\n\uf8f4\n\uf8f3\n\u2202t\u03c9(x, t) + v(x, t) \u00b7 \u2207\u03c9(x, t) = \u03bd\u2206\u03c9(x, t) + f(x),\nx \u2208(0, 1)2, t > 0,\n\u2207\u00b7 v(x, t) = 0,\nx \u2208(0, 1)2, t > 0,\n\u03c9(x, t = 0) = \u03c90(x),\nx \u2208(0, 1)2,\nDUE\n23\nwhere v(x, t) is the velocity, \u03c9 = \u2207\u00d7 v is the vorticity, \u03bd = 10\u22123 denotes viscosity,\nand f(x) = 0.1(sin(2\u03c0(x1 + x2)) + cos(2\u03c0(x1 + x2))) represents a periodic external\nforce. Our goal is to learn the evolution operators of \u03c9 from data with varied time\nlags. We use the data from [9], which comprises N = 100 trajectories of solution\nsnapshots with a length of 50. Solutions are sampled on a 64 \u00d7 64 uniform grid, with\ntime lags randomly sampled from the uniform distribution on [0.5, 1.5]. For neural\nnetwork modeling, we construct an OSG-Net with the Fourier neural operator as the\nbasic block, implemented as osg fno in DUE. The two hyperparameters \u03bb and Q are\nboth set to 1 for the GDSG loss function. We train the model for 500 epochs with a\nbatch size of 20. Subsequently, the trained model is evaluated on 100 new and unseen\ntrajectories with a length of 100 and a time step size \u2206= 1. As shown in Figure 14,\nthe model trained with the GDSG method produces accurate predictions at t = 60\nand 100. Figure 15 displays the training loss and testing errors. We observe that the\npurely data-driven model, which is trained using only the plain fitting loss without\nembedding the semigroup property, is unstable.\n\u22122.1\n0\n2.1\n\u22122.1\n0\n2.1\n0\n0.014\n0.028\n\u22122.1\n0\n2.1\n\u22122.1\n0\n2.1\n0\n0.014\n0.028\nFig. 14: Two-dimensional Navier\u2013Stokes equations. Vorticity at t = 60 (top row) and\n100 (bottom row) predicted by the model trained with the GDSG method.\n0\n100\n200\n300\n400\n500\n10\u22123\n10\u22122\n10\u22121\nEpochs\nOSG-FNO\nOSG-FNO+GDSG\n0\n20\n40\n60\n80\n100\n10\u22123\n10\u22122\n10\u22121\nt\nOSG-FNO\nOSG-FNO+GDSG\nFig. 15: Two-dimensional Navier\u2013Stokes equations. Left: training loss recorded after\nevery epoch. Right: average relative \u21132 error computed on the test set with 100 new\nand unseen trajectories with a length of 100 and time lag \u2206= 1.\n5.7. Two-dimensional Flow Past a Circular Cylinder. In this classic fluid\nmechanics example, we use DUE to learn the dynamics of fluid velocity v and pressure\n24\nJ. CHEN, K. WU, D. XIU\np around a circular cylinder, generating periodic oscillations at a low Reynolds num-\nber. Synthetic data is generated by numerically solving the incompressible Navier\u2013\nStokes equations:\n(5.5)\n(\n\u2202tv + v \u00b7 \u2207v = \u22121\n\u03c1\u2207p + \u03bd\u2206v,\nx \u2208\u2126, t > 0,\n\u2207\u00b7 v(x, t) = 0,\nx \u2208\u2126, t > 0,\nwith fluid density \u03c1 = 1 and viscosity \u03bd = 0.001.\nThe geometric configuration,\nboundary conditions, and computing mesh are depicted in Figure 16. The horizontal\nvelocity component at the inlet, denoted as v0, is sampled from the following Fourier\nseries with random coefficients:\n(5.6)\nv0(y, t) = 1 + 0.6\n5\nX\nm=1\nam sin\n\u00122m\u03c0\nH y\n\u0013\n,\nwhere am \u223cU[\u22121/m, 1/m], and H is the height of the rectangular domain.\n0\n1\n2\n-0.2\n-0.1\n0\n0.1\n0.2\n\ud835\udc65\n\ud835\udc66\n\ud835\udc3f= 0.8\n\ud835\udc3b= 0.4\nFig. 16: Two-dimensional flow past a circular cylinder at the origin in a rectangular\ndomain. The inlet is 0.2 units upstream of the cylinder\u2019s centroid. Domain size: 0.8\nwidth, 0.4 height. Inflow has zero vertical velocity. Lateral boundaries: v = (1, 0).\nOutflow: zero pressure. No-slip condition on the cylinder\u2019s surface.\nThe dataset consists of 1,000 trajectories with 11 snapshots each, having a time\nlag of 0.05.\nWe set K = 0 for the multi-step loss and rearrange each trajectory\ninto 10 input-output pairs to construct the training data set. For neural network\nmodeling with data sampled on an unstructured mesh, we employ a ResNet with the\nPosition-induced Transformer (PiT) as the basic block, implemented as pit in DUE.\nThe model is trained for 500 epochs with a batch size of 50. Subsequently, the trained\nmodel is evaluated on 100 new and unseen trajectories with 10 forward time steps\n(up to t = 0.5). As shown in Figure 17, the PiT model in DUE successfully captures\nthe dynamics of both the velocity and pressure. The relatively larger error in the\ndownstream region is due to a sparser distribution of sampling grid points, resulting\nin a lower resolution of the downstream flow. Consequently, this region contributes\nless to the loss function, leading the model to learn less about the flow patterns there.\nThe resolution in this region can be improved by locally increasing the number of\nsampling points.\n6. Conclusions and Prospects. Artificial intelligence is revolutionizing scien-\ntific research, offering profound insights and accelerating discoveries across various\nDUE\n25\n\u22120.3\n0.7\n1.7\n0\n0.012\n0.024\n\u22120.9\n0\n0.9\n0\n0.011\n0.022\n\u22120.9\n0.1\n1.1\n0\n0.006\n0.012\nFig. 17: Two-dimensional flow past a circular cylinder. From top to bottom: horizon-\ntal velocity; vertical velocity; pressure. Left: the referential (v, p) at t = 0.5; Middle:\nthe predicted (v, p) given by the PiT model; Right: the absolute errors between the\nreferences and the predictions.\nfields through advanced data analysis and predictive modeling. This paper has intro-\nduced a comprehensive framework for learning unknown equations using deep learn-\ning, featuring advanced neural network architectures such as ResNet, gResNet, OSG-\nNet, and Transformers. This adaptable framework is capable of learning unknown\nODEs, PDEs, DAEs, IDEs, and SDEs, as well as reduced or partially observed sys-\ntems with missing variables. Compared to DMD, which offers faster training times\nand performs well on linear systems, the deep learning framework requires more com-\nputational resources for training but excels at capturing nonlinear interactions and\nmodeling complex systems, providing greater flexibility and accuracy for tackling\nchallenging problems.\nWe have presented the novel dual-OSG-Net architecture to address the challenges\nposed by multi-scale stiff differential equations, enabling accurate learning of dynam-\nics across broad time scales. Additionally, we have introduced several techniques to\nenhance prediction accuracy and stability, including a multi-step loss function that\nconsiders model predictions several steps ahead during training, and a semigroup-\ninformed loss function that embeds the semigroup property into the models. These\ntechniques could serve as examples for students and newcomers, illustrating the fron-\ntier of embedding prior knowledge into deep learning for data-driven discovery and\ndeveloping structure-preserving AI for modeling unknown equations.\nTo support this framework, we developed Deep Unknown Equations (DUE), a\nuser-friendly, comprehensive software tool equipped with extensive functionalities for\nmodeling unknown equations through deep learning. DUE facilitates rapid scripting,\nallowing users to initiate new modeling tasks with just a few lines of code. It serves\nas both an educational toolbox for students and newcomers and a versatile Python\nlibrary for researchers dealing with differential equations. DUE is applicable not only\nfor learning unknown equations from data but also for surrogate modeling of known,\nyet complex, equations that are costly to solve using traditional numerical methods.\nThe extensive numerical examples presented in this paper demonstrate DUE\u2019s power\nin modeling unknown equations, and the source codes for these examples are available\nin our GitHub repository, providing templates that users can easily adapt for their\nresearch.\n26\nJ. CHEN, K. WU, D. XIU\nLooking ahead, DUE is envisioned as a long-term project with ongoing mainte-\nnance and regular updates to incorporate advanced techniques. We are committed\nto continuously optimizing DUE\u2019s performance and adding new functionalities as re-\nsearch in this field progresses. We also encourage contributions from users to expand\nDUE\u2019s capabilities and broaden its applicability across a wider range of scenarios.\nOne promising direction is to implement robust denoising procedures during data\npreprocessing, enabling DUE to achieve reliable results even with high levels of noise\nin the data. Additionally, reducing the amount of data required for effective deep\nlearning performance is valuable. While the current semigroup-informed learning ap-\nproach helps in this regard, incorporating additional physical constraints or leveraging\nprior models and knowledge could further guide the model toward accurate predic-\ntions with less data. Another effective strategy is active learning, which focuses on\nselecting the most informative data points for model training. By concentrating on\ncritical data, active learning can enhance model performance while reducing data re-\nquirements.\nLastly, transfer learning offers a powerful approach to minimize data\nneeds further by utilizing pre-trained models on related tasks. For instance, neural\noperators, with their discretization-invariant properties, can be pre-trained on coarser\ndata and adapted to finer resolutions with minimal or no retraining. Exploring ad-\nditional transfer learning techniques, such as those tailored to multi-frequency time\nseries data, is also a promising direction.\nAcknowledgment. The authors would like to express their sincere gratitude to\nthe anonymous reviewers for their insightful comments and constructive suggestions,\nwhich have enhanced the quality of this paper.\nREFERENCES\n[1] A. A. Ahmadi and B. E. Khadir, Learning dynamical systems with side information, SIAM\nRev., 65 (2023), pp. 183\u2013223.\n[2] K. Azizzadenesheli, N. Kovachki, Z. Li, M. Liu-Schiaffini, J. Kossaifi, and A. Anandku-\nmar, Neural operators for accelerating scientific simulations and design, Nat. Rev. Phy.,\n(2024), pp. 1\u20139.\n[3] J. Bongard and H. Lipson, Automated reverse engineering of nonlinear dynamical systems,\nProc. Natl. Acad. Sci., 104 (2007), pp. 9943\u20139948.\n[4] S. L. Brunton, B. W. Brunton, J. L. Proctor, E. Kaiser, and J. N. Kutz, Chaos as an\nintermittently forced linear system, Nat. Commun., 8 (2017), p. 19.\n[5] S. L. Brunton, M. Budi\u02c7si\u00b4c, E. Kaiser, and J. N. Kutz, Modern Koopman theory for dy-\nnamical systems, SIAM Rev., 64 (2022), pp. 229\u2013340.\n[6] S. L. Brunton, J. L. Proctor, and J. N. Kutz, Discovering governing equations from data\nby sparse identification of nonlinear dynamical systems, Proc. Natl. Acad. Sci., 113 (2016),\npp. 3932\u20133937.\n[7] E. J. Candes, J. K. Romberg, and T. Tao, Stable signal recovery from incomplete and\ninaccurate measurements, Comm. Pure Appl. Math., 59 (2006), pp. 1207\u20131223.\n[8] S. Cao, Choose a Transformer: Fourier or Galerkin, in NeurIPS, vol. 34, 2021, pp. 24924\u2013\n24940.\n[9] J. Chen and K. Wu, Deep-OSG: Deep learning of operators in semigroup, J. Comput. Phys.,\n493 (2023), p. 112498.\n[10] J. Chen and K. Wu, Positional knowledge is all you need: Position-induced Transformer\n(PiT) for operator learning, in ICML, PMLR, 2024.\n[11] R. T. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, Neural ordinary differ-\nential equations, in NeurIPS, vol. 31, 2018.\n[12] Y. Chen, Y. Luo, Q. Liu, H. Xu, and D. Zhang, Symbolic genetic algorithm for discovering\nopen-form partial differential equations (SGA-PDE), Phys. Rev. Res., 4 (2022), p. 023174.\n[13] Y. Chen and D. Xiu, Learning stochastic dynamical system via flow map operator, J. Comput.\nPhys., (2024), p. 112984.\n[14] Z. Chen, V. Churchill, K. Wu, and D. Xiu, Deep neural network modeling of unknown\npartial differential equations in nodal space, J. Comput. Phys., 449 (2022), p. 110782.\nDUE\n27\n[15] Z. Chen and D. Xiu, On generalized residual network for deep learning of unknown dynamical\nsystems, J. Comput. Phys., 438 (2021), p. 110362.\n[16] V. Churchill, Y. Chen, Z. Xu, and D. Xiu, DNN modeling of partial differential equations\nwith incomplete data, J. Comput. Phys., 493 (2023), p. 112502.\n[17] V. Churchill and D. Xiu, Deep learning of chaotic systems from partially-observed data, J.\nMach. Learn. Model. Comput., 3 (2022).\n[18] V. Churchill and D. Xiu, Flow map learning for unknown dynamical systems: Overview,\nimplementation, and benchmarks, J. Mach. Learn. Model. Comput., 4 (2023).\n[19] Q. Du, Y. Gu, H. Yang, and C. Zhou, The discovery of dynamics via linear multistep methods\nand deep learning: error estimation, SIAM J. Numer. Anal., 60 (2022), pp. 2014\u20132045.\n[20] M. Feurer and F. Hutter, Hyperparameter optimization, Automated machine learning:\nMethods, systems, challenges, (2019), pp. 3\u201333.\n[21] X. Fu, L.-B. Chang, and D. Xiu, Learning reduced systems via deep neural networks with\nmemory, J. Mach. Learn. Model. Comput., 1 (2020).\n[22] X. Fu, W. Mao, L.-B. Chang, and D. Xiu, Modeling unknown dynamical systems with hidden\nparameters, J. Mach. Learn. Model. Comput., 3 (2022).\n[23] Z. Hao, Z. Wang, H. Su, C. Ying, Y. Dong, S. Liu, Z. Cheng, J. Song, and J. Zhu,\nGnot: A general neural operator Transformer for operator learning, in ICML, PMLR,\n2023, pp. 12556\u201312569.\n[24] K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in CVPR,\n2016, pp. 770\u2013778.\n[25] D. Hendrycks and K. Gimpel, Gaussian error linear units (GELUs), arXiv:1606.08415,\n(2016).\n[26] C. F. Higham and D. J. Higham, Deep learning: An introduction for applied mathematicians,\nSIAM Rev., 61 (2019), pp. 860\u2013891.\n[27] S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Comput., 9 (1997),\npp. 1735\u20131780.\n[28] R. T. Keller and Q. Du, Discovery of dynamics using linear multistep methods, SIAM J.\nNumer. Anal., 59 (2021), pp. 429\u2013455.\n[29] S. Kim, W. Ji, S. Deng, Y. Ma, and C. Rackauckas, Stiff neural ordinary differential\nequations, Chaos, 31 (2021).\n[30] D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, in ICLR, 2015.\n[31] N. B. Kovachki, Z. Li, B. Liu, K. Azizzadenesheli, K. Bhattacharya, A. M. Stuart, and\nA. Anandkumar, Neural operator: Learning maps between function spaces with applica-\ntions to PDEs., J. Mach. Learn. Res., 24 (2023), pp. 1\u201397.\n[32] Y. LeCun, Y. Bengio, et al., Convolutional networks for images, speech, and time series,\nThe Handbook of brain theory and neural networks, 3361 (1995), p. 1995.\n[33] S. Lee and D. You, Data-driven prediction of unsteady flow over a circular cylinder using\ndeep learning, J. Fluid Mech., 879 (2019), pp. 217\u2013254.\n[34] Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and\nA. Anandkumar, Neural operator: Graph kernel network for partial differential equa-\ntions, arXiv:2003.03485, (2020).\n[35] Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, A. Stuart, K. Bhattacharya, and\nA. Anandkumar, Multipole graph neural operator for parametric partial differential equa-\ntions, in NeurIPS, vol. 33, 2020, pp. 6755\u20136766.\n[36] Z. Li, N. B. Kovachki, K. Azizzadenesheli, B. liu, K. Bhattacharya, A. Stuart, and\nA. Anandkumar, Fourier neural operator for parametric partial differential equations, in\nICLR, 2021.\n[37] Z. Li, K. Meidani, and A. B. Farimani, Transformer for partial differential equations operator\nlearning, Trans. Mach. Learn. Res., (2022).\n[38] Z. Long, Y. Lu, and B. Dong, PDE-Net 2.0: Learning PDEs from data with a numeric-\nsymbolic hybrid deep network, J. Comput. Phys., 399 (2019), p. 108925.\n[39] Z. Long, Y. Lu, X. Ma, and B. Dong, PDE-Net: Learning PDEs from data, in ICML, PMLR,\n2018, pp. 3208\u20133216.\n[40] I. Loshchilov and F. Hutter, SGDR: Stochastic gradient descent with warm restarts, in\nICLR, 2016.\n[41] L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis, Learning nonlinear operators via\nDeepONet based on the universal approximation theorem of operators, Nat. Mach. Intell.,\n3 (2021), pp. 218\u2013229.\n[42] L. Lu, X. Meng, Z. Mao, and G. E. Karniadakis, DeepXDE: A deep learning library for\nsolving differential equations, SIAM Rev., 63 (2021), pp. 208\u2013228.\n[43] N. M. Mangan, J. N. Kutz, S. L. Brunton, and J. L. Proctor, Model selection for dynam-\n28\nJ. CHEN, K. WU, D. XIU\nical systems via sparse regression and information criteria, Proc. R. Soc. A: Math. Phys.\nEng. Sci., 473 (2017), p. 20170009.\n[44] H. Miao, X. Xia, A. S. Perelson, and H. Wu, On identifiability of nonlinear ODE models\nand applications in viral dynamics, SIAM Rev., 53 (2011), pp. 3\u201339.\n[45] T. Qin, Z. Chen, J. D. Jakeman, and D. Xiu, Data-driven learning of nonautonomous sys-\ntems, SIAM J. Sci. Comput., 43 (2021), pp. A1607\u2013A1624.\n[46] T. Qin, K. Wu, and D. Xiu, Data driven governing equations approximation using deep neural\nnetworks, J. Comput. Phys., 395 (2019), pp. 620\u2013635.\n[47] M. Raissi, Deep hidden physics models: Deep learning of nonlinear partial differential equa-\ntions, J. Mach. Learn. Res., 19 (2018), pp. 932\u2013955.\n[48] M. Raissi, P. Perdikaris, and G. E. Karniadakis, Machine learning of linear differential\nequations using Gaussian processes, J. Comput. Phys., 348 (2017), pp. 683\u2013693.\n[49] M. Raissi, P. Perdikaris, and G. E. Karniadakis, Multistep neural networks for data-driven\ndiscovery of nonlinear dynamical systems, arXiv:1801.01236, (2018).\n[50] M. Raissi, P. Perdikaris, and G. E. Karniadakis, Physics-informed neural networks: A deep\nlearning framework for solving forward and inverse problems involving nonlinear partial\ndifferential equations, J. Comput. Phys., 378 (2019), pp. 686\u2013707.\n[51] H. Robertson, The solution of a set of reaction rate equations, Numer. Anal.: An Introd.,\n178182 (1966).\n[52] O. Ronneberger, P. Fischer, and T. Brox, U-net: Convolutional networks for biomedical\nimage segmentation, arXiv:1505.04597, (2015).\n[53] S. Ruder, An overview of gradient descent optimization algorithms, arXiv:1609.04747, (2016).\n[54] S. H. Rudy, S. L. Brunton, J. L. Proctor, and J. N. Kutz, Data-driven discovery of partial\ndifferential equations, Sci. Adv., 3 (2017), p. e1602614.\n[55] H. Schaeffer, Learning partial differential equations via data discovery and sparse optimiza-\ntion, Proc. R. Soc. A: Math. Phys. Eng. Sci., 473 (2017), p. 20160446.\n[56] P. J. Schmid, Dynamic mode decomposition of numerical and experimental data, J. Fluid\nMech., 656 (2010), pp. 5\u201328.\n[57] M. Schmidt and H. Lipson, Distilling free-form natural laws from experimental data, Sci.,\n324 (2009), pp. 81\u201385.\n[58] Y. Sun, L. Zhang, and H. Schaeffer, NeuPDE: Neural network based ordinary and partial\ndifferential equations for modeling time-dependent data, in Math. and Sci. Mach. Learn.,\nPMLR, 2020, pp. 352\u2013372.\n[59] G. Tran and R. Ward, Exact recovery of chaotic systems from highly corrupted data, Multi-\nscale Model. Simul., 15 (2017), pp. 1108\u20131129.\n[60] J. H. Tu, C. W. Rowley, D. M. Luchtenburg, S. L. Brunton, and J. N. Kutz, On dynamic\nmode decomposition: Theory and applications, J. Comput. Dyn., 1 (2014), pp. 391\u2013421.\n[61] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez,  L. Kaiser,\nand I. Polosukhin, Attention is all you need, in NeurIPS, vol. 30, 2017.\n[62] W.-X. Wang, R. Yang, Y.-C. Lai, V. Kovanis, and C. Grebogi, Predicting catastrophes\nin nonlinear dynamical systems by compressive sensing, Phys. Rev. Lett., 106 (2011),\np. 154101.\n[63] K. Wu, T. Qin, and D. Xiu, Structure-preserving method for reconstructing unknown Hamil-\ntonian systems from trajectory data, SIAM J. Sci. Comput., 42 (2020), pp. A3704\u2013A3729.\n[64] K. Wu and D. Xiu, Numerical aspects for approximating governing equations using data, J.\nComput. Phys., 384 (2019), pp. 200\u2013221.\n[65] K. Wu and D. Xiu, Data-driven deep learning of partial differential equations in modal space,\nJ. Comput. Phys., 408 (2020), p. 109307.\n[66] H. Xu, H. Chang, and D. Zhang, DLGA-PDE: Discovery of pdes with incomplete candidate\nlibrary via combination of deep learning and genetic algorithm, J. Comput. Phys., 418\n(2020), p. 109584.\n[67] H. Xu and D. Zhang, Robust discovery of partial differential equations in complex situations,\nPhys. Rev. Res., 3 (2021), p. 033270.\n[68] J. Xu and K. Duraisamy, Multi-level convolutional autoencoder networks for parametric pre-\ndiction of spatio-temporal dynamics, Comput. Methods Appl. Mech. Eng., 372 (2020),\np. 113379.\n[69] A. Zhu, T. Bertalan, B. Zhu, Y. Tang, and I. G. Kevrekidis, Implementation and (inverse\nmodified) error analysis for implicitly templated ODE-Nets, SIAM Journal on Applied\nDynamical Systems, 23 (2024), pp. 2643\u20132669.\n[70] Z. Zou, X. Meng, A. F. Psaros, and G. E. Karniadakis, NeuraluUQ: A comprehensive\nlibrary for uncertainty quantification in neural differential equations and operators, SIAM\nRev., 66 (2024), pp. 161\u2013190."
-  },
-  {
-    "domain": "Materials Science",
-    "chunk_type": "general",
-    "text": "Spin-Orbital Intertwined Topological Superconductivity in a Class of Correlated\nNoncentrosymmetric Materials\nLichuang Wang,1, 2, \u2217Ran Wang,1, 2, \u2217Xinliang Huang,1, 2 Xianxin Wu,3, \u2020 and Ning Hao1, 2, \u2021\n1Anhui Provincial Key Laboratory of Low-Energy Quantum Materials and Devices,\nHigh Magnetic Field Laboratory, HFIPS, Chinese Academy of Sciences, Hefei, Anhui 230031, China\n2Science Island Branch of Graduate School, University of Science and Technology of China, Hefei, Anhui 230026, China\n3CAS Key Laboratory of Theoretical Physics, Institute of Theoretical Physics,\nChinese Academy of Sciences, Beijing 100190, China\nIn this study, we propose an alternative route to achieving topological superconductivity (TSC).\nOur approach applies to a new class of correlated noncentrosymmetric materials that host two\nspin-split Fermi surfaces with identical spin textures due to a spin-orbital intertwined effect. Incor-\nporating multi-orbital repulsive Hubbard interactions, we calculate the superconducting pairings of\na minimal two-orbital effective model within a spin-fluctuation-mediated superconductivity frame-\nwork. We find that, depending on the effective Rashba spin-orbit coupling (RSOC) strength and\nfilling level, the Hubbard interaction can drive the leading pairing symmetry into the A1(S\u00b1), B1,\nB2 or B2(d\u00b1) irreducible representations (IRs) of the C4v point group. Notably, the A1(S\u00b1) pairing\ngives rise to a fully gapped TSC characterized by a Z2 invariant, while the B2(d\u00b1) pairing results\nin a nodal TSC. Our analysis reveals that the fully gapped TSC is predominated by spin-singlet\nregardless of the presence of the spin-triplet components. This distinguishes our model from non-\ncentrosymmetric materials with conventional Rashba-split band structures, where TSC typically\nemerges near the van Hove singularity and is primarily driven by p-wave or f-wave spin-triplet pair-\ning. These features enhances its experimental accessibility, and we discuss potential experimental\nsystems for its realization.\nIntroduction.\u2014 TSCs can host Majorana bound states,\nwhich are exotic quasi-particles obeying non-Abelian\nstatistics. This unique feature makes TSCs fundamen-\ntally fascinating and potentially valuable for applica-\ntions in quantum computing[1\u20139].\nOver the past two\ndecades, many efforts have been devoted to realizing\nTSCs in various material platforms[10\u201318]. The general\nstrategy involves utilizing superconducting proximity ef-\nfects in either momentum space or real space. The for-\nmer, also known as connate TSCs, is typically achieved\nin multiband superconducting systems with topological\nboundary states, such as Fe(Se,Te), PbTaSe2 etc[13\u201320].\nThe latter approach involves the artificial construction of\ntopological insulator (or semiconductor)-superconductor\nheterostructures[10\u201312].\nHowever, these material plat-\nforms pose experimental challenges, requiring high inter-\nface quality, a large induced superconducting gap, and\nprecise control of the chemical potential.\nUnlike proximity-effect-induced TSCs, intrinsic TSCs\nexhibit topological properties throughout their entire\nvolume, making them robust against perturbations at\nboundaries or interfaces[1]. This characteristic provides\na significant advantage, positioning intrinsic TSCs as\nstrong candidates for building quantum computing plat-\nforms. The pairing symmetry in intrinsic TSCs is typ-\nically p-wave spin-triplet, which generally arises only in\nunconventional superconducting systems driven by non-\nelectron-phonon coupling mechanisms, making such ma-\nterials exceedingly rare in nature.\nCurrent candidates\ninclude uranium- and cerium-based heavy fermion com-\npounds, such as UTe2, CePt3Si[21\u201325].\nConsequently,\nthe exploration of new material systems for realizing\nTSCs remains an important and active area of research.\nA comprehensive analysis of material systems exhibit-\ning TSC reveals that their Fermi surfaces typically form\nrobust spin textures, indicating explicit or potential\ntopological characteristics in their electronic structures.\nNotably, materials such as CePt3Si, Y2C3, A2Cr3As3\n(A=K, Rb, Cs)[24\u201329], which lack inversion symmetry,\ndisplay similar features due to RSOC. Comparable spin\ntexture characteristics have also been experimentally ob-\nserved in the Fermi surfaces of certain copper- and iron-\nbased superconducting systems[30, 31].\nThese insights\nsuggest that TSCs may emerge in correlated unconven-\ntional superconducting systems when RSOC is consid-\nered. However, current research on single-orbital Rashba-\nHubbard models indicates that the small contribution of\nthe spin-triplet component in these systems is typically\ninsufficient to realize TSCs, except under specific condi-\ntions, such as fillings close to the van Hove singularity[32\u2013\n35].\nTo overcome the difficulties in realizing intrinsic TSCs,\nwe propose an alternative approach by introducing a new\ndegree of freedom, such as orbital, layer, or valley. For\nclarity, we focus on orbital degrees of freedom and con-\nsider a two-orbital Rashba-Hubbard model defined on a\nsquare lattice. Using the random phase approximation\n(RPA), we solve the superconducting problem of this\nmodel within the framework of spin-fluctuation-mediated\nsuperconductivity. We find that as the model parameters\nvary, the leading superconducting pairing can fall into\nthe A1(S\u00b1), B1, B2 or B2(d\u00b1) of C4v point group. Inter-\nestingly, the A1(S\u00b1) pairing induces a sign-change gap\nfunction on the two Fermi surfaces with identical spin\narXiv:2504.10392v1  [cond-mat.supr-con]  14 Apr 2025\n2\ntextures.\nThis sign-change feature is primarily driven\nby the spin-singlet-pairing components regardless of the\npresence of the spin-triplet components. Meanwhile, the\nA1(S\u00b1) pairing leads to full-gap TSC states characterized\nby the topological invariant Z2. This TSC is extraordi-\nnary and different from the ones driven by predominated\np-wave or f-wave spin-triplet pairing. Additionally, the\nB2(d\u00b1) pairing results in a nodal TSC state with coex-\nistence of zero-energy flat-band and dispersive Majorana\nstates at the system\u2019s boundary.\nFIG. 1.\n(a) The two-orbital (dx2\u2212y2, dxy) unconventional\nRashba model defined on a square lattice with C4v symme-\ntry. The relations between the parameters here and those in\nEq. 2 are t1/2 = (tx2\u2212y2 \u00b1 txy)/2 and \u03bbR = (\u03bbR1 + \u03bbR2)/2.\n(b)-(c) The representative band structure and Fermi surfaces\nfor electron doping. The black arrows on the Fermi surfaces\ndenote the effective spin and the color bar labels the renor-\nmalized orbital texture \u27e8FS| \u03c4 z\u03c30 |FS\u27e9k/\u27e8FS| \u03c4 z\u03c30 |FS\u27e9max.\nThe parameters t2 = 0, \u03bbR = 0.8, \u03f5 = 1 and \u00b5 = \u22124.5. (d-e)\nThe calculated bare and RPA longitudinal spin susceptibili-\nties \u03c7zz\n0 (q) and \u03c7zz\nRP A(q). The transverse parts are weak[43].\nThe parameters U = 4.5,V = 0.94U and J = 0.03U. Other\nparameters are the same as those in (b) and (c).\nModel and method.\u2014 The two-orbital unconventional\nRashba-Hubbard model is given by\nH = H0 + Hint,\n(1)\nwith\nH0 =\nX\nk\n\u03be(k)\u2212\u03bbR(\u03c4 0 +\u03f5\u03c4 3)[\u03c3 \u00d7\u2207k\u03b3(k)]z +\u03bb\u03c4 2\u03c33 (2)\nand\nHint = U\nX\ni\u03b1\n\u02c6ni\u03b1\u2191\u02c6ni\u03b1\u2193+ V\nX\ni\u03b1<\u03b2\nX\n\u03c3\u03c3\u2032\n\u02c6ni\u03b1\u03c3\u02c6ni\u03b2\u03c3\u2032\n\u2212J\nX\ni,\u03b1<\u03b2\nSi\u03b1 \u00b7 Si\u03b2\n+ J\u2032 X\ni,\u03b1<\u03b2\nX\n\u03c3\nc\u2020\ni\u03b1\u03c3c\u2020\ni\u03b1\u00af\u03c3ci\u03b2\u00af\u03c3ci\u03b2\u03c3\n(3)\nHere, H0 can be defined on a lattice with the symmetry\nof C3v, C4v or C6v[36\u201338]. For clarity and simplicity, we\nconsider square lattice with C4v, as shown in Fig. 1 (a).\nThe two orbital are fixed to be dx2\u2212y2 and dxy. Then,\n\u03be(k) = 2(t1\u03c4 0 + t2\u03c4 3)\u03b3(k) \u2212\u00b5 with \u03b3(k) = cos kx + cos ky\nand \u00b5 denoting the chemical potential. \u03bbR reperents the\nstrength of effective RSOC, and a parameter \u03f5 is intro-\nduced to measure the anisotropy of two orbitals[39]. \u03bb\nis the strength of on-site SOC. To match the parameter\nsettings in Fig. 1 (a), t1/2 = (tx2\u2212y2 \u00b1 txy)/2 and \u03bbR =\n(\u03bbR1+\u03bbR2)/2. In the following, t1 is set to \u22121, and other\nparameters are in unit of |t1|. \u03bb is set to 3 throughout the\ncalculations. Pauli matrices (\u03c4 0, \u03c4) and (\u03c30, \u03c3) span in\norbital and spin spaces, respectively. Then, H0 spans in\nthe bais \u03a8k = (cx2\u2212y2,k,\u2191, cx2\u2212y2k,\u2193, cxy,k,\u2191, cxy,k,\u2193)T with\nc\u03b1k,\u03c3 being the electron annihilation operator. U, V , J\nand J\u2032 reperents the intra- and inter-orbital on-site re-\npulsive Hubbard interactions, Hund\u2019s coupling, and the\npairing hopping term, respectively.\n\u02c6ni\u03b1\u03c3 and Si\u03b1 de-\nnote the spin-\u03c3 electron density operator and spin oper-\nator for electrons in orbital \u03b1 on i site, respectively. For\nclarity, we adopt the conventional notations J = J\u2032 and\nV = U \u22122J in the following discussion.\nThe general bare susceptibility can be expressed as fol-\nlows,\n\u03c7l1\u03c31l2\u03c32\nl3\u03c33l4\u03c34(q, i\u03bdn)\n= \u2212T\nX\nk,i\u03c9n\nGl3\u03c33l1\u03c31(k, i\u03c9n)Gl2\u03c32l4\u03c34(k, i\u03c9n \u2212i\u03bdn),\n(4)\nwith the Matsubara Green function,\n\u02c6G(k, i\u03c9n) = [i\u03c9n \u2212H0(k)]\u22121.\n(5)\nHere, l1\u22124 are orbital index, and \u03c31\u22124 are spin index. i\u03c9n\nand i\u03bdn fermionic and bosonic Matsubara frequencies,\nrespectively. Using RPA, the dressed susceptibility can\nbe calculated by\n\u02c6\u03c7RP A(q, i\u03bdn) = [\u02c61 \u2212\u02c6\u03c7(q, i\u03bdn) \u02c6W]\u22121\u03c7(q, i\u03bdn).\n(6)\nHere,\n\u02c6W is the bare interaction matrix with non-zero\nelements W l1\u03c3l1\u00af\u03c3\nl1\u03c3l1\u00af\u03c3 = U, W l1\u03c3l1\u03c3\nl1\u00af\u03c3l1\u00af\u03c3 = \u2212U, W l2\u03c3l1\u00af\u03c3\nl2\u03c3l1\u00af\u03c3 = V ,\nW l1\u03c3l1\u03c3\nl2\u00af\u03c3l2\u00af\u03c3 = \u2212V , W l1\u03c3l1\u00af\u03c3\nl2\u03c3l2\u00af\u03c3 = J, W l1\u03c3l2\u03c3\nl1\u00af\u03c3l2\u00af\u03c3 = \u2212J, W l1\u03c3l2\u00af\u03c3\nl1\u03c3l2\u00af\u03c3 =\nJ\u2032, and W l1\u03c3l2\u03c3\nl2\u00af\u03c3l1\u00af\u03c3 = \u2212J\u2032.\nAccording\nto\nthe\ntheory\nof\nfluctuation-mediated\nsuperconductivity[40\u201342], the effective static supercon-\nducting pairing interactions can be calculated by consid-\nering bubble and ladder diagrams[43], and are expressed\nas:\nV l1\u03c31l2\u03c32\nl3\u03c33l4\u03c34 (k, k\u2032) = W l1\u03c31l2\u03c32\nl3\u03c33l4\u03c34 + Vl(k, k\u2032) \u2212Vb(k, k\u2032), (7)\nwith the ladder term\nVl(k, k\u2032) = [ \u02c6W \u02c6\u03c7RP A \u02c6W]l1\u03c31l2\u03c32\nl3\u03c33l4\u03c34(k + k\u2032),\n(8)\n3\nand the bubble term\nVb(k, k\u2032) = [ \u02c6W \u02c6\u03c7RP A \u02c6W]l1\u03c31l4\u03c34\nl3\u03c33l2\u03c32(k \u2212k\u2032).\n(9)\nThe possible superconducting pairing can be evaluated\nthrough solving the linear Eliashaberg equation as fol-\nlows,\n\u03bb\u2206l1\u03c31l4\u03c34(k)\n= T\nX\nk\u2032i\u03c9n\nX\nl2l3\u03c32\u03c33\nV l1\u03c31l2\u03c32\nl3\u03c33l4\u03c34 (k, k\u2032)Fl3\u03c33l2\u03c32(k\u2032, i\u03c9n),\n(10)\nwith\nFl3\u03c33l2\u03c32(k\u2032, i\u03c9n)\n= Gl3\u03c33l\u03c3(k\u2032, i\u03c9n)\u2206l\u03c3l\u2032\u03c3\u2032(k\u2032)Gl2\u03c32l\u2032\u03c3\u2032(\u2212k\u2032, \u2212i\u03c9n).\n(11)\nHere, \u2206l1\u03c31l4\u03c34(k) is the superconducting pairing func-\ntion and the leading superconducting instability can be\nidentified by the largest positive eigen-value of \u03bb.\nFermionology and spin susceptibility.\u2014 Previous stud-\nies on the conventional Rashba-Hubbard model have fo-\ncused on cases where the filling is near the van Hove\nsingularity.\nThis condition induces strong ferromag-\nnetic fluctuations, which, in turn, favor the emergence\nof leading p- or f-wave spin-triplet superconducting\nstates[32, 34, 35].\nIn this work, we relax this restric-\ntion and consider more general cases of electron doping\nbeyond the van Hove singularity. Note that the hole dop-\ning is the mirror of the electron doping here. A typical\nband structure and Fermi surfaces under electron doping\nare shown in the Figs. 1 (b) and (c). Two Fermi sur-\nfaces have the same spin textures and apparent orbital\ntextures, as shown in Fig. 1 (c).\nThe bare and RPA spin susceptibilities corresponding\nto the Fermi surfaces in Fig.\n1 (c) can be calculated\nwith Eqs. 4 and 6. The calculated results indicate that\nthe transverse spin susceptibility \u03c7+\u2212(q) is consistently\nmuch smaller than the longitudinal spin susceptibility\n\u03c7zz(q)[43]. Therefore, we focus on \u03c7zz(q) as shown in\nFigs. 1 (d) and (e), which primarily determines the char-\nacteristics of spin fluctuations in the system.\nIn both\nFigs. 1 (d) and (e), in addition to the four peaks near\nthe (\u03c0, \u03c0), eight new peaks emerge, as indicated by the\nwhite dashed arrows in these panels. These new peaks\ncorrespond to the nesting wave vectors of the two spin-\nsplit Fermi surfaces, which are consistent with the black\ndashed arrows shown in Fig. 1(a). This suggests that\nthese eight new peaks play a crucial role in determin-\ning the characteristics of the potential superconducting\nstate, distinguishing it from the single-orbital case. We\nalso calculated the spin susceptibility for other doping\ncases[43] and found significantly different characteristics\ncompared to the single orbital cases.\nPhase diagram and sign-change SC states.\u2014 Once su-\nperconductivity is induced by antiferromagnetic fluctua-\ntions, the pairing symmetry of the superconducting state\nTABLE I. Possible IRs of superconducting pairings under the\nconstraints of group C4v. Here, gk = sin kx sin ky.\nPairing form\n\u03d5g/u / dg/u\nIRs\ni\u03d5g\u03c32\u03c4 0/3(1)\n1, cos kx + cos ky, cos kx cos ky\nA1(A2)\ncos kx \u2212cos ky\nB1(B2)\nsin kx sin ky\nB2(B1)\ni\u03d5u\u03c32\u03c4 2\n{sin kx, sin ky}\nE\ni(du \u00b7 \u03c3)\u03c32\u03c4 0/3(1)\n(\u2212sin ky, sin kx, 0)\nA1(A2)\n(sin kx, sin ky, 0)\nA2(A1)\n(sin ky, sin kx, 0)\nB1(B2)\n(sin kx, \u2212sin ky, 0)\nB2(B1)\n[(0, 0, sin kx), (0, 0, sin ky)]\nE\ni(dg \u00b7 \u03c3)\u03c32\u03c4 2\n(0, 0, cos kx + cos ky)\nA1\n(0, 0, cos kx \u2212cos ky)\nB1\n(0, 0, gk)\nB2\n{(cos kx/y, 0, 0), (0, cos ky/x, 0)}\nE\n{(gk, 0, 0), (0, gk, 0)}\ncan be classified according to the IRs of the C4v group.\nTable I lists the possible pairings with their form factors\nup to the next-nearest neighbor and their corresponding\nIRs. By solving the linear gap equation in Eq. (10), the\nleading superconducting pairing can be determined.\nFIG. 2.\nThe phase diagrams of superconducting states as\nfunctions of some parameters. (a)-(b) \u03bbR-\u00b5 phase diagram.\nU = 4.5 and J = 0.03U in (a). U = 5.4 and J = 0.21U in (b).\n(c)-(d) U-J phase diagram. \u03bbR = 0.9 and \u00b5 = \u22125.3 in (c).\n\u03bbR = 0.8 and \u00b5 = \u22124.5 in (d). In (a)-(d), all other parameters\nare the same as those in Fig. 1 (b). In panel (a), the positions\nmarked by numbers 1 and 2 correspond to the star-marked\npositions labeled 1 and 2 in (c) and (d). The notation S\u00b1\nand d\u00b1 in A1 and B2 indicates that the superconducting gap\nfunctions on the two spin-split Fermi surfaces have opposite\nsigns along the radial direction.\nThe phase diagrams of the superconducting state with\nrespect to \u03bbR-\u00b5 and U-J are shown in Figs.\n2 and\nFigs. S4 and S6[43]. Over a wide-range parameter space,\nwe identify three possible leading pairing channels: A1,\n4\nB1, and B2. Notably, the superconducting states in the\nA1 and B2 channels exhibit highly unconventional sign-\nchanging gap structures A1(S\u00b1) and B2(d\u00b1), which indi-\ncate that the superconducting gap functions on the two\nspin-split Fermi surfaces have opposite signs along the ra-\ndial direction as shown in Fig. 3 (a) and Fig. S9 (a)[43].\nFIG. 3.\n(a)-(d) The superconducting gap functions pro-\njected onto the corresponding Fermi surfaces. The red and\nblue colors indicate the opposite signs of superconducting\ngap functions.\n(a) and (b) correspond to full and A1,1 =\ni\u03c32\u03c4 0\u03c30\u03d51k channels in A1(S\u00b1) IR, respectively.\n(c) The\nrenormalized gap function \u2206A1(kF , \u03b8)/max(\u2206A1(kF , \u03b8)) and\n\u2206A1,1(kF , \u03b8)/max(\u2206A1,1(kF , \u03b8)). (d) The weights of six sub-\nchannels in A1(S\u00b1) IR as function as t2. (e)-(f) The fitting\ngap function \u22061/2(k) in Eq. 13 along different high-symmetry\nlines \u0393 \u2212M and \u0393 \u2212X. Other parameters in (a)-(d) are the\nsame as those in Fig. 1 (c) and of star labeled 2 in Fig. 2 (a)\nand (d).\nTo elucidate the origin of this unconventional sign-\nchange superconducting state, we take the A1(S\u00b1) chan-\nnel as an example. The gap function of noncentrosym-\nmetric systems can be parameterized as follows[44],\n\u02c6\u2206k = [\u02c6\u03d5k + \u02c6dk \u00b7 \u03c3]i\u03c32.\n(12)\nIn the one-orbital Rashba model, the projection of the\ngap function onto the two spin-split Fermi surfaces can\nbe expressed in a simplified form, \u22061/2(k) = \u03d5k \u00b1 | \u02c6dk|\nwith \u03d5k as a scalar.\nThus, the aforementioned sign-\nchange gap structure only occur when the triplet com-\nponent is relatively strong and has helical p- or f-wave\nspin-triplet pairing, such as when the filling is near the\nvan Hove singularity[32, 34, 35]. Any doping away from\nthe van Hove singularity consistently results in predom-\ninant d-wave spin-singlet pairing[33]. However, for the\ntwo-orbital model considered here, additional possibili-\nties arise. We project the six sub-channels i\u03c32(\u03c4 0\u03c30\u03d51k,\n\u03c4 3\u03c30\u03d52k, \u03c4 0\u03c3 \u00b7 d3k, \u03c4 3\u03c3 \u00b7 d4k, \u03c4 1\u03c3 \u00b7 d5k, \u03c4 2\u03c33d6k) in\nthe spin-orbital representation, listed in Table I onto the\ntwo Fermi surfaces. Note that the first two sub-channels\nbelong to spin-singlet while the latter four correspond\nto spin-triplet. The projection of the first sub-channel\ni\u03c32\u03c4 0\u03c30\u03d51k is shown in Fig. 3 (b), while the others are\npresented in Fig. S7 [43]. We find that their supercon-\nducting gap functions exhibit the same projected distri-\nbution on the Fermi surfaces, ensuring consistency with\nthe overall superconducting gap projection, as illustrated\nin Figs. 3 (a), (b) and (c).\nTo further clarify the relative contributions of these\nsix sub-channels to the overall sign-changing gap, we\nplot their contribution weights as a function of t2 in\nFig. 3(d). We find that the first sub-channel i\u03c32\u03c4 0\u03c30\u03d51k\nconsistently dominates as t2 increasing. In other words,\nthe intra-orbital spin-singlet pairing always predominate\nthe gap structure regardless of the presence of the spin-\ntriplet pairings.\nThen, in the spin-orbital representa-\ntion, the superconducting gap function in Eq.\n12 can\nbe approximated as \u02c6\u2206k \u223ci\u03c32\u03c4 0\u03c30\u03d51k. Considering up\nto the next-nearest-neighbor terms, \u03d51k can be fitted as\n-0.018+0.046(cos kx + cos ky)+0.076cos kx cos ky. Conse-\nquently, in the band representation, the gap functions on\nthe two Fermi surfaces can be expressed as\n\u22061/2(k) =\nq\n2(1 + |B\u00b1\nk |2)\u03d51k,\n(13)\nwith B\u00b1\nk\n=\ni(\np\n\u03bb2 + h2\nk\u03bb2\nR \u00b1 \u03bbRhk)/\u03bb and hk\n=\nq\nsin2 kx + sin2 ky. In Fig. 3 (e) and (f), we plot \u22061/2(k)\nalong the \u0393 \u2212X and \u0393 \u2212M directions, respectively.\nThe sign-change feature of the gap function at the two\nFermi momenta is clearly visible. Such sign-change gap\nstructure is consistent with the Fermi surface nesting\nand \u03c7zz(q) in Figs. 1 (c)-(e). Similar analysis can be\nperformed for channel B2(d\u00b1), as shown in Figs.\nS8-\nS10[43].\nThe results indicate that its sign-change fea-\nture cannot come from the pure spin-singlet channel and\ndepends on the mixing of spin-singlet and spin-triplet\ncomponents[43], in contrast to the A1(S\u00b1) pairing.\nTopological superconductivity\u2014 For noncentrosymmet-\nric systems with time-reversal symmetry, the criteria for\nidentifying TSC require that a pair of spin-split Fermi\nsurfaces enclose an odd number of time-reversal invariant\nmomentum points, and the order parameter must have\nopposite signs on the two Fermi surfaces[45]. Addition-\nally, the fully gapped superconducting state is preferable.\nAccording to this criterion, the Fermi surface in Fig. 1\n(c) and the corresponding superconducting gap in Fig. 3\n(a) or (b) support TSC.\nFor the two-dimensional system with time-reversal\nsymmetry considered here, the Chern number is always\nzero, and the topological nature of the superconduct-\ning state can be characterized by a topological invari-\nant Z2.\nDue to the lack of spatial inversion symme-\ntry, the Z2 invariant can be determined by calculating\nthe Wannier centers of the superconducting state\u2019s wave\nfunctions[43, 46] and verified through the computation of\n5\nFIG. 4.\n(a) The Wilson loop calculation for the A1(S\u00b1)\nsuperconducting state.\n(b) The quasi-particle spectrum of\nA1(S\u00b1) superconducting state.\nNote that the in-gap edge\nstates are double-degeneracy with opposite chriality.\nPeri-\nodic and open boundary conditions are applied along x and\ny direction, respectively. In both (a) and (b), the parameters\nare the same as those in Fig. 3 (a).\nthe edge spectrum[47, 48]. Figs. 4 (a) and (b) show the\nthe Wannier center and edge spectrum, respectively, for\nthe superconducting state with the Fermi surface from\nFig.\n1 (c) and the gap structure from Fig.\n3 (b).\nIt\nis straightforward to conclude that this superconducting\nstate has a topological invariant Z2 = 1. The supercon-\nducting state in B2(d\u00b1) channel has nodes, however, it is\nalso topologically nontrivial[43, 49], and exhibites Majo-\nrana flat bands and dispersive Majorana states on edges.\nOutlook and conclusion\u2014 The TSC state driven by\npredominant s-wave spin-singlet pairing in our study\nhas significant implications. First, TSC states realized\nthrough p- or f-wave spin-triplet pairing are extremely\nrare in real material systems. In contrast, almost all re-\nported superconductors exhibit s-wave spin-singlet pair-\ning, including cuprate, iron-based, and nickelate high-\ntemperature superconductors[50\u201352]. Second, the split-\nting of Fermi surfaces accompanied by spin-texture char-\nacteristics due to broken inversion symmetry has been\nexperimentally reported in systems such as cuprate and\niron-based superconductors[30, 31].\nFor instance, spin\nsplitting induced by Rashba SOC could potentially be\ntuned via electric fields or other external controls. Our\nproposal offers a viable pathway to realize bulk topo-\nlogical superconductivity in high-temperature supercon-\nducting systems without relying on proximity effects. If\nachieved, this could greatly advance the exploration of\ntopological quantum computing based on topological su-\nperconductivity.\nIn summary, we investigated superconductivity in\na new class of correlated non-centrosymmetric sys-\ntems. Using the theory of antiferromagnetic-fluctuation-\nmediated superconductivity, we identified four possible\nsuperconducting states A1(S\u00b1), B1, B2 and B2(d\u00b1) un-\nder different model parameters. all of which are predomi-\nnantly driven by spin-singlet pairing with A1, B1 and B2\npairing symmetries. Notably, the superconducting state\nwith A1(S\u00b1) IR is predominantly driven by spin-singlet\npairing and belongs to a bulk topological superconduct-\ning state characterized by a Z2 = 1 topological invari-\nant. Our proposal is different from all present scenarios\nbased on proximity effects and strong p- or f-wave spin-\ntriplet pairing. We further highlighted the potential real-\nization of such topological superconducting states in cer-\ntain high-temperature superconductors, providing a new\npathway for exploring next-generation high-temperature\ntopological superconductors.\nThis work was financially supported by the Na-\ntional\nKey\nR&D\nProgram\nof\nChina\n(Grants\nNo.\n2022YFA1403200, No. 2024YFA1613200 and Grant No.\n2023YFA1407300), National Natural Science Foundation\nof China (Grants No.\n92265104, No.\n12022413 and\nNo. 12447103), the Basic Research Program of the Chi-\nnese Academy of Sciences Based on Major Scientific In-\nfrastructures (Grant No. JZHKYPT-2021-08), the CA-\nSHIPS Director\u2019s Fund (Grant No. BJPY2023A09), An-\nhui Provincial Major S&T Project(s202305a12020005),\nand the High Magnetic Field Laboratory of Anhui\nProvince under Contract No. AHHM-FX-2020-02.\n\u2217These authors make equal contributions.\n\u2020 xxwu@itp.ac.cn\n\u2021 haon@hmfl.ac.cn\n[1] D. A. Ivanov, Non-abelian statistics of half-quantum vor-\ntices in p-wave superconductors, Physical review letters\n86, 268 (2001).\n[2] A. Y. Kitaev, Fault-tolerant quantum computation by\nanyons, Annals of physics 303, 2 (2003).\n[3] A. Kitaev, Anyons in an exactly solved model and be-\nyond, Annals of Physics 321, 2 (2006).\n[4] C. Nayak, S. H. Simon, A. Stern, M. Freedman, and\nS. Das Sarma, Non-abelian anyons and topological quan-\ntum computation, Reviews of Modern Physics 80, 1083\n(2008).\n[5] J. Alicea, New directions in the pursuit of majorana\nfermions in solid state systems, Reports on progress in\nphysics 75, 076501 (2012).\n[6] S. R. Elliott and M. Franz, Colloquium:\nMajorana\nfermions in nuclear, particle, and solid-state physics, Re-\nviews of Modern Physics 87, 137 (2015).\n[7] M. Z. Hasan and C. L. Kane, Colloquium: Topological\ninsulators, Rev. Mod. Phys. 82, 3045 (2010).\n[8] X.-L. Qi and S.-C. Zhang, Topological insulators and su-\nperconductors, Rev. Mod. Phys. 83, 1057 (2011).\n[9] N. Read and D. Green, Paired states of fermions in\ntwo dimensions with breaking of parity and time-reversal\nsymmetries and the fractional quantum hall effect, Phys.\nRev. B 61, 10267 (2000).\n[10] L. Fu and C. L. Kane, Superconducting proximity effect\nand majorana fermions\u00a1?\nformat?\u00bf at the surface of a\ntopological insulator, Physical review letters 100, 096407\n(2008).\n[11] J. D. Sau, R. M. Lutchyn, S. Tewari, and S. Das Sarma,\nGeneric new platform for topological quantum computa-\ntion using semiconductor heterostructures, Physical re-\nview letters 104, 040502 (2010).\n[12] S. Nadj-Perge, I. K. Drozdov, J. Li, H. Chen, S. Jeon,\n6\nJ. Seo, A. H. MacDonald, B. A. Bernevig, and A. Yaz-\ndani, Observation of majorana fermions in ferromagnetic\natomic chains on a superconductor, Science 346, 602\n(2014).\n[13] N. Hao and J. Hu, Topological phases in the single-layer\nfese, Physical Review X 4, 031053 (2014).\n[14] Z. Wang, P. Zhang, G. Xu, L. Zeng, H. Miao, X. Xu,\nT. Qian, H. Weng, P. Richard, A. V. Fedorov, et al.,\nTopological nature of the fese 0.5 te 0.5 superconductor,\nPhysical Review B 92, 115119 (2015).\n[15] X. Wu, S. Qin, Y. Liang, H. Fan, and J. Hu, Topological\ncharacters in fe (te 1- x se x) thin films, Physical Review\nB 93, 115129 (2016).\n[16] N. Hao and J. Hu, Topological quantum states of matter\nin iron-based superconductors: from concept to material\nrealization, National Science Review 6, 213 (2019).\n[17] X. Wu, S. Qin, Y. Liang, C. Le, H. Fan, and J. Hu, Cafeas\n2: A staggered intercalation of quantum spin hall and\nhigh-temperature superconductivity, Physical Review B\n91, 081111 (2015).\n[18] G. Xu, B. Lian, P. Tang, X.-L. Qi, and S.-C. Zhang,\nTopological superconductivity on the surface of fe-based\nsuperconductors, Physical review letters 117, 047001\n(2016).\n[19] T.-R. Chang,\nP.-J. Chen,\nG. Bian,\nS.-M. Huang,\nH. Zheng, T. Neupert, R. Sankar, S.-Y. Xu, I. Belopolski,\nG. Chang, et al., Topological dirac surface states and su-\nperconducting pairing correlations in pbtase 2, Physical\nReview B 93, 245130 (2016).\n[20] S.-Y. Guan, P.-J. Chen, M.-W. Chu, R. Sankar, F. Chou,\nH.-T. Jeng, C.-S. Chang, and T.-M. Chuang, Supercon-\nducting topological surface states in the noncentrosym-\nmetric bulk superconductor pbtase2, Science advances 2,\ne1600894 (2016).\n[21] L. Jiao, S. Howard, S. Ran, Z. Wang, J. O. Rodriguez,\nM. Sigrist, Z. Wang, N. P. Butch, and V. Madhavan,\nChiral superconductivity in heavy-fermion metal ute2,\nNature 579, 523 (2020).\n[22] D. Aoki, A. Nakamura, F. Honda, D. Li, Y. Homma,\nY. Shimizu, Y. J. Sato, G. Knebel, J.-P. Brison, A. Pour-\nret, et al., Unconventional superconductivity in heavy\nfermion ute2, journal of the physical society of japan 88,\n043702 (2019).\n[23] D. Aoki, J.-P. Brison, J. Flouquet, K. Ishida, G. Knebel,\nY. Tokunaga, and Y. Yanase, Unconventional supercon-\nductivity in ute2, Journal of Physics: Condensed Matter\n34, 243002 (2022).\n[24] E. Bauer, G. Hilscher, H. Michor, C. Paul, E.-W. Scheidt,\nA. Gribanov, Y. Seropegin, H. No\u00a8el, M. Sigrist, and\nP. Rogl, Heavy fermion superconductivity and magnetic\norder in noncentrosymmetric c e p t 3 s i, Physical review\nletters 92, 027003 (2004).\n[25] M. Smidman, M. Salamon, H. Yuan, and D. Agter-\nberg,\nSuperconductivity\nand\nspin\u2013orbit\ncoupling\nin\nnon-centrosymmetric materials:\na review, Reports on\nProgress in Physics 80, 036501 (2017).\n[26] M. Krupka, A. Giorgi, N. Krikorian, and E. Szklarz, High\npressure synthesis and superconducting properties of yt-\ntrium sesquicarbide, Journal of the Less Common Metals\n17, 91 (1969).\n[27] J.-K. Bao, J.-Y. Liu, C.-W. Ma, Z.-H. Meng, Z.-T. Tang,\nY.-L. Sun, H.-F. Zhai, H. Jiang, H. Bai, C.-M. Feng,\net al., Superconductivity in quasi-one-dimensional k 2 cr\n3 as 3 with significant electron correlations, Physical Re-\nview X 5, 011013 (2015).\n[28] Z.-T. Tang, J.-K. Bao, Y. Liu, Y.-L. Sun, A. Ablimit,\nH.-F. Zhai, H. Jiang, C.-M. Feng, Z.-A. Xu, and G.-\nH. Cao, Unconventional superconductivity in quasi-one-\ndimensional rb 2 cr 3 as 3, Physical Review B 91, 020506\n(2015).\n[29] Z.-T. Tang, J.-K. Bao, Z. Wang, H. Bai, H. Jiang, Y. Liu,\nH.-F. Zhai, C.-M. Feng, Z.-A. Xu, and G.-H. Cao, Super-\nconductivity in quasi-one-dimensional cs 2 cr 3 as 3 with\nlarge interchain distance, Science China Materials 58, 16\n(2015).\n[30] K. Gotlieb, C.-Y. Lin, M. Serbyn, W. Zhang, C. L. Small-\nwood, C. Jozwiak, H. Eisaki, Z. Hussain, A. Vishwanath,\nand A. Lanzara, Revealing hidden spin-momentum lock-\ning in a high-temperature cuprate superconductor, Sci-\nence 362, 1271 (2018).\n[31] S. Borisenko, D. Evtushinsky, Z.-H. Liu, I. Morozov,\nR. Kappenberger, S. Wurmehl, B. B\u00a8uchner, A. Yaresko,\nT. Kim, M. Hoesch, et al., Direct observation of spin\u2013\norbit coupling in iron-based superconductors, Nature\nPhysics 12, 311 (2016).\n[32] A. Greco and A. P. Schnyder, Mechanism for uncon-\nventional superconductivity in the hole-doped rashba-\nhubbard model, Physical Review Letters 120, 177002\n(2018).\n[33] K. Nogaki and Y. Yanase, Strongly parity-mixed super-\nconductivity in the rashba-hubbard model, Physical Re-\nview B 102, 165114 (2020).\n[34] A. Greco, M. Bejas, and A. P. Schnyder, Ferromagnetic\nfluctuations in the rashba-hubbard model, Physical Re-\nview B 101, 174420 (2020).\n[35] P. M. Bonetti, D. Chakraborty, X. Wu, and A. P.\nSchnyder, Interaction-driven first-order and higher-order\ntopological superconductivity, Physical Review B 109,\nL180509 (2024).\n[36] X. Huang, Y. Xiao, R. Song, and N. Hao, Generic model\nwith unconventional rashba bands and giant spin galvanic\neffect, Physical Review B 109, 195419 (2024).\n[37] R. Wang, J. Li, X. Huang, L. Wang, R. Song, and\nN. Hao, Superconductivity in two-dimensional systems\nwith unconventional rashba bands, Physical Review B\n110, 134517 (2024).\n[38] R. Wang, S.-B. Zhang, and N. Hao, Finite-momentum\npairing state in unconventional rashba systems, Physical\nReview B 111, L100506 (2025).\n[39] Here, we use anisotropic intra-orbital rsoc \u03c4 3 term to\nreplace the inter-orbital rsoc \u03c4 1 term used in refs. [36\u2013\n38]. note that these two cases are equivalent and can\ntransform in to each other through an unitary operator\n(\u03c4 0 + i\u03c4 2)/\n\u221a\n2. here, for convenience to define the model\non square lattice, we adopt the former case, .\n[40] N. Berk and J. Schrieffer, Effect of ferromagnetic spin\ncorrelations on superconductivity, Physical Review Let-\nters 17, 433 (1966).\n[41] V. Emery, Theories of liquid helium three, Annals of\nPhysics 28, 1 (1964).\n[42] D. J. Scalapino, A common thread: The pairing inter-\naction for unconventional superconductors, Reviews of\nModern Physics 84, 1383 (2012).\n[43] Supplementary materials including the detials of rpa, the\neffect of interorbital interaction, and gap functions, .\n[44] L. P. Gor\u2019kov and E. I. Rashba, Superconducting 2d sys-\ntem with lifted spin degeneracy:\nmixed singlet-triplet\nstate, Physical Review Letters 87, 037004 (2001).\n7\n[45] X.-L. Qi, T. L. Hughes, and S.-C. Zhang, Topological in-\nvariants for the fermi surface of a time-reversal-invariant\nsuperconductor, Physical Review B\u2014Condensed Matter\nand Materials Physics 81, 134508 (2010).\n[46] R. Yu, X. L. Qi, A. Bernevig, Z. Fang, and X. Dai, Equiv-\nalent expression of z 2 topological invariant for band in-\nsulators using the non-abelian berry connection, Physical\nReview B\u2014Condensed Matter and Materials Physics 84,\n075119 (2011).\n[47] N. Hao, P. Zhang, Z. Wang, W. Zhang, and Y. Wang,\nTopological edge states and quantum hall effect in the\nhaldane model, Physical Review B\u2014Condensed Matter\nand Materials Physics 78, 075438 (2008).\n[48] N. Hao, P. Zhang, and Y. Wang, Topological phases\nand fractional excitations of the exciton condensate\nin a special class of bilayer systems, Physical Review\nB\u2014Condensed Matter and Materials Physics 84, 155447\n(2011).\n[49] M. Sato, Y. Tanaka, K. Yada, and T. Yokoyama, Topol-\nogy of andreev bound states with flat dispersion, Physical\nReview B\u2014Condensed Matter and Materials Physics 83,\n224511 (2011).\n[50] J. G. Bednorz and K. A. M\u00a8uller, Possible high t c super-\nconductivity in the ba- la- cu- o system, Zeitschrift f\u00a8ur\nPhysik B Condensed Matter 64, 189 (1986).\n[51] Y. Kamihara, T. Watanabe, M. Hirano, and H. Hosono,\nIron-based layered superconductor la [o1-x f x] feas (x=\n0.05- 0.12) with t c= 26 k, Journal of the American\nChemical Society 130, 3296 (2008).\n[52] H. Sun, M. Huo, X. Hu, J. Li, Z. Liu, Y. Han, L. Tang,\nZ. Mao, P. Yang, B. Wang, et al., Signatures of super-\nconductivity near 80 k in a nickelate under high pressure,\nNature 621, 493 (2023)."
-  },
-  {
-    "domain": "Materials Science",
-    "chunk_type": "general",
-    "text": "1 \n \nAC Current-Driven Magnetization Switching and Nonlinear Hall \nRectification in a Magnetic Topological Insulator \n \nYuto Kiyonaga1\u2020, Masataka Mogi1\u2020*, Ryutaro Yoshimi2,3, Yukako Fujishiro2,4, Yuri Suzuki1, \nMax T. Birch2, Atsushi Tsukazaki1, Minoru Kawamura2, Masashi Kawasaki1,2 & Yoshinori \nTokura1,2,5 \n \n1Department of Applied Physics and Quantum-Phase Electronics Center (QPEC), University \nof Tokyo, Bunkyo-ku, Tokyo, Japan \n2RIKEN Center for Emergent Matter Science (CEMS), Wako, Saitama, Japan \n3Department of Advanced Materials Science, University of Tokyo, Kashiwa, Chiba, Japan \n4RIKEN Cluster for Pioneering Research (CPR), Wako, Saitama, Japan \n5Tokyo College, University of Tokyo, Bunkyo-ku, Tokyo, Japan \n \n\u2020These authors contributed equally: Yuto Kiyonaga, Masataka Mogi. \n*e-mail: mogi@ap.t.u-tokyo.ac.jp \n \n \n \n \n2 \n \nAbstract: Spin-orbit torque arising from the spin-orbit-coupled surface states of topological \ninsulators enables current-induced control of magnetization with high efficiency. Here, \nalternating-current (AC) driven magnetization reversal is demonstrated in a semi-magnetic \ntopological insulator (Cr,Bi,Sb)2Te3/(Bi,Sb)2Te3, facilitated by a low threshold current density \nof 1.5 \u00d7 10! Am\"#. Time-domain Hall voltage measurements using an oscilloscope reveal a \nstrongly nonlinear and nonreciprocal Hall response during the magnetization reversal process. \nFourier analysis of the time-varying Hall voltage identifies higher-harmonic signals and a \nrectified direct-current (DC) component, highlighting the complex interplay among the applied \ncurrent, external magnetic field, and magnetization dynamics. Furthermore, a hysteretic \nbehavior in the current-voltage characteristics gives rise to frequency mixing under dual-\nfrequency excitation. This effect, distinct from conventional polynomial-based nonlinearities, \nallows for selective extraction of specific frequency components. The results demonstrate that \nAC excitation can not only switch magnetization efficiently but also induce tunable nonlinear \nresponses, offering a new pathway for multifunctional spintronic devices with potential \napplications in energy-efficient memory, signal processing, and frequency conversion. \n \n \n \n \n3 \n \n1. Introduction \nThe interplay of electron spin and charge transport underlies a wide range of spintronic \nphenomena and applications, including current-induced magnetization reversal in ferromagnets \nvia spin-transfer and spin-orbit torques.[1,2,3] Beyond acting as the driving force for \nmagnetization dynamics, the dynamics of localized spins significantly influence the charge \ntransport properties, leading to nonreciprocal conduction,[4,5] spin-motive forces,[6,7] and \nemergent electric fields.[8,9,10] These effects open possibilities for advanced functionalities, such \nas diode operation, DC rectification, frequency conversion, and novel inductive elements.[10,11] \n \nNonlinear Hall effects arising from such spin-charge coupling also offer a means to detect \ncurrent-driven magnetization dynamics.[12,13] In particular, second harmonic Hall voltages \ngenerated by continuous AC current excitation are commonly employed to probe magnetization \noscillations.[13] However, driving magnetization dynamics typically requires large current \ndensities on order of 10$% to 10$$ Am\"#, often applied as short pulses to mitigate Joule \nheating,[1] posing multiple challenges originating from extrinsic thermal effects, including \nparasitic Nernst effects,[14] spin-Seebeck effects,[15] and enhanced magnon scattering.[4,5] These \nparasitic effects often mask the intrinsic nonlinear signals that stem directly from magnetization \ndynamics. To date, a nonlinear Hall effect directly associated with magnetization reversal \ndriven by spin-transfer or spin-orbit torques remains largely unexplored. \n \nTime-domain measurements of nonlinear Hall effect provide an effective method for \ninvestigating magnetization dynamics. Such measurements have been employed to study \nultrafast magnetization switching,[16,17] magnetic domain wall motion induced by pulse \ncurrent,[18] and inertial skyrmion dynamics under AC currents.[19] In our study, by \nsimultaneously monitoring the longitudinal resistance as a thermometer, thermal effects can be \nevaluated, helping to distinguish intrinsic nonlinear signals from parasitic heating effects. \nMoreover, real-time current-Hall voltage characteristics allow direct analysis of nonlinear \nbehavior and phase, making time-domain measurements well suited for exploring nonlinear \nHall effects in AC-driven magnetization dynamics. \n \nTopological insulators (TIs) serve as a unique platform to explore these effects, as they host \ninsulating bulk states and conducting surface states with strong spin-orbit coupling, where \nelectron spin is locked perpendicularly to its momentum.[20] This property facilitates efficient \nspin-charge conversion via spin-orbit torques (SOTs).[21] Indeed, in heterostructures composed \n \n \n4 \n \nof TIs and various ferromagnets, current-driven magnetization switching has been \ndemonstrated at low current densities on the order of 10! to 10$% Am\"#.[13,22,23] Furthermore, \nwhen magnetism is introduced into the topological surface state, a magnetization gap is induced, \nleading to a large anomalous Hall conductivity due to Berry curvature.[24,25] Nonetheless, as in \nconventional ferromagnets, nonlinear transport arising from magnetization dynamics can still \nbe masked by substantial magnon scattering under low current excitation.[4,5] \n \nIn this study, we demonstrate AC current-induced magnetization dynamics and its associated \nnonlinear Hall effect in a semi-magnetic TI (Cr,Bi,Sb)\u2082Te\u2083/(Bi,Sb)\u2082Te\u2083 heterostructure thin film \n(see Methods for details). We directly observe magnetization reversal in response to AC current \ndrive by measuring the Hall voltage in time domain using an oscilloscope, The measured Hall \nvoltage exhibits strong current nonlinearity and higher-harmonic signals, as well as a rectified \nDC offset. A systematic analysis of the current-Hall voltage characteristics, combined with \nFourier transforms, reveals notable nonlinear and hysteretic responses in the Hall voltage, \nshedding light on the interplay between current, external magnetic field, and magnetization \ndynamics. We also identify an asymmetric frequency-mixing phenomenon originating from the \nmagnetization switching hysteresis, which contrasts with conventional polynomial-type \nnonlinearities. \n \n2. AC current-induced magnetization switching \nFor current-induced perpendicular magnetization switching in TIs, the damping-like SOT \n\ud835\udf0e\u20d7\u00d7 (\ud835\udf0e\u20d7\u00d7 \ud835\udc40--\u20d7) plays a central role,[3] where \ud835\udf0e is the spin polarization of conduction electron and \n\ud835\udc40 is the localized magnetization. In TIs, the spins are polarized perpendicular to the current \nwhen the chemical potential lies in the surface state.[24] Therefore, the magnetization with \nperpendicular magnetic anisotropy is tilted toward the current direction[5,13,23] (Figure 1a). The \nresulting magnetization can be detected by the anomalous Hall effect (AHE) arising from the \nmagnetization gap in the TI surface state (Figure 1b). \n \nWe first confirm current-induced magnetization switching by applying high-current pulses (up \nto \ud835\udc3c&'()* = \u00b1500 \u00b5A, or current density of \u00b15 \u00d7 10! Am\"#), and under an in-plane magnetic \nfield \ud835\udc35+ = 0.01 T, followed by measuring the Hall voltage at a low sensing current (\ud835\udc3c+ = 1 \u00b5A, \nor current density of 1 \u00d7 10, Am\"#) (Figure 1c). The Hall voltage \ud835\udc49- is given by \ud835\udc49- = \ud835\udc45-+\ud835\udc3c+ \u221d\n\ud835\udc40.\ud835\udc3c+, where \ud835\udc40. is the out-of-plane component contributing to the AHE. Hence, the sign of the \nHall resistance \ud835\udc45-+ directly reflects the direction of M. By varying the amplitude of the pulse, \n \n \n5 \n \nwe observe a clear sign reversal of \ud835\udc45-+ at a threshold current of about 150 \u00b5A (current density \nof  1.5 \u00d7 10! Am\"#) (Figure 1e). The switching polarity reverses if we flip either the current \ndirection or the in-plane field direction, consistent with SOT-driven perpendicular-\nmagnetization switching. \n \nNext, instead of applying current pulses, we record the Hall voltage in real time with an \noscilloscope (see Methods), while applying an AC current excitation \ud835\udc3c+ = \ud835\udc3c&*/0sin(2p\ud835\udc53\ud835\udc61) with \na peak current amplitude of \ud835\udc3c&*/0 =  300 \u00b5A at a frequency of \ud835\udc53=  101 Hz (Figure 1d). We \nevaluate \ud835\udc45-+ from the \ud835\udc45-+ = \ud835\udc49-/\ud835\udc3c+ and plot as a function of \ud835\udc3c+ in Figure 1f. When 0 \u00b5A < \ud835\udc3c+ <\n300 \u00b5A, the Hall resistance \ud835\udc45-+ gradually decreases and switches polarity above a threshold \ncurrent (defined where it crosses \ud835\udc45-+ = 0) of about 150 \u00b5A. Once the polarity is switched, it \nremains until it is switched again at around \ud835\udc3c+ = \u2212150 \u00b5A. This is consistent with the case of \npulse-current-induced magnetization reversal discussed above. Simultaneous monitoring of the \nlongitudinal resistance \ud835\udc45++  indicates that the temperature is kept below 30 K  during the \nmeasurement, which is lower than \ud835\udc471 ~ 40 K and there is no temperature hysteresis (see \nSupplementary Note 3), confirming that heating is not a primary origin of the observed \nmagnetization reversal. \n \n3. Nonlinear Hall voltage and hysteretic behavior \nFigure 2a shows the time-dependent evolution of the Hall voltage \ud835\udc49- (t). Since the Hall \nresistance \ud835\udc45-+ becomes negative for \ud835\udc3c+ > 0 and positive for \ud835\udc3c+ < 0, as shown in Figure 2b, we \nsee that \ud835\udc49- takes predominantly negative values, reflecting the reversed magnetization state \nwhen \ud835\udc3c+ is above the threshold. Meanwhile, small positive spikes (black triangles in Figure 2a) \nappear, indicating the short period when the magnetization remains unreversed below the \nthreshold current. To clarify the role of the current amplitude, we compare waveforms of the \nHall voltage for several peak current values (Figure 2c). At a low peak current value (\ud835\udc3c&*/0 =\n14 \u00b5A) which is far below the threshold current, Hall voltage shows a nearly sinusoidal time \ndependence with alternating signs, originating from the out-of-plane component of \nmagnetization under \ud835\udc35+~0.01 T. With increasing \ud835\udc3c&*/0 (from 71 to 300 \u00b5A), the waveforms \nbecome increasingly distorted, revealing pronounced nonlinearity in the Hall response. Figure \n2d further highlights this nonlinearity by plotting the \ud835\udc3c+ \u2212\ud835\udc49- relationship for each \ud835\udc3c&*/0. At \n\ud835\udc3c&*/0 = 14 \u00b5A, a nearly linear relationship appears, while at higher currents (e.g., 300 \u00b5A), the \ncurves develop a butterfly-shaped hysteresis: the Hall voltages differ for increasing vs \n \n \n6 \n \ndecreasing current. Such a hysteresis results from the magnetization dynamics and phase delay \nwhen the applied AC current exceeds the threshold of magnetization switching, representing \ncontinuous magnetization reversal.  \n \nThe nonlinear Hall response is also strongly affected by the in-plane magnetic field altering the \nmagnetization orientation. To investigate this, we perform time-domain measurements of \ud835\udc49- at \na larger field (\ud835\udc35+ = 2 T). Figure 3a shows the contrastive Hall voltage waveforms at \ud835\udc35+ =\n0.01 T and 2 T (the former being the same data from Figure 2a). Unlike at 0.01 T, at 2 T, which \nis strong enough to force the magnetization to lie in-plane, no positive spikes appear in regions \ni and iii of Figure 3a, and the signal remains consistently negative. Correspondingly, the \ud835\udc3c+ \u2212\n\ud835\udc49- characteristic shows rectifying behavior without hysteresis (Figure 3b, bottom panel). A \nplausible interpretation is that, under the strong in-plane field, the magnetization never switches. \nInstead, magnetization acquires a finite \ud835\udc40. component in the \u2212\ud835\udc67 direction for \ud835\udc3c+ > 0 and in the \n+\ud835\udc67 direction for \ud835\udc3c+ < 0, resulting in a rectified Hall response. \n \nWe note that extrinsic thermoelectric effects[14,15,26] as well as asymmetric scattering from \nmagnon emission/absorption[4,5] may contribute to the nonlinear Hall signals discussed above, \ncomplicating the quantitative separation of each effect. However, none of these effects can \ncause the hysteresis observed at \ud835\udc35+ = 0.01 T. Therefore, the main contribution of the nonlinear \nsignals originates from magnetization dynamics, particularly in the case of \ud835\udc35+ = 0.01 T. \n \nTo systematically compare these nonlinear signals, we decompose the time-domain Hall \nvoltage \ud835\udc49-(\ud835\udc61) via Fourier transforms. In a simple power-series expansion of \ud835\udc49- in terms of \ncurrent \ud835\udc3c+(\ud835\udc61) = \ud835\udc3c% sin(\ud835\udf14\ud835\udc61), \ud835\udc49-(\ud835\udc61) can be expressed as: \n\ud835\udc49-(\ud835\udc61) = \ud835\udc49-\n(%) + \ud835\udc494\n-\n($)sin(\ud835\udf14\ud835\udc61) + \ud835\udc494\n-\n(#)cos(2\ud835\udf14\ud835\udc61) + \ud835\udc494\n-\n(5)sin(3\ud835\udf14\ud835\udc61) + \ud835\udc494\n-\n(6)cos(4\ud835\udf14\ud835\udc61) + \u22ef, \nwhere  \ud835\udc49-\n(%)  and \ud835\udc49\u2032-\n(7) (\ud835\udc5b= 1, 2, 3, 4, \u22ef) are a constant and coefficients, respectively (see \nSupplementary Note 4 for details). Here all the odd-harmonic components appear as sine \nfunctions, while all the even-harmonic components appear as cosine functions in this power \nseries. More generally, a phase delay arising from hysteresis is described by adding odd-\nharmonic cosine functions and even-harmonic sine functions as, \n\ud835\udc49-(\ud835\udc61) = \ud835\udc49-\n(%) + \ud835\udc494-\n($)sin(\ud835\udf14\ud835\udc61) + \ud835\udc4944-\n($)cos(\ud835\udf14\ud835\udc61) + \ud835\udc494-\n(#)cos(2\ud835\udf14\ud835\udc61) + \ud835\udc4944-\n(#)sin(2\ud835\udf14\ud835\udc61) \n+\ud835\udc494-\n(5)sin(3\ud835\udf14\ud835\udc61) + \ud835\udc4944-\n(5)cos(3\ud835\udf14\ud835\udc61) + \ud835\udc494-\n(6)cos(4\ud835\udf14\ud835\udc61) + \ud835\udc4944-\n(6)sin(4\ud835\udf14\ud835\udc61) + \u22ef, \n \n \n7 \n \nwhere \ud835\udc49\u2032\u2032-\n(7) (\ud835\udc5b= 1, 2, 3, 4, \u22ef) are coefficients of these phase-shifted components. Figure 3c \nshows the Fourier amplitudes at \ud835\udc35+ = 0.01 T and 2 T. In this figure, gray, red, and blue bars \nrepresent the components \ud835\udc49-\n(%) , \ud835\udc494-\n(7) , and \ud835\udc4944-\n(7) , respectively. Both cases exhibit strong \nsecond-harmonic \ud835\udc494\n-\n(#) appear, corresponding to the fact that \ud835\udc49- remains negative for both \ud835\udc3c+ >\n0 and \ud835\udc3c+ < 0. However, additional phase-shifted components represented as \ud835\udc4944\n-\n(7) appear only \nat  \ud835\udc35+ = 0.01 T, reflecting the delayed, hysteretic response of magnetization reversal. In \ncontrast, at 2 T, the magnetization follows \ud835\udc3c+ more smoothly, eliminating large phase shifts. \n \n4. Asymmetric frequency mixing \nFinally, we demonstrate a frequency-mixing phenomenon[27] enabled by AC current-induced \nmagnetization switching. In general, a nonlinear system driven by two frequencies \ud835\udc53$ and \ud835\udc53# \ncan generate signals at their sum \ud835\udc53$ + \ud835\udc53# and difference |\ud835\udc53$ \u2212\ud835\udc53#|. If the nonlinearity is purely \npolynomial in current, the amplitudes of the \ud835\udc53$ + \ud835\udc53# and |\ud835\udc53$ \u2212\ud835\udc53#| components are expected to \nbe equal[27] (See Supplementary Note 5). However, in our semi-magnetic TI, the hysteresis of \nmagnetization reversal breaks this symmetry. In the experiment, we apply \ud835\udc3c+(\ud835\udc61) =\n\ud835\udc3c$sin (2\ud835\udf0b\ud835\udc53$\ud835\udc61) + \ud835\udc3c#sin (2\ud835\udf0b\ud835\udc53#\ud835\udc61) with \ud835\udc3c$ = \ud835\udc3c# = 150 \u00b5A, \ud835\udc53$ = 37 Hz, and \ud835\udc53# = 125 Hz. When \nonly a single frequency (\ud835\udc53$ = 37 Hz or \ud835\udc53# = 125 Hz) is applied (peak current 300 \u00b5A), the \nresponse is similar to our earlier single-frequency result (Figure 4a), confirming that the \nhysteretic behavior is governed by magnetization switching (Figure 1e,f), rather than the inertia \ndynamics of magnetic domains or capacitive/inductive components of the electrical circuits. \nWith both frequencies present (Figure 4b), we observe broadband wave mixing, ranging from \nDC component to 338 Hz in the Fourier spectrum of Vy (Figure 4c). Notably, at a low magnetic \nfield (0.01 T), the \ud835\udc53$ + \ud835\udc53# component is substantially larger than the |\ud835\udc53$ \u2212\ud835\udc53#| component. In \ncontrast, at a high field (2 T), both peaks exhibit similar amplitudes, indicating a more \nconventional polynomial-type nonlinearity. A numerical simulation (Figure 4d), assuming a \nsimplified \ud835\udc3c+ \u2212\ud835\udc49- characteristic with a well-defined threshold (insets of Figure 4d), reproduces \nthis asymmetry only when a finite threshold current \ud835\udc3c89 = 150 \u00b5A is considered (mimicking the \nlow-field case). If \ud835\udc3c89 = 0 \u00b5A (high-field case), the two sidebands remain equal. Hence, the \nhysteretic magnetization reversal leads to asymmetric frequency mixing. \n \n5. Conclusion \nOur findings demonstrate that AC current can induce continuous magnetization reversal in a \nsemi-magnetic TI heterostructure, at a remarkably low current density (1.5 \u00d7 10! Am\"#). We \n \n \n8 \n \nhave clarified the nonlinear Hall effect accompanying this process and shown that the hysteretic \nand phase-delayed Hall responses are governed by a threshold current for magnetization \nreversal. Furthermore, the strength of the magnetic field can control the presence or absence of \nthe hysteresis. The pronounced nonlinear Hall effect observed here holds promise for Hall \nrectification,[28,29] which has recently been studied for terahertz-to-DC conversion[30] or AC-to-\nDC conversion.[31] Additionally, when a current containing components of two different \nfrequencies is applied, we discover a distinctive frequency-mixing effect where the magnitudes \nof the sum-frequency and difference-frequency components differ due to the hysteresis for \nmagnetization reversal. Such frequency-mixing effects are commonly used for technologies \nsuch as photoacoustic imaging[32] and microwave generation[33]. The electrical asymmetric \nfrequency mixing effect induced by the nonlinearity of magnetization reversal process may thus \nbe leveraged in spintronic devices for selective extraction of desired frequency components.  \nFurthermore, in the field of neural networks, elements that combine nonlinearity with short-\nterm memory are widely used for physical reservoir computing.[32-37] The nonlinear and \nhysteretic response discussed here also has the potential for nonlinear transformation of inputs \ninto higher-dimensional outputs, making our system promising as a reservoir.  While all the \nresponses explored in this study are at 2.5 K, the use of ferromagnetic materials with high \ntransition temperatures in the magnetic layer exchange-coupled to the topological insulator \ncould lead to potential functionality even at room temperature. [40,41,42] Thus, our study paves \nthe way for harnessing hysteretic magnetization dynamics, with potential applications in spin-\norbit-based low-power switching elements, and advanced nonlinear electronics such as \nneuromorphic computing.[43] \n \n6. Experimental Section/Methods \nSample fabrication and electric transport measurement: We grew (Cr,Bi,Sb)2Te3/(Bi,Sb)2Te3 \nthin films on InP(111) substrates by molecular beam epitaxy under the base pressure on the \norder of 10\", Pa. The flux ratio \ud835\udc431:: \ud835\udc43;<: \ud835\udc43=>: \ud835\udc43?* = 1: 9: 16: 1000 was used to tune the Fermi \nenergy inside the bulk gap. The films are fabricated into 10 \u00b5m-wide Hall-bar devices. The \nstructures of them are illustrated in Supplementary Figure S1(a). In a typical sample, we \nmeasured the electrical transport properties using Physical Property Measurement System \n(PPMS), as shown in Supplementary Figure S1(b). The longitudinal and Hall resistance of a \ntypical sample is about 10 k\u03a9 and 2 k\u03a9, respectively. The Hall conductivity at the lowest \ntemperature \ud835\udc47= 2.5 K is about 0.3 \u00d7 \ud835\udc52#/\u210e , which indicates the Fermi level is fairly close to \n \n \n9 \n \nthe magnetization gap. From the temperature dependence of Hall and longitudinal resistivity, \nthe Curie temperature can be determined as \ud835\udc471 ~ 40 K. \n \nTime-domain measurement: We measured the Hall voltage as well as the longitudinal voltage \nin real time with an oscilloscope and a current source (Keithley 6221). Set up in a PPMS \nchamber is shown in Supplementary Figure S2.  The oscilloscope monitors the Hall voltage \n\ud835\udc49-(\ud835\udc61) as \n\ud835\udc49-(\ud835\udc61) = \ud835\udc491A#(\ud835\udc61) \u2212\ud835\udc491A$(\ud835\udc61) \nand the voltage on the resistor \ud835\udc49B as  \n\ud835\udc49B(\ud835\udc61) = \ud835\udc491A5(\ud835\udc61) \nwhere \ud835\udc491A$(\ud835\udc61), \ud835\udc491A#(\ud835\udc61), and \ud835\udc491A5(\ud835\udc61) are the measured voltages of the channels CH1, CH2, and \nCH3, respectively. We note that the current flowing through the circuit is obtained by  \n\ud835\udc3c+(\ud835\udc61) =\nC!(D) [F]\n$%%% [H] =\nC\"#$(D) [F]\n$%%% [H] . \n \nAcknowledgements: We thank Tomoyuki Yokouchi and Lixuan Tai for stimulating \ndiscussions. This work was supported by JSPS KAKENHI Grant Nos. JP22H04958, \nJP23H05431 and JP24K16986, and JST PRESTO Grant No. JPMJPR23HA. \n \nAuthor contributions: M.M. and Y.T. conceived the study. Y.K. and M.M. grew the samples \nwith help of R.Y., K.S.T., A.T. and M. Kawas. Y.K. and M.M. performed measurements with \nhelp of Y.F., Y.S., M.T.B. and M.Kawam. All authors discussed the results. Y.K., M.M. and \nY.T. wrote the manuscript with inputs from all other authors. Y.T. supervised the project.  \n \nReferences: \n[1] Liu, L. et al. Current-Induced Switching of Perpendicularly Magnetized Magnetic Layers \nUsing Spin Torque from the Spin Hall Effect. Phys. Rev. Lett. 109, 096602 (2012) \n \n[2] Ralph, D. C. & Stiles, M. D. Spin transfer torques. J. Magn. Magn. Mater. 320, 1190\u2013\n1216 (2008) \n \n[3] Manchon, A. et al. Current-induced spin-orbit torques in ferromagnetic and \nantiferromagnetic systems. Rev. Mod. Phys. 91, 035004 (2019) \n \n \n \n10 \n \n[4] Yasuda, K. et al. Large Unidirectional Magnetoresistance in a Magnetic Topological \nInsulator. Phys. Rev. Lett. 117, 127202 (2016) \n \n[5] Yasuda, K. et al. Current-Nonlinear Hall Effect and Spin-Orbit Torque Magnetization \nSwitching in a Magnetic Topological Insulator. Phys. Rev. Lett. 119, 137204 (2017) \n \n[6] Yang, S. A. et al. Universal Electromotive Force Induced by Domain Wall Motion. Phys. \nRev. Lett. 102, 067201 (2009) \n \n[7] Emori, S. et al. Current-driven dynamics of chiral ferromagnetic domain walls. Nat. \nMater. 12, 611\u2013616 (2013) \n \n[8] Nagaosa, N. & Tokura, Y. Emergent electromagnetism in solids. Phys. Scr. 2012, 014020 \n(2012) \n \n[9] Schulz, T. et al. Emergent electrodynamics of skyrmions in a chiral magnet. Nat. Phys. 8, \n301\u2013304 (2012) \n \n[10] Yokouchi, T. et al. Emergent electromagnetic induction in a helical-spin magnet. Nature \n586, 232\u2013236 (2020) \n \n[11] Yamane, Y., Fukami, S., & Ieda, J. Theory of Emergent Inductance with Spin-Orbit \nCoupling Effects. Phys. Rev. Lett. 128, 147201 (2022) \n \n[12] Sala, G. et al. Real-time Hall-effect detection of current-induced magnetization dynamics \nin ferrimagnets. Nat. Commun. 12, 656 (2021) \n \n[13] Fan, Y. et al. Magnetization switching through giant spin\u2013orbit torque in a magnetically \ndoped topological insulator heterostructure. Nat. Mater. 13, 699\u2013704 (2014) \n \n[14] Avci, C. O. et al. Interplay of spin-orbit torque and thermoelectric effects in \nferromagnet/normal-metal bilayers. Phys. Rev. B 90, 224427 (2014) \n \n[15] Uchida, K. et al. Observation of the spin Seebeck effect. Nature 455, 778\u2013781 (2008) \n \n \n11 \n \n \n[16] Baumgartner, M. et al. Spatially and time-resolved magnetization dynamics driven by \nspin-orbit torques. Nat. Nanotechnol. 12, 980\u2013986 (2017) \n \n[17] Grimaldi, E. et al. Single-shot dynamics of spin\u2013orbit torque and spin transfer torque \nswitching in three-terminal magnetic tunnel junctions. Nat. Nanotechnol. 15, 111\u2013117 (2020) \n \n[18] Yoshimura, Y. et al. Soliton-like magnetic domain wall motion induced by the interfacial \nDzyaloshinskii\u2013Moriya interaction. Nat. Phys. 12, 157\u2013161 (2016) \n \n[19] Birch, M. T. et al. Dynamic transition and Galilean relativity of current-driven \nskyrmions. Nature 633, 554\u2013559 (2024) \n \n[20] Hasan, M. Z. & Kane, C. L. Colloquium: Topological insulators. Rev. Mod. Phys. 82, \n3045 (2010) \n \n[21] Kondou, K. et al. Fermi-level-dependent charge-to-spin current conversion by Dirac \nsurface states of topological insulators. Nat. Phys. 12, 1027\u20131031 (2016) \n \n[22] Mellnik, A. R. et al. Spin-transfer torque generated by a topological insulator. Nature \n511, 449\u2013451 (2014) \n \n[23] Mogi, M. et al. Current-induced switching of proximity-induced ferromagnetic surface \nstates in a topological insulator. Nat. Commun. 12, 1404 (2021) \n \n[24] Tokura, Y., Yasuda, K., & Tsukazaki, A. Magnetic topological insulators. Nat. Rev. \nPhys. 1, 126-143 (2019) \n \n[25] Yu, R. et al. Quantized Anomalous Hall Effect in Magnetic Topological Insulators. \nScience 329, 5987 (2010) \n \n[26] Tai, L.  et al. Giant Hall Switching by Surface-State-Mediated Spin-Orbit Torque in a \nHard Ferromagnetic Topological Insulator. Adv. Mater. 36, 2406772 (2024) \n \n \n \n12 \n \n[27] Min, L. et al. Colossal room-temperature non-reciprocal Hall effect. Nat. Mater. 23, \n1671\u20131677 (2024) \n \n[28] Isobe, H., Xu, S.-Y., & Fu, L. High-frequency rectification via chiral Bloch electrons. \nSci. Adv. 6, 13 (2020) \n \n[29] He, P. et al. Quantum frequency doubling in the topological insulator Bi2Se3. Nat. \nCommun. 12, 698 (2021) \n \n[30] Zhang, Y. & Fu, L. Terahertz detection based on nonlinear Hall effect without magnetic \nfield. Proc. Natl. Acad. Sci. USA 118, e2100736118 (2021) \n \n[31] Onishi, Y. & Fu, L. High-efficiency energy harvesting based on a nonlinear Hall \nrectifier. Phys. Rev. B 110, 075122 (2024) \n \n[32] Gusev, V. & Chigarev, N. Nonlinear frequency-mixing photoacoustic imaging of a \ncrack: Theory. J. Appl. Phys. 107, 124905 (2010) \n \n[33] Rice, A. et al. Terahertz optical rectification from <110> zinc-blende crystals. Appl. \nPhys. Lett. 64, 1324\u20131326 (1994) \n \n[34] Torrejon, J. et al. Neuromorphic computing with nanoscale spintronic \noscillators. Nature 547, 428\u2013431 (2017). \n \n[35] Moon, J. et al. Temporal data classification and forecasting using a memristor-based \nreservoir computing system. Nat. Electro. 2, 480\u2013487 (2019) \n \n[36] Zhong, Y. et al. Dynamic memristor-based reservoir computing for high-efficiency \ntemporal signal processing. Nat. Commun. 12, 408 (2021) \n \n[37] Zhong, Y. et al. A memristor-based analogue reservoir computing system for real-time \nand power-efficient signal processing. Nat. Electro. 5, 672\u2013681 (2022) \n \n \n \n13 \n \n[38] Yokouchi, T. et al. Pattern recognition with neuromorphic computing using magnetic \nfield\u2013induced dynamics of skyrmions. Sci. Adv. 8, 39 (2022) \n \n[39] Liang, X. et al. Physical reservoir computing with emerging electronics. Nat. Electro. 7, \n193\u2013206 (2024) \n \n[40] Wang, Y. et al. Room temperature magnetization switching in topological insulator-\nferromagnet heterostructures by spin-orbit torques. Nat. Commun. 8, 1364 (2017) \n \n[41] Wang, H. et al. Room temperature energy-efficient spin-orbit torque switching in two-\ndimensional van der Waals Fe3GeTe2 induced by topological insulators. Nat. Commun. 14, \n5173 (2023) \n \n[42] Choi, G. S. et al. Highly Efficient Room-Temperature Spin-Orbit-Torque Switching in a \nVan der Waals Heterostructure of Topological Insulator and Ferromagnet. Adv. Sci. 11, \n2400893 (2024) \n \n[43] Liu, Y. et al. Cryogenic in-memory computing using magnetic topological insulators, \nNat. Mater. DOI:10.1038/s41563-024-02088-4 (2025). \n \n \n \n \n14 \n \n \nFigure 1: Magnetization switching by pulse current and AC current. a, Schematic \nillustration of current-induced magnetization switching. The purple and light blue arrows \nrepresent in-plane magnetic field \ud835\udc35+ and the effective magnetic field of damping-like spin-orbit \ntorque \ud835\udc35IJ\"=K?, respectively. \ud835\udf0e denotes the spin-polarization of conduction electron. b, Spin-\npolarization of gapped Dirac surface states (blue arrows). c,d, Input waveform of pulse and \nprobe current measurement c and alternating current measurement d. e,f, The change in Hall \nresistance during applying current for pulse and probe e and for AC f. For the pulse case e, the \nHall resistance change is measured by the probe current of 1 \u00b5A under the magnetic field \ud835\udc35+ of \n0.01 T (blue) and \u22120.01 T (red) applied parallel to the current direction. The transverse axis \nshows the pulse current value which varies from \u2212500 \u00b5A to 500 \u00b5A and then from 500 \u00b5A to \n\u2212500 \u00b5A. For the AC case f, the Hall resistance is measured by current \ud835\udc3c+(\ud835\udc61) = \ud835\udc3c&*/0 sin(2\ud835\udf0b\ud835\udc53\ud835\udc61) \n(\ud835\udc3c&*/0 =  300 \u00b5A, \ud835\udc53= 101Hz) at each time \ud835\udc61 under Bx =  0.01 T. The black shaded area \nindicates that the measured value of \ud835\udc45-+  diverges near \ud835\udc3c+ = 0  because it results in an \nindeterminate form (0/0). The shown data are antisymmetrized with respect to the magnetic \nfield. The vertical dotted lines represent the threshold current where the Hall resistance changes \nits sign, and the blue arrows represent the direction of the flow of time. \n \n \n15 \n \n \nFigure 2: Time domain measurements of magnetization switching. a, Waveforms of input \nAC current (red, \ud835\udc53= 101 Hz) and output Hall voltage (green) measured under in-plane \nmagnetic field \ud835\udc35+ = 0.01 T and temperature \ud835\udc47= 2.5 K. Black triangles point to positive spikes \nof Hall voltage.  b, Change in Hall resistance \ud835\udc45-+ (blue) calculated from current and Hall \nvoltage shown in a. Black shaded areas indicate that the value of \ud835\udc45-+ diverges around the time \nperiod around \ud835\udc3c+ = 0 because it results in an indeterminate form (0/0). c, Waveforms of the \nHall voltage (green) for AC current with amplitudes of 14 \u00b5A, 71 \u00b5A, 141 \u00b5A, and 300 \u00b5A. \nThe sinusoidal dotted curve (black) is a guide to the eye which indicates the phase of the AC \ncurrent.  d, Hall voltage vs current in the case of applying AC current whose amplitudes are \n14 \u00b5A, 71 \u00b5A, 141 \u00b5A, and 300 \u00b5A. The red arrows represent the direction of the flow of time. \n \n \n \n \n16 \n \n \nFigure 3: Distinct nonlinear Hall responses under in-plane magnetic fields. a, Waveform \nof Hall voltage (green) and current (red) under magnetic field \ud835\udc35+  =  0.01 T  (top) and \ud835\udc35+  =\n 2 T  (bottom) and temperature \ud835\udc47= 2.5 K. The left illustrations indicate the magnetic field \n(purple arrows) and the magnetization when applying no current to the sample (dark blue \narrows) and the magnetization oscillation (light blue arrows) under the AC current excitation. \nThe vertical dotted lines divide one cycle into 4 regions (i: \ud835\udc51\ud835\udc3c+/\ud835\udc51\ud835\udc61> 0, \ud835\udc3c+ > 0, ii: \ud835\udc51\ud835\udc3c+/\ud835\udc51\ud835\udc61< 0, \n\ud835\udc3c+ > 0, iii: \ud835\udc51\ud835\udc3c+/\ud835\udc51\ud835\udc61< 0, \ud835\udc3c+ < 0, iv: \ud835\udc51\ud835\udc3c+/\ud835\udc51\ud835\udc61> 0, \ud835\udc3c+ < 0). b, Hall voltage vs current under \nmagnetic field Bx = 0.01 T (top) and Bx = 2 T (bottom). The symbols i, ii, iii, and iv on the upper \ngraph corresponds to the ones in a. c, Fourier transformation of the Hall voltage under magnetic \nfield \ud835\udc35+  =  0.01 T  (top) and \ud835\udc35+  =  2 T  (bottom). The blue, red, and gray bars denote the in-\nphase components \ud835\udc49\u2032-\n(7), the out-of-phase components \ud835\udc49\u2032\u2032-\n(7), and the constant component \ud835\udc49-\n(%), \nrespectively. \n \n \n \n \n17 \n \n \nFigure 4: Frequency-mixing accompanied by magnetization dynamics. a, Hall voltage vs \ncurrent for the AC current with the frequencies, \ud835\udc53$ = 37 Hz (red) and \ud835\udc53# = 125 Hz (blue), and \nthe amplitude of 300 \u00b5\ud835\udc34 under magnetic field \ud835\udc35+  =  0.01 T and temperature \ud835\udc47= 2.5 K. b, \nThe waveform of the input current including 2 frequency components \ud835\udc3c+(\ud835\udc61) =\n150 sin(2\ud835\udf0b\ud835\udc53$\ud835\udc61) + 150 sin(2\ud835\udf0b\ud835\udc53#\ud835\udc61) [\u00b5A] (red) and the Hall response \ud835\udc49- (green) under magnetic \nfield \ud835\udc35+  =  0.01 T and temperature \ud835\udc47= 2.5 K. The schematic on the right side shows the \nexperimental set-up. c, Absolute value of Fourier component c\ud835\udc63-(\ud835\udc53)c from the measured \nresponses under magnetic field \ud835\udc35+  =  0.01 T  (top) and \ud835\udc35+  =  2 T (bottom). The inset shows \nHall voltage vs current in each case. In the all panels of c and d, the peaks of the sum-frequency \n\ud835\udc53$ + \ud835\udc53# and the difference-frequency |\ud835\udc53$ \u2212\ud835\udc53#| is explicitly indicated. In the top panel for the \nexperiment under 0.01 T, the peaks of other linear combinations (gray) are also indicated. d, \nAbsolute value of Fourier component c\ud835\udc63-(\ud835\udc53)c, which is defined in the same way as c, from the \nsimulated responses for \ud835\udc3c89 =  150 \u00b5A   (top) and \ud835\udc3c89 =  0 \u00b5A  (bottom), mimicking the \nexperimental cases for \ud835\udc35+  =  0.01 T and \ud835\udc35+  =  2 T, respectively. All the data in c and d are \nplotted as a function of \ud835\udc53 with 1 Hz intervals. \n \n \n \n \n18 \n \nSupplementary Note 1: Analysis of the raw data in the time-domain measurements \nUsing an oscilloscope, we measured Hall voltage in response to current in the time domain. To \nexclude the background from longitudinal voltage due to electrode misalignment, we subtracted \nthe measured data under the magnetic field in the \u2212\ud835\udc65 direction from the one under that in the \n+\ud835\udc65 direction (anti-symmetrization). Here we emphasize that when measuring data under the \nopposite field, the way of initialization is also inverted. For example, we measured \ud835\udc49-(\ud835\udc61) under \n\ud835\udc35+ = +0.01 T after initializing magnetization by setting to \ud835\udc35+ = +2 T, and then measured \n\ud835\udc49-(\ud835\udc61) under \ud835\udc35+ = \u22120.01 T after setting to \ud835\udc35+ = \u22122 T. When both the initializing field and the \nassisting field are inverted, the Hall voltage signal originating from the initial magnetization \nand magnetization dynamics is inverted, while the background from the longitudinal resistance \nremains unchanged. Therefore, this anti-symmetrization method is effective.  \n \nHere, we show the data before and after anti-symmetrization for the cases of pulse (Fig. S3) \nand AC (Fig. S4). In the case of pulses, the probe current is small (10 \u00b5A), which suppresses \ntemperature rise, allowing behaviors such as the sign change in the Hall resistance associated \nwith magnetization reversal to be observed even before anti-symmetrization. On the other hand, \nin the case of AC, the background due to longitudinal resistance is significant relative to the \nHall voltage signal, the anti-symmetrization procedure is necessary for the magnetization \nreversal behavior to be observed. In both cases, however, the behavior is the same: when the \ncurrent flows in the +\ud835\udc65 direction, the magnetization points in the \u2212\ud835\udc67 direction, and when the \ncurrent flows in the \u2212\ud835\udc65 direction, the magnetization points in the +\ud835\udc67 direction.\n \n \n19 \n \n \nSupplementary Note 2: Frequency dependence of the magnetization reversal \nWe mainly use \ud835\udc3c+(\ud835\udc61) = \ud835\udc3c% sin(2\ud835\udf0b\ud835\udc53\ud835\udc61) with \ud835\udc3c% = 300 \u00b5A and \ud835\udc53=10 Hz as the input current in \nthe main text. Here we show the response waveforms \ud835\udc49-(\ud835\udc61) and \ud835\udc3c+ \u2212\ud835\udc49- curves and \ud835\udc3c+ \u2212\ud835\udc45-+ \ncurves for various frequencies \ud835\udc53= 11, 101, 1001, 10001 Hz in Fig. S5. Overall, the qualitative \nbehavior is basically independent of frequency for lower frequencies \ud835\udc53= 11, 101, 1001 Hz. At \nthe highest frequency \ud835\udc53= 10001 Hz, the current is attenuated by the parasitic capacitance \nparallel to the sample, which causes a reduction in the current amplitude and a trivial phase \nchange between current and Hall voltage. Therefore, at such a high frequency, it is not possible \nto correctly measure the Hall voltage as the response to the current in the time domain. \n \nThe frequency dependence of the responses presented here is attributed to the parasitic \ncapacitance because the Hall voltage when the current is \ud835\udc3c+ = 0 was not \ud835\udc49- = 0 at the higher \nfrequency. However, mechanisms other than parasitic capacitance through which the response \ndepends on frequency, such as the inertia of the magnetization dynamics to the current, remain \nelusive. To explore such intrinsic frequency dependence, improvements in the equipment are \nnecessary to enable high-frequency measurements without the influence of parasitic \ncapacitance. \n \n \n \n \n20 \n \nSupplementary Note 3: Estimation of temperature increase caused by Joule heating \nWe attribute the hysteretic behavior to the magnetization reversal caused by spin-orbit torque. \nHowever, there would be still another explanation that the temperature would show a hysteretic \nchange because of Joule heating and accordingly the Hall response would reflect it. To exclude \nthis possibility, we estimate the temperature variation during applying current using \nlongitudinal resistivity as a thermometer. We first measured the temperature dependence of the \nlongitudinal resistance \ud835\udc45++ under the out-of-plane magnetic field \ud835\udc35. = 1 T (Fig. S6a), which is \nlarge enough to fix the magnetization along z direction even under a large current excitation. \nThe sensing current is as low as \ud835\udc3c= 0.1 \u00b5A, enabling us to ignore Joule heating. We then \nmeasured Hall resistance under AC with the amplitude of 300 \u00b5A and the frequency of 101 Hz \nas shown in the middle panel of Fig. S6b. At each time, we can estimate the sample temperature \nfrom the value of \ud835\udc45++ as shown on the lower side. There is certainly no hysteresis in temperature \nand it does not exceed the critical temperature \ud835\udc47L = 40 K. This reinforces the idea that \nmagnetization reversal occurs due to spin-orbit torque and that the hysteresis in the Hall voltage \nis caused by magnetization reversal.\n \n \n21 \n \n \nSupplementary Note 4: Nonlinear Hall effect (polynomial-type) \nHere, we consider the Hall voltage \ud835\udc49-(\ud835\udc61) in response to the current \n\ud835\udc3c+(\ud835\udc61) = \ud835\udc3c% sin(\ud835\udf14\ud835\udc61) \nwhen \ud835\udc49- is written as a power series of \ud835\udc3c+; \n\ud835\udc49- = \ud835\udc45-+\ud835\udc3c+ + \ud835\udc45-++\ud835\udc3c+\n# + \ud835\udc45-+++\ud835\udc3c+\n5 + \ud835\udc45-++++\ud835\udc3c+\n6 + \u22ef \nwhere \ud835\udc45\n-++\u22ef+\nNOP\n%  (\ud835\udc5b= 1, 2, 3, 4, \u22ef) is coefficient of each order of \ud835\udc3c+. Substituting the \ud835\udc3c+(\ud835\udc61) into \nthis, we obtain \n\ud835\udc49-(\ud835\udc61) = \ud835\udc45-+\ud835\udc3c% sin(\ud835\udf14\ud835\udc61) + \ud835\udc45-++(\ud835\udc3c% sin(\ud835\udf14\ud835\udc61))# + \ud835\udc45-+++(\ud835\udc3c% sin(\ud835\udf14\ud835\udc61))5 + \ud835\udc45-++++(\ud835\udc3c% sin(\ud835\udf14\ud835\udc61))6\n+ \u22ef \n= \ud835\udc45-+\ud835\udc3c% sin(\ud835\udf14\ud835\udc61) + \ud835\udc45-++\ud835\udc3c%\n# 1 \u2212cos(2\ud835\udf14\ud835\udc61)\n2\n+ \ud835\udc45-+++\ud835\udc3c%\n5 f3\n4 sin(\ud835\udf14\ud835\udc61) \u22121\n4 sin(3\ud835\udf14\ud835\udc61)g\n+ \ud835\udc45-++++\ud835\udc3c%\n6 f3\n8 \u22121\n2 cos(2\ud835\udf14\ud835\udc61) + 1\n8 cos(4\ud835\udf14\ud835\udc61)g + \u22ef \n= \ud835\udc49-\n(%) + \ud835\udc494-\n($)sin (\ud835\udf14\ud835\udc61) + \ud835\udc494-\n(#)cos (2\ud835\udf14\ud835\udc61) + \ud835\udc494-\n(5)sin (3\ud835\udf14\ud835\udc61) + \ud835\udc494-\n(6)cos (4\ud835\udf14\ud835\udc61) + \u22ef, \nwhere \n\ud835\udc49-\n(%) = \ud835\udc3c%\n#\ud835\udc45-++/2 +  3\ud835\udc3c%\n6\ud835\udc45-++++/8 + \u22ef, \n\ud835\udc494-\n($) = \ud835\udc3c%\ud835\udc45-+ + 3\ud835\udc3c%\n6\ud835\udc45-++++/4 + \u22ef,\ud835\udc494-\n(#) = \u2212\ud835\udc3c%\n#\ud835\udc45-++/2 \u2212\ud835\udc3c%\n6\ud835\udc45-++++/2 + \u22ef, \n \ud835\udc494\n-\n(5) = \u2212\ud835\udc3c%\n5\ud835\udc45-+++/4 + \u22ef, \ud835\udc494\n-\n(6) = \ud835\udc3c%\n6\ud835\udc45-++++/8 + \u22ef . \nIn this way, all the odd-harmonic components appear as sine functions, while all the even \nharmonic components appear as cosine functions in this power series, as mentioned in the main \ntext. \n \n \n \n \n22 \n \nSupplementary Note 5: Frequency-mixing effect with and without threshold \nIn the main text, we discuss the frequency-mixing effect using the current including 2 \nfrequencies \ud835\udc53$ = 37 Hz and \ud835\udc53# = 125 Hz. Here we describe how the mixed frequencies appear \nin the response. \n \nFirst, let the current be \ud835\udc3c+(\ud835\udc61) = \ud835\udc3c$sin(2\ud835\udf0b\ud835\udc53$\ud835\udc61) + \ud835\udc3c#sin(2\ud835\udf0b\ud835\udc53#\ud835\udc61). Then, we assume that the Hall \nvoltage can be written as a power series of \ud835\udc3c+; \n\ud835\udc49-(\ud835\udc61) = \ud835\udc45-+\ud835\udc3c+(\ud835\udc61) + \ud835\udc45-++(\ud835\udc3c+(\ud835\udc61))# + \ud835\udc45-+++(\ud835\udc3c+(\ud835\udc61))5 + \ud835\udc45-++++(\ud835\udc3c+(\ud835\udc61))6 +\u00b7 \u00b7 \u00b7 \nwhere \ud835\udc45\n-++\u22ef+\nNOP\n%  (\ud835\udc5b= 1, 2, 3, 4, \u22ef) is coefficient of each order of \ud835\udc3c+(\ud835\udc61). Substituting \ud835\udc3c+(\ud835\udc61) into it, \nwe obtain a lot of components with frequencies which are expressed as linear combinations of \n\ud835\udc53$ and \ud835\udc53#. For example, the second order term is written as below; \n\ud835\udc45-++(\ud835\udc3c+(\ud835\udc61))# = \ud835\udc45-++(\ud835\udc3c$sin (2\ud835\udf0b\ud835\udc53$\ud835\udc61) + \ud835\udc3c#sin (2\ud835\udf0b\ud835\udc53#\ud835\udc61))#  \n= \ud835\udc45-++ j\ud835\udc3c$\n# 1 \u2212cos(4\ud835\udf0b\ud835\udc53$\ud835\udc61)\n2\n+ \ud835\udc3c#\n# 1 \u2212cos(4\ud835\udf0b\ud835\udc53#\ud835\udc61)\n2\n+ \ud835\udc3c$\ud835\udc3c#{cos[2\ud835\udf0b(\ud835\udc53# \u2212\ud835\udc53$)\ud835\udc61] \u2212cos[2\ud835\udf0b(\ud835\udc53$ + \ud835\udc53#)\ud835\udc61]}m \n= \ud835\udc45-++\n\ud835\udc3c$\n# + \ud835\udc3c#\n#\n2\n \u2212\ud835\udc45-++\ud835\udc3c$\n#\n2\ncos(4\ud835\udf0b\ud835\udc53$\ud835\udc61) \u2212\ud835\udc45-++\n\ud835\udc45-++\ud835\udc3c#\n#\n2\ncos(4\ud835\udf0b\ud835\udc53#\ud835\udc61) \n+\ud835\udc45-++\ud835\udc3c$\ud835\udc3c#cos[2\ud835\udf0b(\ud835\udc53# \u2212\ud835\udc53$)\ud835\udc61] \u2212\ud835\udc45-++\ud835\udc3c$\ud835\udc3c#cos[2\ud835\udf0b(\ud835\udc53$ + \ud835\udc53#)\ud835\udc61] \n \nIn this way, the sum- and difference-frequency components are derived. Other linear \ncombinations of the form, \ud835\udc4e\ud835\udc53$ + \ud835\udc4f\ud835\udc53# (a and b being integer numbers), are also derived from \nhigher order terms. \n \nHere, we prove that the components with frequencies \ud835\udc4e\ud835\udc53$ + \ud835\udc4f\ud835\udc53# and \ud835\udc4e\ud835\udc53$ \u2212\ud835\udc4f\ud835\udc53# have the same \namplitude if Vy(t) is written in a purely polynomial form of current without a threshold behavior. \nFirst, the components of \ud835\udc4e\ud835\udc53$ + \ud835\udc4f\ud835\udc53# and \ud835\udc4e\ud835\udc53$ \u2212\ud835\udc4f\ud835\udc53# comes only from the terms of  \n[\ud835\udc3c$sin(2\ud835\udf0b\ud835\udc53$\ud835\udc61)]QR#S[\ud835\udc3c#sin(2\ud835\udf0b\ud835\udc53#\ud835\udc61)]TR#7 (\ud835\udc5a, \ud835\udc5b =  0, 1, 2,\u00b7 \u00b7 \u00b7 ), \nwhen \ud835\udc53$ and \ud835\udc53# are coprime like 37 Hz and 125 Hz. Because the current can be also expressed \nas \ud835\udc3c+(\ud835\udc61) = \ud835\udc3c$sin2\ud835\udf0b\ud835\udc53$\ud835\udc61\u2212\ud835\udc3c#sin2\ud835\udf0b(\u2212\ud835\udc53#)\ud835\udc61, this term should be equal to \n[\ud835\udc3c$sin(2\ud835\udf0b\ud835\udc53$\ud835\udc61)]QR#S[\ud835\udc3c#sin(2\ud835\udf0b\ud835\udc53#\ud835\udc61)]TR#7 = [\ud835\udc3c$sin(2\ud835\udf0b\ud835\udc53$\ud835\udc61)]QR#S[\u2212\ud835\udc3c#sin(\u22122\ud835\udf0b\ud835\udc53#\ud835\udc61)]TR#7\n= (\u22121)T[\ud835\udc3c$sin(2\ud835\udf0b\ud835\udc53$\ud835\udc61)]QR#S[\ud835\udc3c#sin(\u22122\ud835\udf0b\ud835\udc53#\ud835\udc61)]TR#7 (\ud835\udc5a, \ud835\udc5b =  0, 1, 2,\u00b7 \u00b7 \u00b7 ). \n \n \n23 \n \nThus, the coefficients \ud835\udc36Q,T of the term of cos[2\ud835\udf0b(\ud835\udc4e\ud835\udc53$ + \ud835\udc4f\ud835\udc53#)\ud835\udc61] (or sin[2\ud835\udf0b(\ud835\udc4e\ud835\udc53$ + \ud835\udc4f\ud835\udc53#)\ud835\udc61] and \n\ud835\udc36Q,\"T of cos[2\ud835\udf0b(\ud835\udc4e\ud835\udc53$ \u2212\ud835\udc4f\ud835\udc53#)\ud835\udc61] (or sin[2\ud835\udf0b(\ud835\udc4e\ud835\udc53$ \u2212\ud835\udc4f\ud835\udc53#)\ud835\udc61]) are necessarily connected by \n\ud835\udc36Q,T = (\u22121)T\ud835\udc36Q,\"T \nThis shows that the amplitudes of the two components are equal, in accord with the observation \nas shown in the case of Bx = 2 T in Fig. 4d of the main text. \n \n \n \n24 \n \n \nFigure \nS1: \nBasic \ntransport \nproperties \nof \nthe \nsample. \na \nStructure \nof \n(Cr,Bi,Sb)2Te3/(Bi,Sb)2Te3 thin film grown on InP(111) substrate (top) and the Hall-bar device \nfabricated from the film (bottom). The gray part is CBST/BST thin film and the gold parts are \nAu electrodes. b Fundamental transport properties of the sample (Hall-bar device) measured by \nlow DC current \ud835\udc3cVW = 0.1 \u00b5A using PPMS. The upper left, the upper right, and the lower left \npanels show magnetic field (\ud835\udc35.) dependence of Hall resistivity \ud835\udf0c-+, Hall conductivity \ud835\udf0e+-, and \nlongitudinal resistivity \ud835\udf0c++, respectively. All of them are measured at temperatures 2.5 K (red), \n10 K (blue), 20 K (green). Temperature dependence of \ud835\udf0c++  (red) and \ud835\udf0c-+  (blue) under low \nmagnetic field \ud835\udc35. = 0.01 T are presented in the lower right panel. From the rise of the \nanomalous Hall resistivity (\ud835\udf0c-+ at 0.01T), the transition temperature is determined as \ud835\udc47L~40 K. \n \n \n \n \n25 \n \n \nFigure S2: Time-domain measurement setup. a Circuit diagram of the measurement system. \nCurrent from the current source flows through the sample and resistor connected in parallel. \nThe oscilloscope measures the Hall voltage or longitudinal voltage of the sample while \nsimultaneously monitoring the current flowing through the system from the voltage across the \nresistor. The sample is put in a PPMS chamber. b Configuration of the current (red), the Hall \nvoltage (green), the longitudinal voltage (blue), and the magnetic field (purple) in the sample. \nThe shape of the sample is the same as in Fig. S1a. \n \n \n \n \n \n26 \n \n \nFigure S3: Raw data and anti-symmetrized data of the time-domain measurements of \nmagnetization reversal induced by pulse. a Raw data of Hall voltage (green line in the middle \nsection) in response to pulse current (red, upper) for the peak current values \ud835\udc3c&*/0 =\n20, 100, 300, 500 \u00b5A. The middle and lower sections show the responses under the in-plane \nmagnetic field of +0.05 T and \u22120.05 T, respectively. The configurations of the magnetic field \n(purple arrow), the effective magnetic field of damping-like SOT (light blue), the magnetization \n(blue), and the pulse current (red) are illustrated on the right side. b Anti-symmetrized responses \nobtained from the raw data (shown in a) for \u00b10.05 T. The peak value dependence of the \nmagnetization switching ratio is also shown in the right side. Through this figure, a different \nsample was used for the measurement, compared to the one used in the other figures. \n \n \n \n \n27 \n \n \nFigure S4: Raw data and anti-symmetrized data of the time-domain measurements of \nmagnetization reversal induced by AC. a Raw data of Hall voltage (green lines in the middle \nand lower sections) in response to AC current (red, upper) for the peak current values \ud835\udc3c&*/0 =\n14, 71, 141, 300 \u00b5A. The middle and lower sections show the responses under the in-plane \nmagnetic field of +0.01 T and \u22120.01 T, respectively. b  Anti-symmetrized data obtained from \nthe raw data (shown in a) for \u00b10.01 T. Increasing \ud835\udc3c&*/0, the response \ud835\udc49-(\ud835\udc61) gradually becomes \nnonlinear, as discussed in the main text.\n \n \n28 \n \n \n \nFigure S5: Frequency dependence of AC-induced magnetization reversal. a The Hall \nvoltage response (green) and input current (red) with an amplitude of 300 \u00b5A and at frequencies \nof 11, 101, 1001, 10001 Hz. As the frequency increases beyond 1001 Hz, the amplitude of the \nmeasured current decreases possibly due to parasitic capacitance within the electric circuit. b \nThe Hall voltage vs current characteristics for each frequency. For higher frequencies 1001 Hz \nand 10001 Hz, the value of \ud835\udc49- \u22600 when \ud835\udc3c+ = 0 because of the trivial phase rotation due to the \nparasitic capacitance. c The change in Hall resistance, calculated from the Hall voltage and \ncurrent, depending on the current. For all the frequencies, \ud835\udc45-+ diverges around \ud835\udc3c+ = 0 because \nof \ud835\udc49-/\ud835\udc3c+ = 0/0. \n \n \n \n \n29 \n \n \nFigure S6: Estimation of temperature increase caused by Joule heating. a The lower part \nshows the temperature dependence of \ud835\udc45++ measured by the probe current as low as \ud835\udc3c= 0.1 \u00b5A \nunder the out-of-plane magnetization \ud835\udc35. = 1 T (the configuration is shown in the upper part). \nb Time-variation of the current \ud835\udc3c+ (red), resistance \ud835\udc45++ (purple), and temperature \ud835\udc47 (black) \nwhich is estimated from resistance. \ud835\udc47 is estimated to be as low as 30 K at the highest. \ud835\udc45++ and \nthus \ud835\udc47 diverge at \ud835\udc3c+ = 0 (the gray areas). c The dependence of \ud835\udc47 on \ud835\udc3c+ obtained from the \ud835\udc3c+(\ud835\udc61) \nand \ud835\udc47(\ud835\udc61) shown in b. \ud835\udc47 increases as the absolute value of \ud835\udc3c+ increases. \ud835\udc47 diverges around \ud835\udc3c+ =\n0."
-  },
-  {
-    "domain": "Physics",
-    "chunk_type": "general",
-    "text": "Proteinoid spikes: from protocognitive to universal approximating agents\nSaksham Sharma\nCambridge Centre for Physical Biology, CB3 0WA, Cambridge,\nUK, Unconventional Computing Laboratory, UWE Bristol, UK\nAdnan Mahmud\nDepartment of Chemical Engineering, Cambridge, Philippa Fawcett Drive,\nCambridge CB3 0AS, UK; Zuse Institute Berlin, Takustra\u00dfe 7 14195, Germany\nGiuseppe Tarabella\nInstitute of Materials for Electronic and Magnetism,\nNational Research Council (IMEM-CNR), Parma (Italy)\nPanagiotis Mougoyannis and Andrew Adamatzky\nUnconventional Computing Laboratory, UWE Bristol, UK\n(Dated: April 15, 2025)\nProteinoids, as soft matter fluidic systems, are computational substrates that have been recently\nproposed for their analog computing capabilities. Such systems exhibit oscillatory electrical activity\nbecause of cationic and anionic exchange inside and outside such gels. It has also been recently\nshown that this (analog) electrical activity, when sampled at fixed time intervals, can be used to\nreveal their underlying information-theoretic, computational code. This code, for instance, can be\nexpressed in the (digital) language of Boolean gates and QR codes. Though, this might seem as\na good evidence that proteinoid substrates have computing abilities when subjected to analog-to-\ndigital transition, the leap from their underlying computational code to computing abilities is not\nwell explained yet.\nHow can the electrical activity inside proteinoids, whilst of chemical origin,\nbe able them to perform computational tasks at the first place? In addition, proteinoids are also\nhypothesised to be the chemical manifestation of the primordial soup, i.e., as potential entities with\nproto-cognitive abilities. In this work, we show that the proteinoid substrate, owing to its chemical\nmakeup and proto-cognitive abilities, can be interpreted as an universal approximator, thanks to\na novel equivalence between the electrical activity exhibited by the substrate and a deep Rectified\nLinear Unit (deep ReLU) network.\nWe exemplify this equivalence by constructing a prediction\nalgorithm which acts as a binary classification model and extract 16-dimensional vector data from\nthe proteinoid spike, in order to perform predictions with 70.41% accuracy.\nThis model in its\ncore has a unique transformation modality, inspired from number-theoretic sieve theory, and is\ncombination of two functions: spiral sampling F1 and significant digit extraction F2 functions. The\ncomplexity of the transformed data is measured using eight distinct metrics, and effectively, using\na single meta-metric. We conclude by drawing an equivalence between the the deep ReLU network\nand the Kolmogorov-Arnold representation theorem, whose origin can be traced back to Hilbert\u2019s\nthirteenth problem.\nProteinoids, made from poly(amino) acids, exhibit os-\ncillatory analog electrical activity because of the cationic\nand ionic exchanges inside their gelatinous structure.\nThis analog information can be interpreted in a digital\nformat, by conversion to Boolean gates and QR codes.\nDespite this possibility, it is not yet clear, as to what\ncan make such systems capable of performing universal\ncomputation similar to how a biological neuron does.\nThough relatively young in the literature on proteinoids,\nthere are many \u201cfluidic\u201d soft matter systems that have\nbeen shown to have such neuron-like computing capa-\nbilities.\nIn early 2000, Maass et al.\ndeveloped Liquid\nState Machine consisting of a cortical microcolumn\nthat connects the neurons randomly and is capable of\nuniversal real-time computation [1].\nThis model was\nsubsequently adopted \u201cliterally\u201d by considering water\nwaves in a reservoir and training the system to solve\nXOR problem and perform speech recognition [2].\nThere exists plenty of reservoir computing architec-\ntures in the literature, built out of fluidic systems upon\nimplementation of neural algorithms. A few such fluidic\ncomputing systems include the systems for signal analy-\nsis and data classification operations [3], computation at\nthe nonlinear regime [4\u20136], and incorporating external\nforce fields into the system such as acoustic fields [7, 8].\nOne crucial step in implementing such fluidic reservoir\ncomputing algorithms is that the response of the physical\nsystem to an input impulse or signal is characterised and\nused for training a given neural network architecture. A\nplenty of such architectures exist in the literature, such\nas the Spiking Neural Networks (SNNs) [9], Extreme\nLearning Machines (ELMs) [6], Echo State Networks\n(ESNs) [10], Recurrent Neural Network (RNN) [11],\nor\nmore\nbroadly,\nneuromorphic\ncomputing\n(NMC)\narXiv:2504.10362v1  [physics.flu-dyn]  14 Apr 2025\n2\n100 \ud835\udf07\ud835\udc5a\n(a)\n(b)\n(c)\n100 ms\nVSDs\nFIG. 1. (a) Voltage versus time signal of the oscillatory electrical activity inside proteinoids measured using voltage sensitive\ndyes (top) and the microscopic images of the proteinoid microspheres (bottom). (b) Voltage sensitive dyes (VSDs) convert the\nrecordings into a spike-versus-time data (Dataset 1 is plotted), (c) Functions F1 and F2 are used to transform Dataset 1 into\na multi-nodal graph further detailed in Section III.\nFive distinct\n\"proteinoid\nspiking trains\"\ndata sets\nTransformation\nfunctions F1 \n(spiral sampling)\nand F2 \n(significant \ndigit \nextraction)\nEvaluation of\neight distinct\ncomplexity \nmetrics\nand a single \nmeta-metric\nBinary \nclassification \nmodel (16-dim\nfeature space)\nConfusion \nMatrix\nReceiver \nOperator \nCharacteristic\ncurve\nClassification \nmetrics\n(a)\n(b)\nFIG. 2. (a) The steps involved in the analysis shown in the current work are presented in a block fashion. (b) Data from\nDataset 1 is transformed from spike-versus-time to a multi-nodal graph format with value of complexity metrics shown below.\narchitectures [12], [13], [14]. While each of such networks\nhave their own dictionary and methodology required to\nimplement them, a generic fluidic system (proteinoid gel\nin our case), need a fundamental \u201ctheory of computing\n(ToC)\u201d framework that grants them inherent capability\nof universal computing, upon which a suitable NMC\nmethodology can be implemented.\nIn the current work, we show that proteinoids can be\ninterpreted as universal approximating machines, thanks\nto a fundamental equivalence between the electrical\nactivity recorded by us in the experiments and a deep\nRectified Linear Unit (deep ReLU) network.\nWe use\n3\n5\n10\n15\n20\nDataset 1\nTime (s)\nSpike \nDataset 2\nDataset 3\nDataset 4\nDataset 5\n0\n0\n1\n5\n10\n15\n20\n0\n5\n10\n15\n20\n0\n5\n10\n15\n20\n0\n5\n10\n15\n20\n0\nTime (s)\nTime (s)\nTime (s)\nTime (s)\nFIG. 3. Raw Spike Trains Datasets: The figure shows \u201cspikes or no spikes\u201d (0s or 1s) in the proteinoid samples prepared in\nSection I plotted against time (s) and recorded using voltage sensitive dyes (VSDs).\nvoltage-sensitive dyes to record the electrical activity in\nproteinoid samples. Five distinct datasets of the spiking\ntrains extracted from the proteinoids are transformed\nusing a novel modality, which is the combination of two\nfunctions: F1 (spiral sampling) and F2 (significant digit\nextraction).\nAfter this transformation, the complexity\nof the transformed data is calculated using eight graph-\nbased discrete metrics which are then unified into a single\nmeta-metric.\nFinally, the prediction algorithm for the\nproteinoid spikes is constructed using a feedforward neu-\nral network architecture, which captures three class of\nfeatures (temporal, statistical, and spectral) comprised\nof a 16-dimensional vector space.\nThe algorithm is,\neffectively, classification model which works with 70.41%\naccuracy (owing to a restricted data set from the VSD\nmeasurements). We conclude by drawing an equivalence\nbetween proteinoid spiking classification model and the\nKolmogorov-Arnold representation theorem.\nBroadly speaking, our analysis also lets us to motivate\nfuture directions for this research towards more universal\ncomputing paradigms, such as proteinoid transformers\nor (advanced) proteinoid physical generative AI models,\nwhich can inspire experimentalists to come up with the\nphysical embodiments of proteinoid GPTs - the collection\nof which can serve as the universal-computing multiverse\nin its own sense, and amenable for further relatively top-\nical investigations, such as the \u2018c-word\u2019 of such physical\nsystems.\nI.\nPROTOCOL FOR MEASURING\nELECTRICAL ACTIVITY INSIDE\nPROTEINOIDS\nThe proteinoids are prepared using the protocol used\nbefore by us [15]. The experimental setup is described\nas the following: amino acids in powdered form; 3-hole\nround bottom flasks; Tri-block heater; N2 gas; dialy-\nsis cellophane membrane; magnetic stirrer; and a water\nbath. 1.5 grams each of L-Aspartic acid, L-Histidine, L-\nPhenylalanine, L-Glutamic acid, and L-Lysine is mixed-\nand-heated in the (3-hole) round bottom flask at 290\u25e6C.\nThe temperature is step-by-step increased by 10\u25e6. Af-\nter fuming start to appear, the exhaust in the fume-\nhood (with N2 gas as the inlet) is started, to release the\nfumes out. The powder goes through a colour transfor-\nmation from white to green colour, followed by a mor-\nphological transition resulting in the formation of a se-\nquence of simmering microspheres.\nAfter ceasing the\nheating process, the remaining material solidifies and is\nsubsequently extracted and allowed to cool for a dura-\ntion of thirty minutes.\nThe collected residue is then\nplaced within the Slide-A-Lyzer mini dialysis appara-\ntus, with 10,000 molecular weight cut-off and employs\nwater as the dialysate. Dialysis is carried out continu-\nously over for five days until the residue comprises mi-\ncroscopic ensembles of microspheres.\nThe residue de-\nrived from the dialysis membrane is heated in the vac-\nuum oven for half an hour, facilitating the evaporation of\nthe dialysate (water) from the sample. Subsequently, the\nsample is scrutinised utilizing a transmission electron mi-\ncroscope. Voltage-sensitive (aminonaphthylethenylpyri-\ndinium) dyes are used for recording the electrical activity.\nDuring the recording process, the data logger (ADC-24,\nPico Technology, UK) operated at its maximum capacity\n(600 data points per second). This rate of data collec-\ntion allowed for a comprehensive understanding of the\nelectrical activity exhibited by the proteinoids. Further-\nmore, the data logger stored and saved the average value\nobtained from these measurements. This approach pro-\nvided a concise representation of the recorded electrical\nactivity, facilitating subsequent analysis and interpreta-\ntion of the experimental results.\nII.\nCHEMICAL PRIMORDIAL SOUP TO A\nUNIVERAL APPROXIMATING NETWORK: A\nMODIFIED KOLMOGOROV-ARNOLD\nREPRESENTATION\nA.\nInitial Data\nThe initial dataset comprises of five distinct pro-\nteinoids spike trains, each representing the firing patterns\nof individual spikes against time. These spike trains are\ncharacterised by a series of temporal points, where each\npoint signifies the timestamp of the firing event.\nThe\ndata is structured as follows:\n1. Time series data: Each spike train is represented\nas a time series. X-axis denotes time (in seconds)\nand Y-axis is binary (0 or 1), indicating the pres-\nence or absence of a spike.\n2. Temporal resolution: The timing of spikes is\nrecorded with precision up to the order of microsec-\nonds (\u00b5s).\n4\n3. Variable duration: Each spike train may have a\ndifferent total duration, reflecting the variability in\nspiking activity periods.\n4. Sparse nature: Consistent with typical spiking\nfiring patterns, the spike trains exhibit a sparse\nstructure\u2014\u2013with relatively few spikes distributed\nover the recorded time period.\nB.\nTransformation Process\nTo extract deeper insights and reveal potential hidden\nstructures within the spike train data, we apply a\nnovel transformation process described ahead.\nThe\ntransformation converts the (sequential) time series data\ninto a multi-nodal graph structure.\nFigure 3 displays the original spike train data for all\nfive datasets.\nEach row represents a separate dataset,\nwith time (in seconds) on the x-axis and spiking events\nrepresented by the vertical black lines on the y-axis.\nDataset 1 exhibits a notably dense spike pattern, while\nDatasets 2-5 show varying degrees of sparsity and rhyth-\nmicity in their spiking patterns.\nTemporal clustering\nof spikes is evident in some datasets, particularly in\nDatasets 3-5.\nThe transformation process involves two key functions,\nF1 and F2, which operate as follows:\n1.\nFunction F1: Spiral Sampling\nF1 := x(t) = 10+(10\u22122t) cos(t\u00b7\u03c0),\nt \u2208{0, 2, 3, ..., 19, 20}\n(1)\nF1 is designed to perform non-uniform sampling mech-\nanism and select points from the given spike train in a\nspiraling inward fashion. As t ranges from 0 to 20, x(t)\ngenerates a series of values that oscillate between 0.5 and\n20, with an inward trend. For example, x(13) gives 6.5,\nand the points in the dataset just less than 6.5 are picked.\nThe key aspects of F1 are:\n1. Non-linear sampling: Unlike uniform sampling,\nF1 provides a non-linear distribution of sampling\npoints, potentially revealing patterns at different\ntemporal scales.\n2. Bounded output:\nThe output of the function\nis bounded between 0.5 and 20, ensuring that the\nsampling remains within a predefined range regard-\nless of the input spike train\u2019s duration.\n3. Decreasing periodicity: The cosine term intro-\nduces an oscillatory behavior with decreasing am-\nplitude, mimicking a spiral pattern when visualised.\n2.\nFunction F2: Significant Digit Extraction\nF2 := \u230a(10n) \u00b7 (xi \u2212\u230axi\u230b)\u230b; (10n)xi \u22651 and (10n\u22121)xi < 1\n(2)\nwhere xi represents a time point from the original spike\ntrain.\nThis function extracts the first significant digit after\nthe decimal point for each selected time point. The key\naspects of F2 are:\n1. Scale invariance: By focusing on the first signifi-\ncant digit, F2 captures a scale-invariant property of\nthe time points, potentially revealing patterns that\nare independent of the absolute time scale.\n2. Digit-based clustering: Points with the same\nfirst significant digit are grouped together, creat-\ning a natural clustering mechanism based on this\nmathematical property. The reference point would\nconstitute the hypothetical data point which end\nwith \u201c.000\u201d.\nIII.\nTRANSFORMATION PROCEDURE &\nEFFECTS\nFigure 4 illustrates the outcome of applying the\ntransformation process to Datasets 1-5, which involves\nconverting the linear spike trains into complex, three-\ndimensional graph structures.\nTable I outlines six\nalgorithmic steps of the transformation.\nThe transformation acts on the representation of the\nspike train data and causes the following changes:\n1. Dimensionality\nIncrease:\nThe\noriginal\n1-\ndimensional time series is transformed into a 3-\ndimensional structure (time, F1 index, and connec-\ntion layer).\n2. Topology Change: The linear structure of the\ntime series is converted into a complex graph with\nmultiple interconnected nodes.\n3. Multi-scale Representation: The combination\nof F1 spiral sampling and F2 digit extraction creates\na multi-scale representation, with an aim to reveal\nboth large-scale temporal patterns and fine-grained\nsimilarities between the spike timings.\n4. Pattern Emergence: The graph structure may\nreveal clusters or patterns of spike timings that\nwere not readily apparent in the original linear rep-\nresentation.\n5\nTABLE I. Data Transformation Algorithm\nInput: Original spike train dataset S, Functions F1 and F2\nOutput: Graph G representing analyzed spike train\nAlgorithm:\n1. Initialise empty sets:\nP \u2190\u2205(Set of primary nodes)\nD \u2190\u2205(Set of secondary nodes)\nE \u2190\u2205(Set of edges)\n2. For each t \u2208{0, 2, ..., 20}:\na) Compute x(t) \u2190F1(t)\nb) Find p = max{s \u2208S : s \u2264x(t)} (closest point not exceeding x(t))\nc) Add to primary nodes: P \u2190P \u222a{p}\n3. For each p \u2208P:\nCompute dp \u2190F2(p) (Extract first significant digit after decimal)\n4. For each s \u2208S:\na) Compute ds \u2190F2(s)\nb) For each p \u2208P:\nIf ds = dp:\nAdd to secondary nodes: D \u2190D \u222a{s}\nAdd edge: E \u2190E \u222a{(p, s)}\n5. Construct graph: G \u2190(P \u222aD, E)\n6. Return G\n5. Information Condensation: While some tempo-\nral information is lost in the transformation, the re-\nsulting structure condenses information about simi-\nlarities and patterns in spike timing across different\ntime scales.\nIV.\nCOMPLEXITY METRICS AND\nMETA-METRIC EVALUATION\nA.\nDiscrete Complexity Metrics\nTo quantify the complexity of the transformed spike\ntrain data, as noted in Table II and III, we evaluate a\nset of graph-based metrics. These metrics capture vari-\nous aspects of the multinodal graph structure, providing\ninsights into the intricacy and patterns within the neu-\nronal firing data. We then combine these metrics into a\nmeta-metric to obtain an overall measure of complexity.\nB.\nMeta-Metric for Overall Complexity\nTo combine these discrete metrics into a single measure\nof overall complexity, let us define the meta-metric as the\nmetric that provides a nuanced ranking of complexity\nand avoids artificial bounds and reflects the continuous\nnature of complexity. It is calculated as follows:\n1. Normalisation: Each metric is normalised using\nthe notion of z-score and to ensure comparability\nacross different scales:\nzi = xi \u2212\u00b5i\n\u03c3i\n(3)\nwhere xi is the original metric value, \u00b5i is the mean,\nand \u03c3i is the standard deviation of metric i across\nall datasets.\n2. Weighted Sum:\nA weighted sum of the nor-\nmalised metrics is computed\nS =\n8\nX\ni=1\nwizi\n(4)\nwhere wi are the weights assigned to each metric\n[16], reflecting their relative importance in deter-\nmining overall complexity.\n3. Sigmoid Transformation: The weighted sum is\ntransformed using a sigmoid function:\nMeta-metric =\n1\n1 + e\u2212S\n(5)\nThe sigmoid transformation ensures that the meta-metric\nasymptotically approaches 0 for absolute simple systems\nand 1 for absolute complex systems, without exactly ever\nreaching these values. This reflects the idea that com-\nplexity exists on a continuous spectrum, and allows for\nmeaningful comparisons between datasets while leaving\nroom for potentially more or less complex datasets in fu-\nture analyses.\n6\nDataset 1\nDataset 2\nDataset 3\nDataset 4\nDataset 5\nTime (s)\nF1index\nTime (s)\nF1index\nTime (s)\nF1index\nTime (s)\nF1index\nTime (s)\nF1index\n(i)\n(ii)\n(iii)\n(iv)\n(v)\nO\nX\nY\nZ\nFIG. 4. Multinodal Graph Transformation spike trains datasets: subplots (i-v) represent a transformed dataset, where the\nx-axis represents time in seconds, the y-axis represents the F1 index, and the z-axis (vertical) represents the connection layer.\nRed nodes indicate the primary nodes selected by the F1 function and the grey edges connect the primary nodes to their\ncorresponding secondary nodes.\nValue\nValue\nDataset\n(i)\n(ii)\n(iii)\n(iv)\nDataset\nDataset\nDataset\nDataset\nDataset\nDataset\nDataset\n(v)\n(vi)\n(vii)\n(viii)\nFIG. 5. Comparison of Complexity Metrics Across Datasets (in linear order: 1, 2, 4, 5, and 3). Y-axis data correspond to: (i)\nnumber of nodes (N), (ii) number of edges (E), (iii) average degree (\u00afk), (iv) clustering coefficient C, (v) graph density \u03c1, (vi)\ndegree entropy H, (vii) number of connected components CC, (viii) average resistance R.\nC.\nAnalysis of Complexity Metrics\nTo analyse the complexity of the transformed spike train\ndata across our five datasets, we compute eight individ-\nual metrics and a meta-metric for each dataset. Table\nIII and Figure 5 provides a comprehensive comparison\nof these metrics across datasets 1-5.\nComplexity is not solely determined by the graph size.\nDataset 1 ranks fourth despite being the largest, partly\nbecause of local connectivity and that the clustering\ncoefficient is a strong contributor to overall complexity.\nDataset 2 attains the highest complexity because it\ncontains a fine balance of the complexity metrics.\nIn\naddition, connected components play a nuanced role in\nmeasurement of the complexity.\nAdditionally, degree\n7\nTABLE II. Discrete Complexity Metrics\nMetric\nDescription\nEquation\nNumber of Nodes (N)\nCounts the total number of nodes in the graph.\nN = |V |, where V is the set of vertices in the graph\nNumber of Edges (E)\nCounts the total number of connections in the\ngraph.\nE = |E|, where E is the set of edges in the graph\nAverage Degree (\u00afk)\nMeasures the average number of connections per\nnode.\n\u00afk = 2E\nN\nClustering\nCoefficient\n(C)\nQuantifies the degree to which nodes in the graph\ntend to cluster together.\nC =\n1\nN\nPN\ni=1 Ci, where Ci is the local clustering\ncoefficient of node i\nGraph Density (\u03c1)\nMeasures how close the graph is to being complete. \u03c1 =\n2E\nN(N\u22121)\nDegree Entropy (H)\nQuantifies the complexity of the degree distribu-\ntion.\nH = \u2212P\nk P(k) log P(k), where P(k) is the prob-\nability of a node having degree k\nNumber\nof\nConnected\nComponents (CC)\nCounts the number of disconnected subgraphs\nwithin the overall graph structure.\nN/A\nAverage Resistance (R)\nMeasures the overall connectivity structure within\ncomponents.\nR =\n1\n|CC|\nP\nc\u2208CC\n1\n(|Vc|\n2 )\nP\ni<j\u2208Vc(L+\nii + L+\njj \u22122L+\nij),\nwhere L+ is the pseudoinverse of the Laplacian ma-\ntrix, and Vc is the set of vertices in component c\nTABLE III. Comprehensive Complexity Metrics Comparison\nRank Dataset Nodes Edges Avg Degree Clustering Coef. Density Components Degree Entropy Avg Resistance Meta Metric\n1\nDataset 2\n37\n83\n4.4865\n0.7240\n0.1246\n4\n1.8104\n0.8273\n0.5789\n2\nDataset 4\n37\n78\n4.2162\n0.6855\n0.1171\n4\n1.5879\n0.9039\n0.5353\n3\nDataset 5\n54\n96\n3.5556\n0.3215\n0.0671\n5\n1.4095\n1.3010\n0.4826\n4\nDataset 1\n100\n446\n8.9200\n0.4278\n0.0901\n2\n0.9759\n1.0745\n0.4568\n5\nDataset 3\n53\n81\n3.0566\n0.3040\n0.0588\n5\n1.3389\n1.3095\n0.4461\nentropy is a critical factor, in that it suggests the im-\nportance of diverse node connections. High complexity\nof the Dataset 2 represents rich and structured spiking\ninteractions, whereas, Dataset 1 represents high-and-\nuniform activity, and a balance between local clustering\nand overall connectivity which further indicates efficient\ninformation processing. Multiple connected components\nin Datasets 3 and 5 suggest distinct sub-networks.\nThe importance of individual metrics varies,\nde-\npending on specific spiking phenomena and a further\nanalysis would be needed on the temporal evolution\nof complexity measures.\nThere is also a possibility of\ncorrelating complexity measures with specific spiking\nfunctions or stimuli responses \u2013 therefore, a tailored\nanalysis focusing on specific aspects of spiking dynamics\nor comparative studies is necessary.\nV.\nCLASSIFICATION MODEL FOR SPIKE\nTRAIN ANALYSIS\nWe developed a binary classification model aimed at\npredicting proteinoid spikes, in order to extract insights\ninto the temporal dynamics of the firing.\nA.\nModel Architecture\nWe implemented a feedforward neural network using\ndense layers for our classification task. The architecture\nof our model is as follows:\n\u2022 Input Layer: It corresponds to the number of en-\ngineered features detailed in Section V B.\n\u2022 Hidden Layers: Four dense layers with 128, 64,\n32, and 16 neurons respectively.\n\u2022 Output Layer: A single neuron with sigmoid ac-\ntivation for binary classification.\nAll hidden layers utilise the Rectified Linear Unit\n(ReLU)\nactivation\nfunction,\nwhich\nintroduces\nnon-\nlinearity and helps mitigate the vanishing gradient prob-\nlem during training.\nB.\nFeature Engineering\nTable\n6\npresents\na\ncomprehensive\nand\ntechnical\noverview of all features used in our spike train classifi-\ncation model.\nThese features are designed to capture\ncomplex temporal dynamics and structural characteris-\ntics of the neuronal activity.\nThe complete feature vector xi for each time point i is\nconstructed as follows\nxi = [ti, \u2206ti, ISIi, CVISI,i, \u00b5S,i,3, \u03c3S,i,3,\n\u00b5S,i,5, \u03c3S,i,5, \u00b5S,i,10, \u03c3S,i,10,\nFsin,5,i, Fcos,5,i, Fsin,10,i, Fcos,10,i,\nFsin,20,i, Fcos,20,i]\n(6)\n8\nMeta Feature\nFeature\nMathematical Definition\nDescription\nTemporal\nTime (ti)\nti\nOriginal timestamp of each data\npoint, preserving absolute temporal\ninformation.\nTime\ndifference\n(\u2206ti)\n\u2206ti = ti \u2212ti\u22121\nInterval between consecutive data\npoints,\ncapturing local temporal\ndynamics and identifying irregular-\nities in sampling or firing patterns.\nInter-Spike\nInterval (ISIi)\nISIi =\n(\nti \u2212tj\nif Si = 1, Sj = 1\n0\notherwise\nTime difference between consecu-\ntive spikes, where Si is the spike\nindicator (0 or 1) at time i, and j\nis the index of the previous spike.\nProvides insights into rhythmicity\nand variability of neuronal firing.\nStatistical\nCoefficient of\nVariation\nof\nISIs (CVISI,i)\nCVISI,i =\n\u03c3ISI,i,w\n\u00b5ISI,i,w\nMeasure of variability in spike tim-\ning, calculated over a rolling win-\ndow of size w. \u03c3ISI,i,w and \u00b5ISI,i,w\nare\nthe\nstandard\ndeviation\nand\nmean of ISIs in the window ending\nat time i. Quantifies regularity or\nirregularity of spike patterns.\nRolling Mean\n(\u00b5S,i,w)\n\u00b5S,i,w = 1\nw\nPi\nj=i\u2212w+1 Sj\nAverage of spikes over a rolling win-\ndow of size w (w = 3, 5, 10), cap-\nturing local spike density at differ-\nent temporal scales.\nRolling Stan-\ndard\nDevia-\ntion (\u03c3S,i,w)\n\u03c3S,i,w =\nq\n1\nw\u22121\nPi\nj=i\u2212w+1(Sj \u2212\u00b5S,i,w)2\nStandard deviation of spikes over a\nrolling window of size w (w = 3, 5,\n10), quantifying local variability in\nspike patterns.\nSpectral\nSine\nTrans-\nformation\n(Fsin,p,i)\nFsin,p,i = sin\n\u0010\n2\u03c0ti\np\n\u0011\nSine\ncomponent\nof\nthe\nFourier\ntransformation for periods p (p =\n5, 10, 20). Captures oscillatory be-\nhavior and periodic patterns in the\nspike train data.\nCosine Trans-\nformation\n(Fcos,p,i)\nFcos,p,i = cos\n\u0010\n2\u03c0ti\np\n\u0011\nCosine component of the Fourier\ntransformation for periods p (p =\n5, 10, 20).\nComplements the sine\ncomponent in identifying and quan-\ntifying periodic components across\ndifferent time scales.\nFIG. 6. Comprehensive Feature Set for Spike Train Classification\nThis 16-dimensional feature vector provides a rich rep-\nresentation of the spike train data, encompassing tem-\nporal, statistical, and spectral characteristics. By lever-\naging this comprehensive feature set, our classification\nmodel captures intricate patterns in the spiking firing be-\nhavior and enables accurate spike prediction in the proto-\ncognitive agent data.\nC.\nModel Performance Analysis\n1.\nClassification Metrics\nAs tabulated in Table IV, the model achieves an accu-\nracy of 70.41%, i.e., correctly classifying about 7 out of 10\ninstances. While this indicates better-than-chance per-\nformance, there is a room for improvement. With a pre-\ncision of 70.59%, the model demonstrates a good ability\n9\nMetric\nValue\nTrue Positives\n24\nFalse Positives\n10\nTrue Negatives\n45\nFalse Negatives\n19\nAccuracy\n0.7041\nPrecision\n0.7059\nRecall\n0.5581\nF1 Score\n0.6234\nTABLE IV. Classification Performance Metrics\nto avoid false positives. When the model predicts a spike,\nit is correct approximately 71% of the time. The recall\nof 55.81% suggests that the model identifies about 56%\nof all actual spikes. This relatively lower recall indicates\nthat the model misses a significant portion of true spikes.\nThe F1 score of 0.6234 provides a balanced measure of\nthe model\u2019s performance, considering both precision and\nrecall. This score indicates moderate performance but\nalso highlights the potential for enhancement.\n2.\nReceiver Operating Characteristic (ROC) Curve\nPredicted\nActual\nConfusion Matrix\nFIG. 7.\nConfusion Matrix (actual versus predicted) of the\nSpike Prediction Model.\nFigure 9 illustrates the trade-off between the true\npositive rate (sensitivity) and the false positive rate (1\n- specificity) at various classification thresholds.\nOur\nmodel achieves an Area Under the Curve (AUC) of 0.73,\nindicating a moderate discriminative ability. This AUC\nvalue suggests that the model performs better than\nrandom guessing (AUC = 0.5) in distinguishing between\nspike and non-spike events.\nAccuracy\nLoss\nEpoch\nEpoch\nFIG. 8. Training and Validation Accuracy and Loss\nThe curve\u2019s shape reveals that our model achieves a rel-\natively high true positive rate at lower false positive rates,\nas evidenced by the steep initial rise. This characteristic\nis desirable, especially in the context of neuronal spike\ndetection where minimising false positives while main-\ntaining sensitivity is crucial.\n3.\nInterpretation and Implications\nThe model\u2019s performance metrics reveal several key in-\nsights:\n1.\nBalanced Performance: The similar values of\nprecision and accuracy suggest a relatively balanced per-\nformance, which is crucial in spike train analysis where\nboth false positives and false negatives can significantly\nimpact interpretations.\n2. Conservative Predictions: The higher precision\ncompared to recall indicates that the model is somewhat\nconservative in its spike predictions. It prioritizes avoid-\ning false positives over capturing all spikes.\n3. Missed Spikes: The lower recall value suggests\nthat the model is missing a considerable number of actual\nspikes. This could lead to underestimation of neuronal\nactivity in certain analyses.\n10\nFalse positive rate\nROC curve \nTrue positive rate\nFIG. 9. Receiver Operating Characteristic (ROC) Curve with\nAUC = 0.73.\n4. Potential for Improvement: The moderate F1\nscore and AUC indicate that while the model performs\nbetter than random guessing, there is substantial room\nfor improvement.\nThis could potentially be achieved\nthrough feature engineering, model architecture refine-\nment, or increased training data.\n5. Context-Dependent Utility: The model\u2019s cur-\nrent performance may be more suitable for applica-\ntions where minimizing false positives is prioritized over\ncapturing every spike.\nHowever, for studies requiring\nhigh sensitivity to neuronal activity, further optimization\nwould be beneficial.\nIn the context of analyzing protocognitive agents,\nthese results suggest that our model can, potentially,\nprovide valuable insights into spike patterns, but\u2014of\ncourse\u2014caution should be exercised when making defini-\ntive claims about neuronal activity based solely on these\npredictions.\nVI.\nEQUIVALENCE BETWEEN RELU\nNETWORKS USED IN CURRENT WORK AND\nKAR THEOREM\nSchmidt-Heiber [17] employed a modification of the\nKolmogorov-Arnold representation theorem that has a\ndirect connection with the ReLU networks used in the\nbinary classification model above. The article discusses\nthe construction of deep ReLU networks to approximate\nfunctions using KA representation. The network archi-\ntecture consists of an input layer, output layer, and mul-\ntiple hidden layers, where each hidden layer extracts bits\nfrom the binary representation of inputs to emulate the\nKA decomposition. This process allows the network to\napproximate a target function efficiently with fewer pa-\nrameters by computing intermediate bit-extraction oper-\nations and combining them to form a piecewise constant\nfunction. Through the use of ReLU activation functions,\nthe network can closely simulate thresholding behaviors,\nwhich are essential for the KA structure. The theorem\nproved as a result is the following\nTheorem VI.1 Let p \u2208[1, \u221e). If there exist \u03b2 \u22641 and\na constant Q, such that |f(x) \u2212f(y)| \u2264Q|x \u2212y|\u03b2\n\u221efor\nall x, y \u2208[0, 1]d, then, there exists a deep ReLU network\n\u02dcf with 2K +3 hidden layers, network architecture (2K +\n3, (d, 4d, . . . , 4d, d, 1, 2Kd + 1, 1)) and all network weights\nbounded in absolute value by 2(Kd \u2228\u2225f\u2225\u221e)2K(d\u2228(\u03b2p)),\nsuch that\n\u2225f \u2212\u02dcf\u2225p \u22642 (Q + \u2225f\u2225\u221e) 2\u2212\u03b2K.\nFurther details of this are given in Section 3 of [17].\nVII.\nCONCLUSION AND FUTURE\nDIRECTIONS\nIn the present work, we are able to construct a modal-\nity that converts the discrete time-series data using the\nrecordings from proteinoid ensembles (using voltage sen-\nsitive dyes) to a multi-nodal graph structure, using two\nfunctions F1 and F2, which perform non-linear sampling\nfrom the given data and then extract the first significant\ndigit after the decimal point from the datapoint. The\ndata transformation algorithm employed in this process\nis unique to this analysis, and hope will be further used\nif applicable to other time-series data samples.\nWe\nwere largely motivated to characterise the data samples\nin terms of complexity metrics (conventional discrete\ncomplexity metrics and a novel meta-metric).\nThe\nmetrics measured, as a result, helped us to compare\ndifferent datasets and identify their individual nuances\n(clustering coefficient, local connectivity, sub-networks,\netc).\nFinally, the binary classification model serves as\na prediction tool to identify 16 key features from the\nspike train data. The tool works with 70.59% accuracy\nwhich needs further improvement. We finish the paper\nby hinting on a fundamental equivalence between the\nReLU networks and the modified Kolmogorov-Arnold\nrepresentation theorem [17].\nWe plan to further investigate the spiking train data\nfrom the proteinoid ensembles in our future studies. One\npotential direction that inspires us is that of asking: what\nfundamentally differentiates the given dataset from a ran-\ndomly generated sequence? While we know that the bi-\nnary classification model can make such distinction, the\nunderlying reason is not hinted yet, and will be the topic\nof our forthcoming study.\nAppendix A: Appendix\nThis appendix provides additional visualisations of our\nspike train classification model\u2019s performance, offering\n11\ndeeper insights into its behavior during training and its\npredictive accuracy. Figure 7 presents the confusion ma-\ntrix of our model\u2019s predictions on the test set. Figure 8\nshows the evolution of accuracy and loss for both train-\ning and validation sets over the course of model training.\nThese plots provide several key insights:\n\u2022 Accuracy (left plot): The training accuracy (red\nline) shows a rapid increase in the early epochs, fol-\nlowed by a more gradual improvement. The valida-\ntion accuracy (black line) remains relatively stable,\nsuggesting that the model generalizes well to un-\nseen data.\n\u2022 Loss (right plot): The training loss (red line)\ndecreases steadily throughout the training process,\nindicating continuous improvement in the model\u2019s\nperformance on the training data. The validation\nloss (black line) shows some fluctuations but gener-\nally remains stable, further supporting the model\u2019s\ngeneralization capability.\n\u2022 Overfitting assessment: The gap between train-\ning and validation metrics is relatively small, sug-\ngesting that the model is not severely overfitting to\nthe training data.\nThese visualisations complement the quantitative met-\nrics presented in the main text, providing a more com-\nprehensive view of the model\u2019s learning process and pre-\ndictive performance.\nCODE\nThe GUI used to encode and decode the QR codes for\nproteinoids is available here and is written solely by A.M.\n[1] W. Maass, T. Natschl\u00a8ager, and H. Markram, Real-time\ncomputing without stable states: A new framework for\nneural computation based on perturbations, Neural com-\nputation 14, 2531 (2002).\n[2] C. Fernando and S. Sojakka, Pattern recognition in a\nbucket, in Advances in Artificial Life: 7th European Con-\nference, ECAL 2003, Dortmund, Germany, September\n14-17, 2003. Proceedings 7 (Springer, 2003) pp. 588\u2013597.\n[3] I. S. Maksymov and A. Pototsky, Reservoir comput-\ning based on solitary-like waves dynamics of liquid film\nflows:\nA proof of concept, Europhysics Letters 142,\n43001 (2023).\n[4] K. Nakajima, Physical reservoir computing\u2014an intro-\nductory perspective, Japanese Journal of Applied Physics\n59, 060501 (2020).\n[5] G. Marcucci, P. Caramazza, and S. Shrivastava, A new\nparadigm of reservoir computing exploiting hydrodynam-\nics, arXiv preprint arXiv:2302.01978 (2023).\n[6] G. Marcucci, D. Pierangeli, and C. Conti, Theory of\nneuromorphic computing by waves: machine learning by\nrogue waves, dispersive shocks, and solitons, Physical Re-\nview Letters 125, 093901 (2020).\n[7] M. Mussel and M. F. Schneider, Similarities between ac-\ntion potentials and acoustic pulses in a van der waals\nfluid, Scientific Reports 9, 2467 (2019).\n[8] M. Mussel and M. F. Schneider, It sounds like an action\npotential: unification of electrical, chemical and mechan-\nical aspects of acoustic pulses in lipids, Journal of the\nRoyal Society Interface 16, 20180743 (2019).\n[9] W. Severa, O. Parekh, K. D. Carlson, C. D. James, and\nJ. B. Aimone, Spiking network algorithms for scientific\ncomputing, in 2016 IEEE international conference on re-\nbooting computing (ICRC) (IEEE, 2016) pp. 1\u20138.\n[10] F. Heyder and J. Schumacher, Echo state network for\ntwo-dimensional turbulent moist rayleigh-b\u00b4enard convec-\ntion, Physical Review E 103, 053107 (2021).\n[11] H. T. Siegelmann, Computation beyond the turing limit,\nScience 268, 545 (1995).\n[12] A. Noy and S. B. Darling, Nanofluidic computing makes\na splash, Science 379, 143 (2023).\n[13] S.\nSharma\nand\nG.\nMarcucci,\nFrom\nNavier-Stokes\nmillennium-prize problem to soft matter computing,\narXiv preprint arXiv:2212.01492 (2022).\n[14] S. Sharma, G. Marcucci, and A. Mahmud, A com-\nplexity perspective on fluid mechanics, arXiv preprint\narXiv:2212.00153 (2022).\n[15] S. Sharma, A. Mahmud, G. Tarabella, P. Mougoyan-\nnis, and A. Adamatzky, On morphological and functional\ncomplexity of proteinoid microspheres, arXiv preprint\narXiv:2306.11458 (2023).\n[16] The weights wi are as follows: Number of Nodes: 0.05,\nNumber of Edges: 0.05, Average Degree: 0.10, Cluster-\ning Coefficient: 0.10, Graph Density: 0.20, Degree En-\ntropy: 0.20, Number of Connected Components: 0.10,\nand Average Resistance: 0.20. These weights were chosen\nto balance the contribution of each metric, with slightly\nhigher emphasis on degree entropy and average resistance\nas they capture more nuanced aspects of the graph struc-\nture.\n[17] J. Schmidt-Hieber, The kolmogorov\u2013arnold representa-\ntion theorem revisited, Neural networks 137, 119 (2021)."
-  },
-  {
-    "domain": "Physics",
-    "chunk_type": "general",
-    "text": "Why Cold BGK Modes Are So Cool: Dispersion Relations from Orbit-Constrained\nDistribution Functions\nMikael Tacu\u2217\nCEA, DAM, DIF F-91297 Arpajon, France and\nUniversit\u00b4e Paris-Saclay, CEA, Laboratoire Mati`ere en Conditions Extr\u02c6emes, F-91680, Bruy`eres-le-Ch\u02c6atel, France\n(Dated: April 15, 2025)\nWe derive analytic dispersion relations for cold, orbitally constrained systems governed by the\nVlasov equation. For magnetized plasmas, we obtain the first explicit relation for two-dimensional\nanisotropic BGK modes with finite magnetic field, showing that only a finite number of angular\nmodes can become unstable and identifying a magnetic-field threshold for stabilization. In the grav-\nitational case, we establish a bound on the growth rate of core perturbations, set by the potential\u2019s\ncurvature. These results clarify how orbital constraints shape the spectrum and growth of kinetic\ninstabilities in cold, collisionless media.\nBernstein\u2013Greene\u2013Kruskal (BGK) modes [1] are ex-\nact, time-independent solutions of the Vlasov\u2013Poisson\nequations, representing nonlinear phase-space structures\nthat can persist in weakly collisional plasmas. In one-\ndimensional electrostatic settings, their properties and\nstability have been studied extensively [2, 3] and are\nknown to be associated with electron holes, solitary\nwaves, and phase-space vortices [4]. Observations in both\nlaboratory and space plasmas [5, 6] indicate analogous\nstructures in higher dimensions, but their existence and\nstability remain less understood.\nNg and Bhattachar-\njee [7] constructed explicit two and three-dimensionsal\nBGK equilibria by including an angular momentum as\na conserved quantity, enabling the existence of solu-\ntions in magnetized two dimensional and in unmagne-\ntized three dimensional plasmas. Recent particle-in-cell\nsimulations [8, 9] of such equilibria reported intriguing\ninstability phenomena, including discrete angular mode\npatterns and a stability threshold depending on the back-\nground magnetic field. However, the physical mechanism\nof the spiral pattern observed in the azimuthal field, or\nof the stability threshold remains unexplained.\nIn this Letter, we present an exact linear stability anal-\nysis for a class of cold BGK equilibria constructed on\norbit-constrained, Dirac delta-function distributions. For\nmagnetized plasmas, we derive an explicit dispersion rela-\ntion and show that only a finite number of angular modes\ncan become unstable for any given potential. We iden-\ntify a critical magnetic field above which all perturbations\nare linearly stable, and derive a local condition on the po-\ntential, guaranteeing core stability even for sub-threshold\nmagnetic fields. These results clarify the mechanisms be-\nhind the discrete instability bands observed in previous\nsimulations, and provide new analytical tools for explor-\ning nonlinear phase-space dynamics.\nWe then apply the same method to the gravitational\nVlasov\u2013Poisson system, which governs the collisionless\ndynamics of globular clusters and galactic cores [10]. Fo-\ncusing on cold, orbitally constrained distributions rele-\nvant to the early evolution of such systems, we derive\na strict upper bound on the growth rate of central per-\nturbations, set entirely by the curvature of the gravita-\ntional potential.\nThis analytic result connects central\nmass concentration to the timescale of kinetic redistribu-\ntion, offering a new diagnostic for interpreting the early\ncollapse phase seen in high-resolution N-body simula-\ntions [11, 12].\nThe formalism is general and could be\nextended to other geometries or field configurations. To-\ngether, these results contribute to the theoretical under-\nstanding of phase-space instabilities in cold, long-range-\ninteracting systems.\nPlasma case.\nWe consider a collisionless plasma\nwith a uniform ion density ni and a uniform background\nmagnetic field B = B0ez, in cylindrical coordinates\n(er, e\u03b8, ez). It has been shown [7] that in the cylindrically\nsymmetric case, with a radial electric field E = \u2212\u2207\u03c8,\nany function of the Hamiltonian w = v2/2 \u2212\u03c8(r) and\nof the angular momentum l = 2rv\u03b8 \u2212r2B0 is a solution\nto the following Vlasov equation (where v =\np\nv2\n\u03b8 + v2r is\nnormalized to the thermal velocity and r\u2014to the Debye\nlength),\n\u2202f\n\u2202t + vr\n\u2202f\n\u2202r + v\u03b8\nr\n\u2202f\n\u2202\u03b8 +\n\u0012v2\n\u03b8\nr + d\u03c8\ndr \u2212v\u03b8B0\n\u0013 \u2202f\n\u2202vr\n\u2212\n\u0010vrv\u03b8\nr\n+ E\u03b8 \u2212vrB0\n\u0011 \u2202f\n\u2202v\u03b8\n= 0.\n(1)\nRecently, some of these 2D BGK modes, correspond-\ning to a particular distribution of the form f(w, l) =\n(2\u03c0)\u22123/2e\u2212\u03c9(1 \u2212he\u2212kl2), with h \u2208(\u2212\u221e, 1) and k \u2208\n(0, \u221e) constant parameters, have been shown via PIC\nsimulations [8, 9] to exhibit a stability behavior depen-\ndent on a threshold value of the magnetic field B0. To\nexplore the stability in such a configuration, we consider\nthe cold limit of distributions of the previous type. It\nwrites as an exact solution to the Vlasov equation, on\nconstrained orbits, and using Dirac delta functions as,\nf0(r, vr, v\u03b8) = g(r)\u03b4(vr)\u03b4[v2\n\u03b8 + r\u03c8\u2032(r) \u2212rv\u03b8B0].\n(2)\nThis idealized cold BGK mode is the simplest, yet\nphysically relevant distribution, that allows a finite mag-\nnetic field B0 and an arbitrary radial profile g(r), where\narXiv:2504.10382v1  [physics.plasm-ph]  14 Apr 2025\n2\ng is any real positive function. This profile defines the\npotential \u03c8(r) by Poisson equation, which writes,\n1\nr\nd(r\u03c8\u2032(r))\ndr\n= g(r)\nZ\nR\n\u03b4[(v\u03b8 \u2212v+\n\u03b8 )(v\u03b8 \u2212v\u2212\n\u03b8 )]dv\u03b8 \u22121, (3)\nwhere, 2v\u00b1\n\u03b8 = rB0 \u00b1\n\u221a\n\u2206are the roots of the polynomial\nX2 \u2212rB0X + r\u03c8\u2032(r), with \u2206= r2B2\n0 \u22124r\u03c8\u2032(r).\nThis cold distribution describes a plasma where elec-\ntrons have zero radial velocity and where they turn, for\nany given radius r, at v\u03b8 = v\u00b1\n\u03b8 which depends on the\nmagnetic field and on the potential. The electric field\nEr = \u2212\u03c8\u2032(r) is required to be positive, so that an equi-\nlibrium can be sustained even when B0 = 0, in which\ncase \u2206\u22650. We then find then that g is related to the\npotential by dr(r\u03c8\u2032(r)) = 2rg(r)/\n\u221a\n\u2206\u2212r.\nBy studying the linear stability of the previous dis-\ntribution, we show, that for these functions, there ex-\nist a stability threshold based on the value of the mag-\nnetic field B0. Let the time-dependent distribution be\nf = f0 +f1 and let\u2019s denote by \u02d9vr = v2\n\u03b8 \u2212rv\u03b8B0 +r\u03c8\u2032(r),\nand \u02d9v\u03b8 = vrB0\u2212vrv\u03b8/r for more compact notations. The\nperturbed quantities are f1, E1\nr and E1\n\u03b8. The perturbed\nVlasov equation becomes then,\n\u2202f1\n\u2202t + vr\n\u2202f1\n\u2202r + v\u03b8\nr\n\u2202f1\n\u2202\u03b8 + \u02d9vr\n\u2202f1\n\u2202vr\n+ \u02d9v\u03b8\n\u2202f1\n\u2202v\u03b8\n= E1\nrg(r)\u03b4\u2032(vr)\u03b4(\u02d9vr) + E1\n\u03b8g(r)(2v\u03b8 \u2212rB0)\u03b4(vr)\u03b4\u2032(\u02d9vr).\n(4)\nThe stationary BGK mode has orbits in which vr = 0.\nWe suppose that after perturbation, electrons still follow\nthese orbits. First, we seek for the simplest dispersion re-\nlation possible, second, small perturbations will remain\nconfined on stable orbits for sufficiently small amplitudes\nand third, orbits with vr = 0 are still a valid solution of\nthe Hamiltonian dynamics in the time dependent case.\nThe perturbed distribution writes then f1 = \u03b4(vr)F1,\nand the first moment as F1 =\nR\nR f1dvr. After integrating\nthe previous equation in vr, one gets \u2202tF1+(v\u03b8/r)\u2202\u03b8F1 =\nE1\n\u03b8g(r)(2v\u03b8 \u2212rB0)\u03b4\u2032(\u02d9vr), since \u03b4\u2032(vr) integrates to zero.\nThis can be solved analytically by the method of char-\nacteristics, which gives :\nF1(r, \u03b8, v\u03b8, t) = g(r)(2v\u03b8 \u2212\nrB0)\u03b4\u2032(\u02d9vr)\nR t\n0 E\u03b8 (r, \u03b8 \u2212v\u03b8(t \u2212t0)/r, t0) dt0.\nHere,\nwe\nsupposed that F1(r, \u03b8, v\u03b8, 0) = 0.\nFor the perturbed\nGauss equation, if we only retain the azimuthal field,\nthen it writes \u2202\u03b8E1\n\u03b8 = \u2212r\nR\nR F1dv\u03b8. A direct computa-\ntion yields,\n\u2202E1\n\u03b8\n\u2202\u03b8 = \u2212g(r)\n\u221a\n\u2206\nZ t\n0\nt0\n\u2202E1\n\u03b8\n\u2202\u03b8 (r, \u03b8 \u2212v+\n\u03b8 t0/r, t \u2212t0)dt0\n\u2212g(r)\n\u221a\n\u2206\nZ t\n0\nt0\n\u2202E1\n\u03b8\n\u2202\u03b8 (r, \u03b8 \u2212v\u2212\n\u03b8 t0/r, t \u2212t0)dt0.\n(5)\nWe can Fourier transform the periodic field \u03b8\n7\u2192\nE1\n\u03b8(r, \u03b8, t), by writing E1\n\u03b8 = P+\u221e\nn=\u2212\u221eEn(r, t)ein\u03b8 + c.c.,\nwhich gives from the previous relation\n\u221a\n\u2206En(r, t) =\n\u2212g(r)\nR t\n0 En(r, t\u2212t0)[e\u2212inv+\n\u03b8 t0/r+e\u2212inv\u2212\n\u03b8 t0/r]dt0. We then\nperform a Laplace transform of each of these equations\nindexed by n, by writing \u02c6En(r, \u03c9n) =\nR \u221e\n0\nei\u03c9ntEn(r, t)dt.\nThe dispersion relation follows then,\n\u221a\n\u2206=\ng(r)\n(\u03c9n \u2212nv+\n\u03b8 /r)2 +\ng(r)\n(\u03c9n \u2212nv\u2212\n\u03b8 /r)2 ,\n(6)\nwhere \u03c9n = \u03c9r,n + i\u03b3n is the frequency of each mode\nindexed by n.\nDiscussion of the plasma case.\nTo solve this dis-\npersion relation, let us consider a given \u03c9 and introduce\nthe following notations. Let xn = 2r(\u03c9n \u2212nB0/2)/n\n\u221a\n\u2206\nand an = 4r2g(r)/n2\u22063/2. Then, the frequency has to\nbe chosen between the roots of the polynomial Pn(X) =\nX4 \u22122(an + 1)X2 + 1 \u22122an. The roots of this poly-\nnomial are such that x2\nn = an + 1 \u00b1\np\nan(an + 4). To\nstudy the stability of the modes, which are solutions to\nthe dispersion relation, we use the Penrose criterion [13].\nThe dispersion relation being given by a 4-th order poly-\nnomial, it is already an analytic function, and it has no\npoles in the upper half plane provided that an \u22641/2. So,\nthe considered system is linearly stable when an \u22641/2\nand unstable when an > 1/2, with \u03b3n > 0 in the latter\ncase.\nLet us first consider the stable case. The stability con-\ndition an \u22641/2 can also be written as a1 \u22641/2. Indeed,\nin that case an \u2264a1 \u22641/2, and if a1 > 1/2, then at\nleast one mode is unstable, and so is the whole distri-\nbution. The latter stability condition can be written as\n\u03c8\u2032\u2032 + 2\u03c8\u2032/r \u2264B2\n0/4 \u22121. A necessary condition directly\nfollowing would be B0 \u22652, by taking the limit r \u2192\u221e.\nHowever, this is not a sufficient condition, since we still\nhave to impose the previous differential inequality, which\ncan also be written as [r2\u03c8\u2032(r) \u2212r3(B2\n0/4 \u22121)/3]\u2032 \u2264\n0. Integrating this is equivalent to using the Gr\u00a8onwall\nlemma [14] with B0 \u22652 and amounts to no new infor-\nmation, since Er is positive.\nHowever, we already see\nthat the condition is satisfied in the regions where Er is\ngrowing and also for Er\n\u223c\nr\u2192\u221eE\u221e\nr /r2, which is the case\nfor a localized charge distribution by the Gauss theorem.\nWe are left with the regions where Er decreases. If for\nexample Er follows a 1/rn law in some region, the sta-\nbility condition becomes n \u22122 \u2264(B2\n0/4 \u22121)rn+1, which\ncould be easily violated for a sufficiently large n and for\nr < 1.\nHowever, a too steep decrease of the electric\nfield, and thus of the electron density (on Debye length\nscales), would be inconsistent with the assumption of a\nuniform charge distribution of the ion background. So,\nfor realistic electric fields, with not too steep variations,\nthe necessary and sufficient stability criteria could be still\nwritten as B0 > 2.\nLet us now discuss unconditional stability at small r.\nSince the global stability condition writes \u03c8\u2032\u2032 + 2\u03c8\u2032/r \u2264\nB2\n0/4 \u22121, and with the assumption of Er(0) = 0, we get\nby taking the limit r \u21920, that \u03c8\u2032\u2032(0) < B2\n0/12 \u22121/3.\n3\n0\n1\nan\n(a)\na1\na4\na7\na8\n0\n5\n10\n15\n20\nr/\nD\n0.00\n0.25\n0.50\nn\n(b)\nn = 1\nn = 4\nn = 7\nn = 8\nFIG. 1. In (a) we plot the stability-related parameters an,\ndirectly responsible for the stability of a given mode n. All\nthe modes for which an crosses the solid black line y = 1/2\nare unstable. In (b), we have the corresponding growth rates\n\u03b3n. We see that while an \u22641/2, \u03b3n = 0. We also see that\na8 \u22641/2. Since for all n, an+1 \u2264an, we deduce the stability\nof all modes for n \u22658. The results are plotted for a potential\n\u03c8 computed from the distribution f(w, l) = (2\u03c0)\u22123/2e\u2212\u03c9(1 \u2212\nhe\u2212kl2). Here B0 = 0.25, k = 0.4 and h = 0.9.\nIf this inequality is satisfied, then there exist a strictly\npositive rc = inf\n\b\nr > 0\n\f\f \u03c8\u2032\u2032(r) + 2\u03c8\u2032(r)/r > 1 \u2212B2\n0/4\n\t\n.\nThen, by definition, \u2200r \u2264rc, a1(r) < 1/2, meaning that\nthe distribution is stable in that region, whatever the\nmagnetic field B0. Physically, this means that for a large\nclass of potentials, a stable core region forms close to the\naxis regardless of the external magnetic field strength.\nFor the unstable case, which is governed by the con-\ndition an > 1/2, the growth rate can be written as\n\u03b3n =\nqp\nan(an + 4) \u2212an \u22121/\u221aan\u22061/4 and the real fre-\nquency is given by \u03c9r,n = nB0/2.\nSince for a large\nenough n we have an < 1/2, it necessarily follows that in\nour case, only a finite number of modes can grow\nunstable. This is clearly represented on Fig. 1(a) and\n(b), where both an and \u03b3n are represented for a po-\ntential computed from Poisson equation for the distri-\nbution f(w, l) = (2\u03c0)\u22123/2e\u2212\u03c9(1 \u2212he\u2212kl2). To compute\nthe potential, we numerically solved the differential equa-\ntion (associated to the distribution by Poisson equation)\n\u03c8\u2032\u2032(r) = \u2212\u03c8\u2032(r)/r + e\u03c8(r)(1 \u2212he\u2212kB2\n0r4/4\u03b32/\u03b3) \u22121, with\n\u03c8\u2032(0) = 0 and with \u03c8(r) \u2192\nr\u2192\u221e0.\nRecent particle-in-cell (PIC) simulations qualitatively\nsupport our analytical predictions regarding finite-mode-\nnumber\ninstabilities\nand\nmagnetic-field\nstabilization\nthresholds.\nSpecifically, McClung et al.\n[8] and later\nFranciscovich et al. [9] observed that cases with a mag-\nnetic field strength satisfying B0 \u22652 are stable, while\ninstabilities occur for weaker fields, consistent with our\n5\n0\n5\n5\n0\n5\ny/\nD\n(a)\n5\n0\n5\n5\n0\n5\n(b)\n5\n0\n5\nx/\nD\n5\n0\n5\ny/\nD\n(c)\n5\n0\n5\nx/\nD\n5\n0\n5\n(d)\n1.0\n0.6\n0.2\n0.2\n0.6\n1.0\nNormalized Azimuthal Electric Field\nFIG. 2.\nReal part of the normalized azimuthal perturbed\nelectric field in (x, y) coordinates.\nThe field is given by\nE1\n\u03b8(r, \u03b8, t) = Pnmax\nn=1\nEn(r)e\u03b3n(r)t cos[n(\u03b8 \u2212B0/2)t], where the\nEn were set equal to En = 1 for clarity. The potential is the\nsame as the one used in [8, 9], with B0 = 0.25, k = 0.4 and\nh = 0.9. Only eight modes were retained (nmax = 8), since\nthe rest are stable. In (a), we have t = 2, in (b), t = 6, in\n(c), t = 10 and in (d), we have t = 100. We see an unstable\nring forming and persisting at large times.\nderived stability threshold.\nThe spiral pattern seen in\nFig. 2 is also consistent with what was observed for the\nazimuthal field in PIC simulations performed previously.\nAs shown on 2(a), (b) and (c), the number of spiral arms\nmay vary in time, which was also observed in Ref [8]. In\nthe linear case however, as seen on Fig. 2(d), the field\nwill eventually be dominated by the region surrounding\nthe maximum value of a1, consistent with the growth\nrates represented on Fig. 1, which also delimits the un-\nconditionally stable region near the core.\nGravitational case.\nIn the gravitational analog, we\nconsider a collection of stars evolving in a spherically\nsymmetric, self consistent potential \u03c8(r), which is general\nin this study, but could be for example a Plummer type\npotential \u03c8(r) = \u2212GM/\n\u221a\nr2 + a2, shown to emphasize\nthe sign difference with the plasma case. Here G is the\nuniversal gravitational constant, M is the object\u2019s mass\nand a is the Plummer radius. The Vlasov equation in\n1D2V geometry writes as [10],\n\u2202f\n\u2202t + vr\n\u2202f\n\u2202r +\n\u0012v2\n\u22a5\nr \u2212d\u03c8\ndr\n\u0013 \u2202f\n\u2202vr\n\u2212vrv\u22a5\nr\n\u2202f\n\u2202v\u22a5\n= 0,\n(7)\nwhere v\u22a5=\nq\nv2\n\u03b8 + v2\u03c6. We modify the original construc-\ntion by Ng et al., by using the fact that v\u22a5\u22650. Instead\nof evaluating the Dirac distribution in v2\n\u22a5/r \u2212\u03c8\u2032\n0(r), we\nevaluate it in v\u22a5\u2212\np\nr\u03c8\u2032\n0(r), where \u03c80 is the stationary\npotential. This makes the distribution simpler and more\n4\nsuitable for a perturbative study. It writes,\nf0(r, vr, v\u22a5) = g(r)\u03b4(vr)\u03b4\n\u0014\nv\u22a5\u2212\nq\nr\u03c8\u2032\n0(r)\n\u0015\n.\n(8)\nPhysically, we suppose that the cluster is initially cold\nand that all the stars follow circular orbits with v\u22a5=\np\nr\u03c8\u2032(r). The main reason to consider such a distribu-\ntion is mathematical. It is the simplest distribution, solu-\ntion to Vlasov equation, that supports a generic potential\n\u03c80. In the gravitational case Poisson\u2019s equation writes,\n1\nr2\n\u2202\n\u2202r\n\u0012\nr2 \u2202\u03c8\n\u2202r\n\u0013\n= 8\u03c02G\nZ\nR\ndvr\nZ +\u221e\n0\nv\u22a5dv\u22a5f(r, vr, v\u22a5, t).\n(9)\nFor the stationary potential, we have the following rela-\ntion: 8\u03c02Gr\np\nr\u03c8\u2032\n0(r)g(r) = 2\u03c8\u2032\n0(r) + r\u03c8\u2032\u2032\n0(r). Now, to\nstudy the stability of this class of distributions, we sup-\npose that in the linear phase, the stars continue to fol-\nlow circular orbits in the stationary potential \u03c80. This\nsimplifies the treatment of the problem while keeping it\nphysically insightful.\nSupposing instead that the stars\ncontinue on vr = 0 orbits, amounts to no dispersion\nrelation and to inconditional stability.\nWe write thus\nf(r, vr, v\u22a5, t) = f0(r, vr, v\u22a5)+F1(r, vr, t)\u03b4(v\u22a5\u2212\np\nr\u03c8\u2032\n0(r))\nand after linearizing Eq.(7) and integrating in v\u22a5, we ob-\ntain,\n\u2202F1\n\u2202t + vr\n\u2202F1\n\u2202r + vr\nr F1 = \u2202\u03c81\n\u2202r g(r)\u03b4\u2032(vr),\n(10)\nwhere \u03c8(r, t) = \u03c80(r) + \u03c81(r, t). This equation, like the\nprevious one in the plasma case, can be integrated on\nthe characteristics, by noticing that if x = r + vrt, then\ndt[xF1(x, vr, t)] = x\u2202r\u03c81(x, t)g(x)\u03b4\u2032(vr). After integra-\ntion we obtain with r\u2032 = r \u2212vr(t\u2212t\u2032) that rF1(r, vr, t) =\nR t\n0 r\u2032\u2202r\u03c81(r\u2032, t\u2032)g(r\u2032)\u03b4\u2032(vr)dt\u2032.\nFurther, we use the fact\nthat q(s)\u03b4\u2032(s) = \u2212q\u2032(0)\u03b4(s)+q(0)\u03b4\u2032(s) for any function q.\nBy the way, the obtained expression for F1 is of the form\ng0(r, t)\u03b4(vr) + h0(r, t)\u03b4\u2032(vr). The \u03b4\u2032(vr) term will vanish\nwhen reported in the density. However, it gives an in-\nteresting idea as an ansatz for searching solutions to the\nfull 1D2V Vlasov equation. One could look to solutions\nof the form g1(r, v\u22a5, t)\u03b4(vr)+h1(r, v\u22a5, t)\u03b4\u2032(vr), then take\nthe moments.\nThe \u03b4\u2032(vr) term does not contribute to\nPoisson equation, but it ensures a mathematical coher-\nence, since a Dirac delta derivative \u03b4\u2032(vr) appears nat-\nurally when propagating the unperturbed distribution,\ncoming from the term \u2202vrf0.\nWe Laplace transform the linearized Poisson equation\nand introduce for that \u02dc\u03c81(r, \u03c9) =\nR \u221e\n0\nei\u03c9t\u03c81(r, t)dt. To\nexpress the dispersion relation, let us denote by z = \u2202r \u02dc\u03c81\nand by z\u2032 = \u2202rz. Then after some straightforward ma-\nnipulations, we obtain,\n\u0012\nr + 2\u03c8\u2032\n0 + r\u03c8\u2032\u2032\n0\n\u03c92\n\u0013\nz\u2032+\n\u0012\n2 \u2212\n\u0014\n1 + rg\u2032\ng\n\u0015 2\u03c8\u2032\n0 + r\u03c8\u2032\u2032\n0\nr\u03c92\n\u0013\nz = 0.\n(11)\nThis differential equation is of the form c1(r)z\u2032(r) +\nc0(r)z(r) = 0 and its solution integrated from an r0 > 0\ncan be written as z(r) = z(r0)e\u2212\nR r\nr0 c0(r1)/c1(r1)dr1. Since,\nc0(r)/c1(r)\n\u223c\nr\u2192\u221e2/r, we see after integrating again in\nr, that \u03c81\n\u223c\nr\u2192\u221e\u2212z(r0)/r, which is a consistent be-\nhaviour at infinity for a gravitational potential.\nFor\nr = 0, we require a finite value of the perturbed po-\ntential with z\u2032(0) = 0. However, since c0(r)/c1(r)\n\u223c\nr\u21920\n2/r(1 + 3\u03c8\u2032\u2032(0)/\u03c92), the only way this could be satisfied\nin the limit r0 \u21920 is for 1 + 3\u03c8\u2032\u2032(0)/\u03c92 to be strictly\nnegative (or zero, which will require to go to the sec-\nond order in series expansion of c1). Since \u03c8\u2032\u2032\n0(0) > 0 in\nmost of the cases (for a Plummer potential for example,\n\u03c8\u2032\u2032\n0(0) = GM/a3), it follows that the frequency \u03c92 is neg-\native. Let\u2019s denote it by \u03c9 = i\u03b3, where \u03b3 is the growth\nrate. The considered class of distributions is unstable,\nfor the growth of perturbations such that,\n\u03b3 \u2264\np\n3\u03c8\u2032\u2032(0).\n(12)\nDiscussion of the gravitational case. The instability con-\ndition derived from the orbit-constrained distribution im-\nposes an upper bound on the growth rate of perturba-\ntions near the cluster core. This constraint arises from\nrequiring a finite, regular gravitational potential at the\norigin. Physically, this means that there can not be an\narbitrary fast redistribution in a cold, collisionless core,\nwithout a correspondingly large central potential curva-\nture.\nThis limits the development of kinetic instabili-\nties. Since \u03c8\u2032\u2032\n0(0) depends on the central mass density,\nthis links the dynamical timescale of instability growth\nto the underlying potential structure. For instance, in a\nPlummer potential, we have \u03c8\u2032\u2032\n0(0) = GM/a3, yielding\n\u03b3 \u2272\np\n3GM/a3, which is on the order of the inverse dy-\nnamical time. This is different from the classical Jeans\ninstability, which by the way, is suppressed in this case,\nalong with the radial orbit instability, because of the in-\nfinitely small dispersion in radial velocity. This theoreti-\ncal bound offers a diagnostic: in principle, it implies that\nif one observes a fast redistribution rate in a cold, col-\nlisionless core, whether in simulations or inferred from\ntransient structures, the local curvature of the potential\nmust be correspondingly steep.\nConversely, a shallow\npotential, such as in a globular cluster without a cen-\ntral black hole, imposes a slower maximum growth. The\npresence of an intermediate-mass black hole [15], by con-\ntrast, steepens \u03c8\u2032\u2032(0), thereby allowing faster instability-\ndriven evolution.\nThis suggests that in systems dom-\ninated by central mass concentrations, kinetic mecha-\nnisms may drive fast early evolution even before colli-\nsional processes become important. This could allow for\ninterpreting high-resolution N-body simulations and po-\ntentially constraining unseen central masses via dynam-\nical arguments alone.\nAs for cold, collisionless stellar systems, numerical\nsimulations [11] have shown that early core evolution\n5\nis shaped by orbitally constrained dynamics.\nIn the\ncase of initially cold and spherically symmetric configura-\ntions [16], their violent relaxation leads to central cusps\nand quasi-stationary cores. Recent high-resolution stud-\nies [12, 17] confirm that during this early phase, the cen-\ntral region is dominated by coherent, near-circular or-\nbits\u2014consistent with our analytic assumptions. In this\nregime, collisional effects are negligible and phase-space\ninstabilities govern the redistribution process. Our de-\nrived bound on the instability growth rate provides a\nkinetic limit to this redistribution, directly linking it to\nthe curvature of the gravitational potential. This echoes\nfindings in simulations with and without central black\nholes [18], where deeper central potentials accelerate core\nevolution. Although these simulations do not directly im-\npose our cold, orbit-constrained conditions, they suggest\nthat the mechanism captured by our analytic model may\nnaturally emerge in the early stages of globular cluster\nevolution, before relaxation isotropizes the velocity dis-\ntribution.\nRelevance of cold BGK distributions.\nWhile the ex-\nact delta-function distribution employed in our analyt-\nical model represents an idealized limit, practical nu-\nmerical implementations necessarily involve finite but\nsmall velocity dispersions due to numerical constraints.\nThis could be tested in principle by Vlasov simulations.\nHowever, as recently shown [19], one has to be care-\nful with discontinuous distributions which have sharply\nvarying derivatives, while performing Vlasov simulations,\nlet alone with Dirac delta-functions as an initial condi-\ntion.\nThis is especially true with the commonly used\nsemi-Lagrangian schemes. A more direct method of test-\ning the exact findings in this paper would be a test-\nparticle algorithm, where the conservation in phase space\nwould be used to propagate the initially cold distribution,\nand then Poisson equation would be solved numerically.\nThis is left for future work, since a careful treatment of\nthe radial Dirac term \u03b4(r \u2212rorb) would be necessary in\nthe Poisson\u2019s equation, where rorb would follow the time-\ndependent characteristics of Eq. 7.\nSummary.\nIn summary, we have developed a uni-\nfied kinetic framework for deriving analytic dispersion\nrelations in cold, orbitally constrained systems governed\nby the Vlasov equation. In the plasma context, this ap-\nproach yields the first explicit dispersion relation for cold,\nanisotropic BGK modes in two dimensions with finite\nmagnetic field, containing a finite set of unstable angular\nmodes and a threshold for magnetic stabilization. Ap-\nplied to self-gravitating systems, the same method estab-\nlishes a strict upper bound on the growth rate of central\nperturbations, determined by the curvature of the grav-\nitational potential. This result implies that cold, colli-\nsionless cores cannot undergo rapid redistribution with-\nout significant central mass concentration\u2014linking the\ndynamics of instability growth to core structure.\nTo-\ngether, these findings highlight a shared kinetic mecha-\nnism in long-range-interacting systems. The formalism is\nbroadly applicable and may enable new dispersion rela-\ntions in other geometries. Future analytic work, simula-\ntions, and observational studies informed by these results\ncould advance our understanding of phase-space instabil-\nities across both plasma and astrophysical settings.\nThe author would like to thank Charles Ruyer, Vanina\nRecoules and Jean-Christophe Pain for carefully reading\nthe manuscript. Useful discussions with Didier B\u00b4enisti,\nSerge Bouquet, Matthias Pautard and Robin Piron are\ngratefully acknowledged.\n\u2217mikael.tacu@cea.fr\n[1] I. B. Bernstein, J. M. Greene, and M. D. Kruskal, Phys.\nRev. 108, 546 (1957).\n[2] H. Schamel, J. Plasma Phys. 7, 1 (1972).\n[3] J. L. Schwarzmeier, H. R. Lewis, B. Abraham-Shrauner,\nand K. R. Symon, Phys. Fluids 22, 1747 (1979).\n[4] I. H. Hutchinson, Phys. Plasmas 24, 055601 (2017).\n[5] R. E. Ergun, C. W. Carlson, J. P. McFadden, F. S. Mozer,\nG. T. Delory, W. Peria, R. C. Elphic, and R. J. Strange-\nway, Phys. Rev. Lett. 81, 826 (1998).\n[6] R. Pottelette and R. A. Treumann, Geophys. Res. Lett.\n32 (2005).\n[7] C. S. Ng and A. Bhattacharjee, Phys. Rev. Lett. 95,\n245004 (2005).\n[8] J. McClung, M. T. Franciscovich, K. Germaschewski, and\nC. S. Ng, Phys. Plasmas 31, 042302 (2024).\n[9] M. T. Franciscovich, J. McClung, K. Germaschewski, and\nC. S. Ng, Phys. Plasmas 32, 022302 (2025).\n[10] J. Binney and S. Tremaine, Galactic Dynamics: Second\nEdition (2008).\n[11] T. S. van Albada, Mon. Not. R. Astron. Soc. 201, 939\n(1982).\n[12] F. Sylos Labini and M. Joyce, Astron. Astrophys. 652,\nA8 (2021).\n[13] O. Penrose, Phys. Fluids 3, 258 (1960).\n[14] S. Francinou, H. Gianella, and S. Nicolas, Exercices\nde math\u00b4ematiques :\noraux X-ENS, analyse 4, 1st ed.\n(CASSINI, 2012), p. 120.\n[15] M. H\u00a8aberle, N. Neumayer, A. Seth, et al., Nature 631,\n285\u2013288 (2024).\n[16] D. Merritt and L. A. Aguilar, Mon. Not. R. Astron. Soc.\n217, 787 (1985).\n[17] S. Rozier, J.-B. Fouvry, P. G. Breen, A. L. Varri, C.\nPichon, and D. C. Heggie, Mon. Not. R. Astron. Soc.\n487, 711 (2019).\n[18] P. G. Breen and D. C. Heggie, Mon. Not. R. Astron. Soc.\n432, 2779 (2013).\n[19] M. Tacu and D. B\u00b4enisti, Phys. Rev. E 110, 045205\n(2024)."
-  },
-  {
-    "domain": "Physics",
-    "chunk_type": "general",
-    "text": "Can spacetime torsion source an extremely red-tilted cosmological GW background?\nArko Bhaumik\u2217and Bhaswati Mandal\u2020\nPhysics and Applied Mathematics Unit,\nIndian Statistical Institute,\n203, B.T. Road, Kolkata 700108, India\nSoumitra SenGupta\u2021\nSchool of Physical Sciences, Indian Association for the Cultivation of Science\n2A & 2B, Raja Subodh Chandra Mallick Road, Kolkata 700032, India\nIn the presence of spacetime torsion, any generic f(R) model of gravity is conformally dual to\na scalar-tensor theory augmented with a second rank antisymmetric massless degree of freedom.\nWe investigate the stochastic gravitational wave background (SGWB) that may be sourced directly\nat the second order by such a torsional field, treated perturbatively during an epoch of canonical,\nsingle-field, slow-roll inflation. The resulting second-order induced SGWB, which dominates over the\nprimary inflationary GW background at all scales, peaks only at ultra-low frequencies, and is found\nto be extremely red-tilted with an effective tensor spectral index \u03b1T \u223c\u22126 on matter-dominated\nscales. The signal is potentially within the reach of upcoming indirect GW probes on very large\nscales k \u227210\u22122 Mpc\u22121, i.e., next-generation CMB experiments like the LiteBIRD. In the near\nfuture, observation of such a markedly red-tilted SGWB on CMB scales could hence provide a novel\nand unique clue in favour of torsional gravity during the inflationary era.\nIntroduction: A large-scale stochastic gravitational\nwave background (SGWB) remains one of the last ma-\njor predictions of the inflationary paradigm that has re-\nmained elusive so far [1]. Latest observations of cosmic\nmicrowave background (CMB) anisotropies by Planck\n(Pl18) [2] and Bicep-Keck (BK15) [3] have sharpened the\nupper bound on the inflationary tensor-to-scalar ratio to\nr < 0.036 [4] at the pivot scale k\u2217= 0.002 Mpc-1. Over\nthe coming decades, next-generation gravitational wave\n(GW) detectors with superior sensitivity thresholds offer\nsignificantly better prospects for detecting a primordial\nSGWB, thus shedding light on both its size and spectral\nshape across a wide range of frequencies.\nThe amplitude of an inflationary SGWB is directly tied\nto the inflationary energy scale, while its spectral tilt is\na messenger of finer information, e.g., potential devia-\ntions from the simplest canonical, single field, slow-roll\n(CSFSR) scenario. The latter is well known for predict-\ning a very weakly red-tilted spectrum of tensor pertur-\nbations [5]. Thus, any detection of a very strongly red-\ntilted SGWB with a tensor spectral index nT \u2272\u2212O(1)\non very large scales would be a game changer for ruling\nout large classes of inflationary models, and come as a\nbreakthrough in terms of our understanding of the earli-\nest moments of our Universe. At the same time, it would\nalso strongly defy attempts at any non-inflationary causal\nexplanation, thanks to the universal infrared (IR) scaling\nnT \u223c3 for causally produced GW spectra [6\u20138].\nAs we show in this Letter, an extremely red-tilted\nSGWB peaking on very large scales could interestingly\nserve as a smoking gun signature of torsion-augmented\n\u2217arkobhaumik12@gmail.com\n\u2020 bhaswatimandaliitm92@gmail.com\n\u2021 tpssg@iacs.res.in\nf(R)\ngravity\nduring\ninflation,\nwhile\nalso\nallowing\nus to preserve the virtues of the CSFSR paradigm.\nTorsion arises naturally in the gravitational sector if\nthe structure group acting on the tangent bundle of\nthe spacetime manifold is identified with the Poincar\u00b4e\ngroup, whose irreducible unitary representations are\nquantum fields labelled by mass and spin [9\u201312]. Within\nthe framework of a generic f(R) theory of gravity\nthat includes torsion, transformation to the Einstein\nframe introduces two dynamical degrees of freedom \u2212a\nscalar field with a potential capable of driving CSFSR\ninflation, and a massless antisymmetric rank-2 tensor\nfield that is minimally coupled at the leading order\n[13]. We demonstrate that the latter can source GWs\nat quadratic order over the course of inflation, resulting\nin an exceptionally red-tilted induced SGWB that\ndominates over the primary inflationary SGWB at all\nscales. The former could be detectable with upcoming\nspace-based CMB experiments such as LiteBIRD [14],\nwith a characteristic spectral index \u223c\u22126 on CMB scales\nbeing a clear giveaway of its torsional origin.\nTorsional field in Einstein frame: The scalar-tensor\nrepresentation of an f(R) theory with torsion has been\nstudied in [13]. The starting point is a diffeomorphism-\ninvariant generic f(R) action for gravity having the form\nS =\n1\n2\u03ba2\nZ\nd4x\u221a\u2212g f(R) ,\n(1)\nwhere \u03ba = M \u22121\nPl and all other symbols carry their usual\nmeanings. This f(R) action is known to be equivalent to\nS =\n1\n2\u03ba2\nZ\nd4x\u221a\u2212g\n\u0002\n\u21262R \u2212V\n\u0000\u21262\u0001\u0003\n,\n(2)\nwhere f \u2032(A) = \u21262 and V (\u21262) = A(\u21262)\u21262 \u2212f(A(\u21262)).\nIn the standard scenario, one can proceed to the Ein-\narXiv:2504.10402v1  [astro-ph.CO]  14 Apr 2025\n2\nstein frame by conformally transforming g\u00b5\u03bd \u2192eg\u00b5\u03bd =\n\u2126(x)2g\u00b5\u03bd, which \u201cfrees\u201d the Ricci scalar and causes a\nscalar degree of freedom to emerge [15].\nHowever, since the Christoffel connection (\u0393) is no\nlonger symmetric in twisted spacetime, it decomposes as\n\u0393\u03bb\n\u00b5\u03bd = \u00af\u0393\u03bb\n\u00b5\u03bd \u2212K\u03bb\n\u00b5\u03bd, where the first term is its symmet-\nric component and the second term is the antisymmetric\ncontorsion tensor. This entails non-trivial additive con-\ntributions from K\u03bb\n\u00b5\u03bd in the Ricci tensor. Consequently,\none finds that the Einstein frame action (SE) involves a\nscalar degree of freedom (\u03d5) as well as a rank-2 antisym-\nmetric tensor field (Z\u00b5\u03bd). The final canonical form of SE\nmay be expressed as follows [13]:\nSE =\nZ\nd4x\u221a\u2212g\n\u0014 R\n2\u03ba2 \u22121\n2g\u03b1\u03b2\u2202\u03b1\u03d5\u2202\u03b2\u03d5 \u2212V (\u03d5)\n\u22121\n2\u2207[\u03b1Z\u00b5\u03bd]\u2207[\u03b1Z\u00b5\u03bd] + \u03ba\nr\n6\n7g\u03b1\u03b3g\u03b2\u03f5g\u03bb\u03b4\u2202[\u03b1\u03d5Z\u03f5\u03b4]\u2202[\u03b2Z\u03bb\u03b3]\n\u22123\u03ba2\n7 g\u03b1\u03b3g\u03b2\u03f5g\u03bb\u03b4\u2202[\u03b1\u03d5Z\u03f5\u03b4]\u2202[\u03b2\u03d5Z\u03bb\u03b3]\n\u0015\n,\n(3)\nwhere all quantities are defined with reference to the\nEinstein frame. While the first line contains the terms\ntypical to f(R) theories without torsion, the rest emerge\nuniquely from the torsional sector. In particular, besides\nthe O(Z2) kinetic term of the massless torsional field,\nthere are O(\u03d5Z2) and O(\u03d52Z2) interaction terms which\nare Planck-suppressed. In the present work, we neglect\nthese higher-order terms.\nFirst order dynamics of Z\u00b5\u03bd: We now specialize to\nthe case of a homogeneous and isotropic, flat Friedmann-\nLema\u02c6\u0131tre-Robertson-Walker\n(FLRW)\nmetric\nds2\n=\n\u2212dt2 + a(t)2dx2\n3 as the cosmological background.\nWe\nassume the torsional field to be energetically subdomi-\nnant compared to the scalar \u03d5, which is advantageous\nto identify with the inflaton. Note that all information\nof the functional form of f(R) is distilled into the scalar\npotential V (\u03d5), while the torsional sector remains inde-\npendent of it. Hence, one may assume CSFSR inflation of\nthe background driven by the scalar rolling down a suit-\nably constructed V (\u03d5), with Z\u00b5\u03bd invoked perturbatively\non top. The background equations thus remain\nH2 = \u03ba2\n3\n\u00141\n2\n\u02d9\u03d52 + V (\u03d5)\n\u0015\n,\n\u00a8\u03d5 + 3H \u02d9\u03d5 + \u2202V\n\u2202\u03d5 = 0 ,\n(4)\nwhere the overdots denote derivatives with respect to t,\nand H(t) = \u02d9a(t)/a(t). As in the standard scenario, this\nis sufficient to ensure a quasi-de Sitter (dS) expansion of\nthe background, governed by the specific choice of f(R)\nthat may lead to a suitably flat V (\u03d5). Invoking Z\u00b5\u03bd(\u20d7x, t)\nperturbatively on top of this background and neglecting\nthe higher-order couplings of (3), its EoM may be written\ncomponent-wise as\n\u22072Zi0 + \u2202\u2113\n\u0010\n\u02d9Z\u2113i + \u2202iZ0\u2113\n\u0011\n= 0 ,\n(5)\n\u0012 d\ndt + 3H\n\u0013 \u0010\n\u02d9Zij + \u2202iZj0 \u2212\u2202jZi0\n\u0011\n\u22121\na2\n\u0002\n\u22072Zij + \u2202\u2113(\u2202iZj\u2113\u2212\u2202jZi\u2113)\n\u0003\n= 0 .\n(6)\nTo simplify this system, we impose the following pair of\nconstraints on Z\u00b5\u03bd: (i) the time-space components of Z\u00b5\u03bd\nare identically zero, i.e., Zi0 = 0 \u2200i \u2208{1, 2, 3}, and (ii)\nthe space-space components of Z\u00b5\u03bd are transverse, i.e.,\n\u2202iZij = 0 \u2200j \u2208{1, 2, 3}. Such conditions may be physi-\ncally motivated based on gauge symmetries, if, for exam-\nple, the torsional field has a string-theoretic interpreta-\ntion [16\u201318]. The constraints above leave 6 \u22123 \u22122 = 1\npropagating transverse degree of freedom in Z\u00b5\u03bd, which\nis hence dual to a massless scalar. With (5) rendered su-\nperfluous, (6) is given in k-space in terms of Zij = aZij\nand the conformal time d\u03b7 = dt/a(t) as\nZ\u2032\u2032\nij(k, \u03b7) +\n\u0012k2\n2 \u22122\n\u03b72\n\u0013\nZij(k, \u03b7) = 0 ,\n(7)\nwhere a(\u03b7) = \u2212(\u03b7H)\u22121 has been used, as consistent with\nan inflating dS background. The steps to quantize Zij\nare outlined in Appendix A, following which the Bunch-\nDavies (BD) normalized solution for the mode function\nof (7) is given by\nZk(\u03b7) = 2\n1\n4\n\u221a\nk\ne\u2212ik\u03b7\n\u221a\n2\nk\u03b7\n\u0012\n1 + ik\u03b7\n\u221a\n2\n\u0013\n.\n(8)\nIn subsequent calculations, we have switched back to\nZk(\u03b7)\n=\na\u22121Zk(\u03b7).\nBased on (8), the superhori-\nzon behavior of Zk(\u03b7) can be approximated as Z(0)\nk\n=\nZk(\u03b7)||k\u03b7|\u226a1 \u2248\u22122\n1\n4 H/k\n3\n2 , which is constant. Thus, one\nmay introduce an inflationary transfer function Tk(\u03b7)\nsuch that Zk(\u03b7) = Z(0)\nk Tk(\u03b7), yielding\nTk(\u03b7) = e\u2212ik\u03b7\n\u221a\n2\n\u0012\n1 + ik\u03b7\n\u221a\n2\n\u0013\n,\n(9)\nwhich is used in our subsequent calculations.\nTorsion-sourced SGWB: We now consider a spatially\nperturbed flat FLRW metric with gij = a2(\u03b4ij + hij),\nwhere the tensor perturbations hij are identified with\nGWs propagating on the expanding background. In pres-\nence of an anisotropic stress source, they follow the EoM\nh\u2032\u2032\nij + 2Hh\u2032\nij + k2hij = \u03ba2a2T (T )\nij\n,\n(10)\nwith T (T )\nij\n\u2261\n\u02c6T \u2113m\nij T\u2113m being the spatial transverse-\ntraceless\n(TT)\nprojection\nof\nthe\nsource\nenergy-\nmomentum tensor. In our case, the latter is furnished\n3\nby the O(Z2) kinetic term of (3), which results in\nT (T )\nij [Z] =\n2\n3a(\u03b7)4\n\u0002\nZ\u2032\n\u2113iZ\u2032\n\u2113j + \u2202\u2113Zmi (\u2202\u2113Zmj + \u2202mZj\u2113)\n\u0003\n.\n(11)\nUsing\nthe\nhelicity-basis\ndecomposition\nhij(k, \u03b7)\n=\nP\n\u03bb=\u00b1\nh\u03bb(k, \u03b7)\u03a0ij\n\u03bb (k), where \u03a0ij\n\u00b1(k) = ei\n\u2213(k)ej\n\u2213(k)/\n\u221a\n2 is\nwritten in terms of orthonormal polarization vectors, (10)\nleads to the mode equation\nh\u2032\u2032\n\u03bb+2Hh\u2032\n\u03bb+k2h\u03bb = \u03ba2a2\u03a0ij\n\u03bb (k) eT (T )\nij [Z] = S\u03bb(k, \u03b7), (12)\nwhere eT denotes the Fourier transform of T from (11).\nUsing convolutions for the quadratic terms, the source\nfunction can be expressed in k-space as\nS\u03bb(k, \u03b7) = 2\u03ba2\n3a(\u03b7)2 \u03a0ij\n\u03bb (k)\nZ\nd3k\u2032\n(2\u03c0)3\n\u0014\nZ\u2113i(k\u2032, \u03b7)Z\u2113j(k \u2212k\u2032, \u03b7)\n\u2212k\u2032\n\u2113(k\u2113\u2212k\u2032\n\u2113)Zmi(k\u2032, \u03b7)Zmj(k \u2212k\u2032, \u03b7)\n\u2212k\u2032\n\u2113(km \u2212k\u2032\nm)Zmi(k\u2032, \u03b7)Zj\u2113(k \u2212k\u2032, \u03b7)\n\u0015\n.\n(13)\nThe tensor mode function may then be written as\nh\u03bb(k, \u03b7) =\n\u03b7\nZ\n\u03b7in\nd\u02dc\u03b7 Gk(\u03b7, \u02dc\u03b7)S\u03bb(k, \u02dc\u03b7) ,\n(14)\nwhere the Green\u2019s function Gk(\u03b7, \u02dc\u03b7) for (12) in dS back-\nground reads\nGk(\u03b7, \u02dc\u03b7)\n=\n1\nk3\u02dc\u03b72 [(k\u02dc\u03b7 \u2212k\u03b7) cos(k\u03b7 \u2212k\u02dc\u03b7)\n+\n\u0000k2\u03b7\u02dc\u03b7 + 1\n\u0001\nsin(k\u03b7 \u2212k\u02dc\u03b7)\n\u0003\n\u0398(\u03b7 \u2212\u02dc\u03b7) ,(15)\nwith \u0398(x) being the Heaviside step function [19]. This\nsubsequently enables one to construct the equal-time\ntwo-point tensor correlator, whose full analytic expres-\nsion is provided in Appendix B. Explicit computation\nreveals \u27e8h+h+\u27e9= \u27e8h\u2212h\u2212\u27e9and \u27e8h+h\u2212\u27e9= \u27e8h\u2212h+\u27e9= 0.\nHence, the corresponding Z-induced tensor power spec-\ntrum can be defined as\n\u27e8h+(k, \u03b7)h+(\u02dck, \u03b7)\u27e9= \u03c02\nk3 \u03b4(3)(k + \u02dck) P(Z)\nh\n(k, \u03b7) .\n(16)\nWith (16) at hand, we are ready to compute the tensor\npower spectrum numerically. First, we transform to the\ndimensionless variables v = k\u2032/k and u = |k \u2212k\u2032|/k.\nWhile integrating over k\u2032, we assume an ultraviolet (UV)\ncutoff kmax = ki = aiH, with ai being the scale factor at\nthe end of inflation, approximated as [20]\nai =\n\u00000.9 \u00d7 1029\u0001\u22121 \u0012\nH\n10\u22125MPl\n\u0013\u22121/2\n(17)\nfor instantaneous reheating. We also impose an IR cut-\noff corresponding to the largest observable scale corre-\nsponding to the present horizon size, i.e., kmin = H0 \u223c\n2 \u00d7 10\u22124 Mpc\u22121.\nThe u-integral then runs between\nmax [kmin/k1, |1 \u2212v|] and min [kmax/k1, 1 + v], and the\nv-integral between 0 and kmax/k1. For time integration,\nwe accordingly choose \u03b7in = \u2212\u03b70 (with \u03b70 being the con-\nformal time at present) and the final time \u03b7end = \u2212k\u22121\ni\n,\ni.e., when the longest and the shortest observable modes\nexited the horizon during inflation. This spans the en-\ntire duration of physical inflation, and encapsulates the\ncontribution from the full range of scales therein.\nHaving thus computed the GW power spectrum\nat\nthe\nend\nof\ninflation\n(\u03b7end),\nthe\nGW\nspectral\nabundance at present (\u03b70) is finally calculated as\n\u2126GW(k, \u03b70) = Th (k; \u03b70, \u03b7end)2 P(Z)\nh\n(k, \u03b7end) , where Th\ndenotes the sub-horizon transfer function describing\nthe evolution of tensor modes in the post-inflationary\nUniverse (see Appendix B).\nResults and Discussion: In Fig. 1, we show the spec-\ntral profile envelopes of the torsion-sourced SGWB for\na few representative values of the inflationary Hubble\nparameter (H), which is the only governing parameter\nin our scenario.\nWe particularly focus on the param-\neter region which produces an SGWB consistent with\npresently available datasets and their resultant upper\nbounds on the SGWB strength, while being of interest to\nfuture-generation CMB-scale detectors where the signal\nstrength peaks. Remarkably, the avoidance of GW over-\nproduction by the torsional field requires H \u227210\u221236MPl,\nwhich is significantly smaller than the currently avail-\nable data-based upper bound H \u227210\u22125MPl. However,\nthis is not very problematic, as the theoretical lower\nbound is much smaller and lies around H \u227310\u221244MPl\nbased on the Big Bang Nucleosynthesis (BBN) energy\nscale \u223c1 MeV [27]. As a direct result of such lowered\nenergy scale of inflation, the primary inflationary GW\nbackground remains negligibly small compared to the\ntorsion-sourced SGWB, with a maximum amplitude of\n\u223c10\u22126(keq/kmin)2(H/MPl)2 \u227210\u221274 during the matter-\ndominated (MD) era, where keq \u223c10\u22122 Mpc\u22121 denotes\nthe scale of matter-radiation equality.\nThe SGWB amplitude peaks close to f\n\u223c10\u221218\nHz and quickly decays at higher frequencies, hence\nfalling beyond the scope of detectability at all but the\nlargest scales. In analogy with the power-law primary\ntensor power spectrum P(pr)\nh\n(k) = AT(k/k\u2217)nT, one may\ndefine an effective tensor spectral index \u03b1T such that\nP(Z)\nT (k) \u221dk\u03b1T for scales re-entering in the MD era. The\nsimplest way to estimate \u03b1T is to compare with the tilt\nof the spectral abundance of primary inflationary GWs\nduring the MD era, whose spectral shape for nT = \u22126\nis shown in Fig.\n2.\nA direct comparison between the\nspectral shapes of the primary spectrum and the torsion-\nsourced spectrum therefore yields an approximate value\nof \u03b1T \u223c\u22126, confirming that the torsion-sourced SGWB\nis strikingly red-tilted.\nIn hindsight, this is intuitive\nas the torsional source is a second-rank antisymmetric\ntensor field, which implies an overall a\u22124 dilution of\n4\nH=10-36.1MPl\nH=10-36.2MPl\nH=10-36.3MPl\nH=10-36.4MPl\nH=10-36.5MPl\n-18\n-16\n-14\n-12\n-10\n-8\n-6\n-50\n-40\n-30\n-20\n-10\nlog10(f/Hz)\nlog10 \u03a9GW(f,\u03b70)h2\nPl18+BK15\nLiteBIRD\nSuper-LiteBIRD\nExcluded region from \u0394Neff + BBN bounds\nPIXIE\nVOYAGE2050\nSKA\nTHEIA\n-18.0\n-17.5\n-17.0\n-16.5\n-16.0\n-19\n-18\n-17\n-16\n-15\n-14\n-13\n-12\nlog10(f/Hz)\nlog10 \u03a9GW(f,\u03b70)h2\nPl18+BK15\nLiteBIRD\nSuper-LiteBIRD\nFIG. 1: Present day spectral abundance of the torsion-sourced SGWB, consistent with currently available upper\nbounds from BBN+\u2206Neff constraints [21] and CMB measurements of Pl18+BK15 [22]. We focus on a few viable\nvalues of the inflationary Hubble parameter (H) for which the prominently red-tilted SGWB should be detectable by\nupcoming CMB probes, e.g., the LiteBIRD and its hypothetical successor(s). Two more curves for H = 10\u221235.5MPl\n(blue dot-dashed) and H = 10\u221237MPl (orange dot-dashed) have been included for the purpose of demonstration,\nwith the former overshooting the admissible upper bound and the latter falling below the detectable threshold. The\nprojected sensitivity curves of a few other proposed future instruments have also been shown for reference [23\u201326].\nThe right panel displays a zoomed-in region from the left panel highlighting detectability at CMB scales.\n-18\n-16\n-14\n-12\n-10\n-8\n-60\n-50\n-40\n-30\n-20\n-10\nlog10(f/Hz)\nlog10 \u03a9GW(f,\u03b70)h2\nPl18+BK15\nLiteBIRD\nSuper-LiteBIRD\nExcluded region from \u0394Neff + BBN bounds\nPIXIE\nVOYAGE2050\nSKA\nTHEIA\n-18\n-17\n-16\n-19\n-17\n-15\n-13\nPl18+BK15\nLiteBIRD\nSuper-LiteBIRD\nFIG. 2: Comparison of the spectral shape of the torsion-\nsourced SGWB (red), with those of the primary inflation-\nary SGWB for nT = \u22126 (green) and nT = 0 (black), for\nH = 10\u221236.2MPl. The amplitudes of the primary SGWB\nare scaled up overall by 1050. Across MD scales towards\nthe left of the matter-radiation equality scale (vertical\nblue-dotted line), the torsion-sourced spectrum visibly\nhas an effective tensor spectral index \u03b1T \u223c\u22126.\nits energy density with cosmic expansion as seen in\n(11). Consequently, GW production is expected to have\nhappened predominantly during the earliest period of\ninflation, when the longest observable comoving modes\nwere leaving the horizon.\nConclusion: While torsion is an intrinsic property of\nspacetime geometry, its role in present-day gravitational\ninteractions is observed to be negligibly small compared\nto that of curvature.\nHowever, this need not always\nhave been the case in the past. In this Letter, we have,\nfor the first time, considered a torsional field to be the\ndirect physical source of a cosmological GW background\ngenerated during inflation. The scenario follows simply\nfrom the inclusion of spacetime torsion in any generic\nmodel of f(R) gravity, which, in the Einstein frame,\nquite naturally furnishes both a scalar degree of freedom\nthat may drive CSFSR inflation for some suitable choice\nof f(R), and an antisymmetric rank-2 tensor field which\nis capable of sourcing GWs perturbatively at the second\norder.\nImportantly, the viability of the whole setup\napparently requires an exceptionally lowered energy\nscale of inflation (H \u227210\u221236MPl), which is the only\nparameter in the model under consideration, to ensure\nthat GWs are not overproduced by the torsional field.\nThe same condition also guarantees that the secondary\nSGWB dominates over the primary inflationary GW\nbackground at all scales. Most significantly, we find that\nthe scenario leads to an extremely red-tilted induced\nSGWB peaking within the sensitivity thresholds of\nupcoming CMB missions, e.g.\nthe LiteBIRD satellite.\nBased on our results, detecting such a GW background\nwith a characteristic spectral tilt \u03b1T \u223c\u22126 on CMB\nscales in the near future may provide us with a telltale\nsignature of spacetime torsion from the earliest moments\nof our Universe.\nAcknowledgments: Authors thank Rahul Shah and\nSupratik Pal for fruitful discussions. AB thanks CSIR\nfor financial support through Senior Research Fellowship\n(File no. 09/0093(13641)/2022-EMR-I).\n5\nAppendix A: Quantization of Zij\nThe canonically normalized Minkowskian action for (7)\nis given by\nSZZ = 1\n2\nZ\nd3k d\u03b7\n\u0012\nZ\u20322\nij \u2212k2\n2 Z2\nij \u2212a\u2032\u2032\na Z2\nij\n\u0013\n.\n(A1)\nEquipped with the conjugate momentum \u03a0(Z)\nij (k, \u03b7) =\n\u03b4SZZ\n\u03b4Z\u2032\nij(k, \u03b7) = Z\u2032\nij(k, \u03b7), we introduce the mode expansion\n\u02c6Zij(k, \u03b7) =\nX\n\u03bb=\u00b1\n\u03b5(\u03bb)\nij (k)\nh\nZk(\u03b7)\u02c6ck,\u03bb + Z\u2217\nk(\u03b7)\u02c6c\u2020\n\u2212k,\u03bb\ni\n,\n(A2)\nwhere the creation and annihilation operators satisfy\nh\n\u02c6ck,\u03bb, \u02c6c\u2020\nk\u2032,\u03bb\u2032\ni\n= (2\u03c0)3\u03b4\u03bb\u03bb\u2032\u03b4(3)(k \u2212k\u2032), and the trans-\nverse basis tensors are constructed based on the antisym-\nmetrization requirement as \u03b5\u03bb\nij(k) =\n\u221a\n2e[i\n\u2212\u03bb(k)ej]\n\u2212\u03bb(k).\nWe then use the canonical field commutation relation\nh\nZij(k, \u03b7), \u03a0(Z)\npq (k\u2032, \u03b7)\ni\n= i\n2 (\u03b4ip\u03b4jq \u2212\u03b4iq\u03b4jp) \u03b4(3)(k \u2212k\u2032)\n(A3)\ntogether with the\n\u0002\nc, c\u2020\u0003\ncommutator to arrive at the\nWronskian condition for the mode functions given by\nZk(\u03b7)Z\u2217\u2032\nk (\u03b7) \u2212Z\u2217\nk(\u03b7)Z\n\u2032\nk(\u03b7) = i ,\n(A4)\nwhich makes sense as (A1) is canonically normalized.\nThis Wronskian serves as the requisite normalization cri-\nterion for the mode functions Zk(\u03b7), which are subse-\nquently obtained by solving (7).\nAppendix B: Analytic expressions for the GW\ntwo-point function\nAs the source function given in (13) contains three dis-\ntinct terms, the overall \u27e8S\u03bb(k, \u03b71)S\u03bb\u2032(\u02dck, \u03b72)\u27e9correlator\ncontains a total of nine terms. Thereafter, integrating\nover \u03b71 and \u03b72 across identical finite intervals [\u03b7in, \u03b7end]\nleads to the following terms in the induced GW two-point\ncorrelation function evaluated at the end of inflation:\n\u27e8h\u03bb(k, \u03b7end)h\u03bb\u2032(\u02dck, \u03b7end)\u27e9(1) = 16\u03ba4H4\n9\n\u03b4(3)(k + \u02dck)\nZ\nd3k\u2032\nk\u20323|k \u2212k\u2032|3 \u03a0ij\n\u03bb (k)\u03a0pq\n\u03bb\u2032 (k)\u2217\n\u00d7\nX\n\u03bb1,\u03bb2=\u00b1\n\u03b5(\u03bb1)\n\u2113i\n(k\u2032)\u03b5(\u03bb1)\nmp (k\u2032)\u2217\u03b5(\u03bb2)\n\u2113j\n(k \u2212k\u2032)\u03b5(\u03bb2)\nmq (k \u2212k\u2032)\u2217\n\f\f\f\f\f\n\u03b7end\nZ\n\u03b7in\nd\u03b71\na(\u03b71)2 Gk(\u03b7end, \u03b71)fZ(k\u2032, |k \u2212k\u2032|, \u03b71)\n\f\f\f\f\f\n2\n,\n(B1)\n\u27e8h\u03bb(k, \u03b7end)h\u03bb\u2032(\u02dck, \u03b7end)\u27e9(2)+(3) = \u221232\u03ba4H4\n9\n\u03b4(3)(k + \u02dck)\nZ\nd3k\u2032\nk\u20323|k \u2212k\u2032|3 k\u2032.(k \u2212k\u2032)\n\u00d7 \u211c\n\"\n\u03a0ij\n\u03bb (k)\u03a0pq\n\u03bb\u2032 (k)\u2217\u00d7\nX\n\u03bb1,\u03bb2=\u00b1\n\u03b5(\u03bb1)\nmi (k\u2032)\u03b5(\u03bb1)\n\u2113p (k\u2032)\u2217\u03b5(\u03bb2)\nmj (k \u2212k\u2032)\u03b5(\u03bb2)\n\u2113q\n(k \u2212k\u2032)\u2217\n\u00d7\n\u03b7end\nZ\n\u03b7in\nd\u03b71\na(\u03b71)2 Gk(\u03b7end, \u03b71)fZ(k\u2032, |k \u2212k\u2032|, \u03b71)\n\u03b7end\nZ\n\u03b7in\nd\u03b72\na(\u03b72)2 Gk(\u03b7end, \u03b72)\u03beZ(k\u2032, |k \u2212k\u2032|, \u03b72)\u2217\n#\n,\n(B2)\n\u27e8h\u03bb(k, \u03b7end)h\u03bb\u2032(\u02dck, \u03b7end)\u27e9(4)+(5) = \u221232\u03ba4H4\n9\n\u03b4(3)(k + \u02dck)\nZ\nd3k\u2032\nk\u20323|k \u2212k\u2032|3 k\u2032\n\u2113.(km \u2212k\u2032\nm)\n\u00d7 \u211c\n\"\n\u03a0ij\n\u03bb (k)\u03a0pq\n\u03bb\u2032 (k)\u2217\u00d7\nX\n\u03bb1,\u03bb2=\u00b1\n\u03b5(\u03bb1)\nmi (k\u2032)\u03b5(\u03bb1)\n\u03b1p (k\u2032)\u2217\u03b5(\u03bb2)\nj\u2113\n(k \u2212k\u2032)\u03b5(\u03bb2)\n\u03b1q (k \u2212k\u2032)\u2217\n\u00d7\n\u03b7end\nZ\n\u03b7in\nd\u03b71\na(\u03b71)2 Gk(\u03b7end, \u03b71)\u03beZ(k\u2032, |k \u2212k\u2032|, \u03b71)\n\u03b7end\nZ\n\u03b7in\nd\u03b72\na(\u03b72)2 Gk(\u03b7end, \u03b72)fZ(k\u2032, |k \u2212k\u2032|, \u03b72)\u2217\n#\n,\n(B3)\n\u27e8h\u03bb(k, \u03b7end)h\u03bb\u2032(\u02dck, \u03b7end)\u27e9(6)+(7)+(8)+(9) = 16\u03ba4H4\n9\n\u03b4(3)(k + \u02dck)\nZ\nd3k\u2032\nk\u20323|k \u2212k\u2032|3 \u03a0ij\n\u03bb (k)\u03a0pq\n\u03bb\u2032 (k)\u2217\n6\n\u00d7\nX\n\u03bb1,\u03bb2=\u00b1\n\"\n(k.k\u2032 \u2212k\u20322)2\u03b5(\u03bb1)\nmi (k\u2032)\u03b5(\u03bb1)\n\u03b1p (k\u2032)\u2217\u03b5(\u03bb2)\nj\u2113\n(k \u2212k\u2032)\u03b5(\u03bb2)\n\u03b1q (k \u2212k\u2032)\u2217\n+ (k.k\u2032 \u2212k\u20322)k\u2032\n\u03b1(k\u03b2 \u2212k\u2032\n\u03b2)\u03b5(\u03bb1)\nmi (k\u2032)\u03b5(\u03bb1)\n\u03b2p (k\u2032)\u2217\u03b5(\u03bb2)\nmj (k \u2212k\u2032)\u03b5(\u03bb2)\nq\u03b1 (k \u2212k\u2032)\u2217\n+ (k.k\u2032 \u2212k\u20322)k\u2032\n\u2113(km \u2212k\u2032\nm)\u03b5(\u03bb1)\nmi (k\u2032)\u03b5(\u03bb1)\n\u03b2p (k\u2032)\u2217\u03b5(\u03bb2)\nj\u2113\n(k \u2212k\u2032)\u03b5(\u03bb2)\n\u03b2q (k \u2212k\u2032)\u2217\n+ k\u2032\n\u2113(km \u2212k\u2032\nm)k\u2032\n\u03b1(k\u03b2 \u2212k\u2032\n\u03b2)\u03b5(\u03bb1)\nmi (k\u2032)\u03b5(\u03bb1)\n\u03b2p (k\u2032)\u2217\u03b5(\u03bb2)\nj\u2113\n(k \u2212k\u2032)\u03b5(\u03bb2)\nq\u03b1 (k \u2212k\u2032)\u2217\n#\n\u00d7\n\f\f\f\f\f\n\u03b7end\nZ\n\u03b7in\nd\u03b71\na(\u03b71)2 Gk(\u03b7end, \u03b71)\u03beZ(k\u2032, |k \u2212k\u2032|, \u03b71)\n\f\f\f\f\f\n2\n,\n(B4)\nwhere \u211c(x) denotes the real part of x. In the expres-\nsions above, we have defined fZ(k1, k2, \u03b7) = T \u2032\nk1(\u03b7)T \u2032\nk2(\u03b7)\nand \u03beZ(k1, k2, \u03b7) = Tk1(\u03b7)Tk2(\u03b7) based on the infla-\ntionary transfer function Tk(\u03b7) introduced in (9). The\nresulting GW power spectrum P(Z)\nh\n(k, \u03b7end) can then\nbe constructed following (16), and the GW spectral\nabundance at the present time may be calculated as\n\u2126GW(k, \u03b70) = Th (k; \u03b70, \u03b7end)2 P(Z)\nh\n(k, \u03b7end), where the\npost-inflationary sub-horizon tensor transfer function\n(squared and oscillation-averaged) is given by [28]\nTh (k; \u03b70, \u03b7end)2 =\n\uf8f1\n\uf8f4\n\uf8f4\n\uf8f2\n\uf8f4\n\uf8f4\n\uf8f3\n1\n12\n\u0012 k\nH0\n\u00132 \u0012\u03b7eq\n\u03b70\n\u00132\n[A(k)j2(k\u03b70) + B(k)y2(k\u03b70)]2 : k > keq ,\n1\n12\n\u0012 k\nH0\n\u00132 \u00143j2(k\u03b70)\nk\u03b70\n\u00152\n: k < keq ,\n(B5)\nwhere the coefficients A(k) and B(k) are\nA(k) =\n3\n2k\u03b7eq\n\u2212cos(2k\u03b7eq)\n2k\u03b7eq\n+ sin(2k\u03b7eq)\n(k\u03b7eq)2\n,\n(B6)\nB(k) = \u22121 +\n1\n(k\u03b7eq)2 \u2212cos(2k\u03b7eq)\n(k\u03b7eq)2\n\u2212sin(2k\u03b7eq)\n2k\u03b7eq\n. (B7)\n[1] M. C. Guzzetti, N. Bartolo, M. Liguori, and S. Matarrese,\nGravitational waves from inflation, Riv. Nuovo Cim. 39,\n399 (2016), arXiv:1605.01615 [astro-ph.CO].\n[2] Y. Akrami et al. (Planck), Planck 2018 results. X. Con-\nstraints on inflation, Astron. Astrophys. 641, A10 (2020),\narXiv:1807.06211 [astro-ph.CO].\n[3] P. A. R. Ade et al. (BICEP2, Keck Array), BICEP2 /\nKeck Array x: Constraints on Primordial Gravitational\nWaves using Planck, WMAP, and New BICEP2/Keck\nObservations through the 2015 Season, Phys. Rev. Lett.\n121, 221301 (2018), arXiv:1810.05216 [astro-ph.CO].\n[4] P. Campeti and E. Komatsu, New Constraint on the\nTensor-to-scalar Ratio from the Planck and BICEP/Keck\nArray Data Using the Profile Likelihood, Astrophys. J.\n941, 110 (2022), arXiv:2205.05617 [astro-ph.CO].\n[5] A. A. Starobinsky, Spectrum of relict gravitational radi-\nation and the early state of the universe, JETP Lett. 30,\n682 (1979).\n[6] C. Caprini, R. Durrer, and G. Servant, The stochastic\ngravitational wave background from turbulence and mag-\nnetic fields generated by a first-order phase transition,\nJCAP 12, 024, arXiv:0909.0622 [astro-ph.CO].\n[7] C. Caprini, R. Durrer, T. Konstandin, and G. Servant,\nGeneral Properties of the Gravitational Wave Spectrum\nfrom Phase Transitions, Phys. Rev. D 79, 083519 (2009),\narXiv:0901.1661 [astro-ph.CO].\n[8] R.-G. Cai, S. Pi, and M. Sasaki, Universal infrared scaling\nof gravitational wave background spectra, Phys. Rev. D\n102, 083528 (2020), arXiv:1909.13728 [astro-ph.CO].\n[9] F. W. Hehl, P. Von Der Heyde, G. D. Kerlick, and J. M.\nNester, General Relativity with Spin and Torsion: Foun-\ndations and Prospects, Rev. Mod. Phys. 48, 393 (1976).\n[10] G. Grignani and G. Nardelli, Gravity and the Poincare\ngroup, Phys. Rev. D 45, 2719 (1992).\n[11] H. I. Arcos and J. G. Pereira, Torsion gravity: A Reap-\npraisal, Int. J. Mod. Phys. D 13, 2193 (2004), arXiv:gr-\nqc/0501017.\n[12] A. Trautman, Einstein-Cartan theory, (2006), arXiv:gr-\nqc/0606062.\n[13] H. Kumar,\nT. Paul, and S. SenGupta, f(R) grav-\nity with spacetime torsion, EPL 147, 29001 (2024),\narXiv:2401.01705 [gr-qc].\n[14] E. Allys et al. (LiteBIRD), Probing Cosmic Infla-\ntion\nwith\nthe\nLiteBIRD\nCosmic\nMicrowave\nBack-\n7\nground Polarization Survey, PTEP 2023, 042F01 (2023),\narXiv:2202.02773 [astro-ph.IM].\n[15] T. P. Sotiriou and V. Faraoni, f(R) Theories Of Gravity,\nRev. Mod. Phys. 82, 451 (2010), arXiv:0805.1726 [gr-qc].\n[16] M. Kalb and P. Ramond, Classical direct interstring ac-\ntion, Phys. Rev. D 9, 2273 (1974).\n[17] B. Berche, S. Fumeron, and F. Moraes, Classical Kalb-\nRamond field theory in curved spacetimes, Phys. Rev. D\n105, 105026 (2022), arXiv:2205.05295 [gr-qc].\n[18] C. Capanelli, L. Jenks, E. W. Kolb, and E. McDonough,\nCosmological implications of Kalb-Ramond-like parti-\ncles, JHEP 06, 075, arXiv:2309.02485 [hep-ph].\n[19] J. L. Cook and L. Sorbo, An inflationary model with\nsmall scalar and large tensor nongaussianities, JCAP 11,\n047, arXiv:1307.7077 [astro-ph.CO].\n[20] S. Chakraborty, S. Pal, and S. SenGupta, Hilltop Infla-\ntion and Generation of Helical Magnetic Field, Universe\n8, 26 (2022), arXiv:1810.03478 [gr-qc].\n[21] T.-H. Yeh, J. Shelton, K. A. Olive, and B. D. Fields,\nProbing physics beyond the standard model: limits from\nBBN and the CMB independently and combined, JCAP\n10, 046, arXiv:2207.13133 [astro-ph.CO].\n[22] T. J. Clarke, E. J. Copeland, and A. Moss, Constraints\non primordial gravitational waves from the Cosmic Mi-\ncrowave Background, JCAP 10, 002, arXiv:2004.11396\n[astro-ph.CO].\n[23] A.\nKogut\net\nal.,\nThe\nPrimordial\nInflation\nEx-\nplorer (PIXIE): a nulling polarimeter for cosmic mi-\ncrowave background observations, JCAP 2011 (7), 025,\narXiv:1105.2044 [astro-ph.CO].\n[24] J. Chluba et al., New horizons in cosmology with spectral\ndistortions of the cosmic microwave background, Exper.\nAstron. 51, 1515 (2021), arXiv:1909.01593 [astro-ph.CO].\n[25] D. J. Bacon et al. (SKA), Cosmology with Phase 1 of\nthe Square Kilometre Array: Red Book 2018: Technical\nspecifications and performance forecasts, Publ. Astron.\nSoc. Austral. 37, e007 (2020), arXiv:1811.02743 [astro-\nph.CO].\n[26] A. Vallenari, The Future of Astrometry in Space, Fron-\ntiers in Astronomy and Space Sciences 5, 11 (2018).\n[27] B. D. Fields, K. A. Olive, T.-H. Yeh, and C. Young, Big-\nBang Nucleosynthesis after Planck, JCAP 03, 010, [Er-\nratum: JCAP 11, E02 (2020)], arXiv:1912.01132 [astro-\nph.CO].\n[28] Y. Watanabe and E. Komatsu, Improved Calculation\nof the Primordial Gravitational Wave Spectrum in the\nStandard Model, Phys. Rev. D 73, 123515 (2006),\narXiv:astro-ph/0604176."
-  },
-  {
-    "domain": "Physics",
-    "chunk_type": "general",
-    "text": "Model Order Reduction of Linear Systems via (\u03b3, \u03b4)-Similarity\nShivam Bajaj, Carolyn L. Beck and Vijay Gupta\nAbstract\u2014 Model order reduction aims to determine a low-\norder approximation of high-order models with least possible\napproximation errors. For application to physical systems, it is\ncrucial that the reduced order model (ROM) is robust to any\ndisturbance that acts on the full order model (FOM) \u2013 in the\nsense that the output of the ROM remains a good approximation\nof that of the FOM, even in the presence of such disturbances.\nIn this work, we present a framework for model order reduction\nfor a class of continuous-time linear systems that ensures this\nproperty for any L2 disturbance. Apart from robustness to\ndisturbances in this sense, the proposed framework also displays\nother desirable properties for model order reduction: (1) a\nprovable bound on the error defined as the L2 norm of the\ndifference between the output of the ROM and FOM, (2)\npreservation of stability, (3) compositionality properties and a\nprovable error bound for arbitrary interconnected systems, (4)\na provable bound on the output of the FOM when the controller\ndesigned for the ROM is used with the FOM, and finally,\n(5) compatibility with existing approaches such as balanced\ntruncation and moment matching. Property (4) does not require\ncomputation of any gap metric and property (5) is beneficial\nas existing approaches can also be equipped with some of the\npreceding properties. The theoretical results are corroborated\non numerical case studies, including on a building model.\nI. INTRODUCTION\nComplex interconnected large-scale systems are ubiqui-\ntous; transportation networks, supply chain and logistics sys-\ntems, water networks, connected autonomous vehicles, mi-\ncroelectromechanical systems, and biological systems such\nas the human brain or enzyme-substrate reactions are all\nexamples of large-scale systems [1], [2]. Designing accurate\nmodels of such systems is crucial but extremely difficult as\nsuch systems often have very high number of dimensions or\ndegrees of freedom. Model order reduction is a widely stud-\nied approach for dealing with the issue of model complexity,\nwith balanced truncation and moment matching being two of\nthe most popular methods.\nWhile the primary aim of model order reduction is to\ndetermine a lower-dimensional model that approximates the\nbehavior of the original system, there has been a significant\ninterest in achieving additional desirable properties. In what\nfollows, we summarize the most desirable properties along\nwith related works in the literature regarding these properties.\nA. Literature Review\nIn the following, we highlight some properties that are\ndesired from a reduced order model. A notational remark:\nShivam Bajaj and Vijay Gupta are with the Elmore Family School of\nElectrical and Computer Engineering, Purdue University, West Lafayette,\nUSA (e-mail: bajaj41@purdue.edu, gupta869@purdue.edu). Carolyn L.\nBeck is with the Department of Industrial and Enterprise Systems Engineer-\ning, University of Illinois, Urbana-Champaign, Champaign, USA (e-mail:\nbeck3@illinois.edu)\nwe denote an FOM of order n by \u03a3n and its respective\nROM of order r by \u03a3r.\n1) Minimal Error between \u03a3r and \u03a3n: A primary\nrequirement from any model order reduction approach\nis that it yields a ROM which approximates the be-\nhavior of the FOM well in the sense of minimizing\nthe approximation error measured with some metric.\nGenerally, the H2, H\u221e, or Hankel norm of the error\nare minimized, where the error is defined either as the\ndifference between the outputs or the difference between\nthe transfer functions of the two systems [3]\u2013[8].\nWhile many approaches that minimize the error have\nbeen proposed, most of these approaches are numerical,\ni.e., they do not have explicit theoretical guarantees\n(such as a provable upper bound on the error). Singular\nValue Decomposition (SVD) based approaches, such as\nbalanced truncation [9], are well known to provide an\nexplicit theoretical bound on the error. Hankel norm\napproximation and singular perturbation approximation\nare two other SVD based approaches that have the same\nerror bound as balanced truncation. We refer to the\nsurveys [10], [11] for more details.\n2) Robustness to Modeling Errors: Generally, the param-\neters of the state-space model are not exactly known due\nto the presence of modeling errors (which can be viewed\nas a special class of disturbances). Model reduction\nof uncertain systems has largely been focused on sys-\ntems with specific structures, such as systems modeled\nby Linear Fractional Transform (LFT) representations\n[12]\u2013[16] and polytopic uncertain linear systems [17].\n3) Preserving Stability: A model reduction approach is\nsaid to preserve the stability of a stable system \u03a3n if\nit yields a system \u03a3r which is stable. In general, most\nexisting approaches, with SVD based approaches being\na notable exception, do not preserve stability. Some\ncommon techniques include deleting unstable poles in\npost-processing [18], [19], incorporating SVD based\napproaches [20], [21], or combining techniques from\ndissipativity theory [22], [23]. These approaches are\neither numerical or do not satisfy other properties such\nas robustness.\nIt is worth mentioning that the importance of an explicit\ntheoretical bound on the error as well as the preservation\nof stability of balanced truncation has been widely rec-\nognized, owing to which balanced truncation is one of\nthe most popular approaches for model order reduction.\nHowever, a major drawback of this approach is that\nit requires solving two Lyapunov equations which can\narXiv:2504.10437v1  [eess.SY]  14 Apr 2025\nbe computationally expensive. Thus, these approaches\nmay be suitable only for moderate scale settings. A\npopular approach that is suitable for large scale settings\nis moment matching; however, this approach does not\npreserve stability or provide an explicit error bound.\nThus, there has been a recent interest in optimization\ncentered approaches based on moment matching that\nminimize the error while ensuring stable ROMs [7],\n[24].\n4) Compositionality: Many complex large-scale systems\nconsist of interconnected high-dimensional subsystems.\nOne approach is to reduce the entire interconnected\nsystem as a whole using any model order reduction tech-\nnique. While feasible, this approach does not preserve\nthe interconnection structure [25]. Although structure\npreserving methods exist [25]\u2013[29], they either lack er-\nror bounds or work only under restrictive settings. Mod-\nular model order reduction is an alternate and a popular\napproach in which each subsystem is approximated by\nits ROM. This approach preserves the interconnection\nstructure and is computationally cheaper. However, in\nthis approach, characterizing how approximation errors\non a subsystem level affect the stability and accuracy\nof the interconnected system is challenging. Works such\nas [30] and [31] have characterized an error bound for\nbidirectional networks. More recently, using a robust\nperformance analysis approach, [32] characterizes an\nerror bound on the interconnected system, given the\napproximation errors of the subsystems. We refer to [33]\nand the references therein for an in-depth review on this\nline of literature.\n5) Ensuring Closed-Loop Stability of \u03a3n: Most works\ndiscussed so far consider approximation errors for the\nopen-loop behavior. However, within the context of\nfeedback systems, it is even more important to consider\napproximation errors in terms of the difference of the\nclosed-loop behavior of the systems with the same con-\ntroller. Gap metrics [34]\u2013[36], such as the gap and the\n\u03bd-gap, provide a measure of distance between open-loop\nsystems in terms of their closed-loop behaviors. The\nfundamental property of gap metrics can be informally\nstated as follows. If a controller performs sufficiently\nwell with a system \u03a3n\n1, and if the distance (in the sense\ndefined by the gap metric) between system \u03a3n1\n1\nand\n\u03a3n2\n2\nis sufficiently small, then the same controller is\nguaranteed to achieve a certain level of performance\nwith \u03a3n2\n2 . Although the existence of a reduced model\nsuch that the gap or \u03bd-gap between the FOM and\nthe ROM is less than a specified value has long been\nestablished [37], [38], to the best of our knowledge, only\nnumerical results exist to obtain a reduced model that\nsatisfies such a gap metric error bound [39]\u2013[41]. Since\none of the applications of model order reduction is to\ndesign a controller for the original model, an explicit\ntheoretical bound (as established in this work) on the\noutput of the FOM with a controller designed using the\nROM is clearly of interest.\n(a) ROM obtained via balanced\ntruncation.\n(b) ROM obtained via moment-\nmatching.\nFig. 1: Difference between the outputs (error) of the FOM\nand its ROM.\nWhile much work exists concerning each of these desir-\nable properties, there is still a growing interest in this area,\nas most of the works are experimental (i.e., lack theoretical\nguarantees) or require restrictive assumptions.\nB. Motivation\nSimulating the FOMs is crucial for analyzing the FOM or\neven designing control inputs in real-time [42]. Thus, one\nof the key applications of reduced-order models is to reduce\nthe computational cost of simulations. Given this application,\nour motivation is as follows: since an FOM may operate in\nthe presence of arbitrary disturbances, its ROM must serve\nas a good approximation, even in the presence of unmodeled\ndisturbances. We refer to this property, which is the primary\nmotivation of this work, as robustness of a ROM.\nRecall that balanced truncation and moment matching are\ntwo of the most popular methods for model order reduction.\nWhile we will discuss these methods in detail in Section II-A,\nto further motivate the need for frameworks that yield robust\nROMs, we illustrate via Figure 1 that balanced truncation and\nmoment matching are not robust. In Figure 1, we consider\ntwo scenarios. In the left plot, we consider an ROM obtained\nvia balanced truncation. Specifically, we concatenate the\nsystem matrices associated with the input and the disturbance\nand then apply balanced truncation on the modified system.\nIt must be highlighted that this approach is not suitable for\nthe aforementioned application as the disturbance is treated\nas an input and thus requires the disturbance to be known.\nAlternatively, one may choose to ignore the disturbance. For\nthis case, we construct an ROM via moment matching of\nthe given FOM ignoring the disturbance. For both cases, as\nFigure 1 illustrates, while the error is small in the absence\nof a disturbance, it is much larger in the presence of a\ndisturbance. In other words, the ROM obtained through\nmoment matching or balanced truncation is not robust. We\nremark that, for balanced truncation, the error in Figure 1 is\nwithin the known error bound (refer to Section II-A for error\nbound). Further, balanced truncation is the only method with\nan analytical error bound. This means that the (only known)\nerror bound may be conservative, especially in the presence\nof a disturbance.\nMost of the prior work on robust model order reduc-\ntion has either only focused on robustness to modeling\nerrors or required restrictive assumptions on the disturbance.\nWhile these results provide valuable insights, they are not\napplicable for arbitrary disturbances. Recently in [43], a\nperturbation based approach was proposed to force a linear\nsystem to behave similarly to another linear system, for\narbitrary inputs and disturbances. Motivated by [43], we\nadopt a similarity based approach for robust model order\nreduction. We summarize our specific contributions in the\nnext subsection.\nC. Contributions\nIn this work, for continuous-time linear systems, we\nprovide a framework for a robust model order reduction, i.e.,\nthe output of the obtained ROM continues to remain a good\napproximation of the FOM, even in the presence of arbitrary\ndisturbances. Additionally, our framework satisfies all of the\nproperties described in Section I-A. Specifically, in this work,\nwe provide a framework for model order reduction which:\n1) minimizes the error (defined as the L2-norm between\nthe output of the full and the reduced systems) and\nprovides an analytical upper bound on the error;\n2) is robust to arbitrary L2 disturbances;\n3) preserves the stability of the original system;\n4) provides an explicit error bound between an arbitrary\ninterconnected system and its reduced model while\npreserving the interconnection structure; and\n5) provides an upper bound on the (closed-loop) output of\nthe original system when a controller designed using\nthe reduced model is used with the original system;\n6) is compatible with existing approaches such as balanced\ntruncation and moment matching, thereby extending\nsome of its properties to those approaches as well (while\ninheriting desirable features of those approaches).\nTo the best of our knowledge, this is the first framework\nto satisfy all of the aforementioned six properties. The rest\nof this work is organized as follows. In Section II-A, we\nreview some required concepts and approaches and formally\nstate the problem. In Section III, we describe our approach\nand establish that it satisfies the properties (1)-(3) described\nin Section I-C. In Section IV and Section V we establish the\ncompositionality and closed-loop properties, respectively. A\ndescription of how the proposed framework can be used in\nconjunction with existing approaches is provided in Section\nVI. In Section VII, we numerically establish the efficiency\nof our approach. Finally, a summary of this work and an\noutline of future directions is given in Section VIII.\nNotation: For a matrix M \u2208Rn\u00d7n, we denote its spec-\ntrum and the largest (resp. smallest) eigenvalue as spec(M)\nand \u03bbmax(M) (resp. \u03bbmin(M)), respectively. We use I and\n0 to denote an identity and a zero matrix, respectively,\nof appropriate dimensions. For a symmetric matrix, the\nsymmetric terms are denoted by \u2217. For any two vectors\nx1 \u2208Rn1 and x2 \u2208Rn2, we define the operator col(\u00b7, \u00b7)\nsuch that col(x1, x2) =\n\u0000x\u22a4\n1 , x\u22a4\n2\n\u0001\u22a4. Given a vector x \u2208Rn,\n|x| :=\n\u0000x\u22a4x\n\u00011/2. We denote by L2 the space of measurable\nfunctions u : [0, \u221e) \u2192Rn such that\nR \u221e\n0\n\f\fu(t)\n\f\f2 dt < \u221e.\nAccordingly, the L2 space is endowed with the norm\u2225u\u22252 =\nR \u221e\n0\n\f\fu(t)\n\f\f2 dt. Given a matrix M, M \u227b0 (resp. M \u227a0)\ndenotes that M is positive (resp. negative) definite. Similarly,\nM \u2ab00 (resp. M \u2aaf0) denotes that M is positive (resp.\nnegative) semi-definite. Given a matrix M \u227b0, |x|M :=\n\u0000x\u22a4Mx\n\u00011/2 and \u2225u\u22252\nM :=\nR \u221e\n0\n\f\fu(t)\n\f\f2\nM dt.\nII. PRELIMINARIES, MOTIVATION, AND PROBLEM\nSTATEMENT\nWe begin by reviewing two of the most popular ap-\nproaches for model order reduction, namely, balanced trun-\ncation and moment matching, as well as basic concepts\nregarding the similarity of two linear systems. We then\nformally describe the problem considered in this work.\nA. Preliminaries\nTo review the balanced truncation and the moment match-\ning methods, we consider a linear continuous-time system\n\u03a3n :\n(\n\u02d9x\n= Ax + Bu;\ny\n= Cx,\n(1)\nwhere x \u2208Rn, u \u2208Rm, and y \u2208Rp represent the system\nstate, input, and output, respectively. Throughout this section,\nwe assume that A is Hurwitz.\n1) Balanced Truncation: Balanced truncation (BT) is the\nmost well-known model reduction method. For a given linear\ncontinuous-time system of the form (1), balanced truncation\nrequires solving the following two Lyapunov equations,\nAC + CA\u22a4+ BB\u22a4= 0; AO + OA\u22a4+ CC\u22a4= 0,\nwhere matrices C and O are the reachability and observability\nGramians, respectively. Under the assumption that the matrix\nA is Hurwitz, both C and O are positive semidefinite\nmatrices. The square root of the eigenvalues of CO are the\nsingular values of the Hankel operator associated with the\nsystem (1) and are known as Hankel singular values denoted\nas \u03c3H\ni , 1 \u2264i \u2264n. For ease of exposition, assume that\nall the Hankel singular values are distinct. Then, a reduced\nmodel via balanced truncation is obtained by removing the\nstates that correspond to \u2018sufficiently small\u2019 Hankel singular\nvalues. Specifically, given a positive integer r < n, a reduced\nmodel Ar, Br, and Cr of order r is obtained via balanced\ntruncation as follows. First, the Cholesky factors LC and LO\nof C and O, respectively, are computed. Then, the singular\nvalue decomposition of L\u22a4\nOLC is determined by equating\nL\u22a4\nOLC = ZSY , where S := diag(\u03c3H\n1 , . . . , \u03c3H\nn ) and Z\n(resp. Y ) denote the left (resp. right) singular vectors. Let\nSr := diag(\u03c3H\n1 , . . . , \u03c3H\nr ) and define WBT = LOYrS\u22121/2\nr\nand VBT = LCZrS\u22121/2\nr\n, where Zr (resp. Yr) denotes the\nleading r columns of Z (resp. Y ). The reduced model is then\ngiven by projection, i.e., Ar = W \u22a4\nr AVr, Br = W \u22a4\nr B, and\nCr = CVr. Balanced truncation has the following properties:\n1) The reduced system is asymptotically stable.\n2) The H\u221e-norm, defined as \u2225\u03a3\u2225H\u221e:= supw\u2208R |\u03a3(\u03b9w)|,\nof the error system satisfies\n\u2225\u03a3 \u2212\u03a3r\u2225H\u221e\u22642(\u03c3H\nr+1 + \u00b7 \u00b7 \u00b7 + \u03c3H\nn ).\nRecently, [44] established a similar bound for balanced\ntruncation in the time-domain. Formally, the L2 bound on\nthe error between the outputs satisfies\n\u2225y \u2212yr\u22252 \u22642(\u03c3H\nr+1 + \u00b7 \u00b7 \u00b7 + \u03c3H\nn )\u2225u\u22252 ,\n(2)\nwhere yr denotes the output of the ROM.\n2) Moment Matching: For large-scale systems, another\npopular method for model order reduction is moment match-\ning in which the transfer function (or some derivative of it)\nof the obtained reduced model approximately matches that of\nthe original model at specified frequencies. A 0-moment of\nsystem (1) at s\u2217\u2208C \\ spec(A) is the transfer function of \u03a3\nat s\u2217[45]. Let s1, . . . , sr \u2208R be given and let matrices\nS \u2208Rr\u00d7r and L \u2208Rr\u00d7m be fixed such that the pair\n(L, S) is observable. In particular, the matrix S is chosen\nsuch that spec(S) = {s1, . . . , sr}. Further, let T \u2208Rn\u00d7r be\nthe solution to the Sylvester equation:\nAT + BL = TS.\nThen, the reduced system \u03a3r, parameterized by Br can be\nobtained as\nAr := S \u2212BrL,\nCr := CT.\n(3)\nEquation (3) describes a family of reduced order models,\neach of order r, that achieve moment matching at frequencies\nspecified by spec(S). Note that the family of reduced models\nare parameterized by Br. Further, the reduced system \u03a3r\nsatisfies spec(S \u2212BrL) \u2229spec(S) = \u2205. It is known that\nmoment matching, in general, does not guarantee that the\nROM is stable even if the FOM is stable or provide an\nexplicit error bound on how much the systems differ. Recent\napproaches focus on minimizing the H2 or H\u221e-norm of\nthe error by posing the problem as an optimization problem.\nIn these approaches, the parametric freedom in the method\nis exploited, i.e., the parametrization is combined with an\nobjective function (such as H2-norm). The reduced order\nmodel then obtained by minimizing the objective function\nwith respect to Br minimizes the error between the systems.\nSince matrices S and L are fixed a priori, one may select\nthem to ensure that the stability of the original model is\npreserved. We refer to [7], [24], [45] for more details.\nNext, we review a framework introduced in [43] that\ncharacterizes the similarity of two linear systems.\n3) (\u03b3, \u03b4)-Similar Systems: Consider continuous-time lin-\near systems of the form\n\u03a3ni\ni :\n(\n\u02d9xi\n= Aixi + Biui + Eidi;\nyi\n= Cixi,\n(4)\nwhere xi \u2208Rni, ui \u2208Rm, di \u2208Rqi, and yi \u2208Rp represent\nthe system state, input, disturbance, and output, respectively.\nAssume that the systems \u03a3ni\ni\nare 0-asymptotically stable,\ni.e., the system is asymptotically stable in the absence of\ninput and disturbance (equivalently, the system matrices Ai\nare Hurwitz).\nDefinition 1: Given two systems \u03a3ni\ni , i \u2208{1, 2}, of the\nform (4) and positive constants \u03b3 and \u03b4, the system \u03a3n2\n2\nis said to be (\u03b3, \u03b4)-similar to \u03a3n1\n1 , if there exist constants\n\u03f5, \u03b7, \u00b5 > 0 such that for every input u1, u2 \u2208L2 and\ndisturbance d1 \u2208L2, there exists a disturbance d2 \u2208L2\nsuch that\n\u2225y1 \u2212y2\u22252 \u2264\u03b3\u2225u1 \u2212u2\u22252 + (\u03b4 \u2212\u03f5)\n\r\rcol(u1, u2)\n\r\r\n+ (\u00b5 \u2212\u03f5)\u2225d1\u22252 \u2212\u03b7\u2225d2\u22252 .\n(5)\nThe notion of (\u03b3, \u03b4)-similarity in Definition 1 measures the\nsimilarity of the trajectories of \u03a3n1\n1\nand \u03a3n2\n2\nin terms of\ntheir input-output behavior. The parameter \u03b3 characterizes\nthe deviation in the outputs with respect to the dissimilarity\nin the inputs. The parameter \u03b4 characterizes the deviation in\nthe outputs with respect to the individual inputs. Finally, the\nconstants \u00b5 and \u03b7 characterize the deviation in the output\nwith respect to disturbances. The notion of (\u03b3, \u03b4)-similarity\nhas the following property [43, Proposition 1].\nLemma 1: There exists a \u03b3 > 0 such that the system \u03a3n\nis (\u03b3, \u03b4)-similar to itself for any \u03b4 > 0.\nDetermining the disturbance d2 \u2208L2 and the constants\n\u03f5, \u03b7, \u00b5 > 0 from Definition 1 can be challenging. However,\nan algebraic characterization of Definition 1 was established\nin [43] as follows. Let x := col(x1, x2), w := col(u1, u2, d1),\nand z\n:=\ncol(y1 \u2212y2, d2) and consider the following\ncomposite system obtained by collecting the dynamics of\n\u03a3n1\n1\nand \u03a3n2\n2 :\n\u02d9x = Ax + Bd2 + Ew,\nz = Cx + Dd2,\n,\n(6)\nwhere\nA =\n\u0014A1\n0\n0\nA2\n\u0015\n, B =\n\u0014 0\nE2\n\u0015\n, E =\n\u0014B1\n0\nE1\n0\nB2\n0\n\u0015\n,\nC =\n\u0014C1\n\u2212C2\n0\n0\n\u0015\n, D =\n\u00140\nI\n\u0015\n.\n(7)\nFurther, let\nQ :=\n\uf8ee\n\uf8f0\n(\u03b3 + \u03b4)I\n\u2212\u03b3I\n0\n\u2212\u03b3I\n(\u03b3 + \u03b4)I\n0\n0\n0\n\u00b5I\n\uf8f9\n\uf8fb,\nR =\n\u0014I\n0\n0\n\u03b7I\n\u0015\n.\n(8)\nThen the following result, established in [43, Theorem 2],\ncharacterizes an algebraic necessary and sufficient condition\nfor the notion of (\u03b3, \u03b4)-similarity of two systems.\nTheorem 2.1: For \u03b3, \u03b4 > 0, \u03a3n2\n2\nis (\u03b3, \u03b4)-similar to \u03a3n1\n1\nif\nand only if there exist a positive definite matrix X, a matrix\n\u03a0, and positive scalars \u03b7 and \u00b5 such that\n\uf8ee\n\uf8ef\uf8f0\nAX + B\u03a0 + (AX + B\u03a0)\u22a4\n\u2217\n\u2217\nE\u22a4\n\u2212Q(\u03b3, \u03b4)\n\u2217\nCX + D\u03a0\n0\n\u2212R\n\uf8f9\n\uf8fa\uf8fb\u227a0.\n(9)\nIt is established in [43, Lemma 3] that the disturbance d2\nthat must be applied to \u03a3n2\n2\ncan be chosen in a closed-loop\nfashion, i.e., as a static state feedback which depends of the\nstate of both the systems. Mathematically, d2 = Fx, where\nF := \u03a0X\u22121. Further, the following result, established in [43,\nProposition 4], will be instrumental in establishing some of\nthe results in this work.\nLemma 2: Suppose, for some \u03b3, \u03b4 > 0, \u03a3n2\n2\nis (\u03b3, \u03b4)-\nsimilar to \u03a3n1\n1 , where \u03a3n1\n1\nand \u03a3n2\n2\nare 0-asymptotically\nstable. Then, there exist positive constants l and k such that,\nfor all u1, u2, d1 \u2208L2,\n\r\rcol(y1, y2\n\r\r2 \u2264l\n\r\rcol(u1, u2\n\r\r2 + k\u2225d1\u22252 .\nB. Problem Statement\nWe are interested in solving the following problem.\nProblem 1: Given a system \u03a3n as described in (4) and a\npositive integer r < n, determine a reduced order model \u03a3r\nwhich satisfies the following:\n1) ||y \u2212yr|| is minimized;\n2) \u03a3r is stable;\n3) \u03a3r is robust to any unknown disturbances or modeling\nerrors d1 \u2208L2 that may be applied to \u03a3n.\nFurther, if such a system \u03a3r exists, then:\n4) for a given arbitrary interconnected system \u03a3n consist-\ning of N subsystems \u03a3ni\ni , 1 \u2264i \u2264N , determine an\nROM \u03a3r of \u03a3n, such that the interconnection structure\nis preserved. Further, characterize an upper bound on\nthe error between the system \u03a3n and its reduced model\n\u03a3r.\n5) Given a stabilizing controller for \u03a3r, characterize an\nupper bound on the output of the closed-loop system\nwhen the same controller is used for \u03a3n.\nIn what follows, we provide a general framework based\non the concepts of (\u03b3, \u03b4)-similarity of systems (see Section\nII-A) that can be used to solve Problem 1. Further, as we\nshow in Section VI, the proposed framework can also be\nused as an add-on with existing approaches such as balanced\ntruncation and moment matching, thereby extending some of\nits properties to those approaches as well, while inheriting\ndesired features of those approaches.\nIII. (\u03b3, \u03b4)-MODEL ORDER REDUCTION\nObserve that the notion of (\u03b3, \u03b4)-similarity between two\nsystems is independent of the respective orders of the two\nsystems as it characterizes the similarity between two sys-\ntems only in terms of their input-output behavior. Given this\nobservation, our approach can be summarized as follows;\ngiven a system \u03a3n and a positive integer r < n, we\naim to determine a reduced order model \u03a3r and a suitable\nperturbation d2 \u2208L2 such that the obtained ROM is (\u03b3, \u03b4)-\nsimilar to \u03a3n. Note that the disturbance d2 will depend on\nthe state of both the ROM and the FOM. This is unavoidable\nif there is no a priori structure or assumption that we can\nimpose on the disturbance d1 affecting the original system.\nSuch requirements have been imposed in the past for model\norder reductions [46], [47]. In the sequel, the system \u03a3r\nwhich is thus determined and is (\u03b3, \u03b4)-similar to \u03a3n by\nconstruction will be referred to as a (\u03b3, \u03b4)-ROM of \u03a3n. We\ncharacterize this formally below.\nDefinition 2: Given 0-asymptotically stable systems \u03a3n\nand \u03a3r with r < n, \u03a3r is said to be a (\u03b3, \u03b4)-ROM of \u03a3n, if\nthere exist positive constants \u03b3, \u03b4, \u03f5, \u03b7, \u00b5 and a driving input\nd2 \u2208L2 such that \u2200u1, u2, d1 \u2208L2\n\u2225y1 \u2212y2\u22252 \u2264\u03b3\u2225u1 \u2212u2\u22252 + (\u03b4 \u2212\u03f5)\n\r\rcol(u1, u2)\n\r\r2\n+ (\u00b5 \u2212\u03f5)\u2225d1\u22252 \u2212\u03b7\u2225d2\u22252 .\n(10)\nObserve that the notion of (\u03b3, \u03b4)-ROM requires determining\nthe positive constants (\u03b3, \u03b4) as opposed to the constants given\na priori when we need to analyze whether two systems are\n(\u03b3, \u03b4)-similar. Further, Definition 2 requires that a reduced\nmodel \u03a3r be given a priori. This suggests that the notion\nof (\u03b3, \u03b4)-ROM may work in conjunction with the existing\napproaches. Although true (we will consider this scenario\nmore formally in Section VI), we highlight that this is not\nthe only way to utilize the notion of (\u03b3, \u03b4)-ROM.\nIn what follows, we will first provide a framework which,\nfor a given order r < n, yields a (\u03b3, \u03b4)-ROM \u03a3r of \u03a3n.\nWe will also establish that the obtained \u03a3r satisfies all the\ndesirable properties described in Problem 1.\nFrom Definition 2 and Theorem 2.1, we begin by recasting\nProblem 1 as an optimization problem, which we will refer\nto as P0. Consider a FOM \u03a3n of the form (4). Then, the\noptimization problem P0 is the following:\nProblem P0:\nmin\nA2,B2,C2,X\u227b0,\u03a0,\u03b3>0,\u03b4>0 (\u03b3 + \u03b4)\nsubject to\n\u03c3(\u03a3r) \u2282C\u2212\n(11)\n\uf8ee\n\uf8ef\uf8f0\nAX + B\u03a0 + (AX + B\u03a0)\u22a4\n\u2217\n\u2217\nE\u22a4\n\u2212Q(\u03b3, \u03b4)\n\u2217\nCX + D\u03a0\n0\n\u2212R\n\uf8f9\n\uf8fa\uf8fb\u227a0,\n(12)\nwhere the matrices A, B, C, and D are defined in (7) and\nmatrices Q and R are defined in (8). Note that, since A2, B2,\nand C2 are unknown, the constraint specified in equation (12)\nis a Bilinear Matrix Inequality (BMI) constraint. Further, the\nconstraint defined in (11) ensures that the obtained ROM \u03a3r\nis stable. Due to the presence of the stability constraint in\n(11), the proposed optimization problem P0 is challenging\nto solve. The following result establishes that, for a given 0-\nasymptotically stable system \u03a3n, the obtained (\u03b3, \u03b4)-ROM\n\u03a3r is stable if the BMI constraint in (12) holds and thus we\ncan drop the constraint (11).\nTheorem 3.1: Suppose that the systems \u03a3n and \u03a3r are\nsuch that equation (12) holds. Further, let \u03a3n be 0-\nasymptotically stable. Then, \u03a3r is 0-asymptotically stable.\nProof: Let F := \u03a0X\u22121, AF := A + BF, and CF :=\nC + DF. Then, taking congruence transformation of (12)\nwith respect to\n\uf8ee\n\uf8f0\nX\u22121\n0\n0\n0\nI\n0\n0\n0\nI\n\uf8f9\n\uf8fb\nyields the constraint\n\uf8ee\n\uf8ef\uf8f0\nAF \u22a4X\u22121 + X\u22121AF\nX\u22121E\nCF \u22a4\nE\u22a4X\u22121\n\u2212Q\n0\nCF\n0\n\u2212R\n\uf8f9\n\uf8fa\uf8fb\u227a0.\n(13)\nSince X \u227b0, it follows that AF is Hurwitz. For \u02dcR := R\u22121\nand taking the Schur complement of (13) yields\n\"\nA\u22a4\nF X\u22121 + X\u22121AF + C\u22a4\nF \u02dcRCF\nX\u22121E\nE\u22a4X\u22121\n\u2212Q\n#\n\u227a0.\nThis implies that the composite system defined in equation\n(6) with d2 = Fx is dissipative with the respect to the supply\nrate\ns(w, z) =\n\u0014w\nz\n\u0015\u22a4\"\nQ\n0\n0\n\u2212\u02dcR\n# \u0014w\nz\n\u0015\n(14)\nand storage function V (x) = x\u22a4X\u22121x [48, Theorem 2.9].\nFrom the definition of a dissipative system,\nV (x(t1)) \u2264V (x(t0)) +\nZ t1\nt0\ns(w(t), z(t))dt \u2212\u03f5\nZ t1\nt0\n|w(t)|2dt,\nfor all t0 \u2264t1. For t0 = 0 and using equation (14), we\nobtain\n\u2225z\u22252\nR \u2212\u2225w\u22252\nQ \u2264V (x(0)) \u2212V (x(t1)) \u2212\u03f5\u2225w\u22252\n=\u21d2\u2225z\u22252\nR \u2264\u2225w\u22252\nQ + V (x(0)) \u2212\u03f5\u2225w\u22252\n=\u21d2\u2225z\u2225R \u2264\np\n\u03bbmax (Q \u2212\u03f5)\u2225w\u2225+\np\nV (x(0)).\nRecall that z(t) := z(t, 0, x(0), d, w). Since the composite\nsystem is linear, it follows that z(t) = z(t, 0, x(0), 0, 0) +\nz(t, 0, 0, d, w). Then, using triangle inequality yields\n||z(t, 0, x(0), 0, 0)||R \u2264\np\n\u03bbmax (Q \u2212\u03f5)||w|| +\np\nV (x(0))\n+ ||z(t, 0, 0, d, w)||R.\n(15)\nNow, using equation (14) and from the definition of dis-\nsipativity, we obtain that for t0\n= 0 and x(0) = 0,\n||z(t, 0, 0, d, w)||R \u2264\np\n\u03bbmax (Q \u2212\u03f5)||w||. Substituting in\nequation (15) yields\n\r\rz(t, 0, x(0), 0, 0)\n\r\r\nR \u2264\np\n\u03bbmax (Q \u2212\u03f5)\u2225w\u2225\n\u0012\n1 +\n1\n\u03bbmin (R)\n\u0013\n+ V (x(0)).\nThis means that the L2 norm of the output is bounded under\nd2(t) = 0 and w(t) = 0 for all t implying that matrix A is\nHurwitz [49, Theorem 3.1.1]. Since A1 is Hurwitz (as \u03a3n\nis assumed to be 0-asymptotically stable), it follows that A2\nmust be Hurwitz.\nAs a consequence of Theorem 3.1, we can remove the\nstability constraint defined in equation (11) and reformulate\nthe optimization problem P0 to the following optimization\nproblem which we will refer to as P1.\nProblem P1:\nmin\nA2,B2,C2,X\u227b0,\u03a0,\u03b3>0,\u03b4>0 (\u03b3 + \u03b4)\n(16)\nsubject to\n\uf8ee\n\uf8ef\uf8f0\nAX + B\u03a0 + (AX + B\u03a0)\u22a4\n\u2217\n\u2217\nE\u22a4\n\u2212Q(\u03b3, \u03b4)\n\u2217\nCX + D\u03a0\n0\n\u2212R\n\uf8f9\n\uf8fa\uf8fb\u227a0.\nWe now establish the main result of this section.\n(a) System \u03a3.\n(b) System \u03a3mod.\nFig. 2: Illustration of the modular approach for system \u03a3\nconsisting of N = 2 subsystems. Subsystem \u03a3r1\n1\nand \u03a3r2\n2\nare ROMs of \u03a3n1\n1\nand \u03a3n2\n2 , respectively. w12, w21 (resp.\nw\u2032\n12, w\u2032\n21) and z12, z21 (resp. z\u2032\n12, z\u2032\n21) are the internal inputs\nand outputs, respectively, to system \u03a3 (resp. \u03a3mod). u1, u2\n(resp. u\u2032\n1, u\u2032\n2) and y1, y2 (resp. y\u2032\n1, y\u2032\n2) are the external inputs\nand outputs, respectively, to system \u03a3 (resp. \u03a3mod).\nTheorem 3.2: Given a 0-asymptotically stable system \u03a3n\nand a positive integer r < n, suppose that a solution\n{A\u2217\n2, B\u2217\n2, C\u2217\n2, X\u2217, \u03a0\u2217, \u03b3\u2217, \u03b4\u2217} to the optimization problem\nP1 exists. Then, the system \u03a3r defined by the system\nmatrices A\u2217\n2, B\u2217\n2, C\u2217\n2 is a (\u03b3\u2217, \u03b4\u2217)-ROM of \u03a3n. Further, \u03a3r\nis 0-asymptotically stable.\nProof: The result follows directly from the definition\nof (\u03b3\u2217, \u03b4\u2217)-ROM and Theorem 3.1.\nRemark 1: Suppose that the solution to the optimization\nproblem P1 exists. Then, the obtained ROM satisfies prop-\nerties (1), (2), and (3) defined in Problem 1.\nRemark 2: Problem P1 consists of a BMI constraint.\nMany approaches for model order reduction require solving\nan optimization problem with BMI constraints (see, for\ninstance, [7], [8], [24]). By imposing additional constraints\non the matrix X (similar to, for instance, in [24], [50]),\nit might be possible to relax the BMI to a Linear Matrix\nInequality (LMI) constraint. We leave this direction as future\nwork. Further, in Section VI, we will see that when we\ncombine our framework with balanced truncation, the BMI\nconstraint automatically converts to an LMI constraint.\nTheorem 3.2 ensures that three out of the five properties\ndescribed in Section II-B are satisfied by the proposed ap-\nproach. In what follows, we will establish that the proposed\napproach also satisfies properties (4)-(5) defined in Problem\n1.\nIV. MODEL ORDER REDUCTION OF INTERCONNECTED\nSYSTEMS\nIn this section, with the aim of addressing desirable prop-\nerty (4) in Problem 1, we consider interconnected systems\nconsisting of N > 1 subsystems, where each subsystem may\nhave a high dimension. As a consequence, the interconnected\nsystem is also high-dimensional. As discussed earlier, a\npopular approach, referred to as a modular approach, is\nto apply model-order reduction on the subsystem level and\nconnect the resulting reduced order models of the subsystems\n(see Figure 2). A key challenge in the literature is to quantify\nhow the accuracy of the resulting interconnected system of\nthe subsystem ROMs is affected by the approximation errors\nintroduced by order reduction of the subsystems. In fact, it\nis entirely possible that the resulting interconnected system\nmay not be a good approximation of the original system at\nall. We will show how combining the (\u03b3, \u03b4)-ROM approach\nwith the modular approach solves this problem. Specifically,\ngiven an interconnected system \u03a3 of N subsystems \u03a3ni\ni ,\ni \u2208{1, . . . , N}, we obtain a modular ROM \u03a3mod of \u03a3\nas follows. First, for every subsystem \u03a3ni\ni , we determine\na (\u03b3i, \u03b4i)-ROM, denoted as \u03a3ri\ni , by solving the optimization\nproblem P1 defined in Section III. Then, we connect the\nobtained \u03a3ri\ni\nwith the same interconnection structure as \u03a3\n(cf. Figure 2). To establish the theoretical bounds on the\noutput of a modular ROM of an interconnected system, we\nassume the following throughout this section.\nAssumption 1: Every subsystem \u03a3ni\ni , i \u2208{1, . . . , N}, is\n0-asymptotically stable. Further, a (\u03b3i, \u03b4i)-ROM \u03a3ri\ni\nexists\nand is known for each subsystem \u03a3ni\ni . Finally, for each\nsubsystem \u03a3ni\ni , the constants li and ki defined in Lemma\n2 are known.\nWe will first characterize an upper bound on the output\nof a modular ROM of an arbitrary interconnected system.\nWe will then establish that, for series, parallel, and feedback\nconnections, our approach yields a (\u03b3, \u03b4)-modular ROM.\nA. Systems with Arbitrary Interconnection Structure\nThe following result characterizes an upper bound on the\ninternal outputs of an interconnected system.\nLemma 3: Consider a system \u03a3 consisting of N subsys-\ntems \u03a3ni\ni , 1 \u2264i \u2264N and suppose that Assumption 1\nholds. Let \u03a3mod denote an interconnected system, with the\nsame interconnection structure as \u03a3 but with subsystems\n\u03a3ri\ni . Further, let l := PN\ni=1 li where li, \u2200i \u2208{1, . . . , N}\nare defined as in Lemma 2. Then\n(1 \u2212l)\nN\nX\ni=1\nN\nX\nj=1.j\u0338=i\n\r\rzij\n\r\r2 +\n\r\r\rz\u2032\nij\n\r\r\r\n2\n\u2264\nN\nX\ni=1\n\u0010\nki\u2225di\u22252 + li\u2225ui\u22252 + li\n\r\ru\u2032\ni\n\r\r2\u0011\n.\nProof:\nSince each subsystem 1 \u2264i \u2264N is 0-\nasymptotically stable, it follows from Lemma 2 that, for each\nsubsystem i, there exists constants li and ki such that\nN\nX\nj=1,j\u0338=i\n\r\rzij\n\r\r2 +\n\r\r\rz\u2032\nij\n\r\r\r\n2\n\u2264li\n\u0010\n\u2225ui\u22252 +\n\r\ru\u2032\ni\n\r\r2\u0011\n\u2212\u2225yi\u22252\n\u2212\n\r\ry\u2032\ni\n\r\r2 + li\nN\nX\nj=1,j\u0338=i\n\u0012\r\rwij\n\r\r2 +\n\r\r\rw\u2032\nij\n\r\r\r\n2\u0013\n+ k1\u2225d1\u22252 .\n(17)\nAdding equation (17) for each i \u2208{1, . . . , N} and using the\nfact that wij = zji establishes the claim.\nLet yint := col(y1, . . . , yN) and y\u2032\nint := col(y\u2032\n1, . . . , y\u2032\nN),\nwhere yi (resp. y\u2032\ni) denote the external output of subsystem\n\u03a3ni\ni\n(resp. \u03a3ri\ni ). Then the following result provides an error\nbound between an interconnected system \u03a3 with an arbitrary\ninterconnection structure and its ROM \u03a3mod constructed\nusing the modular approach described above.\nTheorem 4.1: Consider a system \u03a3 consisting of N sub-\nsystems \u03a3ni\ni , 1 \u2264i \u2264N, and suppose that Assumption 1\nholds. Let \u03a3mod denote an interconnected system, with the\nsame interconnection as \u03a3 but with subsystems \u03a3ri\ni . If l < 1\nand \u03b3i < 1, for each i \u2208{1, . . . , N}, then\n\r\ryint \u2212y\u2032\nint\n\r\r2 \u2264\nN\nX\ni=1\n \u0012\n\u00b5i + ki\u03b4max\n1 \u2212l\n\u0013\n\u2225di\u22252 \u2212\u03b7i\n\r\rd\u2032\ni\n\r\r2\n!\n+\nN\nX\ni=1\n\uf8eb\n\uf8ed\u03b3i\n\r\rui \u2212u\u2032\ni\n\r\r2 +\n\u0012\n\u03b4i + li\u03b4max\n1 \u2212l\n\u0013\r\r\r\r\r\n\u0014ui\nu\u2032\ni\n\u0015\r\r\r\r\r\n2\uf8f6\n\uf8f8,\nwhere \u03b4max = max{\u03b41, . . . , \u03b4N} and l is defined in Lemma\n3.\nProof:\nSince \u03a3r1\n1\nis a (\u03b31, \u03b41)-ROM of \u03a3n1\n1 , the\nfollowing holds:\n\r\r\rcol\n\u0000y1 \u2212y\u2032\n1, z12 \u2212z\u2032\n12, . . . , z1N \u2212z\u2032\n1N\n\u0001\r\r\r\n2\n\u2264(\u00b51\u2212\u03f51)\u2225d1\u22252\n+ \u03b31\n\r\ru1 \u2212u\u2032\n1\n\r\r2 + \u03b31\nN\nX\nj=2\n\r\r\rw1j \u2212w\u2032\n1j\n\r\r\r\n2\n\u2212\u03b71\n\r\rd\u2032\n1\n\r\r2\n+ (\u03b41 \u2212\u03f51)\n\r\r\rcol\n\u0000u1, u\u2032\n1, w12, w\u2032\n12, . . . , w\u2032\n1N\n\u0001\r\r\r\n2\n.\n(18)\nAn analogous equation can be obtained for each subsystem\ni yielding a total of N such equations. Adding these N\nequations and using the fact that zij = wji yields\nN\nX\ni=1\n\u0012\r\ryi \u2212y\u2032\ni\n\r\r2 + (1 \u2212\u03b3i)\n\r\r\rwij \u2212w\u2032\nij\n\r\r\r\n2\u0013\n\u2264\nN\nX\ni=1\n\u03b3i\n\r\rui \u2212u\u2032\ni\n\r\r2 +\nN\nX\ni=1\n\u03b4i\n\u0010\n\u2225ui\u22252 +\n\r\ru\u2032\ni\n\r\r2\u0011\n+ max{\u03b41, . . . , \u03b4N}\nN\nX\ni=1\nN\nX\nj=1,j\u0338=i\n\r\rzij\n\r\r2 +\n\r\r\rz\u2032\nij\n\r\r\r\n2\n+\nN\nX\ni=1\n\u0010\n\u00b5i\u2225di\u22252 \u2212\u03b7i\n\r\rd\u2032\ni\n\r\r2\u0011\n.\nThe claim then follows by using Lemma 3 and the fact that\n\u03b3i \u22641 for all i \u2208{1, . . . , N}.\nRecall that Theorem 4.1 holds for systems with an ar-\nbitrary interconnection structure. Since we do not assume\nanything about the interconnections, Theorem 4.1 requires\nthat \u03b3i < 1 and li < 1 for all i \u2208{1, . . . , N}. Since each\n\u03b3i is obtained by solving problem P1, requiring \u03b3i < 1 for\neach subsystem is not restrictive. However, requiring l < 1\nmay be restrictive. In the next subsection, we will see that by\nconsidering specific interconnections such as series, parallel,\nand feedback, these requirements are either not required or\ncan be relaxed.\nB. Systems with Series, Parallel, or Feedback Connections\nIn this subsection, we focus on the most common in-\nterconnection structures \u2013 parallel, series, and feedback \u2013\nand establish that the modular ROM \u03a3mod obtained above\nis a (\u03b3r, \u03b4r)-ROM of the interconnected system \u03a3, for\n(a) System \u03a3.\n(b) System \u03a3mod.\nFig. 3: Parallel interconnection of N = 2 systems. System\n\u03a3 consists of two FOMs \u03a3n1\n1\nand \u03a3n2\n2\nconnected in parallel.\nFor i \u2208{1, 2}, \u03a3ri\ni denotes a (\u03b3i, \u03b4i)-ROM of \u03a3ni\ni . Systems\n\u03a3ri\ni\nare connected in parallel to obtain \u03a3mod.\n(a) System \u03a3.\n(b) System \u03a3mod.\nFig. 4: Series interconnection of N = 2 systems. System \u03a3\nconsists of two FOMs \u03a3n1\n1\nand \u03a3n2\n2\nconnected in series. For\ni \u2208{1, 2}, \u03a3ri\ni denotes a (\u03b3i, \u03b4i)-ROM of \u03a3ni\ni . Systems \u03a3ri\ni\nare connected in series to obtain \u03a3mod.\nspecified values of \u03b3r and \u03b4r. We begin with the subsystems\nbeing connected in parallel (cf. Figure 3), followed by the\nsubsystems being connected in series (cf. Figure 4).\nTheorem 4.2: Consider a system \u03a3 consisting of N sub-\nsystems \u03a3ni\ni , i \u2208{1, . . . , N}, connected in parallel and\nsuppose that Assumption 1 holds. Further, let \u03a3mod de-\nnote a system consisting of \u03a3ri\ni\nconnected in the same\nparallel interconnection structure as in \u03a3. Then, for \u03b3r :=\nmax{\u03b31, . . . , \u03b3N} and \u03b4r := max{\u03b41, . . . , \u03b4N}, \u03a3mod is a\n(\u03b3r, \u03b4r)-ROM of \u03a3.\nProof:\nThe claim follows directly from Theorem 4.1\nby substituting wij = zji = 0, w\u2032\nij = z\u2032\nji = 0, and, for all\ni \u2208{1, . . . , N}, substituting ui = u and u\u2032\ni = u\u2032.\nTheorem 4.3: Consider a system \u03a3 consisting of N sub-\nsystems \u03a3ni\ni , i \u2208{1, . . . , N} connected in series and\nsuppose Assumption 1 holds. Further, let \u03a3mod denote a\nsystem consisting of \u03a3ri\ni\nconnected in the same series\ninterconnection as in \u03a3. Then, for \u03b3r := QN\ni=1 \u03b3i and\n\u03b4r := PN\ni=1\n\u0010\n\u03b4i\nQN\nj=i+1 \u03b3j\nQi\u22121\nk=1 lk\n\u0011\n, \u03a3mod is a (\u03b3r, \u03b4r)-\nROM of \u03a3.\nProof:\nThe claim follows directly from Theorem 4.1\nby substituting wij = zji = 0, w\u2032\nij = z\u2032\nji = 0, and, for\ni \u2208{2, . . . , N}, substituting u\u2032\ni = y\u2032\ni\u22121 and ui = yi\u22121.\nRemark 3: Theorem 4.2 and Theorem 4.3 hold for arbi-\ntrary values of \u03b3i > 0 and li > 0, i \u2208{1, . . . , N}, i.e.,\nTheorem 4.2 and Theorem 4.3 do not require \u03b3i < 1 and\nli < 1.\nWe now consider systems connected in feedback. For N\nsystems connected in negative feedback, i.e., when every\nsystem is in a feedback to another system, we can obtain the\nsame conditions as in Theorem 4.1 and an analogous error\nbound. However, relaxed conditions on \u03b3i and li, i \u2208{1, 2},\ncompared to those in Theorem 4.1 were established for\nN = 2 systems in [43, Proposition 6] which we state below.\nTheorem 4.4: Consider a system \u03a3 consisting of N = 2\nsubsystems \u03a3ni\ni\nconnected in feedback and suppose that As-\nsumption 1 holds. Further, let \u03a3mod denote an interconnected\nsystem consisting of \u03a3ri\ni\nconnected in the same feedback\ninterconnection as \u03a3. Finally, for a given \u02dc\u03f5 = 1 + \u03bd, where\n\u03bd > 0 is a constant such that \u02dc\u03f52\u03b31\u03b32 < 1 and \u02dc\u03f52l1l2 < 1\nholds, \u03a3mod is a (\u03b3r, \u03b4r)-ROM of \u03a3, where\n\u03b3r = \u02dc\u03f5 max{\u03b31, \u03b32} + \u02dc\u03f52\u03b31\u03b32\n\u03bd(1 \u2212\u03b31\u03b32\u02dc\u03f52)\n,\n\u03b4r = 2\u02dc\u03f5 max{\u03b41, \u03b42} max{l1, l2}(1 + \u02dc\u03f5 max{\u03b31, \u03b32})\n(1 \u2212\u03b31\u03b32\u02dc\u03f52)(1 \u2212l1l2\u02dc\u03f52)\n+ 2 max{\u03b41, \u03b42}(1 + \u02dc\u03f5 max{\u03b31, \u03b32})\n\u0000\u03bd + \u02dc\u03f52(1 \u2212\u03bd)l1l2\n\u0001\n\u03bd(1 \u2212\u03b31\u03b32\u02dc\u03f52)(1 \u2212l1l2\u02dc\u03f52)\n.\nProof: The proof is analogous to that of [43, Proposi-\ntion 6] and has been omitted for brevity.\nTill this point, by restricting ourselves to the case when the\nFOMs are assumed to be 0-asymptotically stable, we have\nshown that the approach proposed in this work satisfies the\ndesirable properties (1)-(4) described in Problem 1. In the\nnext section, we will establish that the proposed approach\nalso satisfies the desirable property (5) described in Problem\n1.\nV. CLOSED-LOOP STABILITY OF ORIGINAL SYSTEM\nIn this section, we will establish that a stabilizing con-\ntroller \u03a3K designed for a (\u03b3, \u03b4)-ROM \u03a3r can ensure that\nthe output of the FOM \u03a3n with the same controller is\nalso bounded. Recall that the notion of (\u03b3, \u03b4)-similarity\nrequires the systems to be 0-asymptotically stable. Under this\nassumption, establishing that the output of the (closed-loop)\noriginal system is bounded is not very meaningful. Hence,\nin this section, we will first generalize the notion of (\u03b3, \u03b4)-\nsimilarity of two systems with arbitrary initial conditions\nand that may not be 0-asymptotically stable. Specifically,\nconsider two systems \u03a3i, 1 \u2264i \u22642, defined as follows:\n\u03a3i :\n(\n\u02d9xi\n= Aixi + Biui + Eidi,\nxi(0) = xi,0\nyi\n= Cixi,\n(19)\nwhere the matrices Ai need not be Hurwitz. We first modify\nDefinition 1 to incorporate the non-zero initial conditions.\nDoing so will require us to both modify the intermediate\nresults from [43], as well as establish additional results.\nHowever, as we will see in Lemma 4, generalizing the\nnotion of (\u03b3, \u03b4)-similarity in this way leads to the same LMI\ncondition as when the systems are assumed to be stable and\nwith zero initial conditions (refer to equation (9)). Intuitively,\nthis is because the notion of (\u03b3, \u03b4)-similarity is inspired from\nH\u221econtrol theory, which does not require the assumptions\nimposed in [43]. We begin with the definition of (\u03b3, \u03b4)-\nsimilar systems for arbitrary initial conditions and possibly\nunstable systems.\nDefinition 3: Given systems \u03a3n1\n1\nand \u03a3n2\n2 , for \u03b3, \u03b4 > 0,\nthe system \u03a3n2\n2\nis said to be (\u03b3, \u03b4)-similar to \u03a3n1\n1 , if there\nexist positive constants \u03f5, \u03b7, \u00b5 and a matrix K \u227b0 such that,\nfor every input u1, u2 \u2208L2 and every disturbance d1 \u2208L2,\nthere exists a disturbance d2 \u2208L2 such that\n\u2225y1 \u2212y2\u22252 \u2264\u03b3\u2225u1 \u2212u2\u22252 + (\u03b4 \u2212\u03f5)\n\r\rcol(u1, u2)\n\r\r\n+ (\u00b5 \u2212\u03f5)\u2225d1\u22252 \u2212\u03b7\u2225d2\u22252 + x\u2032\nsKxs,\n(20)\nwhere xs := col(x1,0, x2,0).\nDefinition 3 provides a measure on the similarity of\ngeneral responses as opposed to merely the forced response\nbetween two systems. We now provide a characterization of\n(\u03b3, \u03b4)-similarity for systems that may not be 0-asymptotically\nstable while deferring the intermediate results to Appendix\nI.\nLemma 4: Suppose\nthat\nthe\nconditions\nspecified\nin\nLemma 6 and Theorem 1.1 hold. For \u03b3, \u03b4 > 0, \u03a3n2\n2\nis (\u03b3, \u03b4)-\nsimilar to \u03a3n1\n1\nif there exists a positive definite matrix X, a\nmatrix \u03a0, and positive scalars \u03b7 and \u00b5 such that\n\uf8ee\n\uf8ef\uf8f0\nAX + B\u03a0 + (AX + B\u03a0)\u22a4\n\u2217\n\u2217\nE\u22a4\n\u2212Q(\u03b3, \u03b4)\n\u2217\nCX + D\u03a0\n0\n\u2212R\n\uf8f9\n\uf8fa\uf8fb\u227a0.\nProof:\nThe proof has been omitted for brevity as\nit is analogous to that of [43, Theorem 2]. Note that the\nintermediate results required for this proof are different than\nin [43] and have been provided in Appendix I.\nNote that the LMI in Lemma 4 is exactly the same as in\nTheorem 2.1. We can now utilize the LMI characterized in\nLemma 4 to determine a (\u03b3, \u03b4)-ROM \u03a3r of a FOM \u03a3n.\nTheorem 5.1: Given a system \u03a3n and a positive integer\nr < n, suppose that a solution {A\u2217\n2, B\u2217\n2, C\u2217\n2, X\u2217, \u03a0\u2217, \u03b3\u2217, \u03b7\u2217}\nto problem P1 exists. Then, the obtained ROM \u03a3r defined\nby the system matrices A\u2217\n2, B\u2217\n2, C\u2217\n2 is a (\u03b3\u2217, \u03b4\u2217)-ROM of \u03a3n.\nProof: The proof is analogous to that of Theorem 3.2\nand has been omitted for brevity.\nHaving extended the notion of (\u03b3, \u03b4)-similarity to unstable\nsystems, we can now characterize the closed-loop stability\nof the original system when a controller, denoted as \u03a3K,\ndesigned for the (\u03b3, \u03b4)-ROM \u03a3r is used for the original\nsystem. Note that, since we do not impose any stability\nassumption on the FOM \u03a3n, it is possible that the obtained\nROM \u03a3r may also be unstable. If \u03a3r or \u03a3n is not stabi-\nlizable, then designing a controller \u03a3K is not meaningful.\nHence, a necessary assumption is that the FOM \u03a3n and its\nobtained ROM \u03a3r are stabilizable. Finally, since in this work\nwe restrict ourselves to the class of LTI systems, we consider\nonly linear controllers \u03a3K. We state this assumption more\nformally below.\nAssumption 2: Given a system \u03a3n of order n, there exists\na stabilizable (\u03b31, \u03b41)-ROM \u03a3r of order r < n. Thus, a\n(stabilizing) controller \u03a3K exists for system \u03a3r and is 0-\nasymptotically stable.\nGiven the assumption that \u03a3K is 0-asymptotically stable\n(see Assumption 2), it follows from [51, Section 4.1] that\nthere exist positive constants l1 and k1 such that for all\nu1, d1 \u2208L2\n\u2225yK\u22252 \u2264l1\u2225uK\u22252 + k1\u2225dK\u22252 ,\n(21)\n(a) Feedback interconnection\nof \u03a3r and \u03a3K.\n(b) Feedback interconnection of\n\u03a3n and \u03a3K.\nFig. 5: Controller \u03a3K designed for ROM \u03a3r connected in\nfeedback with ROM \u03a3r and original system \u03a3n.\nwhere yK, uK, and dK denotes the output, input, and\ndisturbance vectors of \u03a3K. Further, since the same controller\nis applied to the both \u03a3r and \u03a3n, from Lemma 1, there\nexists a \u03b3K such that for any \u03b4K > 0, the controller \u03a3K is\n(\u03b3K, \u03b4K)-similar to itself.\nFor l1 and k1 defined in (21), define\n\u02c6Q :=\n\u00142(\u03b3K + \u03b4K + l1)I\n\u22122\u03b3KI\n\u22122\u03b3KI\n2(\u03b3K + \u03b4K + l1)I\n\u0015\n,\nand let \u02dc\u03f5 := 1 + \u03bd for some constant \u03bd > 0. Further let\nl := l1+\u03bbmax( \u02c6Q) and k := k1+2(\u00b5+k1). We are now ready\nto establish the main result of this section, i.e., a controller\ndesigned to stabilize the (\u03b3, \u03b4)-ROM \u03a3r when applied to\nthe FOM \u03a3n (cf. Figure 5), ensures that the output of \u03a3n is\nbounded.\nTheorem 5.2: Given a FOM \u03a3n and a positive integer\nr < n, suppose Assumption 2 holds. Consider that \u03a3K is\nconnected in feedback to \u03a3n. For some constant \u03bd > 0, let\nthe following conditions hold:\n1 \u2212\u02dc\u03f5\u03b31\u03b3K > 0\n(22)\n1 \u2212\ns\n2\u02dc\u03f5\u03b31(\u03b42 \u2212\u03f52) + 2l(\u03b41 \u2212\u03f51)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n> 0.\n(23)\nThen, the following inequality holds\n\uf8eb\n\uf8ed1 \u2212\ns\n2\u02dc\u03f5\u03b31(\u03b42 \u2212\u03f52) + 2l(\u03b41 \u2212\u03f51)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\uf8f6\n\uf8f8\u2225y1\u2225\u2264\ns\n\u02dc\u03f5\u03b31\n\u03bd(1 \u2212\u02dc\u03f52\u03b31\u03b32)\n\r\re1 \u2212e\u2032\n1\n\r\r +\ns\n\u02dc\u03f5\u03b31\u03b32\n\u03bd(1 \u2212\u02dc\u03f52\u03b31\u03b32)\n\r\re2 \u2212e\u2032\n2\n\r\r\n+\ns\n2(\u03b41 \u2212\u03f51)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\r\r\r\r\r\n\u0014\ne1\ne\u2032\n1\n\u0015\r\r\r\r\r\n+\ns\u00122\u02dc\u03f5\u03b31(\u03b42 \u2212\u03f52) + 2l(\u03b41 \u2212\u03f51)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\u0013\r\r\r\r\r\n\u0014\ne2\ne\u2032\n2\n\u0015\r\r\r\r\r\n+\n\uf8eb\n\uf8ed1 +\ns\n2\u02dc\u03f5\u03b31(\u03b42 \u2212\u03f52) + 2l(\u03b41 \u2212\u03f51)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\uf8f6\n\uf8f8\r\ry\u2032\n1\n\r\r\n+\ns\u0012\u00b51 \u2212\u03f51 + 2k(\u03b41 \u2212\u03f51)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\u0013\n\u2225d1\u2225+\nr\n\u03b71\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\r\rd\u2032\n1\n\r\r\n+\ns\n\u02dc\u03f5\u03b31(\u00b52 \u2212\u03f52)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\u2225d2\u22252 +\n\u02dc\u03f5\u03b31\u03b72\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\r\rd\u2032\n2\n\r\r2.\nProof:\nSince \u03a3r is (\u03b31, \u03b41)-similar to \u03a3n, from (20)\nand selecting u1 = e1 \u2212y2 and u\u2032\n1 = e\u2032\n1 \u2212y\u2032\n2 (cf. Figure 5)\nyields\n\r\ry1 \u2212y\u2032\n1\n\r\r2 \u2264\u03b31\n\r\re1 \u2212y2 \u2212e\u2032\n1 \u2212y\u2032\n2\n\r\r2\n+ (\u03b41 \u2212\u03f51)\n\r\rcol(e1 \u2212y2, e\u2032\n1 \u2212y\u2032\n2)\n\r\r\n+ (\u00b51 \u2212\u03f51)\u2225d1\u22252 \u2212\u03b71\n\r\rd\u2032\n1\n\r\r2 + xsKxs.\nUsing the triangle and Cauchy-Schwarz inequalities yield\n\r\ry1 \u2212y\u2032\n1\n\r\r2 \u2264\u02dc\u03f5\n\u03bd \u03b31\n\r\re1 \u2212e\u2032\n1\n\r\r2 + 2(\u03b41 \u2212\u03f51)\n\r\rcol(e1, e\u2032\n1)\n\r\r\n+ \u02dc\u03f5\u03b31\n\r\ry2 \u2212y\u2032\n2\n\r\r2 + 2(\u03b41 \u2212\u03f51)\n\r\rcol(y2, y\u2032\n2)\n\r\r\n+ (\u00b51 \u2212\u03f51)\u2225d1\u22252 \u2212\u03b71\n\r\rd\u2032\n1\n\r\r2 .\n(24)\nAnalogously, we obtain\n\r\ry2 \u2212y\u2032\n2\n\r\r2 \u2264\u02dc\u03f5\n\u03bd \u03b32\n\r\re2 \u2212e\u2032\n2\n\r\r2 + 2(\u03b42 \u2212\u03f52)\n\r\rcol(e2, e\u2032\n2)\n\r\r\n+ \u02dc\u03f5\u03b32\n\r\ry1 \u2212y\u2032\n1\n\r\r2 + 2(\u03b42 \u2212\u03f52)\n\r\rcol(y1, y\u2032\n1)\n\r\r\n+ (\u00b52 \u2212\u03f52)\u2225d2\u22252 \u2212\u03b72\n\r\rd\u2032\n2\n\r\r2 .\n(25)\nSubstituting (25) in (24) and rearranging terms yields\n\r\ry1 \u2212y\u2032\n1\n\r\r2 \u2264\u2212\n\u03b71\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\r\rd\u2032\n1\n\r\r2 \u2212\n\u02dc\u03f5\u03b31\u03b72\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\r\rd\u2032\n2\n\r\r2 +\n\u02dc\u03f5\u03b31\n\u03bd(1 \u2212\u02dc\u03f52\u03b31\u03b32)\n\r\re1 \u2212e\u2032\n1\n\r\r2 +\n\u02dc\u03f5\u03b31\u03b32\n\u03bd(1 \u2212\u02dc\u03f52\u03b31\u03b32)\n\r\re2 \u2212e\u2032\n2\n\r\r2\n+ 2(\u03b41 \u2212\u03f51)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\r\r\r\r\r\n\u0014\ne1\ne\u2032\n1\n\u0015\r\r\r\r\r\n2\n+ 2\u02dc\u03f5\u03b31(\u03b42 \u2212\u03f52)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\r\r\r\r\r\n\u0014\ne2\ne\u2032\n2\n\u0015\r\r\r\r\r\n2\n+ 2(\u03b41 \u2212\u03f51)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\r\r\r\r\r\n\u0014y2\ny\u2032\n2\n\u0015\r\r\r\r\r\n2\n+ 2\u02dc\u03f5\u03b31(\u03b42 \u2212\u03f52)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\r\r\r\r\r\n\u0014y1\ny\u2032\n1\n\u0015\r\r\r\r\r\n2\n+\n\u00b51 \u2212\u03f51\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\u2225d1\u22252 + \u02dc\u03f5\u03b31(\u00b52 \u2212\u03f52)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\u2225d2\u22252 .\n(26)\nUnder Assumption 2 and from Lemmas 1 and 2, there exists\nconstants l := l1 + \u03bbmax( \u02c6Q) and k := k1 + 2(\u00b5 + k1) such\nthat\n\r\rcol(y2, y\u2032\n2)\n\r\r2 \u2264l\n\r\rcol(e2 \u2212y1, e\u2032\n2 \u2212y\u2032\n1)\n\r\r2 + k\u2225d1\u22252\n\u2264l\n\r\rcol(e2, e\u2032\n2)\n\r\r2 + l\n\r\rcol(y1, y\u2032\n1)\n\r\r2 + k\u2225d1\u22252 ,\n(27)\nwhere the last inequality is obtained by using triangle in-\nequality. Substituting (27) in (26) and using (22) yields\n\r\ry1 \u2212y\u2032\n1\n\r\r2 \u2264\u02dc\u03f5\u03b31(\u00b52 \u2212\u03f52)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\u2225d2\u22252 +\n\u02dc\u03f5\u03b31\u03b72\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\r\rd\u2032\n2\n\r\r2 +\n\u02dc\u03f5\u03b31\n\u03bd(1 \u2212\u02dc\u03f52\u03b31\u03b32)\n\r\re1 \u2212e\u2032\n1\n\r\r2 +\n\u02dc\u03f5\u03b31\u03b32\n\u03bd(1 \u2212\u02dc\u03f52\u03b31\u03b32)\n\r\re2 \u2212e\u2032\n2\n\r\r2\n+ 2(\u03b41 \u2212\u03f51)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\r\rcol(e1, e\u2032\n1)\n\r\r2\n+\n\u00122\u02dc\u03f5\u03b31(\u03b42 \u2212\u03f52)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n+ 2l(\u03b41 \u2212\u03f51)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\u0013\r\rcol(e1, e\u2032\n2)\n\r\r2\n+\n\u00122\u02dc\u03f5\u03b31(\u03b42 \u2212\u03f52)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n+ 2l(\u03b41 \u2212\u03f51)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\u0013\r\rcol(y1, y\u2032\n1)\n\r\r2\n+\n\u0012\n\u00b51 \u2212\u03f51\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n+ 2k(\u03b41 \u2212\u03f51)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\u0013\n\u2225d1\u22252 +\n\u03b71\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\r\rd\u2032\n1\n\r\r2 .\nTaking square root on both sides and using triangle inequality\nand the fact that the square root of sum of positive numbers\nis at most sum of square roots of positive numbers yields\n\u2225y1\u2225\u2212\n\r\ry\u2032\n1\n\r\r \u2264\ns\n2(\u03b41 \u2212\u03f51)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\r\rcol(e1, e\u2032\n1)\n\r\r +\ns\n\u02dc\u03f5\u03b31\n\u03bd(1 \u2212\u02dc\u03f52\u03b31\u03b32)\n\r\re1 \u2212e\u2032\n1\n\r\r +\ns\n\u02dc\u03f5\u03b31\u03b32\n\u03bd(1 \u2212\u02dc\u03f52\u03b31\u03b32)\n\r\re2 \u2212e\u2032\n2\n\r\r\n+\ns\u00122\u02dc\u03f5\u03b31(\u03b42 \u2212\u03f52) + 2l(\u03b41 \u2212\u03f51)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\u0013\r\rcol(e2, e\u2032\n2)\n\r\r\n+\ns\u00122\u02dc\u03f5\u03b31(\u03b42 \u2212\u03f52) + 2l(\u03b41 \u2212\u03f51)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\u0013 \u0010\n\u2225y1\u2225+\n\r\ry\u2032\n1\n\r\r\n\u0011\n+\ns\u0012\u00b51 \u2212\u03f51 + 2k(\u03b41 \u2212\u03f51)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\u0013\n\u2225d1\u2225+\nr\n\u03b71\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\r\rd\u2032\n1\n\r\r\n+\ns\n\u02dc\u03f5\u03b31(\u00b52 \u2212\u03f52)\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\u2225d2\u22252 +\n\u02dc\u03f5\u03b31\u03b72\n1 \u2212\u02dc\u03f52\u03b31\u03b32\n\r\rd\u2032\n2\n\r\r2.\nSince \u03a3K is stabilizing for the ROM \u03a3r, it follows that\n\r\ry\u2032\n1\n\r\r\nis bounded. Thus, rearranging the terms yields the result.\nRemark 4: So far in this work we have established that\nthe proposed approach, i.e., the (\u03b3, \u03b4)-ROM obtained as a\nsolution to the optimization problem P1 has the following\nproperties: (1) it minimizes the error between the l2-norm of\nthe outputs of the FOM and the obtained (\u03b3, \u03b4)-ROM (cf.\nTheorem 3.2), (2) if the FOM is 0-asymptotically stable,\nthen the obtained (\u03b3, \u03b4)-ROM is 0-asymptotically stable\n(cf. Theorem 3.1), (3) the obtained (\u03b3, \u03b4)-ROM is robust\nto any modeling errors and disturbances d1 \u2208L2 (cf.\nTheorem 3.2), (4) facilitates an approach for model reduction\nof interconnected systems with theoretical guarantees on\nthe error (cf. Theorem 4.1), and (5) allows the controller\ndesigned for the (\u03b3, \u03b4)-ROM to be applied to the respective\nFOM (cf. Theorem 5.2).\nIt might be possible that some of the properties (such as\npreserving stability) can be achieved by imposing them as\na constraint in the optimization problems for the existing\nmethods. However, doing so for all the properties defined in\nProblem 1 makes the optimization problem challenging (if\nnot infeasible) to solve. We now show that our proposed\napproach can also be used in conjunction with existing\napproaches such as balanced truncation or moment matching.\nThis is beneficial because some of the theoretical guarantees\nestablished for the proposed approach will now also hold for\nthese methods, while inheriting the desired features of these\nmethods.\nVI. COMBINING (\u03b3, \u03b4)-ROM WITH EXISTING\nAPPROACHES\nIn this section, we will modify the optimization problem\nP1 such that it can be combined with moment matching and\nbalanced truncation. We begin with the former.\nFor moment matching, we combine our framework with\nthe approach described in [8]. Recall from Section II-A that\nthe reduced order system matrices A2 and C2 in moment\nmatching are fully characterized by the matrix B2. Given\nan observable pair (L, S) such that spec(S)\u2229spec(A1) = \u2205,\ncombining the problem P1 with moment matching yields the\nfollowing optimization problem, that we refer to as PMM:\nProblem PMM:\nmin\nB2,X\u227b0,\u03a0,\u03b3>0,\u03b4>0 (\u03b3 + \u03b4)\n(28)\nsubject to\n\uf8ee\n\uf8ef\uf8f0\nAX + B\u03a0 + (AX + B\u03a0)\u22a4\n\u2217\n\u2217\nE\u22a4\n\u2212Q(\u03b3, \u03b4)\n\u2217\nCX + D\u03a0\n0\n\u2212R\n\uf8f9\n\uf8fa\uf8fb\u227a0,\nwhere matrices A, B, C, D and E are as defined in (7),\nmatrices Q and R are defined in (8), and matrices A2 and\nC2 are defined in (3). Note that, in moment matching, there\nis an additional constraint that spec(S)\u2229spec(S\u2212B2L) = \u2205.\nAnalogous to [8], one can select S such that spec(S) \u2286C+.\nFrom Theorem 3.1, since the obtained ROM is guaranteed\nto be stable, spec(S) \u2229spec(S \u2212B2L) = \u2205will be satisfied.\nTheorem 6.1: Given a stable system \u03a3n, an observable\npair (S, L), and a positive integer r < n, suppose that\na solution {B\u2217\n2, X\u2217, \u03a0\u2217, \u03b3\u2217, \u03b4\u2217} to problem PMM exists.\nThen, the system \u03a3r defined by the system matrices B\u2217\n2\nis a (\u03b3\u2217, \u03b4\u2217)-ROM of \u03a3n that satisfies properties (1)-(4)\ndescribed in Problem 1.\nProof: Since the obtained ROM is a (\u03b3, \u03b4)-ROM, the\nclaim holds.\nRemark 5: From Theorem 6.1, combining the proposed\nframework with moment matching not only provides an error\nbound, but also ensures that the obtained ROM is stable and\nrobust. To the best of our knowledge, this is the first result\nthat satisfies these properties for moment matching.\nWe now describe how the proposed approach can be\nused with balanced truncation. Although balanced truncation\npreserves stability and provides a theoretical error bound, as\nwe will see shortly, combining balanced truncation with the\nproposed approach is still beneficial as it ensures that the\nROM obtained by balanced truncation is robust.\nSince balanced truncation already provides a ROM that\nsatisfies properties (1) and (2), the idea is to utilize this ROM\nand determine a driving input d2 \u2208L2 such that the ROM\nis (\u03b3, \u03b4)-similar to its respective FOM. To this end, consider\nthe following optimization problem that we will refer to as\nPBT:\nProblem PBT:\nmin\nX\u227b0,\u03a0,\u03b3>0,\u03b4>0 (\u03b3 + \u03b4)\n(29)\nsubject to\n\uf8ee\n\uf8ef\uf8f0\nAX + B\u03a0 + (AX + B\u03a0)\u22a4\n\u2217\n\u2217\nE\u22a4\n\u2212Q(\u03b3, \u03b4)\n\u2217\nCX + D\u03a0\n0\n\u2212R\n\uf8f9\n\uf8fa\uf8fb\u227a0,\nwhere matrices A, B, C, D and E are as defined in (7),\nmatrices Q and R are defined in (8), and matrices A2, B2,\nand C2 are obtained from balanced truncation method.\nTheorem 6.2: Given a stable system \u03a3n and its ROM \u03a3r\nobtained via balanced truncation, suppose that a solution\n{X\u2217, \u03a0\u2217, \u03b3\u2217, \u03b4\u2217} to the optimization problem PBT exist.\nThen, system \u03a3r is (\u03b3\u2217, \u03b4\u2217)-similar to \u03a3n. Further, \u03a3r\nsatisfies properties (1)-(4) described in Problem 1.\nProof: Since the given ROM is a (\u03b3, \u03b4)-similar to its\nFOM, the claim holds.\nNote that, to combine moment-matching or balanced trun-\ncation with our framework, we do not consider the system\nmatrix E1 associated with the disturbance in the FOM when\nconstructing the ROM. This is because the applying d2 to\nthe ROM accounts for d1.\nRemark 6: Theorem 6.2 provides an additional bound\non the difference between the outputs of \u03a3n and \u03a3r. A\npreexisting bound on the error for the balanced truncation\nis given in (2). The bound which is lower among the two\ncan be used.\nRemark 7: The constraint in the optimization problem\nPBT is an LMI constraint for which numerous computational\ntechniques are available.\nVII. NUMERICAL RESULTS\nWe now numerically illustrate the properties established\nin this work on (i) a cart with double-pendulum model, (ii)\n(a) Comparison of outputs of\n\u03a32\ncart and \u03a36\ncart.\n(b) Difference between the out-\nputs of \u03a32\ncart and \u03a36\ncart.\nFig. 6: Comparison of the performance of ROM of a double\npendulum controller. The ROM \u03a3r is obtained by solving\nP1.\na coupled spring-mass-damper system, and (iii) a building\nmodel. For all of the numerical studies in this work, we use\nMATLAB R2023b (YALMIP [52], SDPT3 [53], and Gurobi\n[54]) to solve the specified optimization problem.\nA. Double-pendulum Model\nIn this subsection, we will validate properties (1)-(3) by\nillustrating our approach on a cart with double-pendulum\nmodel [55]. The model has order n = 6 and is defined by\nthe following system matrices:\nA1 =\n\uf8ee\n\uf8ef\uf8ef\uf8ef\uf8ef\uf8ef\uf8ef\uf8f0\n0\n1\n0\n0\n0\n0\n\u22121\n\u22121\n98/5\n1\n0\n0\n0\n0\n0\n1\n0\n0\n1\n1\n\u2212196/5\n\u22122\n49/5\n1\n0\n0\n0\n0\n0\n1\n0\n0\n98/5\n1\n\u221298/5\n\u22122\n\uf8f9\n\uf8fa\uf8fa\uf8fa\uf8fa\uf8fa\uf8fa\uf8fb\n,\nB1 =\n\u0002\n0\n1\n0\n\u22121\n0\n0\n\u0003\u22a4,\nC1 =\n\u00021\n0\n0\n0\n0\n0\u0003\n, E1 = C\u22a4\n1 .\n(30)\nWe denote the model characterized in equation (30) as\n\u03a36\ncart. We select r = 2 and solve the optimization problem P1\nfor the choice of u1(t) = 20 sin(2t), u2(t) = 30 sin(2t), and\nd1(t) =\n30\n0.2\n\u221a\n2\u03c0 exp\n\u0010\n\u2212(t\u221210)2\n2(0.2)2\n\u0011\nto obtain a reduced order\nmodel, denoted by \u03a32\ncart, of \u03a36\ncart. The values of \u03b3\u2217and \u03b4\u2217\nwere determined to be 0.0032 and 0.0042, respectively. We\npresent the outputs of \u03a36\ncart and \u03a32\ncart in Figure 6.\nFrom Figure 6, the output of the ROM \u03a32\ncart obtained by\nsolving the optimization problem P1 closely matches that\nof the output of the FOM \u03a36\ncart, even in the presence of\ndisturbance d1 (cf. properties (1) and (3) in Section II-B).\nFurther, the state transition matrix A2 of the ROM \u03a32\ncart was\ndetermined to be as follows:\nA2 =\n\u0014 \u22121.052\n\u22120.5368\n\u22120.5368\n\u22120.357\n\u0015\n.\nIt can be checked that the matrix A2 obtained as a solution to\nP1 is Hurwitz, as guaranteed from Theorem 3.1 (cf. property\n(2) in Section II-B).\nWe now show that by combining our approach with\nmoment matching may yield reduced order models with\nlower approximation error, even when no disturbance acts\n(a) Comparison of outputs of\n\u03a32\ncart and \u03a36\ncart. \u03a32\ncart is obtained\nas in [24].\n(b) Comparison of outputs of\n\u03a32\ncart and \u03a36\ncart. \u03a32\ncart is obtained\nfrom PMM.\nFig. 7: Numerical comparison of the proposed approach with\napproach based on moment matching [24] with d1(t) = 0.\non the FOM. Specifically, we combine the moment matching\napproach described in [24] with our approach and solve PMM\nand compare the performance with that in [24] in which\nthe H2 norm of the error transfer function is minimized.\nFollowing the approach of [24], we fix the observable pair\n(S, L) to\nS =\n\u00140\n1\n0\n0\n\u0015\n, L =\n\u00021\n1\u0003\nwhich yields a family of second order models parametrized\nwith B2 that match the first two moments at zero of (30)\nwith\nA2 = S \u2212B2L; C2 =\n\u00021\n\u22121\u0003\n.\nFigure 7 presents a comparison of the outputs of the FOM\nwith a ROM obtained by the moment matching approach\ndescribed in [24] (with B2 =\n\u00020.2505\n0.1500\u0003\u22a4) and when\nthe ROM is obtained by solving PMM. The inputs u1(t) =\nu2(t) = 20 sin(2t) and the disturbance d1(t) = 0 are used.\nThe solution obtained by solving PMM consists of matrix\nB2 =\n\u0002\n0.6747\n0.0807\n\u0003\u22a4, \u03b3 = 0.0052, and \u03b4 = 0.0028.\nFurther, the error\u2225y \u2212yr\u2225was determined to be 0.44. Figure\n7 illustrates that combining our framework with moment\nmatching yields lower approximation errors as compared to\ndetermining a ROM based only on moment matching. This\nmeans that lower approximation error may be an additional\nbenefit of our framework apart from the six properties\ndescribed in Section II-B. In particular, combining moment\nmatching with our framework provides an upper bound on\nthe error (cf. (10)) which may be better as compared to the\napproach in [24].\nB. Spring-Mass-Damper System\nWe now consider a coupled spring-mass-damper system\nwith number of masses N\n= 10 yielding a system of\norder n = 20, and illustrate how our proposed approach\ncan be combined with the balanced truncation (BT) method.\nConsequently, we will illustrate the robustness as well as the\ninterconnection properties of the reduced order model thus\nobtained.\nSimilar to the system in [56], the masses, spring constants,\nand the viscous friction coefficients were chosen randomly\nFig. 8: Comparison of output of the spring-mass-damper\nsystem and its ROMs when a disturbance acts on the system.\nFig. 9: Comparison of output of the spring-mass-damper\nsystem and its ROMs when the controller C(s) is connected\nin feedback.\nbetween 1 and 10, 1 and 105, and 0 and 1, respectively.\nWe denote the system by \u03a320\nSMD. We select the order r = 2\nof the ROM and use balanced truncation method to obtain\nthe reduced order model, denoted as \u03a32\nSMD, of \u03a320\nSMD. Using\n\u03a32\nSMD and \u03a320\nSMD, we then solve optimization problem PBT\nwhich yields \u03b3\u2217= 0.122 and \u03b4\u2217= 1.009. The error\u2225y \u2212yr\u2225\nby solving PBT was determined to be 0.8076.\nFigure 8 presents the comparison of the output of the\nFOM \u03a320\nSMD, the ROM obtained by balanced truncation, and\nthe (\u03b3\u2217, \u03b4\u2217)-ROM obtained by solving PBT for the choice\nof inputs and disturbance u1(t) = 11 sin(2t), u2(t) =\n11.1 sin(2t), and d1(t) =\n15\n0.2\n\u221a\n2\u03c0 exp\n\u0010\n\u2212(t\u221210)2\n2(0.2)2\n\u0011\n. When\nbalanced truncation is solely used, we first concatenate the\nsystem matrices associated with the input and the disturbance\nand then apply balanced truncation. From Figure 8, combin-\ning balanced truncation with the proposed approach yields a\nrobust ROM that matches the output of the FOM.\nWe now illustrate the closed-loop property of our ap-\nproach. Similarly to [56], we connect the controller C(s) =\n10\u22125s+1\n10\u22123s+1 in feedback to the FOM \u03a320\nSMD and the ROM \u03a32\nSMD\nobtained by solving PBT. Figure 9 presents the comparison of\nthe output when the controller C(s) is connected in feedback\nto (i) the FOM \u03a320\nSMD, (ii) the ROM obtained via balanced\ntruncation, and (iii) the (\u03b3\u2217, \u03b4\u2217)-ROM obtained by solving\n(a) Outputs of \u03a3r and \u03a3n.\n(b) Error between the outputs of\n\u03a3r and \u03a3n.\nFig. 10: Performance of the proposed approach on building\nmodel. Disturbance d1(t) was kept the same as for Figure\n6b.\nPBT. For the same choice of u1(t), u2(t), and d1(t) as\nin Figure 8, we observe that the output of the closed-loop\nsystem when the controller C(s) is connected in feedback\nto the (\u03b3\u2217, \u03b4\u2217) ROM closely matches the output of the\nclosed-loop system when the controller C(s) is connected\nin feedback to \u03a320\nSMD. We further observe that the proposed\napproach outperforms balanced truncation suggesting the\npossibility of lower approximation errors by combining our\napproach with balanced truncation.\nC. Building Model\nFigure 10 presents the proposed framework on a building\nmodel1. Specifically, we consider a full order model of the\nLos Angeles University Hospital. The model is single input\nand single output, consists of 48 states, and the pole closest\nto the imaginary axis has the real part equal to \u22122.62\u00d710\u22121.\nWe refer to [10] for more details. By solving optimization\nproblem PBT for a given order r = 2, we obtain a robust\nROM of the original building model along with the driving\ninput d2 which ensures that the output of the ROM is similar\nto that of the FOM. In Figure 10, we see that the output of the\nROM obtained by combining our framework with balanced\ntruncation is similar to that of the FOM, even in the presence\nof an arbitrary disturbance.\nVIII. DISCUSSION, CONCLUSION AND FUTURE WORKS\nIn this work, we provide a framework for model order\nreduction for continuous-time linear systems such that the\nobtained ROM continues to be a good approximation even\nwhen an arbitrary disturbance acts on the FOM. Additionally,\nwe establish that the proposed framework has the following\nproperties: (1) provable bound on the error, defined as the\nL2-norm of the difference between the outputs (2) preserves\nstability, (3) provable error bound on arbitrary interconnected\nsystems, (4) provable error bound on the output of the FOM\nwhen the controller designed for the ROM is applied to it,\n(5) compatibility with the existing approaches.\nWe now briefly comment on the applicability of this work.\nAs mentioned in Section I, the driving input d2 determined\nusing our framework requires the information of the state\n1Online available at http://slicot.org/20-site/126-benchmark-examples-\nfor-model-reduction.\nof the FOM and the ROM. This is because applying d2 =\nFx to the obtained ROM perturbs the dynamics of the\nROM such that its output is close to the FOM. Thus, this\nwork is suitable in reducing high-fidelity digital twins to\nreduce the computational load. For applications in which the\nstate information from the FOM may not be available, our\napproach is still applicable by imposing that the driving input\nd2 = 0. In this case, the problem reduces to finding an ROM\nthat (without any perturbation) is (\u03b3, \u03b4)-similar to the FOM.\nWe believe that the results still hold since substituting d2 = 0\nis a special case.\nIt must be highlighted that the numerical comparison\nof our framework with balanced truncation or moment-\nmatching merely suggests that the proposed framework may\nhave comparatively low approximation errors. While we\nobserved consistent results in multiple numerical studies, a\ndefinite answer to this comparison remains an open question.\nFurther, while we used specific instances of u1, u2, and d1\nfor the numerical case studies, we highlight that the error\nbound (cf. (10)) holds for every u1, u2, d1. Finally, we note\nthat the optimization problem such as P1 or PBT is only\nsolved once.\nImmediate next steps include relaxing the BMI constraint\nand determining improved error bounds for arbitrary inter-\nconnected systems. Future work can also include extending\nthis framework to model reduction of non-linear systems.\nAPPENDIX I\n(\u03b3, \u03b4)-SIMILARITY FOR UNSTABLE SYSTEMS\nIn this section, we extend the notion of (\u03b3, \u03b4) similarity\nto systems with arbitrary initial conditions and which may\nnot be 0-asymptotically stable.\nGiven a terminal time T, we define the cost function\nJT (\u03c4, xs, d2, w) =\nZ T\n\u03c4\n|z(t)|2\nR \u2212|w(t)|2\nQdt,\n(31)\nwhere z(t) = z(t; \u03c4, xs, d2, w). The following result provides\nan alternate characterization of Definition 3 in terms of the\ncomposite system (6) and the cost function (31).\nLemma 5: For \u03b3, \u03b4 > 0, the system \u03a32 is (\u03b3, \u03b4)-similar to\nsystem \u03a31 if and only if there exists matrices Q, R, defined\nin (8), and K \u227b0 and a constant \u03f5 > 0 such that for all\nw \u2208L2, there exists d2 \u2208L2 such that\nlim\nT \u2192\u221eJT (0, xs, d2, w) \u2264\u2212\u03f5\u2225w\u22252 + xsKxs.\n(32)\nProof: The proof is analogous to that of [43, Proposi-\ntion 2] and has been omitted for brevity.\nThe following result will be instrumental in extending the\nnotion of (\u03b3, \u03b4)-similar systems to unstable systems.\nLemma 6: Suppose that w(t) = 0, for all t > 0, yielding\nthe composite system defined in (6), as\n\u02d9x = Ax + Bd2,\nz = Cx + Dd2.\n(33)\nFurther suppose that \u03a32 is (\u03b3, \u03b4)-similar to \u03a31 and that the\nsystem (33) does not have any zeros on the imaginary axis.\nThen, there exists a matrix H \u2ab00 of the algebraic Riccati\nequation\nA\u22a4H + HA \u2212HB(D\u22a4RD)\u22121B\u22a4H + C\u22a4RC = 0 (34)\nsuch that\nspec\n\u0010\nA \u2212B(D\u22a4RD)\u22121B\u22a4H\n\u0011\n\u2286C\u2212.\n(35)\nProof: From (32) and since D\u22a4RC = 0,\nlim\nT \u2192\u221eJT (0, xs, d2, w) =\u2225z\u22252\nR\n=\nZ \u221e\n0\nx\u22a4C\u22a4RCx + d\u22a4\n2 D\u22a4RDd2dt.\nAs \u03a32 is (\u03b3, \u03b4)-similar to \u03a31, it follows from (32) that\nthere exists a d2 \u2208L2 such that limT \u2192\u221eJT (0, xs, d2, w)\nis bounded. Following analogous steps as in [57, Theorem\n10.13], it follows that there exists a real positive definite\nsolution H of (34).\nWe will now establish that the solution of (34) satisfies\n(35). Let F := \u2212\n\u0000D\u22a4RD\n\u0001\u22121 B\u22a4L Since D\u22a4RC = 0, (34)\ncan be rewritten as\n(A + BF)\u22a4H + H(A + BF)\n+ (C + DF)\u22a4R(C + DF) = 0.\n(36)\nLet v (resp. \u03bb) denote the eigenvector (resp. eigenvalue)\nof A + BF. Then multiplying (36) from the left and right\nwith v\u2217and v, respectively, yields 2Re{\u03bb}v\u2217Hv = \u2212|(C +\nDF)v|2\nR \u22640. It follows that Re{\u03bb} \u22640 unless Hv = 0.\nHv = 0 or Re{\u03bb} = 0 implies that (C + DF)v = 0 which\nfurther implies that \u03bb is a zero of the system (see Exercise\n10.1 and Exercise 13.3 in [57]). Given the assumption that\nthe zeros of the system do not lie on the imaginary axis, it fol-\nlows that Re{\u03bb} < 0, i.e., spec\n\u0000A \u2212B(D\u22a4RD)\u22121B\u22a4L\n\u0001\n\u2286\nC\u2212.\nTheorem 1.1: Suppose that the composite system (6) does\nnot have any zeros on the imaginary axis. For \u03b3, \u03b4 > 0,\nsystem \u03a32 is (\u03b3, \u03b4)-similar to \u03a31 if and only if there exist\npositive constants \u03b7, \u00b5 and a positive definite matrix K such\nthat the following algebraic Riccati equation\n0 = A\u22a4P + PA + C\u22a4RC + PEQ\u22121E\u22a4P\u2212\nPB(D\u22a4RD)\u22121B\u22a4P\n(37)\nadmits a positive semi-definite solution P such that\nspec\n\u0010\nA \u2212B(D\u22a4RD)\u22121B\u22a4P + EQ\u22121E\u22a4P\n\u0011\n\u2286C\u2212.\n(38)\nProof: We will first establish the only if part, i.e., if \u03a32\nis (\u03b3, \u03b4)-similar to \u03a31, then a solution P \u2ab00 of (37) exists\nwhich satisfies (38).\nOnly if: Recall from Lemma 6 that there exists a solution\nH of (34). Define the cost function as follows which utilizes\nthe matrix L as an end point penalty.\nJT (0, xs) = sup\nw\u2208L2\ninf\nd2\u2208L2\n\u0010\n\u2225z\u22252\n[0,T ],R \u2212\u2225w\u22252\n[0,T ],Q\n+x\u22a4(T)Hx(T)\n\u0011\n\u2264sup\nw\u2208L2\ninf\nd2\u2208L2\n\u0010\n\u2225z\u22252\n[0,T ],R \u2212\u2225w\u22252\n[0,T ],Q +\n{\ninf\nd2\u2208L2[T,\u221e]\nZ \u221e\nT\n|z(t)|2\nRdt | \u2200t > T : w(t) = 0,\nlim\nt\u2192\u221ex(t) = 0}\n\u0013\n.\nSince the term in the second line of the last inequality\nis bounded as a result of linear quadratic optimal control,\nchoosing d2 according to (32) yields\nJT (0, xs) \u2264sup\nw\u2208L2\ninf\nd2\u2208L2{\u2225z\u22252\nR \u2212\u2225w\u22252\nQ | \u2200t > T : w(t) = 0}\n\u2264sup\nw\u2208L2\n{\u2225z\u22252\nR \u2212\u2225w\u22252\nQ} \u2264sup\nw\u2208L2\n{x\u22a4\ns Kxs \u2212\u03f5\u2225w\u22252},\nwhere the last inequality is obtained from equation (31).\nThis implies that JT is bounded from above uniformly\nwith T. Now consider the following differential Riccati\nequation with P(0) = L:\n\u02d9P =A\u22a4P + PA + C\u22a4RC + PEQ\u22121E\u22a4P\n\u2212PB(D\u22a4RD)\u22121B\u22a4P;\nP(T) = 0.\n(39)\nFrom [57, Theorem 10.7], there exists a T1 \u22650 such that\n(39) has a solution P on the interval [0, T1] with P(T) = 0.\nFollowing analogous steps to that in [57, Lemma 13.5], we\nuse completion of the squares argument and obtain that\nJT (\u03c4, xs) = x\u22a4\ns P(\u03c4)xs,\n\u2200\u03c4 \u2208[0, T1].\nThis implies that P is uniformly bounded. Now, using\nanalogous arguments as in [57, Theorem 13.3], we conclude\nthat a solution P with P(0) = H exists on the complete\ninterval [0, \u221e), P(t) \u2192\u00afP as t \u2192\u221e, \u00afP \u2265H, and that \u00afP\nsatisfies the algebraic Riccati equation (37) for P = \u00afP.\nLet F := \u2212(D\u22a4RD)\u22121B\u22a4P. Then, following analogous\nsteps as in [57, Theorem 13.3], we establish that A + BF is\nHurwitz. Now define\n\u00afwT (t) :=\n(\nwT (t),\nt \u2264T,\n0,\nt > T,\nwhere wT (t) := Q\u22121E\u22a4Px(t). Follow similar steps as in\n[43, Theorem 1] and using (32) yields\nJT (0, xs, Fx, wT ) \u2264\n\r\rz(\u00b7; 0, xs, Fx, \u00afwT )\n\r\r2\nR \u2212\u2225\u00afwT \u22252\nQ\n\u2264\u2212\u03f5\u2225\u00afwT \u22252 + xsKxs.\nFrom this point on, the proof is analogous to that in [43,\nTheorem 1] and has been omitted for brevity.\nIf part: Select d\u2217\n2 = Fx, where F = \u2212(D\u22a4RD)\u22121B\u22a4P.\nWe begin by establishing that (A + BF) is Hurwitz. The\nidea is to prove through contradiction.\nSuppose that A + BF is not Hurwitz and let v (resp. \u03bb)\ndenote an eigenvector (resp. eigenvalue) of A + BF. Since\nD\u22a4RC = 0, (37) can be written as\n0 = (A + BF)\u22a4P + P(A + BF)\n+ (C + DF)\u22a4R(C + DF) + PEQ\u22121E\u22a4P.\n(40)\nThen, multiplying (40) by v\u2217and v from the left and right,\nrespectively, yields\n2Re{\u03bb}v\u2217Pv = \u2212|E\u22a4Pv|2\nQ\u22121 \u2212|(C + DF)v|2\nR \u22640.\n(41)\nGiven the assumption that Re{\u03bb} \u22650, (41) holds either if\nPv = 0 or Re{\u03bb} = 0. For either case, since |E\u22a4Pv|2\nQ\u22121\nand |(C+DF)v|2\nR are non-negative, it follows that E\u22a4Pv =\n0 which is not possible since (38) holds. This implies that\nA + BF must be Hurwitz which further implies, from [58,\nLemma 4.8], that d\u2217\n2 \u2208L2. Next, using completion of squares\nargument and selecting K = P(0) + \u02dc\u03f5I, it follows that\nlim\nT \u2192\u221eJT (0, xs, d\u2217\n2, w) \u2264xsKxs\n\u2212\nZ \u221e\n0\n|w(t) \u2212Q\u22121E\u22a4P(t)x(t)|2\nQdt.\nFrom this point on, the proof is analogous to that in [43,\nTheorem 1].\nTheorem 1.1 provides an algebraic characterization of\n(\u03b3, \u03b4)-similarity for unstable systems with arbitrary initial\nconditions. However, to determine the constants \u03b7 and \u00b5\nfrom (37) can be challenging. Recall that the notion of (\u03b3, \u03b4)-\nsimilarity merely requires the existence of these constants.\nThus, utilizing the techniques from dissipative theory and\nanalogous to [43], we now aim to characterize a simple\nverifiable condition for the notion of (\u03b3, \u03b4)-similarity. We\nfirst establish that d2 can also be obtained through state\nfeedback.\nLemma 7: For \u03b3, \u03b4 > 0, \u03a32 is (\u03b3, \u03b4)-similar to \u03a31 if and\nonly if there exist constants \u03f5, \u03b7, \u00b5 > 0 and matrices F and\nK \u227b0 such that the composite system\n\u02d9x = (A + BF) x + Ew,\nx(0) = xs\nz = (C + DF) x\n(42)\nis 0-asymptotically stable and satisfies\n\u2200w \u2208L2 :\u2225z\u22252\nR \u2212\u2225w\u22252\nQ \u2264\u2212\u03f5\u2225w\u22252 + xsKxs.\n(43)\nProof: The proof is analogous to that of [43, Lemma\n3] and has been omitted for brevity.\nLemma 8: For \u03b3, \u03b4 > 0, \u03a32 is (\u03b3, \u03b4)-similar to \u03a31 if there\nexist constants \u03f5, \u03b7, \u00b5 > 0 and matrices F and K \u227b0 such\nthat the composite system (42) is 0-asymptotically stable and\nstrictly dissipative with respect to the supply rate\ns(w, z) =\n\u0014w\nz\n\u0015\u22a4\u0014Q\n0\n0\n\u2212R\n\u0015 \u0014w\nz\n\u0015\n,\n(44)\ni.e., there exists a function V : Rn \u2192[0, \u221e) and an \u03f5 > 0\nsuch that\nV (x(t1)) \u2264V (x(t0))+\nZ t1\nt0\ns(w(t), z(t))dt\u2212\u03f5\nZ t1\nt0\n|w(t)|2dt,\n(45)\nfor all t0 \u2264t1 and all signals x, w, and z that satisfy (42).\nProof: Suppose that constants \u03f5, \u03b7, \u00b5 > 0 and matrices\nF and K \u227b0 exist such that A + BF is 0-asymptotically\nstable. Then, for t0 = 0, x(0) = xs, (45) can be rewritten as\n\u2225z\u22252\nR \u2212\u2225w\u22252\nQ \u2264V (xs) \u2212V (x(t1)) \u2212\u03f5\u2225w\u22252 .\nSelecting V = x\u22a4(t)Kx(t) yields\n\u2225z\u22252\nR \u2212\u2225w\u22252\nQ \u2264x\u22a4\ns Kxs \u2212x(t1)\u22a4Kx(t1) \u2212\u03f5\u2225w\u22252\n\u2264x\u22a4\ns Kxs \u2212\u03f5\u2225w\u22252\nwhich establishes the claim.\nREFERENCES\n[1] M. Kordestani, A. A. Safavi, and M. Saif, \u201cRecent survey of large-\nscale systems: Architectures, controller strategies, and industrial appli-\ncations,\u201d IEEE Systems Journal, vol. 15, no. 4, pp. 5440\u20135453, 2021.\n[2] G. E. Briggs and J. B. S. Haldane, \u201cA note on the kinetics of enzyme\naction,\u201d Biochemical journal, vol. 19, no. 2, p. 338, 1925.\n[3] S. Gugercin, A. C. Antoulas, and C. Beattie, \u201cH2 model reduction\nfor large-scale linear dynamical systems,\u201d SIAM journal on matrix\nanalysis and applications, vol. 30, no. 2, pp. 609\u2013638, 2008.\n[4] A. Bryson Jr and A. Carrier, \u201cSecond-order algorithm for optimal\nmodel order reduction,\u201d Journal of Guidance, Control, and Dynamics,\nvol. 13, no. 5, pp. 887\u2013892, 1990.\n[5] W.-Y. Yan and J. Lam, \u201cAn approximate approach to H2 optimal\nmodel reduction,\u201d IEEE Transactions on Automatic Control, vol. 44,\nno. 7, pp. 1341\u20131358, 1999.\n[6] J. T. Spanos, M. H. Milman, and D. L. Mingori, \u201cA new algorithm for\nL2 optimal model reduction,\u201d Automatica, vol. 28, no. 5, pp. 897\u2013909,\n1992.\n[7] M. F. Shakib, G. Scarciotti, M. Jungers, A. Y. Pogromsky, A. Pavlov,\nand N. van de Wouw, \u201cOptimal H\u221eLMI-based model reduction by\nmoment matching for linear time-invariant models,\u201d in 2021 60th IEEE\nConference on Decision and Control (CDC), pp. 6914\u20136919, IEEE,\n2021.\n[8] I. Necoara and T. C. Ionescu, \u201cParameter selection for best H2 mo-\nment matching-based model approximation through gradient optimiza-\ntion,\u201d in 2019 18th European Control Conference (ECC), pp. 2301\u2013\n2306, IEEE, 2019.\n[9] B. Moore, \u201cPrincipal component analysis in linear systems: Control-\nlability, observability, and model reduction,\u201d IEEE transactions on\nautomatic control, vol. 26, no. 1, pp. 17\u201332, 1981.\n[10] A. C. Antoulas, D. C. Sorensen, and S. Gugercin, \u201cA survey of\nmodel reduction methods for large-scale systems,\u201d Contemporary\nmathematics, vol. 280, pp. 193\u2013220, 2001.\n[11] P. Benner, S. Gugercin, and K. Willcox, \u201cA survey of projection-based\nmodel reduction methods for parametric dynamical systems,\u201d SIAM\nreview, vol. 57, no. 4, pp. 483\u2013531, 2015.\n[12] C. L. Beck, J. Doyle, and K. Glover, \u201cModel reduction of multidi-\nmensional and uncertain systems,\u201d IEEE Transactions on Automatic\nControl, vol. 41, no. 10, pp. 1466\u20131477, 1996.\n[13] L. Li and I. R. Petersen, \u201cA Gramian-based approach to model\nreduction for uncertain systems,\u201d IEEE transactions on automatic\ncontrol, vol. 55, no. 2, pp. 508\u2013514, 2010.\n[14] W. Wang, J. Doyle, C. Beck, and K. Glover, \u201cModel reduction of LFT\nsystems,\u201d Electrical Engineering, vol. 1000, pp. 116\u201381, 1991.\n[15] L. Li, \u201cCoprime factor model reduction for discrete-time uncertain\nsystems,\u201d Systems & Control Letters, vol. 74, pp. 108\u2013114, 2014.\n[16] C. Beck, \u201cCoprime factors reduction methods for linear parameter\nvarying and uncertain systems,\u201d Systems & control letters, vol. 55,\nno. 3, pp. 199\u2013213, 2006.\n[17] F. Wu, \u201cInduced L-2 norm model reduction of polytopic uncertain\nlinear systems,\u201d Automatica, vol. 32, no. 10, pp. 1417\u20131426, 1996.\n[18] Z. Bai, P. Feldmann, and R. W. Freund, \u201cStable and passive reduced-\norder models based on partial Pad\u00b4e approximation via the Lanczos\nprocess,\u201d Numerical Analysis Manuscript, vol. 97, no. 3, p. 10, 1997.\n[19] I. M. Jaimoukha and E. M. Kasenally, \u201cImplicitly restarted Krylov\nsubspace methods for stable partial realizations,\u201d SIAM Journal on\nMatrix Analysis and Applications, vol. 18, no. 3, pp. 633\u2013652, 1997.\n[20] A. Yousefi, Preserving Stability in Model and Controller Reduction:\nwith application to embedded systems.\nPhD thesis, Technische\nUniversit\u00a8at M\u00a8unchen, 2006.\n[21] R. C. Selga, B. Lohmann, and R. Eid, \u201cStability preservation in\nprojection-based model order reduction of large scale systems,\u201d Euro-\npean journal of control, vol. 18, no. 2, pp. 122\u2013132, 2012.\n[22] T. C. Ionescu and A. Astolfi, \u201cOn moment matching with preservation\nof passivity and stability,\u201d in 49th IEEE Conference on Decision and\nControl (CDC), pp. 6189\u20136194, IEEE, 2010.\n[23] R. Pulch, \u201cStability-preserving model order reduction for linear\nstochastic Galerkin systems,\u201d Journal of Mathematics in Industry,\nvol. 9, no. 1, p. 10, 2019.\n[24] I. Necoara and T.-C. Ionescu, \u201cOptimal H2 moment matching-based\nmodel reduction for linear systems through (non) convex optimiza-\ntion,\u201d Mathematics, vol. 10, no. 10, p. 1765, 2022.\n[25] A. Lutowska, Model order reduction for coupled systems using low-\nrank approximations. PhD thesis, Technische Universiteit Eindhoven,\n2012.\n[26] H. Sandberg and R. M. Murray, \u201cModel reduction of interconnected\nlinear systems,\u201d Optimal Control Applications and Methods, vol. 30,\nno. 3, pp. 225\u2013245, 2009.\n[27] W. H. Schilders and A. Lutowska, \u201cA novel approach to model order\nreduction for coupled multiphysics problems,\u201d Reduced order methods\nfor modeling and computational reduction, pp. 1\u201349, 2014.\n[28] A. Vandendorpe and P. Van Dooren, \u201cModel reduction of intercon-\nnected systems,\u201d in Model order reduction: theory, research aspects\nand applications, pp. 305\u2013321, Springer, 2008.\n[29] I. Necoara and T. Ionescu, \u201cH2 model reduction of linear network\nsystems by moment matching and optimization,\u201d arXiv preprint\narXiv:1902.03348, 2019.\n[30] T. Ishizaki, K. Kashima, J.-i. Imura, and K. Aihara, \u201cModel reduction\nand clusterization of large-scale bidirectional networks,\u201d IEEE Trans-\nactions on Automatic Control, vol. 59, no. 1, pp. 48\u201363, 2013.\n[31] X. Cheng, J. M. Scherpen, and B. Besselink, \u201cBalanced truncation of\nnetworked linear passive systems,\u201d Automatica, vol. 104, pp. 17\u201325,\n2019.\n[32] L. A. Janssen, B. Besselink, R. H. Fey, and N. van de Wouw, \u201cModular\nmodel reduction of interconnected systems: A robust performance\nanalysis perspective,\u201d Automatica, vol. 160, p. 111423, 2024.\n[33] X. Cheng and J. M. Scherpen, \u201cModel reduction methods for com-\nplex network systems,\u201d Annual Review of Control, Robotics, and\nAutonomous Systems, vol. 4, no. 1, pp. 425\u2013453, 2021.\n[34] T. Georgiou and M. Smith, \u201cOptimal robustness in the gap metric,\u201d\nIEEE Transactions on Automatic Control, vol. 35, no. 6, pp. 673\u2013686,\n1990.\n[35] G. Vinnicombe, \u201cFrequency domain uncertainty and the graph topol-\nogy,\u201d IEEE Transactions on Automatic Control, vol. 38, no. 9,\npp. 1371\u20131383, 1993.\n[36] G. Vinnicombe, Uncertainty and Feedback, H Loop-shaping and the\n\u03bd-gap Metric. World Scientific, 2000.\n[37] G. Buskes and M. Cantoni, \u201cReduced order approximation in the \u03bd-\ngap metric,\u201d in 2007 46th IEEE Conference on Decision and Control,\npp. 4367\u20134372, IEEE, 2007.\n[38] M. Cantoni, \u201cOn model reduction in the \u03bd-gap metric,\u201d in Proceedings\nof the 40th IEEE Conference on Decision and Control (Cat. No.\n01CH37228), vol. 4, pp. 3665\u20133670, IEEE, 2001.\n[39] G. Buskes and M. Cantoni, \u201cA step-wise procedure for reduced order\napproximation in the \u03bd-gap metric,\u201d in 2008 47th IEEE Conference\non Decision and Control, pp. 4855\u20134860, IEEE, 2008.\n[40] A. Sootla, \u201c\u03bd-gap model reduction in the frequency domain,\u201d IEEE\nTransactions on Automatic Control, vol. 59, no. 1, pp. 228\u2013233, 2013.\n[41] A. Eldesoukey, M. A. Abdelgalil, and H. E. Taha, \u201cObservations on\nthe use of gap metric in model reduction,\u201d in AIAA SCITECH 2022\nForum, p. 1593, 2022.\n[42] W. H. Schilders, H. A. Van der Vorst, and J. Rommes, Model order\nreduction: theory, research aspects and applications, vol. 13. Springer,\n2008.\n[43] A. Pirastehzad, A. van der Schaft, and B. Besselink, \u201cComparison\nof non-deterministic stable linear systems by (\u03b3, \u03b4)-similarity,\u201d IEEE\nTransactions on Automatic Control, 2024.\n[44] M. Redmann, \u201cAn L2\nT -error bound for time-limited balanced trunca-\ntion,\u201d Systems & Control Letters, vol. 136, p. 104620, 2020.\n[45] A. Astolfi, \u201cModel reduction by moment matching for linear and\nnonlinear systems,\u201d IEEE Transactions on Automatic Control, vol. 55,\nno. 10, pp. 2321\u20132336, 2010.\n[46] B. Samari, A. Nejati, and A. Lavaei, \u201cModel order reduction from\ndata with certification,\u201d arXiv preprint arXiv:2502.01094, 2025.\n[47] V. Kurtz, R. R. da Silva, P. M. Wensing, and H. Lin, \u201cFormal\nconnections between template and anchor models via approximate\nsimulation,\u201d in 2019 IEEE-RAS 19th International Conference on\nHumanoid Robots (Humanoids), pp. 64\u201371, IEEE, 2019.\n[48] C. Scherer and S. Weiland, \u201cLinear matrix inequalities in control,\u201d\nLecture Notes, Dutch Institute for Systems and Control, Delft, The\nNetherlands, vol. 3, no. 2, 2000.\n[49] M. Green and D. J. Limebeer, Linear robust control.\nCourier\nCorporation, 2012.\n[50] Y. Ebihara and T. Hagiwara, \u201cOn H\u221emodel reduction using LMIs,\u201d\nIEEE Transactions on Automatic Control, vol. 49, no. 7, pp. 1187\u2013\n1191, 2004.\n[51] C. A. Desoer and M. Vidyasagar, Feedback systems: input-output\nproperties. SIAM, 2009.\n[52] J. L\u00a8ofberg, \u201cYalmip : A toolbox for modeling and optimization\nin matlab,\u201d in In Proceedings of the CACSD Conference, (Taipei,\nTaiwan), 2004.\n[53] R. H. T\u00a8ut\u00a8unc\u00a8u, K.-C. Toh, and M. J. Todd, \u201cSolving semidefinite-\nquadratic-linear programs using sdpt3,\u201d Mathematical programming,\nvol. 95, pp. 189\u2013217, 2003.\n[54] Gurobi Optimization, LLC, \u201cGurobi Optimizer Reference Manual,\u201d\n2024.\n[55] T. C. Ionescu and A. Astolfi, \u201cMoment matching based controller\nreduction for linear systems,\u201d in 52nd IEEE Conference on Decision\nand Control, pp. 5528\u20135533, 2013.\n[56] R. Sato, M. Inoue, and S. Adachi, \u201cA structured model reduction\nmethod for linear interconnected systems,\u201d in Journal of Physics:\nConference Series, vol. 744, p. 012108, IOP Publishing, 2016.\n[57] H. L. Trentelman, A. A. Stoorvogel, M. Hautus, and L. Dewell,\n\u201cControl theory for linear systems,\u201d Appl. Mech. Rev., vol. 55, no. 5,\npp. B87\u2013B87, 2002.\n[58] R. Lozano, B. Brogliato, O. Egeland, and B. Maschke, Dissipative\nsystems analysis and control: theory and applications. Springer-Verlag\nLondon, 2007."
-  },
-  {
-    "domain": "Physics",
-    "chunk_type": "general",
-    "text": "1 \n \nAC Current-Driven Magnetization Switching and Nonlinear Hall \nRectification in a Magnetic Topological Insulator \n \nYuto Kiyonaga1\u2020, Masataka Mogi1\u2020*, Ryutaro Yoshimi2,3, Yukako Fujishiro2,4, Yuri Suzuki1, \nMax T. Birch2, Atsushi Tsukazaki1, Minoru Kawamura2, Masashi Kawasaki1,2 & Yoshinori \nTokura1,2,5 \n \n1Department of Applied Physics and Quantum-Phase Electronics Center (QPEC), University \nof Tokyo, Bunkyo-ku, Tokyo, Japan \n2RIKEN Center for Emergent Matter Science (CEMS), Wako, Saitama, Japan \n3Department of Advanced Materials Science, University of Tokyo, Kashiwa, Chiba, Japan \n4RIKEN Cluster for Pioneering Research (CPR), Wako, Saitama, Japan \n5Tokyo College, University of Tokyo, Bunkyo-ku, Tokyo, Japan \n \n\u2020These authors contributed equally: Yuto Kiyonaga, Masataka Mogi. \n*e-mail: mogi@ap.t.u-tokyo.ac.jp \n \n \n \n \n2 \n \nAbstract: Spin-orbit torque arising from the spin-orbit-coupled surface states of topological \ninsulators enables current-induced control of magnetization with high efficiency. Here, \nalternating-current (AC) driven magnetization reversal is demonstrated in a semi-magnetic \ntopological insulator (Cr,Bi,Sb)2Te3/(Bi,Sb)2Te3, facilitated by a low threshold current density \nof 1.5 \u00d7 10! Am\"#. Time-domain Hall voltage measurements using an oscilloscope reveal a \nstrongly nonlinear and nonreciprocal Hall response during the magnetization reversal process. \nFourier analysis of the time-varying Hall voltage identifies higher-harmonic signals and a \nrectified direct-current (DC) component, highlighting the complex interplay among the applied \ncurrent, external magnetic field, and magnetization dynamics. Furthermore, a hysteretic \nbehavior in the current-voltage characteristics gives rise to frequency mixing under dual-\nfrequency excitation. This effect, distinct from conventional polynomial-based nonlinearities, \nallows for selective extraction of specific frequency components. The results demonstrate that \nAC excitation can not only switch magnetization efficiently but also induce tunable nonlinear \nresponses, offering a new pathway for multifunctional spintronic devices with potential \napplications in energy-efficient memory, signal processing, and frequency conversion. \n \n \n \n \n3 \n \n1. Introduction \nThe interplay of electron spin and charge transport underlies a wide range of spintronic \nphenomena and applications, including current-induced magnetization reversal in ferromagnets \nvia spin-transfer and spin-orbit torques.[1,2,3] Beyond acting as the driving force for \nmagnetization dynamics, the dynamics of localized spins significantly influence the charge \ntransport properties, leading to nonreciprocal conduction,[4,5] spin-motive forces,[6,7] and \nemergent electric fields.[8,9,10] These effects open possibilities for advanced functionalities, such \nas diode operation, DC rectification, frequency conversion, and novel inductive elements.[10,11] \n \nNonlinear Hall effects arising from such spin-charge coupling also offer a means to detect \ncurrent-driven magnetization dynamics.[12,13] In particular, second harmonic Hall voltages \ngenerated by continuous AC current excitation are commonly employed to probe magnetization \noscillations.[13] However, driving magnetization dynamics typically requires large current \ndensities on order of 10$% to 10$$ Am\"#, often applied as short pulses to mitigate Joule \nheating,[1] posing multiple challenges originating from extrinsic thermal effects, including \nparasitic Nernst effects,[14] spin-Seebeck effects,[15] and enhanced magnon scattering.[4,5] These \nparasitic effects often mask the intrinsic nonlinear signals that stem directly from magnetization \ndynamics. To date, a nonlinear Hall effect directly associated with magnetization reversal \ndriven by spin-transfer or spin-orbit torques remains largely unexplored. \n \nTime-domain measurements of nonlinear Hall effect provide an effective method for \ninvestigating magnetization dynamics. Such measurements have been employed to study \nultrafast magnetization switching,[16,17] magnetic domain wall motion induced by pulse \ncurrent,[18] and inertial skyrmion dynamics under AC currents.[19] In our study, by \nsimultaneously monitoring the longitudinal resistance as a thermometer, thermal effects can be \nevaluated, helping to distinguish intrinsic nonlinear signals from parasitic heating effects. \nMoreover, real-time current-Hall voltage characteristics allow direct analysis of nonlinear \nbehavior and phase, making time-domain measurements well suited for exploring nonlinear \nHall effects in AC-driven magnetization dynamics. \n \nTopological insulators (TIs) serve as a unique platform to explore these effects, as they host \ninsulating bulk states and conducting surface states with strong spin-orbit coupling, where \nelectron spin is locked perpendicularly to its momentum.[20] This property facilitates efficient \nspin-charge conversion via spin-orbit torques (SOTs).[21] Indeed, in heterostructures composed \n \n \n4 \n \nof TIs and various ferromagnets, current-driven magnetization switching has been \ndemonstrated at low current densities on the order of 10! to 10$% Am\"#.[13,22,23] Furthermore, \nwhen magnetism is introduced into the topological surface state, a magnetization gap is induced, \nleading to a large anomalous Hall conductivity due to Berry curvature.[24,25] Nonetheless, as in \nconventional ferromagnets, nonlinear transport arising from magnetization dynamics can still \nbe masked by substantial magnon scattering under low current excitation.[4,5] \n \nIn this study, we demonstrate AC current-induced magnetization dynamics and its associated \nnonlinear Hall effect in a semi-magnetic TI (Cr,Bi,Sb)\u2082Te\u2083/(Bi,Sb)\u2082Te\u2083 heterostructure thin film \n(see Methods for details). We directly observe magnetization reversal in response to AC current \ndrive by measuring the Hall voltage in time domain using an oscilloscope, The measured Hall \nvoltage exhibits strong current nonlinearity and higher-harmonic signals, as well as a rectified \nDC offset. A systematic analysis of the current-Hall voltage characteristics, combined with \nFourier transforms, reveals notable nonlinear and hysteretic responses in the Hall voltage, \nshedding light on the interplay between current, external magnetic field, and magnetization \ndynamics. We also identify an asymmetric frequency-mixing phenomenon originating from the \nmagnetization switching hysteresis, which contrasts with conventional polynomial-type \nnonlinearities. \n \n2. AC current-induced magnetization switching \nFor current-induced perpendicular magnetization switching in TIs, the damping-like SOT \n\ud835\udf0e\u20d7\u00d7 (\ud835\udf0e\u20d7\u00d7 \ud835\udc40--\u20d7) plays a central role,[3] where \ud835\udf0e is the spin polarization of conduction electron and \n\ud835\udc40 is the localized magnetization. In TIs, the spins are polarized perpendicular to the current \nwhen the chemical potential lies in the surface state.[24] Therefore, the magnetization with \nperpendicular magnetic anisotropy is tilted toward the current direction[5,13,23] (Figure 1a). The \nresulting magnetization can be detected by the anomalous Hall effect (AHE) arising from the \nmagnetization gap in the TI surface state (Figure 1b). \n \nWe first confirm current-induced magnetization switching by applying high-current pulses (up \nto \ud835\udc3c&'()* = \u00b1500 \u00b5A, or current density of \u00b15 \u00d7 10! Am\"#), and under an in-plane magnetic \nfield \ud835\udc35+ = 0.01 T, followed by measuring the Hall voltage at a low sensing current (\ud835\udc3c+ = 1 \u00b5A, \nor current density of 1 \u00d7 10, Am\"#) (Figure 1c). The Hall voltage \ud835\udc49- is given by \ud835\udc49- = \ud835\udc45-+\ud835\udc3c+ \u221d\n\ud835\udc40.\ud835\udc3c+, where \ud835\udc40. is the out-of-plane component contributing to the AHE. Hence, the sign of the \nHall resistance \ud835\udc45-+ directly reflects the direction of M. By varying the amplitude of the pulse, \n \n \n5 \n \nwe observe a clear sign reversal of \ud835\udc45-+ at a threshold current of about 150 \u00b5A (current density \nof  1.5 \u00d7 10! Am\"#) (Figure 1e). The switching polarity reverses if we flip either the current \ndirection or the in-plane field direction, consistent with SOT-driven perpendicular-\nmagnetization switching. \n \nNext, instead of applying current pulses, we record the Hall voltage in real time with an \noscilloscope (see Methods), while applying an AC current excitation \ud835\udc3c+ = \ud835\udc3c&*/0sin(2p\ud835\udc53\ud835\udc61) with \na peak current amplitude of \ud835\udc3c&*/0 =  300 \u00b5A at a frequency of \ud835\udc53=  101 Hz (Figure 1d). We \nevaluate \ud835\udc45-+ from the \ud835\udc45-+ = \ud835\udc49-/\ud835\udc3c+ and plot as a function of \ud835\udc3c+ in Figure 1f. When 0 \u00b5A < \ud835\udc3c+ <\n300 \u00b5A, the Hall resistance \ud835\udc45-+ gradually decreases and switches polarity above a threshold \ncurrent (defined where it crosses \ud835\udc45-+ = 0) of about 150 \u00b5A. Once the polarity is switched, it \nremains until it is switched again at around \ud835\udc3c+ = \u2212150 \u00b5A. This is consistent with the case of \npulse-current-induced magnetization reversal discussed above. Simultaneous monitoring of the \nlongitudinal resistance \ud835\udc45++  indicates that the temperature is kept below 30 K  during the \nmeasurement, which is lower than \ud835\udc471 ~ 40 K and there is no temperature hysteresis (see \nSupplementary Note 3), confirming that heating is not a primary origin of the observed \nmagnetization reversal. \n \n3. Nonlinear Hall voltage and hysteretic behavior \nFigure 2a shows the time-dependent evolution of the Hall voltage \ud835\udc49- (t). Since the Hall \nresistance \ud835\udc45-+ becomes negative for \ud835\udc3c+ > 0 and positive for \ud835\udc3c+ < 0, as shown in Figure 2b, we \nsee that \ud835\udc49- takes predominantly negative values, reflecting the reversed magnetization state \nwhen \ud835\udc3c+ is above the threshold. Meanwhile, small positive spikes (black triangles in Figure 2a) \nappear, indicating the short period when the magnetization remains unreversed below the \nthreshold current. To clarify the role of the current amplitude, we compare waveforms of the \nHall voltage for several peak current values (Figure 2c). At a low peak current value (\ud835\udc3c&*/0 =\n14 \u00b5A) which is far below the threshold current, Hall voltage shows a nearly sinusoidal time \ndependence with alternating signs, originating from the out-of-plane component of \nmagnetization under \ud835\udc35+~0.01 T. With increasing \ud835\udc3c&*/0 (from 71 to 300 \u00b5A), the waveforms \nbecome increasingly distorted, revealing pronounced nonlinearity in the Hall response. Figure \n2d further highlights this nonlinearity by plotting the \ud835\udc3c+ \u2212\ud835\udc49- relationship for each \ud835\udc3c&*/0. At \n\ud835\udc3c&*/0 = 14 \u00b5A, a nearly linear relationship appears, while at higher currents (e.g., 300 \u00b5A), the \ncurves develop a butterfly-shaped hysteresis: the Hall voltages differ for increasing vs \n \n \n6 \n \ndecreasing current. Such a hysteresis results from the magnetization dynamics and phase delay \nwhen the applied AC current exceeds the threshold of magnetization switching, representing \ncontinuous magnetization reversal.  \n \nThe nonlinear Hall response is also strongly affected by the in-plane magnetic field altering the \nmagnetization orientation. To investigate this, we perform time-domain measurements of \ud835\udc49- at \na larger field (\ud835\udc35+ = 2 T). Figure 3a shows the contrastive Hall voltage waveforms at \ud835\udc35+ =\n0.01 T and 2 T (the former being the same data from Figure 2a). Unlike at 0.01 T, at 2 T, which \nis strong enough to force the magnetization to lie in-plane, no positive spikes appear in regions \ni and iii of Figure 3a, and the signal remains consistently negative. Correspondingly, the \ud835\udc3c+ \u2212\n\ud835\udc49- characteristic shows rectifying behavior without hysteresis (Figure 3b, bottom panel). A \nplausible interpretation is that, under the strong in-plane field, the magnetization never switches. \nInstead, magnetization acquires a finite \ud835\udc40. component in the \u2212\ud835\udc67 direction for \ud835\udc3c+ > 0 and in the \n+\ud835\udc67 direction for \ud835\udc3c+ < 0, resulting in a rectified Hall response. \n \nWe note that extrinsic thermoelectric effects[14,15,26] as well as asymmetric scattering from \nmagnon emission/absorption[4,5] may contribute to the nonlinear Hall signals discussed above, \ncomplicating the quantitative separation of each effect. However, none of these effects can \ncause the hysteresis observed at \ud835\udc35+ = 0.01 T. Therefore, the main contribution of the nonlinear \nsignals originates from magnetization dynamics, particularly in the case of \ud835\udc35+ = 0.01 T. \n \nTo systematically compare these nonlinear signals, we decompose the time-domain Hall \nvoltage \ud835\udc49-(\ud835\udc61) via Fourier transforms. In a simple power-series expansion of \ud835\udc49- in terms of \ncurrent \ud835\udc3c+(\ud835\udc61) = \ud835\udc3c% sin(\ud835\udf14\ud835\udc61), \ud835\udc49-(\ud835\udc61) can be expressed as: \n\ud835\udc49-(\ud835\udc61) = \ud835\udc49-\n(%) + \ud835\udc494\n-\n($)sin(\ud835\udf14\ud835\udc61) + \ud835\udc494\n-\n(#)cos(2\ud835\udf14\ud835\udc61) + \ud835\udc494\n-\n(5)sin(3\ud835\udf14\ud835\udc61) + \ud835\udc494\n-\n(6)cos(4\ud835\udf14\ud835\udc61) + \u22ef, \nwhere  \ud835\udc49-\n(%)  and \ud835\udc49\u2032-\n(7) (\ud835\udc5b= 1, 2, 3, 4, \u22ef) are a constant and coefficients, respectively (see \nSupplementary Note 4 for details). Here all the odd-harmonic components appear as sine \nfunctions, while all the even-harmonic components appear as cosine functions in this power \nseries. More generally, a phase delay arising from hysteresis is described by adding odd-\nharmonic cosine functions and even-harmonic sine functions as, \n\ud835\udc49-(\ud835\udc61) = \ud835\udc49-\n(%) + \ud835\udc494-\n($)sin(\ud835\udf14\ud835\udc61) + \ud835\udc4944-\n($)cos(\ud835\udf14\ud835\udc61) + \ud835\udc494-\n(#)cos(2\ud835\udf14\ud835\udc61) + \ud835\udc4944-\n(#)sin(2\ud835\udf14\ud835\udc61) \n+\ud835\udc494-\n(5)sin(3\ud835\udf14\ud835\udc61) + \ud835\udc4944-\n(5)cos(3\ud835\udf14\ud835\udc61) + \ud835\udc494-\n(6)cos(4\ud835\udf14\ud835\udc61) + \ud835\udc4944-\n(6)sin(4\ud835\udf14\ud835\udc61) + \u22ef, \n \n \n7 \n \nwhere \ud835\udc49\u2032\u2032-\n(7) (\ud835\udc5b= 1, 2, 3, 4, \u22ef) are coefficients of these phase-shifted components. Figure 3c \nshows the Fourier amplitudes at \ud835\udc35+ = 0.01 T and 2 T. In this figure, gray, red, and blue bars \nrepresent the components \ud835\udc49-\n(%) , \ud835\udc494-\n(7) , and \ud835\udc4944-\n(7) , respectively. Both cases exhibit strong \nsecond-harmonic \ud835\udc494\n-\n(#) appear, corresponding to the fact that \ud835\udc49- remains negative for both \ud835\udc3c+ >\n0 and \ud835\udc3c+ < 0. However, additional phase-shifted components represented as \ud835\udc4944\n-\n(7) appear only \nat  \ud835\udc35+ = 0.01 T, reflecting the delayed, hysteretic response of magnetization reversal. In \ncontrast, at 2 T, the magnetization follows \ud835\udc3c+ more smoothly, eliminating large phase shifts. \n \n4. Asymmetric frequency mixing \nFinally, we demonstrate a frequency-mixing phenomenon[27] enabled by AC current-induced \nmagnetization switching. In general, a nonlinear system driven by two frequencies \ud835\udc53$ and \ud835\udc53# \ncan generate signals at their sum \ud835\udc53$ + \ud835\udc53# and difference |\ud835\udc53$ \u2212\ud835\udc53#|. If the nonlinearity is purely \npolynomial in current, the amplitudes of the \ud835\udc53$ + \ud835\udc53# and |\ud835\udc53$ \u2212\ud835\udc53#| components are expected to \nbe equal[27] (See Supplementary Note 5). However, in our semi-magnetic TI, the hysteresis of \nmagnetization reversal breaks this symmetry. In the experiment, we apply \ud835\udc3c+(\ud835\udc61) =\n\ud835\udc3c$sin (2\ud835\udf0b\ud835\udc53$\ud835\udc61) + \ud835\udc3c#sin (2\ud835\udf0b\ud835\udc53#\ud835\udc61) with \ud835\udc3c$ = \ud835\udc3c# = 150 \u00b5A, \ud835\udc53$ = 37 Hz, and \ud835\udc53# = 125 Hz. When \nonly a single frequency (\ud835\udc53$ = 37 Hz or \ud835\udc53# = 125 Hz) is applied (peak current 300 \u00b5A), the \nresponse is similar to our earlier single-frequency result (Figure 4a), confirming that the \nhysteretic behavior is governed by magnetization switching (Figure 1e,f), rather than the inertia \ndynamics of magnetic domains or capacitive/inductive components of the electrical circuits. \nWith both frequencies present (Figure 4b), we observe broadband wave mixing, ranging from \nDC component to 338 Hz in the Fourier spectrum of Vy (Figure 4c). Notably, at a low magnetic \nfield (0.01 T), the \ud835\udc53$ + \ud835\udc53# component is substantially larger than the |\ud835\udc53$ \u2212\ud835\udc53#| component. In \ncontrast, at a high field (2 T), both peaks exhibit similar amplitudes, indicating a more \nconventional polynomial-type nonlinearity. A numerical simulation (Figure 4d), assuming a \nsimplified \ud835\udc3c+ \u2212\ud835\udc49- characteristic with a well-defined threshold (insets of Figure 4d), reproduces \nthis asymmetry only when a finite threshold current \ud835\udc3c89 = 150 \u00b5A is considered (mimicking the \nlow-field case). If \ud835\udc3c89 = 0 \u00b5A (high-field case), the two sidebands remain equal. Hence, the \nhysteretic magnetization reversal leads to asymmetric frequency mixing. \n \n5. Conclusion \nOur findings demonstrate that AC current can induce continuous magnetization reversal in a \nsemi-magnetic TI heterostructure, at a remarkably low current density (1.5 \u00d7 10! Am\"#). We \n \n \n8 \n \nhave clarified the nonlinear Hall effect accompanying this process and shown that the hysteretic \nand phase-delayed Hall responses are governed by a threshold current for magnetization \nreversal. Furthermore, the strength of the magnetic field can control the presence or absence of \nthe hysteresis. The pronounced nonlinear Hall effect observed here holds promise for Hall \nrectification,[28,29] which has recently been studied for terahertz-to-DC conversion[30] or AC-to-\nDC conversion.[31] Additionally, when a current containing components of two different \nfrequencies is applied, we discover a distinctive frequency-mixing effect where the magnitudes \nof the sum-frequency and difference-frequency components differ due to the hysteresis for \nmagnetization reversal. Such frequency-mixing effects are commonly used for technologies \nsuch as photoacoustic imaging[32] and microwave generation[33]. The electrical asymmetric \nfrequency mixing effect induced by the nonlinearity of magnetization reversal process may thus \nbe leveraged in spintronic devices for selective extraction of desired frequency components.  \nFurthermore, in the field of neural networks, elements that combine nonlinearity with short-\nterm memory are widely used for physical reservoir computing.[32-37] The nonlinear and \nhysteretic response discussed here also has the potential for nonlinear transformation of inputs \ninto higher-dimensional outputs, making our system promising as a reservoir.  While all the \nresponses explored in this study are at 2.5 K, the use of ferromagnetic materials with high \ntransition temperatures in the magnetic layer exchange-coupled to the topological insulator \ncould lead to potential functionality even at room temperature. [40,41,42] Thus, our study paves \nthe way for harnessing hysteretic magnetization dynamics, with potential applications in spin-\norbit-based low-power switching elements, and advanced nonlinear electronics such as \nneuromorphic computing.[43] \n \n6. Experimental Section/Methods \nSample fabrication and electric transport measurement: We grew (Cr,Bi,Sb)2Te3/(Bi,Sb)2Te3 \nthin films on InP(111) substrates by molecular beam epitaxy under the base pressure on the \norder of 10\", Pa. The flux ratio \ud835\udc431:: \ud835\udc43;<: \ud835\udc43=>: \ud835\udc43?* = 1: 9: 16: 1000 was used to tune the Fermi \nenergy inside the bulk gap. The films are fabricated into 10 \u00b5m-wide Hall-bar devices. The \nstructures of them are illustrated in Supplementary Figure S1(a). In a typical sample, we \nmeasured the electrical transport properties using Physical Property Measurement System \n(PPMS), as shown in Supplementary Figure S1(b). The longitudinal and Hall resistance of a \ntypical sample is about 10 k\u03a9 and 2 k\u03a9, respectively. The Hall conductivity at the lowest \ntemperature \ud835\udc47= 2.5 K is about 0.3 \u00d7 \ud835\udc52#/\u210e , which indicates the Fermi level is fairly close to \n \n \n9 \n \nthe magnetization gap. From the temperature dependence of Hall and longitudinal resistivity, \nthe Curie temperature can be determined as \ud835\udc471 ~ 40 K. \n \nTime-domain measurement: We measured the Hall voltage as well as the longitudinal voltage \nin real time with an oscilloscope and a current source (Keithley 6221). Set up in a PPMS \nchamber is shown in Supplementary Figure S2.  The oscilloscope monitors the Hall voltage \n\ud835\udc49-(\ud835\udc61) as \n\ud835\udc49-(\ud835\udc61) = \ud835\udc491A#(\ud835\udc61) \u2212\ud835\udc491A$(\ud835\udc61) \nand the voltage on the resistor \ud835\udc49B as  \n\ud835\udc49B(\ud835\udc61) = \ud835\udc491A5(\ud835\udc61) \nwhere \ud835\udc491A$(\ud835\udc61), \ud835\udc491A#(\ud835\udc61), and \ud835\udc491A5(\ud835\udc61) are the measured voltages of the channels CH1, CH2, and \nCH3, respectively. We note that the current flowing through the circuit is obtained by  \n\ud835\udc3c+(\ud835\udc61) =\nC!(D) [F]\n$%%% [H] =\nC\"#$(D) [F]\n$%%% [H] . \n \nAcknowledgements: We thank Tomoyuki Yokouchi and Lixuan Tai for stimulating \ndiscussions. This work was supported by JSPS KAKENHI Grant Nos. JP22H04958, \nJP23H05431 and JP24K16986, and JST PRESTO Grant No. JPMJPR23HA. \n \nAuthor contributions: M.M. and Y.T. conceived the study. Y.K. and M.M. grew the samples \nwith help of R.Y., K.S.T., A.T. and M. Kawas. Y.K. and M.M. performed measurements with \nhelp of Y.F., Y.S., M.T.B. and M.Kawam. All authors discussed the results. Y.K., M.M. and \nY.T. wrote the manuscript with inputs from all other authors. Y.T. supervised the project.  \n \nReferences: \n[1] Liu, L. et al. Current-Induced Switching of Perpendicularly Magnetized Magnetic Layers \nUsing Spin Torque from the Spin Hall Effect. Phys. Rev. Lett. 109, 096602 (2012) \n \n[2] Ralph, D. C. & Stiles, M. D. Spin transfer torques. J. Magn. Magn. Mater. 320, 1190\u2013\n1216 (2008) \n \n[3] Manchon, A. et al. Current-induced spin-orbit torques in ferromagnetic and \nantiferromagnetic systems. Rev. Mod. Phys. 91, 035004 (2019) \n \n \n \n10 \n \n[4] Yasuda, K. et al. Large Unidirectional Magnetoresistance in a Magnetic Topological \nInsulator. Phys. Rev. Lett. 117, 127202 (2016) \n \n[5] Yasuda, K. et al. Current-Nonlinear Hall Effect and Spin-Orbit Torque Magnetization \nSwitching in a Magnetic Topological Insulator. Phys. Rev. Lett. 119, 137204 (2017) \n \n[6] Yang, S. A. et al. Universal Electromotive Force Induced by Domain Wall Motion. Phys. \nRev. Lett. 102, 067201 (2009) \n \n[7] Emori, S. et al. Current-driven dynamics of chiral ferromagnetic domain walls. Nat. \nMater. 12, 611\u2013616 (2013) \n \n[8] Nagaosa, N. & Tokura, Y. Emergent electromagnetism in solids. Phys. Scr. 2012, 014020 \n(2012) \n \n[9] Schulz, T. et al. Emergent electrodynamics of skyrmions in a chiral magnet. Nat. Phys. 8, \n301\u2013304 (2012) \n \n[10] Yokouchi, T. et al. Emergent electromagnetic induction in a helical-spin magnet. Nature \n586, 232\u2013236 (2020) \n \n[11] Yamane, Y., Fukami, S., & Ieda, J. Theory of Emergent Inductance with Spin-Orbit \nCoupling Effects. Phys. Rev. Lett. 128, 147201 (2022) \n \n[12] Sala, G. et al. Real-time Hall-effect detection of current-induced magnetization dynamics \nin ferrimagnets. Nat. Commun. 12, 656 (2021) \n \n[13] Fan, Y. et al. Magnetization switching through giant spin\u2013orbit torque in a magnetically \ndoped topological insulator heterostructure. Nat. Mater. 13, 699\u2013704 (2014) \n \n[14] Avci, C. O. et al. Interplay of spin-orbit torque and thermoelectric effects in \nferromagnet/normal-metal bilayers. Phys. Rev. B 90, 224427 (2014) \n \n[15] Uchida, K. et al. Observation of the spin Seebeck effect. Nature 455, 778\u2013781 (2008) \n \n \n11 \n \n \n[16] Baumgartner, M. et al. Spatially and time-resolved magnetization dynamics driven by \nspin-orbit torques. Nat. Nanotechnol. 12, 980\u2013986 (2017) \n \n[17] Grimaldi, E. et al. Single-shot dynamics of spin\u2013orbit torque and spin transfer torque \nswitching in three-terminal magnetic tunnel junctions. Nat. Nanotechnol. 15, 111\u2013117 (2020) \n \n[18] Yoshimura, Y. et al. Soliton-like magnetic domain wall motion induced by the interfacial \nDzyaloshinskii\u2013Moriya interaction. Nat. Phys. 12, 157\u2013161 (2016) \n \n[19] Birch, M. T. et al. Dynamic transition and Galilean relativity of current-driven \nskyrmions. Nature 633, 554\u2013559 (2024) \n \n[20] Hasan, M. Z. & Kane, C. L. Colloquium: Topological insulators. Rev. Mod. Phys. 82, \n3045 (2010) \n \n[21] Kondou, K. et al. Fermi-level-dependent charge-to-spin current conversion by Dirac \nsurface states of topological insulators. Nat. Phys. 12, 1027\u20131031 (2016) \n \n[22] Mellnik, A. R. et al. Spin-transfer torque generated by a topological insulator. Nature \n511, 449\u2013451 (2014) \n \n[23] Mogi, M. et al. Current-induced switching of proximity-induced ferromagnetic surface \nstates in a topological insulator. Nat. Commun. 12, 1404 (2021) \n \n[24] Tokura, Y., Yasuda, K., & Tsukazaki, A. Magnetic topological insulators. Nat. Rev. \nPhys. 1, 126-143 (2019) \n \n[25] Yu, R. et al. Quantized Anomalous Hall Effect in Magnetic Topological Insulators. \nScience 329, 5987 (2010) \n \n[26] Tai, L.  et al. Giant Hall Switching by Surface-State-Mediated Spin-Orbit Torque in a \nHard Ferromagnetic Topological Insulator. Adv. Mater. 36, 2406772 (2024) \n \n \n \n12 \n \n[27] Min, L. et al. Colossal room-temperature non-reciprocal Hall effect. Nat. Mater. 23, \n1671\u20131677 (2024) \n \n[28] Isobe, H., Xu, S.-Y., & Fu, L. High-frequency rectification via chiral Bloch electrons. \nSci. Adv. 6, 13 (2020) \n \n[29] He, P. et al. Quantum frequency doubling in the topological insulator Bi2Se3. Nat. \nCommun. 12, 698 (2021) \n \n[30] Zhang, Y. & Fu, L. Terahertz detection based on nonlinear Hall effect without magnetic \nfield. Proc. Natl. Acad. Sci. USA 118, e2100736118 (2021) \n \n[31] Onishi, Y. & Fu, L. High-efficiency energy harvesting based on a nonlinear Hall \nrectifier. Phys. Rev. B 110, 075122 (2024) \n \n[32] Gusev, V. & Chigarev, N. Nonlinear frequency-mixing photoacoustic imaging of a \ncrack: Theory. J. Appl. Phys. 107, 124905 (2010) \n \n[33] Rice, A. et al. Terahertz optical rectification from <110> zinc-blende crystals. Appl. \nPhys. Lett. 64, 1324\u20131326 (1994) \n \n[34] Torrejon, J. et al. Neuromorphic computing with nanoscale spintronic \noscillators. Nature 547, 428\u2013431 (2017). \n \n[35] Moon, J. et al. Temporal data classification and forecasting using a memristor-based \nreservoir computing system. Nat. Electro. 2, 480\u2013487 (2019) \n \n[36] Zhong, Y. et al. Dynamic memristor-based reservoir computing for high-efficiency \ntemporal signal processing. Nat. Commun. 12, 408 (2021) \n \n[37] Zhong, Y. et al. A memristor-based analogue reservoir computing system for real-time \nand power-efficient signal processing. Nat. Electro. 5, 672\u2013681 (2022) \n \n \n \n13 \n \n[38] Yokouchi, T. et al. Pattern recognition with neuromorphic computing using magnetic \nfield\u2013induced dynamics of skyrmions. Sci. Adv. 8, 39 (2022) \n \n[39] Liang, X. et al. Physical reservoir computing with emerging electronics. Nat. Electro. 7, \n193\u2013206 (2024) \n \n[40] Wang, Y. et al. Room temperature magnetization switching in topological insulator-\nferromagnet heterostructures by spin-orbit torques. Nat. Commun. 8, 1364 (2017) \n \n[41] Wang, H. et al. Room temperature energy-efficient spin-orbit torque switching in two-\ndimensional van der Waals Fe3GeTe2 induced by topological insulators. Nat. Commun. 14, \n5173 (2023) \n \n[42] Choi, G. S. et al. Highly Efficient Room-Temperature Spin-Orbit-Torque Switching in a \nVan der Waals Heterostructure of Topological Insulator and Ferromagnet. Adv. Sci. 11, \n2400893 (2024) \n \n[43] Liu, Y. et al. Cryogenic in-memory computing using magnetic topological insulators, \nNat. Mater. DOI:10.1038/s41563-024-02088-4 (2025). \n \n \n \n \n14 \n \n \nFigure 1: Magnetization switching by pulse current and AC current. a, Schematic \nillustration of current-induced magnetization switching. The purple and light blue arrows \nrepresent in-plane magnetic field \ud835\udc35+ and the effective magnetic field of damping-like spin-orbit \ntorque \ud835\udc35IJ\"=K?, respectively. \ud835\udf0e denotes the spin-polarization of conduction electron. b, Spin-\npolarization of gapped Dirac surface states (blue arrows). c,d, Input waveform of pulse and \nprobe current measurement c and alternating current measurement d. e,f, The change in Hall \nresistance during applying current for pulse and probe e and for AC f. For the pulse case e, the \nHall resistance change is measured by the probe current of 1 \u00b5A under the magnetic field \ud835\udc35+ of \n0.01 T (blue) and \u22120.01 T (red) applied parallel to the current direction. The transverse axis \nshows the pulse current value which varies from \u2212500 \u00b5A to 500 \u00b5A and then from 500 \u00b5A to \n\u2212500 \u00b5A. For the AC case f, the Hall resistance is measured by current \ud835\udc3c+(\ud835\udc61) = \ud835\udc3c&*/0 sin(2\ud835\udf0b\ud835\udc53\ud835\udc61) \n(\ud835\udc3c&*/0 =  300 \u00b5A, \ud835\udc53= 101Hz) at each time \ud835\udc61 under Bx =  0.01 T. The black shaded area \nindicates that the measured value of \ud835\udc45-+  diverges near \ud835\udc3c+ = 0  because it results in an \nindeterminate form (0/0). The shown data are antisymmetrized with respect to the magnetic \nfield. The vertical dotted lines represent the threshold current where the Hall resistance changes \nits sign, and the blue arrows represent the direction of the flow of time. \n \n \n15 \n \n \nFigure 2: Time domain measurements of magnetization switching. a, Waveforms of input \nAC current (red, \ud835\udc53= 101 Hz) and output Hall voltage (green) measured under in-plane \nmagnetic field \ud835\udc35+ = 0.01 T and temperature \ud835\udc47= 2.5 K. Black triangles point to positive spikes \nof Hall voltage.  b, Change in Hall resistance \ud835\udc45-+ (blue) calculated from current and Hall \nvoltage shown in a. Black shaded areas indicate that the value of \ud835\udc45-+ diverges around the time \nperiod around \ud835\udc3c+ = 0 because it results in an indeterminate form (0/0). c, Waveforms of the \nHall voltage (green) for AC current with amplitudes of 14 \u00b5A, 71 \u00b5A, 141 \u00b5A, and 300 \u00b5A. \nThe sinusoidal dotted curve (black) is a guide to the eye which indicates the phase of the AC \ncurrent.  d, Hall voltage vs current in the case of applying AC current whose amplitudes are \n14 \u00b5A, 71 \u00b5A, 141 \u00b5A, and 300 \u00b5A. The red arrows represent the direction of the flow of time. \n \n \n \n \n16 \n \n \nFigure 3: Distinct nonlinear Hall responses under in-plane magnetic fields. a, Waveform \nof Hall voltage (green) and current (red) under magnetic field \ud835\udc35+  =  0.01 T  (top) and \ud835\udc35+  =\n 2 T  (bottom) and temperature \ud835\udc47= 2.5 K. The left illustrations indicate the magnetic field \n(purple arrows) and the magnetization when applying no current to the sample (dark blue \narrows) and the magnetization oscillation (light blue arrows) under the AC current excitation. \nThe vertical dotted lines divide one cycle into 4 regions (i: \ud835\udc51\ud835\udc3c+/\ud835\udc51\ud835\udc61> 0, \ud835\udc3c+ > 0, ii: \ud835\udc51\ud835\udc3c+/\ud835\udc51\ud835\udc61< 0, \n\ud835\udc3c+ > 0, iii: \ud835\udc51\ud835\udc3c+/\ud835\udc51\ud835\udc61< 0, \ud835\udc3c+ < 0, iv: \ud835\udc51\ud835\udc3c+/\ud835\udc51\ud835\udc61> 0, \ud835\udc3c+ < 0). b, Hall voltage vs current under \nmagnetic field Bx = 0.01 T (top) and Bx = 2 T (bottom). The symbols i, ii, iii, and iv on the upper \ngraph corresponds to the ones in a. c, Fourier transformation of the Hall voltage under magnetic \nfield \ud835\udc35+  =  0.01 T  (top) and \ud835\udc35+  =  2 T  (bottom). The blue, red, and gray bars denote the in-\nphase components \ud835\udc49\u2032-\n(7), the out-of-phase components \ud835\udc49\u2032\u2032-\n(7), and the constant component \ud835\udc49-\n(%), \nrespectively. \n \n \n \n \n17 \n \n \nFigure 4: Frequency-mixing accompanied by magnetization dynamics. a, Hall voltage vs \ncurrent for the AC current with the frequencies, \ud835\udc53$ = 37 Hz (red) and \ud835\udc53# = 125 Hz (blue), and \nthe amplitude of 300 \u00b5\ud835\udc34 under magnetic field \ud835\udc35+  =  0.01 T and temperature \ud835\udc47= 2.5 K. b, \nThe waveform of the input current including 2 frequency components \ud835\udc3c+(\ud835\udc61) =\n150 sin(2\ud835\udf0b\ud835\udc53$\ud835\udc61) + 150 sin(2\ud835\udf0b\ud835\udc53#\ud835\udc61) [\u00b5A] (red) and the Hall response \ud835\udc49- (green) under magnetic \nfield \ud835\udc35+  =  0.01 T and temperature \ud835\udc47= 2.5 K. The schematic on the right side shows the \nexperimental set-up. c, Absolute value of Fourier component c\ud835\udc63-(\ud835\udc53)c from the measured \nresponses under magnetic field \ud835\udc35+  =  0.01 T  (top) and \ud835\udc35+  =  2 T (bottom). The inset shows \nHall voltage vs current in each case. In the all panels of c and d, the peaks of the sum-frequency \n\ud835\udc53$ + \ud835\udc53# and the difference-frequency |\ud835\udc53$ \u2212\ud835\udc53#| is explicitly indicated. In the top panel for the \nexperiment under 0.01 T, the peaks of other linear combinations (gray) are also indicated. d, \nAbsolute value of Fourier component c\ud835\udc63-(\ud835\udc53)c, which is defined in the same way as c, from the \nsimulated responses for \ud835\udc3c89 =  150 \u00b5A   (top) and \ud835\udc3c89 =  0 \u00b5A  (bottom), mimicking the \nexperimental cases for \ud835\udc35+  =  0.01 T and \ud835\udc35+  =  2 T, respectively. All the data in c and d are \nplotted as a function of \ud835\udc53 with 1 Hz intervals. \n \n \n \n \n18 \n \nSupplementary Note 1: Analysis of the raw data in the time-domain measurements \nUsing an oscilloscope, we measured Hall voltage in response to current in the time domain. To \nexclude the background from longitudinal voltage due to electrode misalignment, we subtracted \nthe measured data under the magnetic field in the \u2212\ud835\udc65 direction from the one under that in the \n+\ud835\udc65 direction (anti-symmetrization). Here we emphasize that when measuring data under the \nopposite field, the way of initialization is also inverted. For example, we measured \ud835\udc49-(\ud835\udc61) under \n\ud835\udc35+ = +0.01 T after initializing magnetization by setting to \ud835\udc35+ = +2 T, and then measured \n\ud835\udc49-(\ud835\udc61) under \ud835\udc35+ = \u22120.01 T after setting to \ud835\udc35+ = \u22122 T. When both the initializing field and the \nassisting field are inverted, the Hall voltage signal originating from the initial magnetization \nand magnetization dynamics is inverted, while the background from the longitudinal resistance \nremains unchanged. Therefore, this anti-symmetrization method is effective.  \n \nHere, we show the data before and after anti-symmetrization for the cases of pulse (Fig. S3) \nand AC (Fig. S4). In the case of pulses, the probe current is small (10 \u00b5A), which suppresses \ntemperature rise, allowing behaviors such as the sign change in the Hall resistance associated \nwith magnetization reversal to be observed even before anti-symmetrization. On the other hand, \nin the case of AC, the background due to longitudinal resistance is significant relative to the \nHall voltage signal, the anti-symmetrization procedure is necessary for the magnetization \nreversal behavior to be observed. In both cases, however, the behavior is the same: when the \ncurrent flows in the +\ud835\udc65 direction, the magnetization points in the \u2212\ud835\udc67 direction, and when the \ncurrent flows in the \u2212\ud835\udc65 direction, the magnetization points in the +\ud835\udc67 direction.\n \n \n19 \n \n \nSupplementary Note 2: Frequency dependence of the magnetization reversal \nWe mainly use \ud835\udc3c+(\ud835\udc61) = \ud835\udc3c% sin(2\ud835\udf0b\ud835\udc53\ud835\udc61) with \ud835\udc3c% = 300 \u00b5A and \ud835\udc53=10 Hz as the input current in \nthe main text. Here we show the response waveforms \ud835\udc49-(\ud835\udc61) and \ud835\udc3c+ \u2212\ud835\udc49- curves and \ud835\udc3c+ \u2212\ud835\udc45-+ \ncurves for various frequencies \ud835\udc53= 11, 101, 1001, 10001 Hz in Fig. S5. Overall, the qualitative \nbehavior is basically independent of frequency for lower frequencies \ud835\udc53= 11, 101, 1001 Hz. At \nthe highest frequency \ud835\udc53= 10001 Hz, the current is attenuated by the parasitic capacitance \nparallel to the sample, which causes a reduction in the current amplitude and a trivial phase \nchange between current and Hall voltage. Therefore, at such a high frequency, it is not possible \nto correctly measure the Hall voltage as the response to the current in the time domain. \n \nThe frequency dependence of the responses presented here is attributed to the parasitic \ncapacitance because the Hall voltage when the current is \ud835\udc3c+ = 0 was not \ud835\udc49- = 0 at the higher \nfrequency. However, mechanisms other than parasitic capacitance through which the response \ndepends on frequency, such as the inertia of the magnetization dynamics to the current, remain \nelusive. To explore such intrinsic frequency dependence, improvements in the equipment are \nnecessary to enable high-frequency measurements without the influence of parasitic \ncapacitance. \n \n \n \n \n20 \n \nSupplementary Note 3: Estimation of temperature increase caused by Joule heating \nWe attribute the hysteretic behavior to the magnetization reversal caused by spin-orbit torque. \nHowever, there would be still another explanation that the temperature would show a hysteretic \nchange because of Joule heating and accordingly the Hall response would reflect it. To exclude \nthis possibility, we estimate the temperature variation during applying current using \nlongitudinal resistivity as a thermometer. We first measured the temperature dependence of the \nlongitudinal resistance \ud835\udc45++ under the out-of-plane magnetic field \ud835\udc35. = 1 T (Fig. S6a), which is \nlarge enough to fix the magnetization along z direction even under a large current excitation. \nThe sensing current is as low as \ud835\udc3c= 0.1 \u00b5A, enabling us to ignore Joule heating. We then \nmeasured Hall resistance under AC with the amplitude of 300 \u00b5A and the frequency of 101 Hz \nas shown in the middle panel of Fig. S6b. At each time, we can estimate the sample temperature \nfrom the value of \ud835\udc45++ as shown on the lower side. There is certainly no hysteresis in temperature \nand it does not exceed the critical temperature \ud835\udc47L = 40 K. This reinforces the idea that \nmagnetization reversal occurs due to spin-orbit torque and that the hysteresis in the Hall voltage \nis caused by magnetization reversal.\n \n \n21 \n \n \nSupplementary Note 4: Nonlinear Hall effect (polynomial-type) \nHere, we consider the Hall voltage \ud835\udc49-(\ud835\udc61) in response to the current \n\ud835\udc3c+(\ud835\udc61) = \ud835\udc3c% sin(\ud835\udf14\ud835\udc61) \nwhen \ud835\udc49- is written as a power series of \ud835\udc3c+; \n\ud835\udc49- = \ud835\udc45-+\ud835\udc3c+ + \ud835\udc45-++\ud835\udc3c+\n# + \ud835\udc45-+++\ud835\udc3c+\n5 + \ud835\udc45-++++\ud835\udc3c+\n6 + \u22ef \nwhere \ud835\udc45\n-++\u22ef+\nNOP\n%  (\ud835\udc5b= 1, 2, 3, 4, \u22ef) is coefficient of each order of \ud835\udc3c+. Substituting the \ud835\udc3c+(\ud835\udc61) into \nthis, we obtain \n\ud835\udc49-(\ud835\udc61) = \ud835\udc45-+\ud835\udc3c% sin(\ud835\udf14\ud835\udc61) + \ud835\udc45-++(\ud835\udc3c% sin(\ud835\udf14\ud835\udc61))# + \ud835\udc45-+++(\ud835\udc3c% sin(\ud835\udf14\ud835\udc61))5 + \ud835\udc45-++++(\ud835\udc3c% sin(\ud835\udf14\ud835\udc61))6\n+ \u22ef \n= \ud835\udc45-+\ud835\udc3c% sin(\ud835\udf14\ud835\udc61) + \ud835\udc45-++\ud835\udc3c%\n# 1 \u2212cos(2\ud835\udf14\ud835\udc61)\n2\n+ \ud835\udc45-+++\ud835\udc3c%\n5 f3\n4 sin(\ud835\udf14\ud835\udc61) \u22121\n4 sin(3\ud835\udf14\ud835\udc61)g\n+ \ud835\udc45-++++\ud835\udc3c%\n6 f3\n8 \u22121\n2 cos(2\ud835\udf14\ud835\udc61) + 1\n8 cos(4\ud835\udf14\ud835\udc61)g + \u22ef \n= \ud835\udc49-\n(%) + \ud835\udc494-\n($)sin (\ud835\udf14\ud835\udc61) + \ud835\udc494-\n(#)cos (2\ud835\udf14\ud835\udc61) + \ud835\udc494-\n(5)sin (3\ud835\udf14\ud835\udc61) + \ud835\udc494-\n(6)cos (4\ud835\udf14\ud835\udc61) + \u22ef, \nwhere \n\ud835\udc49-\n(%) = \ud835\udc3c%\n#\ud835\udc45-++/2 +  3\ud835\udc3c%\n6\ud835\udc45-++++/8 + \u22ef, \n\ud835\udc494-\n($) = \ud835\udc3c%\ud835\udc45-+ + 3\ud835\udc3c%\n6\ud835\udc45-++++/4 + \u22ef,\ud835\udc494-\n(#) = \u2212\ud835\udc3c%\n#\ud835\udc45-++/2 \u2212\ud835\udc3c%\n6\ud835\udc45-++++/2 + \u22ef, \n \ud835\udc494\n-\n(5) = \u2212\ud835\udc3c%\n5\ud835\udc45-+++/4 + \u22ef, \ud835\udc494\n-\n(6) = \ud835\udc3c%\n6\ud835\udc45-++++/8 + \u22ef . \nIn this way, all the odd-harmonic components appear as sine functions, while all the even \nharmonic components appear as cosine functions in this power series, as mentioned in the main \ntext. \n \n \n \n \n22 \n \nSupplementary Note 5: Frequency-mixing effect with and without threshold \nIn the main text, we discuss the frequency-mixing effect using the current including 2 \nfrequencies \ud835\udc53$ = 37 Hz and \ud835\udc53# = 125 Hz. Here we describe how the mixed frequencies appear \nin the response. \n \nFirst, let the current be \ud835\udc3c+(\ud835\udc61) = \ud835\udc3c$sin(2\ud835\udf0b\ud835\udc53$\ud835\udc61) + \ud835\udc3c#sin(2\ud835\udf0b\ud835\udc53#\ud835\udc61). Then, we assume that the Hall \nvoltage can be written as a power series of \ud835\udc3c+; \n\ud835\udc49-(\ud835\udc61) = \ud835\udc45-+\ud835\udc3c+(\ud835\udc61) + \ud835\udc45-++(\ud835\udc3c+(\ud835\udc61))# + \ud835\udc45-+++(\ud835\udc3c+(\ud835\udc61))5 + \ud835\udc45-++++(\ud835\udc3c+(\ud835\udc61))6 +\u00b7 \u00b7 \u00b7 \nwhere \ud835\udc45\n-++\u22ef+\nNOP\n%  (\ud835\udc5b= 1, 2, 3, 4, \u22ef) is coefficient of each order of \ud835\udc3c+(\ud835\udc61). Substituting \ud835\udc3c+(\ud835\udc61) into it, \nwe obtain a lot of components with frequencies which are expressed as linear combinations of \n\ud835\udc53$ and \ud835\udc53#. For example, the second order term is written as below; \n\ud835\udc45-++(\ud835\udc3c+(\ud835\udc61))# = \ud835\udc45-++(\ud835\udc3c$sin (2\ud835\udf0b\ud835\udc53$\ud835\udc61) + \ud835\udc3c#sin (2\ud835\udf0b\ud835\udc53#\ud835\udc61))#  \n= \ud835\udc45-++ j\ud835\udc3c$\n# 1 \u2212cos(4\ud835\udf0b\ud835\udc53$\ud835\udc61)\n2\n+ \ud835\udc3c#\n# 1 \u2212cos(4\ud835\udf0b\ud835\udc53#\ud835\udc61)\n2\n+ \ud835\udc3c$\ud835\udc3c#{cos[2\ud835\udf0b(\ud835\udc53# \u2212\ud835\udc53$)\ud835\udc61] \u2212cos[2\ud835\udf0b(\ud835\udc53$ + \ud835\udc53#)\ud835\udc61]}m \n= \ud835\udc45-++\n\ud835\udc3c$\n# + \ud835\udc3c#\n#\n2\n \u2212\ud835\udc45-++\ud835\udc3c$\n#\n2\ncos(4\ud835\udf0b\ud835\udc53$\ud835\udc61) \u2212\ud835\udc45-++\n\ud835\udc45-++\ud835\udc3c#\n#\n2\ncos(4\ud835\udf0b\ud835\udc53#\ud835\udc61) \n+\ud835\udc45-++\ud835\udc3c$\ud835\udc3c#cos[2\ud835\udf0b(\ud835\udc53# \u2212\ud835\udc53$)\ud835\udc61] \u2212\ud835\udc45-++\ud835\udc3c$\ud835\udc3c#cos[2\ud835\udf0b(\ud835\udc53$ + \ud835\udc53#)\ud835\udc61] \n \nIn this way, the sum- and difference-frequency components are derived. Other linear \ncombinations of the form, \ud835\udc4e\ud835\udc53$ + \ud835\udc4f\ud835\udc53# (a and b being integer numbers), are also derived from \nhigher order terms. \n \nHere, we prove that the components with frequencies \ud835\udc4e\ud835\udc53$ + \ud835\udc4f\ud835\udc53# and \ud835\udc4e\ud835\udc53$ \u2212\ud835\udc4f\ud835\udc53# have the same \namplitude if Vy(t) is written in a purely polynomial form of current without a threshold behavior. \nFirst, the components of \ud835\udc4e\ud835\udc53$ + \ud835\udc4f\ud835\udc53# and \ud835\udc4e\ud835\udc53$ \u2212\ud835\udc4f\ud835\udc53# comes only from the terms of  \n[\ud835\udc3c$sin(2\ud835\udf0b\ud835\udc53$\ud835\udc61)]QR#S[\ud835\udc3c#sin(2\ud835\udf0b\ud835\udc53#\ud835\udc61)]TR#7 (\ud835\udc5a, \ud835\udc5b =  0, 1, 2,\u00b7 \u00b7 \u00b7 ), \nwhen \ud835\udc53$ and \ud835\udc53# are coprime like 37 Hz and 125 Hz. Because the current can be also expressed \nas \ud835\udc3c+(\ud835\udc61) = \ud835\udc3c$sin2\ud835\udf0b\ud835\udc53$\ud835\udc61\u2212\ud835\udc3c#sin2\ud835\udf0b(\u2212\ud835\udc53#)\ud835\udc61, this term should be equal to \n[\ud835\udc3c$sin(2\ud835\udf0b\ud835\udc53$\ud835\udc61)]QR#S[\ud835\udc3c#sin(2\ud835\udf0b\ud835\udc53#\ud835\udc61)]TR#7 = [\ud835\udc3c$sin(2\ud835\udf0b\ud835\udc53$\ud835\udc61)]QR#S[\u2212\ud835\udc3c#sin(\u22122\ud835\udf0b\ud835\udc53#\ud835\udc61)]TR#7\n= (\u22121)T[\ud835\udc3c$sin(2\ud835\udf0b\ud835\udc53$\ud835\udc61)]QR#S[\ud835\udc3c#sin(\u22122\ud835\udf0b\ud835\udc53#\ud835\udc61)]TR#7 (\ud835\udc5a, \ud835\udc5b =  0, 1, 2,\u00b7 \u00b7 \u00b7 ). \n \n \n23 \n \nThus, the coefficients \ud835\udc36Q,T of the term of cos[2\ud835\udf0b(\ud835\udc4e\ud835\udc53$ + \ud835\udc4f\ud835\udc53#)\ud835\udc61] (or sin[2\ud835\udf0b(\ud835\udc4e\ud835\udc53$ + \ud835\udc4f\ud835\udc53#)\ud835\udc61] and \n\ud835\udc36Q,\"T of cos[2\ud835\udf0b(\ud835\udc4e\ud835\udc53$ \u2212\ud835\udc4f\ud835\udc53#)\ud835\udc61] (or sin[2\ud835\udf0b(\ud835\udc4e\ud835\udc53$ \u2212\ud835\udc4f\ud835\udc53#)\ud835\udc61]) are necessarily connected by \n\ud835\udc36Q,T = (\u22121)T\ud835\udc36Q,\"T \nThis shows that the amplitudes of the two components are equal, in accord with the observation \nas shown in the case of Bx = 2 T in Fig. 4d of the main text. \n \n \n \n24 \n \n \nFigure \nS1: \nBasic \ntransport \nproperties \nof \nthe \nsample. \na \nStructure \nof \n(Cr,Bi,Sb)2Te3/(Bi,Sb)2Te3 thin film grown on InP(111) substrate (top) and the Hall-bar device \nfabricated from the film (bottom). The gray part is CBST/BST thin film and the gold parts are \nAu electrodes. b Fundamental transport properties of the sample (Hall-bar device) measured by \nlow DC current \ud835\udc3cVW = 0.1 \u00b5A using PPMS. The upper left, the upper right, and the lower left \npanels show magnetic field (\ud835\udc35.) dependence of Hall resistivity \ud835\udf0c-+, Hall conductivity \ud835\udf0e+-, and \nlongitudinal resistivity \ud835\udf0c++, respectively. All of them are measured at temperatures 2.5 K (red), \n10 K (blue), 20 K (green). Temperature dependence of \ud835\udf0c++  (red) and \ud835\udf0c-+  (blue) under low \nmagnetic field \ud835\udc35. = 0.01 T are presented in the lower right panel. From the rise of the \nanomalous Hall resistivity (\ud835\udf0c-+ at 0.01T), the transition temperature is determined as \ud835\udc47L~40 K. \n \n \n \n \n25 \n \n \nFigure S2: Time-domain measurement setup. a Circuit diagram of the measurement system. \nCurrent from the current source flows through the sample and resistor connected in parallel. \nThe oscilloscope measures the Hall voltage or longitudinal voltage of the sample while \nsimultaneously monitoring the current flowing through the system from the voltage across the \nresistor. The sample is put in a PPMS chamber. b Configuration of the current (red), the Hall \nvoltage (green), the longitudinal voltage (blue), and the magnetic field (purple) in the sample. \nThe shape of the sample is the same as in Fig. S1a. \n \n \n \n \n \n26 \n \n \nFigure S3: Raw data and anti-symmetrized data of the time-domain measurements of \nmagnetization reversal induced by pulse. a Raw data of Hall voltage (green line in the middle \nsection) in response to pulse current (red, upper) for the peak current values \ud835\udc3c&*/0 =\n20, 100, 300, 500 \u00b5A. The middle and lower sections show the responses under the in-plane \nmagnetic field of +0.05 T and \u22120.05 T, respectively. The configurations of the magnetic field \n(purple arrow), the effective magnetic field of damping-like SOT (light blue), the magnetization \n(blue), and the pulse current (red) are illustrated on the right side. b Anti-symmetrized responses \nobtained from the raw data (shown in a) for \u00b10.05 T. The peak value dependence of the \nmagnetization switching ratio is also shown in the right side. Through this figure, a different \nsample was used for the measurement, compared to the one used in the other figures. \n \n \n \n \n27 \n \n \nFigure S4: Raw data and anti-symmetrized data of the time-domain measurements of \nmagnetization reversal induced by AC. a Raw data of Hall voltage (green lines in the middle \nand lower sections) in response to AC current (red, upper) for the peak current values \ud835\udc3c&*/0 =\n14, 71, 141, 300 \u00b5A. The middle and lower sections show the responses under the in-plane \nmagnetic field of +0.01 T and \u22120.01 T, respectively. b  Anti-symmetrized data obtained from \nthe raw data (shown in a) for \u00b10.01 T. Increasing \ud835\udc3c&*/0, the response \ud835\udc49-(\ud835\udc61) gradually becomes \nnonlinear, as discussed in the main text.\n \n \n28 \n \n \n \nFigure S5: Frequency dependence of AC-induced magnetization reversal. a The Hall \nvoltage response (green) and input current (red) with an amplitude of 300 \u00b5A and at frequencies \nof 11, 101, 1001, 10001 Hz. As the frequency increases beyond 1001 Hz, the amplitude of the \nmeasured current decreases possibly due to parasitic capacitance within the electric circuit. b \nThe Hall voltage vs current characteristics for each frequency. For higher frequencies 1001 Hz \nand 10001 Hz, the value of \ud835\udc49- \u22600 when \ud835\udc3c+ = 0 because of the trivial phase rotation due to the \nparasitic capacitance. c The change in Hall resistance, calculated from the Hall voltage and \ncurrent, depending on the current. For all the frequencies, \ud835\udc45-+ diverges around \ud835\udc3c+ = 0 because \nof \ud835\udc49-/\ud835\udc3c+ = 0/0. \n \n \n \n \n29 \n \n \nFigure S6: Estimation of temperature increase caused by Joule heating. a The lower part \nshows the temperature dependence of \ud835\udc45++ measured by the probe current as low as \ud835\udc3c= 0.1 \u00b5A \nunder the out-of-plane magnetization \ud835\udc35. = 1 T (the configuration is shown in the upper part). \nb Time-variation of the current \ud835\udc3c+ (red), resistance \ud835\udc45++ (purple), and temperature \ud835\udc47 (black) \nwhich is estimated from resistance. \ud835\udc47 is estimated to be as low as 30 K at the highest. \ud835\udc45++ and \nthus \ud835\udc47 diverge at \ud835\udc3c+ = 0 (the gray areas). c The dependence of \ud835\udc47 on \ud835\udc3c+ obtained from the \ud835\udc3c+(\ud835\udc61) \nand \ud835\udc47(\ud835\udc61) shown in b. \ud835\udc47 increases as the absolute value of \ud835\udc3c+ increases. \ud835\udc47 diverges around \ud835\udc3c+ =\n0."
-  },
-  {
-    "domain": "Physics",
-    "chunk_type": "general",
-    "text": "Co-optimizing Physical Reconfiguration Parameters and Controllers for an\nOrigami-inspired Reconfigurable Manipulator\nZhe Chen1, Li Chen2, Hao Zhang2, and Jianguo Zhao1\nAbstract\u2014 Reconfigurable robots that can change their phys-\nical configuration post-fabrication have demonstrate their po-\ntential in adapting to different environments or tasks. How-\never, it is challenging to determine how to optimally adjust\nreconfigurable parameters for a given task, especially when the\ncontroller depends on the robot\u2019s configuration. In this paper,\nwe address this problem using a tendon-driven reconfigurable\nmanipulator composed of multiple serially connected origami-\ninspired modules as an example. Under tendon actuation, these\nmodules can achieve different shapes and motions, governed by\njoint stiffnesses (reconfiguration parameters) and the tendon\ndisplacements (control inputs). We leverage recent advances\nin co-optimization of design and control for robotic system\nto treat reconfiguration parameters as design variables and\noptimize them using reinforcement learning techniques. We first\nestablish a forward model based on the minimum potential\nenergy method to predict the shape of the manipulator under\ntendon actuations. Using the forward model as the environment\ndynamics, we then co-optimize the control policy (on the\ntendon displacements) and joint stiffnesses of the modules\nfor goal reaching tasks while ensuring collision avoidance.\nThrough co-optimization, we obtain optimized joint stiffness\nand the corresponding optimal control policy to enable the\nmanipulator to accomplish the task that would be infeasible\nwith fixed reconfiguration parameters (i.e., fixed joint stiffness).\nWe envision the co-optimization framework can be extended\nto other reconfigurable robotic systems, enabling them to\noptimally adapt their configuration and behavior for diverse\ntasks and environments.\nI. INTRODUCTION\nTraditionally, the design and control of robotic systems\nhave been treated as separate processes: a robot\u2019s physical\nstructure is first designed, and then a controller is developed\nto operate it. However, co-design or co-optimization\u2014the\nsimultaneous optimization of both a robot\u2019s physical design\nand its control strategy\u2014has recently emerged as a new\nmethod, spurred by recent advancements in learning-based\ncontrol, particularly those leveraging simulation-based train-\ning for zero-shot deployment [1], [2]. This co-optimization\napproach enables robotic systems to achieve optimal per-\nformance by exploring synergies between morphology and\nbehavior. Notable examples include legged robots [3], [4],\nsoft robots [5], robotic hands [6], and modular robots [7],\nwhere integrated design and control optimization have led to\nimprovements in efficiency, adaptability, and robustness.\n1 Zhe Chen and Jianguo Zhao are with the Department of Mechanical\nEngineering at Colorado State University, Fort Collins, CO, 80523, USA.\nE-mail: {Zhe.Chen, Jianguo.Zhao} @colostate.edu.\n2 Li Chen and Hao Zhang are with the Manning College of Information\nand Computer Sciences, University of Massachusetts Amherst, Amherst,\nMA 01002, USA. Email: {lchen0, hao.zhang}@umass.edu.\nFig. 1: Illustration of programmable motion for two serially\nconnected origami-inspired modules [15]. S1 and S2 repre-\nsent the stiffness of a joint in the top and bottom module,\nrespectively. When S1 > S1, the manipulator undergoes\nmotion 1. If S1 < S2, the manipulator undergoes motion\n2 with the same actuation.\nHowever, most existing work on co-optimization generally\nfocuses on geometric dimensions as design parameters, such\nas the leg length for legged robots [3], which are fixed\nafter fabrication and difficult to modify [8]. To enable\nrobots capable of adapting their morphology and behavior\non the fly to accommodate different tasks or environments,\nit is crucial to consider parameters that can be adjusted or\nreconfigured post-fabrication, and we call them physical re-\nconfiguration parameters. Examples of such reconfiguration\nparameters include curvatures for body/leg parts [9], [10]\nand joint stiffness, which can be actively tuned based on the\nadvancements in variable stiffness materials [11]. Tunable\njoint stiffness, in particular, enables programmable motion in\nmechanical systems such as origami [12], [13] and linkage-\nbased mechanisms [14], enhancing their adaptability after\nfabrication.\nWe have recently demonstrated an origami-inspired recon-\nfigurable module capable of achieving programmable shapes\nand motions under tendon actuation and different stiffness\nfor selected joints [15]. Fig. 1 shows a manipulator with two\nserially connected modules. When the stiffness of one joint\nin the top module S1 is larger than the stiffness of one joint\nin the bottom module S2, the manipulator undergoes motion\n1. If S1 < S2, the manipulator undergoes motion 2 with the\nsame actuation (detailed working principle can be found in\n[15] and section III). With more modules connected in series,\nthe resulting manipulator can achieve more diverse motions,\nwithout changing the geometric dimensions, but by tuning\nthe stiffnesses of selected joints within each module.\nThe contribution of this paper is to leverage co-design\narXiv:2504.10474v1  [cs.RO]  14 Apr 2025\nframework to jointly optimize physical reconfiguration pa-\nrameters and controllers for reconfigurable robotic systems.\nSpecifically, we co-optimize the joint stiffnesses and tendon\nactuations for the reconfigurable origami-inspired manipu-\nlator to accomplish desired tasks such as reaching a goal\nposition while avoiding certain objects. We address the\nproblem by considering the reconfiguration parameters (i.e.,\njoint stiffness) as design parameters (instead of the tradi-\ntionally used geometric dimensions) and tendon actuations\nas control actions. Specifically, we maintain a Gaussian\ndistribution over the joint stiffnesses and uses reinforcement\nlearning to optimize both the neural network control policy\nand the distribution parameters to maximize the expected\nreturn of the control policy over the stiffness distribution.\nCo-optimizing the joint stiffnesses and tendon actuations\ncan generate robotic manipulator that can adapt to task\nrequirement post fabrication if we can control the stiffness\nat a specific value before the tendons are actuated.\nThe rest of this paper is organized as follows. Related\nworks are discussed in Section II. After that, we explain\nthe working principle, and develop a forward model of\nthe origami manipulator in Section III. We then discuss\nwhy co-optimization is necessary for the manipulator in\nSection IV. After that, we discuss how we implement the co-\noptimization of the design parameters and control algorithm\nof the manipulator for a reaching task as well as the results.\nLastly, conclusions are drawn in Section VII.\nII. RELATED WORK\nProgrammable motion with variable stiffness joints.\nFor origami robots, the stiffnesses of creases can influ-\nence their mechanical properties and dynamic behaviors.\nMoreover, the ability of origami systems to dynamically\nadjust their stiffness in real time could greatly expand their\napplicability in robotic tasks, such as locomotion, manipula-\ntion, and grasping. Firouzeh et al. [16] used shape memory\npolymer (SMP) for an underactuated origami gripper, which\ncan change the stiffness by heating up the SMP joints.\nLin et al. [17] employed laminar jamming to control the\nstiffness of an origami structure on the fly. Lerner et al. [13]\ndeveloped a novel variable stiffness joint and demonstrated\nthe programmable motion of an origami robot with those\njoints.\nModel-based co-optimization. Some researchers used\ndetailed dynamic models of the robots for model-based co-\noptimization. For instance, Spielberg et al. [18] demonstrated\nthat robot design parameters can be incorporated into trajec-\ntory optimization process, enabling the concurrent optimiza-\ntion of robot trajectories and physical designs. Deimel et al.\n[19] used a simplified model for the dynamics of the soft\nrobotic hand in the co-design process, which updates the\ndesign parameters with particle filter optimization method.\nLiao et al. [20] proposed hierarchical process constrained\nbatch Bayesian optimization (HPC-BBO) to automatically\noptimize robot morphologies and the controllers in a data\nefficient manner.\nFig. 2: A reconfigurable manipulator consisting of two\norigami-inspired modules connected in series\nData-driven co-optimization. Data-driven approaches,\nsuch as deep reinforcement learning (RL), have proven\nhighly effective in addressing the complex dynamics of\nrobotics and their interactions with the environment [21].\nCompared with model-based co-optimization methods, RL-\nbased co-optimization methods excel in learning directly\nfrom interactions with the environment. Recently, researchers\nhave also explored implementing co-optimization using RL\nmethods [22]\u2013[25]. For instance, He et al. [23] proposed the\nMORPH framework to co-optimize robot morphology and\ncontrol policy using a neural network based proxy model\nto approximate the real physical model of the robot. Wang\net al. [26] proposed Neural Graph Evolution to co-evolve\nboth the robot design and the control policy, representing\na robot\u2019s morphology with a graph neural network. Chen\net al. [27] developed a bi-level optimization method to co-\noptimize both morphology parameters and control policy for\nsmall-scale legged robots.\nIII. ORIGAMI-INSPIRED RECONFIGURABLE\nMANIPULATOR\nIn this section, we discuss the working principle of the\nreconfigurable manipulator and the forward kinematic model\nthat can predict the manipulator\u2019s motion given joint stiffness\nand tendon actuation.\nA. Working Principle\nA manipulator consisting of two origami-inspired recon-\nfigurable modules connected in series is shown in Fig. 2.\nWe refer to the top module as Module 2 and the bottom\none as Module 1. Each module has a top and a bottom\ntriangular plate that are connected by three pairs of vertical\nand diagonal links. These links are attached to the plates\nthrough silicone tubes, which function as compliant spherical\njoints. The bottom plate of Module 1 is fixed. On each side\nof the triangular manipulator, an actuation tendon (shown in\nyellow in Fig. 2) is anchored to the top plate of Module\n2, routed along the diagonal links and threaded through the\nplates in a zigzag pattern. The tendons extend through the\nfixed bottom plate of Module 1 and are ultimately connected\nto motors placed beneath the bottom plate, which control the\ndisplacements of the tendons.\nEach diagonal link is implemented as a Variable Stiffness\nJoint (VSJ), consisting of a thermoplastic material enclosed\nby an elastic tube in the middle. We can reconfigure the\nstiffness of each VSJ through Joule heating by using heating\nwires around the tube. The detailed working principle for the\nVSJs is introduced in our earlier work [14], [15]. Depending\non the SVJs\u2019 stiffness values, each module exhibits distinct\nmotion characteristics in response to tendon actuation. By\nconnecting multiple modules in series, the manipulator can\nachieve more complex and versatile motions, significantly\nexpanding its workspace and functional capabilities. For a\nmanipulator, we define the centroid of the top plate of the\ntopmost module as its effective tip.\nB. Forward Model\nTo predict the shape of the origami manipulator, which\nconsists of N serially connected modules, a forward model\nis required to determine its deformation given the dis-\nplacements of the three tendons, represented as D\n=\n[d1, d2, d3]T , and the stiffnesses of all joints, represented\nas S = [s1\n1, s1\n2, s1\n3, s2\n1, s2\n2, s2\n3, ..., sN\n1 , sN\n2 , sN\n3 ]T . Unlike our\nprevious design in [15], the current manipulator is driven\nby three tendons, necessitating the development of a new\nforward model for the co-optimization process. To achieve\nthis, we employ the minimum potential energy method,\nwhich determines the equilibrium shape of the manipulator\nby considering both the applied tendon displacements and\nthe stiffnesses of all VSJs.\nWe first illustrate important parameters for the forward\nmodel. In Fig. 3, the initial shape of Module 1 in Fig. 2\nis shown in transparent. To simplify the model, we assume\nthat every neighboring pair of vertical link and diagonal link\nis connected to the top or bottom plate at the same point,\nmeaning they share the same spherical joint. For instance,\nvertical link P1Qini\n1\nand diagonal link P2Qini\n1\nare connected\nto the top plate Qini\n1 Qini\n2 Qini\n3\nat the same point Qini\n1 . If link\nP2Q1 is soft and P3Q2, P1Q3 are rigid, the module under\nactuation would deform to a shape shown in nontransparent\nin Fig. 3.\nFor a manipulator made from N modules, the shape of j-\nth module is uniquely represented by the chord lengths of its\nthree diagonal links [bj\n1, bj\n2, bj\n3]T . Note that if a diagonal link\nis straight, the chord length equals its initial length bini. In\nthis case, the shape of the manipulator can be represented as\nFig. 3: Geometry of the deformed module\nB = [b1\n1, b1\n2, b1\n3, b2\n1, b2\n2, b2\n3, ..., bN\n1 , bN\n2 , bN\n3 ]T . Since the tendons\ncannot extend, the manipulator is subject to the following\nconstraint equation.\ndi \u2264N \u00b7 bini \u2212\nN\nX\nj=1\nbj\ni,\ni = 1, 2, 3\n(1)\nwhere i corresponds to the i-th side of the manipulator.\nHowever, it is possible that infinite many sets of B may\nsatisfy the same tendon constraint. We use the minimum\npotential energy method to choose the set of B from the\ninfinite many possible Bs that minimizes the potential energy\nof the manipulator, E. For each module, the potential energy\nis calculated as follows.\nEj = 1\n2\n3\nX\ni=1\nsj\ni(\u03b3j\ni )2 + 1\n2\n12\nX\nm=1\nkj\nm(\u03c3j\nm)2\n(2)\nwhere the first item represents the potential energy stored\nin the three VSJs, the second item represents the potential\nenergy stored in the elastic spherical joints connecting the\nlinks and the plates. \u03b3i denotes the bending angle of the\nVSJs. \u03c3m denotes the bending angle of the spherical joints.\nsi and km are the corresponding effective stiffnesses of the\nVSJs and the spherical joints. \u03b31 and \u03c31 are shown in Fig.\n3. Note that \u03b3i and \u03c3m can be obtained from the shape of\nthe manipulator, B, through geometric calculations [15].\nTo model real-world constraints, we impose a limit on\nthe maximum force each cable can generate, as a physical\nprototype would rely on motors with finite stall torque. In\nthis way, the manipulator is also subject to the following\nconstraint equation of force limit, Fl.\n\u2202E\n\u2202di\n\u2264Fl,\ni = 1, 2, 3\n(3)\nWe formulate the forward model as an optimization prob-\nFig. 4: The manipulator under fixed actuation sequence ex-\nhibits different motions if the VSJs have different stiffnesses.\nTrajectories 1, 2, and 3 correspond to stiffness set S1, S2,\nS3, respectively.\nlem as follows\nmin\nB\nE =\nN\nX\nj=1\nEj\ns.t.\nConstraints (Eq. 1, Eq. 3)\n(4)\nThe solution B of the optimization problem depends on\nboth the stiffnesses of the VSJs S and the tendon displace-\nment D, since S and D are included in the objective function\nand constraint equations in Eq. (4). With an optimal B, we\ncan solve the forward problem to predict the final shape of\nthe manipulator consisting of multiple modules connected in\nseries, given D and S. We note that though we could only\nachieve binary stiffness of the VSJs (either rigid or soft)\nin our earlier work [15], the forward model presented here\nassumes continuously adjustable stiffness. This enhancement\nmakes the model more general and applicable to a wider\nrange of scenarios. The position of the effective tip of a\nmodule or a manipulator can be readily calculated based on\nthe shape of the module or the manipulator.\nIV. NECESSITY OF CO-OPTIMIZATION FOR JOINT\nSTIFFNESS AND TENDON ACTUATION\nIn this section, we discuss why co-optimization is needed.\nSpecifically, we first show that with different stiffnesses of\nthe VSJs, the motion of the manipulator can be different\nunder the same tendon actuation. We then show that the\nreachable workspace can also be different under different\nset of stiffness values.\nA. The same actuation can generate different trajectories\nunder different VSJ stiffnesses\nTo illustrate how the motion depends on the stiffness, we\nuse a manipulator made from four origami modules and pre-\ndict its motion using the forward model developed in section\nFig. 5: The manipulators with different stiffness selections\nhave varying reachable workspace.\nIII-B with the following actuation for the displacements of\nthe three tendons.\n\u22061 = 128t,\n\u22062 = 100 sin \u03c0\n2 t,\n\u22063 = 100t2\n(5)\nwhere the parameter t is in the range (0, 1). We choose three\ndifferent sets of stiffnesses as follows. S1 = [1.37, 1.65,\n0.66, 1.30, 0.64, 0.97, 0.68, 2.28, 0.76, 0.42, 1.23, 1.06]T\nis randomly sampled from a uniform distribution within the\nrange (0.4, 2.5). S2 is obtained by shifting S1 by 1 unit to\nthe right, and S3 is generated by shifting S1 by 2 units to\nthe right. The same actuation results in drastically different\ntrajectories for the position of the effective tip, as shown\nin Fig. 4. The blue dot represents the initial position of the\neffective tip when the manipulator is not actuated.\nB. Different stiffness can lead to different workspace\nWe further demonstrate that the same manipulator, with\ndifferent VSJ stiffnesses, can achieve varying reachable\nworkspaces. We obtain the workspace by uniformly sam-\npling tendon displacements within their feasible ranges and\ncomputing the corresponding end-effector position for each\nsample. The workspace varies with the stiffness selections\nsince the force limit is included in the forward model. For\ntwo different stiffness sets, S4 = [2.50, 0.80, 0.80, 2.50,\n0.80, 0.80, 2.50, 0.80, 0.80, 2.50, 0.80, 0.80]T and S5 =\n[0.80, 2.50, 0.80, 0.80, 2.50, 0.80, 0.80, 2.50, 0.80, 2.50,\n0.80, 0.80]T , we compute and visualize the corresponding\nworkspaces in Fig. 5, with green dots representing S4 and\nblue dots representing S5.\nV. CO-OPTIMIZATION OF VSJ STIFFNESS AND TENDON\nACTUATION\nIn\nthis\nsection,\nwe\ndescribe\nour\napproach\nto\nco-\noptimization by formulating it as a reinforcement learning\n(RL) problem. We begin with a brief review of RL fundamen-\ntals before presenting the formulation of the co-optimization\nprocedure.\nA. Reinforcement learning\nThe RL problem is generally formulated as a Markov\nDecision Process (MDP). An MDP can be represented by\na tuple (S, A, F, r), where S is the state space, A is the\naction space, F is the state transition model, r is the reward\nfunction. An agent in state st \u2208S at time t takes action\nat \u2208A according to some policy \u03c0\u03b8, and the environment\nreturns the agent\u2019s new state st+1 \u2208S according to the state\ntransition model F(st+1|st, at), along with the associated\nreward rt = r(st, at). The goal is to learn the optimal control\npolicy \u03c0\u2217\n\u03b8 : S \u2192A mapping states to actions that maximizes\nthe expected return\nJ(\u03c0\u03b8) = E\u03c4\u223c\u03c0\u03b8 [R(\u03c4)]\nwhere \u03c4 is a trajectory obtained by letting the agent act in\nthe environment using the policy \u03c0\u03b8, R(\u03c4) = PT\ni=0 \u03b3irt+i\nis the return for the trajectory \u03c4, where T is the length of\nthe trajectory, \u03b3 \u2208[0, 1) is the discount factor of the future\nrewards.\nA stochastic policy, denoted as \u03c0\u03b8(at | st), is commonly\nused to predict the action at given the current state st.\nThe stochastic nature of the policy encourages exploration\nduring the training process, enhancing the model\u2019s ability\nto discover optimal actions. Typically, \u03c0\u03b8(at | st) can be\nmodeled as a neural network, which takes the current state st\nas input, and outputs a probability distribution for sampling\nthe action at. In most cases, a Gaussian distribution is used\nfor the probability distribution, where the neural network\noutputs the mean and standard deviation for each dimension\nof the action at.\nIn this work, we use the Proximal Policy Optimization\n(PPO) [28] algorithm to learn an optimal control policy for\nour reconfigurable manipulator to achieve specific tasks (e.g.,\nreaching goal points). PPO offers significant advantages over\ntraditional policy gradient methods by providing a stable and\nefficient training process. PPO is an on-policy algorithm that\nalternates between sampling data from the environment and\noptimizing the following objective:\n\u02c6Et\nh\nmin\n\u0010\nrt(\u03b8) \u02c6At, clip(rt(\u03b8), 1 \u2212\u03f5, 1 + \u03f5) \u02c6At\n\u0011i\n(6)\nwhere rt(\u03b8) =\n\u03c0\u03b8(at|st)\n\u03c0\u03b8old(at|st) is the ratio function, \u02c6A is the ad-\nvantage function, \u03f5 is the clip range, a small hyperparameter\nwhich roughly says how far away the new policy is allowed\nto go from the old one. We set the clip range \u03f5 to be 0.2.\nThe clipped objective has the effect of maximizing expected\nreturn by making only small steps in policy space at a time.\nB. Co-optimization\nTo conduct the co-optimization, we extend the standard\nRL formulation to include the physical reconfiguration pa-\nrameters (i.e., the stiffness for VSJs) as a learnable parameter.\nSpecifically, we co-optimize the reconfiguration parameters\nand control policy of the robots using the algorithm presented\nin [29], which is inspired by the parameter exploring strategy\npresented in [30]. This algorithm is straightforward to imple-\nment and efficient in learning, as it optimizes reconfiguration\nFig. 6: Overview of the co-optimization approach\nparameters directly, without the need for an additional neural\nnetwork or modifying the architecture of the neural network,\nas used in [22], [23]. We denote the design parameters as \u03c9.\nFor the manipulator, \u03c9 represents the VSJs\u2019 stiffness values.\nThe goal of our co-optimization is to obtain optimal \u03c9\u2217\u2208\u2126\nfrom the space of feasible reconfiguration parameters \u2126that\nmaximizes the agent\u2019s success when used in conjunction with\na corresponding optimal control policy \u03c0\u2217\n\u03b8.\nInstead of treating \u03c9 as fixed values, we model them as\nGaussian distributions p\u03d5(\u03c9) with means \u00b5 and standard\ndeviations \u03c3.\np(\u03c9i) =\n1\np\n2\u03c0\u03c32\ni\nexp\n\u0012\n\u2212(\u03c9i \u2212\u00b5i)2\n2\u03c32\ni\n\u0013\n,\n(7)\nNow with the Gaussian distribution, we can include the\noptimization of the design parameters to the classic pol-\nicy gradient procedure. Similarly to the update of policy\nparameters, we update the learning parameters \u03d5 of the\ndesign distribution (\u00b5 and \u03c3) using the episode return. The\noverview of the co-optimization approach is depicted in\nFig. 6, where the red part represents the classic policy\ngradient procedure, and the blue part represents the optimiza-\ntion of the reconfiguration parameters. The policy function\n\u03c0\u03b8(at|st, \u03c9) now depends on both the current state st and\nthe reconfiguration parameters \u03c9. Formally, we seek to find\nthe optimal parameters for both the design and policy \u03d5\u2217and\n\u03b8\u2217such that they can generate maximized expected return:\n\u03d5\u2217, \u03b8\u2217= arg max\n\u03d5,\u03b8 E\u03c9\u223cp\u03d5 [E\u03c0\u03b8[R\u03c4]] .\n(8)\nAt each iteration of training, the policy is trained using\nPPO to maximize the expected return over designs sampled\nfrom the current design distribution p\u03d5. Also, the design\ndistribution is updated at every iteration to increase the\nprobability density around designs that perform well when\nusing the current learned policy \u03c0\u03b8 [29]:\n\u2207E\u03c9\u223cp\u03d5 [E\u03c0\u03b8[Rt]] = E\u03c9\u223cp\u03d5 [\u2207log p\u03d5(\u03c9)E\u03c0\u03b8[Rt]] .\n(9)\nThis shifts the means and standard deviations of the re-\nconfiguration distribution \u03d5 to maximize the expected return\nFig. 7: Execution of the trained policy for the reaching task\nwith obstacle avoidance. The red curve shows the trajectory\nof the effective tip. The red dot shows the position of the\neffective tip at the last step.\nunder the current policy \u03c0\u03b8. The term \u2207log p\u03d5(\u03c9) in Eq. (9)\nis calculated as follows [30].\n\u2207\u00b5i log p(\u03c9i) = \u03c9i \u2212\u00b5i\n\u03c32\ni\n,\n(10)\n\u2207\u03c3i log p(\u03c9i) = (\u03c9i \u2212\u00b5i)2 \u2212\u03c32\ni\n\u03c33\ni\n.\n(11)\nAfter the training process, we choose the modes of the re-\nconfiguration distributions as the final VSJs\u2019 stiffness values.\nVI. RESULTS\nIn this section, we describe the simulation setup and\nthe corresponding results. The forward model in Section\nIII-B is implemented as the state transition model in the\nlearning process. As shown in Fig. 7, we increase the number\nof modules in the manipulator from 4 to 5 to achieve a\nmore challenging reaching task with obstacle avoidance. The\nbottom plate of its bottommost module is fixed. We aim\nto optimize both the stiffnesses of all VSJs and the control\nstrategy on the tendon displacements to make the effective tip\nof the manipulator reach a goal point while avoiding collision\nwith a surrounding obstacle.\nA. Reaching task with one obstacle\nThe objective of this task is to train the RL agent to reach\na predefined goal position at [\u221250, 0, 50]T while avoiding a\nplanar obstacle. The obstacle is positioned parallel to the\nYZ plane at a fixed x-coordinate of -25 mm. The agent\nmust navigate through the environment while adhering to\nkinematic constraints and ensuring collision-free movement.\nSince the obstacle lies between the manipulator\u2019s initial\nFig. 8: Average return of the training process with one\nobstacle for a total of 4 million time steps.\nposition and the goal, the manipulator must maneuver around\nthe obstacle to successfully reach the target. To guide the\nagent toward the goal position, we employ a dense reward\nfunction defined as follows.\nr = \u2212rc1(collision) + rs1(d < ds) +\nrd\n0.2 + d\n(12)\nwhere rc is the collision penalty applied if the agent collides\nwith the obstacle, rs is the success bonus reward applied if\nthe agent reaches the goal within threshold ds of 3.0 mm.\nThe indicator function 1(collision) returns 1 if a collision\nis detected and 0 otherwise. Similarly, the indicator function\n1(d < ds) returns 1 if d is less than ds, and 0 otherwise.\nd = \u2225pt\u2212pg\u2225is the distance between the effective tip pt and\nthe goal position pg. The last term represents the distance\nreward scaled by a constant rd. We extended the forward\nmodel to calculate the distance d and to detect the collision\nbetween the manipulator and the obstacle.\nWe employ the PPO algorithm with a multi-layer percep-\ntron (MLP) policy, consisting of two hidden layers with 64\nneurons each and ReLU activation. The observation space\nof the control policy consists of tendon displacements, stiff-\nnesses of the joints, shape of the manipulator, and the goal\nposition. The actions are the change of tendon displacements.\nThe forward model is used to develop a custom environment\ncompatible using the Gymnasium API [31] for the RL\nprocess. Our co-optimization framework is implemented with\nthe reinforcement learning library Stable-Baselines3 [32].\nTo balance exploration and exploitation, we employ a\nlinear decay strategy for both the entropy coefficient and the\nlearning rate. The entropy coefficient decreases linearly from\n0.02 to 0.001, while the learning rate gradually declines from\n0.00025 to 0. We apply normalization to standardize both\nobservations and rewards to help mitigate large fluctuations\nand improves learning stability. We present the average\nepisode return as a function of time steps in Fig. 8.\nThe stiffnesses of the VSJs are optimized through the\ntraining process to be S = [2.50, 2.41, 1.87, 0.58, 2.12,\n1.52, 2.50, 0.40, 0.40, 1.26, 0.40, 1.09, 0.82, 1.55, 1.25]T .\nFig. 9: Execution of the trained policy for the reaching\ntask while avoiding two obstacles. The red curve shows the\ntrajectory of the effective tip. The red dot shows the position\nof the effective tip at the last step.\nTo evaluate the trained control policy and stiffness values,\nwe deploy them on a simulated manipulator and visualize\nthe results in Fig. 7. The initial shape of the manipulator is\ndepicted in black and white. The trajectory of the effective\ntip (shown as the red curve) and the final shape of the\nmanipulator, shown in green, show the agent\u2019s ability to\nsuccessfully reach the goal (the black dot) while effectively\navoiding the surrounding obstacle. We also train control\npolicies for the same reaching task with obstacle avoidance\nwithout the co-optimization procedure. In this case, the\nstiffness values are predetermined as stiffness sets S1 and\nS4 and remain fixed throughout the learning process. The\ntrained control policies are then deployed on the manipulator\nwith these specified stiffnesses, and the resulting trajectories\nare shown as the blue and green curves, respectively. For\nstiffness set S1, the agent navigates toward the goal but fails\nto reach it completely. In contrast, for stiffness S4, the agent\nbecomes trapped in a local minimum, avoiding both the goal\nposition and the obstacle.\nB. Reaching task with two obstacles\nWe increase the task difficulty by introducing a second\nobstacle (obstacle 2 in Fig. 9), which is parallel to the YZ\nplane and positioned at a fixed x-coordinate of \u221260 mm. This\nobstacle extends from a z-coordinate of 135 mm (bottom\nedge) to 200 mm (top edge).\nAs shown in Fig. 7, the previously trained policy leads\nthe agent to collide with this second obstacle during its\nmotion. Unlike the first obstacle, which primarily constrains\nthe final stage of the motion, the second obstacle introduces\nFig. 10: Average return of the training process with two\nobstacles for a total of 2 million timesteps\nconstraints early in the trajectory, requiring the agent to plan\nits movement from the very beginning.\nNow we include the second obstacle in the custom envi-\nronment and train a new policy to make the agent reach the\nsame goal point while avoiding both obstacles. The reward\nfunction and hyperparameters are the same as before, except\nthat we increase the initial value of the entropy coefficient\nto 0.03 to encourage greater exploration.\nWe present the average episode return as a function of\ntime steps in Fig. 10. The stiffnesses of the VSJs are\noptimized through the training process to be S = [2.37,\n1.70, 2.50, 1.86, 2.04, 2.11, 0.49, 0.82, 1.98, 0.50, 2.50,\n2.31, 0.69, 2.50, 0.90]T . We also deploy the learned control\npolicy and stiffness values on a simulated manipulator and\nvisualize the results in Fig. 9. The resulting trajectory of\nthe effective tip (shown as the red curve) demonstrates that\nthe agent successfully reaches the goal (depicted as the\nblack dot) while effectively avoiding both obstacles. Notably,\nwe observe that the manipulator initially bends slightly to\nthe right before redirecting its motion toward the left to\nreach the goal. This initial rightward movement allows the\nmanipulator to navigate around obstacle 2 before proceeding\ntoward the target, illustrating a strategic adjustment that\nensures collision-free motion. For comparison, we also train\ncontrol policies for the same reaching task with both obstacle\navoidance without the co-optimization procedure. As in the\nprevious case, the stiffness values are predetermined to be\nS1 and S4 and remain fixed throughout the learning process.\nThe resulting trajectories are shown as the blue and green\ncurves, respectively in Fig. 10. For stiffness set S1, the\nagent navigates toward the goal but fails to avoid obstacle\n2. For stiffness set S4, the agent again get trapped in a local\nminimum, avoiding both the goal position and the planar\nobstacle.\nVII. CONCLUSION\nIn this work, we applied a RL-based co-optimization\nalgorithm to jointly optimize the joint stiffnesses and tendon\nactuation of an origami-inspired reconfigurable manipulator.\nWe first introduced the working principle and developed\na forward model for the manipulator, which consists of\nmultiple serially connected origami-inspired modules. We\nthen demonstrated that the manipulator\u2019s design parame-\nters, specifically the joint stiffnesses, significantly influence\nits motion and workspace. Finally, by integrating stiffness\noptimization into the control learning process, we showed\nthat the co-optimized manipulator outperforms agents with\nfixed design parameters in reaching tasks while avoiding\nobstacles. These results underscore the importance of co-\noptimizing physical reconfiguration parameters and control\npolicies, as different stiffness configurations directly impact\nthe manipulator\u2019s kinematic behavior.\nFuture work will focus on extending this approach to more\ncomplex manipulation tasks. Additionally, we aim to validate\nthe learned policies and stiffnesses on physical prototypes to\nassess their real-world feasibility and robustness.\nREFERENCES\n[1] T. Miki, J. Lee, J. Hwangbo, L. Wellhausen, V. Koltun, and M. Hutter,\n\u201cLearning robust perceptive locomotion for quadrupedal robots in the\nwild,\u201d Science robotics, vol. 7, no. 62, p. eabk2822, 2022.\n[2] A. Kumar, Z. Fu, D. Pathak, and J. Malik, \u201cRma: Rapid motor\nadaptation for legged robots,\u201d arXiv preprint arXiv:2107.04034, 2021.\n[3]\n\u00b4A. Belmonte-Baeza, J. Lee, G. Valsecchi, and M. Hutter, \u201cMeta\nreinforcement learning for optimal design of legged robots,\u201d IEEE\nRobotics and Automation Letters, vol. 7, no. 4, pp. 12 134\u201312 141,\n2022.\n[4] T. Dinev, C. Mastalli, V. Ivan, S. Tonneau, and S. Vijayakumar, \u201cA\nversatile co-design approach for dynamic legged robots,\u201d in 2022\nIEEE/RSJ International Conference on Intelligent Robots and Systems\n(IROS).\nIEEE, 2022, pp. 10 343\u201310 349.\n[5] C. Schaff, A. Sedal, and M. R. Walter, \u201cSoft robots learn to crawl:\nJointly optimizing design and control with sim-to-real transfer,\u201d arXiv\npreprint arXiv:2202.04575, 2022.\n[6] T. Chen, Z. He, and M. Ciocarlie, \u201cCo-designing hardware and control\nfor robot hands,\u201d Science Robotics, vol. 6, no. 54, p. eabg2133, 2021.\n[7] A. Gupta, L. Fan, S. Ganguli, and L. Fei-Fei, \u201cMetamorph:\nLearning universal controllers with transformers,\u201d arXiv preprint\narXiv:2203.11931, 2022.\n[8] T. F. Nygaard, C. P. Martin, J. Torresen, K. Glette, and D. Howard,\n\u201cReal-world\nembodied\nai\nthrough\na\nmorphologically\nadaptive\nquadruped robot,\u201d Nature Machine Intelligence, vol. 3, no. 5, pp. 410\u2013\n419, 2021.\n[9] R. Baines, S. K. Patiballa, J. Booth, L. Ramirez, T. Sipple, A. Garcia,\nF. Fish, and R. Kramer-Bottiglio, \u201cMulti-environment robotic transi-\ntions through adaptive morphogenesis,\u201d Nature, vol. 610, no. 7931,\npp. 283\u2013289, 2022.\n[10] J. Sun, E. Lerner, B. Tighe, C. Middlemist, and J. Zhao, \u201cEmbedded\nshape morphing for morphologically adaptive robots,\u201d Nature Com-\nmunications, vol. 14, no. 1, p. 6023, 2023.\n[11] T. L. Buckner, M. C. Yuen, S. Y. Kim, and R. Kramer-Bottiglio,\n\u201cEnhanced variable stiffness and variable stretchability enabled by\nphase-changing particulate additives,\u201d Advanced Functional Materials,\nvol. 29, no. 50, p. 1903368, 2019.\n[12] M. Stern, C. Arinze, L. Perez, S. E. Palmer, and A. Murugan, \u201cSu-\npervised learning through physical changes in a mechanical system,\u201d\nProceedings of the National Academy of Sciences, vol. 117, no. 26,\npp. 14 843\u201314 850, 2020.\n[13] E. Lerner, Z. Chen, and J. Zhao, \u201cReconfigurable origami with\nvariable stiffness joints for adaptive robotic locomotion and grasping,\u201d\nPhilosophical Transactions A, vol. 382, no. 2283, p. 20240017, 2024.\n[14] J. Sun and J. Zhao, \u201cAn adaptive walking robot with reconfigurable\nmechanisms using shape morphing joints,\u201d IEEE Robotics and Au-\ntomation Letters, vol. 4, no. 2, pp. 724\u2013731, 2019.\n[15] Z. Chen, B. Tighe, and J. Zhao, \u201cOrigami-inspired modules enable\na reconfigurable robot with programmable shapes and motions,\u201d\nIEEE/ASME Transactions on Mechatronics, vol. 27, no. 4, pp. 2016\u2013\n2025, 2022.\n[16] A. Firouzeh and J. Paik, \u201cGrasp mode and compliance control of\nan underactuated origami gripper using adjustable stiffness joints,\u201d\nIeee/asme Transactions on Mechatronics, vol. 22, no. 5, pp. 2165\u2013\n2173, 2017.\n[17] Y. Lin, G. Yang, Y. Liang, C. Zhang, W. Wang, D. Qian, H. Yang, and\nJ. Zou, \u201cControllable stiffness origami \u201cskeletons\u201d for lightweight and\nmultifunctional artificial muscles,\u201d Advanced Functional Materials,\nvol. 30, no. 31, p. 2000349, 2020.\n[18] A. Spielberg, B. Araki, C. Sung, R. Tedrake, and D. Rus, \u201cFunctional\nco-optimization of articulated robots,\u201d in 2017 IEEE International\nConference on Robotics and Automation (ICRA).\nIEEE, 2017, pp.\n5035\u20135042.\n[19] R. Deimel, P. Irmisch, V. Wall, and O. Brock, \u201cAutomated co-design\nof soft hand morphology and control strategy for grasping,\u201d in 2017\nIEEE/RSJ International Conference on Intelligent Robots and Systems\n(IROS).\nIEEE, 2017, pp. 1213\u20131218.\n[20] T. Liao, G. Wang, B. Yang, R. Lee, K. Pister, S. Levine, and\nR. Calandra, \u201cData-efficient learning of morphology and controller\nfor a microrobot,\u201d in 2019 International Conference on Robotics and\nAutomation (ICRA).\nIEEE, 2019, pp. 2488\u20132494.\n[21] A. Nagabandi, K. Konolige, S. Levine, and V. Kumar, \u201cDeep dynamics\nmodels for learning dexterous manipulation,\u201d in Conference on Robot\nLearning.\nPMLR, 2020, pp. 1101\u20131112.\n[22] Y. Yuan, Y. Song, Z. Luo, W. Sun, and K. Kitani, \u201cTransform2act:\nLearning a transform-and-control policy for efficient agent design,\u201d\narXiv preprint arXiv:2110.03659, 2021.\n[23] Z. He and M. Ciocarlie, \u201cMorph: Design co-optimization with rein-\nforcement learning via a differentiable hardware model proxy,\u201d in 2024\nIEEE International Conference on Robotics and Automation (ICRA).\nIEEE, 2024, pp. 7764\u20137771.\n[24] S. Islam, Z. He, and M. Ciocarlie, \u201cTask-based design and policy co-\noptimization for tendon-driven underactuated kinematic chains,\u201d arXiv\npreprint arXiv:2405.14566, 2024.\n[25] A. Spielberg, A. Zhao, Y. Hu, T. Du, W. Matusik, and D. Rus,\n\u201cLearning-in-the-loop optimization: End-to-end control and co-design\nof soft robots through learned deep latent representations,\u201d Advances\nin Neural Information Processing Systems, vol. 32, 2019.\n[26] T. Wang, Y. Zhou, S. Fidler, and J. Ba, \u201cNeural graph evo-\nlution: Towards efficient automatic robot design,\u201d arXiv preprint\narXiv:1906.05370, 2019.\n[27] C. Chen, P. Xiang, J. Zhang, R. Xiong, Y. Wang, and H. Lu,\n\u201cDeep reinforcement learning based co-optimization of morphology\nand gait for small-scale legged robot,\u201d IEEE/ASME Transactions on\nMechatronics, vol. 29, no. 4, pp. 2697\u20132708, 2023.\n[28] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov,\n\u201cProximal\npolicy\noptimization\nalgorithms,\u201d\narXiv\npreprint\narXiv:1707.06347, 2017.\n[29] C. Schaff, D. Yunis, A. Chakrabarti, and M. R. Walter, \u201cJointly\nlearning to construct and control agents using deep reinforcement\nlearning,\u201d in 2019 international conference on robotics and automation\n(ICRA).\nIEEE, 2019, pp. 9798\u20139805.\n[30] F. Sehnke, C. Osendorfer, T. R\u00a8uckstie\u00df, A. Graves, J. Peters, and\nJ. Schmidhuber, \u201cParameter-exploring policy gradients,\u201d Neural Net-\nworks, vol. 23, no. 4, pp. 551\u2013559, 2010.\n[31] M. Towers, A. Kwiatkowski, J. Terry, J. U. Balis, G. De Cola, T. Deleu,\nM. Goul\u02dcao, A. Kallinteris, M. Krimmel, A. KG et al., \u201cGymnasium:\nA standard interface for reinforcement learning environments,\u201d arXiv\npreprint arXiv:2407.17032, 2024.\n[32] A. Raffin, A. Hill, A. Gleave, A. Kanervisto, M. Ernestus, and\nN. Dormann, \u201cStable-baselines3: Reliable reinforcement learning\nimplementations,\u201d Journal of Machine Learning Research, vol. 22,\nno. 268, pp. 1\u20138, 2021. [Online]. Available: http://jmlr.org/papers/\nv22/20-1364.html"
-  }
-]
\ No newline at end of file