Learning Neural Operators on Riemannian Manifolds

In Artificial Intelligence (AI) and computational science, learning the mappings between functions (called operators) defined on complex computational domains is a common theoretical challenge. Recently, Neural Operator emerged as a promising framework with a discretisation-independent model structure to break the fixed-dimension limitation of classical deep learning models. However, existing operator learning methods mainly focus on regular computational domains, and many components of these methods rely on Euclidean structural data. In real-life applications, many operator learning problems are related to complex computational domains such as complex surfaces and solids, which are non-Euclidean and widely referred to as Riemannian manifolds. Here, we report a new concept, Neural Operator on Riemannian Manifolds (NORM), which generalises Neural Operator from being limited to Euclidean spaces to being applicable to Riemannian manifolds, and can learn the mapping between functions defined on any real-life complex geometries, while preserving the discretisation-independent model structure. NORM shifts the function-to-function mapping to finite-dimensional mapping in the Laplacian eigenfunctions' subspace of geometry, and holds universal approximation property in learning operators on Riemannian manifolds even with only one fundamental block. The theoretical and experimental analysis prove that NORM is a significant step forward in operator learning and has the potential to solve complex problems in many fields of applications sharing the same nature and theoretical principle.


Introduction
Many scientific discoveries and engineering research activities involve the exploration of the intrinsic connection and relationship between functions [1,2].In mathematics, the mapping between two functions is called the operator [3].Establishing the operator defined on complex computational domains has been a theoretical challenge [4].One ubiquitous example of operator is the solution operator of Partial Differential Equations (PDEs) [5], which provide the foundational descriptions of many nature laws.Solving PDEs under different parameters, initial and boundary conditions can be regarded as finding the solution operators [6,7].A more practical example is that, for nuclear fusion, establishing the operator that links the input controlling coil voltage to the plasma distribution in the complex tokamak vessel could enable rapid and accurate forecasting of plasma field evolution, thus pointing to a promising direction towards sustainable fusion [8,2].There are also requirements for establishing operators in a wide range of other complex field prediction scenarios, such as predicting the blood flow dynamics of the human body for the purpose of cardiovascular disease diagnosis and treatment [9,10], and predicting the pressure field of an aircraft for fuselage structure optimisation [11,12].Physical experiments and numerical simulations are commonly used methods for finding the mapping between two functions (i.e.operators) [13,1].Due to the complex process of the underlying operators, especially when involving complex computational domains like tokamak vessels, human organs or aircraft structures, the high computational and experimental costs of these methods are prohibitive for real-world situations [14,15].
Artificial Intelligence (AI) techniques recently emerged as a promising paradigm shift for learning operators directly from data [1].Classical deep learning methods, such as Convolutional Neural Networks (CNNs) and deconvolution techniques, can learn the mapping between discretised picturelike uniform grid data to approximate the operator [16,17].Graph Neural Networks (GNNs) can represent the computational domain as a graph and then learn the properties of the nodes through message passing [18,19].However, since the network structure and the parameterisation of CNNs and GNNs heavily depend on the discretisation-resolution of the computational domain [20], the high-dimensional discretisation of the computational domains will bring significant computational burdens to model training, and lead to slow convergence or even divergence when learning general nonlinear operators [21].Recently, Neural Operators (NOs), such as DeepONet [22] and Fourier Neural Operator (FNO) [23], were proposed as a new deep learning approach that could directly learn mappings between functions on continuous domains with a discretisation-independent model structure (i.e., the parameterisation of the model is independent of the discretisation of the computational domain) [24].Despite the significant success of NOs, they mainly focus on learning the mapping between functions defined on regular computational domains (data in the form of a picture-like uniform grid), and many components of these methods rely on Euclidean structural data, for example, Fast Fourier Transform in FNO [23], image convolution layer in U-shaped Neural Operator (UNO) [25], and Wavelet transform in Wavelet Neural Operator (WNO) [26].However, real-life applications are more complex and many are in irregular computational domains.Existing NOs often have to convert irregular data to the form as regular uniform grid by coordinate transformation [27,28] or grid interpolation [20,29].However, coordinate transformation techniques are normally limited to converting simple 2-dimensional (2D) irregular computational domains due to the poor intrinsic representation [27,20], whilst grid interpolation often leads to high-dimensional discretisation and thus brings significant computational burdens to model training, especially for 3dimensional (3D) computational domains [17].Therefore, existing NOs have limitations in solving operator learning problems of real-life applications with irregular computational domains, including complex surfaces and solids, which are non-Euclidean structural data, and widely referred to as Riemannian manifolds.
This research proposed a deep learning framework with a new concept called Neural Operator on Riemannian Manifolds (NORM), as shown in Fig. 1a.NORM could break the limitations of existing NOs and extend the applicability from Euclidean spaces to Riemannian manifolds.NORM can learn the mapping between functions defined on any Riemannian manifolds, including 2D and 3D computational domains, while maintaining a model structure independent of discretisation.Compared with learning operators directly in the Euclidean space, the fundamental blocks of NORM shift the function-to-function mapping to the finite-dimensional mapping in the Laplacian eigenfunctions' subspace of geometry (Fig. 1c).Because Laplacian eigenfunctions have been proven to be the optimal basis for approximating functions on Riemannian manifolds [30], NORM can learn the global geometric information effectively and accurately without increasing the complexity of parameterisation.Besides, we have proved that NORM could hold the universal approximation property even with only one fundamental block.The effectiveness of the proposed framework was demonstrated through several different tasks in science and engineering, including learning solution operators for classical PDEs, composite workpiece deformation prediction and blood flow dynamics prediction.

Problem definition
Learning operators on Riemannian manifolds refers to learning a mapping between two functions defined on Riemannian manifolds, as shown in Fig. 1a.Denote G : A(X ; R) → U(Y; R) a continuous operator, namely the underlying mapping between the input and the output functions.The input a ∈ A (X ; R) is a function a(x) : X → R, x ∈ X , the output u ∈ U (Y; R) is a function u(y) : Y → R, y ∈ Y. X and Y are Riemannian manifolds.Assuming that both A and U are L 2 spaces, then, the problem of learning operator on Riemannian manifolds is to learn a parameterised operator G θ to approximate G, i.e.G θ ≈ G, θ ∈ R p .
Since the input function a and the output function u are both defined on Riemannian manifolds, the obvious solution is to transfer them into a new representation that can be processed with existing Euclidean learning models.Ideally, the solution should be feasible and consistent for any functions defined on Riemannian manifolds.At the same time, the new representation should be low-dimensional while maintaining the information of the original functions.Therefore, we first propose a simple approximation block with an encoder-approximator-decoder structure to transfer the mapping between functions on Riemannian manifolds to a finite-dimensional mapping on Euclidean space.
The approximation block for learning operators on Riemannian manifolds can be defined as a mapping denotes the encoder that maps the function on manifold X to Euclidean space, R : R approximator, a learning model for Euclidean data, D : R d Y → U(Y; R) is an inverted mapping to recover the prediction function on manifold Y. Similar encoder-approximator-decoder structures were also applied in learning mappings between functions defined on Euclidean space [31,32].To learn operators on Riemannian manifolds, the primary challenge lies in how to design the encoder and decoder mapping to process functions on manifolds without increasing the model complexity.These two mappings would not only influence the feature extraction capability of the learning model, but also determine whether the model holds universal approximation property.

Constructing mappings using Laplacian
The discretisation-independent target of the neural operator reminds us of the mesh-free spectral method in PDE solving [33].Intuitively, the spectrum of manifolds could naturally describe the intrinsic information of operators on manifolds.The ideal choice of the spectrum for operator learning is the eigenfunctions of the Laplacian, which is a set of orthonormal basis [34], and has been proven to be the optimal basis for approximating functions defined on Riemannian manifolds [30,35].Therefore, the encoder E and the decoder D could be constructed as the spectral decomposition and the spectral reconstruction on the corresponding Laplacian eigenfunctions.
The Laplacian occurs in a wide range of differential equations describing science and engineering problems, such as the heat transfer function, Poisson's equation, diffusion equation, and wave equation [35].For the Euclidean space R d and a twice-differentiable function f , the Laplacian ∆f is a second-order differential operator defined as the divergence ∇• of the gradient ∇f , that is ∆f = ∇ • ∇f = ∇ 2 f .The eigenvalue problem for the Laplacian can be defined as ∆ϕ i = λ i ϕ i , where the λ i (λ 1 ≤ λ 2 ≤ . ..) and ϕ i (x) that satisfying this equation are defined as the eigenvalues and the corresponding eigenfunctions.Actually, the Fourier basis e 2πikx is also the eigenfunction of the Laplacian with the eigenvalue λ = −(2πk) 2 [36].
Since the divergence operator ∇• and gradient operator ∇f can also be defined on manifolds with Riemannian metric g, the Laplacian ∆f can be naturally extended to the Riemannian manifold, which is also called the Laplace-Beltrami operator (LBO) [37].Therefore, we could obtain the Laplacian spectrum of manifolds in a similar way as in Euclidean space, as shown in Fig. 1d.
For Riemannian manifold M, the LBO eigenfunctions ϕ i (x) is a set of orthonormal bases for the Hilbert space L 2 (M).It can be proved that a finite number of leading LBO eigenfunctions can approximate functions on manifolds with any accuracy [30].Therefore, for the approximation block N = D • R • E, the encoder E can be defined as the spectral decomposition on the LBO eigenfunctions ϕ X ,i of the input manifold X : And the decoder can be defined as the spectral reconstruction on the LBO eigenfunctions ϕ Y,i of the output manifold Y: With the defined encoder X and decoder D, an approximation block N = D • R • E could potentially learn the mappings between functions on manifolds with a simple parameterised Euclidean learning model R. Since LBO can be defined on any Riemannian manifold, the block N can naturally deal with any complex geometric domain, which breaks the limitations that existing neural operators relying on Euclidean structured data.Meanwhile, the approximation block holds the discretisation-independent property, because R is parameterised on Euclidean spaces with size only related to the truncated eigenfunctions of the input and output manifolds.
Although Laplacian is defined mathematically on smooth domains, practical numerical computation typically requires discrete approximations of domains, such as meshes or point clouds.The LBOs of common geometric meshes have been strictly defined in the differential geometry field [38], including triangular mesh, quadrilateral mesh, and tetrahedral mesh.In Supplementary Materials S2.1, an example of discretised LBO for triangular mesh is provided.

Framework of Neural Operator on Riemannian Manifold
The approximation block N = D • R • E can transfer the mapping between functions on Riemannian manifolds to a finite-dimensional Euclidean space learning problem.However, one basic block only approximates the target operator by a linear subspace, which is inefficient in extracting non-linear low-dimensional structures of data.Here, we propose a new deep learning framework, the Neural Operator on Riemannian Manifolds (NORM), that consists of multiple layers and in which the approximation block N constitutes one layer of the model, like the convolution layer in traditional CNN.
We start from a common situation, assuming the input and output functions are defined on the same manifold M, i.e.X = Y = M.The structure of NORM can be represented as the form shown in Fig. 1b, consisting of two feature mapping layers P, Q and n L hidden layers.The shallow network P lifts the input function a to get v 0 = P(a), where P : L 2 (M; R) → L 2 (M; R dv ), so as to expand the dimension of features to increase the representation ability, similar to the convolution channel expansion in CNN.Multiple hidden layers, defined as the Laplace layer, or L-layer (Fig. 1c), would update the input function iteratively, such as v l−1 → v l = L l (v l−1 ) in the L-layer l.After that, the final shallow network Q would project the high-dimensional features to the output dimension, namely u = Q (v n L ), where Q : L 2 (M; R dv ) → L 2 (M; R).The iterative structure can be represented as: The iteration of the hidden layers is given as follows: where the linear transformations W l ∈ R dv×dv and the bias b l ∈ R dv are defined as pointwise mapping.σ is the non-linear activation function like in the traditional neural network.Note that, the LBO eigenfunctions required in the approximation block can be pre-computated before training the model, as shown in Fig. 1d.The detailed implementation of the discredited version of the approximation block N (v) is provided in Supplementary Materials S1.1.
The above definition introduces the NORM structure where the input and output are defined on the same manifold.Nevertheless, the structure can be easily generalised to the settings where the input and output are defined on different manifolds and several different structures are introduced in Supplementary Materials S1.2.
Note that the parameterisation of NORM is independent of the discretisation of the input and output functions, because all operations are defined directly in the function spaces on manifolds rather than the Euclidean coordinate spaces.P, Q are learnable neural networks between finite-dimensional Euclidean spaces and have the same point-wise parameterisation for all x ∈ M. Therefore, NORM can learn the mappings between functions on any Riemannian manifolds while maintaining the discretisation-independent property.

Universal Approximation of NORM
Many recent studies have investigated the universal approximation properties of neural operators between functions on Euclidean spaces [39,24].This section will show the advantage of the proposed method that even one approximation block N of NORM holds the universal approximation ability in learning operators between functions defined on Riemannian manifolds.
Let N = D • R • E be a neural operator for the continuous mapping A(X ; R) → U(Y; R), and R : R d X → R d Y represents a neural network that has universal approximation property.The encoder is defined as: E : A → R d X and E(a) := (⟨a, ϕ X ,1 ⟩ , . . ., ⟨a, ϕ X ,d X ⟩), ∀a ∈ A. The decoder is defined as D : R d Y → U, and X and Y are Riemannian manifolds.A and U are L 2 spaces.ϕ X ,i and ϕ Y,i are LBO eigenfunctions of manifolds X and Y, respectively.It should be noted that N = D • R • E is a basic block of the NORM, and also can be treated as the simplified version of NORM.Therefore, the universal approximation property of N could guarantee the universal approximation property of the more complex NORM framework.The universal approximation theorem of neural operators on Riemannian manifolds is as follows: Theorem.Universal approximation theorem for neural operators on Riemannian manifolds.Let G : A(X ; R) → U(Y; R) be a Lipschitz continuous operator, K ∈ A is compact set.Then for any ϵ > 0, there exists a neural operator N = D • R • E, such that: Proof of Theorem.It is challenging to directly prove the approximation error from A → U. Therefore, we establish a low-dimensional projection subspace of A and U spanned by the corresponding LBO eigenfunctions.It can be first proved that N holds universal approximation property in learning operators between the projection subspaces.Since LBO eigenfunctions is a group of basis in L 2 space, the projection error can be proven to be ϵ−approximation.Therefore, the final approximation error of N can be obtained by combining the approximation error on the subspace, the encoding error on the input, and the decoding error on the output.The detailed proof can be found in Supplementary Materials S3.

Results
The proposed NORM was verified using three toy cases and two practical engineering cases with 2D or 3D complex geometric domains.The three toy cases of learning PDEs solution operators involved different problem settings and input/output structures: (1) The Darcy problem case aims to learn the mapping from the parameter function (the diffusion coefficient field) to the solution function (the pressure function field), where both functions are defined on the same 2D manifold; (2) The pipe turbulence case is a classical dynamics systems prediction setting, namely, predicting the future state field based on the current state field (velocity field in the pipe), and (3) The heat The values A(B) represent the mean and standard deviation of five repeated runs, respectively.
transfer case tries to learn the mapping from the boundary condition (temperature function on 2D manifold) to the temperature field of the part (temperature function on 3D manifold).The two engineering cases are composite workpiece deformation prediction and blood flow dynamics prediction: (4) The composite case aims to learn the mapping from the temperature field to the final deformation field of a 3D composite workpiece, where the deformation mechanism involves complex physicochemical processes other than only PDEs, and (5) For the blood flow dynamics case, the inputs are multiple time series functions, and the output is the spatiotemporal velocity field of the aortic (3D manifold).
We compared the NORM with several popular neural operators, including DeepONet [22], POD-DeepONet [20], FNO [23], and also one classical Graph Neural Networks (GNN), named GraphSAGE [40].For the 2D cases, the irregular geometric domains were interpolated to a regular domain for the implementation of FNO.For the 3D cases, we did not compare with FNO because of the prohibitive complexity of 3D spatial interpolation.Since the message-passing mechanism in graph learning methods typically focuses on problems with the same input and output graphs, we did not compare GNN for the heat transfer case and blood flow dynamics case.The details about data generation and baseline model configurations are described in the Supplementary Materials S4 and S5.The quantitative comparison results of all methods are presented in Table .1.We considered two error metrics: E L 2 is the mean relative L 2 error of all test samples, and Mean Maximum Error (MME) refers to the mean value of all test samples in terms of the maximum error in the whole computational domain.

Darcy problem (Case 1)
Darcy flow equation is a classical law for describing the flow of a fluid through a porous medium.This problem is also widely used for various neural operator verification [24].Darcy's law can be mathematically described by the following equation: where a is the diffusion coefficient field, u is the pressure field and f is the source term to be specified.As shown in Fig. 2a, the computational domain is a 2D geometric shape represented by a triangle mesh with 2290 nodes.The geometric domain has an irregular boundary with a thin rectangle notch inside, which can increase the complexity of the learning problem.The operator learning target in the Darcy flow problem is the mapping from the diffusion coefficient field a(x) to the pressure field u(x): The labelled data for training the neural operator model is the pair of a(x) and u(x).1200 sets of input data a(x) are randomly generated first.Then the corresponding u(x) is solved by Matlab's SOLVEPDE toolbox.1000 of them are used as the training dataset, and the rest 200 groups are defined as the test dataset.
Fig. 2a reports the comparative prediction results for one representative in the test dataset.It can be observed that the output field and the NORM predicted result show excellent agreement.The prediction results and errors of comparison methods are provided in the Extended Fig. 1, in which ∆ mean refers to the average absolute error over all nodes in the geometric domain, and ∆ max means the maximum absolute error on all nodes.Due to inaccurate grid interpolation, FNO has the most significant error, especially in the boundary region.DeepONet and POD-DeepONet show significant errors on the right side of the rectangle.The quantitative results on the test dataset are listed in Table .1. NORM can achieve the lowest error compared with all other baseline methods.

Pipe turbulence (Case 2)
Turbulence is a vital flow state of the fluid, which reflects the instability of the fluid system [41].Here, we considered turbulent flows in a complex pipe, of which the underlying governing law is the 2D Navier-Stokes equation for a viscous incompressible fluid: where v is the velocity, p is the pressure, and the fluid chosen is water.The geometric design of the irregular pipe is shown in Fig. 2b, where the left and right ends are inlet and outlet, respectively.For a given inlet velocity, we performed the transient simulation to predict the velocity distribution in the pipe.The velocity field data are represented by a triangular mesh with 2673 nodes.Details about data generation and simulation settings can be found in the Supplementary Materials S4.1.2.
The operator learning problem of this case is defined as the mapping from the velocity field v(x, t 1 ) to the velocity field v(x, t 2 ), where t 2 = t 1 + 0.1s: The considered baseline methods are the same as Darcy problem.Fig. 2b shows the predictive performance of NORM on one representative in the test dataset, which gives consistent prediction compared with the ground truth.The prediction results and errors of baseline models are provided in the Extended Fig. 2. FNO achieves minor errors in smooth areas but large errors in sharp areas because of the grid interpolation, leading to small ∆ mean but large ∆ max .POD-DeepONet, like NORM, has a uniform distribution of errors, while the error value is slightly larger than NORM.DeepONet has the most significant prediction error compared to other methods in this task.The quantitative statistical results can be seen in Table 1.

Heat transfer (Case 3)
Heat transfer describes the transfer of energy as a result of a temperature difference, which widely occurs in nature and engineering technology [42].The heat equation can be written in the following form (assuming no mass transfer or radiation).
where T is the temperature as a function of time and space.ρ, C, and K are the density, specific heat capacity, and thermal conductivity of the medium, respectively.And Q is the internal heat source.
The heat transfer case was designed on a three-dimensional solid part, as shown in Fig. 2c.The learning problem is defined as the mapping from the 2D boundary condition T bc (x) to the 3D temperature field T t=3s (y) of the solid part after 3s of heat transfer.
As shown in Fig. 2c, the input geometric domain is represented by a triangular mesh with 186 nodes, and the output geometric domain is represented by a tetrahedral mesh with 7199 nodes.The labelled dataset was generated by the commercial simulation software Comsol.The training dataset consists of 100 labelled samples, and another 100 groups are defined as the test dataset.More details are given in the Supplementary Materials S4. 1.3.In this case, the input and output functions are defined on different manifolds, thus the different L-layers of NORM have to utilise different LBO eigenfunctions.The model structure of NORM is given in Fig. S1b.The beginning L-layers of NORM employ the LBO eigenfunctions of the input manifold for both the encoder and decoder.One middle L-layer of NORM utilises the LBO eigenfunctions of the input manifold for the encoder while taking the LBO eigenfunctions of the output manifold for the decoder.The ending L-layers employ LBO eigenfunctions of the output manifold for both the encoder and decoder.FNO is not implemented for this case due to the prohibitive computational complexity of 3D spatial interpolation.The prediction results of different methods for one typical test data are shown in the Extended Fig. 3. DeepONet has a large prediction error where the temperature gradient is large.POD-DeepONet has different errors on different temperature regions of the left end face, while the error of NORM is smaller and only appears in a few small areas.Moreover, the statistical results for all methods on the test dataset are shown in Table 1, where NORM shows the smallest relative L 2 error.

Composite workpiece deformation prediction (Case 4)
This case study investigated the effectiveness of the proposed NORM on a complex 3D irregular geometry, specifically in predicting the curing deformation of a Carbon Fiber Reinforced Polymer (CFRP) composite part.CFRP composites are known for their lightweight and high-strength properties, thus becoming preferred materials for weight reduction and performance enhancement in modern aerospace industries [15].The large size and high accuracy requirements of aerospace CFRP composite parts impose increased demands on deformation control during the curing process [43].Regulating the curing temperature distribution of a part is an effective means of controlling curing deformation.Therefore, constructing the predictive model of the temperature-to-deformation field on the geometry can provide essential support for further curing process optimisation [44].As shown in Fig. 3a, the CFRP composite workpiece used for the case study is the air-intake structural part of a jet.This workpiece is a complex closed revolving structure formed by multiple curved surfaces, which would deform significantly after high-temperature curing.The learning problem of this case is defined as the mapping from the temperature field a(x, y, z) to the deformation field u(x, y, z) on the given composite part.
Fig. 3b shows the prediction result of NORM and the prediction error of baseline methods of one test sample.It can be found that the error map of NORM is almost 'green' for the whole part, which means that predicted deformation field is very close to the reference value.Table .1 shows that NORM outperforms all baseline methods in both E L 2 and MME.Fig. 3c shows the distribution of prediction error over all nodes of all test samples.It can be seen that the prediction errors of all nodes for all methods show Gaussian distributions with mean values approximating zero.The estimated standard deviations of different methods are marked in each figure.By comparison, NORM can achieve a lower prediction error uniformly and comprehensively for most nodes.
Composite manufacturing is a risk-sensitive problem, so it is not sufficient to consider only the relative L 2 error and average statistical results.According to the deformation prediction evaluation criteria provided by the engineers of the collaborating company, the maximum prediction error of the deformation field predicted by the data-driven model should be less than 0.2mm.Fig. 3d reports the maximum prediction errors of all test cases.NORM is not only far outperforming the comparative methods but also has all test samples with a maximum prediction error of less than 0.2mm.

Blood flow dynamics prediction (Case 5)
Blood flow dynamics is the science of studying the characteristics and regularities of blood movement and its constituents in the organism, which is closely related to human health [45].To explore the potential of NORM for aortic hemodynamic modelling (Fig. 4a), we consider a similar scenario as described in reference [46] where the inputs pv| [0,T ]×6 are time-varying pressure and velocity at the inlet/outlets, and the output v| M×[0,T ]×3 is the velocity field of blood flow consisting of velocity components in three directions [47], as shown in Fig. 4b.The spatial domain is represented by a tetrahedral mesh with 1656 nodes, and the temporal domain is discrete with 121 temporal nodes.It is worth pointing out that the challenges of this case lie in two aspects: 1) time-space complexity, i.e. the output function defined on the complex geometric domain is time-varying; 2) unbalanced node values, i.e. the velocity of most nodes is close to zero due to the no-slip boundary condition.
Since the Fourier basis is also a group of the LBO eigenfunction, NORM can naturally deal with the temporal dimension of input and output functions using the Fourier basis, as discussed in Supplementary Materials S1.2.Hence NORM adopted the structure of Fig. S1c.Statistics results of the NORM and two benchmarks (DeepONet and POD-DeepONet) are presented in Table 1.It is evident that NORM yields the smallest MME and relative L 2 error with minor variation.It stands to reason that at nodes v → 0, even a slight prediction bias would lead to a significant relative L 2 error, but the proposed NORM achieves an impressive relative L 2 error 4.822%, compared with 89.26% of DeepONet and 37.42% of POD-DeepONet, which demonstrates the remarkable approximation capability of NORM.Fig. 4c compares the visualisation of the velocity streamlines (snapshots at a representative time) against baseline methods.We observe that NORM achieves an excellent agreement with the corresponding ground truth, while POD-DeepONet and DeepONet only learn the general trend of velocity distribution but lose the predictive accuracy of the node value.Especially, DeepONet fails to capture the local details of streamlines at inlets and outlets.Additional comparison visualisations of other moments can be found in Supplementary Materials S6.
Furthermore, Fig. 4d provides the perspective to investigate the predictive accuracy of the node velocity evolution over time, which projects the three-dimensional vector onto the xy-plane.NORM agrees well with ground truth regarding phase and amplitude, while POD-DeepONet shows a smaller overall amplitude, and DeepONet loses accuracy in both aspects.Finally, the comparison between ground truth and predictions for the magnitude of the velocity vector at 5000 spatiotemporal nodes randomly sampled from all test samples is plotted in Fig. 4e.Compared to NORM (R 2 = 0.998), despite a quasi-linear relationship maintained by POD-DeepONet (R 2 = 0.859), a prediction bias amplifies as the velocity increases.We conjecture that it is the approximation bias introduced by using the linear superposition method to fit complex nonlinear problems.As for DeepONet (R 2 = 0.567), since its training mode is point-wise and the loss function used for training is the relative L 2 error, the updating of the model parameters is mainly driven by the nodes v → 0.Then, the model outputs tend to be zero, resulting in a trade-off with the optimisation of other nodes.Therefore, the overall prediction of nodes in DeepONet appears more dispersed and does not show a linear relationship.

Analysis
The encoder and the decoder of NORM are constructed by the spectral decomposition and the spectral reconstruction on the corresponding LBO eigenfunctions.This prompts a natural question: Could there be a more suitable basis than LBO eigenfunctions?From a model reduction point of view, the Proper Orthogonal Decomposition (POD) could also provide the projection basis to construct the encoder and the decoder.Consequently, NORM could be naturally extended to POD-NORM, wherein the POD modes of the training dataset replace the LBO eigenfunctions.Note that, the input data and the output data have different POD modes, so the structure of POD-NORM is similar to NORM with different input and output manifolds (Fig. S1b in the Supplementary Materials).Therefore, NORM and POD-NORM were compared to demonstrate the advantages of LBO eigenfunctions.We focus on the performance comparison of the Darcy problem and the composite case, because the input fields of these two tasks are more complex, which brings more challenge to the representability of the spectrum.The data results reported in Fig. 5 are the average based on five repeated runs.We first compared the performance of NORM, POD-NORM, and POD-DeepONet across various numbers of modes from 16 to 896.Fig. 5a shows the error tendency of the different methods with different mode numbers.Each case contains the results with two different sizes of the training dataset, {1000, 1500} for the Darcy case and {200, 800} for the composite case.For the Darcy case, the prediction errors of all three methods decrease rapidly as the number of modes increases, eventually converging to a stable performance level.Notably, POD-NORM and POD-DeepONet have similar performance, and NORM shows smaller errors under all number of modes.These findings indicate that the LBO eigenfunctions possess a more robust representation capability compared to POD modes.In Fig. 5a, we can observe that, in the composite case, increasing the number of POD modes does not appear to reduce the prediction error of POD-DeepONet and POD-NORM significantly.In contrast, NORM continues to show a clear decreasing trend in error while maintaining its leading performance.
To further explain the performance difference between the two modes in the composite case, we conducted a comparative analysis of the spectral decomposition of both the input temperature field and the output deformation field using LBO and POD modes.As shown in Fig. 5b, the top 100 POD decomposition coefficients of the deformation field decreases rapidly from magnitudes of 10 2 to 10 −3 , and the decomposition coefficients of the temperature field drop from 10 1 to 10 −4 suddenly.That indicates that the feature representation after the encoder E contains coefficients spanning a wide range, from 10 −4 to 10 2 , which could bring challenges for the learning process.Besides, since the high-order POD coefficients of the deformation field are extremely small, any errors in these coefficients could lead to significant sensitivity in the reconstructed results generated by the decoder D. By comparison, the LBO decomposition coefficients fluctuate in a relatively smaller range.This observation provides a potential explanation for why NORM consistently outperforms POD-NORM in most scenarios.
Another key distinction between these two modes lies in their underlying principles.POD modes are learnt from data, making their accuracy and generalisability heavily dependent on the size of training data.In contrast, LBO eigenfunctions are entirely independent of the training data.Therefore, we further compare the error tendency for different operator learning methods with respect to the training dataset size.For the Darcy problem, the training dataset sizes vary from 400 to 2000, and the test dataset is an additional 200 groups labelled data.For the composite case, the training dataset sizes are set from 100 to 1000, and another 100 groups are defined as the test dataset.The number of modes is consistently set to 128 for POD-DeepONet, POD-NORM, and NORM.The results are presented in Fig. 5c.Notably, we observe that NORM exhibits a more rapid convergence rate as the training dataset increases, outperforming the other methods.In particular, for the Darcy problem, a NORM with 1200 samples can achieve a relative L2 error of less than 1.00%, while DeepONet, POD-DeepONet, and POD-NORM with 2000 samples are 1.04%, 1.24% and 1.16% respectively.To sum up, integrating LBO eigenfunctions enables NORM with superior performance bound and enhances the convergence capability.

Discussion
In this research, we propose a deep learning framework with a new concept called the Neural Operator on Riemannian Manifolds (NORM), to learn mappings between functions defined on complex geometries, which are common challenges in science discovery and engineering applications.Unlike existing neural operator methods (such as FNO, UNO, WNO) that rely on regular geometric domains of Euclidean structure, NORM is able to learn mappings between input and output functions defined on any Riemannian manifolds via LBO subspace approximation.Furthermore, the optimality of LBO eigenfunctions allows NORM to capture the global feature of complex geometries with only a limited number of modes, rather than directly learning the operator in the high-dimensional coordinate space.The ability of LBO eigenfunctions to approximate functions on Riemannian manifolds also guarantees the universal approximation property of NORM.
NORM generalises the neural operator from Euclidean spaces to Riemannian manifolds, which has a wide range of potential applications, including PDEs solving, aerodynamics optimisation and other complex modelling scenarios.The case studies in parametric PDEs solving problems and engineering applications demonstrated that NORM can learn operators accurately and outperform the baseline methods.The discretisation-independent ability enables NORM with greater performance advantages compared with the coordinate spaces based model (such as DeepONet [22]) when learning more complex operators (Blood flow dynamics case).The architecture of NORM draws inspiration from the iterative kernel integration structure employed in FNO [23].Notably, since the Fourier basis is also a group of the LBO eigenfunction, NORM can be treated as a generalisation of FNO from the Euclidean space to Riemannian manifolds.In addition, NORM can deal with different input/output manifolds, including Euclidean space or complex geometries, and thus has broader application potential compared with GNN or FNO, which requires the input and output to be the same domains.
Although NORM shows promising performance in learning operators on complex geometries, the integration of LBO eigenfunctions also restricts the geometries to be Riemannian manifolds, which means that NORM could not deal with non-Riemannian geometries or even non-manifold geometries.For non-Riemannian geometries such as 3D point clouds, one feasible solution could be manually constructing the Riemannian metric g from the point cloud and then calculating LBO eigenfunctions like described in reference [48].Recent researchers have also started to develop Laplacian for non-manifold triangle meshes, which could be a potential solution for operator learning in non-manifold geometries [49].
Our method offers a new perspective for learning operators and solving PDEs on manifolds.Furthermore, the Laplacian-based approximation block in our method has strong extension potential to other neural operator structures or even physics-informed machine learning methods.For instance, the approximation block N could replace the branch net of DeepONet, making the new framework discretisation independent in both input and output functions.When solving PDEs with known equations, integrating the approximation block N into the physics-informed neural network could reduce the parameterisation complexity in coordinate spaces.In addition, the advantages of LBO eigenfunctions could be further discovered for more operator learning settings.the XPLORER PRIZE.Extended Figure 3 3.00

Supplementary Materials for Learning Neural Operators on Riemannian Manifolds
Gengxiang Chen a,1 , Xu Liu

S1.1. Discretised version of N (v)
We first start from a common situation, assuming that the input and output functions are defined on the same manifold M, i.e.X = Y = M. Suppose the manifold M is discretised into n x nodes.For the approximation block N (v l ) = D • R • l ) in the NORM hidden layer, the input function v l (x) can be represented discretely as V l ∈ R nx×dv .The LBO eigenfunctions ϕ M,i is then discretised into a vector form ϕ M,i ∈ R nx×1 .Suppose we consider d m modes for the spectral decomposition and reconstruction, then all d m LBO eigenfunctions form a matrix Φ ∈ R nx×dm .The complex geometric information of the domain has been embedded in the LBO eigenfunctions.
First, the encoder of the discretised input matrix V l can be expressed as: where Φ † ∈ R dm×nx refers to the pseudo inverse of the LBO eigenfunctions matrix Φ, defined as: Denote R as a simple linear mapping R ∈ R dm×dv×dv , the mappings on the encoded information can then be represented as: where the tensor operation is defined as: The decoder process is simply the linear transformation with the LBO eigenfunctions matrix.Then the discretised version of N (v) can be given as:

S1.2. Model structure with different input and output manifolds
NORM can deal with different input and output manifolds by defining different L-layers.Different manifolds mean that the input and output have different LBO eigenfunctions, which will influence the encoder E and decoder D of the approximation block N in each L-layer.Fig. S1a shows the common situation where the input and output functions are both defined on the same manifolds M, which means all E and D in all L-layers take the same LBO eigenfunctions (marked as M → M for illustration).
As shown in Fig. S1b, when the input and output functions are defined on different manifolds, there will be three different types of L-layers.The beginning L-layers (X → X ) employ LBO eigenfunctions of manifold X for both the encoder E and decoder D. The middle L-layer should utilise the LBO eigenfunctions of manifold X for the encoder E, and the LBO eigenfunctions of manifold Y for the decoder D. Therefore, the output of the middle L-layer (X → Y) will have the same domain discretisation dimension of Y.After that, the feature can be passed to the ending L-layers (Y → Y) with the decoder and encoder defined with LBO eigenfunctions of manifold Y.
Since the Fourier basis can also be regarded as a group of the LBO eigenfunction, NORM can be treated as a generalisation of FNO from the Euclidean space to Riemannian manifolds.Therefore, NORM can also deal with temporal functions as input or output.Fig. S1c shows the problem with input temporal function and output spatial function.the beginning multiple L-layers can define the decoder and encoder with the Fourier basis to process the input temporal function (F → F).The middle L-layer can use the Fourier basis for the encoder E, and take the LBO eigenfunctions of manifold Y for the decoder D (marked as F → Y).

S2.1. Discretised LBO of triangular mesh
The Laplace-Beltrami Operators (LBOs) of different geometric meshes are strictly defined in the differential geometry field [1], including triangular mesh, quadrilateral mesh, tetrahedral mesh, etc.Take the example of the triangular mesh, as shown in Fig. S2 and Eq. 6, the discrete Laplacian of a scalar function f on a vertex i defined by the cotangent function of the adjacent nodes, where M(i) is the vertex i on the geometric mesh.The Laplacian of triangular mesh is also called the cotangent Laplace operator, which can be derived in many different ways, including finite analysis, finite volume method, or discrete exterior calculus [2].
After defining the LBO of complex geometries, the LBO eigenfunctions can be obtained by solving the eigenfunctions ∆ϕ = λϕ with Galerkin method, power iteration, or other numerical methods [3,4].

S2.2. Projection error of LBO eigenfunctions
Theorem (Projection error of LBO eigenfunctions).[5] Let M be a given Riemannian manifold, an induced LBO ∆, with associated spectral basis ϕ i , where , the projection error is: and Proof of Theorem S2.2.[5] : Define the projection residual function as it is easy to verify that ⟨r n , ϕ i ⟩ = 0, ∀i, 1 ≤ i ≤ n.The projection error can be given as: The gradient bound of residual function is: Since the eigenvalues are ascending, λ 1 ≤ λ 2 , . .., we have: So that: Then, we build the connection between ∥∇r n ∥ 2 L 2 and ∥∇f ∥ 2 L 2 .We have: Then, it follows that: By assumption, f ∈ L 2 .Therefore, ∥∇f ∥ 2 L 2 is bounded.Theorem 2.6 in Ref [6] shows that lim n→∞ λ n = ∞, so it can be verified that λ n+1 → ∞ forces ∥r n ∥ L 2 → 0. [7] Given a Riemannian manifold M, the induced LBO ∆, and its spectral basis ϕ i , where ∆ϕ i = λ i ϕ i , and a real scalar value 0 ≤ α < 1.For any f ∈ L 2 (M, R), there is no orthonormal basis of functions {ψ i } ∞ i=1 , and an integer n such that

S3.1. Non-Euclidean universal approximation condition
Recently, Kratsios et al. [8] investigated which modifications to the input and output of a neural network could deal with non-Euclidean while preserving the universal approximation capability.Based on their research, the Non-Euclidean Universal Approximation Condition can be summarised as follows: Theorem (Non-Euclidean Universal Approximation Condition [8]).Let ϕ : X → R m and ρ : R → Y, where X and Y are topological spaces.Equip C(X , Y) with the bounded compact topology, C (R m , R n ) with the topology of uniform convergence on compacts, let F be a subset of Here, F ρ,ϕ is dense in C(X , Y) means that given any ϵ > 0 and g C ∈ C(X , Y), there exists For the defined neural operator on Riemannian manifolds N = D • R • E, suppose the approximator R is a neural network that holds universal approximation property.Then, based on Theorem S3.1, the key step of establishing an N with universal approximation property is to construct a continuous injective map E from functions on manifolds to the Euclidean space and a continuous surjective map D from the Euclidean space to functions on manifolds.

S3.2. Proof of universal approximation of NORM
Let N = D • R • E be a neural operator for C(A, U), where R represent a neural network that has universal approximation property in C R d X , R d Y .The encoder is defined as: The decoder is defined as D : R d Y → U, and X and Y are Riemannian manifolds.A and U are L 2 spaces.ϕ X ,i and ϕ Y,i are LBO eigenfunctions of manifolds X and Y, respectively.The universal approximation theorem of neural operators on Riemannian manifolds is as follows: Theorem (Universal approximation theorem for the neural operator on Riemannian manifolds).Let G : A(X ; R) → U(Y; R) be a Lipschitz continuous operator, K ∈ A is compact set.Then for any > 0, there exists a neural operator on N = D • R • E, such that: Proof of Theorem S3.2.: Since N is defined with a finite number of LBO eigenfunctions, the encoder E is not injective, and the decoder D is not surjective, so we cannot derive the universal approximation property directly based on Theorem S3.1.
The following proof consists of three steps: first, the universal approximation error on the projection subspace, then the decoding error on Y, and the of the encoding error on X .
Step 1: Approximation error on the projection subspace For the input function space A, let V X ,d X ⊂ A be the d X -dimensional projection space of the LBO eigenfunctions of the manifold X , namely Therefore, the orthogonal projection of the input function a can be represented as Π V X ,d X a: Similarly, for the output function space Then the orthogonal projection of the output function u on V Y,d Y can be defined as: Let N † = D † • R • E † be a neural operator on the projection space C(V X ,d X , V Y,d Y ), where R represent a neural network that has universal approximation property in C R d X , R d Y .The encoder can be defined as the following mapping: And the decoder can be given as: Then D † and E † follow the assumption in Theorem S3.1, that E † is a continuous injective map and D † is a continuous surjective map.Based on Theorem S3.1, N † is an universal approximator in C(V X ,d X , V Y,d Y ).Suppose K is compact set in A, then for any ϵ > 0, there exists a N † and a Note that, we have E , ∀a ∈ A. Hence, we have: Step 2: Decoding error on the output Theorem S2.2 shows that the projection error of LBO eigencfunctions convergence to 0 when with a sufficient number of basis.Therefore, for any ϵ > 0, there exists a number d Y ∈ N, such that: Step 3: Encoding error on the input Here we assume G is Lipschitz continuous, that is, there exists a constant M > 0 that: Since Π V X ,d X a can approximate a at any accuracy, then for any ϵ > 0, there exists d X ∈ N, such that: Step 4: Combining the errors from steps 1 to 3 Therefore, triangle inequality implies that: This concludes the proof.
According to the proof procedure above, one fundamental characteristic of NORM that supports its universal approximation property is the ability of LBO eigenfunctions to approximate continuous functions of Riemannian manifolds with arbitrary accuracy.Therefore, this proof procedure can be generalised to other potential extensions of NORM that utilise different orthogonal basis functions rather than LBO, as long as they can also approximate functions on Riemannian manifolds.

S4.1. Learning PDEs solution operators S4.1.1. Darcy problem (Case 1)
Darcy flow equation is a classical law for describing the flow of a fluid through a porous medium.This problem is also widely used for various neural operator verification.We focus on the darcy equation on 2D irregular geometric domain, which can be described by the following equation: where a is the diffusion coefficient field, u is the pressure field and f is the source term to be specified.The learning target in the Darcy flow problem is the mapping from the diffusion coefficient field a(x) to the pressure field u(x): For this case, the source term is set to 1, i.e. f = 1.The input diffusion coefficient field a(x) is generated by the Gaussian random field with a piecewise function, namely a(x) = t(µ), where µ is a distribution defined by µ = N 0, (−∆ + 25I) −2 [9].After sampling from this distribution, the diffusion coefficient field a(x) can be generated from the following piecewise function: We design an irregular geometric domain with a thin rectangle notch inside, which can increase the complexity of the learning problem.As shown in Fig. S3a, the geometric domain of the Darcy case is divided by the triangle mesh with 2290 nodes, where the outside boundary condition follows u = u ∂D (x) and the three boundaries of the inside rectangle follows u = 0.The boundary condition u ∂D (x) is shown in Fig. S3b, and Fig. S3c-d show the input field and output field of one labelled data.In this case, 1200 labelled data are randomly generated, 1000 of them are used as the training data, and the rest 200 groups are defined as the test data.

S4.1.2. Pipe turbulence (Case 2)
Flow in a pipe is very common in physiological systems, here we consider turbulent flows in a complex pipe, where the governing equation is the 2-d Navier-Stokes equation for a viscous, incompressible fluid: where v is the velocity, p is the pressure, and the fluid chosen is water.We use the k − ε model of Reynolds Average Navier-Stockes (RANS) models in the Comsol Multiphysics to conduct simulations.The geometry of the pipe is shown in Fig. S4a.The average normal velocity v = [1,5] is imposed at the inlet, zero pressure condition is imposed at the outlet, and the no-slip boundary condition is imposed at the pipe surface.For given inlet velocity, we perform a 1 s transient simulation to predict the velocity distribution in the pipe.The learning problem of this case is defined as the mapping from the velocity field of t ∈ [0.1s, 0.9s] to the velocity field of t + 0.1s, as shown in Fig. S4b-c.The input and output mesh both comprise 2673 nodes.Finally, we generate 400 trajectories, including 80 sets of transient simulations of different inlet velocities and 5 input-output pairs for each simulation.300 of them are used as training data, and the rest of the 100 groups are defined as test data.

D*HRPHWU\DQGPHVK E,QSXWILHOG F2XWSXWILHOG
Figure S4: The geometric domain and input-output pair of the pipe turbulence case.

S4.1.3. Heat transfer (Case 3)
Heat transfer describes the transfer of energy as a result of a temperature difference, which widely exists in nature and engineering technology fields.A solid heat transfer case for a threedimensional complex part is constructed to verify the ability of the method to handle the prediction problem with complex geometric domains.The heat equation can be represented in the following form (assuming no mass transfer or radiation).
where T is temperature as a function of time and space.ρ, C, and K are the density, specific heat capacity, and thermal conductivity of the medium, respectively.And Q is the internal heat source.For this case, the part material is copper with a residual resistivity ratio of 30, and Q is set to 0. The three-dimensional design model of the solid part is shown in Fig. S5.
The boundary conditions are imposed on the left and right sides.Specifically, on the left side (z = 0) is the low-temperature zone with the boundary condition T L (x, y, z = 0).On the right side (z = 50) is the high-temperature zone with the boundary condition T H (x, y, z = 50).The heat transfer problem is solved by the commercial simulation software Comsol, and the two boundary conditions are set as follows.
where θ is the center angle corresponding to the mesh node, T 1 , T 2 , T 3 , T 4 , T 5 , and T 6 are six temperature parameters which are randomly sampled from 290K − 350K.
The learning problem of this case is defined as the mapping from the low-temperature boundary condition T L (x, y, z = 0) to the solid part's 3-dimensional temperature field T t=3 (x, y, z) after 3s of heat transfer.The mesh for the input and output field is shown in Fig. S6a and Fig. S6c.The input domain is discretised by the triangular mesh with 186 nodes, and the output domain is discretised by the tetrahedral mesh including 7199 nodes.Furthermore, the input temperature field and output temperature field are shown in Fig. S6b and Fig. S6d.Note that the space domains for input and output are different in this case.The training data set consists of 100 labelled data, and another 100 samples are defined as test data.

S4.2. Composite workpiece deformation prediction (Case 4) S4.2.1. Background
Carbon Fiber Reinforced Polymer (CFRP) composite materials, which are lightweight and highstrength, are preferred materials for weight reduction and performance enhancement in modern aerospace industries [10].CFRP parts used in aerospace have large size and complex shapes, therefore imposing higher requirements on deformation control during the manufacturing process.As one of the key processes of composites manufacturing, curing refers to using high temperatures to stimulate the chemical reactions and physical changes of the resin, thereby forming CFRP parts with load-bearing properties.Non-uniform residual stresses generated during the curing process can cause curing deformations such as spring-back, warpage, and bending-twisting combination, which not only risks the CFRP parts being scrapped but also becomes an important reason for damages and failures during subsequent assemblies [11].
Regulating the curing temperature distribution of a part is an effective means of controlling curing deformation.However, optimising the curing temperature field usually requires a large number of iterations based on the prediction results of the curing deformation field.Therefore, establishing a fast prediction model from the curing temperature field to the deformation field is of great significance for optimising and designing the temperature field of CFRP parts [12].Numerical simulation methods, such as the finite element method, have become the most widely used curing process modelling methods.However, high-fidelity curing deformation simulation requires accurate modelling of complex physicochemical processes and fine meshing of the part calculation domain, resulting in highly expensive and time-consuming calculations.Therefore, the computational efficiency of the traditional numerical modelling methods is insufficient to meet the requirements for the temperature field optimisation of the CFRP parts.Establishing a data-driven temperature-to-deformation prediction model can provide essential support for further curing process optimising.
The CFRP workpiece used for verification is the air-intake structural part of a jet.As shown in Fig. S7, this workpiece is a complex closed revolving structure formed by multiple curved surfaces, which would deform significantly after high-temperature curing.The curing process is zoned selfresistance electric heating, where the internal and external surfaces of the workpiece are divided into multiple areas according to the radius of curvature for independent temperature control.The theoretical support of this case can be found in the authors' previous work [13].

S4.2.2. Data generation
The internal and external surfaces of the composite part are divided into 20 separate curing zones, with the temperature of each zone generated randomly between the 370K ∼ 400K.The temperature fields and the deformation fields of the composite part were simulated by considering heat transfer, curing reactions, viscoelastic mechanics and other processes.As shown in Fig. S8a, the part geometry is represented by a tetragonal mesh constructed in the commercial simulation software Comsol Multiphysics, comprising a total of 8232 nodes.A total of 500 data pairs of temperature-to-deformation fields were simulated, 400 of them are defined as training data and the rest 100 as test data.The examples of the input temperature field and output deformation field are shown in Fig. S8b and Fig. S8c.

S4.3. Blood flow dynamics prediction (Case 5) S4.3.1. Background
Blood flow dynamics is the science of studying the characteristics and regularities of the movement of blood and its constituents in the organism [14].The driving force of hemodynamic research consists of the following three aspects: 1) hemodynamic research can assist researchers in studying the laws of blood flow in the vascular system of a healthy human body [15]; 2) hemodynamic research can help analyse the causes and effects of cardiovascular diseases [16,17]; and 3) hemodynamic research can promote the development and optimisation of diagnosis and treatment techniques for vascular diseases from a therapeutic point of view [14].In recent years, the development of measurement techniques has made it possible to reconstruct patient-specific vascular structures by CT imaging [18] and 4D MRI [19].Furthermore, computational fluid dynamics (CFD) modelling has been used to simulate blood flow by numerically solving the Navier-Stokes equations, showing promising potential in clinical practice [20,21].On the one hand, CFD allows the non-invasive acquisition of haemodynamic parameters that in vitro measurements cannot measure.On the other hand, CFD can provide visualisation of the flow field results to investigate the effect of specific structures on haemodynamics.Despite the excellent predictive performance of CFD modelling, its high computational cost and the long processing time have prevented it from clinical practice in time-sensitive areas such as preoperative planning and serial monitoring [22].To address the above limitations, we aim to explore the possibilities of data-driven neural operator models for the surrogate modelling of haemodynamic CFD.
This case focuses on the hemodynamics of the human thoracic aorta, the largest human artery responsible for transporting oxygen and nutrient-rich blood to various organs.We consider a similar monitoring scenario in the paper [23], where the inputs are the time-varying blood flow metrics monitored in real-time such as flow rate, blood flow, pressure, etc., at the inlet and outlet [24], and the outputs are the velocity field of the aorta.The field outputs can provide more information reflecting the state and evolution of the patient's disease than the individual metrics.

S4.3.2. Data generation
The aorta contains 1 inlet, i.e. the ascendens aortae, and 5 outlets, i.e. the descendens aortae, the left/right subclavian arteries, and the left/right common caroti arteries, as shown in Fig. S9.The velocity boundary condition is imposed at the inlet, and pressure boundary conditions are imposed at the outlets, which describes the time-varying characteristic of velocity and pressure during one cardiac cycle (1.2s).To simulate the gradual increase of velocity and pressure to a peak during the systolic phase and a fall back from the peak during the diastolic phase, we approximated the changes of velocity and pressure using a simplified Gaussian function, as shown in Fig. S10.Then a set of boundary conditions can be determined by setting the mean, bandwidth, and peak values.
Blood is assumed to be a homogeneous Newtonian fluid with a density of 1060 kg/m 3 and a viscosity of 0.0035 N • s/m 2 , and the flow of blood in the aorta is laminar flow.The vessel wall was assumed to be rigid, and no-slip conditions were considered.A total of 500 velocity/pressure curves are generated as boundary conditions, which are treated as inputs.Velocity fields are simulated based on Comsol Multiphysics software, which serves as outputs.The simulation results are derived using tetrahedral mesh (1656 spatial nodes), with a sampling interval of 0.01s in the time dimension,

S5. Experimental setting
We compare the performance of the proposed NORM on the experimental cases with the existing representative operator learning methods FNO, DeepONet, POD-DeepONet, and a popular GNN architecture GraphSAGE.A brief introduction to each method is given below to help the reader understand the architecture and parameter setting of each model.
NORM can learn the mapping between functions on Riemannian manifolds.The proposed NORM consists of multiple encoder-approximator-decoder blocks, in which the encoder and the decoder are constructed as the spectral decomposition and the spectral reconstruction on the eigenfunctions of the Laplace-Beltrami operator (LBO).The LBO eigenfunctions of the geometric domain and its pseudo-inverse can be pre-calculated before training.Note that Darcy problem, Pipe turbulence and Composites cases adopt the structure of Fig. S1a.Heat transfer case adopts the structure of Fig. S1b.Blood flow case follows the Fig. S1c, where the encoder and decoder of several ending L-layers employ both LBO eigenfunctions of Y and Fourier basis to process the output spatiotemporal functions.
FNO [25] parameterises the integral operators in the Fourier domain, and then the highdimensional operator mapping can be transferred to the low-dimensional discretisation-invariant parameterisation of the few frequency modes.Since the original FNO cannot deal with the irregular geometric domain, the mesh interpolation solution from the paper [26] is adopted to construct a regular mesh for the FNO.And the final prediction error is calculated on the original irregular grid by the second interpolation from the regular grid to the irregular grid.Considering the prohibitive computational burden of the spatial mesh interpolation of 3D parts, FNO is only implemented in two cases: Darcy flow and pipe turbulence.And the interpolation resolution ratios are 101*101 and 32*128 for the Darcy flow and pipe turbulence cases, respectively.
DeepONet [27] is a neural operator framework based on the universal approximation theorem.The branch net of DeepONet encodes the input function, and another trunk net encodes the grid coordinates to be queried for the output function.The combination of the two networks enables the function output that can provide the prediction result of any point in the domain.
POD-DeepONet [26] is the latest variant of DeepONet, in which Proper Orthogonal Decomposition (POD) is performed on the training data to compute the bases for output data.The POD bases are used as the trunk net (The POD basis can be precomputed before training, no training required.),and the branch network can directly learn the weights of POD bases.
GraphSAGE [28] is a popular GNN architecture that uses SAGE convolutions, which is an inductive learning framework that can utilize the attribute information of the vertex to effectively generate the unknown vertex embedding.
Due to the different learning problems of each case, it is difficult to adopt an exact same set for each methods.The detailed architecture and parameters setting of each method for different cases are summarised in Table S1, in which d m and d t mean the number of LBO/POD basis and Fourier basis, respectively.d v denotes the channel number after the mapping P.

FunctionFigure 1 :
Figure 1: The illustration of Neural Operator on Riemannian Manifolds (NORM).a, Operators defined on Riemannian manifolds, where the input function and output function can be defined on the same or different Riemannian manifolds.The example for this illustration is the operator learning problem of the composite curing case, where the input temperature function and the output deformation function are both defined on the same manifold, the composite part.b, The framework of NORM, consists of two feature mapping layers (P and Q) and multiple L-layers.c, The structure of L-layer, consists of the encoder-approximator-decoder block, the linear transformation, and the non-linear activation function.d, Laplace-Beltrami Operator (LBO) eigenfunctions for the geometric domain (the composite part).

2 𝑡 2 Figure 2 :
Figure 2: Illustration of three toy case studies.a, Darcy problem (Case 1): the operator learning problem is the mapping from the diffusion coefficient field to the pressure field.b, Pipe turbulence (Case 2): the operator learning problem is the mapping from the current velocity field to the future velocity field.c, Heat transfer (Case 3): the operator learning problem is the mapping from the 2D boundary condition to the 3D temperature field of the part.

Figure 3 :
Figure 3: Composite workpiece deformation prediction case (Case 4).a, Illustration of the air-intake workpiece and the composite curing.b, The input and output of the operator learning problem, the predicted deformation of NORM, and the prediction error of comparison methods.c, The distribution of deformation prediction error over all nodes of all test samples.d, The maximum prediction errors of all test cases for the three methods.

Figure 4 :
Figure4: Blood flow dynamics prediction case (Case 5).a, Illustration of the human thoracic aorta, the largest human artery.b, Illustration of the operator mapping G.The inputs are the velocity at the inlet and the pressure at the outlets.The output is the velocity field of the blood flow.c, Visualisation of the velocity streamlines (snapshots at a representative time) against baseline methods.d, Comparison of node velocity evolution prediction over time.We project the 3D vector onto the xy-plane.e, Comparison between ground truth and predictions for the magnitude of the velocity vector.We randomly sample 5000 spatiotemporal nodes from all test samples.

Figure 5 :
Figure 5: Analysis results for different methods in the Darcy case and the composite case.a, Comparison of POD-DeepONet, POD-NORM, and NORM for various numbers of modes in different sizes of the training dataset.b, The coefficient analysis of the spectral decomposition of both the input temperature field and the output deformation field for the composite case using LBO and POD modes.c, Comparison of DeepONet, POD-DeepONet, POD-NORM, and NORM for different size of training data while the number of LBO/POD modes is 128.

Extended Figure 1 :Extended Figure 2 :
Experimental results of the Darcy problem.a, The mesh for the irregular geometric domain.b, c, The input and output fields for a representative sample.d-g, The prediction results of different methods.h-k, The prediction errors of different methods.Experimental results of the pipe turbulence.a, The mesh for the complex pipe shape.b, c, The input and output fields for a representative sample.d-g, The prediction results of different methods.h-k, The prediction errors of different methods.

Extended Figure 3 :
Experimental results of the heat transfer.a, The mesh for the input geometric domain.b, c, The input and output fields for a representative sample.d-f, The prediction results of different methods.g-i, The prediction errors of different methods.

Figure S1 :
Figure S1: Model structure with different input and output manifolds

Figure S2 :
Figure S2: Cotangent Laplace operator of the triangular mesh.

Figure S3 :
Figure S3: The mesh, boundary conditions, input field and output field of the Darcy flow case 1.

Figure S5 :
Figure S5: The 3-dimensional model for the heat transfer case.

Figure S6 :
Figure S6: The mesh and field for input and output data of the heat transfer case.

Figure S7 :
Figure S7: The CFRP part for case study.

Figure S8 :
Figure S8: The mesh and field for input and output data of the composite case.

FFigure S9 :
Figure S9: The diagram of inlet and outlets of the aorta.

Figure S10 :
Figure S10: The velocity and pressure curves.

Table 1 :
Performance comparison for the five case studies.