Merge pull request #21 from aai-institute/feature/observation

Feature/observation
aai-institute · Jan 23, 2024 · 19a28f2 · 19a28f2
2 parents e7826d8 + fe010db
commit 19a28f2
Show file tree

Hide file tree

Showing 11 changed files with 198 additions and 292 deletions.
diff --git a/docs/operators/index.md b/docs/operators/index.md
@@ -7,16 +7,23 @@ alias:
 
 # Introduction
 
+Function operators are ubiquitous in mathematics and physics: They are used to
+describe dynamics of physical systems, such as the Navier-Stokes equations in
+fluid dynamics. As solutions of these systems are functions, it is natural to
+transfer the concept of function mapping into machine learning.
+
 ## Operators
 
 In mathematics, _operators_ are function mappings – they map functions to functions.
 
-Let $u: \mathbb{R}^d \to \mathbb{R}^c$ be a function that maps a
-$d$-dimensional input to $c$ *channels*. Then, an **operator**
+Let $u: X \subset \mathbb{R}^d \to \mathbb{R}^c$ be a function that maps a
+$d$-dimensional input to $c$ output *channels*.
+
+An **operator**
 $$
 G: u \to v
 $$
-maps $u$ to a function $v: \mathbb{R}^{d'} \to \mathbb{R}^{c'}$.
+maps $u$ to a function $v: Y \subset \mathbb{R}^{p} \to \mathbb{R}^{q}$.
 
 !!! example annotate
     The operator $G: u \to \partial_x u$ maps functions $u$ to their
@@ -27,44 +34,55 @@ maps $u$ to a function $v: \mathbb{R}^{d'} \to \mathbb{R}^{c'}$.
 Learning operators is the task of learning the mapping $G$ from data.
 In the context of neural networks, we want to learn a neural network $G_\theta$
 with parameters $\theta$ that, given a set of input-output pairs $(u_k, v_k)$,
-maps $u_k$ to $v_k$.
-
-As neural networks take vectors as input, we need to vectorize the input
-function $u$ somehow. There are two possibilities:
-
-1. We represent the function $u$ within a finite-dimensional function space
-  (e.g. the space of polynomials) and map the coefficients, or
-2. We map evaluations of the function at a finite set of evaluation points.
-
-In **Continuity**, we use the second, more geneal approach of mapping function
-evaluations, and use this also for the representation of the output function $v$.
-
-In the input domain, we evaluate the function $u$ at a set of points $x_i$ and
-collect a set of *sensors* $(x_i, u(x_i))$ in an *observation*
+maps $u_k$ to $v_k$. We refer to such a neural network as **neural operator**.
+
+In **Continuity**, we use the general approach of mapping function
+evaluations to represent both input and output functions $u$ and $v$.
+
+!!! note annotate
+    As neural networks take vectors as input, we need to vectorize the
+    functions $u$ and $v$ in some sense. We could represent the functions within
+    finite-dimensional function spaces (e.g., the space of $n$-th order
+    polynomials) and map the coefficients. However, a more general approach is
+    to map evaluations of the functions at a finite set of evaluation points.
+    This was proposed in the original DeepONet paper and is also used in other
+    neural operator architectures.
+
+Let $x_i \in X,\ 1 \leq i \leq n,$ be a finite set of *collocation points*
+(or *sensor positions*) in the domain $X$ of $u$.
+We represent the function $u$ by its evaluations at these collocation
+points and write $\mathbf{x} = (x_i)_i$ and $\mathbf{u} = (u(x_i))_i$.
+This finite dimensional representation is fed into the neural operator.
+
+The mapped function $v = G(u)$, on the other hand, is also represented by
+function evaluations only. Let $y_j \in Y,\ 1 \leq j \leq m,$ be a set of
+*evaluation points* (or *query points*) in the domain $Y$ of $v$ and
+$\mathbf{y} = (y_j)_j$.
+Then, the output values $\mathbf{v} = (v(y_j))_j$ are approximated by the neural
+operator
 $$
-\mathcal{O} = \\{ (x_i, u(x_i)) \mid i = 1, \dots N \\}.
+v(\mathbf{y}) = G(u)(\mathbf{y})
+\approx G_\theta(\mathbf{x}, \mathbf{u}, \mathbf{y}) = \mathbf{v}.
 $$
 
-The mapped function can then be evaluated at query points $\mathbf{y}$ to obtain the output
-$$
-v(\mathbf{y}) = G(u)(\mathbf{y}) \approx G_\theta(\mathbf{x}, \mathbf{u}; \mathbf{y}) = \mathbf{v}
-$$
-where $\mathbf{x} = (x_i)_i$ and $\mathbf{y} = (y_j)_j$ are the evaluation points
-of the input and output domain, respectively, and $\mathbf{u} = (u_i)_i$ is the
-vector of function evaluations at $\mathbf{x}$.
-The output $\mathbf{v} = (v_j)_j$ is the vector of function evaluations at $\mathbf{y}$.
-
-
-In Python, this call can be written like
+In Python, we write the operator call as
 ```
 v = operator(x, u, y)
 ```
+with tensors `x`, `u`, `y`, `v` of shape `[b, n, d]`, `[b, n, c]`, `[b, m, p]`,
+and `[b, m, q]`, respectively, and a batch size `b`.
+This is to provide the most general case for implementing operators, as
+some neural operators differ in the way they handle input and output values.
+
+For convenience, the call can be wrapped to mimic the mathematical syntax.
+For instance, for a fixed set of collocation points `x`, we could define
+```
+G = lambda y: lambda u: operator(x, u, y)
+v = G(u)(y)
+```
 
-## Applications to PDEs
+Operators extend the concept of neural networks to function mappings, which
+enables discretization-invariant and mesh-free mappings of data with
+applications to physics-informed training, super-resolution, and more.
 
-Operators are ubiquitous in mathematics and physics. They are used to describe
-the dynamics of physical systems, such as the Navier-Stokes equations in fluid
-dynamics. As solutions of PDEs are functions, it is natural to use the concept
-of neural operators to learn solution operators of PDEs. One possibility to do
-this is using an inductive bias, or _physics-informed_ training.
-See our examples in [[operators]] for more details.
+See our examples in [[operators]] for more details and further reading.
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -93,7 +93,6 @@ theme:
     - content.code.annotate
     - content.code.copy
     - navigation.footer
-    - navigation.instant
     - navigation.path
     - navigation.top
     - navigation.tracking

diff --git a/notebooks/selfsupervised.ipynb b/notebooks/selfsupervised.ipynb
diff --git a/src/continuity/data/__init__.py b/src/continuity/data/__init__.py
@@ -1,14 +1,12 @@
 """
-In Continuity, data is given by *observations*. Every observation is a set of
-function evaluations, so-called *sensors*. Every data set is a set of
-observations, evaluation coordinates and labels.
+This defines DataSets in Continuity.
+Every data set is a list of (x, u, y, v) tuples.
 """
 
 import math
 import torch
 from torch import Tensor
-from numpy import ndarray
-from typing import List, Tuple
+from typing import Tuple
 
 
 def get_device() -> torch.device:
@@ -36,79 +34,6 @@ def tensor(x):
     return torch.tensor(x, device=device, dtype=torch.float32)
 
 
-class Sensor:
-    """
-    A sensor is a function evaluation.
-
-    Args:
-        x: spatial coordinate of shape (coordinate_dim)
-        u: function value of shape (num_channels)
-    """
-
-    def __init__(self, x: ndarray, u: ndarray):
-        self.x = x
-        self.u = u
-
-        self.coordinate_dim = x.shape[0]
-        self.num_channels = u.shape[0]
-
-    def __str__(self) -> str:
-        return f"Sensor(x={self.x}, u={self.u})"
-
-
-class Observation:
-    """
-    An observation is a set of sensors.
-
-    Args:
-        sensors: List of sensors. Used to derive 'num_sensors', 'coordinate_dim' and 'num_channels'.
-    """
-
-    def __init__(self, sensors: List[Sensor]):
-        self.sensors = sensors
-
-        self.num_sensors = len(sensors)
-        assert self.num_sensors > 0
-
-        self.coordinate_dim = self.sensors[0].coordinate_dim
-        self.num_channels = self.sensors[0].num_channels
-
-        # Check consistency across sensors
-        for sensor in self.sensors:
-            assert (
-                sensor.coordinate_dim == self.coordinate_dim
-            ), "Inconsistent coordinate dimension."
-            assert (
-                sensor.num_channels == self.num_channels
-            ), "Inconsistent number of channels."
-
-    def __str__(self) -> str:
-        s = "Observation(sensors=\n"
-        for sensor in self.sensors:
-            s += f"  {sensor}, \n"
-        s += ")"
-        return s
-
-    def to_tensors(self) -> Tuple[torch.Tensor, torch.Tensor]:
-        """Convert observation to tensors.
-
-        Returns:
-            Two tensors: The first tensor contains sensor positions of shape (num_sensors, coordinate_dim), the second tensor contains the sensor values of shape (num_sensors, num_channels).
-        """
-        x = torch.zeros((self.num_sensors, self.coordinate_dim))
-        u = torch.zeros((self.num_sensors, self.num_channels))
-
-        for i, sensor in enumerate(self.sensors):
-            x[i] = tensor(sensor.x)
-            u[i] = tensor(sensor.u)
-
-        # Move to device
-        x.to(device)
-        u.to(device)
-
-        return x, u
-
-
 class DataSet:
     """Data set base class.
 
@@ -192,3 +117,62 @@ def to(self, device: torch.device):
         self.u = self.u.to(device)
         self.y = self.y.to(device)
         self.v = self.v.to(device)
+
+
+class SelfSupervisedDataSet(DataSet):
+    """
+    A `SelfSupervisedDataSet` is a data set that exports batches of observations
+    and labels for self-supervised learning.
+    Every data point is created by taking one sensor as label.
+
+    Every batch consists of tuples `(x, u, y, v)`, where `x` contains the sensor
+    positions, `u` the sensor values, and `y = x_i` and `v = u_i` are
+    the label's coordinate its value for all `i`.
+
+    Args:
+        x: Sensor positions of shape (num_observations, num_sensors, coordinate_dim)
+        u: Sensor values of shape (num_observations, num_sensors, num_channels)
+        batch_size: Batch size.
+        shuffle: Shuffle dataset.
+    """
+
+    def __init__(
+        self,
+        x: Tensor,
+        u: Tensor,
+        batch_size: int,
+        shuffle: bool = True,
+    ):
+        self.num_observations = u.shape[0]
+        self.num_sensors = u.shape[1]
+        self.coordinate_dim = x.shape[-1]
+        self.num_channels = u.shape[-1]
+
+        # Check consistency across observations
+        for i in range(self.num_observations):
+            assert (
+                x[i].shape[-1] == self.coordinate_dim
+            ), "Inconsistent coordinate dimension."
+            assert (
+                u[i].shape[-1] == self.num_channels
+            ), "Inconsistent number of channels."
+
+        xs, us, ys, vs = [], [], [], []
+
+        for i in range(self.num_observations):
+            # Add one data point for every sensor
+            for j in range(self.num_sensors):
+                y = x[i][j].unsqueeze(0)
+                v = u[i][j].unsqueeze(0)
+
+                xs.append(x[i])
+                us.append(u[i])
+                ys.append(y)
+                vs.append(v)
+
+        xs = torch.stack(xs)
+        us = torch.stack(us)
+        ys = torch.stack(ys)
+        vs = torch.stack(vs)
+
+        super().__init__(xs, us, ys, vs, batch_size, shuffle)