Library design ============== The world in which we live is arguably three-dimensional. If you count time, it's four-dimensional. If you look at the work machine leaning scientists do, they work with data that exists in dozens of dimensions, if not thousands. This presents a fundamental problem for programmers, who build software on computers that are equipped with one-dimensional memory: as far as we have come since the days of `Alan Turing `_, the memories in our computers are still just tapes. The problem thus arises how we map our higher-dimensional data onto our single-dimensional data, and -- perhaps even more importantly -- how we do that efficiently. Principles ---------- In order to be as efficient and maintainable as possible, covfie is based upon a set of core principes which guide its design. Understanding these principles is important for understanding the rest of the library. Compile-time knowledge Modern day compilers are incredible tools: they can issue powerful warnings and prevent us from shooting ourselves in the foot in a variety of different ways. However, the compiler cannot protect us as run-time. In covfie, we try to determine as much as possible about the use case of the vector field at compile time. covfie also allows absolutely no run-time polymorphism. Compositionality In the `immortal words of Brian Beckman `_, composition is *the* way to control complexity in software. Wherever we see complex software, it is our duty as programmers to break it down into smaller, composable pieces, and to find useful abstractions. covfie takes a compositional approach to vector fields which allows us to be as flexible as possible with as little code as possible. Extensibility In the real world, library users almost always have wishes which library designers did not think of. In some libraries, this leads to insurmountable obstacles. We want covfie to be helpful in these cases: the library is designed with extensibility in mind, such that end users can easily use their own layouts, storage types, and interpolation algorithms if they so please. Generality covfie was originally designed for high-energy physics to model magnetic fields in particle accelerator experiments, but we do not want covfie to be limited to just this purpose. Our library is designed to be generally applicable, so you can use if regardless of whether you're working with magnetic fields or ocean currents. .. warning:: What follows is a somewhat needlessly abstract and mathematical description of the problems with representing vector fields and how we solve those problems. Reading this section is not necessary to understand and use the library. Vector fields ------------- Finding a rigorous definition of what a vector field is is surprisingly difficult. In the most general sense, we might take two vector spaces :math:`V` and :math:`W` and define a vector field as a mapping: .. math:: V \to W One interesting question is whether vector fields are necessarily endomorphic, such that the vector space on the left side of the arrow is the same as the vector space on the right. While vector fields of this structure are well-behaved we do not assume that this is generally the case. Thus, a vector field may map from one vector space to another. Vector fields in a general sense are also a little too complex to model: some rather unusual structures can be described a vector fields, and as such we will restrict ourselves to so-called *coordinate spaces*, where the vector spaces on either side of the arrow are tuples over some algebraic field. Thus, if we have two algebraic fields :math:`F` and :math:`G` as well as two whole numbers :math:`n` and :math:`m`, we can describe our vector field as follows: .. math:: F^n \to G^m Finally, we will make one additional modification to this model: the structures over which we define our coordinates need not be fields, they are only required to be commutative semirings, such that we can model the natural numbers :math:`\mathbb{N}_0` and the Booleans :math:`\mathbb{B}`, which are crucial for modeling computer memory as we will see later. Composition ----------- .. _composition: As we have discussed previously, our slightly expanded definition of what a vector field is is given by two commutative semirings :math:`R` and :math:`S` as well as two numbers :math:`n` and :math:`m` such that our vector field is given as follows: .. math:: R^n \to S^m Now, if we consider what our computer memory (we'll write :math:`M`) looks like, it allows us to request a set of bits (of arbitrary length) at a certain location in its one-dimensional tape-like memory. Thus, we could describe it as a mapping of the following form: .. math:: M : \mathbb{N} \to \mathbb{B}^p And therein lies the problem of designing a library for vector fields: the mapping that we want to represent and the mapping that we have in our memory are so different that it is not immediately obvious how to turn one into the other. However, it turns out that the compositional nature of our library will give us a way to model this in an elegant way. The key insight here is that the functional arrow :math:`\to` that forms the basis of our definition of a vector field is a profunctor. In other words, it is a bifunctor that is covariant in its second argument (thus, :math:`A \to` is a covariant functor) and contravariant in its first argument (meaning :math:`\to B` is a contravariant functor). Therefore, we can apply additional mapping functions to both the return value of the field (in a covariant fashion) as well as the argument value (in a contravariant fashion). We will define our vector fields in terms of *layers*, each of which contravariantly mutates the input of the mapping and covariantly mutates the output. These layers, described by the quaternary type constructor :math:`L` and a functor :math:`F`, are pairs of two functions: .. math:: L : \mathrm{Functor}~F \Rightarrow (A' \to F(A), F(B) \to B') In code, this can be described using general algebraic data types, such as in this Haskell snippet: .. code-block:: haskell data Layer i j k l where L :: Functor f => (a' -> f a) -> (f b -> b') -> L a' (f a) (f b) b' It may be worth noting that both the contravariant and covariant parts of a layer can be deconstructed further using :math:`F`-(co)algebras. Indeed, the contravariant part can be constructed from a function :math:`f` of type :math:`A' \to A`, and either an :math:`F`-coalgebra carried by :math:`A` or an :math:`F`-coalgebra carried by :math:`A'`. In such a construction, :math:`f` acts as a homomorphism between the two :math:`F`-coalgebras, and all the usual rules for such a homomorphism apply. The covariant part of a layer can be constructed using the exact dual of this idea, employing :math:`F`-algebras instead of coalgebras. To understand how we can use such a layer to transform an existing mapping (such as our memory :math:`M`), we follow the principle described earlier of applying the first function contravariantly and applying the second one covariantly, such that applying a layer to an existing function has the following type: .. math:: $_L : (A' \to F(A), F(B) \to B') \to (A \to B) \to A' \to B' And the following corresponding implementation: .. math:: \begin{align} (f, g)~$_L~h = g \circ F(h) \circ f \end{align} As a practical example, let's consider the case where we have our computer's memory, :math:`M : \mathbb{N} \to \mathbb{B}^p`, and we have a layer that can convert this to a vector field from three-dimensional reals to three-dimensional reals, which would have the following type (the identity functor :math:`\mathrm{Id}` is not relevant for the time being): .. math:: l : (\mathbb{R}^3 \to \mathrm{Id}(\mathbb{N}), \mathrm{Id}(\mathbb{B}^p) \to \mathbb{R}^3) To layer this over :math:`M` -- giving :math:`l~$_L~M` -- we would receive an object of type :math:`\mathbb{R}^3` as our input. This would be passed into the contravariant part of our layer (which has type :math:`\mathbb{R}^3 \to \mathrm{Id}(\mathbb{N})`) to produce an object of type :math:`\mathrm{Id}(\mathbb{N})`. We then lift the mapping :math:`M` into the identity functor to create :math:`\mathrm{Id}(M)`, converting the old mapping of type :math:`\mathbb{N} \to \mathbb{B}^p` into a new mapping of type :math:`\mathrm{Id}(\mathbb{N}) \to \mathrm{Id}(\mathbb{B}^p)`. Passing the previously produced value of type :math:`\mathrm{Id}(\mathbb{N})` into this mapping gives an object of type :math:`\mathrm{Id}(\mathbb{B}^p)`. Finally, we can apply the covariant part of our layer, which has type :math:`\mathrm{Id}(\mathbb{B}^p) \to \mathbb{R}^3` to that value to produce the desired vector type :math:`\mathbb{R}^3`! In code, the application of layers to functions is given as follows: .. code-block:: haskell (|$|) :: Functor f => Layer a (f b) (f c) d -> (b -> c) -> (a -> d) (|$|) (Layer f g) h = g . (fmap h) . f The magic of such layers is that they can be constructed through the composition of simpler layers. Composition of layers is an operation with the following type: .. math:: :nowrap: \begin{align} \circ_L :~&(A'' \to F(A'), F(B') \to B'')\\ \to~&(A' \to G(A), G(B) \to B')\\ \to~&(A'' \to (F \circ G)(A), (F \circ G)(B) \to B'') \end{align} Such that: .. math:: (f_1, g_1) \circ_L (f_2, g_2) = (F(f_2) \circ f_1, g_1 \circ F(g_2)) This operation is conveniently broken down into a contravariant part and a covariant part. The contravariant part mutates the input, on which the "outer" layer, on the left of the composition operator, operates first. Thus, we first apply the contravariant part of the outer layer (with type :math:`A'' \to F(A')`) to the input, which creates a value in the :math:`F` functor, and we therefore need to lift the contravariant side of the *inner* layer (with type :math:`A' \to G(A)`) into the :math:`F` functor to be able to apply it to the value created by the outer layer. The covariant part of the composition is created in a very similar vein, but in reverse. That process is left as an exercise to the reader. The equivalent code looks like this: .. code-block:: haskell (|.|) :: (Functor f, Functor g) => Layer a1 (f a2) (f b2) b1 -> Layer a2 (g a3) (g b3) b2 -> Layer a1 ((Compose f g) a3) ((Compose f g) b3) b1 (|.|) (Layer f1 g1) (Layer f2 g2) = Layer contra co where contra = (Compose . (fmap f2) . f1) co = (g1 . (fmap g2) . getCompose) The identity element of these layers under composition is given as a layer where the contravariant component lifts a value into the identity functor, and a covariant component which extracts the value from it. Thus, we might define the identity layer as follows: .. math:: \mathrm{id}_L = (\lambda x . \mathrm{Id}(x), \lambda \mathrm{Id}(x) . x) Or, once again, in code: .. code-block:: haskell idLayer = (Identity, runIdentity) Abstract example ---------------- Let's consider the case where we want to model a magnetic field, which is a mapping from three-dimensional real coordinates to three-dimensional vectors. Thus, the type of such a magnetic field is a mapping :math:`\mathbb{R}^3 \to \mathbb{R}^3`. From the previous section we know that we can achieve this by finding a layer :math:`l` of the following type: .. math:: l : \mathrm{Functor}~F\Rightarrow (\mathbb{R}^3 \to F(\mathbb{N}), F(\mathbb{B}^p) \to \mathbb{R}^3) The key insight is that constructing the pair of functions :math:`l` is best achieved compositionally; if we wrote these functions in one go, we would be unable to re-use them for anything else. Thus, we will consider how to construct them compositionally, from a set of simpler layers. The first layer we will consider is the data type. Remember that the basic definition of our memory is a mapping from an address to a series of bits. The first layer of our composition shall thus be a layer that interprets those bits as the real numbers we desire. The type of this layer is as follows: .. math:: l_1 : (\mathbb{N} \to \mathrm{Id}(\mathbb{N}), \mathrm{Id}(\mathbb{B^p}) \to \mathbb{R}^3) Assuming we want to model our real numbers as 32-bit IEEE 754 floating point numbers, we will need to consider that each output vector, consisting of three of such numbers, will consist of a total of 96 bits. Thus, to move forward one vector in memory, we need to skip over 96 bits. In other words, to access the :math:`n`-th vector, we need to look up the memory starting at the :math:`(96n)`-th bit. Then, we need to interpret those 96 bits as three integers. In lieu of convenient mathematical notation, we will define our first layer as follows: .. math:: l_1 = (\lambda x.\mathrm{Id}(96x), \lambda \mathrm{Id}(x).\mathtt{reinterpret\_cast}(x)) This gives us a mapping from one-dimensional coordinates to three-dimensional vectors: a step in the right direction, but not what we want quite yet. We will need to compose some kind of layer that can take three-dimensional coordinates and interpret them as one-dimensional coordinates. In this layer, we don't need to modify the output at all. Thus, we arrive at the following type: .. math:: l_2 : (\mathbb{N}^3 \to \mathrm{Id}(\mathbb{N}), \mathrm{Id}(\mathbb{R}^3) \to \mathbb{R}^3) We'll assume that the size of our field in each direction is known as :math:`N`. Adhering to a `column-major storage order `_, we can define our next layer as follows: .. math:: l_2 = (\lambda (c_1, c_2, c_3).\mathrm{Id}\left(\sum_{k=1}^3\left(\prod_{l = k + 1}^3 N_l\right)c_k\right), \lambda\mathrm{Id}(x).x) Next, we will want to prevent our vector field from going out of bounds. In particular, we will want to control what happens in that case. Let's assume that we want any accesses that go outsize our :math:`N_1 \times N_2 \times N_3` mapping to retun the zero-vector instead of causing some sort of error. This is sometimes known as *clamping*. Such a layer presents an interesting challenge because it is the first layer that does not employ the identity functor. Rather, the type of this layer is the following: .. math:: l_3 : \left(\mathbb{N}^3 \to \mathrm{Maybe}(\mathbb{N} ^3), \mathrm{Maybe}(\mathbb{R}^3) \to \mathbb{R}^3\right) Intuitively, if the requested coordinate lies inside of our mapping, the contravariant side of this layer will produce an extant value, and the underlying layer will be called to operate on it as normal. However, if the coordinate lies outside of the requested mapping, a non-extant value is produced, which is threaded through any underlying layers (effectively performing no computation) and the non-extant value will be returned to the covariant side of the layer, which can then return the default value! The layer is given as follows: .. math:: :nowrap: \begin{align} l_3 = \bigg(&\lambda (c_1, c_2, c_3). \begin{cases} \mathrm{Just}((c_1, c_2, c_3)) & \mathrm{if}~c_1 < N_1 \wedge c_2 < N_2 \wedge c_3 < N_3\\ \mathrm{Nothing} & \mathrm{otherwise} \end{cases},\\ &\lambda x . \begin{cases} (r_1, r_2, r_3) & \mathrm{if}~x = \mathrm{Just}((r_1, r_2, r_3))\\ (0, 0, 0) & \mathrm{otherwise} \end{cases}\bigg) \end{align} Next, let's get rid of the constraint that we can only access our vector field at integer coordinates. To do so, we will need to construct some sort of interpolation method. Using a nearest-neighbour method this is rather simple. The type of such a layer would be: .. math:: l'_4 : (\mathbb{R}^3 \to \mathrm{Id}(\mathbb{N}^3), \mathrm{Id}(\mathbb{R}^3) \to \mathbb{R}^3) And it's implementation might look something like this, where we simply round the coordinates on the contravariant side and leave the covariant side unchanged: .. math:: l'_4 = (\lambda (c_1, c_2, c_3).\mathrm{Id}((\lfloor c_1 \rceil, \lfloor c_2 \rceil, \lfloor c_3 \rceil)), \lambda \mathrm{Id}(x) . x) But this is boring. A more interesting example that showcases the true power of our approach is linear interpolation. This requires us to access the underlying function eight times for every access, which we can model elegantly using our functor-based approach. Our new interpolation method has the following layer type, noting that the creation of 8-tuples is our functor: .. math:: l_4 : (\mathbb{R}^3 \to (\mathbb{N}^3)^8, (\mathbb{R}^3)^8 \to \mathbb{R}^3) In this case, we will omit the full equation for the layer, but intuitively the contravariant side takes one real-valued coordinate and produces the eight integer-valid coordinates which are closest to it. On the covariant side, we find eight resulting vectors of which we take the weighted average to find our final vector. We'll now add one final layer. As it stands, the coordinates we feed our vector field have no geometric meaning. They are -- for the lack of a better term -- coordinates in our computer's memory, rather than in space. Thankfully, this problem is easily solved: we will add a layer that transforms geometrically meaningful coordinates into the ones our current field accepts through an affine transformation. The type of this layer is simple: .. math:: l_5 : (\mathbb{R}^3 \to \mathrm{Id}(\mathbb{R}^3), \mathrm{Id}(\mathbb{R}^3) \to \mathbb{R}^3) The definition is not much more complex, if we assume that we are in possession of some affine transformation matrix :math:`A`: .. math:: l_5 = (\lambda x . \mathrm{Id}(Ax), \lambda \mathrm{Id}(x).x) Finally, we are ready to construct our magnetic field :math:`B`. First, let's compose all of the layers into one layer :math:`l_B`. This is achieved very simply through the operators we defined earlier: .. math:: l_B = l_5 \circ_L l_4 \circ_L l_3 \circ_L l_2 \circ_L l_1 Which we can then apply to the machine's memory to obtain our magnetic field: .. math:: B = l_B~$_L~M Note that this notion of composition is associative, such that the following are all equivalent: .. math:: :nowrap: \begin{align} B &= (l_5 \circ_L l_4 \circ_L l_3 \circ_L l_2 \circ_L l_1)~$_L~M\\ B &= (l_5 \circ_L l_4) \circ_L (l_3 \circ_L l_2 \circ_L l_1)~$_L~M\\ B &= l_5 \circ_L (l_4 \circ_L l_3 \circ_L l_2) \circ_L l_1~$_L~M\\ B &= (((l_5 \circ_L l_4) \circ_L l_3) \circ_L l_2) \circ_L l_1~$_L~M\\ B &= l_5 \circ_L (l_4 \circ_L (l_3 \circ_L (l_2 \circ_L l_1)))~$_L~M \end{align} And so forth. In addition, composing one layer with another and then applying that layer to a mapping is equivalent to applying the innermost layer to the mapping and then applying the outermost layer, such that: .. math:: (l_3 \circ_L l_2 \circ_L l_1)~$_L~M = (l_3 \circ_L l_2)~$_L~(l_1 $_L M) = l_3~$_L~(l_2~$_L~(l_1~$_L~M)) The final type of our magnetic field is as follows: .. math:: :nowrap: \begin{align} l_B : (&\mathbb{R}^3 \to (\mathrm{Id}\circ -^8 \circ \mathrm{Id} \circ \mathrm{Maybe} \circ \mathrm{Id} \circ \mathrm{Id})(\mathbb{N}),\\ &(\mathrm{Id}\circ -^8 \circ \mathrm{Id} \circ \mathrm{Maybe} \circ \mathrm{Id} \circ \mathrm{Id})(\mathbb{B}^p) \to \mathbb{R}^3) \end{align} And while that seems like quite a mouth full, it is worth noting that the composition of functors :math:`\mathrm{Id}\circ -^8 \circ \mathrm{Id} \circ \mathrm{Maybe} \circ \mathrm{Id} \circ \mathrm{Id}` is itself a well-behaved functor, meaning we can plug it into our definition of :math:`$_L` without worrying about it: composition has handled this complexity for us.