notes on lift monad in topos of trees model

8cd81d9f · Eric Giovannini · eee09a71 · 8cd81d9f · 8cd81d9f
Commit 8cd81d9f authored 2 years ago by Eric Giovannini
--- a/notes/lift-monad-topos-of-trees.pdf
+++ b/notes/lift-monad-topos-of-trees.pdf
--- a/notes/lift-monad-topos-of-trees.tex
+++ b/notes/lift-monad-topos-of-trees.tex
+\documentclass{article}
+\usepackage{amssymb}
+\usepackage{amsmath}
+\usepackage{amsthm}
+\usepackage{tikz-cd}
+\usepackage{mathpartir}
+\usepackage{stmaryrd}
+\usepackage{parskip}
+\usepackage{tikz-cd}
+\newtheorem{theorem}{Theorem}[section]
+\newtheorem{nonnum-theorem}{Theorem}[section]
+\newtheorem{corollary}{Corollary}[section]
+\newtheorem{definition}{Definition}[section]
+\newtheorem{lemma}{Lemma}[section]
+\newcommand{\later}{\vartriangleright\hspace{-2px}}
+\newcommand{\nxt}{\texttt{next}}
+\newcommand{\calS}{\mathcal{S}}
+\newcommand{\gfix}{\texttt{gfix}}
+\newcommand{\calU}{\mathcal{U}}
+\newcommand{\laterhat}{\widehat{\later}}
+\newcommand{\El}{\mathsf{El}}
+\newcommand{\lift}{L_\mho}
+\newcommand{\lifthat}{\widehat{L_\mho}}
+\newcommand{\To}{\Rightarrow}
+\newcommand{\calC}{\mathcal{C}}
+\newcommand{\Set}{\texttt{Set}}
+\newcommand{\Yo}{\mathsf{Yo}}
+\newcommand{\Hom}{\mathsf{Hom}}
+\newcommand{\inl}{\texttt{inl}}
+\newcommand{\inr}{\texttt{inr}}
+\begin{document}
+\title{The Lift Monad in the Topos of Trees}
+\author{Eric Giovannini}
+\maketitle
+\section{Background}
+Recall that the topos of trees $\calS$ is the presheaf category $\Set^{\omega^o}$. 
+An object $X$ is a family $\{X_i\}$ of sets indexed by natural numbers, along with
+restriction maps $r^X_i \colon X_{i+1} \to X_i$.
+A morphism from $\{X_i\}$ to $\{Y_i\}$ is a family of functions $f_i \colon X_i \to Y_i$
+that commute with the restriction maps in the obvious way, that is,
+$f_i \circ r^X_i = r^Y_i \circ f_{i+1}$.
+There is an operator $\later$ (``later''), defined on an object $X$ by
+$(\later X)_0 = 1$ and $(\later X)_{i+1} = X_i$. The restriction maps are given by
+$r^\later_0 = !$, where $!$ is the unique map into $1$, and $r^\later_{i+1} = r^X_i$.
+For each $X$, there is a morphism $\nxt^X \colon X \to \later X$ defined pointwise by
+$\nxt^X_0 = !$, and $\nxt^X_{i+1} = r^X_i$.
+% Note that $\calS$ is a topos, so it is cartesian closed and has a subobject classifier $\Omega$.
+We also have a universe $\calU$ of types, with encodings for operations such as sum types
+(written as $\widehat{+}$). There is also an operator $\laterhat \colon \later \calU \to \calU$ such that
+$\El(\laterhat(\nxt A)) =\,\, \later \El(A)$, where $\El$ is the type corresponding to the code $A$.
+\section{Guarded Fixpoints}
+Given a morphism $f \colon \later X \to X$, we want to compute its fixed point $\gfix f : X$.
+We define $\gfix \colon (\later X \To X) \to X$ as a morphism in $\calS$, where $(\later X \To X)$
+is the exponential object, the object of functions from $\later X$ to $X$.
+We first describe $(\later X \To X)$. Since $\calS$ is a presheaf category, we can use the
+Yoneda Lemma to describe the exponential objects, as reviewed below.
+In general, let $X$ and $Y$ be object in the functor category $\Set^{\calC^o}$, i.e., functors
+$X$ and $Y$ from $\calC^o$ to $\Set$.
+We have by the Yoneda Lemma that for each object $a$ of $\calC$, the set
+$(X \To Y)(a)$ is isomorphic to the set of natural transformations $\Yo(a) \to (X \To Y)$,
+i.e., the set of morphisms $\Yo(a) \to (X \To Y)$ in the functor category.
+But by the universal property of the exponential, such a morphism is the same
+as a morphism (in this case, a natural transformation) $\Yo(a) \times X \to Y$.
+So, the set $(X \To Y)(a)$ is isomorphic to the set of natural transformations
+$\Yo(a) \times X \to Y$, that is, $(\Hom(-,a) \times X) \to Y$.
+% Naturality condition
+% Action of the exponential functor on morphisms
+With the above, we can compute the definition of $(\later X \To X)$ in $\calS$.
+Since $\calC = \omega$ is a preorder, the above result specializes to
+\[
+    (\later X \To X)_i = \{ f^i \colon \forall j \le i. (\later X)_j \to X_j \}.
+\]
+(The superscript is included to remind us of which set each function belongs to.)
+The naturality condition says that, for $j' \le j$ the following diagram commutes:
+% https://q.uiver.app/?q=WzAsNCxbMCwwLCIoXFxsYXRlciBYKV9qIl0sWzIsMCwiWF9qIl0sWzIsMiwiWF97aid9Il0sWzAsMiwiKFxcbGF0ZXIgWClfe2onfSJdLFswLDMsInJfe2ogXFxsZSBqJ31ee1xcbGF0ZXIgWH0iLDJdLFsxLDIsInJfe2ogXFxsZSBqJ31ee1h9Il0sWzMsMiwiZl5pIFxcLCBqJyIsMl0sWzAsMSwiZl5pXFwsIGoiXV0=
+\[\begin{tikzcd}
+	{(\later X)_j} && {X_j} \\
+	\\
+	{(\later X)_{j'}} && {X_{j'}}
+	\arrow["{r_{j \le j'}^{\later X}}"', from=1-1, to=3-1]
+	\arrow["{r_{j \le j'}^{X}}", from=1-3, to=3-3]
+	\arrow["{f^i \, j'}"', from=3-1, to=3-3]
+	\arrow["{f^i\, j}", from=1-1, to=1-3]
+\end{tikzcd}\]
+%From now on, we will leave the argument $j$ implicit when applying
+%the functions $f^i$ defined above.
+Additionally, the restriction maps 
+$r_i^{(\later X \To X)} \colon (\later X \To X)_{i+1} \to (\later X \To X)_{i}$
+are defined in the obvious way.
+Namely, if $f : \forall j \le i+1 . (\later X)_j \to X_j$ then in particular
+$f : \forall j \le i . (\later X)_j \to X_j$, so we define
+\[
+    r_i^{(\later X \To X)} (f) := f.
+\]
+Note that this is just a special case of the action of the functor $X \To Y$
+discussed above on morphisms.
+\vspace{4ex}
+Now we can proceed to define $\gfix$. We define $\gfix_i \colon (\later X \To X)_i \to X_i$
+by induction on $i$. We let
+\[
+    \gfix_0 (f) = f\, 0\, 1,
+\]
+since here $f\, 0 \colon (\later X)_0 \to X_0$ and we have $(\later X)_0 = 1$.
+For $i \ge 1$, we define
+\[
+    \gfix_i (f) = f\, i\, (\gfix_{i-1} (f\, (i-1))),
+\]
+which has type $X_i$ as follows: since $f\, (i-1) \colon ((\later X)_{i-1} \to X_{i-1})$,
+then $\gfix_{i-1}(f\, (i-1))$ has type $X_{i-1}$, which is equal to $(\later X)_i$,
+and so it is a valid input to $f\, i$.
+This completes the definition of $\gfix_i$, and hence of $\gfix$.
+We need to verify that $\gfix f = f (\nxt (\gfix f))$, where $f \colon \later X \To X$.
+This follows from the definition of $\gfix$ above.
+% Details?
+\section{The Lift Monad}
+For an object $X$, the lift monad $\lift X$ is defined as follows:
+\begin{align*}
+    &\lift X := \\
+    &\quad\quad    \eta \colon X \to \lift X \\
+    &\quad\quad    \mho \colon \lift X \\
+    &\quad\quad    \theta \colon \later (\lift X) \to \lift X
+\end{align*}
+In the setting of guarded domain theory, the above translates to
+the following guarded fixpoint:
+\[
+    \lifthat X = \gfix(\lambda Y \colon \later \calU . 
+      X \,\, \widehat{+} \,\, 1 \,\, \widehat{+} \,\, \widehat{\later} Y)
+\]
+With this definition, we can take $\lift X = \El(\lifthat X)$,
+and we can show that $\lift X = X + 1 + \later (\lift X)$.
+This follows from the fact that $\gfix f = f (\nxt (\gfix f))$ and
+the rule $\El(\laterhat(\nxt A)) =\,\, \later \El(A)$.
+% TODO be explicit about how this follows from the
+% description of gfix we computed above
+We can compute $(\lift X)_i$ for each $i$, as follows. Intuitively, we can think of
+having a potentially-erroring computation that returns values of type $X$,
+and $(\lift X)_i$ represents our view of the behavior of the computation
+provided we can watch it for $i+1$ steps.
+We have
+\[
+    (\lift X)_0 = X_0 + 1 + (\later \lift X)_0 = X_0 + 1 + 1.
+\]
+This represents the behavior of the computation after one step.
+The first $1$ represents the case where an error occurred, while
+the second $1$ represents the case where the computation took more
+than one step, and hence we have run out of time to observe it.
+For $i \ge 1$, we have
+\begin{align*}
+    (\lift X)_i &= X_i + 1 + (\later \lift X)_i \\
+                &= X_i + 1 + (\lift X)_{i-1}.
+\end{align*}
+In other words, when observing a computation for $i+1$ steps,
+either (1) we get a value in the first step, or (2) we
+error in the first step, or (3) the answer is not ready yet
+and we must wait, in which case we will have $i$ steps left to watch
+the computation.
+For concreteness, when $i = 1$, we have
+\[
+    (\lift X)_1 = X_1 + 1 + (X_0 + 1 + 1).
+\]
+Here, the first $1$ represents an error occurring in the first step of the
+computation, while the second $1$ represents an error in the second step,
+and the third $1$ represents having run out of steps to observe the computation.
+\vspace{4ex}
+We now discuss each of the constructors of $\lift X$.
+First consider the constructor $\eta^X \colon X \To \lift X$.
+This is defined component-wise as $\eta^X_i = \lambda x . \inl (\inl x)$
+(we assume sums are left-associative).
+The constructor $\mho^X \colon 1 \To \lift X$ is defined by
+$\mho^X_i = \lambda () . \inl (\inr \, ())$.
+Lastly, the constructor $\theta^X \colon (\later (\lift X) \To \lift X)$
+is defined by $\theta^X_i = \lambda \tilde{l} . \inr \, \tilde{l}$.
+The restriction maps $r^{\lift X}_i \colon (\lift X)_{i+1} \to (\lift X)_i$
+are defined by ``truncating'' the observation of the computation from $i+2$
+steps to $i+1$ steps, as follows.
+If the computation has the form $\eta^X_{i+1}(x)$, then we return $\eta^X_{i}(r^X_i(x))$.
+If the computation has the form $\mho^X_{i+1}$, then we return $\mho^X_i$.
+Lastly, if the computation has the form $\theta^X_{i+1} (\tilde{l})$, then
+we return $\theta^X_0 ()$ if $i = 0$ and 
+$(\theta^X_i (r^{\lift X}_{i-1} (\tilde{l})))$ if $i \ge 1$.
+% Is this correct?
+\subsection{The Diverging Computation}
+Consider the ``non-terminating'' computation
+$\gfix (\theta) \colon \lift X$. This can be described pointwise
+as follows:
+\[ (\gfix (\theta))_0 = \inr \, () \colon (\lift X)_0, \]
+and for $i \ge 1$,
+\[ (\gfix (\theta))_i = \inr ((\gfix (\theta))_{i-1}), \]
+where the latter type-checks because $(\gfix (\theta))_{i-1} \colon (\lift X)_{i-1}$.
+Unfolding the above, we have
+\[
+    (\gfix (\theta))_i = \inr (\inr ( \dots ( \inr \, () ) \dots )),
+\]
+where there are $i+1$ nested occurrences of $\inr$.
+So, no matter how many steps we observe this computation for,
+it will never return a result within that number of steps.
+It will instead always say that it needs more time.
+\section{Weak Bisimilarity}
+We now define a weak bisimilarity relation ``$\approx$'' on $\lift X$.
+This relation is parameterized by an equivalence relation
+$\sim$ on $X$. Intuitively, two computations $l$ and $l'$
+if they are equivalent ``up to $\theta$''. Specifically,
+their underlying results are related by $\sim$ and they only
+differ in the number of steps taken to deliver their results.
+In the below, we define $\delta \colon \lift X \to \lift X$ to be
+$\theta \circ \nxt$.
+We define $\approx$ via guarded fixpoint as follows:
+\begin{align*}
+    &\approx \,\, = \gfix (\lambda \,\, rec \,\, l \,\, l'. \\
+    &\quad\quad \text{case } l, l' \text{ of} \\
+    &\quad\quad\quad | \quad \mho , \mho     \Rightarrow Unit \\
+    &\quad\quad\quad | \quad \eta (x) , \eta (y) \Rightarrow x \sim y \\
+    &\quad\quad\quad | \quad \theta (\tilde{x}) , \theta (\tilde{y}) \Rightarrow 
+      \laterhat (\lambda t . (rec \,\, t) (\tilde{x} \, t) (\tilde{y} \, t)) \\
+    &\quad\quad\quad | \quad \theta (\tilde{x}) , \mho    \Rightarrow 
+      \Sigma_{n : \mathbb{N}} . \theta (\tilde{x}) = \delta^n (\mho) \\
+    &\quad\quad\quad | \quad \theta (\tilde{x}) , \eta(y)  \Rightarrow 
+      \Sigma_{n : \mathbb{N}} . \Sigma_{x : X} . 
+        \theta (\tilde{x}) = \delta^n(\eta (x)) \times (x \sim y)\\
+    &\quad\quad\quad | \quad \mho , \theta (\tilde{y}) \Rightarrow
+      \Sigma_{n : \mathbb{N}} . \theta (\tilde{y}) = \delta^n (\mho) \\
+    &\quad\quad\quad | \quad \eta (x) , \theta (\tilde{y}) \Rightarrow
+      \Sigma_{n : \mathbb{N}} . \Sigma_{y : X} . 
+        \theta (\tilde{y}) = \delta^n(\eta (y)) \times (x \sim y)\\
+    &\quad\quad\quad | \quad \text{otherwise} \Rightarrow \bot
+    )
+\end{align*}
+Note that in the case where both variables are $\theta$'s, we have
+used tick abstraction so that we can apply the guarded recursive
+function $rec$.
+\vspace{4ex}
+We now consider the claim that for all $l : \lift X$,
+\[
+    l \approx (\gfix (\theta)).
+\]
+We claim that this does not hold. Specifically, we claim that
+\[
+    \mho \not\approx (\gfix (\theta)).
+\]
+In order for these to be related, according to the defintion of the relation,
+(and recalling that $(\gfix (\theta)) = \theta (\nxt (\gfix (\theta)))$),
+we would need that $(\gfix (\theta)) = \delta^n (\mho)$ for some $n$.
+Suppose this were the case for some natural number $n_0$.
+Consider an index $i > n_0 \in \omega$. The left-hand side above is equal to
+\[ \inr (\inr ( \dots ( \inr \, () ) \dots )), \]
+(where there are $i+1$ occurrences of $\inr$), while the right-hand side will have
+as its innermost term an $\inl (\inr \, ())$, representing the error case $\mho$.
+Thus, the two terms cannot be equal for any $n$. It follows that 
+\[
+    \mho \not\approx (\gfix (\theta)).
+\]
+\end{document}
\ No newline at end of file