diff --git a/paper/gtt.tex b/paper/gtt.tex
index f0280c8a4550943c5ad5f4d480c762750a1926be..280d0bd6981745c535e482917f1cf81a2477323d 100644
--- a/paper/gtt.tex
+++ b/paper/gtt.tex
@@ -456,8 +456,8 @@ which express the uniqueness or universality of certain constructions.
 The $\eta$ law of the untyped $\lambda$-calculus, which
 states that any $\lambda$-term $M \equiv \lambda x. M x$, is
 restricted in a typed language to only hold for terms of function type $M
-: A \to B$---$\lambda$ is the unique/universal way of making an element
-of the function type.  
+: A \to B$ ($\lambda$ is the unique/universal way of making an element
+of the function type).  
 %
 This famously ``fails'' to hold in call-by-value languages in the
 presence of effects: if $M$ is a program that prints \texttt{"hello"}
@@ -466,7 +466,7 @@ $\lambda x. M x$ will only print when given an argument. But this can be
 accomodated with one further modification: the $\eta$ law is valid in
 simple call-by-value languages\footnote{This does not hold in languages
   with some intensional feature of functions such as reference
-  equality. We discuss the applicability of our main results more generally in section \ref{sec:related}} (e.g. SML) if we have a ``value
+  equality. We discuss the applicability of our main results more generally in Section \ref{sec:related}.} (e.g. SML) if we have a ``value
 restriction'' $V \equiv \lambda x. V x$.
 %
 This illustrates that $\eta$/extensionality rules must be stated for
@@ -482,7 +482,7 @@ statement on $x$: $M \equiv \kw{if} x (M[\texttt{true}/x])
 If we have an \texttt{if} form that is strongly typed (i.e., errors on
 non-booleans) then this tells us that it is \emph{safe} to run an if
 statement on any input of boolean type (in CBN, by contrast an if
-statement forces a thunk and so is not safe).
+statement forces a thunk and so is not necessarily safe).
 %
 In addition, even if our \texttt{if} statement does some kind of
 coercion, this tells us that the term $M$ only cares about whether $x$
@@ -696,9 +696,9 @@ determine a cast semantics, and helps clarify the trade-off between
 eager and lazy semantics of function casts.
 
 \textbf{Technical Overview of GTT.}  The gradual type theory developed
-in this paper unifies our previous work~\citep{newahmed18} on
+in this paper unifies our previous work on
 operational (logical relations) reasoning for gradual typing in a
-call-by-value setting (which did not consider a proof theory), and on an
+call-by-value setting~\citep{newahmed18} (which did not consider a proof theory), and on an
 axiomatic proof theory for gradual typing~\citep{newlicata2018-fscd} in
 a call-by-name setting (which considered only function and product
 types, and denotational but not operational models).
@@ -1619,7 +1619,7 @@ above: it can be taken to be a stack $\bullet : \u B' \vdash \dncast{\u
   B}{\u B'}{\bullet} : \u B$, because a downcasted computation
 evaluates the computation it is ``wrapping'' exactly once.  One
 intuitive justification for this point of view, which we make precise
-in section \ref{sec:contract}, is to think of the dynamic computation type $\dync$ as a
+in Section \ref{sec:contract}, is to think of the dynamic computation type $\dync$ as a
 recursive \emph{product} of all possible behaviors that a computation
 might have, and the downcast as a recursive type unrolling and product
 projection, which is a stack.  From this point of view, an \emph{upcast}
@@ -9408,7 +9408,7 @@ theorem applies.
 %
 The theorem applies to any \emph{model} of gradual type theory, such
 as the models we have constructed using call-by-push-value given in
-sections \ref{sec:contract}, \ref{sec:complex}, \ref{sec:operational}.
+Sections \ref{sec:contract}, \ref{sec:complex}, \ref{sec:operational}.
 %
 We conjecture that simple call-by-value and call-by-name gradual
 languages are also models of GTT, by extending the translation of