diff --git a/notes/cast-induction.txt b/notes/cast-induction.txt
new file mode 100644
index 0000000000000000000000000000000000000000..bb040e382d51abbf01ecc76bc818399faeac0dae
--- /dev/null
+++ b/notes/cast-induction.txt
@@ -0,0 +1,242 @@
+Itâ€™s probably too late/too much to do this in this paper, but hereâ€™s an
+idea for how to make the proofs in GTT easier / a suggestion for what an
+"Abadi-Plotkin logicâ€ for dynamism might include.
+
+First, suppose we have type variables in the program logic, where a
+judgement
+X type |- T(X) type
+implies monotonicity of T, i.e. T(A) <= T(B) if A <= B.
+(This is admissible in GTT, but not enforced for an â€œunknown" type
+constructor.)
+
+Second, suppose we have terms with free type variables, where
+X type |- E : T(X)
+implies that E satisfies a â€œfundamental theoremâ€
+E(A) <= E(B) : T(A) <= T(B)
+
+Next, suppose we have a value->value function type A => B (maybe this
+only needs to exist in the program logic, not in the programming
+language, as a way to talk about a term with a free value variable).
+Having congruence and beta for this implies that every
+complex-value-with-a-variable is monotone, because f <= g : A => B <= Aâ€™
+=> B' and a <= a' : A <= Aâ€™ implies f(a) <= g(a') : B <= Bâ€™, and we have
+beta for the applications as equidynamism.
+
+Then I think the following â€œpath inductionâ€ principles are valid for
+proving dynamism by "replacing upcasts with the identity whenever that
+type checks". These are basically just packaging up using monotonicity,
+transitivity, and the universal property of the upcast, which says in
+this notation
+
+id_A <= <B <-< A> : A => A   <=   A => B
+<B <-< A> <= id_B : A => B   <=   B => B
+
+You have to be careful about (a) whether itâ€™s the source or target of
+the function thatâ€™s â€œfreeâ€, and (b) whether the term with the cast is on
+the left or right of the term dynamism judgement, which gives:
+
+(1) right upcast induction
+
+X type, f : X => B |- E(X,f) : T(X)
+T(B) <= C [might not hold]
+E(B, id_B) <= F : T(B) <= C
+â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”
+E(A, <B <-< A>) <= F : T(A) <= C
+
+[ use monotonicity to get E(A, upcast) <= E(B,id_B) : T(A) <= T(B) and
+then transitivity with the premise ]
+
+(2) left upcast induction
+
+Y type, f : A => Y |- E(Y,f) : T(Y)
+C <= T(A) [might not hold]
+F <= E(A, id_A) <= F : C <= T(A)
+â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”
+F <= E(B, <B <-< A>) : C <= T(B)
+
+Here are some examples using them:
+
+(a) Suppose we have a List A type and a complex-value map (everything is
+positive) : (A => B) x List A => List B
+We want to show that map < B <- A > is an upcast, i.e.
+
+(a1) id_<List A> <= map < B <- A > : L(A) => L(A) <= L(A) => L(B)
+Note that Y type, f : A => Y |- map <Y <-< A> : L(A) => L(Y)
+and L(A) => L(A) <= L(A) => L(A).
+Therefore, by left upcast induction, it suffices to show
+id_<List A> <= map id_A, which is true by beta/eta.
+
+(a2) map < B <- A > <= id_<List B> : L(A) => L(B) <= L(B) => L(B)
+Note that X type, f : X => B |- map <B <-< X> : L(X) => L(B),
+and L(B) => L(B) <= L(B) => L(B).
+Therefore, by right upcast induction, it suffices to show
+map < id_B > <= id_<List B>, which is true by eta.
+
+(b) We want to show that u := \z.split z as (x,y) in (<Aâ€™ <-< A>x, <Bâ€™
+<-< B> y) is an upcast A x B <= Aâ€™ x Bâ€™.
+
+(b1) To show:
+id_<A x B> <= u : A x B => A x B <= A x B => Aâ€™ x Bâ€™
+First, note that
+Y type, f : A => Y |- \z.split z as (x,y) in (f x, <Bâ€™ <-< B> y) : A x B
+=> Y x B
+and
+A x B => A x B <= A x B => A x Bâ€™.
+Therefore, by left upcast induction (with T(Y) = A x B => Y x B) , it
+suffices to show
+id_<A x B> <= \z.split z as (x,y) in (id_A x, <Bâ€™ <-< B> y) : A x B => A
+x Bâ€™
+and by beta
+id_<A x B> <= \z.split z as (x,y) in (x, <Bâ€™ <-< B> y) : A x B => A x Bâ€™
+Second,
+Y type, f : B => Y |- \z.split z as (x,y) in (x, f y) : A x B => A x Y
+and
+A x B => A x B <= A x B => A x B
+Therefore, by left upcast induction (with T(Y) = A x B => A x Y), it
+suffices to show
+id_<A x B> <= \z.split z as (x,y) in (x, id_B y)
+which is true by beta/eta.
+
+(b2) To show:
+u <= id_<A' x B'> : A x B => A' x B' <= A' x B' => Aâ€™ x Bâ€™
+Since
+X type, f : X => A' |- \z.split z as (x,y) in (f x, <Bâ€™ <-< B> y) : X x
+B => Aâ€™ x Bâ€™
+and
+A' x B => Aâ€™ x Bâ€™  <=  Aâ€™ x B' => Aâ€™ x Bâ€™
+it suffices to show
+\z.split z as (x,y) in (id_A' x, <Bâ€™ <-< B> y) <= id_<A' x Bâ€™>
+And since
+X type, f : X => B' |- \z.split z as (x,y) in (id_Aâ€™ x, f y) : A' x X =>
+Aâ€™ x Bâ€™  <=  Aâ€™ x Bâ€™ => Aâ€™ x Bâ€™
+A' x B' => Aâ€™ x Bâ€™  <=  Aâ€™ x Bâ€™ => Aâ€™ x Bâ€™
+it suffices to show
+\z.split z as (x,y) in (id_A' x, id_Bâ€™ y) <= id_<A' x Bâ€™>
+which again holds by beta-eta.
+
+----------------------------------------------------------------------
+
+For (1) and (2), I was thinking that A => Aâ€™ and B -o Bâ€™ could exist
+only at the program logic level, not as part of the programming
+language.  I.e. without changing GTT the programming language from
+whatâ€™s currently there, add a separate layer where we have a judgement
+like
+
+Psi | G | D |- E : T
+
+where
+
+Psi ::= Psi,X type | Psi,f : A1 => A2 | Psi,s : B1 -o B2
+
+which act like a meta-language quantifier that we would currently say as
+â€œfor any complex value from A to Aâ€™ â€ or â€œfor any stack from A -o Aâ€™ â€.
+
+And then a judgement
+
+f : A1 => A2 | G | D |- E : T
+
+means
+(1) for any complex value G, x : A1 |- V : A2, we have G | D |- E[x.V/f]
+: T, and
+(2) for related complex values G <= G, x : A1 <= xâ€™ : A1â€™ |- V <= V' :
+A2 <= A2â€™,
+G <= G | D <= D |- E[x.V] <= Eâ€™[xâ€™.Vâ€™] : T <= T
+
+Similarly, S : B1 -o B2 | G | D |- E : T means
+(1) for any stack G | B1 |- S : B2,
+G | D |- E[S/s] : T
+(2) for related stacks G <= G | B1 <= B1' |- S <= Sâ€™ : B2 <= B2â€™,
+G <= Gâ€™ | D <= Dâ€™ |- E[S/s] <= Eâ€™[Sâ€™/s] : B2 <= B2â€™
+
+I.e. in the program logic the first-order quantifiers range over both
+the objects of the category and the maps.
+
+----------------------------------------------------------------------
+
+OK, I think that if we have type dynamism assumption, then the following
+variation works:
+
+(1) right type dynamism induction
+
+X type, X <= B |- E(X) : T(X)
+T(B) <= C [might not hold]
+E(B) <= F : T(B) <= C
+â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”
+A type, A <= B |- E(A) <= F : T(A) <= C
+
+i.e. if you have a term that makes sense for any type below B, then if
+it has an upper bound at B, it has an upper bound everywhere
+
+(2) left dynamism induction
+
+Y type, A <= Y |- E(Y) : T(Y)
+C <= T(A) [might not hold]
+F <= E(A) : C <= T(A)
+â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”â€”
+B type, A <= B |- F <= E(B) : C <= T(B)
+
+(3) axioms <A <-< A> = id_A : A => A and <B <<- B> = id_B : B -o B
+
+(if type dynamism were not thin, E would depend on the derivation, and
+the identity would be substituted in in the premise).
+
+To see why this is right, assume A <= B.  For the downcasts, the
+ordering is still
+
+id_A : A -o A <= <A <<- B> : B -o A <= <B <<- B> : B -o B
+
+just like for upcasts
+
+id_A : A => A <= <B <-< A> : A <= B <= <B <-< B> : B -o B
+
+i.e. the identity on the source of the type precision is less than the
+cast, which is less than the identity on the target; the only difference
+is the type in the middle.
+
+So, for example, if you have
+X type, X <= B |- <B <-< X> : X => B
+and a particular A for which A <= B then it makes sense that
+<B <-< A> <= <B <-< B> = id_B : A => B <= B => B
+but also that (if theyâ€™re computation types instead)
+X type, X <= B |- <X <<- B> : B -o X so
+<A <<- B> <= <B <<- B> = id_B : B -o A <= B -o B
+
+
+So hereâ€™s how one of the Kleisli upcasts goes:
+
+Suppose B1 <= B1â€™ and B2 <= B2â€™.
+To show:
+u := \x : U (B1 x B2).  thunk {force(<U B1â€™ <-< U B1> thunk(fst(force
+x))),  force(<U B2â€™ <-< U B2> thunk(fst(force x)))  }
+is an upcast <U B1â€™xB2' <-< U B1xB2>
+
+(a) B1 <= B1â€™, B2 <= B2â€™ |- id_U (B1 x B2) <= u : U (B1 x B2) => U (B1 x
+B2)  <=  U (B1 x B2) => U (B1' x B2')
+
+Note that
+Y type, B1 <= Y |- \x : U (B1 x B2).  thunk {force(<U Y <-< U B1>
+thunk(fst(force x))),  force(<U B2â€™ <-< U B2> thunk(snd(force x)))  } :
+U(B1 & B2) => U(Y & B2)
+so we can contract B1â€™ to B1, and it suffices to show
+id <=  \x : U (B1 x B2).  thunk {force(<U B1 <-< U B1> thunk(fst(force
+x))),  ... }
+: U (B1 x B2) => U (B1 x B2)  <=  U (B1 x B2) => U (B1 x B2â€™)
+Then (secretly using the fact that U(id_B1) = id_U B1 : U B1 <= U B1),
+this is equal to
+\x : U (B1 x B2).  thunk {force(thunk(fst(force x))),  ... }
+by reducing an identity cast.
+Doing the same thing to contract B2â€™ to B2 and beta-reducing the
+identity, it suffices to show
+id <= \x : U (B1 x B2).  thunk {force(thunk(fst(force x))),
+force(thunk(snd(force x))) }
+which is true by beta-reducing the force-thunk redex that killing the
+cast exposed,
+eta-reducing the (fst force x, snd force x) pair,
+and eta-reducing the thunk(force x)
+
+(b) a dual thing should work for
+
+B1 <= B1â€™, B2 <= B2â€™ |- u <= id_U (B1' x B2â€™)
+
+Because weâ€™re showing an upper bound, use right dynamism induction to
+contract B1 to B1â€™ and B2 to B2â€™, and then beta/eta does it.  
diff --git a/paper/gtt.tex b/paper/gtt.tex
index 18b1bc4872640bcfe93371c8f960b06f374cb5c1..c628d55683845153698873cc9156821cd9e8be4a 100644
--- a/paper/gtt.tex
+++ b/paper/gtt.tex
@@ -225,6 +225,8 @@
 \newcommand{\bnfadd}{\mathrel{\bf +::=}}
 \newcommand{\bnfsub}{\mathrel{\bf -::=}}
 
+\newcommand\dynvctx[0]{\mathsf{dyn\mathord{-}vctx}}
+\newcommand\dyncctx[0]{\mathsf{dyn\mathord{-}cctx}}
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \begin{document}
@@ -783,16 +785,16 @@ TODO: do we actually know what would go wrong in that case?
 \inferrule{A \ltdyn A' \and \u B \ltdyn \u B'}
           {A \to \u B \ltdyn A' \to \u B'}
 \\
-\inferrule{ }{\cdot \, \mathsf{ctx}}
+\inferrule{ }{\cdot \, \dynvctx}
 \and
-\inferrule{\Phi \, \mathsf{ctx} \and
+\inferrule{\Phi \, \dynvctx \and
             A \ltdyn A'}
-          {\Phi, x \ltdyn x' : A \ltdyn A' \, \mathsf{ctx}}
+          {\Phi, x \ltdyn x' : A \ltdyn A' \, \dynvctx}
 \and
-\inferrule{ }{\cdot \, \mathsf{cctx}}
+\inferrule{ }{\cdot \, \dyncctx}
 \and          
 \inferrule{\u B \ltdyn \u B'}
-          {(\bullet \ltdyn \bullet : \u B \ltdyn \u B') \, \mathsf{cctx}}
+          {(\bullet \ltdyn \bullet : \u B \ltdyn \u B') \, \dyncctx}
 \end{mathpar}
 \caption{GTT Type and Context Dynamism}
 \end{figure}
@@ -1039,19 +1041,13 @@ TODO: do we actually know what would go wrong in that case?
 \begin{small}
   \begin{mathpar}
 
-    \inferrule{ }{\Gamma \ltdyn \Gamma \vdash V \ltdyn V : A \ltdyn A}
+    \inferrule{ }{\Gamma \ltdyn \Gamma \mid \Delta \ltdyn \Delta \vdash E \ltdyn E : T \ltdyn T}
 
-    \inferrule{\Gamma \ltdyn \Gamma' \vdash V \ltdyn V' : A \ltdyn A' \\\\
-      \Gamma' \ltdyn \Gamma'' \vdash V' \ltdyn V'' : A' \ltdyn A''
+    \inferrule{\Gamma \ltdyn \Gamma' \mid \Delta \ltdyn \Delta' \vdash E \ltdyn E' : T \ltdyn T' \\\\
+      \Gamma' \ltdyn \Gamma'' \mid \Delta' \ltdyn \Delta'' \vdash E' \ltdyn E'' : T' \ltdyn T''
     }
-    {\Gamma \ltdyn \Gamma'' \vdash V \ltdyn V'' : A \ltdyn A''}
+    {\Gamma \ltdyn \Gamma'' \mid \Delta \ltdyn \Delta'' \vdash E \ltdyn E'' : T \ltdyn T''}
     \\
-    \inferrule{ }{\Gamma \ltdyn \Gamma \mid \Delta \ltdyn \Delta \vdash M \ltdyn M : \u B \ltdyn \u B}
-
-    \inferrule{\Gamma \ltdyn \Gamma' \mid \Delta \ltdyn \Delta' \vdash M \ltdyn M' : \u B \ltdyn \u B' \\\\
-      \Gamma' \ltdyn \Gamma'' \mid \Delta' \ltdyn \Delta'' \vdash M' \ltdyn M'' : \u B' \ltdyn \u B''
-    }
-    {\Gamma \ltdyn \Gamma'' \mid \Delta \ltdyn \Delta'' \vdash M \ltdyn M'' : \u B \ltdyn \u B''}
 
     \inferrule
     { }