We thank the reviewers for their very helpful feedback and will revise
the paper accordingly. Especially, we will try to make the
introduction more generally accessible.

1. **Reviewer C** says "Moreover, the paper doesn't produce any particular
   equalities that one might want to reason with while programming."

We disagree: we show soundness of an entire class of equalities that are useful
for program reasoning by programmers and compilers alike: things like lifting an
if statement (or more generally any pattern match) out of a lambda are very
common in refactoring and optimization.

In terms of "actionable" theorems, we show that upcasts form complex
values and downcasts form complex stacks, which also justifies 
refactorings and optimizations. For instance, upcasts, as complex
values, can be freely (de)-duplicated, discarded if not used, and lifted
out of loops as if they were values.  Downcasts, as complex stacks, are
strict in their input, which aids strictness analysis, as in optimizers
for lazy languages.

2. **Reviewer C** asks (Question1) about the meaning of L514 and whether or not
   our syntactic theory of term dynamism and equivalence is not only sound
   but *complete* for contextual equivalence.

Our sentence is a bit awkwardly worded, but we mean that if two terms
are equidynamic, then in operational models they should be
contextually equivalent. We do include beta-eta in the definition of
term dynamism in figure 4, where they are labeled as axioms of term
dynamism, but this is not enough to prove completeness for an
effectful language like cbpv: for instance, both sequential and
non-deterministic languages model beta-eta.

The syntax is not intended to be complete for any particular language,
in order that it apply to as many as possible. This is what allows us
to interpret the model with incompatible definitions of the dynamic
type and extend it with new axioms based on that model like those
given in figure 5.

3. **Reviewer C** asks (Question2) "Do these theorems hold when you distinguish 
   between different errors via, e.g., blame labels?"

It seems likely, but it's not yet clear how to formalize these properties
with multiple blame labels.
For instance, the dynamic gradual guarantee allows for a more
precisely typed program to error with a blame label different from the
less precise program.
So if our ordering models the dynamic gradual guarantee, this means
that blame(l) would be a least element of our ordering for every label
l so any two blame errors would be equivalent. 

On the other hand the uniqueness theorems seem that they should hold
when all of the casts involved have the *same* label; we may explore
this in future work, building on the unpublished tech report:
https://newtraell.cs.uchicago.edu/research/publications/techreports/TR-2004-02

4. **Reviewer D** asks to clarify lines 116-117.

Yes, we mean that the typed equivalences should hold in the statically
typed portions of the code; but the casts ensure that the
equivalences hold in statically typed portions of the code even if they
interact with (more) dynamically typed code. We expect that it will be
rare for gradually typed programs to be completely typed (e.g. this is
very rare in Typed Racket).

5. **Reviewer D** asks about languages where dynamism doesn't go all the
   way up to ? if we can still reduce everything to upcasts and
   downcasts.

The use of ? in the factorization of a cast A => B can be replaced
with any type D that is more dynamic than both A and B. We don't know
of any systems where a cast A => B is allowed but they have no common
dynamism-supertype.

6. **Reviewer C** says that we "restrict to non-effectful programs" and
   finds it surprising because "blame is an effect".

We _do_ treat dynamic type errors as an effect, and as Reviewer D notes, this is
one of the main reasons we use CBPV: to have an axiomatic theory that is valid
in the presence of arbitrary effects.
The comment in the intro where we say "restrict to non-effectful code" concerns
the eta rule (explaining the eta rule in traditional call-by-value is restricted
to values or non-effectful code). 


7. **Reviewer A** asks "(l35) eta is mostly used in one direction for
    optimization, it's the other direction that fails for these cases right?"

Yes, in the eager cast semantics, the eta law for functions should be
an ordering `V <= \x -> V x` but it's debatable whether this means that
the right should be allowed to be optimized to the left. This would make
the semantics of the program dependent on the precise behavior of the
optimizer, so it would be very hard to change the optimizer without
either making code error that didn't before or vice-versa (because
whether or not an optimization is triggered can be very sensitive to
small changes).

8. **Reviewer C** comments on some missing related work.

We apologize for the oversight and will add appropriate discussion; for
Henglein, a significant difference is that the equational theory there
is based on a rewriting system with cast rules for each pair of types,
whereas in our approach those rules are a theorem following from the
specification of casts.

We will also add a discussion of Degen and Thiemann's work, which is
related: their completeness property inherently violates CBN eta
principles, so is incompatible with "lazy" checking.