* Reviewer A (Accept)
  - (l35) eta is mostly used in one direction for optimization, it's
    the other direction that fails for these cases right?
    - Ans: Yes, in the "eager" semantics, the eta law for functions
      should be an ordering `V <= \x -> V x` but it's debateable
      whether this means that the right should be allowed to be
      optimized to the left. This would make the semantics of the
      program dependent on the precise behvior of the optimizer, so it
      would be very hard to change the optimizer without either making
      code error that didn't before or vice-versa.
  - 781: is the unforcedness of these choices connected to similar
    unforcedness... Could you be more specific about what you mean? It
    sounds intriguing but we don't understand the question. Is the
    paper being referred to Dimoulas et al 2011?
  - 1143: why isn't the trans relation at some number larger than i:
    this is technical but standard: the short explanation is that `i`
    here is only a bound on exploring the behavior of V1 (i.e. the
    left-hand side), so it doesn't change when we go from V2 to V3.
  - 1200: we 
* Reviewer B (Weak Reject)
* Reviewer C (Weak Accept)
** Things I will respond to
   - "cast semantics that munge proxies disrupt \eta"
     - A: we think it is oversimplifying our discussion of eta to say
       that the observation that eager semantics has already been
       shown. The two examples given involving reference equality are
       situations where the eta principle is not present, not where it
       is broken by the cast semantics, as we discussed. If you can
       point us to work where the this has been noted before, we would
       be happy to see it.
   - "But beyond these criticisms of the paper's scholarly situation,
     there's more: the core result---that $\eta$ isn't compatible with
     some of the trickier forms of gradual checking---is hardly a new
     insight, even if it hasn't been stated exactly that way
     before. There are several papers already remarking on the issues
     presented by, e.g., function proxies (e.g., Siek's work on
     Reticulated Python is motivated by Python's extensive use of
     reference equality and other $\eta$-breaking operations; Tanter's
     work on Confined Gradual Typing has a similar motivation)."

   - "The paper claims that "the linchpin to the design of a gradually
     typed language is the semantics of _runtime type casts_"
     (L34). This unsupported claim goes directly counter to the
     intuition of Garcia et al.'s AGT, which says that the linchpin of
     the design is the Galois connection giving meaning to imprecision
     in types. Further, the paper goes on to suggest that grad..."

     Is this a fing joke? Our point is that gradual typing is that
     static reasoning principles are ensured by the dynamic type
     casts. AGT produces these casts by its Galois Connection but the
     casts are still central to the design of the system. The latest
     work you cited changes the input language in order to produce
     different runtime type checking.

   - We think you are misinterpreting what the comment about L124 is
     saying: we are saying that the "full" eta law `M = \x -> M x` for
     functions in CBV can be true if the program's context is pure or
     if `M` is pure, so it is used sometimes to reason about programs
     in a restricted context.

     Our development does not at all restrict to non-effectful
     programs: the CBPV axiomatic theory is compatible with a broad
     class of effects: state, continuations, I/O and that is the main
     reason that we use it.
     In our *specific* model, we have two effects: divergence and an
     uncatchable error, but the soundness of GTT for proving
     contextual approximation would still hold if we were to use a
     language with more effects.

   - "Moreover, the paper doesn't produce any particular equalities
     that one might want to reason with while programming."

     We show an entire class of equalities that are useful for program
     reasoning by programmers and compilers alike: things like lifting
     an if statement (or more generally any pattern match) out of a
     lambda are very common in refactoring and optimization.

   - "Degen and Thiemann's work": we thank the reviewer for pointing
     out this related work. Our perspective on their result is that
     their definition of "completeness" is inherently call-by-value in
     that it requires breaking eta equalities that a lazy language is
     intended to provide.
** Questions for Response
   - What do you mean that you have a syntactic judgment for
     contextual equivalence? (L514) Are there contextual equivalent
     terms which aren't equidynamic? It's not obvious from the
     presentation of the definitions that you're including $\beta\eta$
     in $\sqsubseteq$... but you are, right? In which case
     equidynamism should be not only sound but _also complete_ for
     contextual equivalence (L758).

     The sentence is a bit awkwardly worded, but we mean that if two
     terms are equidynamic, then in operational models they should be
     contextually equivalent.
     We include beta-eta in the definition of term dynamism in figure
     4, where they are labeled as axioms of term dynamism.

     It does not make sense to ask if our syntax is complete for
     contextual equivalence without fixing a particular operational
     model. The syntax is not intended to be complete for any
     particular model, in order that it apply to as many models as
     possible. This is what allows us to interpret the model with
     incompatible definitions of the dynamic type and extend it with
     new axioms based on that model like those given in figure 5.

   - Do these theorems hold when you distinguish between different
     errors via, e.g., blame labels?

     For many of our theorems the distinction is small. The
     dynamic gradual guarantee allows for a term with more precise
     types to error with a different blame label, so any system we
     made that axiomatized the ordering in the gradual guarantee would
     necessarily have that any two errors would be equi-dynamic
     regardless of their blame label, so it doesn't change much to
     just consider them equal in our axioms.

     On the other hand, it seems sensible that our cast uniqueness
     theorems should hold in the presence of multiple blame labels if
     formalized correctly. For instance in the product cast uniquenes
     theorem:
     
        <{A1' x A2'} <-< {A1 x A2}>y
        = split y to (y1,y2). (<A1' <-< A1>y1, <A2' <-< A2>y2)
	
     the correct blame-sensitive statement would be that these hold
     when both sides all use the same blame label. It is not clear to
     us how to formalize this exactly, but this was attempted in some
     unpublished work by Findler-Blume-Felleisen where the ordering
     was parameterized by which error was considered the "current
     blame label":
     https://newtraell.cs.uchicago.edu/research/publications/techreports/TR-2004-02

* Reviewer D (Accept)
** Other Qs
   - For "less gradual" languages in which dynamism doesn't go up to
     ?...upcasting all the way to ? makes me a little worried.
     - The use of ? in the factorization of a cast A => B can be
       replaced with any type D that is more dynamic than both A and
       B, so ? is the most convenient since if its present it always
       works. We would be curious to know of any systems where a cast
       A => B was allowed but they have no common dynamism-supertype.
** Questions for Response
   - 
   - The typed equivalences should hold "within the static portions of
     the program" as you say, but we also want to emphasize that the
     static equivalences hold even in the presence of interaction with
     (more) dynamically typed code, which is exactly what the casts
     ensure.

4. Reviewer C says that we are overly specific in our view of gradual
   typing: that we only consider languages that have eta laws and the
   dynamic gradual guarantee.

This seems like a narrow view of programming language semantics. Often
in practice languages will not satisfy idealized properties because of
certain known features. This does not mean that these principles
cannot ever be used. For instance in OCaml an eta reduction is not
necessarily an equivalence because of reference equality of functions,
but in a specific portion of a program where reference equality of
functions is not being used programmers will refactor in ways that are
only justified by eta principles. What we have shown is that anything
besides the cast semantics *breaks the eta principle in a new way*. So
this means that in order