diff --git a/paper/gtt.tex b/paper/gtt.tex index a2f11a7afab5624832446a3bc518ff85074c9fa6..2dd4f8a685509e5aa812e9888b5c04360f98af14 100644 --- a/paper/gtt.tex +++ b/paper/gtt.tex @@ -410,9 +410,9 @@ However, the dichotomy between gradual and optional typing is not as firm as one might like. % There have been many different proposed semantics of run-time type -checking: ``transient'' cast semantics~\cite{vitousekswordssiek:2017} +checking: ``transient'' cast semantics~\citep{vitousekswordssiek:2017} only checks the head connective of a type (number, function, list, -\ldots), ``eager'' cast semantics~\cite{herman2010spaceefficient} checks +\ldots), ``eager'' cast semantics~\citep{herman2010spaceefficient} checks run-time type information on closures, whereas ``lazy'' cast semantics~\citep{findler-felleisen02} will always delay a type-check on a function until it is called (and there are other possibilities, see @@ -485,8 +485,7 @@ common. % The second property we take for granted is that the language satisfies the \emph{dynamic gradual guarantee}~\cite{refined} (``graduality'')---a -now standard correctness theorem of gradual typing that most newly -proposed gradual languages seek to satisfy--- which constraints how +strong correctness theorem of gradual typing--- which constraints how changing type annotations changes behavior. Graduality says that if we change the types in a program to be ``more precise''---e.g., by changing from the dynamic type to a more precise type such as integers or @@ -736,8 +735,8 @@ types and downcasts with lazy/computation types, and that the modalities relating values and computations induce the downcasts for eager/value types and upcasts for lazy/computation types. Moreover, this analysis articulates an important behavioral property of casts that was proved -operationally for call-by-value in \cite{newahmed18} but missed for -call-by-name in \cite{newlicata2018-fscd}: upcasts for eager types and +operationally for call-by-value in \citet{newahmed18} but missed for +call-by-name in \citet{newlicata2018-fscd}: upcasts for eager types and downcasts for lazy types are both ``pure'' in a suitable sense, which enables more refactorings and program optimizations. In particular, we show that these casts can be taken to be (and are essentially forced to @@ -788,7 +787,7 @@ Then for the extended CBPV and define a step-indexed biorthogonal logical relation that interprets the ordering relation on terms as contextual error approximation, which underlies the definition of graduality as -presented in \cite{newahmed18}. +presented in \citet{newahmed18}. % Combining these theorems gives an implementation of the term language of GTT in which $\beta, \eta$ are observational equivalences @@ -853,7 +852,7 @@ The main contributions of the paper are as follows. \begin{shortonly} \textbf{Extended version:} An extended version of the paper, which includes the omitted cases of definitions, lemmas, and proofs is -available as \citet{newlicataahmed19:extended}. +available in \citet{newlicataahmed19:extended}. \end{shortonly} % While gradual typing has been researched quite extensively by proving @@ -1451,7 +1450,7 @@ types will arise from type dynamism and casts. The \emph{type dynamism} relation of gradual type theory is written $A \ltdyn A'$ and read as ``$A$ is less dynamic than $A'$''; intuitively, this means that $A'$ supports more behaviors than $A$. -\cite{newahmed18,newlicata2018-fscd} analyze this as the existence of an \emph{upcast} +\citet{newahmed18,newlicata2018-fscd} analyze this as the existence of an \emph{upcast} from $A$ to $A'$ and a downcast from $A'$ to $A$ which form an embedding-projection pair (\emph{ep pair}) for term error approximation (an ordering where runtime errors are minimal): the upcast followed by the @@ -1550,7 +1549,7 @@ considered. However, we can arrive at a first proposal by considering how previous work would be embedded into CBPV. % In the previous work on both CBV and -CBN~\cite{newahmed18,newlicata2018-fscd} every type dynamism judgement +CBN~\citep{newahmed18,newlicata2018-fscd} every type dynamism judgement $A \ltdyn A'$ induces both an upcast from $A$ to $A'$ and a downcast from $A'$ to $A$. % @@ -1858,7 +1857,7 @@ typically because they are identical), which is a syntactic judgement for contex Finally, we assert some term dynamism axioms that describe the behavior of programs. The cast universal properties at the top of -Figure~\ref{fig:gtt-term-dyn-axioms}, following~\cite{newlicata2018-fscd}, say that +Figure~\ref{fig:gtt-term-dyn-axioms}, following~\citet{newlicata2018-fscd}, say that the defining property of an upcast from $A$ to $A'$ is that it is the least dynamic term of type $A'$ that is more dynamic that $x$, a ``least upper bound''. That is, $\upcast{A}{A'}{x}$ is a term of type $A'$ that @@ -4594,7 +4593,7 @@ introduction forms for $\dynv$ are exactly the introduction forms for all of the value types (unit, pairing,$\texttt{inl}$, $\texttt{inr}$, $\texttt{force}$), while elimination forms are all of the elimination forms for computation types ($\pi$, $\pi'$, application and binding); such ``bityped'' languages -are related to \cite{girard01locussolum,zeilberger09thesis}. +are related to \citet{girard01locussolum,zeilberger09thesis}. \begin{shortonly} In the extended version, we give an extension of GTT axiomatizing this implementation of the dynamic types. @@ -4700,7 +4699,7 @@ lazy product type $\with$. Normally, these are not included in this way, but rather sums are encoded by making each case use a fresh constructor (using nominal techniques like opaque structs in Racket) and then making the sum the -union of the constructors, as argued in \cite{siekth16recursiveunion}. +union of the constructors, as argued in \citet{siekth16recursiveunion}. % We leave modeling this nominal structure to future work, but in minimalist languages, such as simple dialects of Scheme and Lisp, sum @@ -5384,7 +5383,7 @@ translate a heterogeneous term dynamism to a homogeneous inequality \end{theorem} \begin{longonly} - Relative to previous work on graduality (\cite{newahmed18}), + Relative to previous work on graduality \citep{newahmed18}, the distinction between complex value upcasts and complex stack downcasts guides the formulation of the theorem; e.g. using upcasts in the left-hand theorem would require more thunks/forces. @@ -6741,7 +6740,7 @@ strictness). \end{longonly} % \ This translation clarifies the behavioral meaning of complex values and -stacks, following \cite{munchmaccagnoni14nonassociative, +stacks, following \citet{munchmaccagnoni14nonassociative, fuhrmann1999direct}, and therefore of upcasts and downcasts. \begin{longonly} This is related to completeness of focusing: it moves inversion rules @@ -7068,7 +7067,7 @@ $\kw{split}$ are not evaluation contexts in the operational semantics). \end{longonly} -Levy~\cite{levy03cbpvbook} translates \cbpvstar\/ to \cbpv, but not does not prove +\citet{levy03cbpvbook} translates \cbpvstar\/ to \cbpv, but not does not prove the inequality preservation that we require here, so we give an alternative translation for which this property is easy to @@ -7112,7 +7111,7 @@ the same type. The \emph{de-complexification} procedure is defined as follows. % We note that this translation is not the one presented in -\cite{levy03cbpvbook}, but rather a more inefficient version that, in CPS +\citet{levy03cbpvbook}, but rather a more inefficient version that, in CPS terminology, introduces many administrative redices. % Since we are only proving results up to observational equivalence @@ -7839,7 +7838,7 @@ in figure \end{figure} \fi -This is morally the same as in \cite{levy03cbpvbook}, but we present +This is morally the same as in \citet{levy03cbpvbook}, but we present stacks in a manner similar to Hieb-Felleisen style evaluation contexts\iflong(rather than as an explicit stack machine with stack frames)\fi. % @@ -7987,7 +7986,7 @@ least preorders so we write $\apreorder$ rather than $\sim$. \noindent Three important relations arise as liftings: Equality of results lifts to observational equivalence ($\ctxize=$). The preorder generated by $\err \ltdyn R$ (i.e. the other three results are unrelated maximal elements) lifts to the notion of -\emph{error approximation} used in \cite{newahmed18} to prove the +\emph{error approximation} used in \citet{newahmed18} to prove the graduality property ($\ctxize\ltdyn$). The preorder generated by $\diverge \preceq R$ lifts to the standard notion of \emph{divergence approximation} ($\ctxize\preceq$). @@ -7998,7 +7997,7 @@ The most famous use of lifting is for observational equivalence, which is the lifting of equality of results ($\ctxize=$), and we will show that $\equidyn$ proofs in GTT imply observational equivalences. % -However, as shown in \cite{newahmed18}, the graduality property is +However, as shown in \citet{newahmed18}, the graduality property is defined in terms of an observational \emph{approximation} relation $\ltdyn$ that places $\err$ as the least element, and every other element as a maximal element. @@ -8157,7 +8156,7 @@ approximation, i.e., $M \ctxize= N$ if and only if $M \ctxize\preceq N$ and $N \ctxize\preceq M$, and since $\preceq$ has $\diverge$ as a least element, we can use a step-indexed relation to prove it. % -As shown in \cite{newahmed18}, a similar trick works for error +As shown in \citet{newahmed18}, a similar trick works for error approximation, but since $\ltdyn$ is \emph{not} an equivalence relation, we decompose it rather into two \emph{different} orderings: error approximation up to divergence on the left $\errordivergeleft$ and @@ -8251,8 +8250,8 @@ So using Lemmas~\ref{lem:ctx-commutes-conjunction}, \ref{lem:ctx-commutes-dual} \end{proof} As a corollary, the decomposition of contextual equivalence into diverge -approximation in \cite{ahmed06:lr} and the decomposition of dynamism in -\cite{newahmed18} are really the same trick: +approximation in \citet{ahmed06:lr} and the decomposition of dynamism in +\citet{newahmed18} are really the same trick: \begin{corollary}[Contextual Decomposition] ~~~ \label{cor:contextual-decomposition} \begin{enumerate} \item $\ctxize= \mathbin{\Leftrightarrow} \ctxize{\preceq} \wedge @@ -9373,7 +9372,7 @@ implementation theorems (e.g. $\upcast{\mathtt{list}(A)}{\mathtt{list}(A')} \equidyn \mathtt{map}\upcast{A}{A'}$). Additionally, since more efficient cast implementations such as optimized cast calculi (the lazy variant in -\cite{herman2010spaceefficient}) and threesome +\citet{herman2010spaceefficient}) and threesome casts~\cite{siekwadler10zigzag}, are equivalent to the lazy contract semantics, they should also be models of GTT, and if so we could use GTT to reason about program transformations and optimizations in them. @@ -9449,7 +9448,7 @@ x. \Omega$ will not. % This is a crucial feature of Haskell and is a major source of differences between implementations of lazy contracts, as noted in -\cite{Degen2012TheIO}. +\citet{Degen2012TheIO}. % We can understand this difference by using a different translation into call-by-push-value: what Levy calls the ``lazy paradigm'', as @@ -9464,7 +9463,7 @@ With this embedding and the uniqueness theorem, GTT produces a definition for lazy casts, and the definition matches the work of \citet{XuPJC09} when restricting to non-dependent contracts. -Some recent work \cite{greenmanfelleisen:2018} gives a spectrum of +\citet{greenmanfelleisen:2018} gives a spectrum of differing syntactic type soundness theorems for different semantics of gradual typing. % @@ -9664,7 +9663,7 @@ interpretation. % It may be of interest to develop a more general notion of model of GTT for which we can prove soundness and completeness theorems, as in -\cite{newlicata2018-fscd}. +\citet{newlicata2018-fscd}. % A model would be a strong adjunction between double categories where one of the double categories has all ``companions'' and the other has