diff options
Diffstat (limited to 'doc/refman/Cases.tex')
-rw-r--r-- | doc/refman/Cases.tex | 750 |
1 files changed, 750 insertions, 0 deletions
diff --git a/doc/refman/Cases.tex b/doc/refman/Cases.tex new file mode 100644 index 00000000..6f58269f --- /dev/null +++ b/doc/refman/Cases.tex @@ -0,0 +1,750 @@ +\achapter{Extended pattern-matching} +%BEGIN LATEX +\defaultheaders +%END LATEX +\aauthor{Cristina Cornes and Hugo Herbelin} + +\label{Mult-match-full} +\ttindex{Cases} +\index{ML-like patterns} + +This section describes the full form of pattern-matching in {\Coq} terms. + +\asection{Patterns}\label{implementation} The full syntax of {\tt +match} is presented in Figures~\ref{term-syntax} +and~\ref{term-syntax-aux}. Identifiers in patterns are either +constructor names or variables. Any identifier that is not the +constructor of an inductive or coinductive type is considered to be a +variable. A variable name cannot occur more than once in a given +pattern. It is recommended to start variable names by a lowercase +letter. + +If a pattern has the form $(c~\vec{x})$ where $c$ is a constructor +symbol and $\vec{x}$ is a linear vector of (distinct) variables, it is +called {\em simple}: it is the kind of pattern recognized by the basic +version of {\tt match}. On the opposite, if it is a variable $x$ or +has the form $(c~\vec{p})$ with $p$ not only made of variables, the +pattern is called {\em nested}. + +A variable pattern matches any value, and the identifier is bound to +that value. The pattern ``\texttt{\_}'' (called ``don't care'' or +``wildcard'' symbol) also matches any value, but does not bind +anything. It may occur an arbitrary number of times in a +pattern. Alias patterns written \texttt{(}{\sl pattern} \texttt{as} +{\sl identifier}\texttt{)} are also accepted. This pattern matches the +same values as {\sl pattern} does and {\sl identifier} is bound to the +matched value. +A pattern of the form {\pattern}{\tt |}{\pattern} is called +disjunctive. A list of patterns separated with commas is also +considered as a pattern and is called {\em multiple pattern}. However +multiple patterns can only occur at the root of pattern-matching +equations. Disjunctions of {\em multiple pattern} are allowed though. + +Since extended {\tt match} expressions are compiled into the primitive +ones, the expressiveness of the theory remains the same. Once the +stage of parsing has finished only simple patterns remain. Re-nesting +of pattern is performed at printing time. An easy way to see the +result of the expansion is to toggle off the nesting performed at +printing (use here {\tt Set Printing Matching}), then by printing the term +with \texttt{Print} if the term is a constant, or using the command +\texttt{Check}. + +The extended \texttt{match} still accepts an optional {\em elimination +predicate} given after the keyword \texttt{return}. Given a pattern +matching expression, if all the right-hand-sides of \texttt{=>} ({\em +rhs} in short) have the same type, then this type can be sometimes +synthesized, and so we can omit the \texttt{return} part. Otherwise +the predicate after \texttt{return} has to be provided, like for the basic +\texttt{match}. + +Let us illustrate through examples the different aspects of extended +pattern matching. Consider for example the function that computes the +maximum of two natural numbers. We can write it in primitive syntax +by: + +\begin{coq_example} +Fixpoint max (n m:nat) {struct m} : nat := + match n with + | O => m + | S n' => match m with + | O => S n' + | S m' => S (max n' m') + end + end. +\end{coq_example} + +\paragraph{Multiple patterns} + +Using multiple patterns in the definition of {\tt max} allows to write: + +\begin{coq_example} +Reset max. +Fixpoint max (n m:nat) {struct m} : nat := + match n, m with + | O, _ => m + | S n', O => S n' + | S n', S m' => S (max n' m') + end. +\end{coq_example} + +which will be compiled into the previous form. + +The pattern-matching compilation strategy examines patterns from left +to right. A \texttt{match} expression is generated {\bf only} when +there is at least one constructor in the column of patterns. E.g. the +following example does not build a \texttt{match} expression. + +\begin{coq_example} +Check (fun x:nat => match x return nat with + | y => y + end). +\end{coq_example} + +\paragraph{Aliasing subpatterns} + +We can also use ``\texttt{as} {\ident}'' to associate a name to a +sub-pattern: + +\begin{coq_example} +Reset max. +Fixpoint max (n m:nat) {struct n} : nat := + match n, m with + | O, _ => m + | S n' as p, O => p + | S n', S m' => S (max n' m') + end. +\end{coq_example} + +\paragraph{Nested patterns} + +Here is now an example of nested patterns: + +\begin{coq_example} +Fixpoint even (n:nat) : bool := + match n with + | O => true + | S O => false + | S (S n') => even n' + end. +\end{coq_example} + +This is compiled into: + +\begin{coq_example} +Print even. +\end{coq_example} + +In the previous examples patterns do not conflict with, but +sometimes it is comfortable to write patterns that admit a non +trivial superposition. Consider +the boolean function \texttt{lef} that given two natural numbers +yields \texttt{true} if the first one is less or equal than the second +one and \texttt{false} otherwise. We can write it as follows: + +\begin{coq_example} +Fixpoint lef (n m:nat) {struct m} : bool := + match n, m with + | O, x => true + | x, O => false + | S n, S m => lef n m + end. +\end{coq_example} + +Note that the first and the second multiple pattern superpose because +the couple of values \texttt{O O} matches both. Thus, what is the result +of the function on those values? To eliminate ambiguity we use the +{\em textual priority rule}: we consider patterns ordered from top to +bottom, then a value is matched by the pattern at the $ith$ row if and +only if it is not matched by some pattern of a previous row. Thus in the +example, +\texttt{O O} is matched by the first pattern, and so \texttt{(lef O O)} +yields \texttt{true}. + +Another way to write this function is: + +\begin{coq_example} +Reset lef. +Fixpoint lef (n m:nat) {struct m} : bool := + match n, m with + | O, x => true + | S n, S m => lef n m + | _, _ => false + end. +\end{coq_example} + +Here the last pattern superposes with the first two. Because +of the priority rule, the last pattern +will be used only for values that do not match neither the first nor +the second one. + +Terms with useless patterns are not accepted by the +system. Here is an example: +% Test failure +\begin{coq_eval} +Set Printing Depth 50. + (********** The following is not correct and should produce **********) + (**************** Error: This clause is redundant ********************) +\end{coq_eval} +\begin{coq_example} +Check (fun x:nat => + match x with + | O => true + | S _ => false + | x => true + end). +\end{coq_example} + +\paragraph{Disjunctive patterns} + +Multiple patterns that share the same right-hand-side can be +factorized using the notation \nelist{\multpattern}{\tt |}. For instance, +{\tt max} can be rewritten as follows: + +\begin{coq_eval} +Reset max. +\end{coq_eval} +\begin{coq_example} +Fixpoint max (n m:nat) {struct m} : nat := + match n, m with + | S n', S m' => S (max n' m') + | 0, p | p, 0 => p + end. +\end{coq_example} + +Similarly, factorization of (non necessary multiple) patterns +that share the same variables is possible by using the notation +\nelist{\pattern}{\tt |}. Here is an example: + +\begin{coq_example} +Definition filter_2_4 (n:nat) : nat := + match n with + | 2 as m | 4 as m => m + | _ => 0 + end. +\end{coq_example} + +Here is another example using disjunctive subpatterns. + +\begin{coq_example} +Definition filter_some_square_corners (p:nat*nat) : nat*nat := + match p with + | ((2 as m | 4 as m), (3 as n | 5 as n)) => (m,n) + | _ => (0,0) + end. +\end{coq_example} + +\asection{About patterns of parametric types} +When matching objects of a parametric type, constructors in patterns +{\em do not expect} the parameter arguments. Their value is deduced +during expansion. +Consider for example the type of polymorphic lists: + +\begin{coq_example} +Inductive List (A:Set) : Set := + | nil : List A + | cons : A -> List A -> List A. +\end{coq_example} + +We can check the function {\em tail}: + +\begin{coq_example} +Check + (fun l:List nat => + match l with + | nil => nil nat + | cons _ l' => l' + end). +\end{coq_example} + + +When we use parameters in patterns there is an error message: +% Test failure +\begin{coq_eval} +Set Printing Depth 50. +(********** The following is not correct and should produce **********) +(******** Error: The constructor cons expects 2 arguments ************) +\end{coq_eval} +\begin{coq_example} +Check + (fun l:List nat => + match l with + | nil A => nil nat + | cons A _ l' => l' + end). +\end{coq_example} + + + +\asection{Matching objects of dependent types} +The previous examples illustrate pattern matching on objects of +non-dependent types, but we can also +use the expansion strategy to destructure objects of dependent type. +Consider the type \texttt{listn} of lists of a certain length: +\label{listn} + +\begin{coq_example} +Inductive listn : nat -> Set := + | niln : listn 0 + | consn : forall n:nat, nat -> listn n -> listn (S n). +\end{coq_example} + +\asubsection{Understanding dependencies in patterns} +We can define the function \texttt{length} over \texttt{listn} by: + +\begin{coq_example} +Definition length (n:nat) (l:listn n) := n. +\end{coq_example} + +Just for illustrating pattern matching, +we can define it by case analysis: + +\begin{coq_example} +Reset length. +Definition length (n:nat) (l:listn n) := + match l with + | niln => 0 + | consn n _ _ => S n + end. +\end{coq_example} + +We can understand the meaning of this definition using the +same notions of usual pattern matching. + +% +% Constraining of dependencies is not longer valid in V7 +% +\iffalse +Now suppose we split the second pattern of \texttt{length} into two +cases so to give an +alternative definition using nested patterns: +\begin{coq_example} +Definition length1 (n:nat) (l:listn n) := + match l with + | niln => 0 + | consn n _ niln => S n + | consn n _ (consn _ _ _) => S n + end. +\end{coq_example} + +It is obvious that \texttt{length1} is another version of +\texttt{length}. We can also give the following definition: +\begin{coq_example} +Definition length2 (n:nat) (l:listn n) := + match l with + | niln => 0 + | consn n _ niln => 1 + | consn n _ (consn m _ _) => S (S m) + end. +\end{coq_example} + +If we forget that \texttt{listn} is a dependent type and we read these +definitions using the usual semantics of pattern matching, we can conclude +that \texttt{length1} +and \texttt{length2} are different functions. +In fact, they are equivalent +because the pattern \texttt{niln} implies that \texttt{n} can only match +the value $0$ and analogously the pattern \texttt{consn} determines that \texttt{n} can +only match values of the form $(S~v)$ where $v$ is the value matched by +\texttt{m}. + +The converse is also true. If +we destructure the length value with the pattern \texttt{O} then the list +value should be $niln$. +Thus, the following term \texttt{length3} corresponds to the function +\texttt{length} but this time defined by case analysis on the dependencies instead of on the list: + +\begin{coq_example} +Definition length3 (n:nat) (l:listn n) := + match l with + | niln => 0 + | consn O _ _ => 1 + | consn (S n) _ _ => S (S n) + end. +\end{coq_example} + +When we have nested patterns of dependent types, the semantics of +pattern matching becomes a little more difficult because +the set of values that are matched by a sub-pattern may be conditioned by the +values matched by another sub-pattern. Dependent nested patterns are +somehow constrained patterns. +In the examples, the expansion of +\texttt{length1} and \texttt{length2} yields exactly the same term + but the +expansion of \texttt{length3} is completely different. \texttt{length1} and +\texttt{length2} are expanded into two nested case analysis on +\texttt{listn} while \texttt{length3} is expanded into a case analysis on +\texttt{listn} containing a case analysis on natural numbers inside. + + +In practice the user can think about the patterns as independent and +it is the expansion algorithm that cares to relate them. \\ +\fi +% +% +% + +\asubsection{When the elimination predicate must be provided} +The examples given so far do not need an explicit elimination predicate + because all the rhs have the same type and the +strategy succeeds to synthesize it. +Unfortunately when dealing with dependent patterns it often happens +that we need to write cases where the type of the rhs are +different instances of the elimination predicate. +The function \texttt{concat} for \texttt{listn} +is an example where the branches have different type +and we need to provide the elimination predicate: + +\begin{coq_example} +Fixpoint concat (n:nat) (l:listn n) (m:nat) (l':listn m) {struct l} : + listn (n + m) := + match l in listn n return listn (n + m) with + | niln => l' + | consn n' a y => consn (n' + m) a (concat n' y m l') + end. +\end{coq_example} +The elimination predicate is {\tt fun (n:nat) (l:listn n) => listn~(n+m)}. +In general if $m$ has type $(I~q_1\ldots q_r~t_1\ldots t_s)$ where +$q_1\ldots q_r$ are parameters, the elimination predicate should be of +the form~: +{\tt fun $y_1$\ldots $y_s$ $x$:($I$~$q_1$\ldots $q_r$~$y_1$\ldots + $y_s$) => Q}. + +In the concrete syntax, it should be written~: +\[ \kw{match}~m~\kw{as}~x~\kw{in}~(I~\_\ldots \_~y_1\ldots y_s)~\kw{return}~Q~\kw{with}~\ldots~\kw{end}\] + +The variables which appear in the \kw{in} and \kw{as} clause are new +and bounded in the property $Q$ in the \kw{return} clause. The +parameters of the inductive definitions should not be mentioned and +are replaced by \kw{\_}. + +Recall that a list of patterns is also a pattern. So, when +we destructure several terms at the same time and the branches have +different type we need to provide +the elimination predicate for this multiple pattern. +It is done using the same scheme, each term may be associated to an +\kw{as} and \kw{in} clause in order to introduce a dependent product. + +For example, an equivalent definition for \texttt{concat} (even though the matching on the second term is trivial) would have +been: + +\begin{coq_example} +Reset concat. +Fixpoint concat (n:nat) (l:listn n) (m:nat) (l':listn m) {struct l} : + listn (n + m) := + match l in listn n, l' return listn (n + m) with + | niln, x => x + | consn n' a y, x => consn (n' + m) a (concat n' y m x) + end. +\end{coq_example} + +% Notice that this time, the predicate \texttt{[n,\_:nat](listn (plus n +% m))} is binary because we +% destructure both \texttt{l} and \texttt{l'} whose types have arity one. +% In general, if we destructure the terms $e_1\ldots e_n$ +% the predicate will be of arity $m$ where $m$ is the sum of the +% number of dependencies of the type of $e_1, e_2,\ldots e_n$ +% (the $\lambda$-abstractions +% should correspond from left to right to each dependent argument of the +% type of $e_1\ldots e_n$). +When the arity of the predicate (i.e. number of abstractions) is not +correct Coq raises an error message. For example: + +% Test failure +\begin{coq_eval} +Reset concat. +Set Printing Depth 50. +(********** The following is not correct and should produce ***********) +(** Error: the term l' has type listn m while it is expected to have **) +(** type listn (?31 + ?32) **) +\end{coq_eval} +\begin{coq_example} +Fixpoint concat + (n:nat) (l:listn n) (m:nat) + (l':listn m) {struct l} : listn (n + m) := + match l, l' with + | niln, x => x + | consn n' a y, x => consn (n' + m) a (concat n' y m x) + end. +\end{coq_example} + +\asection{Using pattern matching to write proofs} +In all the previous examples the elimination predicate does not depend +on the object(s) matched. But it may depend and the typical case +is when we write a proof by induction or a function that yields an +object of dependent type. An example of proof using \texttt{match} in +given in Section~\ref{refine-example}. + +For example, we can write +the function \texttt{buildlist} that given a natural number +$n$ builds a list of length $n$ containing zeros as follows: + +\begin{coq_example} +Fixpoint buildlist (n:nat) : listn n := + match n return listn n with + | O => niln + | S n => consn n 0 (buildlist n) + end. +\end{coq_example} + +We can also use multiple patterns. +Consider the following definition of the predicate less-equal +\texttt{Le}: + +\begin{coq_example} +Inductive LE : nat -> nat -> Prop := + | LEO : forall n:nat, LE 0 n + | LES : forall n m:nat, LE n m -> LE (S n) (S m). +\end{coq_example} + +We can use multiple patterns to write the proof of the lemma + \texttt{forall (n m:nat), (LE n m)}\verb=\/=\texttt{(LE m n)}: + +\begin{coq_example} +Fixpoint dec (n m:nat) {struct n} : LE n m \/ LE m n := + match n, m return LE n m \/ LE m n with + | O, x => or_introl (LE x 0) (LEO x) + | x, O => or_intror (LE x 0) (LEO x) + | S n as n', S m as m' => + match dec n m with + | or_introl h => or_introl (LE m' n') (LES n m h) + | or_intror h => or_intror (LE n' m') (LES m n h) + end + end. +\end{coq_example} +In the example of \texttt{dec}, +the first \texttt{match} is dependent while +the second is not. + +% In general, consider the terms $e_1\ldots e_n$, +% where the type of $e_i$ is an instance of a family type +% $\lb (\vec{d_i}:\vec{D_i}) \mto T_i$ ($1\leq i +% \leq n$). Then, in expression \texttt{match} $e_1,\ldots, +% e_n$ \texttt{of} \ldots \texttt{end}, the +% elimination predicate ${\cal P}$ should be of the form: +% $[\vec{d_1}:\vec{D_1}][x_1:T_1]\ldots [\vec{d_n}:\vec{D_n}][x_n:T_n]Q.$ + +The user can also use \texttt{match} in combination with the tactic +\texttt{refine} (see Section~\ref{refine}) to build incomplete proofs +beginning with a \texttt{match} construction. + +\asection{Pattern-matching on inductive objects involving local +definitions} + +If local definitions occur in the type of a constructor, then there +are two ways to match on this constructor. Either the local +definitions are skipped and matching is done only on the true arguments +of the constructors, or the bindings for local definitions can also +be caught in the matching. + +Example. + +\begin{coq_eval} +Reset Initial. +Require Import Arith. +\end{coq_eval} + +\begin{coq_example*} +Inductive list : nat -> Set := + | nil : list 0 + | cons : forall n:nat, let m := (2 * n) in list m -> list (S (S m)). +\end{coq_example*} + +In the next example, the local definition is not caught. + +\begin{coq_example} +Fixpoint length n (l:list n) {struct l} : nat := + match l with + | nil => 0 + | cons n l0 => S (length (2 * n) l0) + end. +\end{coq_example} + +But in this example, it is. + +\begin{coq_example} +Fixpoint length' n (l:list n) {struct l} : nat := + match l with + | nil => 0 + | cons _ m l0 => S (length' m l0) + end. +\end{coq_example} + +\Rem for a given matching clause, either none of the local +definitions or all of them can be caught. + +\asection{Pattern-matching and coercions} + +If a mismatch occurs between the expected type of a pattern and its +actual type, a coercion made from constructors is sought. If such a +coercion can be found, it is automatically inserted around the +pattern. + +Example: + +\begin{coq_example} +Inductive I : Set := + | C1 : nat -> I + | C2 : I -> I. +Coercion C1 : nat >-> I. +Check (fun x => match x with + | C2 O => 0 + | _ => 0 + end). +\end{coq_example} + + +\asection{When does the expansion strategy fail ?}\label{limitations} +The strategy works very like in ML languages when treating +patterns of non-dependent type. +But there are new cases of failure that are due to the presence of +dependencies. + +The error messages of the current implementation may be sometimes +confusing. When the tactic fails because patterns are somehow +incorrect then error messages refer to the initial expression. But the +strategy may succeed to build an expression whose sub-expressions are +well typed when the whole expression is not. In this situation the +message makes reference to the expanded expression. We encourage +users, when they have patterns with the same outer constructor in +different equations, to name the variable patterns in the same +positions with the same name. +E.g. to write {\small\texttt{(cons n O x) => e1}} +and {\small\texttt{(cons n \_ x) => e2}} instead of +{\small\texttt{(cons n O x) => e1}} and +{\small\texttt{(cons n' \_ x') => e2}}. +This helps to maintain certain name correspondence between the +generated expression and the original. + +Here is a summary of the error messages corresponding to each situation: + +\begin{ErrMsgs} +\item \sverb{The constructor } {\sl + ident} \sverb{ expects } {\sl num} \sverb{ arguments} + + \sverb{The variable } {\sl ident} \sverb{ is bound several times + in pattern } {\sl term} + + \sverb{Found a constructor of inductive type } {\term} + \sverb{ while a constructor of } {\term} \sverb{ is expected} + + Patterns are incorrect (because constructors are not applied to + the correct number of the arguments, because they are not linear or + they are wrongly typed). + +\item \errindex{Non exhaustive pattern-matching} + +The pattern matching is not exhaustive. + +\item \sverb{The elimination predicate } {\sl term} \sverb{ should be + of arity } {\sl num} \sverb{ (for non dependent case) or } {\sl + num} \sverb{ (for dependent case)} + +The elimination predicate provided to \texttt{match} has not the + expected arity. + + +%\item the whole expression is wrongly typed + +% CADUC ? +% , or the synthesis of +% implicit arguments fails (for example to find the elimination +% predicate or to resolve implicit arguments in the rhs). + +% There are {\em nested patterns of dependent type}, the elimination +% predicate corresponds to non-dependent case and has the form +% $[x_1:T_1]...[x_n:T_n]T$ and {\bf some} $x_i$ occurs {\bf free} in +% $T$. Then, the strategy may fail to find out a correct elimination +% predicate during some step of compilation. In this situation we +% recommend the user to rewrite the nested dependent patterns into +% several \texttt{match} with {\em simple patterns}. + +\item {\tt Unable to infer a match predicate\\ + Either there is a type incompatiblity or the problem involves\\ + dependencies} + + There is a type mismatch between the different branches. + The user should provide an elimination predicate. + +% Obsolete ? +% \item because of nested patterns, it may happen that even though all +% the rhs have the same type, the strategy needs dependent elimination +% and so an elimination predicate must be provided. The system warns +% about this situation, trying to compile anyway with the +% non-dependent strategy. The risen message is: + +% \begin{itemize} +% \item {\tt Warning: This pattern matching may need dependent +% elimination to be compiled. I will try, but if fails try again +% giving dependent elimination predicate.} +% \end{itemize} + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +% % LA PROPAGATION DES CONTRAINTES ARRIERE N'EST PAS FAITE DANS LA V7 +% TODO +% \item there are {\em nested patterns of dependent type} and the +% strategy builds a term that is well typed but recursive calls in fix +% point are reported as illegal: +% \begin{itemize} +% \item {\tt Error: Recursive call applied to an illegal term ...} +% \end{itemize} + +% This is because the strategy generates a term that is correct w.r.t. +% the initial term but which does not pass the guard condition. In +% this situation we recommend the user to transform the nested dependent +% patterns into {\em several \texttt{match} of simple patterns}. Let us +% explain this with an example. Consider the following definition of a +% function that yields the last element of a list and \texttt{O} if it is +% empty: + +% \begin{coq_example} +% Fixpoint last [n:nat; l:(listn n)] : nat := +% match l of +% (consn _ a niln) => a +% | (consn m _ x) => (last m x) | niln => O +% end. +% \end{coq_example} + +% It fails because of the priority between patterns, we know that this +% definition is equivalent to the following more explicit one (which +% fails too): + +% \begin{coq_example*} +% Fixpoint last [n:nat; l:(listn n)] : nat := +% match l of +% (consn _ a niln) => a +% | (consn n _ (consn m b x)) => (last n (consn m b x)) +% | niln => O +% end. +% \end{coq_example*} + +% Note that the recursive call {\tt (last n (consn m b x))} is not +% guarded. When treating with patterns of dependent types the strategy +% interprets the first definition of \texttt{last} as the second +% one\footnote{In languages of the ML family the first definition would +% be translated into a term where the variable \texttt{x} is shared in +% the expression. When patterns are of non-dependent types, Coq +% compiles as in ML languages using sharing. When patterns are of +% dependent types the compilation reconstructs the term as in the +% second definition of \texttt{last} so to ensure the result of +% expansion is well typed.}. Thus it generates a term where the +% recursive call is rejected by the guard condition. + +% You can get rid of this problem by writing the definition with +% \emph{simple patterns}: + +% \begin{coq_example} +% Fixpoint last [n:nat; l:(listn n)] : nat := +% <[_:nat]nat>match l of +% (consn m a x) => Cases x of niln => a | _ => (last m x) end +% | niln => O +% end. +% \end{coq_example} + +\end{ErrMsgs} + + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "Reference-Manual" +%%% End: |