diff options
Diffstat (limited to 'doc/refman/Extraction.tex')
-rw-r--r-- | doc/refman/Extraction.tex | 551 |
1 files changed, 0 insertions, 551 deletions
diff --git a/doc/refman/Extraction.tex b/doc/refman/Extraction.tex deleted file mode 100644 index af5d4049..00000000 --- a/doc/refman/Extraction.tex +++ /dev/null @@ -1,551 +0,0 @@ -\achapter{Extraction of programs in Objective Caml and Haskell} -\label{Extraction} -\aauthor{Jean-Christophe Filliātre and Pierre Letouzey} -\index{Extraction} - -We present here the \Coq\ extraction commands, used to build certified -and relatively efficient functional programs, extracting them from -either \Coq\ functions or \Coq\ proofs of specifications. The -functional languages available as output are currently \ocaml{}, -\textsc{Haskell} and \textsc{Scheme}. In the following, ``ML'' will -be used (abusively) to refer to any of the three. - -\paragraph{Differences with old versions.} -The current extraction mechanism is new for version 7.0 of {\Coq}. -In particular, the \FW\ toplevel used as an intermediate step between -\Coq\ and ML has been withdrawn. It is also not possible -any more to import ML objects in this \FW\ toplevel. -The current mechanism also differs from -the one in previous versions of \Coq: there is no more -an explicit toplevel for the language (formerly called \textsc{Fml}). - -\asection{Generating ML code} -\comindex{Extraction} -\comindex{Recursive Extraction} -\comindex{Extraction Module} -\comindex{Recursive Extraction Module} - -The next two commands are meant to be used for rapid preview of -extraction. They both display extracted term(s) inside \Coq. - -\begin{description} -\item {\tt Extraction \qualid.} ~\par - Extracts one constant or module in the \Coq\ toplevel. - -\item {\tt Recursive Extraction \qualid$_1$ \dots\ \qualid$_n$.} ~\par - Recursive extraction of all the globals (or modules) \qualid$_1$ \dots\ - \qualid$_n$ and all their dependencies in the \Coq\ toplevel. -\end{description} - -%% TODO error messages - -All the following commands produce real ML files. User can choose to produce -one monolithic file or one file per \Coq\ library. - -\begin{description} -\item {\tt Extraction "{\em file}"} - \qualid$_1$ \dots\ \qualid$_n$. ~\par - Recursive extraction of all the globals (or modules) \qualid$_1$ \dots\ - \qualid$_n$ and all their dependencies in one monolithic file {\em file}. - Global and local identifiers are renamed according to the chosen ML - language to fulfill its syntactic conventions, keeping original - names as much as possible. - -\item {\tt Extraction Library} \ident. ~\par - Extraction of the whole \Coq\ library {\tt\ident.v} to an ML module - {\tt\ident.ml}. In case of name clash, identifiers are here renamed - using prefixes \verb!coq_! or \verb!Coq_! to ensure a - session-independent renaming. - -\item {\tt Recursive Extraction Library} \ident. ~\par - Extraction of the \Coq\ library {\tt\ident.v} and all other modules - {\tt\ident.v} depends on. -\end{description} - -The list of globals \qualid$_i$ does not need to be -exhaustive: it is automatically completed into a complete and minimal -environment. - -\asection{Extraction options} - -\asubsection{Setting the target language} -\comindex{Extraction Language} - -The ability to fix target language is the first and more important -of the extraction options. Default is Ocaml. -\begin{description} -\item {\tt Extraction Language Ocaml}. -\item {\tt Extraction Language Haskell}. -\item {\tt Extraction Language Scheme}. -\end{description} - -\asubsection{Inlining and optimizations} - -Since Objective Caml is a strict language, the extracted -code has to be optimized in order to be efficient (for instance, when -using induction principles we do not want to compute all the recursive -calls but only the needed ones). So the extraction mechanism provides -an automatic optimization routine that will be -called each time the user want to generate Ocaml programs. Essentially, -it performs constants inlining and reductions. Therefore some -constants may not appear in resulting monolithic Ocaml program. -In the case of modular extraction, even if some inlining is done, the -inlined constant are nevertheless printed, to ensure -session-independent programs. - -Concerning Haskell, such optimizations are less useful because of -lazyness. We still make some optimizations, for example in order to -produce more readable code. - -All these optimizations are controled by the following \Coq\ options: - -\begin{description} - -\item \comindex{Set Extraction Optimize} -{\tt Set Extraction Optimize.} - -\item \comindex{Unset Extraction Optimize} -{\tt Unset Extraction Optimize.} - -Default is Set. This control all optimizations made on the ML terms -(mostly reduction of dummy beta/iota redexes, but also simplifications on -Cases, etc). Put this option to Unset if you want a ML term as close as -possible to the Coq term. - -\item \comindex{Set Extraction AutoInline} -{\tt Set Extraction AutoInline.} - -\item \comindex{Unset Extraction AutoInline} -{\tt Unset Extraction AutoInline.} - -Default is Set, so by default, the extraction mechanism feels free to -inline the bodies of some defined constants, according to some heuristics -like size of bodies, useness of some arguments, etc. Those heuristics are -not always perfect, you may want to disable this feature, do it by Unset. - -\item \comindex{Extraction Inline} -{\tt Extraction Inline} \qualid$_1$ \dots\ \qualid$_n$. - -\item \comindex{Extraction NoInline} -{\tt Extraction NoInline} \qualid$_1$ \dots\ \qualid$_n$. - -In addition to the automatic inline feature, you can now tell precisely to -inline some more constants by the {\tt Extraction Inline} command. Conversely, -you can forbid the automatic inlining of some specific constants by -the {\tt Extraction NoInline} command. -Those two commands enable a precise control of what is inlined and what is not. - -\item \comindex{Print Extraction Inline} -{\tt Print Extraction Inline}. - -Prints the current state of the table recording the custom inlinings -declared by the two previous commands. - -\item \comindex{Reset Extraction Inline} -{\tt Reset Extraction Inline}. - -Puts the table recording the custom inlinings back to empty. - -\end{description} - - -\paragraph{Inlining and printing of a constant declaration.} - -A user can explicitly ask for a constant to be extracted by two means: -\begin{itemize} -\item by mentioning it on the extraction command line -\item by extracting the whole \Coq\ module of this constant. -\end{itemize} -In both cases, the declaration of this constant will be present in the -produced file. -But this same constant may or may not be inlined in the following -terms, depending on the automatic/custom inlining mechanism. - - -For the constants non-explicitly required but needed for dependency -reasons, there are two cases: -\begin{itemize} -\item If an inlining decision is taken, whether automatically or not, -all occurrences of this constant are replaced by its extracted body, and -this constant is not declared in the generated file. -\item If no inlining decision is taken, the constant is normally - declared in the produced file. -\end{itemize} - -\asubsection{Extra elimination of useless arguments} - -\begin{description} -\item \comindex{Extraction Implicit} - {\tt Extraction Implicit} \qualid\ [ \ident$_1$ \dots\ \ident$_n$ ]. - -This experimental command allows to declare some arguments of -\qualid\ as implicit, i.e. useless in extracted code and hence to -be removed by extraction. Here \qualid\ can be any function or -inductive constructor, and \ident$_i$ are the names of the concerned -arguments. In fact, an argument can also be referred by a number -indicating its position, starting from 1. When an actual extraction -takes place, an error is raised if the {\tt Extraction Implicit} -declarations cannot be honored, that is if any of the implicited -variables still occurs in the final code. This declaration of useless -arguments is independent but complementary to the main elimination -principles of extraction (logical parts and types). -\end{description} - -\asubsection{Realizing axioms}\label{extraction:axioms} - -Extraction will fail if it encounters an informative -axiom not realized (see Section~\ref{extraction:axioms}). -A warning will be issued if it encounters an logical axiom, to remind -user that inconsistent logical axioms may lead to incorrect or -non-terminating extracted terms. - -It is possible to assume some axioms while developing a proof. Since -these axioms can be any kind of proposition or object or type, they may -perfectly well have some computational content. But a program must be -a closed term, and of course the system cannot guess the program which -realizes an axiom. Therefore, it is possible to tell the system -what ML term corresponds to a given axiom. - -\comindex{Extract Constant} -\begin{description} -\item{\tt Extract Constant \qualid\ => \str.} ~\par - Give an ML extraction for the given constant. - The \str\ may be an identifier or a quoted string. -\item{\tt Extract Inlined Constant \qualid\ => \str.} ~\par - Same as the previous one, except that the given ML terms will - be inlined everywhere instead of being declared via a let. -\end{description} - -Note that the {\tt Extract Inlined Constant} command is sugar -for an {\tt Extract Constant} followed by a {\tt Extraction Inline}. -Hence a {\tt Reset Extraction Inline} will have an effect on the -realized and inlined axiom. - -Of course, it is the responsibility of the user to ensure that the ML -terms given to realize the axioms do have the expected types. In -fact, the strings containing realizing code are just copied in the -extracted files. The extraction recognizes whether the realized axiom -should become a ML type constant or a ML object declaration. - -\Example -\begin{coq_example} -Axiom X:Set. -Axiom x:X. -Extract Constant X => "int". -Extract Constant x => "0". -\end{coq_example} - -Notice that in the case of type scheme axiom (i.e. whose type is an -arity, that is a sequence of product finished by a sort), then some type -variables has to be given. The syntax is then: - -\begin{description} -\item{\tt Extract Constant \qualid\ \str$_1$ \ldots \str$_n$ => \str.} ~\par -\end{description} - -The number of type variables is checked by the system. - -\Example -\begin{coq_example} -Axiom Y : Set -> Set -> Set. -Extract Constant Y "'a" "'b" => " 'a*'b ". -\end{coq_example} - -Realizing an axiom via {\tt Extract Constant} is only useful in the -case of an informative axiom (of sort Type or Set). A logical axiom -have no computational content and hence will not appears in extracted -terms. But a warning is nonetheless issued if extraction encounters a -logical axiom. This warning reminds user that inconsistent logical -axioms may lead to incorrect or non-terminating extracted terms. - -If an informative axiom has not been realized before an extraction, a -warning is also issued and the definition of the axiom is filled with -an exception labeled {\tt AXIOM TO BE REALIZED}. The user must then -search these exceptions inside the extracted file and replace them by -real code. - -\comindex{Extract Inductive} - -The system also provides a mechanism to specify ML terms for inductive -types and constructors. For instance, the user may want to use the ML -native boolean type instead of \Coq\ one. The syntax is the following: - -\begin{description} -\item{\tt Extract Inductive \qualid\ => \str\ [ \str\ \dots \str\ ]\ -{\it optstring}.} ~\par - Give an ML extraction for the given inductive type. You must specify - extractions for the type itself (first \str) and all its - constructors (between square brackets). If given, the final optional - string should contain a function emulating pattern-matching over this - inductive type. If this optional string is not given, the ML - extraction must be an ML inductive datatype, and the native - pattern-matching of the language will be used. -\end{description} - -For an inductive type with $k$ constructor, the function used to -emulate the match should expect $(k+1)$ arguments, first the $k$ -branches in functional form, and then the inductive element to -destruct. For instance, the match branch \verb$| S n => foo$ gives the -functional form \verb$(fun n -> foo)$. Note that a constructor with no -argument is considered to have one unit argument, in order to block -early evaluation of the branch: \verb$| O => bar$ leads to the functional -form \verb$(fun () -> bar)$. For instance, when extracting {\tt nat} -into {\tt int}, the code to provide has type: -{\tt (unit->'a)->(int->'a)->int->'a}. - -As for {\tt Extract Inductive}, this command should be used with care: -\begin{itemize} -\item The ML code provided by the user is currently \emph{not} checked at all by - extraction, even for syntax errors. - -\item Extracting an inductive type to a pre-existing ML inductive type -is quite sound. But extracting to a general type (by providing an -ad-hoc pattern-matching) will often \emph{not} be fully rigorously -correct. For instance, when extracting {\tt nat} to Ocaml's {\tt -int}, it is theoretically possible to build {\tt nat} values that are -larger than Ocaml's {\tt max\_int}. It is the user's responsability to -be sure that no overflow or other bad events occur in practice. - -\item Translating an inductive type to an ML type does \emph{not} -magically improve the asymptotic complexity of functions, even if the -ML type is an efficient representation. For instance, when extracting -{\tt nat} to Ocaml's {\tt int}, the function {\tt mult} stays -quadratic. It might be interesting to associate this translation with -some specific {\tt Extract Constant} when primitive counterparts exist. -\end{itemize} - -\Example -Typical examples are the following: -\begin{coq_example} -Extract Inductive unit => "unit" [ "()" ]. -Extract Inductive bool => "bool" [ "true" "false" ]. -Extract Inductive sumbool => "bool" [ "true" "false" ]. -\end{coq_example} - -If an inductive constructor or type has arity 2 and the corresponding -string is enclosed by parenthesis, then the rest of the string is used -as infix constructor or type. -\begin{coq_example} -Extract Inductive list => "list" [ "[]" "(::)" ]. -Extract Inductive prod => "(*)" [ "(,)" ]. -\end{coq_example} - -As an example of translation to a non-inductive datatype, let's turn -{\tt nat} into Ocaml's {\tt int} (see caveat above): -\begin{coq_example} -Extract Inductive nat => int [ "0" "succ" ] - "(fun fO fS n => if n=0 then fO () else fS (n-1))". -\end{coq_example} - -\asubsection{Avoiding conflicts with existing filenames} - -\comindex{Extraction Blacklist} - -When using {\tt Extraction Library}, the names of the extracted files -directly depends from the names of the \Coq\ files. It may happen that -these filenames are in conflict with already existing files, -either in the standard library of the target language or in other -code that is meant to be linked with the extracted code. -For instance the module {\tt List} exists both in \Coq\ and in Ocaml. -It is possible to instruct the extraction not to use particular filenames. - -\begin{description} -\item{\tt Extraction Blacklist \ident \ldots \ident.} ~\par - Instruct the extraction to avoid using these names as filenames - for extracted code. -\item{\tt Print Extraction Blacklist.} ~\par - Show the current list of filenames the extraction should avoid. -\item{\tt Reset Extraction Blacklist.} ~\par - Allow the extraction to use any filename. -\end{description} - -For Ocaml, a typical use of these commands is -{\tt Extraction Blacklist String List}. - -\asection{Differences between \Coq\ and ML type systems} - - -Due to differences between \Coq\ and ML type systems, -some extracted programs are not directly typable in ML. -We now solve this problem (at least in Ocaml) by adding -when needed some unsafe casting {\tt Obj.magic}, which give -a generic type {\tt 'a} to any term. - -For example, here are two kinds of problem that can occur: - -\begin{itemize} - \item If some part of the program is {\em very} polymorphic, there - may be no ML type for it. In that case the extraction to ML works - all right but the generated code may be refused by the ML - type-checker. A very well known example is the {\em distr-pair} - function: -\begin{verbatim} -Definition dp := - fun (A B:Set)(x:A)(y:B)(f:forall C:Set, C->C) => (f A x, f B y). -\end{verbatim} - -In Ocaml, for instance, the direct extracted term would be: - -\begin{verbatim} -let dp x y f = Pair((f () x),(f () y)) -\end{verbatim} - -and would have type: -\begin{verbatim} -dp : 'a -> 'a -> (unit -> 'a -> 'b) -> ('b,'b) prod -\end{verbatim} - -which is not its original type, but a restriction. - -We now produce the following correct version: -\begin{verbatim} -let dp x y f = Pair ((Obj.magic f () x), (Obj.magic f () y)) -\end{verbatim} - - \item Some definitions of \Coq\ may have no counterpart in ML. This - happens when there is a quantification over types inside the type - of a constructor; for example: -\begin{verbatim} -Inductive anything : Set := dummy : forall A:Set, A -> anything. -\end{verbatim} - -which corresponds to the definition of an ML dynamic type. -In Ocaml, we must cast any argument of the constructor dummy. - -\end{itemize} - -Even with those unsafe castings, you should never get error like -``segmentation fault''. In fact even if your program may seem -ill-typed to the Ocaml type-checker, it can't go wrong: it comes -from a Coq well-typed terms, so for example inductives will always -have the correct number of arguments, etc. - -More details about the correctness of the extracted programs can be -found in \cite{Let02}. - -We have to say, though, that in most ``realistic'' programs, these -problems do not occur. For example all the programs of Coq library are -accepted by Caml type-checker without any {\tt Obj.magic} (see examples below). - - - -\asection{Some examples} - -We present here two examples of extractions, taken from the -\Coq\ Standard Library. We choose \ocaml\ as target language, -but all can be done in the other dialects with slight modifications. -We then indicate where to find other examples and tests of Extraction. - -\asubsection{A detailed example: Euclidean division} - -The file {\tt Euclid} contains the proof of Euclidean division -(theorem {\tt eucl\_dev}). The natural numbers defined in the example -files are unary integers defined by two constructors $O$ and $S$: -\begin{coq_example*} -Inductive nat : Set := - | O : nat - | S : nat -> nat. -\end{coq_example*} - -This module contains a theorem {\tt eucl\_dev}, whose type is: -\begin{verbatim} -forall b:nat, b > 0 -> forall a:nat, diveucl a b -\end{verbatim} -where {\tt diveucl} is a type for the pair of the quotient and the -modulo, plus some logical assertions that disappear during extraction. -We can now extract this program to \ocaml: - -\begin{coq_eval} -Reset Initial. -\end{coq_eval} -\begin{coq_example} -Require Import Euclid Wf_nat. -Extraction Inline gt_wf_rec lt_wf_rec induction_ltof2. -Recursive Extraction eucl_dev. -\end{coq_example} - -The inlining of {\tt gt\_wf\_rec} and others is not -mandatory. It only enhances readability of extracted code. -You can then copy-paste the output to a file {\tt euclid.ml} or let -\Coq\ do it for you with the following command: - -\begin{coq_example} -Extraction "euclid" eucl_dev. -\end{coq_example} - -Let us play the resulting program: - -\begin{verbatim} -# #use "euclid.ml";; -type nat = O | S of nat -type sumbool = Left | Right -val minus : nat -> nat -> nat = <fun> -val le_lt_dec : nat -> nat -> sumbool = <fun> -val le_gt_dec : nat -> nat -> sumbool = <fun> -type diveucl = Divex of nat * nat -val eucl_dev : nat -> nat -> diveucl = <fun> -# eucl_dev (S (S O)) (S (S (S (S (S O)))));; -- : diveucl = Divex (S (S O), S O) -\end{verbatim} -It is easier to test on \ocaml\ integers: -\begin{verbatim} -# let rec nat_of_int = function 0 -> O | n -> S (nat_of_int (n-1));; -val i2n : int -> nat = <fun> -# let rec int_of_nat = function O -> 0 | S p -> 1+(int_of_nat p);; -val n2i : nat -> int = <fun> -# let div a b = - let Divex (q,r) = eucl_dev (nat_of_int b) (nat_of_int a) - in (int_of_nat q, int_of_nat r);; -val div : int -> int -> int * int = <fun> -# div 173 15;; -- : int * int = (11, 8) -\end{verbatim} - -Note that these {\tt nat\_of\_int} and {\tt int\_of\_nat} are now -available via a mere {\tt Require Import ExtrOcamlIntConv} and then -adding these functions to the list of functions to extract. This file -{\tt ExtrOcamlIntConv.v} and some others in {\tt plugins/extraction/} -are meant to help building concrete program via extraction. - -\asubsection{Extraction's horror museum} - -Some pathological examples of extraction are grouped in the file -{\tt test-suite/success/extraction.v} of the sources of \Coq. - -\asubsection{Users' Contributions} - - Several of the \Coq\ Users' Contributions use extraction to produce - certified programs. In particular the following ones have an automatic - extraction test (just run {\tt make} in those directories): - - \begin{itemize} - \item Bordeaux/Additions - \item Bordeaux/EXCEPTIONS - \item Bordeaux/SearchTrees - \item Dyade/BDDS - \item Lannion - \item Lyon/CIRCUITS - \item Lyon/FIRING-SQUAD - \item Marseille/CIRCUITS - \item Muenchen/Higman - \item Nancy/FOUnify - \item Rocq/ARITH/Chinese - \item Rocq/COC - \item Rocq/GRAPHS - \item Rocq/HIGMAN - \item Sophia-Antipolis/Stalmarck - \item Suresnes/BDD - \end{itemize} - - Lannion, Rocq/HIGMAN and Lyon/CIRCUITS are a bit particular. They are - examples of developments where {\tt Obj.magic} are needed. - This is probably due to an heavy use of impredicativity. - After compilation those two examples run nonetheless, - thanks to the correction of the extraction~\cite{Let02}. - -% $Id: Extraction.tex 13153 2010-06-15 16:09:43Z letouzey $ - -%%% Local Variables: -%%% mode: latex -%%% TeX-master: "Reference-Manual" -%%% End: |