doc/Extraction.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230

\achapter{Execution of extracted programs in Objective Caml and Haskell}
\label{Extraction}
\aauthor{Jean-Christophe Filliâtre and Pierre Letouzey}
\index{Extraction}

\begin{flushleft}
  \em The status of extraction is experimental \\
  Haskell extraction is not yet implemented
\end{flushleft}

It is possible to use \Coq\ to build certified and relatively
efficient programs, extracting them from the proofs of their
specifications. The extracted objects can be 
obtained at the \Coq\ toplevel with the command {\tt Extraction}
(see \ref{ExtractionTerm}).

We present here a \Coq\ module, {\tt Extraction}, which translates the
extracted terms to ML dialects, namely Objective Caml and Haskell. 
In the following, we will not refer to a particular dialect
when possible and ``ML'' will be used to refer to any of the target
dialects.

\Rem
The current extraction mechanism is new from version 7.0 of {\Coq}.
In particular, the \FW\ toplevel used as intermediate step between 
\Coq\ and ML has been abandoned.  It is also not possible 
any more to import ML objects in this \FW\ toplevel.
The current mechanism also differs from
the one in previous versions of \Coq: there is no more
an explicit toplevel for the language (formerly called {\sf Fml}). 


%\section{Extraction facilities}
%
%(* TO DO *)
%
\begin{coq_eval}
  Require Extraction.
\end{coq_eval}

\medskip
In the first part of this document we describe the commands of the
{\tt Extraction} module, and we give some examples in the second part.

\asection{Generating ML code}
\comindex{Extraction}
\comindex{Recursive Extraction}
\comindex{Extraction Module}

The \Coq\ commands to generate ML code are:
\begin{description}
\item {\tt Extraction \term.} ~\par
  Extracts one term in the \Coq\ toplevel.
  Extracted terms are displayed with an \ocaml\ syntax, where globals
  are printed as in the \Coq\ toplevel (thus without any renaming).

\item {\tt Recursive Extraction  \qualid$_1$ \dots\ \qualid$_n$.} ~\par
  Recursive extraction of all the globals \qualid$_1$ \dots\
  \qualid$_n$ and all their dependencies in the \Coq\ toplevel.

\item {\tt Extraction "{\em file}"} {\em options} 
      \qualid$_1$ \dots\ \qualid$_n$. ~\par
  Recursive extraction of all the globals \qualid$_1$ \dots\
  \qualid$_n$ and all their dependencies in file {\em file}.
  Global and local identifiers are renamed accordingly to the target
  language to fullfill its syntactic conventions, keeping original
  names as much as possible.

\item  {\tt Extraction Module} {\em options} \ident. ~\par
  Extraction of the \Coq\ module \ident\ to an ML module {\tt\ident.ml}.
  Identifiers are here renamed using prefixes \verb!coq_! or
  \verb!Coq_! to ensure a session-independent renaming.
\end{description}
The list of globals \qualid$_i$ does not need to be
exhaustive: it is automatically completed into a complete and minimal
environment. Extraction will fail if it encounters an axiom.

\paragraph{Optimizations.} 
Since Objective Caml is a strict language, the extracted
code has to be optimized in order to be efficient (for instance, when
using induction principles we do not want to compute all the recursive
calls but only the needed ones). So an optimization routine will be
called each time the user want to generate Caml programs. Essentially,
it performs constants expansions and reductions.  Therefore some
constants will not appear in the resulting Caml program (A warning is
printed for each such constant). To avoid this, just put the constant
name in the previous list \qualid$_1$ \dots\ \qualid$_n$ and it will not
be expanded. Moreover, two options allow the user to control the
expansion strategy:
\begin{description}
  \item\texttt{[ noopt ]} ~\par
    specifies not to do any optimization.
  \item\texttt{[ expand \qualid$_1$ \dots\ \qualid$_n$ ]} ~\par
    forces the expansion of the constants \qualid$_1$ \dots\ \qualid$_n$
    (when it is possible).
\end{description}
Note that no optimization is performed for a modular extraction.


\asection{Realizing axioms}
\comindex{Link}

It is possible to assume some axioms while developing a proof. Since
these axioms can be any kind of proposition or object type, they may
perfectly well have some computational content. But a program must be
a closed term, and of course the system cannot guess the program which
realizes an axiom.  Therefore, it is possible to tell the system
what ML term corresponds to a given axiom. Of course, it is the
responsability of the user to ensure that the ML terms given to
realize the axioms do have the expected types.

\comindex{Extract Constant}
\comindex{Extract Inductive}
The system actually provides a more general mechanism to specify ML
terms even for defined constants, inductive types and constructors.
For instance, the user may want to use the ML native boolean type
instead of \Coq\ one.
The syntax is the following:
\begin{description}
\item{\tt Extract Constant \qualid\ => \str.} ~\par
  Give an ML extraction for the given constant.
  The \str\ may be an identifier or a quoted string.
\item{\tt Extract Inlined Constant \qualid\ => \str.} ~\par
  Same as the previous one, except that the given ML terms will
  be inlined everywhere instead of being declared via a let.
\item{\tt Extract Inductive \qualid\ => \str\ [ \str\ \dots \str\ ].} ~\par
  Give an ML extraction for the given inductive type. You must specify
  extractions for the type itself (first \str) and all its
  constructors (between square brackets). The ML extraction must be an
  ML recursive datatype.
\end{description}

\begin{Remarks}
%\item 
%  The given ML terms are always inlined; thus, it may be
%  useful to introduce definitions in a separate ML module and then to
%  use the corresponding qualified names.
\item
  The extraction of a module depending on axioms from another module
  will not fail. It is the responsability of the ``extractor'' of that
  other module to realize the given axioms.
\end{Remarks}

\Example
Typical examples are the following:
\begin{coq_example}
Extract Inductive unit => unit [ "()" ].
Extract Inductive bool => bool [ true false ].
Extract Inductive sumbool => bool [ true false ].
\end{coq_example}


% \asubsection{Differences between \Coq\ and ML type systems}

% \subsubsection{ML types that are not \FW\ types}

% Some ML recursive types have no counterpart in the type system of
% \Coq, like types using the record construction, or non positive types
% like
% \begin{verbatim}
% # type T = C of T->T;;
% \end{verbatim}
% In that case, you cannot import those types as inductive types, and
% the only way to do is to import them as abstract types (with {\tt ML
% Import}) together with the corresponding building and de-structuring
% functions (still with {\tt ML Import Constant}).


% \subsubsection{Programs that are not ML-typable}

% On the contrary, some extracted programs in \FW\ are not typable in
% ML. There are in fact two cases which can be problematic:
% \begin{itemize}
%   \item If some part of the program is {\em very} polymorphic, there
%     may be no ML type for it. In that case the extraction to ML works
%     all right but the generated code may be refused by the ML
%     type-checker. A very well known example is the {\em distr-pair}
%     function:
% $$\mbox{\tt
% Definition dp := [A,B:Set][x:A][y:B][f:(C:Set)C->C](f A x,f B y).
% }$$
% In Caml Light, for instance, the extracted term is 
% \verb!let dp x y f = pair((f x),(f y))!  and has type
% $$\mbox{\tt
% dp : 'a -> 'a -> ('a -> 'b) -> ('b,'b) prod
% }$$
% which is not its original type, but a restriction.

%   \item Some definitions of \FW\ may have no counterpart in ML. This
%     happens when there is a quantification over types inside the type
%     of a constructor; for example:
% $$\mbox{\tt
% Inductive anything : Set := dummy : (A:Set)A->anything.
% }$$
% which corresponds to the definition of ML dynamics.
% \end{itemize}

% The first case is not too problematic: it is still possible to run the
% programs by switching off the type-checker during compilation. Unless
% you misused the semantical attachment facilities you should never get
% any message like ``segmentation fault'' for which the extracted code
% would be to blame. To switch off the Caml type-checker, use the
% function {\tt obj\_\_magic} which gives the type {\tt 'a} to any
% object; but this implies changing a little the extracted code by hand.

% The second case is fatal. If some inductive type cannot be translated
% to ML, one has to change the proof (or possibly to ``cheat'' by
% some low-level manipulations we would not describe here). 

% We have to say, though, that in most ``realistic'' programs, these
% problems do not occur. For example all the programs of the library are
% accepted by Caml type-checker except {\tt Higman.v}\footnote{Should
%   you obtain a not ML-typable program out of a self developed example,
%   we would be interested in seeing it; so please mail us the example at
%   {\em coq@pauillac.inria.fr}}.


% \asection{Some examples}
% TODO


\section{Bugs}

Surely there are still bugs in the {\tt Extraction} module.  You can
send your bug reports directly to the authors
(\textsf{Pierre.Letouzey$@$lri.fr} and
\textsf{Jean-Christophe.Filliatre$@$lri.fr}) or to the \Coq\ mailing
list (at \textsf{coq$@$pauillac.inria.fr}).

% $Id$