\label{Helm}
\index{XML exportation}
\index{Proof rendering}

This section describes the exportation of {\Coq} theories to XML that
has been contributed by Claudio Sacerdoti Coen.  Currently, the main
applications are the rendering and searching tool
developed within the HELM\footnote{Hypertextual Electronic Library of
Mathematics} and MoWGLI\footnote{Mathematics on the Web, Get it by
Logic and Interfaces} projects mainly at the University of Bologna and
partly at INRIA-Sophia Antipolis.

\subsection{Practical use of the XML exportation tool}

The basic way to export the logical content of a file into XML format
is to use {\tt coqc} with option {\tt -xml}. 
When the {\tt -xml} flag is set, every definition or declaration is
immediately exported to XML once concluded.
The system environment variable {\tt COQ\_XML\_LIBRARY\_ROOT} must be
previously set to a directory in which the logical structure of the
exported objects is reflected.

  For {\tt Makefile} files generated by \verb+coq_makefile+ (see section
  \ref{Makefile}), it is sufficient to compile the files using
 \begin{quotation}
   \verb+make COQ_XML=-xml+
 \end{quotation}
 (or, equivalently, setting the environment variable \verb+COQ_XML+)
 
 To export a development to XML, the suggested procedure is then:
 
 \begin{enumerate}
  \item add to your own contribution a valid \verb+Make+ file and use
      \verb+coq_makefile+ to generate the \verb+Makefile+ from the \verb+Make+
      file.
 
      \Warning Since logical names are used to structure the XML
      hierarchy, always add to the \verb+Make+ file at least one \verb+"-R"+
      option to map physical file names to logical module paths.
  \item set the \verb+COQ_XML_LIBRARY_ROOT+ environment variable to
  the directory where the XML file hierarchy must be physically
  rooted.
  \item compile your contribution with \verb+"make COQ_XML=-xml"+
 \end{enumerate}
 
\Rem In case the system variable {\tt COQ\_XML\_LIBRARY\_ROOT} is not set,
the output is done on the standard output. Also, the files are
compressed using {\tt gzip} after creation. This is to save disk space
since the XML format is very verbose.

\subsection{Reflection of the logical structure into the file system}

For each {\Coq} logical object, several independent files associated
to this object are created.  The structure of the long name of the
object is reflected in the directory structure of the file system.
E.g. an object of long name {\tt
{\ident$_1$}.{\ldots}.{\ident$_n$}.{\ident}} is exported to files in the
subdirectory {{\ident$_1$}/{\ldots}/{\ident$_n$}} of the directory 
bound to the environment variable {\tt COQ\_XML\_LIBRARY\_ROOT}.

\subsection{What is exported?}

The XML exportation tool exports the logical content of {\Coq}
theories. This covers global definitions (including lemmas, theorems,
...), global assumptions (parameters and axioms), local assumptions or
definitions, and inductive definitions.

Vernacular files are exported to {\tt .theory.xml} files. 
%Variables,
%definitions, theorems, axioms and proofs are exported to individual
%files whose suffixes range from {\tt .var.xml}, {\tt .con.xml}, {\tt
%.con.body.xml}, {\tt .con.types.xml} to {\tt .con.proof_tree.xml}.
Comments are pre-processed with {\sf coqdoc} (see section
\ref{coqdoc}). Especially, they have to be enclosed within {\tt (**}
and {\tt *)} to be exported.

For each inductive definition of name
{\ident$_1$}.{\ldots}.{\ident$_n$}.{\ident}, a file named {\tt
{\ident}.ind.xml} is created in the subdirectory {\tt
{\ident$_1$}/{\ldots}/{\ident$_n$}} of the xml library root
directory. It contains the arities and constructors of the type. For mutual inductive definitions, the file is named after the
name of the first inductive type of the block.

For each global definition of base name {\tt
{\ident$_1$}.{\ldots}.{\ident$_n$}.{\ident}}, files named
{\tt {\ident}.con.body.xml} and {\tt {\ident}.con.xml} are created in the
subdirectory {\tt {\ident$_1$}/{\ldots}/{\ident$_n$}}. They
respectively contain the body and the type of the definition.

For each global assumption of base name {\tt
{\ident$_1$}.{\ident$_2$}.{\ldots}.{\ident$_n$}.{\ident}}, a file
named {\tt {\ident}.con.xml} is created in the subdirectory {\tt
{\ident$_1$}/{\ldots}/{\ident$_n$}}.  It contains the type of the
global assumption.

For each local assumption or definition of base name {\ident} located
in sections {\ident$'_1$}, {\ldots}, {\ident$'_p$} of the module {\tt
{\ident$_1$}.{\ident$_2$}.{\ldots}.{\ident$_n$}.{\ident}}, a file
named {\tt {\ident}.var.xml} is created in the subdirectory {\tt
{\ident$_1$}/{\ldots}/{\ident$_n$}/{\ident$'_1$}/\ldots/{\ident$'_p$}}.
It contains its type and, if a definition, its body.

In order to do proof-rendering (for example in natural language), some
redundant typing information is required, i.e. the type of at least
some of the subterms of the bodies and types of the CIC objects. These
types are called inner types and are exported to files of suffix {\tt
.types.xml} by the exportation tool.


% Deactivated
%% \subsection{Proof trees}

%% For each definition or theorem that has been built with tactics, an
%% extra file of suffix {\tt proof\_tree.xml} is created. It contains the
%% proof scripts and is used for rendering the proof.

\subsection{Inner types}
\label{inner-types}

The type of a subterm of a construction is called an {\em inner type}
if it respects the following conditions.
 
\begin{enumerate}
  \item Its sort is \verb+Prop+\footnote{or {\tt CProp} which is the
    "sort"-like definition used in C-CoRN (see
    \url{http://vacuumcleaner.cs.kun.nl/c-corn}) to type
    computationally relevant predicative propositions.}.
 \item It is not a type cast nor an atomic term (variable, constructor or constant).
 \item If it's root is an abstraction, then the root's parent node is
    not an abstraction, i.e. only the type of the outer abstraction of
    a block of nested abstractions is printed.
\end{enumerate}
                                                                               
The rationale for the 3$^{rd}$ condition is that the type of the inner
abstractions could be easily computed starting from the type of the
outer ones; moreover, the types of the inner abstractions requires a
lot of disk/memory space: removing the 3$^{rd}$ condition leads to XML
file that are two times as big as the ones exported applying the 3$^{rd}$
condition.

\subsection{Interactive exportation commands}

There are also commands to be used interactively in {\tt coqtop}.

\subsubsection{\tt Print XML {\qualid}}
\comindex{Print XML}

If the variable {\tt COQ\_XML\_LIBRARY\_ROOT} is set, this command creates
files containing the logical content in XML format of {\qualid}. If
the variable is not set, the result is displayed on the standard
output.

\begin{Variants}
\item {\tt Print XML File {\str} {\qualid}}\\
This writes the logical content of {\qualid} in XML format to files
whose prefix is {\str}.
\end{Variants}

\subsubsection{{\tt Show XML Proof}}
\comindex{Show XML Proof}

If the variable {\tt COQ\_XML\_LIBRARY\_ROOT} is set, this command creates
files containing the current proof in progress in XML format. It
writes also an XML file made of inner types.  If the variable is not
set, the result is displayed on the standard output.

\begin{Variants}
\item {\tt Show XML File {\str} Proof}\\ This writes the
logical content of {\qualid} in XML format to files whose prefix is
{\str}. 
\end{Variants}

\subsection{Applications: rendering, searching and publishing}

The HELM team at the University of Bologna has developed tools
exploiting the XML exportation of {\Coq} libraries. This covers
rendering, searching and publishing tools.

All these tools require a running http server and, if possible, a
MathML compliant browser.  The procedure to install the suite of tools
ultimately allowing rendering and searching can be found on the HELM
web site \url{http://helm.cs.unibo.it/library.html}.

It may be easier though to upload your developments on the HELM http
server and to re-use the infrastructure running on it. This requires
publishing your development. To this aim, follow the instructions on
\url{http://mowgli.cs.unibo.it}.

Notice that the HELM server already hosts a copy of the standard
library of {\Coq} and of the {\Coq} user contributions.

\subsection{Technical informations}

\subsubsection{CIC with Explicit Named Substitutions}
                                                                               
The exported files are XML encoding of the lambda-terms used by the
\Coq\ system. The implementative details of the \Coq\ system are hidden as much
as possible, so that the XML DTD is a straightforward encoding of the
Calculus of (Co)Inductive Constructions.
                                                                                
Nevertheless, there is a feature of the \Coq\ system that can not be
hidden in a completely satisfactory way: discharging (see Sect.\ref{Section}).
In \Coq\ it is possible
to open a section, declare variables and use them in the rest of the section
as if they were axiom declarations. Once the section is closed, every definition and theorem in the section is discharged by abstracting it over the section
variables. Variable declarations as well as section declarations are entirely
dropped. Since we are interested in an XML encoding of definitions and
theorems as close as possible to those directly provided the user, we
do not want to export discharged forms. Exporting non-discharged theorem
and definitions together with theorems that rely on the discharged forms
obliges the tools that work on the XML encoding to implement discharging to
achieve logical consistency. Moreover, the rendering of the files can be
misleading, since hyperlinks can be shown between occurrences of the discharge
form of a definition and the non-discharged definition, that are different
objects.
                                                                                
To overcome the previous limitations, Claudio Sacerdoti Coen developed in his
PhD. thesis an extension of CIC, called Calculus of (Co)Inductive Constructions
with Explicit Named Substitutions, that is a slight extension of CIC where
discharging is not necessary. The DTD of the exported XML files describes
constants, inductive types and variables of the Calculus of (Co)Inductive
Constructions with Explicit Named Substitutions. The conversion to the new
calculus is performed during the exportation phase.

The following example shows a very small \Coq\ development together with its
version in CIC with Explicit Named Substitutions.

\begin{verbatim}
# CIC version: #
Section S.
 Variable A : Prop.

 Definition impl := A -> A.

 Theorem t : impl.           (* uses the undischarged form of impl *)
  Proof.
   exact (fun (a:A) => a).
  Qed.

End S.

Theorem t' : (impl False).   (* uses the discharged form of impl *)
 Proof.
  exact (t False).           (* uses the discharged form of t *)
 Qed.
\end{verbatim}
 
\begin{verbatim}
# Corresponding CIC with Explicit Named Substitutions version: #
Section S.
 Variable A : Prop.
 
 Definition impl(A) := A -> A. (* theorems and definitions are
                                  explicitly abstracted over the
                                  variables. The name is sufficient to
                                  completely describe the abstraction *)
 
 Theorem t(A) : impl.          (* impl where A is not instantiated *)
  Proof.
   exact (fun (a:A) => a).
  Qed.
 
End S.
 
Theorem t'() : impl{False/A}. (* impl where A is instantiated with False
                                 Notice that t' does not depend on A     *)
 Proof.
  exact t{False/A}.           (* t where A is instantiated with False *)
 Qed.
\end{verbatim}
 
Further details on the typing and reduction rules of the calculus can be
found in Claudio Sacerdoti Coen PhD. dissertation, where the consistency
of the calculus is also proved.
 
\subsubsection{The CIC with Explicit Named Substitutions XML DTD}
 
A copy of the DTD can be found in the file ``\verb+cic.dtd+'' in the
\verb+contrib/xml+ source directory of \Coq.
The following is a very brief overview of the elements described in the DTD.
 
\begin{description}
 \item[]\texttt{<ConstantType>}
   is the root element of the files that correspond to constant types.
 \item[]\texttt{<ConstantBody>}
   is the root element of the files that correspond to constant bodies.
   It is used only for closed definitions and theorems (i.e. when no
   metavariable occurs in the body or type of the constant)
 \item[]\texttt{<CurrentProof>}
   is the root element of the file that correspond to the body of a constant
   that depends on metavariables (e.g. unfinished proofs)
 \item[]\texttt{<Variable>}
   is the root element of the files that correspond to variables
 \item[]\texttt{<InductiveTypes>}
   is the root element of the files that correspond to blocks
   of mutually defined inductive definitions
\end{description}
 
The elements
 \verb+<LAMBDA>+, \verb+<CAST>+, \verb+<PROD>+, \verb+<REL>+, \verb+<SORT>+,
 \verb+<APPLY>+, \verb+<VAR>+, \verb+<META>+, \verb+<IMPLICIT>+, \verb+<CONST>+, \verb+<LETIN>+, \verb+<MUTIND>+, \verb+<MUTCONSTRUCT>+, \verb+<MUTCASE>+,
 \verb+<FIX>+ and \verb+<COFIX>+ are used to encode the constructors of CIC.
 The \verb+sort+ or \verb+type+ attribute of the element, if present, is
 respectively the sort or the type of the term, that is a sort because of the
 typing rules of CIC.
 
The element \verb+<instantiate>+ correspond to the application of an explicit
named substitution to its first argument, that is a reference to a definition
or declaration in the environment.
 
All the other elements are just syntactic sugar.


%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "Reference-Manual"
%%% End: