[TITLE] ~ Abstract This is the Dafny reference manual which describes the Dafny programming language and how to use the Dafny verification system. Parts of this manual are more tutorial in nature in order to help the user understand how to do proofs with Dafny. ~ [TOC] # Introduction Dafny [@Leino:Dafny:LPAR16] is a programming language with built-in specification constructs. The Dafny static program verifier can be used to verify the functional correctness of programs. The Dafny programming language is designed to support the static verification of programs. It is imperative, sequential, supports generic classes, methods and functions, dynamic allocation, inductive and co-inductive datatypes, and specification constructs. The specifications include pre- and postconditions, frame specifications (read and write sets), and termination metrics. To further support specifications, the language also offers updatable ghost variables, recursive functions, and types like sets and sequences. Specifications and ghost constructs are used only during verification; the compiler omits them from the executable code. The Dafny verifier is run as part of the compiler. As such, a programmer interacts with it much in the same way as with the static type checkerâ€”when the tool produces errors, the programmer responds by changing the programâ€™s type declarations, specifications, and statements. The easiest way to try out [Dafny is in your web browser at rise4fun](http://rise4fun.com/Dafny)[@Rise4fun:dafny]. Once you get a bit more serious, you may prefer to [download](http://dafny.codeplex.com/) it to run it on your machine. Although Dafny can be run from the command line (on Windows or other platforms), the preferred way to run it is in Microsoft Visual Studio 2012 (or newer) or using emacs, where the Dafny verifier runs in the background while the programmer is editing the program. The Dafny verifier is powered by [Boogie](http://research.microsoft.com/boogie) [@Boogie:Architecture;@Leino:Boogie2-RefMan;@LeinoRuemmer:Boogie2] and [Z3](https://github.com/z3prover)[@deMouraBjorner:Z3:overview]. From verified programs, the Dafny compiler produces code (`.dll` or `.exe`) for the .NET platform via intermediate C# files. However, the facilities for interfacing with other .NET code are minimal. This is the reference manual for the Dafny verification system. It is based on the following references: [@Leino:Dafny:LPAR16;@MSR:dafny:main; @MSR:dafny:source;@MSR:dafny:quickref; @LEINO:Dafny:Calc; @LEINO:Dafny:Coinduction; and the tutorials at @Rise4fun:dafny] The main part of the reference manual is in top down order except for an initial section that deals with the lowest level constructs. [Co-induction Simply]: http://research.microsoft.com/en-us/um/people/leino/papers/krml230.pdf "Co-induction Simply: Automatic Co-inductive Proofs in a Program Verifier" ## Dafny Example To give a flavor of Dafny, here is the solution to a competition problem. ``` // VSComp 2010, problem 3, find a 0 in a linked list and return how many // nodes were skipped until the first 0 (or end-of-list) was found. // Rustan Leino, 18 August 2010. // // The difficulty in this problem lies in specifying what the return // value 'r' denotes and in proving that the program terminates. Both of // these are addressed by declaring a ghost field 'List' in each // linked-list node, abstractly representing the linked-list elements // from the node to the end of the linked list. The specification can // now talk about that sequence of elements and can use 'r' as an index // into the sequence, and termination can be proved from the fact that // all sequences in Dafny are finite. // // We only want to deal with linked lists whose 'List' field is properly // filled in (which can only happen in an acyclic list, for example). To // that avail, the standard idiom in Dafny is to declare a predicate // 'Valid()' that is true of an object when the data structure // representing object's abstract value is properly formed. The // definition of 'Valid()' is what one intuitively would think of as the // ''object invariant'', and it is mentioned explicitly in method pre- // and postconditions. As part of this standard idiom, one also declared // a ghost variable 'Repr' that is maintained as the set of objects that // make up the representation of the aggregate object--in this case, the // Node itself and all its successors. class Node { ghost var List: seq ghost var Repr: set var head: int var next: Node predicate Valid() reads this, Repr { this in Repr && 1 <= |List| && List[0] == head && (next == null ==> |List| == 1) && (next != null ==> next in Repr && next.Repr <= Repr && this !in next.Repr && next.Valid() && next.List == List[1..]) } static method Cons(x: int, tail: Node) returns (n: Node) requires tail == null || tail.Valid() ensures n != null && n.Valid() ensures if tail == null then n.List == [x] else n.List == [x] + tail.List { n := new Node; n.head, n.next := x, tail; if (tail == null) { n.List := [x]; n.Repr := {n}; } else { n.List := [x] + tail.List; n.Repr := {n} + tail.Repr; } } } method Search(ll: Node) returns (r: int) requires ll == null || ll.Valid() ensures ll == null ==> r == 0 ensures ll != null ==> 0 <= r && r <= |ll.List| && (r < |ll.List| ==> ll.List[r] == 0 && 0 !in ll.List[..r]) && (r == |ll.List| ==> 0 !in ll.List) { if (ll == null) { r := 0; } else { var jj,i := ll,0; while (jj != null && jj.head != 0) invariant jj != null ==> jj.Valid() && i + |jj.List| == |ll.List| && ll.List[i..] == jj.List invariant jj == null ==> i == |ll.List| invariant 0 !in ll.List[..i] decreases |ll.List| - i { jj := jj.next; i := i + 1; } r := i; } } method Main() { var list: Node := null; list := list.Cons(0, list); list := list.Cons(5, list); list := list.Cons(0, list); list := list.Cons(8, list); var r := Search(list); print "Search returns ", r, "\n"; assert r == 1; } ``` # Lexical and Low Level Grammar Dafny uses the Coco/R lexer and parser generator for its lexer and parser ()[@Linz:Coco]. The Dafny input file to Coco/R is the `Dafny.atg` file in the source tree. A Coco/R input file consists of code written in the target language (⪚ C#) intermixed with these special sections: 0. The Characters section which defines classes of characters that are used in defining the lexer (Section [#sec-character-classes]). 1. The Tokens section which defines the lexical tokens (Section [#sec-tokens]). 2. The Productions section which defines the grammar. The grammar productions are distributed in the later parts of this document in the parts where those constructs are explained. The grammar presented in this document was derived from the `Dafny.atg` file but has been simplified by removing details that, though needed by the parser, are not needed to understand the grammar. In particular, the following transformation have been performed. * The semantics actions, enclosed by "(." and ".)", where removed. * There are some elements in the grammar used for error recovery ("SYNC"). These were removed. * There are some elements in the grammar for resolving conflicts ("IF(b)"). These have been removed. * Some comments related to Coco/R parsing details have been removed. * A Coco/R grammar is an attributed grammar where the attributes enable the productions to have input and output parameters. These attributes were removed except that boolean input parameters that affect the parsing are kept. * In our representation we represent these in a definition by giving the names of the parameters following the non-terminal name. For example `entity1(allowsX)`. * In the case of uses of the parameter, the common case is that the parameter is just passed to a lower-level non-terminal. In that case we just give the name, e.g. `entity2(allowsX)`. * If we want to given an explicit value to a parameter, we specify it in a keyword notation like this: `entity2(allowsX: true)`. * In some cases the value to be passed depends on the grammatical context. In such cases we give a description of the conditions under which the parameter is true, enclosed in parenthesis. For example: `FunctionSignatureOrEllipsis_(allowGhostKeyword: ("method" present))` means that the `allowGhostKeyword` parameter is true if the "method" keyword was given in the associated ``FunctionDecl``. * Where a parameter affects the parsing of a non-terminal we will explain the effect of the parameter. The names of character sets and tokens start with a lower case letter but the names of grammar non-terminals start with an upper-case letter. The grammar uses Extended BNF notation. See the [Coco/R Referenced manual](http://www.ssw.uni-linz.ac.at/Research/Projects/Coco/Doc/UserManual.pdf) for details. But in summary: * identifiers starting with a lower case letter denote terminal symbols, * identifiers starting with an upper case letter denote nonterminal symbols. * Strings denote themselves. * `=` separates the sides of a production, ⪚ `A = a b c` * In the Coco grammars "." terminates a production, but for readability in this document a production starts with the defined identifier in the left margin and may be continued on subsequent lines if they are indented. * `|` separates alternatives, ⪚ `a b | c | d e` means `a b` or `c or d e` * `(` `)` groups alternatives, ⪚ (a | b) c means a c or b c * `[ ]` option, ⪚ `[a] b` means `a b` or `b` * `{ }` iteration (0 or more times), ⪚ `{a} b` means `b` or `a b` or `a a b` or ... * We allow `|` inside `[ ]` and `{ }`. So `[a | b]` is short for `[(a | b)]` and `{a | b}` is short for `{(a | b)}`. * The first production defines the name of the grammar, in this case `Dafny`. In addition to the Coco rules, for the sake of readability we have adopted these additional conventions. * We allow `-` to be used. `a - b` means it matches if it matches `a` but not `b`. * To aid in explaining the grammar we have added some additional productions that are not present in the original grammar. We name these with a trailing underscore. If you inline these where they are referenced, the result should let you reconstruct the original grammar. **For the convenience of the reader, any references to character sets, tokens, or grammar non-terminals in this document are hyper-links that will link to the definition of the entity.** ## Character Classes This section defines character classes used later in the token definitions. In this section backslash is used to start an escape sequence, so for example '\n' denotes the single linefeed character. ```` letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" ```` At present, a letter is an ASCII upper or lowercase letter. Other Unicode letters are not supported. ```` digit = "0123456789" ```` A digit is just one of the base-10 digits. ```` posDigit = "123456789" ```` A ``posDigit`` is a digit, excluding 0. ```` hexdigit = "0123456789ABCDEFabcdef" ```` A ``hexdigit`` character is a digit or one of the letters from 'A' to 'F' in either case. ```` special = "'_?" ```` The _special_ characters are the characters in addition to alphanumeric characters that are allowed to appear in a Dafny identifier. These are * `"'"` because mathematicians like to put primes on identifiers and some ML programmers like to start names of type parameters with a "'". * "_" because computer scientists expect to be able to have underscores in identifiers. * "?" because it is useful to have "?" at the end of names of predicates, e.g. "Cons?". ```` cr = '\r' ```` A carriage return character. ```` lf = '\n' ```` A line feed character. ```` tab = '\t' ```` A tab character. ```` space = ' ' ```` A space character. ```` nondigitIdChar = letter + special ```` The characters that can be used in an identifier minus the digits. ```` idchar = nondigitIdChar + digit ```` The characters that can be used in an identifier. ```` nonidchar = ANY - idchar ```` Any character except those that can be used in an identifier. ```` charChar = ANY - '\'' - '\\' - cr - lf ```` Characters that can appear in a character constant. ```` stringChar = ANY - '"' - '\\' - cr - lf ```` Characters that can appear in a string constant. ```` verbatimStringChar = ANY - '"' ```` Characters that can appear in a verbatim string. ### Comments Comments are in two forms. * They may go from "/*" to "*/" and be nested. * They may go from "//" to the end of the line. ## Tokens As with most languages, Dafny syntax is defined in two levels. First the stream of input characters is broken up into _tokens_. Then these tokens are parsed using the Dafny grammar. The Dafny tokens are defined in this section. ### Reserved Words The following reserved words appear in the Dafny grammar and may not be used as identifiers of user-defined entities: ```` reservedword = "abstract" | "array" | "as" | "assert" | "assume" | "bool" | "break" | "calc" | "case" | "char" | "class" | "codatatype" | "colemma" | "constructor" | "copredicate" | "datatype" | "decreases" | "default" | "else" | "ensures" | "exists" | "extends" | "false" | "forall" | "free" | "fresh" | "function" | "ghost" | "if" | "imap" | "import" | "in" | "include" | "inductive" | "int" | "invariant" | "iset" | "iterator" | "label" | "lemma" | "map" | "match" | "method" | "modifies" | "modify" | "module" | "multiset" | "nat" | "new" | "newtype" | "null" | "object" | "old" | "opened" | "predicate" | "print" | "protected" | "reads" | "real" | "refines" | "requires" | "return" | "returns" | "seq" | "set" | "static" | "string" | "then" | "this" | "trait" | "true" | "type" | "var" | "where" | "while" | "yield" | "yields" | arrayToken arrayToken = "array" [ posDigit { digit }] ```` An ``arrayToken`` is a reserved word that denotes an array type of given rank. `array` is an array type of rank 1 (aka a vector). `array2` is the type of two-dimensional arrays, etc. TODO: Is "_" is reserved word? ### Identifiers ```` ident = nondigitIdChar { idchar } - arraytoken - chartoken - reservedword ```` In general Dafny identifiers are sequences of ``idChar`` characters where the first character is a ``nondigitIdChar``. However tokens that fit this pattern are not identifiers if they look like an array type token, a character literal, or a reserved work. ### Digits ```` digits = digit {['_'] digit} ```` A sequence of decimal digits, possibly interspersed with underscores for readability. Example: `1_234_567`. ```` hexdigits = "0x" hexdigit {['_'] hexdigit} ```` A hexadecimal constant, possibly interspersed with underscores for readability. Example: `0xffff_ffff`. ```` decimaldigits = digit {['_'] digit} '.' digit {['_'] digit} ```` A decimal fraction constant, possibly interspersed with underscores for readability. Example: `123_456.789_123`. ### Escaped Character In this section the "\\" characters are literal. ```` escapedChar = ( "\\\'" | "\\"" | "\\\\" | "\\0" | "\\n" | "\\r" | "\\t" | "\\u" hexdigit hexdigit hexdigit hexdigit ) ```` In Dafny character or string literals escaped characters may be used to specify the presence of the delimiting quote, or back slash, or null, or new line, or carriage return or tab, or the Unicode character with given hexadecimal representation. ### Character Constant Token ```` charToken = "'" ( charChar | escapedChar ) "'" ```` A character constant is enclosed by "'" and includes either a character from the ``charChar`` set, or an escaped character. Note that although Unicode letters are not allowed in Dafny identifiers, Dafny does support Unicode in its character and string constants and in its data. A character constant has type `char`. ### String Constant Token ```` stringToken = '"' { stringChar | escapedChar } '"' | '@' '"' { verbatimStringChar | '"' '"' } '"' ```` A string constant is either a normal string constant or a verbatim string constant. A normal string constant is enclosed by '"' and can contain characters from the ``stringChar`` set and escapes. A verbatim string constant is enclosed between '@"' and '"' and can consists of any characters (including newline characters) except that two successive double quotes give a way to escape one quote character inside the string. ## Low Level Grammar Productions ### Identifier Variations ```` Ident = ident ```` The ``Ident`` non-terminal is just an ``ident`` token and represents an ordinary identifier. ```` DotSuffix = ( ident | digits | "requires" | "reads" ) ```` When using the _dot_ notation to denote a component of a compound entity the token following the ".", in addition to being an identifier, can also be a natural number, or one of the keywords `requires` or `reads`. * Digits can be used to name fields of classes and destructors of datatypes. For example, the built-in tuple datatypes have destructors named 0, 1, 2, etc. Note that as a field or destructor name, internal underscores matter, so 10 is different from 1_0. * `m.requires` is used to denote the precondition for method m. * `m.reads` is used to denote the things that method m may read. ```` NoUSIdent = ident - "_" { idChar } ```` A ``NoUSIdent`` is an identifier except that identifiers with a **leading** underscore are not allowed. The names of user-defined entities are required to be ``NoUSIdent``s. We introduce more mnemonic names for these below (e.g. ``ClassName``). ```` WildIdent = NoUSIdent | "_" ```` Identifier, disallowing leading underscores, except the "wildcard" identifier "_". When "_" appears it is replaced by a unique generated identifier distinct from user identifiers. ### NoUSIdent Synonyms In the productions for the declaration of user-defined entities the name of the user-defined entity is required to be an identifier that does not start with an underscore, i.e., a ``NoUSIdent``. To make the productions more mnemonic, we introduce the following synonyms for ``NoUSIdent``. ```` ModuleName = NoUSIdent ClassName = NoUSIdent TraitName = NoUSIdent DatatypeName = NoUSIdent DatatypeMemberName = NoUSIdent NewtypeName = NoUSIdent NumericTypeName = NoUSIdent SynonymTypeName = NoUSIdent IteratorName = NoUSIdent TypeVariableName = NoUSIdent MethodName = NoUSIdent FunctionName = NoUSIdent PredicateName = NoUSIdent CopredicateName = NoUSIdent LabelName = NoUSIdent AttributeName = NoUSIdent FieldIdent = NoUSIdent ```` A ``FieldIdent`` is one of the ways to identify a field. The other is using digits. ### Qualified Names A qualified name starts with the name of the top-level entity and then is followed by zero or more ``DotSuffix``s which denote a component. Examples: * `Module.MyType1` * `MyTuple.1` * `MyMethod.requires` The grammar does not actually have a production for qualified names except in the special case of a qualified name that is known to be a module name, i.e. a ``QualifiedModuleName``. ### Identifier-Type Combinations In this section, we describe some nonterminals that combine an identifier and a type. ```` IdentType = WildIdent ":" Type ```` In Dafny, a variable or field is typically declared by giving its name followed by a ``colon`` and its type. An ``IdentType`` is such a construct. ```` GIdentType(allowGhostKeyword) = [ "ghost" ] IdentType ```` A ``GIdentType`` is a typed entity declaration optionally preceded by "ghost". The _ghost_ qualifier means the entity is only used during verification but not in the generated code. Ghost variables are useful for abstractly representing internal state in specifications. If `allowGhostKeyword` is false then "ghost" is not allowed. ```` LocalIdentTypeOptional = WildIdent [ ":" Type ] ```` A ``LocalIdentTypeOptional`` is used when declaring local variables. In such a case a value may be specified for the variable in which case the type may be omitted because it can be inferred from the initial value. The initial value value may also be omitted. ```` IdentTypeOptional = WildIdent [ ":" Type ] ```` A ``IdentTypeOptional`` is typically used in a context where the type of the identifier may be inferred from the context. Examples are in pattern matching or quantifiers. ```` TypeIdentOptional = [ "ghost" ] ( NoUSIdent | digits ) ":" ] Type ```` ``TypeIdentOptional``s are used in ``FormalsOptionalIds``. This represents situations where a type is given but there may not be an identifier. ```` FormalsOptionalIds = "(" [TypeIdentOptional { "," TypeIdentOptional } ] ")" ```` A ``FormalsOptionalIds`` is a formal parameter list in which the types are required but the names of the parameters is optional. This is used in algebraic datatype definitions. ### Numeric Literals ```` Nat = ( digits | hexdigits ) ```` A ``Nat`` represents a natural number expressed in either decimal or hexadecimal. ```` Dec = (decimaldigits ) ```` A ``Dec`` represents a decimal fraction literal. # Programs ```` Dafny = { IncludeDirective_ } { TopDecl } EOF ```` At the top level, a Dafny program (stored as files with extension `.dfy`) is a set of declarations. The declarations introduce (module-level) methods and functions, as well as types (classes, traits, inductive and co-inductive datatypes, new_types, type synonyms, opaque types, and iterators) and modules, where the order of introduction is irrelevant. A class also contains a set of declarations, introducing fields, methods, and functions. When asked to compile a program, Dafny looks for the existence of a Main() method. If a legal Main() method is found, the compiler will emit a `.EXE`; otherwise, it will emit a `.DLL`. (If there is more than one Main(), Dafny will try to emit an .EXE, but this may cause the C# compiler to complain. One could imagine improving this functionality so that Dafny will produce a polite error message in this case.) In order to be a legal Main() method, the following must be true: * The method takes no parameters * The method is not a ghost method * The method has no requires clause * The method has no modifies clause * If the method is an instance (that is, non-static) method in a class, then the enclosing class must not declare any constructor Note, however, that the following are allowed: * The method is allowed to be an instance method as long as the enclosing class does not declare any constructor. In this case, the runtime system will allocate an object of the enclosing class and will invoke Main() on it. * The method is allowed to have `ensures` clauses * The method is allowed to have `decreases` clauses, including a `decreases *`. (If Main() has a `decreases *`, then its execution may go on forever, but in the absence of a `decreases *` on Main(), Dafny will have verified that the entire execution will eventually terminate.) An invocation of Dafny may specify a number of source files. Each Dafny file follows the grammar of the ``Dafny`` non-terminal. It consists of a sequence of optional _include_ directives followed by top level declarations followed by the end of the file. ## Include Directives ```` IncludeDirective_ = "include" stringToken ```` Include directives have the form ``"include" stringToken`` where the string token is either a normal string token or a verbatim string token. The ``stringToken`` is interpreted as the name of a file that will be included in the Dafny source. These included files also obey the ``Dafny`` grammar. Dafny parses and processes the transitive closure of the original source files and all the included files, but will not invoke the verifier on these unless they have been listed explicitly on the command line. ## Top Level Declarations ```` TopDecl = { { DeclModifier } ( SubModuleDecl | ClassDecl | DatatypeDecl | NewtypeDecl | SynonymTypeDecl | IteratorDecl | TraitDecl | ClassMemberDecl(moduleLevelDecl: true) } ```` Top-level declarations may appear either at the top level of a Dafny file, or within a ``SubModuleDecl``. A top-level declaration is one of the following types of declarations which are described later. The ``ClassDecl``, ``DatatypeDecl``, ``NewtypeDecl``, ``SynonymTypeDecl``, ``IteratorDecl``, and ``TraitDecl`` declarations are type declarations and are describe in Section [#sec-types]. Ordinarily ``ClassMemberDecl``s appear in class declarations but they can also appear at the top level. In that case they are included as part of an implicit top-level class and are implicitly `static` (but cannot be declared as static). In addition a ``ClassMemberDecl`` that appears at the top level cannot be a ``FieldDecl``. ## Declaration Modifiers ```` DeclModifier = ( "abstract" | "ghost" | "static" | "protected" | "extern" [ stringToken] ) ```` Top level declarations may be preceded by zero or more declaration modifiers. Not all of these are allowed in all contexts. The "abstract" modifiers may only be used for module declarations. An abstract module can leave some entities underspecified. Abstract modules are not compiled to C#. The ghost modifier is used to mark entities as being used for specification only, not for compilation to code. The static modifier is used for class members that that are associated with the class as a whole rather than with an instance of the class. The protected modifier is used to control the visibility of the body of functions. The extern modifier is used to alter the CompileName of entities. The CompileName is the name for the entity when translating to Boogie or C#. The following table shows modifiers that are available for each of the kinds of declaration. In the table we use already-ghost to denote that the item is not allowed to have the ghost modifier because it is already implicitly ghost. +--------------------------+---------------------------------------+ | Declaration | allowed modifiers | +--------------------------+---------------------------------------+ | module | abstract | | class | extern | | trait | - | | datatype or codatatype | - | | field | ghost | | newtype | - | | synonym types | - | | iterators | - | | method | ghost static extern | | lemma, colemma, comethod | already-ghost static protected | | inductive lemma | already-ghost static | | constructor | - | | function (non-method) | already-ghost static protected | | function method | already-ghost static protected extern | | predicate (non-method) | already-ghost static protected | | predicate method | already-ghost static protected extern | | inductive predicate | already-ghost static protected | | copredicate | already-ghost static protected | +--------------------------+---------------------------------------+ # Modules ```` SubModuleDecl = ( ModuleDefinition_ | ModuleImport_ ) ```` Structuring a program by breaking it into parts is an important part of creating large programs. In Dafny, this is accomplished via _modules_. Modules provide a way to group together related types, classes, methods, functions, and other modules together, as well as control the scope of declarations. Modules may import each other for code reuse, and it is possible to abstract over modules to separate an implementation from an interface. ## Declaring New Modules ```` ModuleDefinition_ = "module" { Attribute } ModuleName [ [ "exclusively" ] "refines" QualifiedModuleName ] "{" { TopDecl } "}" QualifiedModuleName = Ident { "." Ident } ```` A qualified name that is known to refer to a module. A new module is declared with the `module` keyword, followed by the name of the new module, and a pair of curly braces ({}) enclosing the body of the module: ``` module Mod { ... } ``` A module body can consist of anything that you could put at the top level. This includes classes, datatypes, types, methods, functions, etc. ``` module Mod { class C { var f: int method m() } datatype Option = A(int) | B(int) type T method m() function f(): int } ``` You can also put a module inside another, in a nested fashion: ``` module Mod { module Helpers { class C { method doIt() var f: int } } } ``` Then you can refer to the members of the `Helpers` module within the `Mod` module by prefixing them with "Helpers.". For example: ``` module Mod { module Helpers { ... } method m() { var x := new Helpers.C; x.doIt(); x.f := 4; } } ``` Methods and functions defined at the module level are available like classes, with just the module name prefixing them. They are also available in the methods and functions of the classes in the same module. ``` module Mod { module Helpers { function method addOne(n: nat): nat { n + 1 } } method m() { var x := 5; x := Helpers.addOne(x); // x is now 6 } } ``` ## Importing Modules ```` ModuleImport_ = "import" ["opened" ] ModuleName [ "=" QualifiedModuleName | "as" QualifiedModuleName ["default" QualifiedModuleName ] ] [ ";" ] ```` Declaring new submodules is useful, but sometimes you want to refer to things from an existing module, such as a library. In this case, you can _import_ one module into another. This is done via the `import` keyword, and there are a few different forms, each of which has a different meaning. The simplest kind is the concrete import, and has the form `import A = B`. This declaration creates a reference to the module `B` (which must already exist), and binds it to the new name `A`. Note this new name, i.e. `A`, is only bound in the module containing the import declaration; it does not create a global alias. For example, if `Helpers` was defined outside of `Mod`, then we could import it: ``` module Helpers { ... } module Mod { import A = Helpers method m() { assert A.addOne(5) == 6; } } ``` Note that inside `m()`, we have to use `A` instead of `Helpers`, as we bound it to a different name. The name `Helpers` is not available inside `m()`, as only names that have been bound inside `Mod` are available. In order to use the members from another module, it either has to be declared there with `module` or imported with `import`. We don't have to give `Helpers` a new name, though, if we don't want to. We can write `import Helpers = Helpers` if we want to, and Dafny even provides the shorthand `import Helpers` for this behavior. You can't bind two modules with the same name at the same time, so sometimes you have to use the = version to ensure the names do not clash. The ``QualifiedModuleName`` in the ``ModuleImport_`` starts with a sibling module of the importing module, or with a submodule of the importing module. There is no wya to refer to the parent module, only sibling modules (and their submodules). ## Opening Modules Sometimes, prefixing the members of the module you imported with the name is tedious and ugly, even if you select a short name when importing it. In this case, you can import the module as `opened`, which causes all of its members to be available without adding the module name. The `opened` keyword must immediately follow `import`, if it is present. For example, we could write the previous example as: ``` module Mod { import opened Helpers method m() { assert addOne(5) == 6; } } ``` When opening modules, the newly bound members will have low priority, so they will be hidden by local definitions. This means if you define a local function called `addOne`, the function from `Helpers` will no longer be available under that name. When modules are opened, the original name binding is still present however, so you can always use the name that was bound to get to anything that is hidden. ``` module Mod { import opened Helpers function addOne(n: nat): nat { n - 1 } method m() { assert addOne(5) == 6; // this is now false, // as this is the function just defined assert Helpers.addOne(5) == 6; // this is still true } } ``` If you open two modules that both declare members with the same name, then neither member can be referred to without a module prefix, as it would be ambiguous which one was meant. Just opening the two modules is not an error, however, as long as you don't attempt to use members with common names. The `opened` keyword can be used with any kind of `import` declaration, including the module abstraction form. ## Module Abstraction Sometimes, using a specific implementation is unnecessary; instead, all that is needed is a module that implements some interface. In that case, you can use an _abstract_ module import. In Dafny, this is written `import A as B`. This means bind the name `A` as before, but instead of getting the exact module `B`, you get any module which is a _adheres_ of `B`. Typically, the module `B` may have abstract type definitions, classes with bodyless methods, or otherwise be unsuitable to use directly. Because of the way refinement is defined, any refinement of `B` can be used safely. For example, if we start with: ``` module Interface { function method addSome(n: nat): nat ensures addSome(n) > n } module Mod { import A as Interface method m() { assert 6 <= A.addSome(5); } } ``` then we can be more precise if we know that `addSome` actually adds exactly one. The following module has this behavior. Further, the postcondition is stronger, so this is actually a refinement of the Interface module. ``` module Implementation { function method addSome(n: nat): nat ensures addSome(n) == n + 1 { n + 1 } } ``` We can then substitute `Implementation` for `A` in a new module, by declaring a refinement of `Mod` which defines `A` to be `Implementation`. ``` module Mod2 refines Mod { import A = Implementation ... } ``` You can also give an implementation directly, without introducing a refinement, by giving a default to the abstract import: ``` module Interface { function method addSome(n: nat): nat ensures addSome(n) > n } module Mod { import A as Interface default Implementation method m() { assert 6 <= A.addSome(5); } } module Implementation { function method addSome(n: nat): nat ensures addSome(n) == n + 1 { n + 1 } } module Mod2 refines Mod { import A as Interface default Implementation ... } ``` Regardless of whether there is a default, the only things known about `A` in this example is that it has a function `addSome` that returns a strictly bigger result, so even with the default we still can't prove that `A.addSome(5) == 6`, only that `6 <= A.addSome(5)`. When you refine an abstract import into a concrete one, or giving a default, Dafny checkes that the concrete module is a refinement of the abstract one. This means that the methods must have compatible signatures, all the classes and datatypes with their constructors and fields in the abstract one must be present in the concrete one, the specifications must be compatible, etc. ## Module Ordering and Dependencies Dafny isn't particular about which order the modules appear in, but they must follow some rules to be well formed. As a rule of thumb, there should be a way to order the modules in a program such that each only refers to things defined **before** it in the source text. That doesn't mean the modules have to be given in that order. Dafny will figure out that order for you, assuming you haven't made any circular references. For example, this is pretty clearly meaningless: ``` import A = B import B = A ``` You can have import statements at the toplevel, and you can import modules defined at the same level: ``` import A = B method m() { A.whatever(); } module B { ... } ``` In this case, everything is well defined because we can put `B` first, followed by the `A` import, and then finally `m()`. If there is no ordering, then Dafny will give an error, complaining about a cyclic dependency. Note that when rearranging modules and imports, they have to be kept in the same containing module, which disallows some pathological module structures. Also, the imports and submodules are always considered to be first, even at the toplevel. This means that the following is not well formed: ``` method doIt() { } module M { method m() { doIt(); } } ``` because the module `M` must come before any other kind of members, such as methods. To define global functions like this, you can put them in a module (called `Globals`, say) and open it into any module that needs its functionality. Finally, if you import via a path, such as `import A = B.C`, then this creates a dependency of `A` on `B`, as we need to know what `B` is (is it abstract or concrete, or a refinement?). ## Name Resolution When Dafny sees something like `A.B.C`, how does it know what each part refers to? The process Dafny uses to determine what identifier sequences like this refer to is name resolution. Though the rules may seem complex, usually they do what you would expect. Dafny first looks up the initial identifier. Depending on what the first identifier refers to, the rest of the identifier is looked up in the appropriate context. In terms of the grammar, sequences like the above are represented as a ``NameSegment`` followed by 0 or more ``Suffix``es. A ``Suffix`` is more general and the form shown above would be for when the ``Suffix`` is an ``AugmentedDotSuffix_``. The resolution is different depending on whether it is in an expression context or a type context. ### Expression Context Name Resolution The leading ``NameSegment`` is resolved using the first following rule that succeeds. 0. Local variables, parameters and bound variables. These are things like `x`, `y`, and `i` in `var x;, ... returns (y: int)`, and `forall i :: ....` The declaration chosen is the match from the innermost matching scope. 1. If in a class, try to match a member of the class. If the member that is found is not static an implicit `this` is inserted. This works for fields, functions, and methods of the current class (if in a static context, then only static methods and functions are allowed). You can refer to fields of the current class either as `this.f` or `f`, assuming of course that `f` hasn't be hidden by one of the above. You can always prefix this if needed, which cannot be hidden. (Note, a field whose name is a string of digits must always have some prefix.) 2. If there is no ``Suffix``, then look for a datatype constructor, if unambiguous. Any datatypes that don't need qualification (so the datatype name itself doesn't need a prefix), and also have a uniquely named constructor, can be referred to just by its name. So if `datatype List = Cons(List) | Nil` is the only datatype that declares `Cons` and `Nil` constructors, then you can write `Cons(Cons(Nil))`. If the constructor name is not unique, then you need to prefix it with the name of the datatype (for example `List.Cons(List.Nil)))`. This is done per constructor, not per datatype. 3. Look for a member of the enclosing module. 4. Module-level (static) functions and methods TODO: Not sure about the following paragraph. Opened modules are treated at each level, after the declarations in the current module. Opened modules only affect steps 2, 3 and 5. If a ambiguous name is found, an error is generated, rather than continuing down the list. After the first identifier, the rules are basically the same, except in the new context. For example, if the first identifier is a module, then the next identifier looks into that module. Opened modules only apply within the module it is opened into. When looking up into another module, only things explicitly declared in that module are considered. To resolve expression `E.id`: First resolve expression E and any type arguments. * If `E` resolved to a module `M`: 0. If `E.id` is not followed by any further suffixes, look for unambiguous datatype constructor. 1. Member of module M: a sub-module (including submodules of imports), class, datatype, etc. 2. Static function or method. * If `E` denotes a type: 3. Look up id as a member of that type * If `E` denotes an expression: 4. Let T be the type of E. Look up id in T. ### Type Context Name Resolution In a type context the priority of ``NameSegment`` resolution is: 1. Type parameters. 2. Member of enclosing module (type name or the name of a module). To resolve expression `E.id`: * If `E` resolved to a module `M`: 0. Member of module M: a sub-module (including submodules of imports), class, datatype, etc. * If `E` denotes a type: 1. If `allowDanglingDotName`: Return the type of `E` and the given `E.id`, letting the caller try to make sense of the final dot-name. TODO: I don't under this sentence. What is `allowDanglingDotName`? # Specifications Specifications describe logical properties of Dafny methods, functions, lambdas, iterators and loops. They specify preconditions, postconditions, invariants, what memory locations may be read or modified, and termination information by means of _specification clauses_. For each kind of specification zero or more specification clauses (of the type accepted for that type of specification) may be given, in any order. We document specifications at these levels: - At the lowest level are the various kinds of specification clauses, e.g. a ``RequiresClause_``. - Next are the specifications for entities that need them, e.g. a ``MethodSpec``. - At the top level are the entity declarations that include the specifications, e.g. ``MethodDecl``. This section documents the first two of these in a bottom-up manner. We first document the clauses and then the specifications that use them. ## Specification Clauses ### Requires Clause ```` RequiresClause_ = "requires" Expression(allowLemma: false, allowLambda: false) ```` The **requires** clauses specify preconditions for methods, functions, lambda expressions and iterators. Dafny checks that the preconditions are met at all call sites. The callee may then assume the preconditions hold on entry. If no **requires** clause is specified it is taken to be `true`. If more than one **requires** clause is given, then the precondition is the conjunction of all of the expressions from all of the **requires** clauses. ### Ensures Clause ```` EnsuresClause_ = "ensures" { Attribute } Expression(allowLemma: false, allowLambda: false) ForAllEnsuresClause_ = "ensures" Expression(allowLemma: false, allowLambda: true) FunctionEnsuresClause_ = "ensures" Expression(allowLemma: false, allowLambda: false) ```` An **ensures** clause specifies the post condition for a method, function or iterator. If no **ensures** clause is specified it is taken to be `true`. If more than one **ensures** clause is given, then the postcondition is the conjunction of all of the expressions from all of the **ensures** clauses. TODO: In the present sources ``FunctionEnsuresClause_`` differs from ``EnsuresClause_`` only in that it is not allowed to specify ``Attribute``s. This seems like a bug and will likely be fixed in a future version. ### Decreases Clause ```` DecreasesClause_(allowWildcard, allowLambda) = "decreases" { Attribute } DecreasesList(allowWildcard, allowLambda) FunctionDecreasesClause_(allowWildcard, allowLambda) = "decreases" DecreasesList(allowWildcard, allowLambda) ```` ```` DecreasesList(allowWildcard, allowLambda) = PossiblyWildExpression(allowLambda) { "," PossiblyWildExpression(allowLambda) } ```` If `allowWildcard` is false but one of the ``PossiblyWildExpression``s is a wild-card, an error is reported. TODO: A ``FunctionDecreasesClause_`` is not allowed to specify ``Attribute``s. this will be fixed in a future version. **Decreases** clauses are used to prove termination in the presence of recursion. if more than one **decreases** clause is given it is as if a single **decreases** clause had been given with the collected list of arguments. That is, ``` decreases A, B decreases C, D ``` is equivalent to ``` decreases A, B, C, D ``` If any of the expressions in the **decreases** clause are wild (i.e. "*") then proof of termination will be skipped. Termination metrics in Dafny, which are declared by **decreases** clauses, are lexicographic tuples of expressions. At each recursive (or mutually recursive) call to a function or method, Dafny checks that the effective **decreases** clause of the callee is strictly smaller than the effective **decreases** clause of the caller. What does "strictly smaller" mean? Dafny provides a built-in well-founded order for every type and, in some cases, between types. For example, the Boolean "false" is strictly smaller than "true", the integer 78 is strictly smaller than 102, the set `{2,5}` is strictly smaller than the set `{2,3,5}`, and for "s" of type `seq` where `Color` is some inductive datatype, the color `s[0]` is strictly less than `s` (provided `s` is nonempty). What does "effective decreases clause" mean? Dafny always appends a "top" element to the lexicographic tuple given by the user. This top element cannot be syntactically denoted in a Dafny program and it never occurs as a run-time value either. Rather, it is a fictitious value, which here we will denote \top, such that each value that can ever occur in a Dafny program is strictly less than \top. Dafny sometimes also prepends expressions to the lexicographic tuple given by the user. The effective decreases clause is any such prefix, followed by the user-provided decreases clause, followed by \top. We said "user-provided decreases clause", but if the user completely omits a "decreases" clause, then Dafny will usually make a guess at one, in which case the effective decreases clause is any prefix followed by the guess followed by \top. (If you're using the Dafny IDE in Visual Studio, you can hover the mouse over the name of a recursive function or method, or the "while" keyword for a loop, to see the "decreases" clause that Dafny guessed, if any.) Here is a simple but interesting example: the Fibonacci function. ``` function Fib(n: nat) : nat { if n < 2 then n else Fib(n-2) + Fib(n-1) } ``` In this example, if you hover your mouse over the function name you will see that Dafny has supplied a `**decreases** n` clause. Let's take a look at the kind of example where a mysterious-looking decreases clause like "Rank, 0" is useful. Consider two mutually recursive methods, `A` and `B`: ``` method A(x: nat) { B(x); } method B(x: nat) { if x != 0 { A(x-1); } } ``` To prove termination of `A` and `B`, Dafny needs to have effective decreases clauses for A and B such that: * the measure for the callee `B(x)` is strictly smaller than the measure for the caller `A(x)`, and * the measure for the callee `A(x-1)` is strictly smaller than the measure for the caller `B(x)`. Satisfying the second of these conditions is easy, but what about the first? Note, for example, that declaring both `A` and `B` with "decreases x" does not work, because that won't prove a strict decrease for the call from `A(x)` to `B(x)`. Here's one possibility (for brevity, we will omit the method bodies): ``` method A(x: nat) decreases x, 1 method B(x: nat) decreases x, 0 ``` For the call from `A(x)` to `B(x)`, the lexicographic tuple `"x, 0"` is strictly smaller than `"x, 1"`, and for the call from `B(x)` to `A(x-1)`, the lexicographic tuple `"x-1, 1"` is strictly smaller than `"x, 0"`. Two things to note: First, the choice of "0" and "1" as the second components of these lexicographic tuples is rather arbitrary. It could just as well have been "false" and "true", respectively, or the sets `{2,5}` and `{2,3,5}`. Second, the keyword **decreases** often gives rise to an intuitive English reading of the declaration. For example, you might say that the recursive calls in the definition of the familiar Fibonacci function `Fib(n)` "decreases n". But when the lexicographic tuple contains constants, the English reading of the declaration becomes mysterious and may give rise to questions like "how can you decrease the constant 0?". The keyword is just that---a keyword. It says "here comes a list of expressions that make up the lexicographic tuple we want to use for the termination measure". What is important is that one effective decreases clause is compared against another one, and it certainly makes sense to compare something to a constant (and to compare one constant to another). We can simplify things a little bit by remembering that Dafny appends \top to the user-supplied decreases clause. For the A-and-B example, this lets us drop the constant from the **decreases** clause of A: ``` method A(x: nat) decreases x method B(x: nat) decreases x, 0 ``` The effective decreases clause of `A` is `"x, \top"` and the effective decreases clause of `B` is `"x, 0, \top"`. These tuples still satisfy the two conditions `(x, 0, \top) < (x, \top)` and `(x-1, \top) < (x, 0, \top)`. And as before, the constant "0" is arbitrary; anything less than \top (which is any Dafny expression) would work. Let's take a look at one more example that better illustrates the utility of `\top`. Consider again two mutually recursive methods, call them `Outer` and `Inner`, representing the recursive counterparts of what iteratively might be two nested loops: ``` method Outer(x: nat) { // set y to an arbitrary non-negative integer var y :| 0 <= y; Inner(x, y); } method Inner(x: nat, y: nat) { if y != 0 { Inner(x, y-1); } else if x != 0 { Outer(x-1); } } ``` The body of `Outer` uses an assign-such-that statement to represent some computation that takes place before `Inner` is called. It sets "y" to some arbitrary non-negative value. In a more concrete example, `Inner` would do some work for each "y" and then continue as `Outer` on the next smaller "x". Using a **decreases** clause `"x, y"` for `Inner` seems natural, but if we don't have any bound on the size of the `"y"` computed by `Outer`, there is no expression we can write in **decreases** clause of `Outer` that is sure to lead to a strictly smaller value for `"y"` when `Inner` is called. `\top` to the rescue. If we arrange for the effective decreases clause of `Outer` to be `"x, \top"` and the effective decreases clause for `Inner` to be `"x, y, \top"`, then we can show the strict decreases as required. Since `\top` is implicitly appended, the two decreases clauses declared in the program text can be: ``` method Outer(x: nat) decreases x method Inner(x: nat, y: nat) decreases x, y ``` Moreover, remember that if a function or method has no user-declared **decreases** clause, Dafny will make a guess. The guess is (usually) the list of arguments of the function/method, in the order given. This is exactly the decreases clauses needed here. Thus, Dafny successfully verifies the program without any explicit decreases clauses: ``` method Outer(x: nat) { var y :| 0 <= y; Inner(x, y); } method Inner(x: nat, y: nat) { if y != 0 { Inner(x, y-1); } else if x != 0 { Outer(x-1); } } ``` The ingredients are simple, but the end result may seem like magic. For many users, however, there may be no magic at all -- the end result may be so natural that the user never even has to bothered to think about that there was a need to prove termination in the first place. ### Framing ```` FrameExpression(allowLemma, allowLambda) = ( Expression(allowLemma, allowLambda) [ FrameField ] | FrameField ) ```` ```` FrameField = "`" Ident ```` ```` PossiblyWildFrameExpression(allowLemma) = ( "*" | FrameExpression(allowLemma, allowLambda: false) ) ```` Frame expressions are used to denote the set of memory locations that a Dafny program element may read or write. A frame expression is a set expression. The form `{}` (that is, the empty set) says that no memory locations may be modified, which is also the default if no **modifies** clause is given explicitly. Note that framing only applies to the heap, or memory accessed through references. Local variables are not stored on the heap, so they cannot be mentioned (well, they are not in scope in the declaration) in reads annotations. Note also that types like sets, sequences, and multisets are value types, and are treated like integers or local variables. Arrays and objects are reference types, and they are stored on the heap (though as always there is a subtle distinction between the reference itself and the value it points to.) The ``FrameField`` construct is used to specify a field of a class object. The identifier following the back-quote is the name of the field being referenced. If the `FrameField` is preceded by an expression the expression must be a reference to an object having that field. If the `FrameField` is not preceded by an expression then the frame expression is referring to that field of the current object. This form is only used from a method of a class. The use of ``FrameField`` is discouraged as in practice it has not been shown to either be more concise or to perform better. Also, there's (unfortunately) no form of it for array elements---one could imagine ``` modifies a`[j] ``` Also, ``FrameField`` is not taken into consideration for lambda expressions. ### Reads Clause ```` FunctionReadsClause_ = "reads" PossiblyWildFrameExpression (allowLemma: false) { "," PossiblyWildFrameExpression(allowLemma: false) } LambdaReadsClause_ = "reads" PossiblyWildFrameExpression(allowLemma: true) IteratorReadsClause_ = "reads" { Attribute } FrameExpression(allowLemma: false, allowLambda: false) { "," FrameExpression(allowLemma: false, allowLambda: false) } PossiblyWildExpression(allowLambda) = ( "*" | Expression(allowLemma: false, allowLambda) ) ```` Functions are not allowed to have side effects but may be restricted in what they can read. The _reading frame_ of a function (or predicate) is all the memory locations that the function is allowed to read. The reason we might limit what a function can read is so that when we write to memory, we can be sure that functions that did not read that part of memory have the same value they did before. For example, we might have two arrays, one of which we know is sorted. If we did not put a reads annotation on the sorted predicate, then when we modify the unsorted array, we cannot determine whether the other array stopped being sorted. While we might be able to give invariants to preserve it in this case, it gets even more complex when manipulating data structures. In this case, framing is essential to making the verification process feasible. It is not just the body of a function that is subject to **reads** checks, but also its precondition and the **reads** clause itself. A reads clause can list a wildcard ("*"), which allows the enclosing function to read anything. In many cases, and in particular in all cases where the function is defined recursively, this makes it next to impossible to make any use of the function. Nevertheless, as an experimental feature, the language allows it (and it is sound). Note that a "*" makes the rest of the frame expression irrelevant. A **reads** clause specifies the set of memory locations that a function, lambda, or iterator may read. If more than one **reads** clause is given in a specification the effective read set is the union of the sets specified. If there are no **reads** clauses the effective read set is empty. If `"*"` is given in a **reads** clause it means any memory may be read. TODO: It would be nice if the different forms of read clauses could be combined. In a future version the single form of read clause will allow a list and attributes. ### Modifies Clause ```` ModifiesClause_ = "modifies" { Attribute } FrameExpression(allowLemma: false, allowLambda: false) { "," FrameExpression(allowLemma: false, allowLambda: false) } ```` Frames also affect methods. As you might have guessed, methods are not required to list the things they read. Methods are allowed to read whatever memory they like, but they are required to list which parts of memory they modify, with a modifies annotation. They are almost identical to their reads cousins, except they say what can be changed, rather than what the value of the function depends on. In combination with reads, modification restrictions allow Dafny to prove properties of code that would otherwise be very difficult or impossible. Reads and modifies are one of the tools that allow Dafny to work on one method at a time, because they restrict what would otherwise be arbitrary modifications of memory to something that Dafny can reason about. Note that fields of newly allocated objects can always be modified. It is also possible to frame what can be modified by a block statement by means of the block form of the [modify statement](#sec-modify-statement) (Section [#sec-modify-statement]). A **modifies** clause specifies the set of memory locations that a method, iterator or loop body may modify. If more than one **modifies** clause is given in a specification, the effective modifies set is the union of the sets specified. If no **modifies** clause is given the effective modifies set is empty. A loop can also have a **modifies** clause. If none is given, the loop gets to modify anything the enclosing context is allowed to modify. ### Invariant Clause ```` InvariantClause_ = "invariant" { Attribute } Expression(allowLemma: false, allowLambda: true) ```` An **invariant** clause is used to specify an invariant for a loop. If more than one **invariant** clause is given for a loop the effective invariant is the conjunction of the conditions specified. The invariant must hold on entry to the loop. And assuming it is valid on entry, Dafny must be able to prove that it then holds at the end of the loop. ## Method Specification ```` MethodSpec = { ModifiesClause_ | RequiresClause_ | EnsuresClause_ | DecreasesClause_(allowWildcard: true, allowLambda: false) } ```` A method specification is zero or more **modifies**, **requires**, **ensures** or **decreases** clauses, in any order. A method does not have **reads** clauses because methods are allowed to read any memory. ## Function Specification ```` FunctionSpec = { RequiresClause_ | FunctionReadsClause_ | FunctionEnsuresClause_ | FunctionDecreasesClause_(allowWildcard: false, allowLambda: false) } ```` A function specification is zero or more **reads**, **requires**, **ensures** or **decreases** clauses, in any order. A function specification does not have **modifies** clauses because functions are not allowed to modify any memory. ## Lambda Specification ```` LambdaSpec_ = { LambdaReadsClause_ | RequiresClause_ } ```` A lambda specification is zero or more **reads** or **requires** clauses. Lambda specifications do not have **ensures** clauses because the body is never opaque. Lambda specifications do not have **decreases** clauses because they do not have names and thus cannot be recursive. A lambda specification does not have **modifies** clauses because lambdas are not allowed to modify any memory. ## Iterator Specification ```` IteratorSpec = { IteratorReadsClause_ | ModifiesClause_ | [ "yield" ] RequiresClause_ | [ "yield" ] EnsuresClause_ | DecreasesClause_(allowWildcard: false, allowLambda: false) } ```` An iterator specification applies both to the iterator's constructor method and to its `MoveNext` method. The **reads** and **modifies** clauses apply to both of them. For the **requires** and **ensures** clauses, if `yield` is not present they apply to the constructor, but if `yield` is present they apply to the `MoveNext` method. TODO: What is the meaning of a **decreases** clause on an iterator? Does it apply to `MoveNext`? Make sure our description of iterators explains these. TODO: What is the relationship between the post condition and the `Valid()` predicate? ## Loop Specification ```` LoopSpec = { InvariantClause_ | DecreasesClause_(allowWildcard: true, allowLambda: true) | ModifiesClause_ } ```` A loop specification provides the information Dafny needs to prove properties of a loop. The ``InvariantClause_`` clause is effectively a precondition and it along with the negation of the loop test condition provides the postcondition. The ``DecreasesClause_`` clause is used to prove termination. # Types ```` Type = DomainType [ "->" Type ] ```` A Dafny type is a domain type (i.e. a type that can be the domain of a function type) optionally followed by an arrow and a range type. ```` DomainType = ( BoolType_ | CharType_ | NatType_ | IntType_ | RealType_ | ObjectType_ | FiniteSetType_ | InfiniteSetType_ | MultisetType_ | SequenceType_ | StringType_ | FiniteMapType_ | InfiniteMapType_ | ArrayType_ | TupleType_ | NamedType_ ) ```` The domain types comprise the builtin scalar types, the builtin collection types, tuple types (including as a special case a parenthesized type) and reference types. Dafny types may be categorized as either value types or reference types. ## Value Types The value types are those whose values do not lie in the program heap. These are: * The basic scalar types: `bool`, `char`, `nat`, `int`, `real` * The built-in collection types: `set`, `multiset`, `seq`, `string`, `map`, `imap` * Tuple Types * Inductive and co-inductive types Data items having value types are passed by value. Since they are not considered to occupy _memory_, framing expressions do not reference them. ## Reference Types Dafny offers a host of _reference types_. These represent _references_ to objects allocated dynamically in the program heap. To access the members of an object, a reference to (that is, a _pointer_ to or _object identity_ of) the object is _dereferenced_. The reference types are class types, traits and array types. The special value `null` is part of every reference type.[^fn-nullable] [^fn-nullable]: This will change in a future version of Dafny that will support both nullable and (by default) non-null reference types. ## Named Types ```` NamedType_ = NameSegmentForTypeName { "." NameSegmentForTypeName } ```` A ``NamedType_`` is used to specify a user-defined type by name (possibly module-qualified). Named types are introduced by class, trait, inductive, co-inductive, synonym and opaque type declarations. They are also used to refer to type variables. ```` NameSegmentForTypeName = Ident [ GenericInstantiation ] ```` A ``NameSegmentForTypeName`` is a type name optionally followed by a ``GenericInstantiation`` which supplies type parameters to a generic type, if needed. It is a special case of a ``NameSegment`` (See Section [#sec-name-segment]) that does not allow a ``HashCall``. The following sections describe each of these kinds of types in more detail. # Basic types Dafny offers these basic types: `bool` for booleans, `char` for characters, `int` and `nat` for integers, and `real` for reals. ## Booleans ```` BoolType_ = "bool" ```` There are two boolean values and each has a corresponding literal in the language: `false` and `true`. In addition to equality (`==`) and disequality (`!=`), which are defined on all types, type `bool` supports the following operations: +--------------------+------------------------------------+ | operator | description | +--------------------+------------------------------------+ | `<==>` | equivalence (if and only if) | +--------------------+------------------------------------+ | `==>` | implication (implies) | | `<==` | reverse implication (follows from) | +--------------------+------------------------------------+ | `&&` | conjunction (and) | | [\|\|]{.monospace} | disjunction (or) | +--------------------+------------------------------------+ | `!` | negation (not) | +--------------------+------------------------------------+ Negation is unary; the others are binary. The table shows the operators in groups of increasing binding power, with equality binding stronger than conjunction and disjunction, and weaker than negation. Within each group, different operators do not associate, so parentheses need to be used. For example, ``` A && B || C // error ``` would be ambiguous and instead has to be written as either ``` (A && B) || C ``` or ``` A && (B || C) ``` depending on the intended meaning. ### Equivalence Operator The expressions `A <==> B` and `A == B` give the same value, but note that `<==>` is _associative_ whereas `==` is _chaining_. So, ``` A <==> B <==> C ``` is the same as ``` A <==> (B <==> C) ``` and ``` (A <==> B) <==> C ``` whereas ``` A == B == C ``` is simply a shorthand for ``` A == B && B == C ``` ### Conjunction and Disjunction Conjunction is associative and so is disjunction. These operators are _short circuiting (from left to right)_, meaning that their second argument is evaluated only if the evaluation of the first operand does not determine the value of the expression. Logically speaking, the expression `A && B` is defined when `A` is defined and either `A` evaluates to `false` or `B` is defined. When `A && B` is defined, its meaning is the same as the ordinary, symmetric mathematical conjunction ∧. The same holds for `||` and ∨. ### Implication and Reverse Implication Implication is _right associative_ and is short-circuiting from left to right. Reverse implication `B <== A` is exactly the same as `A ==> B`, but gives the ability to write the operands in the opposite order. Consequently, reverse implication is _left associative_ and is short-circuiting from _right to left_. To illustrate the associativity rules, each of the following four lines expresses the same property, for any `A`, `B`, and `C` of type `bool`: ``` A ==> B ==> C A ==> (B ==> C) // parentheses redundant, since ==> is right associative C <== B <== A (C <== B) <== A // parentheses redundant, since <== is left associative ``` To illustrate the short-circuiting rules, note that the expression `a.Length` is defined for an array `a` only if `a` is not `null` (see Section [#sec-reference-types]), which means the following two expressions are well-formed: ``` a != null ==> 0 <= a.Length 0 <= a.Length <== a != null ``` The contrapositive of these two expressions would be: ``` a.Length < 0 ==> a == null // not well-formed a == null <== a.Length < 0 // not well-formed ``` but these expressions are not well-formed, since well-formedness requires the left (and right, respectively) operand, `a.Length < 0`, to be well-formed by itself. Implication `A ==> B` is equivalent to the disjunction `!A || B`, but is sometimes (especially in specifications) clearer to read. Since, `||` is short-circuiting from left to right, note that ``` a == null || 0 <= a.Length ``` is well-formed, whereas ``` 0 <= a.Length || a == null // not well-formed ``` is not. In addition, booleans support _logical quantifiers_ (forall and exists), described in section [#sec-quantifier-expression]. ## Numeric types ```` IntType_ = "int" RealType_ = "real" ```` Dafny supports _numeric types_ of two kinds, _integer-based_, which includes the basic type `int` of all integers, and _real-based_, which includes the basic type `real` of all real numbers. User-defined numeric types based on `int` and `real`, called _newtypes_, are described in Section [#sec-newtypes]. Also, the _subset type_ `nat`, representing the non-negative subrange of `int`, is described in Section [#sec-subset-types]. The language includes a literal for each non-negative integer, like `0`, `13`, and `1985`. Integers can also be written in hexadecimal using the prefix "`0x`", as in `0x0`, `0xD`, and `0x7c1` (always with a lower case `x`, but the hexadecimal digits themselves are case insensitive). Leading zeros are allowed. To form negative integers, use the unary minus operator. There are also literals for some of the non-negative reals. These are written as a decimal point with a nonempty sequence of decimal digits on both sides. For example, `1.0`, `1609.344`, and `0.5772156649`. For integers (in both decimal and hexadecimal form) and reals, any two digits in a literal may be separated by an underscore in order to improve human readability of the literals. For example: ``` 1_000_000 // easier to read than 1000000 0_12_345_6789 // strange but legal formatting of 123456789 0x8000_0000 // same as 0x80000000 -- hex digits are often placed in groups of 4 0.000_000_000_1 // same as 0.0000000001 -- 1 $[Ångström]{.comment-color}$ ``` In addition to equality and disequality, numeric types support the following relational operations: +-----------------+------------------------------------+ | operator | description | +-----------------+------------------------------------+ | [<]{.monospace} | less than | | `<=` | at most | | `>=` | at least | | `>` | greater than | +-----------------+------------------------------------+ Like equality and disequality, these operators are chaining, as long as they are chained in the "same direction". That is, ``` A <= B < C == D <= E ``` is simply a shorthand for ``` A <= B && B < C && C == D && D <= E ``` whereas ``` A < B > C ``` is not allowed. There are also operators on each numeric type: +---------------+------------------------------------+ | operator | description | +---------------+------------------------------------+ | `+` | addition (plus) | | `-` | subtraction (minus) | +---------------+------------------------------------+ | `*` | multiplication (times) | | `/` | division (divided by) | | `%` | modulus (mod) | +---------------+------------------------------------+ | `-` | negation (unary minus) | +---------------+------------------------------------+ The binary operators are left associative, and they associate with each other in the two groups. The groups are listed in order of increasing binding power, with equality binding more strongly than the multiplicative operators and weaker than the unary operator. Modulus is supported only for integer-based numeric types. Integer division and modulus are the _Euclidean division and modulus_. This means that modulus always returns a non-negative, regardless of the signs of the two operands. More precisely, for any integer `a` and non-zero integer `b`, ``` a == a / b * b + a % b 0 <= a % b < B ``` where `B` denotes the absolute value of `b`. Real-based numeric types have a member `Trunc` that returns the _floor_ of the real value, that is, the largest integer not exceeding the real value. For example, the following properties hold, for any `r` and `r'` of type `real`: ``` 3.14.Trunc == 3 (-2.5).Trunc == -3 -2.5.Trunc == -2 real(r.Trunc) <= r r <= r' ==> r.Trunc <= r'.Trunc ``` Note in the third line that member access (like `.Trunc`) binds stronger than unary minus. The fourth line uses the conversion function `real` from `int` to `real`, as described in Section [#sec-numeric-conversion-operations]. ## Characters ```` CharType_ = "char" ```` Dafny supports a type `char` of _characters_. Character literals are enclosed in single quotes, as in `'D'`. Their form is described by the ``charToken`` nonterminal in the grammar. To write a single quote as a character literal, it is necessary to use an _escape sequence_. Escape sequences can also be used to write other characters. The supported escape sequences are as follows: +--------------------+------------------------------------------------------------+ | escape sequence | meaning | +--------------------+------------------------------------------------------------+ | `\'` | the character `'` | | [\\\"]{.monospace} | the character [\"]{.monospace} | | `\\` | the character `\` | | `\0` | the null character, same as `\u0000` | | `\n` | line feed | | `\r` | carriage return | | `\t` | horizontal tab | | `\u$_xxxx_$` | universal character whose hexadecimal code is `$_xxxx_$` | +--------------------+------------------------------------------------------------+ The escape sequence for a double quote is redundant, because [\'\"\']{.monospace} and [\'\\\"\']{.monospace} denote the same character---both forms are provided in order to support the same escape sequences as for string literals (Section [#sec-strings]). In the form `\u$_xxxx_$`, the `u` is always lower case, but the four hexadecimal digits are case insensitive. Character values are ordered and can be compared using the standard relational operators: +-----------------+------------------------------------+ | operator | description | +-----------------+------------------------------------+ | [<]{.monospace} | less than | | `<=` | at most | | `>=` | at least | | `>` | greater than | +-----------------+------------------------------------+ Sequences of characters represent _strings_, as described in Section [#sec-strings]. The only other operations on characters are obtaining a character by indexing into a string, and the implicit conversion to string when used as a parameter of a `print` statement. TODO: Are there any conversions between `char` values and numeric values? # Type parameters ```` GenericParameters = "<" TypeVariableName [ "(" "==" ")" ] { "," TypeVariableName [ "(" "==" ")" ] } ">" ```` Many of the types (as well as functions and methods) in Dafny can be parameterized by types. These _type parameters_ are typically declared inside angle brackets and can stand for any type. It is sometimes necessary to restrict these type parameters so that they can only be instantiated by certain families of types. As such, Dafny distinguishes types that support the equality operation not only in ghost contexts but also in compiled contexts. To indicate that a type parameter is restricted to such _equality supporting_ types, the name of the type parameter takes the suffix "`(==)`".[^fn-type-mode] For example, ``` method Compare(a: T, b: T) returns (eq: bool) { if a == b { eq := true; } else { eq := false; } } ``` is a method whose type parameter is restricted to equality-supporting types. Again, note that _all_ types support equality in _ghost_ contexts; the difference is only for non-ghost (that is, compiled) code. Co-inductive datatypes, function types, as well as inductive datatypes with ghost parameters are examples of types that are not equality supporting. [^fn-type-mode]: Being equality-supporting is just one of many _modes_ that one can imagine types in a rich type system to have. For example, other modes could include having a total order, being zero-initializable, and possibly being uninhabited. If Dafny were to support more modes in the future, the "`()`"-suffix syntax may be extended. For now, the suffix can only indicate the equality-supporting mode. Dafny has some inference support that makes certain signatures less cluttered (described in a different part of the Dafny language reference). In some cases, this support will infer that a type parameter must be restricted to equality-supporting types, in which case Dafny adds the "`(==)`" automatically. TODO: Need to describe type inference somewhere. # Generic Instantiation ```` GenericInstantiation = "<" Type { "," Type } ">" ```` When a generic entity is used, actual types must be specified for each generic parameter. This is done using a ``GenericInstantiation``. If the `GenericInstantiation` is omitted, type inference will try to fill these in. # Collection types Dafny offers several built-in collection types. ## Sets ```` FiniteSetType_ = "set" [ GenericInstantiation ] InfiniteSetType_ = "iset" [ GenericInstantiation ] ```` For any type `T`, each value of type `set` is a finite set of `T` values. TODO: Set membership is determined by equality in the type `T`, so `set` can be used in a non-ghost context only if `T` is equality supporting. For any type `T`, each value of type `iset` is a potentially infinite set of `T` values. A set can be formed using a _set display_ expression, which is a possibly empty, unordered, duplicate-insensitive list of expressions enclosed in curly braces. To illustrate, ``` {} {2, 7, 5, 3} {4+2, 1+5, a*b} ``` are three examples of set displays. There is also a _set comprehension_ expression (with a binder, like in logical quantifications), described in section [#sec-set-comprehension-expressions]. In addition to equality and disequality, set types support the following relational operations: +-----------------+------------------------------------+ | operator | description | +-----------------+------------------------------------+ | [<]{.monospace} | proper subset | | `<=` | subset | | `>=` | superset | | `>` | proper superset | +-----------------+------------------------------------+ Like the arithmetic relational operators, these operators are chaining. Sets support the following binary operators, listed in order of increasing binding power: +---------------+------------------------------------+ | operator | description | +---------------+------------------------------------+ | `!!` | disjointness | +---------------+------------------------------------+ | `+` | set union | | `-` | set difference | +---------------+------------------------------------+ | `*` | set intersection | +---------------+------------------------------------+ The associativity rules of `+`, `-`, and `*` are like those of the arithmetic operators with the same names. The expression `A !! B`, whose binding power is the same as equality (but which neither associates nor chains with equality), says that sets `A` and `B` have no elements in common, that is, it is equivalent to ``` A * B == {} ``` However, the disjointness operator is chaining, so `A !! B !! C !! D` means: ``` A * B == {} && (A + B) * C == {} && (A + B + C) * D == {} ``` In addition, for any set `s` of type `set` or `iset` and any expression `e` of type `T`, sets support the following operations: +---------------------+------------------------------------+ | expression | description | +---------------------+------------------------------------+ | [\|s\|]{.monospace} | set cardinality | | `e in s` | set membership | | `e !in s` | set non-membership | +---------------------+------------------------------------+ The expression `e !in s` is a syntactic shorthand for `!(e in s)`. ## Multisets ```` MultisetType_ = "multiset" [ GenericInstantiation ] ```` A _multiset_ is similar to a set, but keeps track of the multiplicity of each element, not just its presence or absence. For any type `T`, each value of type `multiset` is a map from `T` values to natural numbers denoting each element's multiplicity. Multisets in Dafny are finite, that is, they contain a finite number of each of a finite set of elements. Stated differently, a multiset maps only a finite number of elements to non-zero (finite) multiplicities. Like sets, multiset membership is determined by equality in the type `T`, so `multiset` can be used in a non-ghost context only if `T` is equality supporting. A multiset can be formed using a _multiset display_ expression, which is a possibly empty, unordered list of expressions enclosed in curly braces after the keyword `multiset`. To illustrate, ``` multiset{} multiset{0, 1, 1, 2, 3, 5} multiset{4+2, 1+5, a*b} ``` are three examples of multiset displays. There is no multiset comprehension expression. In addition to equality and disequality, multiset types support the following relational operations: +-----------------+------------------------------------+ | operator | description | +-----------------+------------------------------------+ | [<]{.monospace} | proper multiset subset | | `<=` | multiset subset | | `>=` | multiset superset | | `>` | proper multiset superset | +-----------------+------------------------------------+ Like the arithmetic relational operators, these operators are chaining. Multisets support the following binary operators, listed in order of increasing binding power: +---------------+------------------------------------+ | operator | description | +---------------+------------------------------------+ | `!!` | multiset disjointness | +---------------+------------------------------------+ | `+` | multiset union | | `-` | multiset difference | +---------------+------------------------------------+ | `*` | multiset intersection | +---------------+------------------------------------+ The associativity rules of `+`, `-`, and `*` are like those of the arithmetic operators with the same names. The `+` operator adds the multiplicity of corresponding elements, the `-` operator subtracts them (but 0 is the minimum multiplicity), and the `*` has multiplicity that is the minimum of the multiplicity of the operands. The expression `A !! B` says that multisets `A` and `B` have no elements in common, that is, it is equivalent to ``` A * B == multiset{} ``` Like the analogous set operator, `!!` is chaining. In addition, for any multiset `s` of type `multiset`, expression `e` of type `T`, and non-negative integer-based numeric `n`, multisets support the following operations: +---------------------+------------------------------------------+ | expression | description | +---------------------+------------------------------------------+ | [\|s\|]{.monospace} | multiset cardinality | | `e in s` | multiset membership | | `e !in s` | multiset non-membership | | `s[e]` | multiplicity of `e` in `s` | | `s[e := n]` | multiset update (change of multiplicity) | +---------------------+------------------------------------------+ The expression `e in s` returns `true` if and only if `s[e] != 0`. The expression `e !in s` is a syntactic shorthand for `!(e in s)`. The expression `s[e := n]` denotes a multiset like `s`, but where the multiplicity of element `e` is `n`. Note that the multiset update `s[e := 0]` results in a multiset like `s` but without any occurrences of `e` (whether or not `s` has occurrences of `e` in the first place). As another example, note that `s - multiset{e}` is equivalent to: ``` if e in s then s[e := s[e] - 1] else s ``` ## Sequences ```` SequenceType_ = "seq" [ GenericInstantiation ] ```` For any type `T`, a value of type `seq` denotes a _sequence_ of `T` elements, that is, a mapping from a finite downward-closed set of natural numbers (called _indices_) to `T` values. (Thinking of it as a map, a sequence is therefore something of a dual of a multiset.) ### Sequence Displays A sequence can be formed using a _sequence display_ expression, which is a possibly empty, ordered list of expressions enclosed in square brackets. To illustrate, ``` [] [3, 1, 4, 1, 5, 9, 3] [4+2, 1+5, a*b] ``` are three examples of sequence displays. There is no sequence comprehension expression. ### Sequence Relational Operators In addition to equality and disequality, sequence types support the following relational operations: +-----------------+------------------------------------+ | operator | description | +-----------------+------------------------------------+ | [<]{.monospace} | proper prefix | | `<=` | prefix | +-----------------+------------------------------------+ Like the arithmetic relational operators, these operators are chaining. Note the absence of `>` and `>=`. ### Sequence Concatenation Sequences support the following binary operator: +---------------+------------------------------------+ | operator | description | +---------------+------------------------------------+ | `+` | concatenation | +---------------+------------------------------------+ Operator `+` is associative, like the arithmetic operator with the same name. ### Other Sequence Expressions In addition, for any sequence `s` of type `seq`, expression `e` of type `T`, integer-based numeric `i` satisfying `0 <= i < |s|`, and integer-based numerics `lo` and `hi` satisfying `0 <= lo <= hi <= |s|`, sequences support the following operations: +---------------------+----------------------------------------+ | expression | description | +---------------------+----------------------------------------+ | [\|s\|]{.monospace} | sequence length | | `s[i]` | sequence selection | | `s[i := e]` | sequence update | | `e in s` | sequence membership | | `e !in s` | sequence non-membership | | `s[lo..hi]` | subsequence | | `s[lo..]` | drop | | `s[..hi]` | take | | `s[$_slices_$]` | slice | | `multiset(s)` | sequence conversion to a `multiset` | +---------------------+----------------------------------------+ Expression `s[i := e]` returns a sequence like `s`, except that the element at index `i` is `e`. The expression `e in s` says there exists an index `i` such that `s[i] == e`. It is allowed in non-ghost contexts only if the element type `T` is equality supporting. The expression `e !in s` is a syntactic shorthand for `!(e in s)`. Expression `s[lo..hi]` yields a sequence formed by taking the first `hi` elements and then dropping the first `lo` elements. The resulting sequence thus has length `hi - lo`. Note that `s[0..|s|]` equals `s`. If the upper bound is omitted, it defaults to `|s|`, so `s[lo..]` yields the sequence formed by dropping the first `lo` elements of `s`. If the lower bound is omitted, it defaults to `0`, so `s[..hi]` yields the sequence formed by taking the first `hi` elements of `s`. In the sequence slice operation, `$_slices_$` is a nonempty list of length designators separated and optionally terminated by a colon, and there is at least one colon. Each length designator is a non-negative integer-based numeric, whose sum is no greater than `|s|`. If there are _k_ colons, the operation produces _k + 1_ consecutive subsequences from `s`, each of the length indicated by the corresponding length designator, and returns these as a sequence of sequences.[^fn-slice-into-tuple] If `$_slices_$` is terminated by a colon, then the length of the last slice extends until the end of `s`, that is, its length is `|s|` minus the sum of the given length designators. For example, the following equalities hold, for any sequence `s` of length at least `10`: ``` var t := [3.14, 2.7, 1.41, 1985.44, 100.0, 37.2][1:0:3]; assert |t| == 3 && t[0] == [3.14] && t[1] == []; assert t[2] == [2.7, 1.41, 1985.44]; var u := [true, false, false, true][1:1:]; assert |u| == 3 && u[0][0] && !u[1][0] && u[2] == [false, true]; assert s[10:][0] == s[..10]; assert s[10:][1] == s[10..]; ``` [^fn-slice-into-tuple]: Now that Dafny supports built-in tuples, the plan is to change the sequence slice operation to return not a sequence of subsequences, but a tuple of subsequences. The operation `multiset(s)` yields the multiset of elements of sequence `s`. It is allowed in non-ghost contexts only if the element type `T` is equality supporting. ### Strings ```` StringType_ = "string" ```` A special case of a sequence type is `seq`, for which Dafny provides a synonym: `string`. Strings are like other sequences, but provide additional syntax for sequence display expressions, namely _string literals_. There are two forms of the syntax for string literals: the _standard form_ and the _verbatim form_. String literals of the standard form are enclosed in double quotes, as in `"Dafny"`. To include a double quote in such a string literal, it is necessary to use an escape sequence. Escape sequences can also be used to include other characters. The supported escape sequences are the same as those for character literals, see Section [#sec-characters]. For example, the Dafny expression `"say \"yes\""` represents the string `'say "yes"'`. The escape sequence for a single quote is redundant, because [\"\'\"]{.monospace} and [\"\\\'\"]{.monospace} denote the same string---both forms are provided in order to support the same escape sequences as for character literals. String literals of the verbatim form are bracketed by [@\"]{.monospace} and [\"]{.monospace}, as in `@"Dafny"`. To include a double quote in such a string literal, it is necessary to use the escape sequence [\"\"]{.monospace}, that is, to write the character twice. In the verbatim form, there are no other escape sequences. Even characters like newline can be written inside the string literal (hence spanning more than one line in the program text). For example, the following three expressions denote the same string: ``` "C:\\tmp.txt" @"C:\tmp.txt" ['C', ':', '\\', 't', 'm', 'p', '.', 't', 'x', 't'] ``` Since strings are sequences, the relational operators [<]{.monospace} and `<=` are defined on them. Note, however, that these operators still denote proper prefix and prefix, respectively, not some kind of alphabetic comparison as might be desirable, for example, when sorting strings. ## Finite and Infinite Maps ```` FiniteMapType_ = "map" [ GenericInstantiation ] InfiniteMapType_ = "imap" [ GenericInstantiation ] ```` For any types `T` and `U`, a value of type `map` denotes a _(finite) map_ from `T` to `U`. In other words, it is a look-up table indexed by `T`. The _domain_ of the map is a finite set of `T` values that have associated `U` values. Since the keys in the domain are compared using equality in the type `T`, type `map` can be used in a non-ghost context only if `T` is equality supporting. Similarly, for any types `T` and `U`, a value of type `imap` denotes a _(possibly) infinite map_. In most regards, `imap` is like `map`, but a map of type `imap` is allowed to have an infinite domain. A map can be formed using a _map display_ expression (see ``MapDisplayExpr``), which is a possibly empty, ordered list of _maplets_, each maplet having the form `t := u` where `t` is an expression of type `T` and `u` is an expression of type `U`, enclosed in square brackets after the keyword `map`. To illustrate, ``` map[] map[20 := true, 3 := false, 20 := false] map[a+b := c+d] ``` are three examples of map displays. By using the keyword `imap` instead of `map`, the map produced will be of type `imap` instead of `map`. Note that an infinite map (`imap`) is allowed to have a finite domain, whereas a finite map (`map`) is not allowed to have an infinite domain. If the same key occurs more than once, only the last occurrence appears in the resulting map.[^fn-map-display] There is also a _map comprehension expression_, explained in section [#sec-map-comprehension-expression]. [^fn-map-display]: This is likely to change in the future to disallow multiple occurrences of the same key. For any map `fm` of type `map`, any map `m` of type `map` or `imap`, any expression `t` of type `T`, any expression `u` of type `U`, and any `d` in the domain of `m` (that is, satisfying `d in m`), maps support the following operations: +----------------------+------------------------------------+ | expression | description | +----------------------+------------------------------------+ | [\|fm\|]{.monospace} | map cardinality | | `m[d]` | map selection | | `m[t := u]` | map update | | `t in m` | map domain membership | | `t !in m` | map domain non-membership | +----------------------+------------------------------------+ `|fm|` denotes the number of mappings in `fm`, that is, the cardinality of the domain of `fm`. Note that the cardinality operator is not supported for infinite maps. Expression `m[d]` returns the `U` value that `m` associates with `d`. Expression `m[t := u]` is a map like `m`, except that the element at key `t` is `u`. The expression `t in m` says `t` is in the domain of `m` and `t !in m` is a syntactic shorthand for `!(t in m)`.[^fn-map-membership] [^fn-map-membership]: This is likely to change in the future as follows: The `in` and `!in` operations will no longer be supported on maps. Instead, for any map `m`, `m.Domain` will return its domain as a set and `m.Range` will return, also as a set, the image of `m` under its domain. Here is a small example, where a map `cache` of type `map` is used to cache computed values of Joule-Thomson coefficients for some fixed gas at a given temperature: ``` if K in cache { // check if temperature is in domain of cache coeff := cache[K]; // read result in cache } else { coeff := ComputeJouleThomsonCoefficient(K); // do expensive computation cache := cache[K := coeff]; // update the cache } ``` # Types that stand for other types ```` SynonymTypeDecl = ( SynonymTypeDefinition_ | OpaqueTypeDefinition_ ) [ ";" ] ```` It is sometimes useful to know a type by several names or to treat a type abstractly. Synonym and opaque types serve this purpose. ## Type synonyms ```` SynonymTypeDefinition_ = "type" { Attribute } SynonymTypeName [ GenericParameters ] "=" Type ```` A _type synonym_ declaration: ``` type Y = G ``` declares `Y` to be a synonym for the type `G`. Here, `T` is a nonempty list of type parameters (each of which is optionally designated with the suffix "`(==)`"), which can be used as free type variables in `G`. If the synonym has no type parameters, the "``" is dropped. In all cases, a type synonym is just a synonym. That is, there is never a difference, other than possibly in error messages produced, between `Y` and `G`. For example, the names of the following type synonyms may improve the readability of a program: ``` type Replacements = map type Vertex = int ``` As already described in Section [#sec-strings], `string` is a built-in type synonym for `seq`, as if it would have been declared as follows: ``` type string = seq ``` ## Opaque types ```` OpaqueTypeDefinition_ = "type" { Attribute } SynonymTypeName [ "(" "==" ")" ] [ GenericParameters ] ```` A special case of a type synonym is one that is underspecified. Such a type is declared simply by: ``` type Y ``` It is known as an _opaque type_. Its definition can be revealed in a refining module. To indicate that `Y` designates an equality-supporting type, "`(==)`" can be written immediately following the name "`Y`". For example, the declarations ``` type T function F(t: T): T ``` can be used to model an uninterpreted function `F` on some arbitrary type `T`. As another example, ``` type Monad ``` can be used abstractly to represent an arbitrary parameterized monad. # Well-founded Functions and Extreme Predicates This section is a tutorial on well-founded functions and extreme predicates. We place it here in preparation for Section [#sec-class-types] where function and predicate definitions are described. Recursive functions are a core part of computer science and mathematics. Roughly speaking, when the definition of such a function spells out a terminating computation from given arguments, we may refer to it as a _well-founded function_. For example, the common factorial and Fibonacci functions are well-founded functions. There are also other ways to define functions. An important case regards the definition of a boolean function as an extreme solution (that is, a least or greatest solution) to some equation. For computer scientists with interests in logic or programming languages, these _extreme predicates_ are important because they describe the judgments that can be justified by a given set of inference rules (see, e.g., [@CamilleriMelham:InductiveRelations; @Winskel:FormalSemantics; @LeroyGrall:CoinductiveBigStep; @Pierce:SoftwareFoundations; @NipkowKlein:ConcreteSemantics]). To benefit from machine-assisted reasoning, it is necessary not just to understand extreme predicates but also to have techniques for proving theorems about them. A foundation for this reasoning was developed by Paulin-Mohring [@PaulinMohring:InductiveCoq] and is the basis of the constructive logic supported by Coq [@Coq:book] as well as other proof assistants [@BoveDybjerNorell:BriefAgda; @SwamyEtAl:Fstar2011]. Essentially, the idea is to represent the knowledge that an extreme predicate holds by the proof term by which this knowledge was derived. For a predicate defined as the least solution, such proof terms are values of an inductive datatype (that is, finite proof trees), and for the greatest solution, a coinductive datatype (that is, possibly infinite proof trees). This means that one can use induction and coinduction when reasoning about these proof trees. Therefore, these extreme predicates are known as, respectively, _inductive predicates_ and _coinductive predicates_ (or, _co-predicates_ for short). Support for extreme predicates is also available in the proof assistants Isabelle [@Paulson:CADE1994] and HOL [@Harrison:InductiveDefs]. Dafny supports both well-founded functions and extreme predicates. This section is a tutorial that describes the difference in general terms, and then describes novel syntactic support in Dafny for defining and proving lemmas with extreme predicates. Although Dafny's verifier has at its core a first-order SMT solver, Dafny's logical encoding makes it possible to reason about fixpoints in an automated way. The encoding for coinductive predicates in Dafny was described previously [@LeinoMoskal:Coinduction] and is here described in Section [#sec-co-inductive-datatypes]. ## Function Definitions To define a function $f \colon X \to Y$ in terms of itself, one can write an equation like ~ Equation {#eq-general} f \Equal \F(f) ~ where $\mathcal{F}$ is a non-recursive function of type $(X \to Y) \to X \to Y$. Because it takes a function as an argument, $\mathcal{F}$ is referred to as a _functor_ (or _functional_, but not to be confused by the category-theory notion of a functor). Throughout, I will assume that $\F(f)$ by itself is well defined, for example that it does not divide by zero. I will also assume that $f$ occurs only in fully applied calls in $\F(f)$; eta expansion can be applied to ensure this. If $f$ is a boolean function, that is, if $Y$ is the type of booleans, then I call $f$ a _predicate_. For example, the common Fibonacci function over the natural numbers can be defined by the equation ~ Equation \fib \Equal \lambda n \bullet\; \ite{n < 2}{n}{\fib(n-2) + \fib(n-1)} ~ With the understanding that the argument $n$ is universally quantified, we can write this equation equivalently as ~ Equation {#eq-fib} \fib(n) \Equal \ite{n < 2}{n}{\fib(n-2) + \fib(n-1)} ~ The fact that the function being defined occurs on both sides of the equation causes concern that we might not be defining the function properly, leading to a logical inconsistency. In general, there could be many solutions to an equation like [#eq-general] or there could be none. Let's consider two ways to make sure we're defining the function uniquely. ### Well-founded Functions A standard way to ensure that equation [#eq-general] has a unique solution in $f$ is to make sure the recursion is well-founded, which roughly means that the recursion terminates. This is done by introducing any well-founded relation $\Less$ on the domain of $f$ and making sure that the argument to each recursive call goes down in this ordering. More precisely, if we formulate [#eq-general] as ~ Equation f(x) \Equal \F'(f) ~ then we want to check $E \Less x$ for each call $f(E)$ in $\F'(f)$. When a function definition satisfies this _decrement condition_, then the function is said to be _well-founded_. For example, to check the decrement condition for $\fib$ in [#eq-fib], we can pick $\Less$ to be the arithmetic less-than relation on natural numbers and check the following, for any $n$: ~ Equation 2 \leq n \;\;\Imp\;\; n-2 \Less n \;\And\; n-1 \Less n ~ Note that we are entitled to using the antecedent $2 \leq n$, because that is the condition under which the else branch in [#eq-fib] is evaluated. A well-founded function is often thought of as "terminating" in the sense that the recursive _depth_ in evaluating $f$ on any given argument is finite. That is, there are no infinite descending chains of recursive calls. However, the evaluation of $f$ on a given argument may fail to terminate, because its _width_ may be infinite. For example, let $P$ be some predicate defined on the ordinals and let $\PDownward$ be a predicate on the ordinals defined by the following equation: ~ Equation \PDownward(o) \Equal P(o) \And \forall p \bullet\; p \Less o \Imp \PDownward(p) ~ With $\Less$ as the usual ordering on ordinals, this equation satisfies the decrement condition, but evaluating $\PDownward(\omega)$ would require evaluating $\PDownward(n)$ for every natural number $n$. However, what we are concerned about here is to avoid mathematical inconsistencies, and that is indeed a consequence of the decrement condition. #### Example with Well-founded Functions {#sec-fib-example} So that we can later see how inductive proofs are done in Dafny, let's prove that for any $n$, $\fib(n)$ is even iff $n$ is a multiple of $3$. We split our task into two cases. If $n < 2$, then the property follows directly from the definition of $\fib$. Otherwise, note that exactly one of the three numbers $n-2$, $n-1$, and $n$ is a multiple of 3. If $n$ is the multiple of 3, then by invoking the induction hypothesis on $n-2$ and $n-1$, we obtain that $\fib(n-2) + \fib(n-1)$ is the sum of two odd numbers, which is even. If $n-2$ or $n-1$ is a multiple of 3, then by invoking the induction hypothesis on $n-2$ and $n-1$, we obtain that $\fib(n-2) + \fib(n-1)$ is the sum of an even number and an odd number, which is odd. In this proof, we invoked the induction hypothesis on $n-2$ and on $n-1$. This is allowed, because both are smaller than $n$, and hence the invocations go down in the well-founded ordering on natural numbers. ### Extreme Solutions We don't need to exclude the possibility of equation [#eq-general] having multiple solutions---instead, we can just be clear about which one of them we want. Let's explore this, after a smidgen of lattice theory. For any complete lattice $(Y,\leq)$ and any set $X$, we can by _pointwise extension_ define a complete lattice $(X \to Y, \FBelow)$, where for any $f,g \colon X \to Y$, ~ Equation f \FBelow q \Equiv \forall x \bullet\; f(x) \leq g(x) ~ In particular, if $Y$ is the set of booleans ordered by implication ($\false \leq \true$), then the set of predicates over any domain $X$ forms a complete lattice. Tarski's Theorem [@Tarski:theorem] tells us that any monotonic function over a complete lattice has a least and a greatest fixpoint. In particular, this means that $\F$ has a least fixpoint and a greatest fixpoint, provided $\F$ is monotonic. Speaking about the _set of solutions_ in $f$ to [#eq-general] is the same as speaking about the _set of fixpoints_ of functor $\F$. In particular, the least and greatest solutions to [#eq-general] are the same as the least and greatest fixpoints of $\F$. In casual speak, it happens that we say "fixpoint of [#eq-general]", or more grotesquely, "fixpoint of $f$" when we really mean "fixpoint of $\F$". In conclusion of our little excursion into lattice theory, we have that, under the proviso of $\F$ being monotonic, the set of solutions in $f$ to [#eq-general] is nonempty, and among these solutions, there is in the $\FBelow$ ordering a least solution (that is, a function that returns $\false$ more often than any other) and a greatest solution (that is, a function that returns $\true$ more often than any other). When discussing extreme solutions, I will now restrict my attention to boolean functions (that is, with $Y$ being the type of booleans). Functor $\F$ is monotonic if the calls to $f$ in $\F'(f)$ are in _positive positions_ (that is, under an even number of negations). Indeed, from now on, I will restrict my attention to such monotonic functors $\F$. Let me introduce a running example. Consider the following equation, where $x$ ranges over the integers: ~ Equation {#eq-EvenNat} g(x) \Equal (x = 0 \Or g(x-2)) ~ This equation has four solutions in $g$. With $w$ ranging over the integers, they are: ~ Equation \begin{array}{r@{}l} g(x) \Equiv{}& x \in \{w \;|\; 0 \leq w \And w\textrm{ even}\} \\ g(x) \Equiv{}& x \in \{w \;|\; w\textrm{ even}\} \\ g(x) \Equiv{}& x \in \{w \;|\; (0 \leq w \And w\textrm{ even}) \Or w\textrm{ odd}\} \\ g(x) \Equiv{}& x \in \{w \;|\; \true\} \end{array} ~ The first of these is the least solution and the last is the greatest solution. In the literature, the definition of an extreme predicate is often given as a set of _inference rules_. To designate the least solution, a single line separating the antecedent (on top) from conclusion (on bottom) is used: ~ Equation {#g-ind-rule} \frac{}{g(0)} \qquad\qquad \frac{g(x-2)}{g(x)} ~ Through repeated applications of such rules, one can show that the predicate holds for a particular value. For example, the _derivation_, or _proof tree_, to the left in Figure [#fig-proof-trees] shows that $g(6)$ holds. (In this simple example, the derivation is a rather degenerate proof "tree".) The use of these inference rules gives rise to a least solution, because proof trees are accepted only if they are _finite_. ~ Begin Figure { #fig-proof-trees caption="Left: a finite proof tree that uses the rules of [#g-ind-rule] to establish $g(6)$. Right: an infinite proof tree that uses the rules of [#g-coind-rule] to establish $g(1)$." } ~ Begin Columns ~~ Column { vertical-align=bottom } ~ Math \dfrac{ \dfrac{ \dfrac{ \dfrac{}{g(0)\xstrut} }{g(2)\xstrut} }{g(4)\xstrut} }{g(6)\xupstrut} ~ ~~ ~~ Column { width=5em } ~~ ~~ Column { vertical-align=bottom } ~ Math \Dfrac{ \Dfrac{ \Dfrac{ \Dfrac{ {}_{\vdots } }{{g(-5)}} }{{g(-3)}} }{{g(-1)}} }{g(1)} ~ ~~ ~ End Columns ~ End Figure When inference rules are to designate the greatest solution, a double line is used: ~ Equation {#g-coind-rule} \Dfrac{}{g(0)} \qquad\qquad \Dfrac{g(x-2)}{g(x)} ~ In this case, proof trees are allowed to be infinite. For example, the (partial depiction of the) infinite proof tree on the right in Figure [#fig-proof-trees] shows that $g(1)$ holds. Note that derivations may not be unique. For example, in the case of the greatest solution for $g$, there are two proof trees that establish $g(0)$: one is the finite proof tree that uses the left-hand rule of [#g-coind-rule] once, the other is the infinite proof tree that keeps on using the right-hand rule of [#g-coind-rule]. ### Working with Extreme Predicates In general, one cannot evaluate whether or not an extreme predicate holds for some input, because doing so may take an infinite number of steps. For example, following the recursive calls in the definition [#eq-EvenNat] to try to evaluate $g(7)$ would never terminate. However, there are useful ways to establish that an extreme predicate holds and there are ways to make use of one once it has been established. For any $\F$ as in [#eq-general], I define two infinite series of well-founded functions, $\iter{f}_k$ and $\Iter{f}_k$ where $k$ ranges over the natural numbers: ~ Equation {#eq-least-approx} \iter{f}_k(x) \Equal \left\{ \begin{array}{ll} \false & \textrm{if } k = 0 \\ \F(\iter{f}_{k-1})(x) & \textrm{if } k > 0 \end{array} \right. ~ ~ Equation {#eq-greatest-approx} \Iter{f}_k(x) \Equal \left\{ \begin{array}{ll} \true & \textrm{if } k = 0 \\ \F(\Iter{f}_{k-1})(x) & \textrm{if } k > 0 \end{array} \right. ~ These functions are called the _iterates_ of $f$, and I will also refer to them as the _prefix predicates_ of $f$ (or the _prefix predicate_ of $f$, if we think of $k$ as being a parameter). Alternatively, we can define $\iter{f}_k$ and $\Iter{f}_k$ without mentioning $x$: Let $\bot$ denote the function that always returns $\false$, let $\top$ denote the function that always returns $\true$, and let a superscript on $\F$ denote exponentiation (for example, $\F^0(f) = f$ and $\F^2(f) = \F(\F(f))$). Then, [#eq-least-approx] and [#eq-greatest-approx] can be stated equivalently as $\iter{f}_k = \F^k(\bot)$ and $\Iter{f}_k = \F^k(\top)$. For any solution $f$ to equation [#eq-general], we have, for any $k$ and $\ell$ such that $k \leq \ell$: ~ Equation {#eq-prefix-postfix} \iter{f}_k \quad\FBelow\quad \iter{f}_\ell \quad\FBelow\quad f \quad\FBelow\quad \Iter{f}_\ell \quad\FBelow\quad \Iter{f}_k ~ In other words, every $\iter{f}_k$ is a _pre-fixpoint_ of $f$ and every $\Iter{f}_k$ is a _post-fixpoint_ of $f$. Next, I define two functions, $f\least$ and $f\greatest$, in terms of the prefix predicates: ~ Equation {#eq-least-is-exists} f\least(x) \Equal \exists k \bullet\; \iter{f}_k(x) ~ ~ Equation {#eq-greatest-is-forall} f\greatest(x) \Equal \forall k \bullet\; \Iter{f}_k(x) ~ By [#eq-prefix-postfix], we also have that $f\least$ is a pre-fixpoint of $\F$ and $f\greatest$ is a post-fixpoint of $\F$. The marvelous thing is that, if $\F$ is _continuous_, then $f\least$ and $f\greatest$ are the least and greatest fixpoints of $\F$. These equations let us do proofs by induction when dealing with extreme predicates. I will explain in Section [#sec-friendliness] how to check for continuity. Let's consider two examples, both involving function $g$ in [#eq-EvenNat]. As it turns out, $g$'s defining functor is continuous, and therefore I will write $g\least$ and $g\greatest$ to denote the least and greatest solutions for $g$ in [#eq-EvenNat]. #### Example with Least Solution {#sec-example-least-solution} The main technique for establishing that $g\least(x)$ holds for some $x$, that is, proving something of the form $Q \Imp g\least(x)$, is to construct a proof tree like the one for $g(6)$ in Figure [#fig-proof-trees]. For a proof in this direction, since we're just applying the defining equation, the fact that we're using a least solution for $g$ never plays a role (as long as we limit ourselves to finite derivations). The technique for going in the other direction, proving something _from_ an established $g\least$ property, that is, showing something of the form $g\least(x) \Imp R$, typically uses induction on the structure of the proof tree. When the antecedent of our proof obligation includes a predicate term $g\least(x)$, it is sound to imagine that we have been given a proof tree for $g\least(x)$. Such a proof tree would be a data structure---to be more precise, a term in an _inductive datatype_. For this reason, least solutions like $g\least$ have been given the name _inductive predicate_. Let's prove $g\least(x) \Imp 0 \leq x \And x \textrm{ even}$. We split our task into two cases, corresponding to which of the two proof rules in [#g-ind-rule] was the last one applied to establish $g\least(x)$. If it was the left-hand rule, then $x=0$, which makes it easy to establish the conclusion of our proof goal. If it was the right-hand rule, then we unfold the proof tree one level and obtain $g\least(x-2)$. Since the proof tree for $g\least(x-2)$ is smaller than where we started, we invoke the _induction hypothesis_ and obtain $0 \leq (x-2) \And (x-2) \textrm{ even}$, from which it is easy to establish the conclusion of our proof goal. Here's how we do the proof formally using [#eq-least-is-exists]. We massage the general form of our proof goal: |~~~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| | | $f\greatest(x) \Imp R$ | | = | { [#eq-least-is-exists] } | | | $(\exists k \bullet\; \iter{f}_k(x)) \Imp R$ | | = | { distribute $\Imp$ over $\exists$ to the left } | | | $\forall k \bullet\; (\iter{f}_k(x) \Imp R)$ | The last line can be proved by induction over $k$. So, in our case, we prove $\iter{g}_k(x) \Imp 0 \leq x \And x \textrm{ even}$ for every $k$. If $k=0$, then $\iter{g}_k(x)$ is $\false$, so our goal holds trivially. If $k > 0$, then $\iter{g}_k(x) = (x = 0 \Or \iter{g}_{k-1}(x-2))$. Our goal holds easily for the first disjunct ($x=0$). For the other disjunct, we apply the induction hypothesis (on the smaller $k-1$ and with $x-2$) and obtain $0 \leq (x-2) \And (x-2) \textrm{ even}$, from which our proof goal follows. #### Example with Greatest Solution {#sec-example-greatest-solution} We can think of a given predicate $g\greatest(x)$ as being represented by a proof tree---in this case a term in a _coinductive datatype_, since the proof may be infinite. For this reason, greatest solutions like $g\greatest$ have been given the name _coinductive predicate_, or _co-predicate_ for short. The main technique for proving something from a given proof tree, that is, to prove something of the form $g\greatest(x) \Imp R$, is to destruct the proof. Since this is just unfolding the defining equation, the fact that we're using a greatest solution for $g$ never plays a role (as long as we limit ourselves to a finite number of unfoldings). To go in the other direction, to establish a predicate defined as a greatest solution, like $Q \Imp g\greatest(x)$, we may need an infinite number of steps. For this purpose, we can use induction's dual, _coinduction_. Were it not for one little detail, coinduction is as simple as continuations in programming: the next part of the proof obligation is delegated to the _coinduction hypothesis_. The little detail is making sure that it is the "next" part we're passing on for the continuation, not the same part. This detail is called _productivity_ and corresponds to the requirement in induction of making sure we're going down a well-founded relation when applying the induction hypothesis. There are many sources with more information, see for example the classic account by Jacobs and Rutten [@JacobsRutten:IntroductionCoalgebra] or a new attempt by Kozen and Silva that aims to emphasize the simplicity, not the mystery, of coinduction [@KozenSilva:Coinduction]. Let's prove $\true \Imp g\greatest(x)$. The intuitive coinductive proof goes like this: According to the right-hand rule of [#g-coind-rule], $g\greatest(x)$ follows if we establish $g\greatest(x-2)$, and that's easy to do by invoking the coinduction hypothesis. The "little detail", productivity, is satisfied in this proof because we applied a rule in [#g-coind-rule] before invoking the coinduction hypothesis. For anyone who may have felt that the intuitive proof felt too easy, here is a formal proof using [#eq-greatest-is-forall], which relies only on induction. We massage the general form of our proof goal: |~~~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| | | $Q \Imp f\greatest(x)$ | | = | { [#eq-greatest-is-forall] } | | | $Q \Imp \forall k \bullet\; \Iter{f}_k(x)$ | | = | { distribute $\Imp$ over $\forall$ to the right } | | | $\forall k \bullet\; Q \Imp \Iter{f}_k(x)$ | The last line can be proved by induction over $k$. So, in our case, we prove $\true \Imp \Iter{g}_k(x)$ for every $k$. If $k=0$, then $\Iter{g}_k(x)$ is $\true$, so our goal holds trivially. If $k > 0$, then $\Iter{g}_k(x) = (x = 0 \Or \Iter{g}_{k-1}(x-2))$. We establish the second disjunct by applying the induction hypothesis (on the smaller $k-1$ and with $x-2$). ### Other Techniques Although in this paper I consider only well-founded functions and extreme predicates, it is worth mentioning that there are additional ways of making sure that the set of solutions to [#eq-general] is nonempty. For example, if all calls to $f$ in $\F'(f)$ are _tail-recursive calls_, then (under the assumption that $Y$ is nonempty) the set of solutions is nonempty. To see this, consider an attempted evaluation of $f(x)$ that fails to determine a definite result value because of an infinite chain of calls that applies $f$ to each value of some subset $X'$ of $X$. Then, apparently, the value of $f$ for any one of the values in $X'$ is not determined by the equation, but picking any particular result values for these makes for a consistent definition. This was pointed out by Manolios and Moore [@ManoliosMoore:PartialFunctions]. Functions can be underspecified in this way in the proof assistants ACL2 [@ACL2:book] and HOL [@Krauss:PhD]. ## Functions in Dafny In this section, I explain with examples the support in Dafny[^fn-on-da-web] for well-founded functions, extreme predicates, and proofs regarding these. [^fn-on-da-web]: Dafny is open source at [dafny.codeplex.com](http://dafny.codeplex.com) and can also be used online at [rise4fun.com/dafny](http://rise4fun.com/dafny). ### Well-founded Functions in Dafny Declarations of well-founded functions are unsurprising. For example, the Fibonacci function is declared as follows: ``` function fib(n: nat): nat { if n < 2 then n else fib(n-2) + fib(n-1) } ``` Dafny verifies that the body (given as an expression in curly braces) is well defined. This includes decrement checks for recursive (and mutually recursive) calls. Dafny predefines a well-founded relation on each type and extends it to lexicographic tuples of any (fixed) length. For example, the well-founded relation $x \Less y$ for integers is $x < y \And 0 \leq y$, the one for reals is $x \leq y - 1.0 \And 0.0 \leq y$ (this is the same ordering as for integers, if you read the integer relation as $x \leq y - 1 \And 0 \leq y$), the one for inductive datatypes is structural inclusion, and the one for coinductive datatypes is $\false$. Using a `decreases` clause, the programmer can specify the term in this predefined order. When a function definition omits a `decreases` clause, Dafny makes a simple guess. This guess (which can be inspected by hovering over the function name in the Dafny IDE) is very often correct, so users are rarely bothered to provide explicit `decreases` clauses. If a function returns `bool`, one can drop the result type `: bool` and change the keyword `function` to `predicate`. ### Proofs in Dafny Dafny has `lemma` declarations. These are really just special cases of methods: they can have pre- and postcondition specifications and their body is a code block. Here is the lemma we stated and proved in Section [#sec-fib-example]: ``` lemma FibProperty(n: nat) ensures fib(n) % 2 == 0 <==> n % 3 == 0 { if n < 2 { } else { FibProperty(n-2); FibProperty(n-1); } } ``` The postcondition of this lemma (keyword `ensures`) gives the proof goal. As in any program-correctness logic (e.g., [@Hoare:AxiomaticBasis]), the postcondition must be established on every control path through the lemma's body. For `FibProperty`, I give the proof by an `if` statement, hence introducing a case split. The then branch is empty, because Dafny can prove the postcondition automatically in this case. The else branch performs two recursive calls to the lemma. These are the invocations of the induction hypothesis and they follow the usual program-correctness rules, namely: the precondition must hold at the call site, the call must terminate, and then the caller gets to assume the postcondition upon return. The "proof glue" needed to complete the proof is done automatically by Dafny. Dafny features an aggregate statement using which it is possible to make (possibly infinitely) many calls at once. For example, the induction hypothesis can be called at once on all values `n'` smaller than `n`: ``` forall n' | 0 <= n' < n { FibProperty(n'); } ``` For our purposes, this corresponds to _strong induction_. More generally, the `forall` statement has the form ``` forall k | P(k) ensures Q(k) { Statements; } ``` Logically, this statement corresponds to _universal introduction_: the body proves that `Q(k)` holds for an arbitrary `k` such that `P(k)`, and the conclusion of the `forall` statement is then $\forall k \bullet\; P(k) \Imp Q(k)$. When the body of the `forall` statement is a single call (or `calc` statement), the `ensures` clause is inferred and can be omitted, like in our `FibProperty` example. Lemma `FibProperty` is simple enough that its whole body can be replaced by the one `forall` statement above. In fact, Dafny goes one step further: it automatically inserts such a `forall` statement at the beginning of every lemma [@Leino:induction]. Thus, `FibProperty` can be declared and proved simply by: ``` {.para-end} lemma FibProperty(n: nat) ensures fib(n) % 2 == 0 <==> n % 3 == 0 { } ``` Going in the other direction from universal introduction is existential elimination, also known as Skolemization. Dafny has a statement for this, too: for any variable `x` and boolean expression `Q`, the _assign such that_ statement `x :| Q;` says to assign to `x` a value such that `Q` will hold. A proof obligation when using this statement is to show that there exists an `x` such that `Q` holds. For example, if the fact $\exists k \bullet\; 100 \leq \fib(k) < 200$ is known, then the statement `k :| 100 <= fib(k) < 200;` will assign to `k` some value (chosen arbitrarily) for which `fib(k)` falls in the given range. ### Extreme Predicates in Dafny {#sec-friendliness} In this previous subsection, I explained that a `predicate` declaration introduces a well-founded predicate. The declarations for introducing extreme predicates are `inductive predicate` and `copredicate`. Here is the definition of the least and greatest solutions of $g$ from above, let's call them `g` and `G`: ``` inductive predicate g(x: int) { x == 0 || g(x-2) } copredicate G(x: int) { x == 0 || G(x-2) } ``` When Dafny receives either of these definitions, it automatically declares the corresponding prefix predicates. Instead of the names $\iter{g}_k$ and $\Iter{g}_k$ that I used above, Dafny names the prefix predicates `g#[k]` and `G#[k]`, respectively, that is, the name of the extreme predicate appended with `#`, and the subscript is given as an argument in square brackets. The definition of the prefix predicate derives from the body of the extreme predicate and follows the form in [#eq-least-approx] and [#eq-greatest-approx]. Using a faux-syntax for illustrative purposes, here are the prefix predicates that Dafny defines automatically from the extreme predicates `g` and `G`: ``` predicate g#[_k: nat](x: int) { _k != 0 && (x == 0 || g#[_k-1](x-2)) } predicate G#[_k: nat](x: int) { _k != 0 ==> (x == 0 || G#[_k-1](x-2)) } ``` The Dafny verifier is aware of the connection between extreme predicates and their prefix predicates, [#eq-least-is-exists] and [#eq-greatest-is-forall]. Remember that to be well defined, the defining functor of an extreme predicate must be monotonic, and for [#eq-least-is-exists] and [#eq-greatest-is-forall] to hold, the functor must be continuous. Dafny enforces the former of these by checking that recursive calls of extreme predicates are in positive positions. The continuity requirement comes down to checking that they are also in _continuous positions_: that recursive calls to inductive predicates are not inside unbounded universal quantifiers and that recursive calls to co-predicates are not inside unbounded existential quantifiers [@Milner:CCS; @LeinoMoskal:Coinduction]. ### Proofs about Extreme Predicates From what I have presented so far, we can do the formal proofs from Sections [#sec-example-least-solution] and [#sec-example-greatest-solution]. Here is the former: ``` lemma EvenNat(x: int) requires g(x) ensures 0 <= x && x % 2 == 0 { var k: nat :| g#[k](x); EvenNatAux(k, x); } lemma EvenNatAux(k: nat, x: int) requires g#[k](x) ensures 0 <= x && x % 2 == 0 { if x == 0 { } else { EvenNatAux(k-1, x-2); } } ``` Lemma `EvenNat` states the property we wish to prove. From its precondition (keyword `requires`) and [#eq-least-is-exists], we know there is some `k` that will make the condition in the assign-such-that statement true. Such a value is then assigned to `k` and passed to the auxiliary lemma, which promises to establish the proof goal. Given the condition `g#[k](x)`, the definition of `g#` lets us conclude `k != 0` as well as the disjunction `x == 0 || g#[k-1](x-2)`. The then branch considers the case of the first disjunct, from which the proof goal follows automatically. The else branch can then assume `g#[k-1](x-2)` and calls the induction hypothesis with those parameters. The proof glue that shows the proof goal for `x` to follow from the proof goal with `x-2` is done automatically. Because Dafny automatically inserts the statement ``` forall k', x' | 0 <= k' < k && g#[k'](x') { EvenNatAux(k', x'); } ``` at the beginning of the body of `EvenNatAux`, the body can be left empty and Dafny completes the proof automatically. Here is the Dafny program that gives the proof from Section [#sec-example-greatest-solution]: ``` {.para-end} lemma Always(x: int) ensures G(x) { forall k: nat { AlwaysAux(k, x); } } lemma AlwaysAux(k: nat, x: int) ensures G#[k](x) { } ``` While each of these proofs involves only basic proof rules, the setup feels a bit clumsy, even with the empty body of the auxiliary lemmas. Moreover, the proofs do not reflect the intuitive proofs I described in Section [#sec-example-least-solution] and [#sec-example-greatest-solution]. These shortcoming are addressed in the next subsection. ### Nicer Proofs of Extreme Predicates The proofs we just saw follow standard forms: use Skolemization to convert the inductive predicate into a prefix predicate for some `k` and then do the proof inductively over `k`; respectively, by induction over `k`, prove the prefix predicate for every `k`, then use universal introduction to convert to the coinductive predicate. With the declarations `inductive lemma` and `colemma`, Dafny offers to set up the proofs in these standard forms. What is gained is not just fewer characters in the program text, but also a possible intuitive reading of the proofs. (Okay, to be fair, the reading is intuitive for simpler proofs; complicated proofs may or may not be intuitive.) Somewhat analogous to the creation of prefix predicates from extreme predicates, Dafny automatically creates a _prefix lemma_ `L#` from each "extreme lemma" `L`. The pre- and postconditions of a prefix lemma are copied from those of the extreme lemma, except for the following replacements: For an inductive lemma, Dafny looks in the precondition to find calls (in positive, continuous positions) to inductive predicates `P(x)` and replaces these with `P#[_k](x)`. For a co-lemma, Dafny looks in the postcondition to find calls (in positive, continuous positions) to co-predicates `P` (including equality among coinductive datatypes, which is a built-in co-predicate) and replaces these with `P#[_k](x)`. In each case, these predicates `P` are the lemma's _focal predicates_. The body of the extreme lemma is moved to the prefix lemma, but with replacing each recursive call `L(x)` with `L#[_k-1](x)` and replacing each occurrence of a call to a focal predicate `P(x)` with `P#[_k-1](x)`. The bodies of the extreme lemmas are then replaced as shown in the previous subsection. By construction, this new body correctly leads to the extreme lemma's postcondition. Let us see what effect these rewrites have on how one can write proofs. Here are the proofs of our running example: ``` inductive lemma EvenNat(x: int) requires g(x) ensures 0 <= x && x % 2 == 0 { if x == 0 { } else { EvenNat(x-2); } } colemma Always(x: int) ensures G(x) { Always(x-2); } ``` Both of these proofs follow the intuitive proofs given in Sections [#sec-example-least-solution] and [#sec-example-greatest-solution]. Note that in these simple examples, the user is never bothered with either prefix predicates nor prefix lemmas---the proofs just look like "what you'd expect". Since Dafny automatically inserts calls to the induction hypothesis at the beginning of each lemma, the bodies of the given extreme lemmas `EvenNat` and `Always` can be empty and Dafny still completes the proofs. Folks, it doesn't get any simpler than that! # Class Types ```` ClassDecl = "class" { Attribute } ClassName [ GenericParameters ] ["extends" Type {"," Type} ] "{" { { DeclModifier } ClassMemberDecl(moduleLevelDecl: false) } "}" ```` ```` ClassMemberDecl(moduleLevelDecl) = ( FieldDecl | FunctionDecl | MethodDecl(isGhost: ("ghost" was present), allowConstructor: !moduleLevelDecl) ) ```` The ``ClassMemberDecl`` parameter `moduleLevelDecl` will be true if the member declaration is at the top level or directly within a module declaration. It will be false for ``ClassMemberDecl``s that are part of a class or trait declaration. If `moduleLevelDecl` is false ``FieldDecl``s are not allowed. A _class_ `C` is a reference type declared as follows: ``` class C extends J1, ..., Jn { $_members_$ } ``` where the list of type parameters `T` is optional and so is "`extends J1, ..., Jn`", which says that the class extends traits `J1` ... `Jn`. The members of a class are _fields_, _functions_, and _methods_. These are accessed or invoked by dereferencing a reference to a `C` instance. A function or method is invoked on an _instance_ of `C`, unless the function or method is declared `static`. A function or method that is not `static` is called an _instance_ function or method. An instance function or method takes an implicit _receiver_ parameter, namely, the instance used to access the member. In the specification and body of an instance function or method, the receiver parameter can be referred to explicitly by the keyword `this`. However, in such places, members of `this` can also be mentioned without any qualification. To illustrate, the qualified `this.f` and the unqualified `f` refer to the same field of the same object in the following example: ``` class C { var f: int method Example() returns (b: bool) { b := f == this.f; } } ``` so the method body always assigns `true` to the out-parameter `b`. There is no semantic difference between qualified and unqualified accesses to the same receiver and member. A `C` instance is created using `new`, for example: ``` c := new C; ``` Note that `new` simply allocates a `C` object and returns a reference to it; the initial values of its fields are arbitrary values of their respective types. Therefore, it is common to invoke a method, known as an _initialization method_, immediately after creation, for example: ``` c := new C; c.InitFromList(xs, 3); ``` When an initialization method has no out-parameters and modifies no more than `this`, then the two statements above can be combined into one: ``` c := new C.InitFromList(xs, 3); ``` Note that a class can contain several initialization methods, that these methods can be invoked at any time, not just as part of a `new`, and that `new` does not require that an initialization method be invoked at creation. A clas can declare special initializing methods called _constructor methods_. See Section [#sec-method-declarations]. ## Field Declarations ```` FieldDecl = "var" { Attribute } FIdentType { "," FIdentType } ```` An ``FIdentType`` is used to declare a field. The field name is either an identifier (that is not allowed to start with a leading underscore) or some digits. Digits are used if you want to number your fields, e.g. "0", "1", etc. ```` FIdentType = ( FieldIdent | digits ) ":" Type ```` A field x of some type T is declared as: ``` var x: T ``` A field declaration declares one or more fields of the enclosing class. Each field is a named part of the state of an object of that class. A field declaration is similar to but distinct from a variable declaration statement. Unlike for local variables and bound variables, the type is required and will not be inferred. Unlike method and function declarations, a field declaration cannot be given at the top level. Fields can be declared in either a class or a trait. A class that inherits from multiple traits will have all the fields declared in any of its parent traits. Fields that are declared as `ghost` can only be used in specifications, not in code that will be compiled into executable code. Fields may not be declared static. `protected` is not allowed for fields. ## Method Declarations ```` MethodDecl(isGhost, allowConstructor) = MethodKeyword { Attribute } [ MethodName ] ( MethodSignature(isGhost) | SignatureEllipsis_ ) MethodSpec [ BlockStmt ] ```` The `isGhost` parameter is true iff the `ghost` keyword preceded the method declaration. If the `allowConstructor` parameter is false then the ``MethodDecl`` must not be a `constructor` declaration. ```` MethodKeyword = ("method" | "lemma" | "colemma" | "inductive" "lemma" | "constructor" ) ```` The method keyword is used to specify special kinds of methods as explained below. ```` MethodSignature(isGhost) = [ GenericParameters ] Formals(allowGhost: !isGhost) [ "returns" Formals(allowGhost: !isGhost) ] ```` A method signature specifies the method generic parameters, input parameters and return parameters. The formal parameters are not allowed to have `ghost` specified if `ghost` was already specified for the method. ```` SignatureEllipsis_ = "..." ```` A ``SignatureEllipsis_`` is used when a method or function is being redeclared in module that refines another module. In that case the signature is copied from the module that is being refined. This works because Dafny does not support method or function overloading, so the name of the class method uniquely identifies it without the signature. ```` Formals(allowGhostKeyword) = "(" [ GIdentType(allowGhostKeyword) { "," GIdentType(allowGhostKeyword) } ] ")" ```` The ``Formals`` specifies the names and types of the method input or output parameters. See section [#sec-method-specification] for a description of ``MethodSpec``. A method declaration adheres to the ``MethodDecl`` grammar above. Here is an example of a method declaration. ``` method {:att1}{:att2} M(a: A, b: B, c: C) returns (x: X, y: Y, z: Z) requires Pre modifies Frame ensures Post decreases Rank { Body } ``` where `:att1` and `:att2` are attributes of the method, `T1` and `T2` are type parameters of the method (if generic), `a, b, c` are the methodâ€™s in-parameters, `x, y, z` are the methodâ€™s out-parameters, `Pre` is a boolean expression denoting the methodâ€™s precondition, `Frame` denotes a set of objects whose fields may be updated by the method, `Post` is a boolean expression denoting the methodâ€™s postcondition, `Rank` is the methodâ€™s variant function, and `Body` is a statement that implements the method. `Frame` can be a list of expressions, each of which is a set of objects or a single object, the latter standing for the singleton set consisting of that one object. The methodâ€™s frame is the union of these sets, plus the set of objects allocated by the method body. For example, if `c` and `d` are parameters of a class type `C`, then ``` modifies {c, d} modifies {c} + {d} modifies c, {d} modifies c, d ``` all mean the same thing. A method can be declared as ghost by preceding the declaration with the keyword ghost. By default, a method has an implicit receiver parameter, this. This parameter can be removed by preceding the method declaration with the keyword static. A static method M in a class C can be invoked by C.M(â€¦). In a class, a method can be declared to be a constructor method by replacing the keyword `method` with the keyword `constructor`. A constructor can only be called at the time an object is allocated (see object-creation examples below), and for a class that contains one or more constructors, object creation must be done in conjunction with a call to a constructor. An ordinary method is declared with the `method` keyword. Section [#sec-constructors] explains methods that instead use the `constructor` keyword. Section [#sec-lemmas] discusses methods that are declared with the `lemma` keyword. Methods declared with the `inductive` `lemma` keywords are discussed later in the context of inductive predicates (see [#sec-inductive-datatypes]). Methods declared with the `colemma` keyword are discussed later in the context of co-inductive types, in section [#sec-colemmas]. A method without is body is _abstract_. A method is allowed to be abstract under the following circumstances: * It contains an `{:axiom}` attribute * It contains an `{:imported}` attribute * It contains a `{:decl}` attribute * It is a declaration in an abstract module. Note that when there is no body, Dafny assumes that the *ensures* clauses are true without proof. ### Constructors To write structured object-oriented programs, one often relies on that objects are constructed only in certain ways. For this purpose, Dafny provides _constructor (method)s_, which are a restricted form of initialization methods. A constructor is declared with the keyword `constructor` instead of `method`. When a class contains a constructor, every call to `new` for that class must be accompanied with a call to one of the constructors. Moreover, a constructor cannot be called at other times, only during object creation. Other than these restrictions, there is no semantic difference between using ordinary initialization methods and using constructors. The Dafny design allows the constructors to be named, which promotes using names like `InitFromList` above. Still, many classes have just one constructor or have a typical constructor. Therefore, Dafny allows one _anonymous constructor_, that is, a constructor whose name is essentially "". For example: ``` class Item { constructor (x: int, y: int) // ... } ``` When invoking this constructor, the "`.`" is dropped, as in: ``` m := new Item(45, 29); ``` Note that an anonymous constructor is just one way to name a constructor; there can be other constructors as well. ### Lemmas Sometimes there are steps of logic required to prove a program correct, but they are too complex for Dafny to discover and use on its own. When this happens, we can often give Dafny assistance by providing a lemma. This is done by declaring a method with the `lemma` keyword. Lemmas are implicitly ghost methods and the `ghost` keyword cannot be applied to them. For an example, see the `FibProperty` lemma in Section [#sec-proofs-in-dafny]. See [the Dafny Lemmas tutorial](http://rise4fun.com/Dafny/tutorial/Lemmas) for more examples and hints for using lemmas. ## Function Declarations ```` FunctionDecl = ( "function" [ "method" ] { Attribute } FunctionName FunctionSignatureOrEllipsis_(allowGhostKeyword: ("method" present)) | "predicate" [ "method" ] { Attribute } PredicateName PredicateSignatureOrEllipsis_(allowGhostKeyword: ("method" present)) | "inductive" "predicate" { Attribute } PredicateName PredicateSignatureOrEllipsis_(allowGhostKeyword: false) | "copredicate" { Attribute } CopredicateName PredicateSignatureOrEllipsis_(allowGhostKeyword: false) ) FunctionSpec [ FunctionBody ] FunctionSignatureOrEllipsis_(allowGhostKeyword) = FunctionSignature_ | SignatureEllipsis_ FunctionSignature_(allowGhostKeyword) = [ GenericParameters ] Formals(allowGhostKeyword) ":" Type PredicateSignatureOrEllipsis_(allowGhostKeyword) = PredicateSignature_(allowGhostKeyword) | SignatureEllipsis_ PredicateSignature_(allowGhostKeyword) = [ GenericParameters ] Formals(allowGhostKeyword) FunctionBody = "{" Expression(allowLemma: true, allowLambda: true) "}" ```` In the above productions, allowGhostKeyword is true if the optional "method" keyword was specified. This allows some of the formal parameters of a function method to be specified as ghost. See section [#sec-function-specification] for a description of ``FunctionSpec``. A Dafny function is a pure mathematical function. It is allowed to read memory that was specified in its `reads` expression but is not allowed to have any side effects. Here is an example function declaration: ``` function {:att1}{:att2} F(a: A, b: B, c: C): T requires Pre reads Frame ensures Post decreases Rank { Body } ``` where `:att1` and `:att2` are attributes of the function, if any, `T1` and `T2` are type parameters of the function (if generic), `a, b, c` are the functionsâ€™s parameters, `T` is the type of the functionâ€™s result, `Pre` is a boolean expression denoting the functionâ€™s precondition, `Frame` denotes a set of objects whose fields the function body may depend on, `Post` is a boolean expression denoting the functionâ€™s postcondition, `Rank` is the functionâ€™s variant function, and `Body` is an expression that defines the function return value. The precondition allows a function to be partial, that is, the precondition says when the function is defined (and Dafny will verify that every use of the function meets the precondition). The postcondition is usually not needed, since the body of the function gives the full definition. However, the postcondition can be a convenient place to declare properties of the function that may require an inductive proof to establish. For example: ```` function Factorial(n: int): int requires 0 <= n ensures 1 <= Factorial(n) { if n == 0 then 1 else Factorial(n-1) * n } ```` says that the result of Factorial is always positive, which Dafny verifies inductively from the function body. To refer to the functionâ€™s result in the postcondition, use the function itself, as shown in the example. By default, a function is *ghost*, and cannot be called from non-ghost code. To make it non-ghost, replace the keyword function with the two keywords "function method". By default, a function has an implicit receiver parameter, `this`. This parameter can be removed by preceding the function declaration with the keyword `static`. A static function `F` in a class `C` can be invoked by `C.F(â€¦)`. This can give a convenient way to declare a number of helper functions in a separate class. As for methods, a ``SignatureEllipsis_`` is used when declaring a function in a module refinement. For example, if module `M0` declares function `F`, a module `M1` can be declared to refine `M0` and `M1` can then refine `F`. The refinement function, `M1.F` can have a ``SignatureEllipsis_`` which means to copy the signature form `M0.F`. A refinement function can furnish a body for a function (if `M0.F` does not provide one). It can also add **ensures** clauses. And if `F` is a predicate, it can add conjuncts to a previously given body. ### Function Transparency A function is said to be _transparent_ in a location if the contents of the body of the function is visible at that point. A function is said to be _opaque_ at a location if it is not transparent. However the ``FunctionSpec`` of a function is always available. A function is usually transparent up to some unrolling level (up to 1, or maybe 2 or 3). If its arguments are all literals it is transparent all the way. But the transparency of a function is affected by the following: * whether the function was declared to be protected, and * whether the function was given the `{:opaque}` attribute (as explained in Section [#sec-opaque]). The following table summarizes where the function is transparent. The module referenced in the table is the module in which the function is defined. +------------+--------------+-------------+-------------+ | Protected? | `{:opaque}`? | Transparent | Transparent | | | | Inside | Outside | | | | Module | Module | +:----------:+:------------:+:-----------:+:-----------:+ | N | N | Y | Y | | Y | N | Y | N | | N | Y | N | N | +------------+--------------+-------------+-------------+ When `{:opaque}` is specified for function `g`, `g` is opaque, however the lemma `reveal_g` is available to give the semantics of `g` whether in the defining module or outside. It currently is not allowed to have both `protected` and `{:opaque}` specified for a function. ### Predicates A function that returns a `bool` results is called a _predicate_. As an alternative syntax, a predicate can be declared by replacing the `function` keyword with the `predicate` keyword and omitting a declaration of the return type. ### Inductive Predicates and Lemmas See section [#sec-friendliness] for descriptions of inductive predicates and lemmas. # Trait Types ```` TraitDecl = "trait" { Attribute } TraitName [ GenericParameters ] "{" { { DeclModifier } ClassMemberDecl(moduleLevelDecl: false) } "}" ```` A _trait_ is an "abstract superclass", or call it an "interface" or "mixin". Traits are new to Dafny and are likely to evolve for a while. The declaration of a trait is much like that of a class: ``` trait J { $_members_$ } ``` where `$_members_$` can include fields, functions, and methods, but no constructor methods. The functions and methods are allowed to be declared `static`. A reference type `C` that extends a trait `J` is assignable to `J`, but not the other way around. The members of `J` are available as members of `C`. A member in `J` is not allowed to be redeclared in `C`, except if the member is a non-`static` function or method without a body in `J`. By doing so, type `C` can supply a stronger specification and a body for the member. `new` is not allowed to be used with traits. Therefore, there is no object whose allocated type is a trait. But there can of course be objects of a class `C` that implements a trait `J`, and a reference to such a `C` object can be used as a value of type `J`. As an example, the following trait represents movable geometric shapes: ``` trait Shape { function method Width(): real reads this method Move(dx: real, dy: real) modifies this method MoveH(dx: real) modifies this { Move(dx, 0.0); } } ``` Members `Width` and `Move` are _abstract_ (that is, body less) and can be implemented differently by different classes that extend the trait. The implementation of method `MoveH` is given in the trait and thus gets used by all classes that extend `Shape`. Here are two classes that each extends `Shape`: ``` class UnitSquare extends Shape { var x: real, y: real function method Width(): real { // note the empty reads clause 1.0 } method Move(dx: real, dy: real) modifies this { x, y := x + dx, y + dy; } } class LowerRightTriangle extends Shape { var xNW: real, yNW: real, xSE: real, ySE: real function method Width(): real reads this { xSE - xNW } method Move(dx: real, dy: real) modifies this { xNW, yNW, xSE, ySE := xNW + dx, yNW + dy, xSE + dx, ySE + dy; } } ``` Note that the classes can declare additional members, that they supply implementations for the abstract members of the trait, that they repeat the member signatures, and that they are responsible for providing their own member specifications that both strengthen the corresponding specification in the trait and are satisfied by the provided body. Finally, here is some code that creates two class instances and uses them together as shapes: ``` var myShapes: seq; var A := new UnitSquare; myShapes := [A]; var tri := new LowerRightTriangle; // myShapes contains two Shape values, of different classes myShapes := myShapes + [tri]; // move shape 1 to the right by the width of shape 0 myShapes[1].MoveH(myShapes[0].Width()); ``` # Array Types ```` ArrayType_ = arrayToken [ GenericInstantiation ] ```` Dafny supports mutable fixed-length _array types_ of any positive dimension. Array types are reference types. ## One-dimensional arrays A one-dimensional array of `n` `T` elements is created as follows: ``` a := new T[n]; ``` The initial values of the array elements are arbitrary values of type `T`. The length of an array is retrieved using the immutable `Length` member. For example, the array allocated above satisfies: ``` a.Length == n ``` For any integer-based numeric `i` in the range `0 <= i < a.Length`, the _array selection_ expression `a[i]` retrieves element `i` (that is, the element preceded by `i` elements in the array). The element stored at `i` can be changed to a value `t` using the array update statement: ``` a[i] := t; ``` Caveat: The type of the array created by `new T[n]` is `array`. A mistake that is simple to make and that can lead to befuddlement is to write `array` instead of `T` after `new`. For example, consider the following: ``` var a := new array; var b := new array[n]; var c := new array(n); // resolution error var d := new array(n); // resolution error ``` The first statement allocates an array of type `array`, but of unknown length. The second allocates an array of type `array>` of length `n`, that is, an array that holds `n` values of type `array`. The third statement allocates an array of type `array` and then attempts to invoke an anonymous constructor on this array, passing argument `n`. Since `array` has no constructors, let alone an anonymous constructor, this statement gives rise to an error. If the type-parameter list is omitted for a type that expects type parameters, Dafny will attempt to fill these in, so as long as the `array` type parameter can be inferred, it is okay to leave off the "``" in the fourth statement above. However, as with the third statement, `array` has no anonymous constructor, so an error message is generated. One-dimensional arrays support operations that convert a stretch of consecutive elements into a sequence. For any array `a` of type `array`, integer-based numerics `lo` and `hi` satisfying `0 <= lo <= hi <= a.Length`, the following operations each yields a `seq`: +---------------------+------------------------------------+ | expression | description | +---------------------+------------------------------------+ | `a[lo..hi]` | subarray conversion to sequence | | `a[lo..]` | drop | | `a[..hi]` | take | | `a[..]` | array conversion to sequence | +---------------------+------------------------------------+ The expression `a[lo..hi]` takes the first `hi` elements of the array, then drops the first `lo` elements thereof and returns what remains as a sequence. The resulting sequence thus has length `hi - lo`. The other operations are special instances of the first. If `lo` is omitted, it defaults to `0` and if `hi` is omitted, it defaults to `a.Length`. In the last operation, both `lo` and `hi` have been omitted, thus `a[..]` returns the sequence consisting of all the array elements of `a`. The subarray operations are especially useful in specifications. For example, the loop invariant of a binary search algorithm that uses variables `lo` and `hi` to delimit the subarray where the search `key` may be still found can be expressed as follows: ``` key !in a[..lo] && key !in a[hi..] ``` Another use is to say that a certain range of array elements have not been changed since the beginning of a method: ``` a[lo..hi] == old(a[lo..hi]) ``` or since the beginning of a loop: ``` ghost var prevElements := a[..]; while // ... invariant a[lo..hi] == prevElements[lo..hi] { // ... } ``` Note that the type of `prevElements` in this example is `seq`, if `a` has type `array`. A final example of the subarray operation lies in expressing that an array's elements are a permutation of the array's elements at the beginning of a method, as would be done in most sorting algorithms. Here, the subarray operation is combined with the sequence-to-multiset conversion: ``` multiset(a[..]) == multiset(old(a[..])) ``` ## Multi-dimensional arrays An array of 2 or more dimensions is mostly like a one-dimensional array, except that `new` takes more length arguments (one for each dimension), and the array selection expression and the array update statement take more indices. For example: ``` matrix := new T[m, n]; matrix[i, j], matrix[x, y] := matrix[x, y], matrix[i, j]; ``` create a 2-dimensional array whose dimensions have lengths `m` and `n`, respectively, and then swaps the elements at `i,j` and `x,y`. The type of `matrix` is `array2`, and similarly for higher-dimensional arrays (`array3`, `array4`, etc.). Note, however, that there is no type `array0`, and what could have been `array1` is actually named just `array`. The `new` operation above requires `m` and `n` to be non-negative integer-based numerics. These lengths can be retrieved using the immutable fields `Length0` and `Length1`. For example, the following holds of the array created above: ``` matrix.Length0 == m && matrix.Length1 == n ``` Higher-dimensional arrays are similar (`Length0`, `Length1`, `Length2`, ...). The array selection expression and array update statement require that the indices are in bounds. For example, the swap statement above is well-formed only if: ``` 0 <= i < matrix.Length0 && 0 <= j < matrix.Length1 && 0 <= x < matrix.Length0 && 0 <= y < matrix.Length1 ``` In contrast to one-dimensional arrays, there is no operation to convert stretches of elements from a multi-dimensional array to a sequence. # Type object ```` ObjectType_ = "object" ```` There is a built-in trait `object` that is like a supertype of all reference types.[^fn-object-trait] Every class automatically extends object and so does every user-defined trait. The purpose of type `object` is to enable a uniform treatment of _dynamic frames_. In particular, it is useful to keep a ghost field (typically named `Repr` for "representation") of type `set