FoCaLize















Reference Manual










1.0.0







January 2009













Authors










Thérèse Hardin, François Pessaux, Pierre Weis, Damien Doligez

About FoCaLize





FoCaLize is the result of a collective work of several researchers, listed in the following, who designed, defined, compiled, studied, extended, used and debugged the preceding versions. They were helped by many students who had a summer internship under their supervision. They would like to thank all these students and more generally all the persons who brought some contribution to FoCaLize.









FoCaLize contributors






Philippe Ayrault (SPI-LIP6), William Bartlett (CPR-CEDRIC), Julien Blond (SPI-LIP6), Sylvain Boulmé (SPI-LIP6), Matthieu Carlier (CPR-CEDRIC), Damien Doligez (GALLIUM-INRIA), David Delahaye (CPR-CEDRIC), Catherine Dubois (CPR-CEDRIC), Jean-Frédéric Etienne (CPR-CEDRIC), Stéphane Fechter (SPI-LIP6), Mathieu Jaume (SPI-LIP6), Lionel Habib (SPI-LIP6), Thérèse Hardin (SPI-LIP6), Charles Morisset (SPI-LIP6), Ivan Noyer (SPI-LIP6), François Pessaux (SPI-LIP6), Virgile Prevosto (SPI-LIP6), Renaud Rioboo (CPR-CEDRC), Lien Tran (SPI-LIP6), Véronique Viguié Donzeau-Gouge (CPR-CNAM), Pierre Weis (ESTIME-INRIA)




and their institutions






SPI (Semantics, Proofs and Implementations) is a team of LIP6, (Laboratoire d'Informatique de Paris 6) of UPMC (Pierre and Marie Curie University)1.





CPR (Conception et Programmation Raisonnées) is a team of CEDRIC (Centre d'Etudes et de Recherches du CNAM) of CNAM (Conservatoire National des Arts et Métiers)2 and ENSIIE (Ecole Nationale d'Informatique pour l'Industrie et l'Entreprise)3.



ESTIME and GALLIUM are teams of INRIA Rocquencourt4








Thanks






The Foc project was first partially supported by LIP6 (Projet Foc, LIP6 1997) then by the Ministry of Research (Action Modulogic). The Focal research team was then partially supported by the French SSURF ANR project ANR-06-SETI-016 (Safety and Security UndeR Focal). The project also benefited of strong collaborations with EDEMOI ANR project and with BERTIN and SAFERIVER companies.

The FoCaLize language and compiler development effort started around 2005. The architecture conception and code rewritting started from scratch in 2006 to finally make the first focalizec compiler and FoCaLize system distribution in 2009, January.

This manual documents the completely revised system with the new syntax and its semantics extensions.


1
UPMC-LIP6, 104 avenue du Président Kennedy, Paris 75016, France, Firstname.Lastname@lip6.fr
2
CNAM-CEDRIC, 292 rue Saint Martin, 75003, Paris, France, Firstname.Lastname@cnam.fr
3
ENSIIE-CEDRIC, 1 Square de la Résistance, 91025 Evry Cedex, France, Lastname@ensiie.fr
4
INRIA, Bat 8. Domaine de Voluceau, Rocquencourt, BP 105, F-78153 Le Chesnay, France, Firstname.Lastname@inria.fr
Table of Contents

Introduction

Motivations

The Foc project was launched in 1998 by T. Hardin and R. Rioboo [HardinRiobooTSI04] 5 with the objective of helping all stages of development of critical software within safety and security domains. The methods used in these domains are evolving, ad-hoc and empirical approaches being replaced by more formal methods. For example, for high levels of safety, formal models of the requirement/specification phase are more and more considered as they allow mechanized proofs, test or static analysis of the required properties. In the same way, high level assurance in system security asks for the use of true formal methods along the process of software development and is often required for the specification level. Thus the project was to elaborate an Integrated Development Environment (IDE) able to provide high-level and justified confidence to users, but remaining easy to use by well-trained engineers.

To ease developing high integrity systems with numerous software components, an Integrated Development Environment (IDE) should provide tools to formally express specifications, to describe design and coding and to ensure that specification requirements are met by the corresponding code. This is not enough. First, standards of critical systems ask for pertinent documentation which has to be maintained along all the revisions during the system life cycle. Second, the evaluation conformance process of software is by nature a sceptical analysis. Thus, any proof of code correctness must be easily redone at request and traceability must be eased. Third, design and coding are difficult tasks. Research in software engineering has demonstrated the help provided by some object-oriented features as inheritance, late binding and early research works on programming languages have pointed out the importance of abstraction mechanism such as modularity to help invariant maintaining. There are a lot of other points which should also be considered when designing an IDE for safe and/or secure systems to ensure conformance with high Evaluation Assurance or Safety Integrity Levels (EAL-5,7 or SIL 3,4) and to ease the evaluation process according to various standards (e.g. IEC61508, CC, ...): handling of non-functional contents of specification, handling of dysfunctional behaviors and vulnerabilities from the true beginning of development and fault avoidance, fault detection by validation testing, vulnerability and safety analysis.

Initial application testbed

When the Foc project was launched by Hardin and Rioboo, only one specific domain was considered, the one of Computer Algebra. Algorithms used in this domain can be rather intricated and difficult to test and this is not rare that computer algebra systems issue a bad result, due to semantical flaws, compiler anomalies, etc. Thus the idea was to design a language allowing to specify the mathematics underlying these algorithms and to go step by step to different kinds of implementations according to the specifities of the problem under consideration6. The first step was to design the semantics of such a language, trying to fit to several requirements: easing the expression of mathematical statements, clear distinction between the mathematical structure (semi-ring, polynomial, ..) and its different implementations, easing the development (modularity, inheritance, parametrisation, abstraction, ..), runtime efficiency and confidence in the whole development (mechanised proofs, ..). After an initial phase of conceptual design, the Foc semantics was submitted to a double test. On one hand, this semantics was specified in Coq and in a categorical model of type theories by S. Boulmé (see his thesis[BoulmePhD00]), a point which enlightened the borders of this approach, regarding the logical background. On the other hand, before designing the syntax, it was needed to study the development style in such a language. R. Rioboo [ThRRCalculemus, HardinRiobooTSI04] used the OCaml language to try different solutions which are recorded in  [HardinRiobooTSI04].

Initial Focal design

Then the time came to design the syntax of the language and the compiler. To overcome inconsistencies risks, an original dependency analysis was incorporated into the compiler (V. Prevosto thesis[PrevostoPhD03, TPHOL2002, PrevostoJAR02]) and the correction of the compiler (mostly written by V. Prevosto) against Focal's semantics is proved (by hand) [TLCA2005], a point which brings a satisfactory confidence in the language's correctness. Then Rioboo [] began the development of a huge computer algebra library, which offers full specification and implementation of usual algebraic structures up to multivariate polynomial rings with complex algorithms, first as a way to extensively test the language and (quite satisfactory) efficiency of the produced code and then to provide a standard library of mathematical backgrounds. And D. Doligez[ZenonBDD] started the development of Zenon, an automatic prover based on tableaux method, which takes a Focal statement and tries to build a proof of it and, when succeeds, issues a Coq term. More recently, M. Carlier and C. Dubois[CarlierDuboisLNCS2008] began the development of a test tool for Focal.

Focal has already been used to develop huge examples such as the standard library and the computer algebra library. The library dedicated to the algebra of access control models, developed by M. Jaume and C. Morisset[jias06, fcsarspa06, MorissetPhd], is another huge example, which borrows implementations of orderings, lattices and boolean algebras from the computer algebra library. Focal was also very successfully used to formalize airport security regulations, a work by D. Delahaye, J.-F. Etienne, C. Dubois, V. Donzeau-Gouge  [EDEMOI-All, EDEMOI-Model, EDEMOI-Proof]. This last work led to the development of a translator[Focal-UML] from Focal to UML for documentation purposes.

The FoCaLize system

The FoCaLize development effort started in 2006: it was clearly a continuation of the Foc and Focal efforts. The new system was rewritten from scratch. A new language and syntax was designed and carefully implemented, with in mind ease of use, expressivity, and programmer friendyness. The addition of powerful data structure definitions together with the corresponding pattern matching facility, lead to new expressing power.

The Zenon automatic theorem prover was also integrated in the compiler and natively interfaced within the FoCaLize language. New developments for recursive functions support is on the way (in particular for termination proofs).

A formal specification can be built by declaring names of functions and values and introducing properties. Then, design and implementation can incrementally be done by adding definitions of functions and proving that the implementation meets the specification or design requirements. Thus, developing in FoCaLize is a kind of refinement process from formal model to design and code, completely done within FoCaLize. Taking the global development in consideration within the same environment brings some conciseness, helps documentation and reviewing. Thus a FoCaLize development is organised as a hierarchy that may have several roots. The upper levels of the hierarchy are built along the specification stage while the lower ones correspond to implementation and each node of the hierarchy corresponds to a progress toward a complete implementation.

The FoCaLize system provides means for the developers to formally express their specifications and to go step by step (in an incremental approach) to design and implementation while proving that such an implementation meets its specification or design requirements. The FoCaLize language offers high level mechanisms such as inheritance, late binding, redefinition, parametrization, etc. Confidence in proofs submitted by developers or automatically done relies on formal proof verification. FoCaLize also provides some automation of documentation production and management.

We would like to mention several works about safety and/or security concerns within FoCaLize and specially the definition of a safety life cycle by P. Ayrault, T. Hardin and F. Pessaux [TTSS08] and the study of some traps within formal methods by E. Jaeger and T. Hardin[traps].

The FoCaLize system in short

FoCaLize can be seen as an IDE still in development, which gives a positive solution to the three requirements identified above:
  1. pertinent documentation is maintained within the system being written, and its extraction is an automatic part of the compilation process,
  2. proofs are written using a high level proof language, so that proofs are easier to write and their verification is automatic and reliable,
  3. the framework provides powerful abstraction mechanisms to facilitate design and development; however, these mechanisms are carefully ruled: the compiler performs numerous validity checks to ensure that no further development can inadvertantly break the invariants or invalidate the proofs; indeed, the compiler ensures that if a theorem was based on assumptions that are now violated by the new development, then the theorem is out of reach of the programmer.

5
They were members of the SPI (Semantics, Proofs, Implementations) team of the LIP6 (Lab. Informatique de Paris 6) at Université Pierre et Marie Curie (UMPC), Paris
6
For example Computer Algebra Libraries use several different representations of polynomials according to the treatment to be done
Chapter 1 Overview

Before entering the precise description of FoCaLize we give an informal presentation of near all its features, to help further reading of the reference manual. Every construction or feature of FoCaLize will be entirely described in the following chapters.

1.1 The Basic Brick

The primitive entity of a FoCaLize development is the species. It can be viewed as a record grouping ``things'' related to a same concept. Like in most modular design systems (i.e. objected oriented, algebraic abstract types) the idea is to group a data structure with the operations to process it. Since in FoCaLize we don't only address data type and operations, among these ``things'' we also find the declaration (specification) of these operations, the properties (which may represent requirements) and their proofs.

We now describe each of these ``things'', called methods. We now make concrete these notions on an example we will incrementally extend. We want to model some simple algebraic structures. Let's start with the description of a ``setoid'' representing the data structure of ``things'' belonging to a set, which can be submitted to an equality test and exhibited (i.e. one can get a witness of existence of one of these ``things'').
  signature ( = ) : Self -> Self -> bool ;
  signature element : Self ;

  property refl : all x in Self, x = x ;
  property symm : all x y in Self, x = y -> y = x ;
  property trans: all x y z in Self, x=y and y=z -> x=z ;
  let different (x, y) = basics#not_b (x = y) ;

end ;;

In this species, the representation is not explicitly given (no keyword representation) , since we don't need to set it to be able to express functions and properties our ``setoid'' requires. However, we can refer to it via Self and it is in fact a type variable. In the same way, we specify a signature for the equality (operator =). We introduce the three properties that an equality (equivalence relation) must conform to.

We complete the example by the definition of the function different which use the name = (here basics#not_b stands for the function not_b, the boolean and coming from the FoCaLize source file basics.fcl). It is possible right now to prove that different is irreflexive, under the hypothesis that = is an equivalence relation (i.e. that each implementation of = given further will satisfy these properties).





It is possible to use methods only declared before they get a real definition thanks to the late-binding feature provided by FoCaLize. In the same idea, redefining a method is allowed in FoCaLize and, it is always the last version which is kept as the effective definition inside the species.

1.2 Type of Species, Interfaces and Collections
The type of a species is obtained by removing definitions and proofs. Thus, it is a kind of record type, made of all the method types of the species. If the representation is still a type variable say a, then the species type is prefixed with an existential binder $ a. This binder will be eliminated as soon as the representation will be instantiated (defined) and must be eliminated to obtain runnable code.





The interface of a species is obtained by abstracting the representation type in the species type and this abstraction is permanent.





Beware! No special construction is given to denote interfaces in the concrete syntax, they are simply denoted by the name of the species underlying them. Do not confuse a species and its interface.





The species type remain totally implicit in the concrete syntax, being just used as a step to build species interface. It is used during inheritance resolution.

Interfaces can be ordered by inclusion, a point providing a very simple notion of subtyping. This point will be further commented.

A species is said to be complete if all declarations have received definitions and all properties have received proofs.

When complete, a species can be submitted to an abstraction process of its representation to create a collection. Thus the interface of the collection is just the interface of the complete species underlying it. A collection can hence be seen as an abstract data type, only usable through the methods of its interface, but having the guarantee that all methods/theorems are defined/proved.

1.3 Combining Bricks by Inheritance

A FoCaLize development is organised as a hierarchy which may have several roots. Usually the upper levels of the hierarchy are built during the specification stage while the lower ones correspond to implementations. Each node of the hierarchy, i.e. each species, is a progress to a complete implementation. On the previous example, forgetting different, we typically presented a kind of species for ``specification'' since it expressed only signatures of functions to be later implemented and properties to which, later, give proofs.

We can now create a new species, may be more complex, by inheritance of a previously defined. We say here ``may be more complex'' because it can add new operations and properties, but it can also only bring real definitions to signatures and proofs to properties, adding no new method.

Hence, in FoCaLize inheritance serves two kinds of evolutions. In the first case the evolution aims making a species with more operations but keeping those of its parents (or redefining some of them). In the second case, the species only tends to be closer to a ``run-able'' implementation, providing explicit definitions to methods that were previously only declared.

Continuing our example, we want to extend our model to represent ``things'' with a multiplication and a neutral element for this operation.
  signature ( * ) : Self -> Self -> Self ;
  signature one : Self ;
  let element = one * one ;
end ;;

We see here that we added new methods but also gave a definition to element, saying it is the application of the method * to one twice, both of them being only declared. Here, we used the inheritance in both the presented ways: making a more complex entity by adding methods and getting closer to the implementation by explicitly defining element.

Multiple inheritance is available in FoCaLize. For sake of simplicity, the above example uses simple inheritance. In case of inheriting a method from several parents, the order of parents in the inherits clause serves to determine the chosen method.

The type of a species built using inheritance is defined like for other species, the methods types retained inside it being those of the methods present in the species after inheritance is resolved.

A strong constraint in inheritance is that the type of inherited, and/or redefined methods must not change. This is required to ensure consistence of the FoCaLize model, hence of the developed software. More precisely, if the representation is given by a type expression containing some type variables, then it can be more defined by instanciation of these variables. In the same way, two signatures have compatible types if they have a common unifier, thus, roughly speaking if they are compatible Ml-like types. For example, if the representation was not yet defined, thus being still a type variable, it can be defined by int. And if a species S inherits from S1 and S2 a method called m, there is no type clash if S1 !m and S2!m can be unified, then the method S!m has the most general unifier of these two types as its own type.

1.4 Combining Bricks by Parametrisation

Until now we are only able to enrich species. However, we sometimes need to use a species, not to take over its methods, but rather to use it as an ``ingredient'' to build a new structure. For instance, a pair of setoids is a new structure, using the previous species as the ``ingredient'' to create the structure of the pair. Indeed, the structure of a pair is independent of the structure of each component it is made of. A pair can be seen as parametrised by its two components. Following this idea, FoCaLize allows two flavors of parametrisation.

1.4.1 Parametrisation by Collection Parameters

We first introduce the collection parameters. They are collections that the hosting species may use through their methods to define its own ones.

A collection parameter is given a name C and an interface I. The name C serves to call the methods of C which figure in I. C can be instantiated by an effective parameter CE of interface IE. CE is a collection and its interface IE must contain I. Moreover, the collection and late-binding mechanisms ensure that all methods appearing in I are indeed implemented (defined for functions, proved for properties) in CE. Thus, no runtime error, due to linkage of libraries, can occur and any properties stated in I can be safely used as an hypothesis.

Calling a species's method is done via the ``bang'' notation: !meth or
Self!meth for a method of the current species (and in this case, even simpler: meth, since the FoCaLize compiler will resolve scoping issues). To call collection parameters's method, the same notation is used: A!element stands for the method element of the collection parameter A.

To go on with our example, a pair of setoids has two components, hence a species for pairs of setoids will have two collection parameters. It is itself a setoid, a fact which is simply recorded via the inheritance mechanism: inherits Setoid gives to Setoid_product all the methods of Setoid.
  representation = (A * B) ;

  let ( = ) (x, y) =
     and_b
       (A!( = ) (first (x), first (y)),
        B!( = ) (scnd (x), scnd (y))) ;
  let create (x, y) in Self = basics#crp (x, y) ;
  let element = Self!create (A!element, B!element) ;

  proof of refl = by definition of ( = ) ;
end ;;

We express the representation of the product of two setoids as the Cartesian product of the representation of the two parameters. In A * B, * is the FoCaLize type  constructor of pairs, A denotes indeed the representation of the first collection parameter, and B the one of of the second collection parameter.

Next, we add a definition for = of Setoid_product, relying on the methods = of A (A!( = )) and B (which are not yet defined). Similarly, we introduce a definition for element by building a pair, using the function create (which calls the predefined function basics#crp) and the methods element of respectively A and B. And we can prove that = of Setoid_product is indeed reflexive, upon the hypothesis made on A!( = ) and B!( = ). The part of FoCaLize used to write proofs will be shortly presented later, in section ??.

This way, the species Setoid_product builds its methods relying on those of its collection parameters. Note the two different uses of Setoid in our species Setoid_product, which inherits of Setoid and is parametrised by Setoid.





Why such collection parameters and not simply species parameters? There are two reasons. First, effective parameters must provide definitions/proofs for all the methods of the required interface: this is the contract. Thus, effective parameters must be complete species. Then, we do not want the parametrisation to introduce dependencies on the parameters' representation definitions. For example, it is impossible to express `` if A!representation is int and B!representation is bool then A*B is a list of boolean values''. This would dramatically restrict possibilities to instantiate parameters since assumptions on the representation, possibly used in the parametrised species to write its own methods, could prevent collections having the right set of methods but a different representation to be used as effective parameters. Such a behaviour would make parametrisation too weak to be usable. We choose to always hide the representation of a collection parameter to the parametrised hosting species. Hence the introduction of the notion of collection, obtained by abstracting the representation from a complete species.

1.4.2 Parametrisation by Entity Parameters

Let us imagine we want to make a species working on natural numbers modulo a certain value. In the expression 5 modulo 2 is 1, both 5 and 2 are natural numbers. To be sure that the species will consistently work with the same modulo, this last one must be embedded in the species. However, the species itself doesn't rely on a particular value of the modulo. Hence this value is clearly a parameter of the species, but a parameter in which we are interested by its value, not only by its representation and the methods acting on it. We call such parameters entity parameters, their introduction rests upon the introduction of a collection parameter and they denote a value having the type of the representation of this collection parameter.

Let us first have a species representing natural numbers:
  signature one : Self ;
  signature modulo : Self -> Self -> Self ;
end ;;
Note that IntModel can be later implemented in various ways, using Peano's integers, machine integers, arbitrary-precision arithmetic ...

We now build our species ``working modulo ...'', embedding the value of this modulo:
  let job1 (x in Naturals) in ... =
    ... Naturals!modulo (x, n) ... ;
  let job2 (x in Naturals, ...) in ... =
    ... ... Naturals!modulo (x, n) ... ... ;
end ;;
Using the entity parameter n, we ensure that the species Modulo_work works for any value of the modulo, but will always use the same value n of the modulo everywhere inside the species.

1.5 The Final Brick

As briefly introduced in ??, a species needs to be fully defined to lead to executable code for its functions and checkable proofs for its theorems. When a species is fully defined, it can be turned into a collection. Hence, a collection represents the final stage of the inheritance tree of a species and leads to an effective data representation with executable functions processing it.

For instance, providing that the previous species IntModel turned into a fully-defined species MachineNativeInt through inheritances steps, with a method from_string allowing to create the natural representation of a string, we could get a related collection by:

Next, to get a collection implementing arithmetic modulo 8, we could extract from the species Modulo_work the following collection:
   (MachineNativeIntColl, MachineNativeIntColl!from_string (``8'') ;;

As seen by this example, a species can be applied to effective parameters by giving their values with the usual syntax of parameter passing.

As said before, to ensure modularity and abstraction, the representation of a collection turns hidden. This means that any software component dealing with a collection will only be able to manipulate it through the operations (methods) its interface provides. This point is especially important since it prevents other software components from possibly breaking invariants required by the internals of the collection.

1.6 Properties, Theorems and Proofs

FoCaLize aims not only to write programs, it intends to encompass both the executable model (i.e. program) and properties this model must satisfy. For this reason, ``special'' methods deal with logic instead of purely behavioural aspects of the system: theorems, properties and proofs.

Stating a property expects that a proof that it holds will finally be given. For theorems, the proof is directly embedded in the theorem. Such proofs must be done by the developer and will finally be sent to the formal proof assistant Coq who will automatically check that the demonstration of the property is consistent. Writing a proof can be done in several ways.

It can be written in ``FoCaLize's proof language'', a hierarchical proof language that allows to give hints and directions for a proof. This language will be sent to an external theorem prover, Zenon [Zenon, zenon0.4.1] developed by D. Doligez. This prover is a first order theorem prover based on the tableau method incorporating implementation novelties such as sharing. Zenon will attempt, from these hints to automatically generate the proof and exhibit a Coq term suitable for verification by Coq. Basic hints given by the developer to Zenon are: ``prove by definition of a method'' (i.e. looking inside its body) and ``prove by property'' (i.e. using the logical body of a theorem or property''. Surrounding this hints mechanism, the language allows to build the proof by stating assumptions (that must obviously be demonstrated next) that can be used to prove lemmas or parts for the whole property. We show below an example of such demonstration.
    !order_inf(i, x) -> !order_inf(i, y) ->
      !order_inf(i, !inf(x, y))
    proof:
      <1>1 assume x in Self, assume y in Self,
           assume i in Self, assume H1: !order_inf(i, x),
           assume H2: !order_inf(i, y),
           prove !order_inf(i, !inf(x, y))
        <2>1 prove !equal(i, !inf(!inf(i, x), y))
          by hypothesis H1, H2
             property inf_left_substitution_rule,
               equal_symmetric, equal_transitive
             definition of order_inf
        <2>9 qed
          by step <2>1
             property inf_is_associative, equal_transitive
             definition of order_inf
      <1>2 conclude
    ;

The important point is that Zenon works for the developer: it searches the proof itself, the developer does not have to elaborate it formally ``from scratch''.

Like any automatic theorem prover, Zenon may fail finding a demonstration. In this case, FoCaLize allows to write verbatim Coq proofs. In this case, the proof is not anymore automated, but this leaves the full power of expression of Coq to the developer.

Finally, the assumed keyword is the ultimate proof backdoor, telling that the proof is not given but that the property must be admitted. Obviously, a really safe development should not make usage of such ``proofs'' since they bypass the formal verification of software's model. However, such a functionality remains needed since some of ``well-known'' properties can never be proved for a computer. For instance, " x Î IN, x+1 >n does not hold in a computer with native integers. However, in a mathematical framework, this property holds and is needed to carry out other proofs. Thus the developer may prove either that all manipulated values remain in an interval where this property holds or may admit this property or may add code to detect overflow ... On another side, a development may be linked with external code, trusted or not, but for which properties cannot be proved inside the FoCaLize part since it does not belong to it. Expressing properties of the FoCaLize part may need to express properties on the imported code, that cannot be formally proved, then must be ``assumed''.

1.7 Around the Language

In the previous sections, we presented FoCaLize through its programming model and shortly its syntax. We especially investigated the various entities making a FoCaLize program. We now address what becomes a FoCaLize program once compiled. We recall that FoCaLize supports the redefinition of functions, which permits for example to specialise code to a specific representation (for example, there exists a generic implementation of integer addition modulo n but it can be redefined in arithmetics modulo 2 if boolean values are used to represent the two values). It is also a very convenient tool to maintain software.

1.7.1 Consistency of the Software

All along the development cycle of a FoCaLize program, the compiler keeps trace of dependencies between species, their methods, the proofs, ...to ensure that modifications of one of them will be detected those depending of it.

FoCaLize considers two types of dependencies: The redefinition of a function may invalidate the proofs that use properties of the body of the redefined function. All the proofs which truly depend of the definition are then erased by the compiler and must be done again in the context updated with the new definition. Thus the main difficulty is to choose the best level in the hierarchy to do a proof. In [PrevostoJaume2003], Prevosto and Jaume propose a coding style to minimise the number of proofs to be redone in the case of a redefinition, by a certain kind of modularisation of the proofs.

1.7.2 Code Generation

FoCaLize currently compiles programs toward two languages, OCaml to get an executable piece of software, and Coq to have a formal model of the program, with theorems and proofs.

In OCaml code generation, all the logical aspects are discarded since they do not lead to executable code.

Conversely, in Coq, all the methods are compiled, i.e. ``computational'' methods and logical methods with their proofs. This allows Coq to check the entire consistence of the system developed in FoCaLize.

1.7.3 Tests
FoCaLize incorporates a tool named FocalTest [CarlierDuboisLNCS2008] for Integration/Validation testing. It allows to confront automatically a property of the specification with an implementation. It generates automatically test cases, executes them and produces a test report as an XML document. The property under test is used to generate the test cases, it also serves as an oracle. When a test case fails, it means a counterexample of the property has been found: the implantation does not match the property; it can also indicate an error in the specification.

The tool FocalTest automatically produces the test environment and the drivers to conduct the tests. We benefit from the inheritance mechanism to isolate the testing harness from the components written by the programmer.

The testable properties are required to be broken down into a precondition and a conclusion, both executable. FocalTest proposes a pure random test cases generation: it generates test cases until the precondition is satisfied, the verdict of the test case is given by executing the post-condition. It can be an expensive process for some kind of preconditions. To overcome this drawback, a constraint based generation is under development: it allows to produce directly test cases for which the precondition is satisfied.

1.7.4 Documentation

The tool called FoCaLizeDoc [MaarekCalculemus03] automatically generates documentation, thus the documentation of a component is always coherent with respect to its implementation.

This tool uses its own XML format that contains information coming not only from structured comments (that are parsed and kept in the program's abstract syntax tree) and FoCaLize concrete syntax but also from type inference and dependence analysis. From this XML representation and thanks to some XSLT stylesheets, it is possible to generate HTML files or LATEX files. Although this documentation is not the complete safety case, it can helpfully contribute to its elaboration. In the same way, it is possible to produce UML models [Focal-UML] as means to provide a graphical documentation for FoCaLize specifications. The use of graphical notations appears quite useful when interacting with end-users, as these tend to be more intuitive and are easier to grasp than their formal (or textual) counterparts. This transformation is based on a formal schema and captures every aspect of the FoCaLize language, so that it has been possible to prove the soundness of this transformation (semantic preservation).

FoCaLize's architecture is designed to easily plug third-parties analyses that can use the internal structures elaborated by the compiler from the source code. This allows, for example, to make dedicated documentation tools for custom purposes, just exploiting information stored in the FoCaLize program's abstract syntax tree, or extra information possibly added by extra processes, analyses.

Chapter 2 Installing and Compiling

2.1 Required software
To be able to develop with the FoCaLize environment, a few third party tools are required. All of them can be freely downloaded from their related website.
2.2 Optional software
The FoCaLize compiler can generate dependencies graphs from compiled source code. It generates them in the format suitable to be processed and displayed by the dotty tools suit of the ``Graphwiz'' package. If you plan to examine these graphs, you also need to install this software from http://www.graphviz.org/.

2.3 Operating systems
FoCaLize was fully developed under Linux using free software. Hence, any Unix-based operating system should support FoCaLize. The currently tested Unix are: Fedora, Debian, Suse, BSD.

Windows users can run FoCaLize via the Unix-like environment Cygwin providing both users and developers tools. This software is freely distributed and available at http://www.cygwin.com/.


From the official Cygwin web site:
``Cygwin is a Linux-like environment for Windows. It consists of two parts: A DLL (cygwin1.dll) which acts as a Linux API emulation layer providing substantial Linux API functionality. A collection of tools which provide Linux look and feel.
The Cygwin DLL currently works with all recent, commercially released x86 32 bit and 64 bit versions of Windows, with the exception of Windows CE.
Cygwin is not a way to run native linux apps on Windows. You have to rebuild your application from source if you want it to run on Windows.

Cygwin is not a way to magically make native Windows apps aware of UNIX ® functionality, like signals, ptys, etc. Again, you need to build your apps from source if you want to take advantage of Cygwin functionality.
''


Under Cygwin, the required packages are the same as those listed in ?? and ??. As stated in Cygwin's citation above, you need to get the sources packages of this software and compile them yourself, following information provided in these packages.

The installation of FoCaLize itself is the same for all operating systems and is described in the following section (??).

2.4 Installation
FoCaLize is currently distributed as a tarball containing the whole source code of the development environment. You must first deflate the archive (a directory will be created) by:
tar xvzf focalize-x.x.x.tgz
Next, go in the sources directory:
cd focalize-x.x.x/
You now must configure the build process by:
./configure
The configuration script then asks for directories where to install the FoCaLize components. You may just press enter to keep the default installation directories.
latour:~/src/focalize$ ./configure ~/pkg
Where to install FoCaLize binaries ?
Default is /usr/local/bin.
Just press enter to use default location.

Where to install FoCaLize libraries ?
Default is /usr/local/lib/focalize.
Just press enter to use default location.
After the configuration ends, just build the system:
make all
And finally, get root priviledges to install the FoCaLize system:
su
make install

2.5 Compilation process and outputs

We call compilation unit a file containing source code for toplevel-definitions, species, collections. Visibility rules, described in section ??, are defined according to compilation units status. From a compilation unit, the compiler issues several files described on the following.

2.5.1 Outputs
A FoCaLize development contains both ``computational code'' (i.e. code performing operations that lead to an effect, a result) and logical properties.


When compiled, two outputs are generated: In addition, several other outputs can be generated for documentation or debug purposes. See the section ?? for details.

2.5.2 Compiling a source
Compiling a FoCaLize program involves several steps that are automatically handled by the focalizec command. Using the command line options, it is possible to tune the code generations steps as described in ??.
  1. FoCaLize source compilation. This step takes the FoCaLize source code and generates the OCaml and/or ``pre-''Coq code. You can disable the code generation for one of these languages (see page ??), or both, in this case, no code is produced and you only get the FoCaLize object code produced without anymore else output and the process ends at this point. If you disable one of the target languages, then you won't get any generated file for it, hence no need to address its related compilation process described below.

    Assuming you generate code for both OCaml and Coq  you will get two generated files: source.ml (the OCaml code) and source.zv (the ``pre-''Coq code).

  2. OCaml code compilation. This step takes the generated OCaml code (it is an OCaml source file) and compile it. This is done like any regular OCaml compilation, the only difference is that the search path containing the FoCaLize installation path and your own used extra FoCaLize source files directories are automatically passed to the OCaml compiler. Hence this steps acts like a manual invocation:
    ocamlc -c -I /usr/local/lib/focalize -I mylibs
       -I myotherlibs source.ml
        
    This produces the OCaml object file source.cmo. Note that you can also ask to use the OCaml code in native mode, in this case the ocamlopt version of the OCaml compiler is selected (see OCaml reference manual for more information) and the object files are .cmx files instead of .cmo. ones.

  3. ``Pre-''Coq code compilation. This step takes the generated .zv file and attempts to produce a real Coq .v source file by replacing proofs written in FoCaLize Proof Language by some effective Coq proofs found by the Zenon theorem prover. Note that if Zenon fails in finding a proof, a hole will remain in the final Coq .v file. Such a hole appears as the text ``TO_BE_DONE_MANUALLY.'' in place of the effective proof. In this case, Coq will obviously fail in compiling the file, so the user must do the proof by hand or modify his original FoCaLize source file to get a working proof. This step acts like a manual invocation:
    zvtov -new source.zv
    For more about the Zenon options, consult section ??.

  4. Coq code compilation. This step takes the generated .v code and compiles it with Coq. This is done like any regular Coq compilation. The only difference is that the search path containing the FoCaLize installation path and your own used extra FoCaLize source files directories are automatically passed to the Coq compiler.
    coqc -I /usr/local/lib/focalize -I mylibs
      -I myotherlibs source.v
        
    Once this step is done, you have the Coq object files and you are sure that Coq validated you program model, properties and proofs. The final ``assessor'' of the tool-chain accepted your program.

Once all separate files are compiled, to get an executable from the OCaml object files, you must link them together, providing the same search path than above and the .cmo files corresponding to all the generated OCaml files from all your FoCaLize .foc files. You also need to add the .cmo files corresponding to the modules of the standard library you use (currently, this must be done by the user, next versions will automate this process).
ocamlc -I mylibs -I myotherlibs
  install_dir/ml_builtins.cmo install_dir/basics.cmo
  install_dir/sets.cmo ...
  mylibs/src1.cmo mylibs/src2.cmo ...
  myotherlibs src3.cmo mylibs/src3.cmo ...
  source1.cmo source2.cmo ...
  -o exec_name
    
Chapter 3 The core language

3.1 Lexical conventions

3.1.1 Blanks
The following characters are considered as blanks: space, newline, horizontal tabulation, carriage return, line feed and form feed. Blanks are ignored, but they separate adjacent identifiers, literals and keywords that would otherwise be confused as one single identifier, literal or keyword.

3.1.2 Comments
Comments (possibly spanning) on several lines are introduced by the two characters (*, with no intervening blanks, and terminated by the characters *), with no intervening blanks. Comments are treated as blanks. Comments can occur inside string or character literals (provided the * character is escaped) and can be nested. They are discarded during the compilation process. Example:
species S =
  ...
  let m (x in Self) = (* Another discarded comment. *)
  ...
end ;;
(* Another discarded comment at end of file. *)

Comments spanning on a single line start by the two characters -- and end with the end-of-line character. Example:
species S =
  let m (x in Self) = -- Another uni-line comment.
  ...
end ;;

3.1.3 Annotations
Annotations are introduced by the three characters (**, with no intervening blanks, and terminated by the two characters *), with no intervening blanks. Annotations cannot occur inside string or character literals and cannot be nested. They must precede the construct they document. In particular, a source file cannot end by an annotation.

Unlike comments, annotations are kept during the compilation process and recorded in the compilation information (``.fo'' files). Annotations can be processed later on by external tools that could analyze them to produce a new FoCaLize source code accordingly. For instance, the FoCaLize development environment provides the FoCaLizeDoc automatic production tool that uses annotations to automatically generate documentation. Several annotations can be put in sequence for the same construct. We call such a sequence an annotations block. Using embedded tags in annotations allows third-party tools to easily find out annotations that are meaningful to them, and safely ignore others. For more information, consult ??. Example:
    Documentation for species S. *)
species S =
  ...
  let m (x in Self) =
    (** {@TEST} Annotation for the test generator. *)
    (** {@MY_TAG_MAINTAIN} Annotation for maintainers. *)
    ... ;
end ;;

3.1.4 Identifiers



FoCaLize features a rich class of identifiers with sophisticated lexical rules that provide fine distinction between the kind of notion a given identifier can designate.

3.1.4.1 Introduction

Sorting words to find out which kind of meaning they may have is a very common conceptual categorization of names that we use when we write or read ordinary English texts. We routinely distinguish between:
0pt 5pt ·
a word only made of lowercase characters, that is supposed to be an ordinary noun, such as "table", "ball", or a verb as in "is", or an adjective as in "green",
·
a word starting with an uppercase letter, that is supposed to be a name, maybe a family or christian name, as in "Kennedy" or "David", or a location name as in "London".
We use this distinctive look of words as a useful hint to help understanding phrases. For instance, we accept the phrase "my ball is green" as meaningful, whereas "my Paris is green" is considered a nonsense. This is simply because "ball" is a regular noun and "Paris" is a name. The word "ball" as the right lexical classification in the phrase, but "Paris" has not. This is also clear that you can replace "ball" by another ordinary noun and get something meaningful: "my table is green"; the same nonsense arises as well if you replace "Paris" by another name: "my Kennedy is green".

Natural languages are far more complicated than computer languages, but FoCaLize uses the same kind of tricks: the ``look'' of words helps a lot to understand what the words are designating and how they can be used.

3.1.4.2 Conceptual properties of names

FoCaLize distinguishes 4 concepts for each name:
0pt 5pt ·
the fixity assigns the place where an identifier must be written,
·
the precedence decides the order of operations when identifiers are combined together,
·
the categorisation fixes which concept the identifier designates.
·
the nature of a name can either be symbolic or alphanumeric.
Those concepts are compositional, i.e. all these concepts are independent from one another. Put is another way: for any fixity, precedence, category and nature, there exist identifiers with this exact properties.

We further explain those concepts below.

3.1.4.3 Fixity of identifiers

The fixity of an identifier answers to the question ``where this identifier must be written ?''.
0pt 5pt ·
a prefix is written before its argument, as sin in sin  x or - in - y,
·
an infix is written between its arguments, as + in x  +  y or mod in x  mod  3.
·
a mixfix is written among its arguments, as if  ... then  ... else  ... in if c  then  1  else  2 .
In FoCaLize, as in maths, ordinary identifiers are always prefix and binary operators are always infix.

3.1.4.4 Precedence of identifiers

The precedence rules out where implicit parentheses take place in a complex combination of symbols. For instance, according to the usual mathematical conventions:
0pt 5pt ·
1  +  2  *  3 means 1  +  (2  *  3) hence 7, it