FoCaLize
Reference Manual
1.0.0
January 2009
Authors
Thérèse Hardin,
François Pessaux,
Pierre Weis,
Damien Doligez
About FoCaLize
FoCaLize is the result of a collective work of several researchers,
listed in the following, who designed, defined, compiled, studied, extended,
used and debugged the preceding versions. They were helped by many
students who had a summer internship under their supervision. They
would like to thank all these students and more generally all the
persons who brought some contribution to FoCaLize.
FoCaLize contributors
Philippe Ayrault (SPI-LIP6), William Bartlett (CPR-CEDRIC), Julien
Blond (SPI-LIP6), Sylvain Boulmé (SPI-LIP6), Matthieu Carlier
(CPR-CEDRIC), Damien Doligez (GALLIUM-INRIA), David Delahaye
(CPR-CEDRIC), Catherine Dubois (CPR-CEDRIC), Jean-Frédéric Etienne
(CPR-CEDRIC), Stéphane Fechter (SPI-LIP6), Mathieu Jaume (SPI-LIP6),
Lionel Habib (SPI-LIP6), Thérèse Hardin (SPI-LIP6), Charles
Morisset (SPI-LIP6), Ivan Noyer (SPI-LIP6), François Pessaux
(SPI-LIP6), Virgile Prevosto (SPI-LIP6), Renaud Rioboo (CPR-CEDRC),
Lien Tran (SPI-LIP6), Véronique Viguié Donzeau-Gouge (CPR-CNAM),
Pierre Weis (ESTIME-INRIA)
and their institutions
SPI (Semantics, Proofs and Implementations) is a team of LIP6,
(Laboratoire d'Informatique de Paris 6) of UPMC (Pierre and Marie Curie
University)1.
CPR (Conception et Programmation Raisonnées) is a team of CEDRIC
(Centre d'Etudes et de Recherches du CNAM) of CNAM (Conservatoire National
des Arts et Métiers)2 and ENSIIE (Ecole Nationale d'Informatique pour
l'Industrie et l'Entreprise)3.
ESTIME and GALLIUM are teams of INRIA Rocquencourt4
Thanks
The Foc project was first partially supported by LIP6 (Projet Foc, LIP6
1997) then by the Ministry of Research (Action Modulogic).
The Focal research team was then partially supported by the French SSURF ANR project
ANR-06-SETI-016 (Safety and Security UndeR Focal). The project also
benefited of strong collaborations with EDEMOI ANR project and with
BERTIN and SAFERIVER companies.
The FoCaLize language and compiler development effort started around 2005.
The architecture conception and code rewritting started from scratch in 2006
to finally make the first focalizec compiler and FoCaLize system
distribution in 2009, January.
This manual documents the completely revised system with the new syntax and
its semantics extensions.
- 1
- UPMC-LIP6,
104 avenue du Président Kennedy, Paris 75016, France,
Firstname.Lastname@lip6.fr
- 2
- CNAM-CEDRIC,
292 rue Saint Martin, 75003, Paris, France,
Firstname.Lastname@cnam.fr
- 3
- ENSIIE-CEDRIC,
1 Square de la Résistance, 91025 Evry Cedex, France,
Lastname@ensiie.fr
- 4
- INRIA,
Bat 8. Domaine de Voluceau, Rocquencourt, BP 105, F-78153 Le
Chesnay, France, Firstname.Lastname@inria.fr
The Foc project was launched in 1998 by T. Hardin and R. Rioboo
[HardinRiobooTSI04] 5
with the objective of helping all stages of development of critical software
within safety and security domains. The methods used in these domains
are evolving, ad-hoc and empirical approaches being replaced by more formal
methods. For example, for high levels of safety, formal models of the
requirement/specification phase are more and more considered as they
allow mechanized proofs, test or static analysis of the required
properties. In the same way, high level assurance in system security asks for
the use of true formal methods along the process of software
development and is often required for the specification level.
Thus the project was to elaborate an Integrated
Development Environment (IDE) able to provide high-level and justified
confidence to users, but remaining easy to use by well-trained
engineers.
To ease developing high integrity systems with numerous software
components, an Integrated Development Environment (IDE) should provide
tools to formally express specifications, to describe design and
coding and to ensure that specification requirements are met by the
corresponding code. This is not enough. First, standards of critical systems
ask for pertinent documentation which has to be maintained along all the revisions
during the system life cycle. Second, the evaluation conformance
process of software is by nature a sceptical analysis. Thus, any proof
of code correctness must be easily redone at request and traceability
must be eased. Third, design
and coding are difficult tasks. Research in software engineering has
demonstrated the help provided by some object-oriented
features as inheritance, late binding and early research works on
programming languages have pointed out the importance of abstraction
mechanism such as modularity to help invariant maintaining. There are
a lot of other points which should also be considered when designing
an IDE for safe and/or secure systems to ensure conformance with high
Evaluation Assurance or Safety Integrity Levels (EAL-5,7 or SIL 3,4)
and to ease the evaluation process according to various standards
(e.g. IEC61508, CC, ...): handling of non-functional contents of
specification, handling of dysfunctional behaviors and vulnerabilities
from the true beginning of development and fault avoidance, fault
detection by validation testing, vulnerability and safety analysis.
Initial application testbed |
|
When the Foc project was launched by Hardin and Rioboo,
only one specific domain was considered, the one of Computer
Algebra. Algorithms used in this domain can be rather intricated and
difficult to test and this is not rare that computer algebra systems
issue a bad result, due to semantical flaws, compiler anomalies,
etc. Thus the idea was to design a language allowing to specify the
mathematics underlying these algorithms and to go step by step to
different kinds of implementations according to the specifities of the
problem under consideration6. The first step was to design
the semantics of such a language, trying to fit to several
requirements: easing the expression of mathematical statements, clear
distinction between the mathematical structure (semi-ring, polynomial,
..) and its different implementations, easing the development
(modularity, inheritance, parametrisation, abstraction, ..), runtime
efficiency and confidence in the whole development (mechanised proofs,
..). After an initial phase of conceptual design, the Foc
semantics was submitted to a double test. On one hand, this semantics
was specified in Coq and in a categorical model of type theories by
S. Boulmé (see his thesis[BoulmePhD00]), a point which
enlightened the borders of this approach, regarding the logical
background. On the other hand, before designing the syntax, it was
needed to study the development style in such a
language. R. Rioboo [ThRRCalculemus, HardinRiobooTSI04] used the
OCaml language to try different solutions which are recorded in
[HardinRiobooTSI04].
Then the time came to design the syntax of the language and the compiler. To
overcome inconsistencies risks, an original dependency analysis was
incorporated into the compiler (V. Prevosto
thesis[PrevostoPhD03, TPHOL2002, PrevostoJAR02]) and the correction of the
compiler (mostly written by V. Prevosto) against Focal's semantics is
proved (by hand) [TLCA2005], a point which
brings a satisfactory confidence in the language's correctness. Then
Rioboo [] began the development of a huge computer algebra
library, which offers full specification
and implementation of usual algebraic structures up to multivariate
polynomial rings with complex algorithms, first as a way to
extensively test the language and (quite satisfactory)
efficiency of the produced code and then to provide a standard library
of mathematical backgrounds. And D. Doligez[ZenonBDD] started the
development of Zenon, an automatic prover based on tableaux method,
which takes a Focal statement and tries to build a proof of it and,
when succeeds, issues a Coq term. More recently, M. Carlier and
C. Dubois[CarlierDuboisLNCS2008] began the development of a
test tool for Focal.
Focal has already been used to develop huge examples such as the
standard library and the computer algebra library. The library
dedicated to the algebra of access control models, developed by
M. Jaume and C. Morisset[jias06, fcsarspa06, MorissetPhd],
is another huge example, which borrows implementations of orderings,
lattices and boolean algebras from the computer algebra library.
Focal was also very successfully used to formalize airport security
regulations, a work by D. Delahaye, J.-F. Etienne, C. Dubois,
V. Donzeau-Gouge [EDEMOI-All, EDEMOI-Model, EDEMOI-Proof]. This
last work led to the development of a translator[Focal-UML]
from Focal to UML for documentation purposes.
The FoCaLize development effort started in 2006: it was clearly a
continuation of the Foc and Focal efforts. The new system was rewritten
from scratch. A new language and syntax was designed and carefully
implemented, with in mind ease of use, expressivity, and programmer
friendyness. The addition of powerful data structure definitions together
with the corresponding pattern matching facility, lead to new expressing
power.
The Zenon automatic theorem prover was also integrated in the
compiler and natively interfaced within the FoCaLize language. New
developments for recursive functions support is on the way (in particular for
termination proofs).
A formal specification can be built by declaring names of functions
and values and introducing
properties. Then, design and implementation can incrementally be done
by adding definitions of functions and proving that the implementation
meets the specification or design requirements. Thus, developing in
FoCaLize is a kind of refinement process from formal model to design
and code, completely done within FoCaLize. Taking the global development
in consideration within the same environment brings some conciseness,
helps documentation and reviewing.
Thus a FoCaLize development is organised as a hierarchy that may have
several roots. The upper levels of the hierarchy are built along the
specification stage while the lower ones correspond to
implementation and each node of the hierarchy corresponds to a progress
toward a complete implementation.
The FoCaLize system provides means for the developers to formally express
their specifications and to go step by step (in an incremental approach) to
design and implementation while proving that such an implementation
meets its specification or design requirements. The FoCaLize language offers
high level mechanisms such as inheritance, late binding, redefinition,
parametrization, etc. Confidence in proofs submitted by developers or
automatically done relies on formal proof verification. FoCaLize also
provides some automation of documentation production and management.
We would like to mention several works about safety and/or security
concerns within FoCaLize and specially the definition of a safety life
cycle by P. Ayrault, T. Hardin and F. Pessaux [TTSS08] and the
study of some traps within formal methods by E. Jaeger and
T. Hardin[traps].
The FoCaLize system in short |
|
FoCaLize can be seen as an IDE still in development, which
gives a positive solution to the three requirements identified above:
-
pertinent documentation is maintained within the system being written,
and its extraction is an automatic part of the compilation process,
- proofs are written using a high level proof language, so that proofs
are easier to write and their verification is automatic and reliable,
- the framework provides powerful abstraction mechanisms to facilitate
design and development; however, these mechanisms are carefully ruled:
the compiler performs numerous validity checks to ensure that no
further development can inadvertantly break the invariants or
invalidate the proofs; indeed, the compiler ensures that if a theorem
was based on assumptions that are now violated by the new development,
then the theorem is out of reach of the programmer.
- 5
- They were members of the SPI (Semantics, Proofs,
Implementations) team of the LIP6 (Lab. Informatique de Paris 6)
at Université Pierre et Marie Curie (UMPC), Paris
- 6
- For example Computer Algebra
Libraries use several different representations of polynomials
according to the treatment to be done
Before entering the precise description of FoCaLize we give an
informal
presentation of near all its features, to help further reading of the
reference manual. Every construction or feature of FoCaLize will be
entirely described in the following chapters.
The primitive entity of a FoCaLize development is the
species. It can be viewed as a record grouping ``things'' related
to a same concept. Like in most modular design systems (i.e. objected
oriented, algebraic abstract types) the idea is to group a data
structure with the operations to process it. Since in FoCaLize we don't
only address data type and operations, among these ``things'' we also
find the declaration (specification) of these operations, the
properties (which may represent
requirements) and their proofs.
We now describe each of these ``things'', called methods.
We now make concrete these notions on an example we will incrementally
extend. We want to model some simple algebraic structures. Let's start
with the description of a ``setoid'' representing the data structure of
``things'' belonging to a set, which can be submitted to an
equality test and
exhibited (i.e. one can get a witness of existence of one of these
``things'').
signature ( = ) : Self -> Self -> bool ;
signature element : Self ;
property refl : all x in Self, x = x ;
property symm : all x y in Self, x = y -> y = x ;
property trans: all x y z in Self, x=y and y=z -> x=z ;
let different (x, y) = basics#not_b (x = y) ;
end ;;
In this species, the representation is not explicitly
given (no keyword representation) , since we don't need to set
it to be able to express functions and properties our ``setoid''
requires. However, we can refer to it via Self and it is in fact
a type variable. In the same way, we specify a signature for the
equality (operator =). We introduce the three properties that
an equality (equivalence relation) must conform to.
We complete the example by the definition of
the function different which use the name =
(here basics#not_b stands for the
function not_b, the boolean and coming from the FoCaLize
source file basics.fcl). It is possible right now to prove that
different is irreflexive, under the hypothesis that = is an
equivalence relation (i.e. that each implementation of = given
further will satisfy these properties).
It is possible to use methods only declared before they get a
real definition thanks to the late-binding feature
provided by FoCaLize. In the same idea, redefining a method is
allowed in FoCaLize and, it is
always the last version which is kept as the effective definition
inside the species.
1.2 |
Type of Species, Interfaces and Collections |
|
The type of a species is obtained by removing definitions
and proofs. Thus, it is a kind of record type, made of all the method
types of the species. If the representation is still a type
variable say a, then the species type is prefixed with an
existential binder $ a. This binder will be eliminated as
soon as the representation will be instantiated (defined) and
must be eliminated to obtain runnable code.
The interface of a species is obtained by abstracting the representation type in the species type and this abstraction
is permanent.
Beware! No special construction
is given to denote interfaces in the concrete syntax, they are
simply denoted by the name of the species underlying them. Do not
confuse a species and its interface.
The species type remain totally implicit in the concrete syntax, being
just used as a step to build species interface. It is used
during inheritance resolution.
Interfaces can be ordered by inclusion, a point providing a very
simple notion of subtyping. This point will be further commented.
A species is said to be complete if all declarations have
received definitions and all properties have received proofs.
When complete, a species can be submitted to an abstraction
process of its representation to create a collection. Thus the
interface of the collection is just the interface of the
complete species underlying it. A collection can hence be seen as an
abstract data type, only usable through the methods of its interface,
but having the guarantee that all methods/theorems are defined/proved.
1.3 |
Combining Bricks by Inheritance |
|
A FoCaLize development is organised as a hierarchy which may have
several roots. Usually the upper levels of the hierarchy are built
during the specification stage while the lower ones correspond to
implementations. Each node of the hierarchy, i.e. each species,
is a progress to a complete implementation. On the previous
example, forgetting different, we typically presented a kind of
species for ``specification'' since it expressed only
signatures of functions to be later implemented and properties
to which, later, give proofs.
We can now create a new species, may be more complex, by
inheritance of a previously defined. We say here ``may be more
complex'' because it can add new operations and properties, but it can
also only bring real definitions to signatures and proofs
to properties, adding no new method.
Hence, in FoCaLize inheritance serves two kinds of evolutions. In the
first case the evolution aims making a species with more
operations but keeping those of its parents (or redefining some of
them). In the second case, the species only tends to be closer
to a ``run-able'' implementation, providing explicit definitions to
methods that were previously only declared.
Continuing our example, we want to extend our model to represent
``things'' with a multiplication and a neutral element for this
operation.
signature ( * ) : Self -> Self -> Self ;
signature one : Self ;
let element = one * one ;
end ;;
We see here that we added new methods but also gave a definition
to element, saying it is the application of the method *
to one twice, both of them being only declared. Here, we
used the inheritance in both the presented ways: making a more complex
entity by adding methods and getting closer to the
implementation by explicitly defining element.
Multiple inheritance is available in FoCaLize. For sake of simplicity,
the above example uses simple inheritance. In case of inheriting a
method from several parents, the order of parents in the
inherits clause serves to determine the chosen method.
The type of a species built using inheritance is defined
like for other species, the methods types retained inside
it being those of the methods present in the species after
inheritance is resolved.
A strong constraint in inheritance is that the type of inherited,
and/or redefined methods must not change. This is required to
ensure consistence of the FoCaLize model, hence of the developed
software. More precisely, if the representation is given by a type
expression containing some type variables, then it can be more defined
by instanciation of these variables. In the same way, two signatures
have compatible types if they have a common unifier, thus, roughly
speaking if they are compatible Ml-like types. For example, if the
representation was not yet defined, thus being still a type variable,
it can be defined by int. And if a species S inherits from
S1 and S2 a method called m, there is no type clash if S1 !m
and S2!m can be unified, then the method S!m has the most general
unifier of these two types as its own type.
1.4 |
Combining Bricks by Parametrisation |
|
Until now we are only able to enrich species.
However, we sometimes need to use a species, not to take over
its methods, but rather to use it as an ``ingredient'' to build
a new structure. For instance, a pair of setoids is a
new structure, using the previous species as the ``ingredient''
to create the structure of the pair. Indeed, the structure of a pair is
independent of the structure of each component it is made of. A pair
can be seen as parametrised by its two components.
Following this idea, FoCaLize allows two flavors of parametrisation.
1.4.1 |
Parametrisation by Collection Parameters |
|
We first introduce the collection parameters. They are
collections that the hosting species may use through their
methods to define its own ones.
A collection parameter is given a name C and an interface
I. The name C serves to call the methods of C which figure in
I. C can be instantiated by an effective parameter CE of
interface IE. CE is a collection and its interface IE must
contain I. Moreover, the collection and late-binding mechanisms
ensure that all methods appearing in I are indeed implemented
(defined for functions, proved for properties) in CE. Thus, no
runtime error, due to linkage of libraries, can occur and any properties stated in I can be safely used as an hypothesis.
Calling a species's method is
done via the ``bang'' notation:
!meth or
Self!meth for a method of the current
species (and in this case, even simpler: meth, since the
FoCaLize compiler will resolve scoping issues). To call
collection parameters's method, the same notation is used:
A!element stands for the method element of the
collection parameter A.
To go on with our example, a pair of setoids has two components, hence a
species for pairs of setoids will have two
collection parameters. It is itself a setoid, a fact which is
simply recorded via the inheritance mechanism:
inherits Setoid gives to Setoid_product all the methods
of Setoid.
representation = (A * B) ;
let ( = ) (x, y) =
and_b
(A!( = ) (first (x), first (y)),
B!( = ) (scnd (x), scnd (y))) ;
let create (x, y) in Self = basics#crp (x, y) ;
let element = Self!create (A!element, B!element) ;
proof of refl = by definition of ( = ) ;
end ;;
We express the representation of the product of two setoids as the
Cartesian product of the representation of the two parameters. In
A * B, * is the FoCaLize type constructor of pairs, A denotes indeed
the representation of the first collection parameter, and B
the one of of the second collection parameter.
Next, we add a definition for = of Setoid_product,
relying on the methods = of A (A!( = )) and B
(which are not yet defined). Similarly, we introduce a definition for
element by building a pair, using
the function create (which calls the predefined function basics#crp) and the methods element of respectively A
and B. And we can prove that = of Setoid_product is
indeed reflexive, upon the hypothesis made on A!( = )
and B!( = ). The part of FoCaLize used to write proofs will be
shortly presented later, in section ??.
This way, the species Setoid_product builds its methods relying on those of its collection parameters. Note the
two different uses of Setoid in our species Setoid_product, which inherits of Setoid and is parametrised
by Setoid.
Why such collection parameters and not simply species
parameters? There are two reasons. First, effective parameters must
provide definitions/proofs for all the methods of the required
interface: this is the contract. Thus, effective parameters must
be complete species. Then, we do not want the parametrisation
to introduce dependencies
on the parameters' representation definitions. For example, it is
impossible to express `` if A!representation is int and B!representation
is bool then A*B is a list of boolean values''. This would
dramatically restrict possibilities to instantiate parameters since
assumptions on the representation, possibly used in the
parametrised species to write its own methods,
could prevent collections having the right set of methods but
a different representation to be used as
effective parameters. Such a behaviour would make parametrisation too
weak to be usable. We choose to always hide the representation of a
collection parameter to the parametrised
hosting species. Hence the introduction of the notion of
collection, obtained by abstracting the representation from a
complete species.
1.4.2 |
Parametrisation by Entity Parameters |
|
Let us imagine we want to make a species working on natural numbers
modulo a certain value. In the expression
5 modulo 2 is 1, both 5 and 2
are natural numbers. To be sure that the species will
consistently work with the same modulo, this last one must be embedded
in the species. However, the species itself doesn't rely
on a particular value of the modulo. Hence this value is clearly a
parameter of the species, but a parameter in which we are
interested by its value, not only by its representation and the
methods acting on it. We call
such parameters entity parameters, their introduction rests upon
the introduction of a collection parameter and they denote a
value having the type of the representation of this
collection parameter.
Let us first have a species representing natural numbers:
signature one : Self ;
signature modulo : Self -> Self -> Self ;
end ;;
Note that IntModel can be later implemented in various ways,
using Peano's integers, machine integers, arbitrary-precision
arithmetic ...
We now build our species ``working modulo ...'', embedding
the value of this modulo:
let job1 (x in Naturals) in ... =
... Naturals!modulo (x, n) ... ;
let job2 (x in Naturals, ...) in ... =
... ... Naturals!modulo (x, n) ... ... ;
end ;;
Using the entity parameter n, we ensure that the
species Modulo_work works for any value of the
modulo, but will always use the same value n of the modulo
everywhere inside the species.
As briefly introduced in ??, a species
needs to be fully defined to lead to executable code for its functions
and checkable proofs for its theorems. When a species is fully
defined, it can be turned into a collection. Hence, a collection
represents the final stage of the inheritance tree of a species
and leads to an effective data representation with
executable functions processing it.
For instance, providing that the previous
species IntModel turned into a fully-defined species
MachineNativeInt through inheritances steps, with a method
from_string allowing to create the natural representation of a
string, we could get a related collection by:
Next, to get a collection implementing arithmetic modulo 8, we
could extract from the species Modulo_work the following
collection:
(MachineNativeIntColl, MachineNativeIntColl!from_string (``8'') ;;
As seen by this example, a species can be applied to effective
parameters by giving their values with the usual syntax of parameter
passing.
As said before, to ensure modularity and abstraction, the
representation of a
collection turns hidden. This means that any software component
dealing with a collection will only be able to manipulate it
through the operations (methods) its interface provides. This
point is especially important since it prevents other software
components from possibly breaking invariants required by the internals
of the collection.
1.6 |
Properties, Theorems and Proofs |
|
FoCaLize aims not only to write programs, it intends to encompass both
the executable model (i.e. program) and properties this model must
satisfy. For this reason, ``special'' methods deal with logic
instead of purely behavioural aspects of the system: theorems,
properties and proofs.
Stating a property expects that a proof that it
holds will finally be given. For theorems, the proof is
directly embedded in the theorem. Such proofs must be done by
the developer and will finally be sent to the formal proof assistant
Coq who will automatically check that the demonstration of the
property is consistent. Writing a proof can be done in several ways.
It can be written in ``FoCaLize's proof language'', a hierarchical proof
language that allows to give hints and directions for a proof. This
language will be sent to an external theorem prover,
Zenon [Zenon, zenon0.4.1] developed by D. Doligez. This prover is
a first order theorem prover based on the tableau method incorporating
implementation novelties such as sharing. Zenon will attempt, from
these hints to automatically generate the proof and exhibit a Coq
term suitable for verification by Coq. Basic hints given by the
developer to Zenon are: ``prove by definition of a method''
(i.e. looking inside its body) and ``prove by property''
(i.e. using the logical body of a theorem or property''.
Surrounding this hints mechanism, the language allows to build the
proof by stating assumptions (that must obviously be demonstrated
next) that can be used to prove lemmas or parts for the whole
property. We show below an example of such demonstration.
!order_inf(i, x) -> !order_inf(i, y) ->
!order_inf(i, !inf(x, y))
proof:
<1>1 assume x in Self, assume y in Self,
assume i in Self, assume H1: !order_inf(i, x),
assume H2: !order_inf(i, y),
prove !order_inf(i, !inf(x, y))
<2>1 prove !equal(i, !inf(!inf(i, x), y))
by hypothesis H1, H2
property inf_left_substitution_rule,
equal_symmetric, equal_transitive
definition of order_inf
<2>9 qed
by step <2>1
property inf_is_associative, equal_transitive
definition of order_inf
<1>2 conclude
;
The important point is that Zenon works for the
developer: it searches the proof itself, the developer does not
have to elaborate it formally ``from scratch''.
Like any automatic theorem prover, Zenon may fail finding a
demonstration. In this case, FoCaLize allows to write verbatim
Coq proofs. In this case, the proof is not anymore automated, but
this leaves the full power of expression of Coq to the developer.
Finally, the assumed keyword is the ultimate proof backdoor,
telling that the proof is not given but that the property must be
admitted. Obviously, a really safe development should not make usage of
such ``proofs'' since they bypass the formal verification of
software's model. However, such a functionality remains needed since
some of ``well-known'' properties can never be proved for a computer.
For instance, " x Î IN, x+1 >n does not hold in a
computer with native integers. However, in a mathematical
framework, this property holds and is needed to carry out other
proofs. Thus the developer may prove either that all manipulated values
remain in an interval where this property holds or may admit this
property or may add code to detect overflow ...
On another side, a development may be linked with external
code, trusted or not, but for which properties cannot be proved inside
the FoCaLize part since it does not belong to it. Expressing properties
of the FoCaLize part may need to express properties on the imported
code, that cannot be formally proved, then must be ``assumed''.
In the previous sections, we presented FoCaLize through its programming
model and shortly its syntax. We especially investigated the various
entities making a FoCaLize program. We now address what becomes a
FoCaLize program once compiled. We recall that FoCaLize supports the
redefinition of functions, which permits for example to specialise
code to a specific representation (for example, there
exists a generic implementation of integer addition modulo n but
it can be redefined in arithmetics modulo 2 if boolean values
are used to represent the two values). It is also a very convenient
tool to maintain software.
1.7.1 |
Consistency of the Software |
|
All along the development cycle of a FoCaLize program, the compiler
keeps trace of dependencies between species, their
methods, the proofs, ...to ensure that modifications
of one of them will be detected those depending of it.
FoCaLize considers two types of dependencies:
-
The decl-dependency: a method A decl-depends on a
method B, if the declaration of B is required to
state A.
- The def-dependency: a method (and more especially, a
theorem) A def-depends on a method B, if the
definition of B is required to state A (and more
especially, to prove the property stated by the theorem
A).
The redefinition of a function may invalidate the proofs that use
properties of the body of the redefined function. All the proofs
which truly depend of the definition are then erased by the compiler
and must be done again in the context updated with the new
definition. Thus the main difficulty is to choose the best level in
the hierarchy to do a proof. In [PrevostoJaume2003], Prevosto and
Jaume propose a coding style to minimise the number of proofs
to be redone in the case of a redefinition, by a certain kind of
modularisation of the proofs.
FoCaLize currently compiles programs toward two languages, OCaml to
get an executable piece of software, and Coq to have a formal model
of the program, with theorems and proofs.
In OCaml code generation, all
the logical aspects are discarded since they do not lead to executable
code.
Conversely, in Coq, all the methods are compiled,
i.e. ``computational'' methods and logical methods with
their proofs. This allows Coq to check the entire consistence of
the system developed in FoCaLize.
FoCaLize incorporates a tool named FocalTest
[CarlierDuboisLNCS2008] for Integration/Validation testing. It
allows to confront automatically a property of the specification with
an implementation. It generates automatically test cases, executes
them and produces a test report as an XML document. The property under
test is used to generate the test cases, it also serves as an
oracle. When a test case fails, it means a counterexample of the
property has been found: the implantation does not match the property;
it can also indicate an error in the specification.
The tool FocalTest automatically produces the test environment and
the drivers to conduct the tests. We benefit from the inheritance
mechanism to isolate the testing harness from the components written
by the programmer.
The testable properties are required to be broken down into a
precondition and a conclusion, both executable.
FocalTest proposes a pure random test cases generation: it
generates test cases until the precondition is satisfied, the verdict
of the test case is given by executing the post-condition. It can be
an expensive process for some kind of preconditions. To overcome this
drawback, a constraint based generation is under development: it
allows to produce directly test cases for which the precondition is
satisfied.
The tool called FoCaLizeDoc [MaarekCalculemus03] automatically
generates documentation, thus the documentation of a component is
always coherent with respect to its implementation.
This tool uses its own XML format that contains information coming not
only from structured comments (that are parsed and kept in the
program's abstract syntax tree) and FoCaLize concrete syntax but also
from type inference and dependence analysis. From this XML
representation and thanks to some XSLT stylesheets, it is possible to
generate HTML files or LATEX files. Although this documentation is
not the complete safety case, it can helpfully contribute to its
elaboration. In the same way, it is possible to produce UML
models [Focal-UML] as means to provide a graphical documentation
for FoCaLize specifications. The use of graphical notations appears
quite useful when interacting with end-users, as these tend to be more
intuitive and are easier to grasp than their formal (or textual)
counterparts. This transformation is based on a formal schema and
captures every aspect of the FoCaLize language, so that it has been
possible to prove the soundness of this transformation (semantic
preservation).
FoCaLize's architecture is designed to easily plug third-parties
analyses that can use the internal structures elaborated by the
compiler from the source code. This allows, for example, to make
dedicated documentation tools for custom purposes, just exploiting
information stored in the FoCaLize program's abstract syntax tree, or
extra information possibly added by extra processes, analyses.
To be able to develop with the FoCaLize environment, a few third party
tools are required. All of them can be freely downloaded from their
related website.
-
The Objective Caml compiler (version ³ 3.10.2).
Available
at http://caml.inria.fr
. This will be used to compile both
the FoCaLize system at installation stage from the tarball and
the FoCaLize compiler's output generated by the compilation of
your FoCaLize programs.
- The Coq Proof Assistant (version ³ 8.1pl4).
Available at
http://coq.inria.fr
. This will be used to compile both
the FoCaLize libraries at installation stage from the tarball and
the FoCaLize compiler's output generated by the compilation of
your FoCaLize programs.
The FoCaLize compiler can generate dependencies graphs from compiled
source code. It generates them in the format suitable to be processed
and displayed by the dotty tools suit of the ``Graphwiz'' package. If
you plan to examine these graphs, you also need to install this
software from http://www.graphviz.org/
.
FoCaLize was fully developed under Linux using free software. Hence,
any Unix-based operating system should support FoCaLize. The currently
tested Unix are: Fedora, Debian, Suse, BSD.
Windows users can run FoCaLize via the Unix-like
environment Cygwin providing both users and developers tools. This
software is freely distributed and available
at http://www.cygwin.com/
.
From the official Cygwin web site:
``Cygwin is a Linux-like environment for Windows. It consists of
two parts: A DLL (cygwin1.dll) which acts as a Linux API emulation
layer providing substantial Linux API functionality. A collection of
tools which provide Linux look and feel.
The Cygwin DLL currently works with all recent, commercially released
x86 32 bit and 64 bit versions of Windows, with the exception of
Windows CE.
Cygwin is not a way to run native linux apps on Windows. You have to
rebuild your application from source if you want it to run on
Windows.
Cygwin is not a way to magically make native Windows apps aware of
UNIX ® functionality, like signals, ptys, etc. Again, you need to
build your apps from source if you want to take advantage of Cygwin
functionality.''
Under Cygwin, the required packages are the same as those listed
in ?? and ??. As stated
in Cygwin's citation above, you need to get the sources packages of
this software and compile them yourself, following information
provided in these packages.
The installation of FoCaLize itself is the same for all operating
systems and is described in the following section
(??).
FoCaLize is currently distributed as a tarball containing the whole
source code of the development environment. You must first deflate the
archive (a directory will be created) by:
tar xvzf focalize-x.x.x.tgz
Next, go in the sources directory:
cd focalize-x.x.x/
You now must configure the build process by:
./configure
The configuration script then asks for directories where to install
the FoCaLize components. You may just press enter to keep the default
installation directories.
latour:~/src/focalize$ ./configure ~/pkg
Where to install FoCaLize binaries ?
Default is /usr/local/bin.
Just press enter to use default location.
Where to install FoCaLize libraries ?
Default is /usr/local/lib/focalize.
Just press enter to use default location.
After the configuration ends, just build the system:
make all
And finally, get root priviledges to install the FoCaLize system:
su
make install
2.5 |
Compilation process and outputs |
|
We call compilation unit a file
containing source code for toplevel-definitions, species,
collections. Visibility rules, described in section
??, are defined according to compilation units status.
From a compilation unit, the compiler issues several files described
on the following.
A FoCaLize development contains both
``computational code'' (i.e. code performing operations that lead to
an effect, a result) and logical properties.
When compiled, two outputs are generated:
-
The ``computational code'' is compiled into OCaml source
that can then be compiled with the OCaml compiler to lead to an
executable binary. In this pass, logical properties are discarded
since they do not lead to executable code.
- Both the ``computational code'' and the logical properties are
compiled into a Coq model. This model can then be sent to the
Coq proof assistant who will verify the consistency of both the
``computational code'' and the logical properties (whose
proofs must be obviously provided) of the
FoCaLize development. This means that the Coq code generated is
not intended to be used to generate an OCaml source code by
automated extraction. As stated above, the executable generation
is preferred using directly the generated OCaml code. In this
idea, Coq acts as an assessor of the development instead of a
code generator.
More accurately, FoCaLize first generates a pre-Coq code, i.e. a
file containing Coq syntax plus ``holes'' in place of proofs
written in the FoCaLize Proof Language. This kind of files is
suffixed by ``.zv'' instead of directly ``.v''. When sending this
file to Zenon these ``holes'' will be filled by effective
Coq code automatically generated by Zenon (if it succeed in
finding a proof), hence leading to a pure Coq code file that can
be compiled by Coq.
In addition, several other outputs can be generated for documentation
or debug purposes. See the section ?? for
details.
Compiling a FoCaLize program involves several steps that are
automatically handled by the focalizec command. Using the command
line options, it is possible to tune the code generations steps as
described in ??.
-
FoCaLize source compilation. This step takes the FoCaLize
source code and generates the OCaml and/or ``pre-''Coq code.
You can disable the code generation for one of these languages
(see page ??), or both, in this case, no code is
produced and you only get the FoCaLize object code produced without
anymore else output and the process ends at this point. If you
disable one of the target languages, then you won't get any
generated file for it, hence no need to address its related
compilation process described below.
Assuming you generate code for both OCaml and Coq you will get
two generated files: source.ml (the OCaml code) and
source.zv (the ``pre-''Coq code).
- OCaml code compilation. This step takes the generated
OCaml code (it is an OCaml source file) and compile it. This
is done like any regular OCaml compilation, the only difference
is that the search path containing the FoCaLize installation path
and your own used extra FoCaLize source files directories are
automatically passed to the OCaml compiler. Hence this steps
acts like a manual invocation:
ocamlc -c -I /usr/local/lib/focalize -I mylibs
-I myotherlibs source.ml
This produces the OCaml object file source.cmo. Note that
you can also ask to use the OCaml code in native mode, in this
case the ocamlopt version of the OCaml compiler is
selected (see OCaml reference manual for more information) and
the object files are .cmx files instead of .cmo.
ones.
- ``Pre-''Coq code compilation. This step takes the
generated .zv file and attempts to produce a real Coq
.v source file by replacing proofs written in FoCaLize Proof
Language by some effective Coq proofs found by the Zenon
theorem prover. Note that if Zenon fails in finding a proof, a
hole will remain in the final Coq .v file. Such a hole
appears as the text ``TO_BE_DONE_MANUALLY.'' in place of
the effective proof. In this case, Coq will obviously fail
in compiling the file, so the user must do the proof by hand or
modify his original FoCaLize source file to get a working proof.
This step acts like a manual invocation:
zvtov -new source.zv
For more about the Zenon options, consult section
??.
- Coq code compilation. This step takes the generated
.v code and compiles it with Coq. This is done like any
regular Coq compilation. The only difference is that the search
path containing the FoCaLize installation path and your own used
extra FoCaLize source files directories are automatically passed
to the Coq compiler.
coqc -I /usr/local/lib/focalize -I mylibs
-I myotherlibs source.v
Once this step is done, you have the Coq object files and you
are sure that Coq validated you program model, properties and
proofs. The final ``assessor'' of the tool-chain accepted your
program.
Once all separate files are compiled, to get an executable from the
OCaml object files, you must link them together, providing the same
search path than above and the .cmo files corresponding to all
the generated OCaml files from all your FoCaLize .foc
files. You also need to add the .cmo files corresponding to the
modules of the standard library you use (currently, this must be done
by the user, next versions will automate this process).
ocamlc -I mylibs -I myotherlibs
install_dir/ml_builtins.cmo install_dir/basics.cmo
install_dir/sets.cmo ...
mylibs/src1.cmo mylibs/src2.cmo ...
myotherlibs src3.cmo mylibs/src3.cmo ...
source1.cmo source2.cmo ...
-o exec_name
The following characters are considered as blanks: space, newline,
horizontal tabulation, carriage return, line feed and form
feed. Blanks are ignored, but they separate adjacent identifiers,
literals and keywords that would otherwise be confused as one single
identifier, literal or keyword.
Comments (possibly spanning) on several lines are introduced by the
two characters (*, with no intervening blanks, and terminated by
the characters *), with no intervening blanks. Comments are
treated as blanks. Comments can occur inside string or character
literals (provided the * character is escaped) and can be nested. They
are discarded during the compilation process. Example:
species S =
...
let m (x in Self) = (* Another discarded comment. *)
...
end ;;
(* Another discarded comment at end of file. *)
Comments spanning on a single line start by the two characters
-- and end with the end-of-line character.
Example:
species S =
let m (x in Self) = -- Another uni-line comment.
...
end ;;
Annotations are introduced by the three characters (**,
with no intervening blanks, and terminated by the two characters
*), with no intervening blanks.
Annotations cannot occur inside string or character literals and
cannot be nested. They must precede the construct they document.
In particular, a source file cannot end by an annotation.
Unlike comments, annotations are kept during the compilation process
and recorded in the compilation information (``.fo'' files). Annotations can
be processed later on by external tools that could analyze them to
produce a new FoCaLize source code accordingly.
For instance, the FoCaLize development environment provides the FoCaLizeDoc
automatic production tool that uses annotations to automatically generate
documentation.
Several annotations can be put in sequence for the same construct. We call
such a sequence an annotations block.
Using embedded tags in annotations allows third-party tools to easily find
out annotations that are meaningful to them, and safely ignore others.
For more information, consult
??.
Example:
Documentation for species S. *)
species S =
...
let m (x in Self) =
(** {@TEST} Annotation for the test generator. *)
(** {@MY_TAG_MAINTAIN} Annotation for maintainers. *)
... ;
end ;;
FoCaLize features a rich class of identifiers with sophisticated lexical
rules that provide fine distinction between the kind of notion a given
identifier can designate.
Sorting words to find out which kind of meaning they may have is a very common
conceptual categorization of names that we use when we write or read ordinary
English texts. We routinely distinguish between:
- 0pt 5pt
·
- a word only made of lowercase characters, that is supposed to be an
ordinary noun, such as "table", "ball", or a verb as in "is", or an
adjective as in "green",
- ·
- a word starting with an uppercase letter, that is supposed to be a name,
maybe a family or christian name, as in "Kennedy" or "David", or a location
name as in "London".
We use this distinctive look of words as a useful hint to help understanding
phrases. For instance, we accept the phrase "my ball is green" as meaningful,
whereas "my Paris is green" is considered a nonsense. This is simply because
"ball" is a regular noun and "Paris" is a name. The word "ball" as the right
lexical classification in the phrase, but "Paris" has not. This is also clear
that you can replace "ball" by another ordinary noun and get something
meaningful: "my table is green"; the same nonsense arises as well if you
replace "Paris" by another name: "my Kennedy is green".
Natural languages are far more complicated than computer languages, but
FoCaLize uses the same kind of tricks: the ``look'' of words helps a lot to
understand what the words are designating and how they can be used.
3.1.4.2 |
Conceptual properties of names |
|
FoCaLize distinguishes 4 concepts for each name:
- 0pt 5pt
·
- the fixity assigns the place where an identifier must be written,
- ·
- the precedence decides the order of operations when
identifiers are combined together,
- ·
- the categorisation fixes which concept the identifier designates.
- ·
- the nature of a name can either be symbolic or alphanumeric.
Those concepts are compositional, i.e. all these concepts are independent
from one another. Put is another way: for any fixity, precedence, category and nature,
there exist identifiers with this exact properties.
We further explain those concepts below.
The fixity of an identifier answers to the question ``where this identifier
must be written ?''.
- 0pt 5pt
·
- a prefix is written before its argument, as sin in
sin x or - in - y,
- ·
- an infix is written between its arguments, as + in
x + y or mod in x mod 3.
- ·
- a mixfix is written among its arguments, as
if ... then ... else ... in
if c then 1 else 2 .
In FoCaLize, as in maths, ordinary identifiers are always prefix and binary operators are
always infix.
The precedence rules out where implicit parentheses take place in a
complex combination of symbols. For instance, according to the usual mathematical
conventions:
- 0pt 5pt
·
- 1 + 2 * 3 means 1 + (2 * 3) hence 7,
it