Xtext relies heavily on EMF internally, but it can also be used as the serialization back-end of other EMF-based tools. In this section we introduce the basic concepts of the Eclipse Modeling Framework (EMF) in the context of Xtext. If you want to learn more about EMF, we recommend reading the EMF book.
When Xtext uses EMF models as the in-memory representation of any parsed text files. This in-memory object graph is called the Abstract Syntax Tree (AST). Depending on the community this concepts is also called document object graph (DOM), semantic model, or simply model. We use model and AST interchangeably. Given the example model from the introduction, the AST looks similar to this
The AST should contain the essence of your textual models and abstract over syntactical information. It is used by later processing steps, such as validation, compilation or interpretation. In EMF a model is made up of instances of EObjects which are connected and an EObject is an instance of an EClass. A set of EClasses if contained in a so called EPackage, which are both concepts of Ecore In Xtext, meta models are either inferred from the grammar or predefined by the user (see the section on package declarations for details). The next diagram shows the meta model of our example:
The language in which the meta model is defined is called Ecore. In other words, the meta model is the Ecore model of your language. Ecore an essential part of EMF. Your your models instantiate the meta model, and your meta model instantiates Ecore. To put an end to this recursion, Ecore is defined in itself (an instance of itself).
The meta model defines the types of the semantic nodes as Ecore EClasses. EClasses are shown as boxes in the meta mode diagram, so in our example, Model, Type, SimpleType, Entity, and Property are EClasses. An EClass can inherit from other EClasses. Multiple inheritance is allowed in Ecore, but of course cycles are forbidden.
EClasses can have EAttributes for their simple properties. These are shown inside the EClasses nodes. The example contains two EAttributes name and one EAttribute isMulti. The domain of values for an EAttribute is defined by its EDataType. Ecore ships with some predefined EDataTypes, which essentially refer to Java primitive types and other immutable classes like String. To make a distinction from the Java types, the EDataTypes are prefixed with an E. In our example, that’s EString and EBoolean.
In contrast to EAttributes, EReferences point to other EClasses. The containment flag indicates whether an EReference is a containment reference or a cross-reference. In the diagram, references are edges and containment references are marked with a diamond. At the model level, each element can have at most one container, i.e. another element referring to it with a containment reference. This infers a tree structure to the models, as can be seen in the sample model diagram. On the other hand, cross-references refer to elements that can be contained anywhere else. In the example, elements and properties are containment references, while type and extends are cross-references. For reasons of readability, we skipped the cross-references in the sample model diagram. Note that in contrast to other parser generators, Xtext creates ASTs with linked cross-references.
Other than associations in UML, EReferences in Ecore are always owned by one EClass and only navigable in the direction form the owner to the type. Bi-directional associations must be modeled as two references, being eOpposite of each other and owned by either end of the associations.
The superclass of EAttibutes and EReferences is EStructuralFeature and allows to define a name and a cardinality by setting lowerBound and upperBound. Setting the latter to -1 means ‘unbounded’.
The common supertype of EDataType and EClass is EClassifier. An EPackage acts as a namespace and container of EClassifiers.
We have summarized these most relevant concepts of Ecore in the following diagram: