Architecture

Comparison Process

This is the overview of the comparison process as a whole. Each of the six phases of the comparison process of EMF Compare are briefly defined on the Overview, and a much more in-depth explanation will be given below, in our explanations of the default behavior of EMF Compare.

Project Architecture

EMF Compare is built on top of the Eclipse platform. We depend on the Eclipse Modeling Framework (EMF), the Eclipse Compare framework and, finally, Eclipse Team, the framework upon which the repository providers (EGit, CVS, Subversive...) are built.

The EMF Compare extensions target specific extensions of the modeling framework: UML, the Graphical Modeling Framework (and its own extensions, papyrus, ecoretools, ...).

Whilst we are built atop bricks that are tightly coupled with the eclipse platform, it should be noted that the core of EMF Compare can be run in a standalone application with no runtime dependencies towards Eclipse; as can EMF itself.

The Comparison Model

EMF Compare uses a single model, which root is a Comparison object, to represent all of the information regarding the comparison: matched objects, matched resources, detected differences, links between these references, etc. The root Comparison is created at the beginning of the Match process, and will undergo a set of successive refinings during the remainder of the Comparison: Diff, Equivalence, Dependencies... will all add their own information to the Comparison.

So, how exactly is represented all of the information the Comparison model can hold, and how to make sense of it all?

Match

A Match element is how we represent that the n compared versions have elements that are basically the same. For example, if we are comparing two different versions v1 and v2 of a given model which look like:

Master Borrowables

Comparing these two models, we'll have a Comparison model containing three matches:

  1. library <-> library
  2. Book <-> Novel
  3. title <-> title

In other words, the comparison model contains an aggregate of the two or three compared models, in the form of Match elements linking the elements of all versions together. Differences will then be detected on these Match and added under them, thus allowing us to know both:

Diff

Diff elements are created during the differencing process in order to represent the actual modifications that can be detected within the source model(s). The Diff concept itself is only there as the super-class of the three main kind of differences EMF Compare can detect in a model, namely ReferenceChange, AttributeChange and ResourceAttachmentChange. We'll go back to these three sub-classes in a short while.

Whatever their type, the differences share a number of common elements:

In order to ensure that the model stays coherent through individual merge operations, we've also decided to link differences together through a number of associations and references. For example, there are times when one difference cannot be merged without first merging another, or some differences which are exactly equivalent to one another. In no specific order:

As mentionned above, there are only three kind of differences that we will detect through EMF Compare, which will be sufficient for all use cases. ReferenceChange differences will be detected for every value of a reference for which we detect a change. Either the value was added, deleted, or moved (within the reference or between distinct references). AttributeChange differences are the same, but for attributes instead of references. Lastly, the ResourceAttachmentChange differences, though very much alike the ReferenceChanges we create for containment references, are specifically aimed at describing changes within the roots of one of the compared resources.

Conflict

Conflict will only be detected during three-way comparisons. There can only be "conflicts" when we are comparing two different versions of a same model along with their common ancestor. In other words, we need to able to compare two versions of a common element with a "reference" version of that element.

There are many different kinds of conflicts; to name a few:

Conflicts can be of two kinds. We call PSEUDO conflict a conflict where the two sides of a comparison have changed as compared to their common ancestor, but where the two sides are actually now equal. In other words, the end result is that the left is now equal to the right, even though they are both different from their ancestor. This is the opposite of REAL conflict where the value on all three sides is different. In terms of merging, pseudo conflicts do not need any particular action, whilst real conflicts actually need resolution.

There can be more than two differences conflicting with each other. For example, the deletion of an element from one side will most likely conflict with a number of differences from the other side.

Equivalence

EMF Compare uses Equivalence elements in order to link together a number of differences which can ultimately be considered to be the same. For example, ecore's eOpposite references will be maintained in sync with one another. As such, modifying one of the two references will automatically update the second one accordingly. The manual modification and the automatic update are two distinct modifications of the model, resulting in two differences detected. However, merging any of these two differences will automatically merge the other one. Therefore both are marked as being equivalent to each other.

There can be more than two differences equivalent with each other; in which case all will be added to a single Equivalence object, representing their relations.