Import and export of workflow files from to EasySOA

This page is about technical solutions to provide seamless import/export of workflow files from/to EasySOA. The goals, extracted from EasySOA and workflow build process, are the following:

File saving/loading: Architects shall be able to work on workflow files (business, technical and executable files) stored on EasySOA as easily as if they were stored locally.
Model synchronization: When files are transfered to EasySOA, their content shall be reflected to [EasySOA Core model](Final soa model design). In particular, the folowing information shall be taken into account:
- workflow context information
- business process hierarchy (which processes are called from a given process)
- service use by (technical) workflows
Constraints:
- The feature shall be available when using "vanilla" (not EasySOA-branded) editors
  - at least the solution should provide a "fallback" mode to slightly modified editors, lacking some features but still providing file transfer.
  - at best the solution should provide partial or full functionality to any editor without requiring any change
- Workflow files shall be versioned, i.e. each past version of the files shall be available as well as the current version.
- (optional) An offline mode should be available, while keeping the data's consistency -- to avoid spreading data on several storages.

Status

What's been done

We chose solution 2.

Work has started on the "ecm_support" branch of the JWT SVN repository. The model synchronization was identified as a priority, thus there is no file loading/saving support for now.

A good entry point to CMIS Synchronization would be the org.eclipse.jwt.transformation.ecm.atl.AbstractDmsSynchronizationTransformation class, in the transformations/jwt-transformation-ecm-atl project. It is extended in particular in the ecm/jwt-ecm-sync-demo project.

In this branch, the following projects are of interest:

ecm/jwt-ecm-sync-base: the implementation of Model synchronization. It takes an EMF model as input and connects to a CMIS repository to make it look like the given EMF model.
ecm/jwt-ecm-sync-cmis-model: the EMF model definition of a "CMIS synchronization target". It is used to describe "what we want" on a remote Document Management System, i.e. the "target". It is intended to be instantiated by synchronization users, for example by the mean of an ATL transformation from their own model to this one.
ecm/jwt-ecm-sync-demo: a demonstration of how the synchronizer can be used. It features an ATL transformation from the JWT model to a simple model either on Alfresco or Nuxeo.
releng/jwt-test-plugin-unofficial: tests were added in package org.eclipse.jwt.tests.ecm.cmis.sync. These tests are performed against default Nuxeo & Alfresco models at the moment. A system of Builder/Director was put in place: see org.eclipse.jwt.tests.ecm.cmis.sync.fixture.building for definition and org.eclipse.jwt.tests.ecm.cmis.sync.ManualTest for use. For the moment, as the name shows, validation of the result of this synchronization is purely manual: the tests only check for exceptions thrown during synchronization and synchronization stability, but do not check that the resulting remote repository is as expected. There is hope that this last check can be automated using a custom Builder which does not perform builds, but checks that they were performed instead.
transformations/jwt-transformation-ecm and transformations/jwt-transformation-ecm-atl: various tools for implementation of model synchronization using the JWT transformation framework. Both projects are based on the transformations/jwt-transformation-base project, version 1.3 (see below).

To integrate model synchronization to the JWT transformations framework, this framework had to be updated. Indeed, until this update it only allowed transformations to take strings as an input, which prevented full introspection of a modeler instance, for example to easily generate pictures of a process. Now we can give any Java object as an input to a transformation, including an modeler instance (WEEditor in JWT). These updates were performed in the transformations/jwt-transformation-base project.

NOTE: Keep in mind that, if using Nuxeo, version 5.5-HF09 is required. Previous versions were a bit buggy, especially relating to CMIS relationships support. Note that a bug is still around, preventing Nuxeo Relations from being in sync with CMIS relationships.

What's left to do

Model synchronization: Once the actual EasySOA Core model has been defined and implemented, we'll have to create an Eclipse plug-in based on ecm/jwt-ecm-sync-base and transformations/jwt-transformation-ecm-atl that:

defines an ATL transformation extracting the information we want on Nuxeo from the JWT model
defines the way remote (Nuxeo) objects are matched to local (EMF) ones, and which changes can be performed on the EasySOA repository (see respectively org.eclipse.jwt.ecm.sync.cmis.strategy.matching.MatchingStrategy and org.eclipse.jwt.ecm.sync.cmis.strategy.filtering.FilteringStrategy in the ecm/jwt-ecm-sync-base project)
defines a JWT transformation and its input/output definitions. It may be useful to implement some editor-wide settings relative to these IO definitions (to avoid repeatedly typing EasySOA Core's URL, for instance)

That would close the model synchronization issue, provided Nuxeo's bug is fixed.

File saving/loading: For now, we only studied the problem. The Filesystem-level implementation at the bottom of this page describes how we could implement this feature, but this would require to start a new Eclipse plugin from scratch. Integration with the model synchronization feature, in particular, promises to be tricky.

Solution #1: WebDAV mountpoint

Idea

Mount a remote Nuxeo view on the user's filesystem. Then, editors work as usual: they are "fooled", since they don't "know" about EasySOA. Users could browse this filesystem to save/load files at relevant places.

To improve user-friendliness, we could add wizards in editors to create the files in the right folder (based on the mountpoint, project name, workflow name, and so on).

Details

Data transfer

Since mounting filesystems on one another is typically an operating system job, the client-side solution will depend on the user's OS.

On Windows, network drives
CHECKED On Linux, davfs with possibly other solutions, including graphical front-ends (in GNOME's Nautilus for example)
On Mac OS X, Finder seems to support WebDAV mounting natively.
On other platforms, well, who knows?

WebDAV seems supported in Nuxeo. There is some documentation about working with WebDAV in Nuxeo, too.

Working this way would probably induce some limitations; for example, we may need to create "Workflow" items in Nuxeo before being able to create workflow diagrams at the right place.

Synchronization to EasySOA Core model

can we use webdav to change file "properties"?
- Yes we can, but not relationships. See http://www.slideshare.net/nuxeo/cmis-overview
- WebDAV spec: http://www.webdav.org/specs/#dav
Otherwise, there's still workflow definition parsing...

Constraints

Compatibility with vanilla editors

Saving & loading are fully compatible with any desktop editor.

TODO Synchronization to EasySOA Core model

Compatibility with other ECMs

Probably OK, since WebDAV is a widespread standard.

Versioning

Versioning would be handled by Nuxeo, and should work out-of-the-box.

Offline mode

On Windows, the Briefcase ("Porte-documents" in french) enables working on cached versions of online documents. According to Alfresco's documentation, simply mounting a WebDAV space as a shared network drive would do the trick (this seems too simple, though).

On Linux and Mac OS, we still need to find a solution. TODO

JWT point of view

TODO Synchronization to EasySOA Core model

Solution #2: CMIS

Idea

Use CMIS technology to allow the editor to access Nuxeo. Insert a plugin in each editor, which uses a library performing file transfer with Nuxeo (saving and loading but also property/relationship changes), which itself uses CMIS technology. The plugins integrate the library in editors (with some UI modifications) and thus enables users to use remote files.

For editors which can't be modified, Nuxeo's web interface is used to upload or download files.

Details

Data transfer

CMIS allows transferring files using either a REST API or a SOAP API.

Synchronization to EasySOA Core model

It probably wouldn't be trivial, as the diagram would have to be "split" into several documents in EasySOA, each of them having relationships. These relationships would have to be managed, especially when updating.

The whole "model synchronization" logic would have to be handled directly in the editor using the CMIS API, since CMIS allows to create/retrieve/update/delete objects and relationships, but not to call "business-specific", server-side methods.

Constraints

Compatibility with vanilla editors

By packaging the generic (editor-independent) parts of this feature as a library, we could probably get other Eclipse-based editors to work in a "fallback" mode with minimal, non-intrusive changes. It is highly unlikely, though, that we'll be able to integrate editors without any changes, or to integrate non-Java editors without major workloads.

Compatibility with other ECMs

According to a CMIS overview by Nuxeo, CMIS is implemented by, at least:

Nuxeo
IBM Filenet
EMC Documentum
Microsoft Sharepoint
Open Text
Alfresco

Versioning

Versioning would be handled by Nuxeo, and should work out-of-the-box.

However, it seems CMIS does not support tree versioning. See this CMIS tutorial, where it's said that only Documents (leaves) can be versioned. It might not be that bad if we consider that trees are only versioned when releasing

Offline mode

TODO

JWT point of view

Apache Chemistry's OpenCMIS doesn't seem included in Orbit (see orbit CVS)

Solution #3: dedicated library

Idea

Same as solution 2, but the plugin would use Nuxeo-specific protocols instead of using the CMIS standard.

This would, in particular, allow using Nuxeo's Content Automation in order to move the EasySOA-related business logic to the server side.

Generic, unidirectional model synchronization

We want not only to save workflow files to an ECM, but also to be able to alter the ECM's model to reflect the current workflow configuration. We'll call this process "model synchronization".

In this section, we discuss how to make the process of synchronizing information from the editor to an ECM generic, i.e. flexible enough to be able to be adapted to any server-side model (we'll call it an "implementation").

Our goal is to update the ECM-side model to reflect the current state of the editor-specific workflow model, e.g.:

add new information to the ECM
but also update existing pieces of information in the ECM (from previous syncs)
and remove obsolete pieces of information in the ECM (from previous syncs), such as removed subprocesses or service calls.
while keeping user-added pieces of informations (comments added directly on the ECM, ...)

We do not want bi-directional synchronization: non-conflicting changes in the ECM must be preserved (see last bullet point above), but do not have to be propagated to the workflow file.

In order to build a generic implementation, we need to reduce coupling between the editor's model and the ECM's model. This can be achieved by introducing a well-defined, switchable "transformation layer" between the editor and the actual synchronization layer (which uses CMIS). The transformation layer would map the editor's model to an instance of the CMIS model which would then be processed by a generic CMIS synchronization layer:

Editor model ---[implementation-specific transformation]---> client-side, implementation-specific instance of the CMIS model ---[synchronization (CMIS calls)]---> server-side, implementation-specific model

We can imagine that there's probably still some implementation-specific behavior in the synchronization layer. For example, different implementations may need different ways to resolve conflicts automatically. This could be modeled by a set of rules to be provided by the implementation-specific part:

Editor model ---[implementation-specific transformation]---> client-side, implementation-specific instance of the CMIS model --\
                                                                                                                               |
                                                                                                                               +--[synchronization (CMIS calls)]---> server-side, implementation-specific model
                                                                                                                               |
                                                                               implementation-specific synchronisation rules --/

The next subsections detail the important aspects of the generic synchronization, and details the types of rules to be handled.

File location

A particular workflow "project" may contain several files (e.g. a "diagram" file and a "model" file, as in JWT for instance, but also potentially several model files). These files are to be stored on particular objects, which depends on the implementation model. Here we need to assume that whatever the implementation model, each file's location on the ECM is:

Independent of changes in this file
Independent of changes in other files

Which basically means that, once a file has been created on the ECM, it won't move unless a user intentionally moves it (using "Save as..."-like features). In particular, no change to the workflow model can affect the workflow model file's location, or any other file's location. This is needed to integrate to Eclipse EMF Resources save system, which assumes that files' URIs are fixed unless the user moves them, and that the operation of saving a Resource can be performed independently of other Resources.

As a result, the "implementation-specific transformation" above will be subject to the following constraint: objects carrying a workflow-project-related file will always be present in the output (else their ID wouldn't be valid anymore, thus it would break the loading/saving features).

Object matching

In order to do proper "updates", as opposed to simply adding new objects in the ECM, we need a way to identify the server objects that "are the same" as local objects. These objects may be renamed, and potentially a lot of properties can change between two saves.

Solutions:

Use pre-existing editor-specific IDs in the ECM. Requires changes in the ECM implementation model.
Use CMIS IDs already available in the ECM and store them in the workflow file. Requires changes in the editor model.

Model consistency

We need a way to keep the server model consistent, i.e. we should not only add & update information, but also remove obsolete information.

In order to do this, we need a clear definition of what's only automatically-generated data (which should be automatically overwritten/removed if necessary) and what might be user-generated data (which should be left alone unless instructed otherwise).

Solutions:

Per-type rules: this type of relationship (or document, or property) is automatically generated, that one is not. May be incompatible with some models where some relationships are both modified by users and automatic processes.
Per-object rules: this object matches the rules (a property has the correct value), so it's been generated. May require implementation model modification to add dedicated properties (e.g. "isGenerated").

In practice, a combination of both might be ideal. We could then specify "general rules" per type and "exceptions" per object.

Conflict resolution

The behavior to adopt when automatically generated data (from the editor model) conflicts with potentially user-defined data might change from a user to another, and from an entity (document, relationship, property, ...) to another.

We can imagine at least three potential behaviors:

resolve in favor of the editor (overwrite)
resolve in favor of the ECM (don't change)
ask the user to choose

Miscelaneous

Offline Nuxeo

In order to add offline support to any solution, we could set up a local instance of Nuxeo on each designer's machine, and keep it in sync with the central instance when the network is on.

It doesn't seem feasible though, since an old version of Nuxeo's documentation mentions it but in read-only mode. I (Y. Rodière) could not find a matching page in the current documentation), though. The closest matches were:

a page about Android sync, which mentions "offline usage"
a page with an "offline client" section, which, sadly, is empty

Eclipse editors

Eclipse editors (JWT) save process

When a save is requested for a given part, org.eclipse.ui.ISaveablePart.doSave(IProgressMonitor monitor) is called.

In WEEditor, the method then calls org.eclipse.emf.ecore.resource.Resource.save(Map, ?> options) on both the diagram and the model resource. Here is an excerpt of this method's Javadoc:

An implementation typically uses the URI converter of the containing resource set to create an output stream, and then delegates to save(OutputStream, Map).

The URIConverter interface seems to be implemented by org.eclipse.emf.ecore.resource.impl.ExtensibleURIConverterImpl only, and this class delegates most of its job to org.eclipse.emf.ecore.resource.URIHandler. URIHandlers are mainly responsible for creating input and output streams as well as deleting the content pointed by a URI. Default URIHandlers are the following:

ArchiveURIHandlerImpl (archive:/ , for zipped files?)
EFSURIHandlerImpl (Eclipse File System)
FileURIHandlerImpl (file:/)
PlatformResourceURIHandlerImpl (platform:/)

Eclipse editors (JWT) loading process

When an org.eclipse.ui.part.MultiPageEditorPart is created, the org.eclipse.ui.part.MultiPageEditorPart.createPages method is called.

In WEEditor, this method calls org.eclipse.jwt.we.editors.WEEditor.createModel, which:

uses the editor input (obtained via getEditorInput()) to determine the files' URIs
then uses org.eclipse.jwt.converter.Converter to convert the model file to the newest version. CAUTION, this involves code assuming the file is on the local filesystem.
and finally calls org.eclipse.emf.ecore.resource.ResourceSet.getResource to load the files (first the model, then the diagram). As in org.eclipse.emf.ecore.resource.Resource.save, this uses an URIConverter to get the input stream.

Implementation alternatives

The actual implementation of our "import/export" feature in an Eclipse-based application can take different forms, from the user point of view.

Remote resources in an Eclipse project, which would be loaded and saved transparently. According to the IFile interface definition, files can be remote, but it seems like a temporary state for not-fetched-yet resources. This could be handled at different levels:
at editor level. A specific editor which would directly provide the model to the sync facility
at filesystem level, using Eclipse File System.
Imported resources, using the import wizard.
Synchronized resources. The ECM would act as a Git or SVN repository. This involves adding team support to the plugin.

See also PTP remote projects, which seems to be a dead feature.

Editor-level implementation

File saving/loading

We could use org.eclipse.emf.ecore.resource.ResourceSet.setURIConverter to add our own URIHandler dealing with CMIS. The resources would thus be assigned (at creation time) CMIS-specific URIs (e.g. "cmis-rest://myecm:8080/rest/cmis/") which would enable using specific URIHandlers (see Eclipse save and load process above).

However, it seems that, when opening "things" in the package/project navigator using right-click > "open with", Eclipse expects these "things" to be Files (see org.eclipse.ui.actions.OpenWithMenu.openEditor, and org.eclipse.core.resources.IResource type hierarchy). The project explorer may then be somewhat broken by this solution. Resource links would not help, since they seem to use EFS to delegate their job to real File instances.

Synchronization

In order to synchronize the remote model, we would need to either:

subclass one or several of the org.eclipse.emf.ecore.resource.Resource implementations, and change the default org.eclipse.emf.ecore.resource.Resource.Factory.Registry of the relevant org.eclipse.emf.ecore.resource.ResourceSet in order to create the subclasses in place of the base class. The subclass would handle the "save" method in a special way, not only serializing the resource in a file but also connecting to the ECM to update the model
handle the synchronization externally, by inspecting the JWT model (for example in WEEditor.save, after saving the files)

Filesystem-level implementation

See Eclipse help on contributing an alternative filesystem. This subsection dealing with cross-filesystem projects (using links) and this page dealing with linked resources may be especially useful.

File saving/loading

EFS-CMIS lets you create folders in Eclipse projects wich are actually links to folders on remote ECMs. It was tested against Alfresco public repository and a local Nuxeo 5.5 instance.

When testing, JWT-WE had to be slightly modified in order to work with EFS-CMIS. The Converter assumes that all files are on the local filesystem (explicit calls to FileInputStream constructor). Therefore, it had to be by-passed when using remote files.

Otherwise, this seems to work perfectly. It may have to be re-implemented in order to use it in JWT, though, since it uses Alfresco code and some "org.oasis_open.docs" (?) libraries.

Model synchronization

Does not seem possible at this level. The IFile interface is already defined and cannot be extended since EFS purpose is genericity. Moreover, when saving a file, the Resources are already serialized (to XML, for example): it is impossible to (elegantly) analyze these resources.

Import and export of workflow files from to EasySOA

Status

What's been done

What's left to do

Solution #1: WebDAV mountpoint

Idea

Details

Data transfer

Synchronization to EasySOA Core model

Constraints

Compatibility with vanilla editors

Compatibility with other ECMs

Versioning

Offline mode

JWT point of view

Solution #2: CMIS

Idea

Details

Data transfer

Synchronization to EasySOA Core model

Constraints

Compatibility with vanilla editors

Compatibility with other ECMs

Versioning

Offline mode

JWT point of view

Solution #3: dedicated library

Idea

Generic, unidirectional model synchronization

File location

Object matching

Model consistency

Conflict resolution

Miscelaneous

Offline Nuxeo

Eclipse editors

Eclipse editors (JWT) save process

Eclipse editors (JWT) loading process

Implementation alternatives

Editor-level implementation

File saving/loading

Synchronization

Filesystem-level implementation

File saving/loading

Model synchronization

Clone this wiki locally