Skip to content
mdutoo edited this page Feb 29, 2012 · 5 revisions

Current usage of SOAP in Nuxeo Platform

In current version of Nuxeo Platform we expose very few features via SOAP/WSDL.

The main Web Service API (Content Automation) is exposed via JAX-RS with a JSON Marshaling and does not provide any SOAP binding, nor a WSDL contract.

Identified problems with SOAP / WSDL

The fact that our main WebService API does not use SOAP is tied to our bad experiences with WSDL/SOAP related technologies.

I try in this section to summarize the issues we faced so far.

My goal is not to flam on SOAP technologies but more to share our small experience and get some feedback.

Interoperability

Oddly enough, this is probably the point where we where the most disappointed with SOAP/WSDL technologies.

Issues when calling Java WebServices from the .Net world

.Net shops are from far the clients that ask the most for WebService, probably because of the good tooling integrated by default in their IDE.

Unfortunately we experience several compatibilities issues when doing this.

Of course there are types mapping issues, but for example we experienced "stupid" restriction like : the .Net client can only all methods that have not more that one parameter.

This was indeed a documented limitation in MSDN (I think it was when using JAX-RPC via JBossWS on the server side).

Issues in the java world because of small differences in the JAX-WS version

We very recently had to integrate Nuxeo with a BPM engine from IBM.

This BPM engine used a slightly higher version of JAX-WS than us (XXX vs YYY) but it was enough for breaking the interop.

So we upgraded the JAX-WS stack :

  • creating a backward compatibility issue

  • adding the requirement to add lib in the JVM endorsed directory because of XML parser version mismatch with the one in JDK 6

Usage of Scripting languages

Most scripting languages like JavaScript or python are not very welcome in the SOAP World. At least tooling is usually not very good and performance overhead is significant.

For example CMIS stack now proposes a browser-binding transport that is based on JSON/REST rather than SOAP.

To summarize, almost every time we had to do interoperability with SOAP (with JS, Java, .Net) we experienced issues.

So the promise for transparent interoperability via SOAP stack is not verified, at least for the tests we did

binary streams

Nuxeo being an ECM platform a lot of our APIs use binary streams as input parameter or return value.

There is nothing we can do against this.

In the context of SOAP, this is not that easy to manage.

We tried several solutions.

Brutal base64 encoding

This is probably the simplest and most compatible solution.

This is also a real pain :

  • from the network efficiency point of view

  • from the XML stack point of view (this can simply crash the JVM memory depending the SOAP XML parsing implementation)

Return REST GET urls via SOAP

For return streams, we have the simple solution to return a download url.

This works quite well but :

  • it breaks the "magic soap proxy" paradigm : client has to manage by hand a download with authentication

  • it does not work for input parameters (usually we can not expect the client to also be an http server)

NB : For input parameter we can think about doing a separated upload that returns a token that is then used as parameter ...

MTOM

MTOM is probably the only standard solution for this, but it seems to create a lot of incompatibility issues.

Security

In the context of Nuxeo we need to manage authentication and security.

Although there are standard like WS-Security it comes with impacting requirements :

  • on the server side WS stack (that may be provided by the app server)

  • on the client side

A classic workaround is to simply proxy (filter) soap call, but in the context of Nuxeo it is not enough since we don't only need check access credential to the method exposed in WS but we need to do it depending on the parameters (data level security issue that is classic in ECM context).

Currently we have 2 solutions inside Nuxeo :

Token (Session) management

The user is required to call a first method with his credentials as parameters and he will receive a Token as response. He then needs to include the token as parameter of each call.

It does work independently from the WS stack and client, but this approach has several drawbacks :

  • API is ugly

  • it breaks the basic principle of Stateless WS (since we need to maintain state on the server side)

WS Security Hack

For CMIS we have a kind of WS Security where we do pre-parse the SOAP Envelope to extract the credentials and put them in a context that is accessible from the application layer that will then manage Security context initialization.

This does work, but this is clearly not the best solution.

Environment portability

Usually you write the code in a development environment : this means the SOAP proxy are generated against a test server. So you need to be able to change the target server url without rebuilding everything.

We had isse when using JAX-WS / Metro wwhen trying to change the protocol from http to https : we had to use reflection to change the fields inside the generated proxy code.

This may be very specific to a bug in the Metro stack we use by default, but this an other example of the bad experiences we had with WSDL.

Dynamic API

The whole point of Nuxeo services is to be extensible. This means I can very easily expose a new Operation or Chain as a service, part of this can be done from within Studio and be exposed without any server restart.

As a first step, this is an issue if we use a "generate WSDL and binding on startup" as we currently do with Metro or JBossWS.

But more generally it is a problem to expose a dynamic model via a stack that statically WSDL and Client stub.

for this I can see 2 different approaches :

Invoke pattern

We could build a static and generic WSDL/SOAP endpoint that supports the "invoke" model.

This would work, but we loose an interesting part of SOAP/WSDL stack : the fact that you can have a meaningful and typesafe stub to access the service.

On demand WSDL generation

We could generate on demand WSDL, but of course we still have the issue about the stub that needs to be rebuild each time.

SOAP anyway ?

As you can see, we have several issues with SOAP and WSDL :

  • maybe because we did not use the right techno stack

  • maybe because we did not use the right approach

  • maybe because of SOAP itself

Anyway, my goal is to find the least bad solution to have a real support for SOAP in our main Content Automation API :

  • this would make sense in the context of EasySOA ( this would help integrations that need to be done in EasySOA )

  • because it does make sense in some context

Context dependency

Scripting world (JavaScript, Python, Ruby, PHP ...)

From our experience, no one asks for a SOAP binding when using a scripting technology.

They are usually happy with a http/JSON API like Automation.

To improve their experience of the Web Service API we don't need to have a SOAP stack, but rather :

  • make the API simpler and more RESTful

or

  • provide a client lib (what we did in Php, JS and Python)

Java World

In the Java world people do ask for SOAP support but as soon as we explain the issues and provide the native java client lib the problems are solved.

.Net World

.Net developers are probably the one that really need a SOAP binding.

Even providing a good .Net client lib or a simple RESTful http API won't fulfill their needs.

=> currently .Net is probably the first motivation for exposing Nuxeo Automation API via SOAP

Possible approach

For now I would say that the less bad solution would be generate WSDL on demand depending on

  • the set of operations (APIs ) that need to be exposed

  • a parameter on how Blobs should be handled (MTOM vs Urls vs base64)

  • a parameter on how security should be handled (Session vs WS-Security vs http auth)

This also means we must have a endpoint that can handle all the cases.

Brutal implementation

A fairly brutal approach would be to generate the WSDL with a templating system like Freemarker and implement the EndPoint as a bare Servlet that does XML parsing.

=> looks like a painful road especially for maintenance

Plug in a really pluggable SOAP stack

CXF seems to be better than Metro for that ???

Use existing features from Frascati

I still don't know exactly what Frascati does, but since it seems to be able to dynamically handle Proxy/Endpoint generation from a SCA model, this could be worth checking if this can be reused / extended (even if that means embedding Frascati in Nuxeo).

OW (mdutoo) feedback

First, I believe that users are right to ask for a client connector that works best with their own development platform and choices : some want to see JSON/HTTP, and going a bit far, SOAP alone would not be enough because some users really only want to see Java. To provide them with that, you have to address interoperability issues that varies depending on what they want to see and their own platform. SOAP technologies try to address all platforms and issues at the same time, but there's no magic, and the more issues (security, binary...) and the more platforms you actually want to work, the more it's hard, brings constraints and impacts the user development model. And it may come to the point that to make it work, you have no other choice but to put your hands in SOAP's innards - that is, XML.

So my answer is that :

  • if you only have (and will have) to support a single client platform, write a dedicated connector for that platform, making technology choices that will make interoperability and remoting easy to write and maintain while efficient. For instance, if only .NET users want WSDL, write a .NET connector remoted using Automation.
  • otherwise, choose the interoperability technology that makes as much kinds of users happy (here SOAP comes to mind). But be ready anyway to make your hands dirty at low level to make it work, if you start stretching it on the number of platforms and of features you want - so choose it also so it helps you for that. Content-oriented remoting such as XML-based interoperability allows for easy low-level manipulation in any language and on any platform, while many platforms provide higher level tools such as XML transformation, and some service stacks (like CXF) are flexible enough to handle a lot of cases and pluggable enough to let you add your own if need be - not even talking about interoperability solutions that are dedicated to XML manipulation, that is ESBs.

So yes, CXF has a fully pluggable architecture with lots of plugins and Spring configuration, so also easy to develop new ones and configure them.

Now, FraSCAti is a service-oriented "middleware of middlewares" based on SCA & dynamic components. For instance, it integrates CXF to handle SOAP and REST, and Joram to handle JMS. You could do what you need (generating a WSDL interface from your Automation JSON interface and translating SOAP calls to Automation) within CXF alone, but FraSCAti additionally provides a (model-driven, code-generating) architecture that's made for doing it transparently, just like today it transparently generates WSDL from Java interface and SOAP to Java calls translation code. From my experience, this can be done by writing

  • either a FraSCAti binding, that you'll have to configure as many times as you have Automation services (but this FraSCAti configuration could be also generated using the new composite templates),
  • or a FraSCAti implementation, that will sit on its own and rely on your existing Automation service configuration files to know which services to expose. We're currently hacking the ServletImplementationVelocity in a similar way to expose velocity templates as SOAP services, see #73 and code.
  • a lighter answer could be to generate JAXRS interfaces that many client sides (even if it's not the point of JAXRS) know how to use to call a REST / JAXRS service, and such a FraSCAti client could then provide any other kind of service (SOAP, JMS...). However it would require to be even more platform independent (bye bye MTOM).

And further, what about generating even your client connectors ?

Some other feedback :

  • token / session API is ugly ? Not if you hide login/logouts in a proxy or aspect.
  • download using GET URL breaks "magic SOAP proxy" by asking for another authentication integration ? Yes, but SOAP is not that magic : it's not the best solution for everything (especially binaries, or privacy where HTTPS is far more efficient), and it almost always runs on HTTP anyways (so it's rather a "magic HTTP proxy"). So again, provide a client hiding differences in authentication schemes, and building on a lower common (HTTP) protocol.

INRIA ideas

=> ?

Clone this wiki locally