Skip to content

Frontend

Alexander Kostetskiy edited this page Oct 12, 2018 · 20 revisions

Introduction

In order to support larger objects and achieve a higher throughput, Ambry is making a concerted effort towards making the whole stack (client, frontend, routing and backend) non-blocking. This document describes the design of the non-blocking front end, the REST framework behind it, the interaction with the routing library and the use of Netty as the NIO framework.

The blocking paradigm experiences some problems:

  1. Inability to support larger objects - this is the single biggest reason Ambry is making this effort.
  2. Client has to make a call and wait for the operation to finish before the thread is released - this is wasteful.
  3. Blocking paradigms don't play well with some frameworks (like play).
  4. High memory pressure at our current front ends if we start supporting larger objects.

High Level Design

The non-blocking front end can be split into 3 well defined components:-

  1. Remote Service layer - Responsible for interacting with any service that could potentially make network calls or do heavy processing (e.g. Router library) to perform the requested operations.
  2. Scaling layer - Acts as a conduit for all data that goes in and out of the front end. Responsible for enforcing the non-blocking paradigm and providing scaling independent of all other components.
  3. NIO layer - Performs all network related operations including encoding/decoding HTTP.

Although these components interact very closely, they are very modular in the sense that they have clear boundaries and provide specific services. The components are all started by a RestServer that also enforces the start order required for interaction dependencies.

Component Description

The following sections describe the parts that make up each of these components along with a brief explanation of where they fit in.

Interaction enablers

In order to better understand the different layers and the rationale behind their design, it is useful to understand the tools that the layers can use to exchange data and control. Interaction enablers are interfaces that enable the different components to interact with each other in a way that is agnostic to the underlying implementations of each of the components. These interfaces are implemented by the components that generate/consume the data that needs to be shared.

ReadableStreamChannel

This is an interface that enables interaction between the front end and the library that contacts the remote services (like the Router library). Through this interface, data can be streamed (in the form of bytes) between the interacting pieces as if reading it through a channel (usually they are actually reading from an underlying network channel). An implementation of this interface is required through the RestRequest interface at the NIO layer. When a blob is being POSTed, the remote service library will "pull" data from the front end through a ReadableStreamChannel. If the remote service library needs to return response bodies, it will need to provide an implementation of this interface that can be used with the RestResponseHandler. ReadableStreamChannel is designed for asynchronous reads and focuses on avoiding copies and supporting back pressure naturally.

RestRequest

This interface extends the ReadableStreamChannel interface and is implemented by the NIO layer. It enables interaction between all the components of the front end in a way that is agnostic to the NIO layer framework. In addition to helping the remote service library pull data from the client (through the front end), it enables the scaling and remote service service layers to process the request correctly.

RestResponseChannel

This interface, implemented by the NIO layer, provides a way for the remote service and scaling layers to return processed responses to the client. The APIs it provides deal with bytes only and thus it is agnostic to the kind of the data being returned. It is the responsibility of NIO layer to encode the data into HTTP and send it over the network to the client.

Remote Service Layer

This layer mainly interacts with the remote service library by calling the right APIs but is also responsible for doing any pre processing (like ID transformations, anti virus checks etc) before making those calls. One instance of a single RemoteService is started by the RestServer.

In Ambry, this layer is usually singleton and stateless i.e it does not maintain state or context about the requests flowing through it. It is also responsible for pre-processing responses since responses arrive as callbacks from the Router library. Pre-processing usually involves setting response headers - the actual bytes are streamed out in the RestResponseHandler.

Scaling Layer

This layer is the core of the non-blocking front end. It enforces the non-blocking paradigm and acts as a conduit for data flowing between the remote service layer and the NIO layer. The framework consists of: -

RestRequestHandler - This is the component that handles requests submitted by the NIO layer and hands them off to the remote service layer. Internally, it can maintain a number of scaling units that can be scaled independently of all other components. The number of scaling units has a direct impact on throughput and latency.

RestResponseHandler - This is the component that handles responses submitted by the remote service layer and streams the bytes to the network via the NIO layer. Internally, it can maintain a number of scaling units that can be scaled independently of all other components. The number of scaling units has a direct impact on throughput and latency.

AsyncRequestResponseHandler

AsyncRequestResponseHandler is an implementation of both RestRequestHandler and RestResponseHandler. It processes both requests and responses asynchronously. Requests are handled using one or more scaling units called AsyncRequestWorker. Due to the asynchronous nature of ReadableStreamChannel, response handling does not need scaling units. In order to process requests and responses, each scaling unit maintains some state

  • Requests that are waiting to be processed (Request queue)- This is a queue of requests that are awaiting processing.
  • Requests are enqueued by the NIO layer and dequeued and processed using the remote service layer.
  • Responses waiting to be sent out (Response set) - This is a list of responses that are ready to be streamed to the client. * The responses are represented by a ReadableStreamChannel and will be sent over the provided RestResponseChannel. If an exception was provided, an appropriate error message is constructed and returned to the client.

The scaling units are CPU bound and perform all the CPU bound tasks.

NIO Layer

The NIO layer is responsible for all network related operations including encoding/decoding HTTP. On the receiving side, the NIO framework is expected to provide a way to listen on a certain port for requests from clients, accept them, decode the HTTP data received and handoff this data to the scaling framework in a NIO framework agnostic format (RestRequest and RestResponseChannel). On the sending side, the NIO layer is expected to provide an implementation of RestResponseChannel to return processed responses back to the client.

The NIO layer also needs to maintain some state. For the layer as a whole, it needs to maintain the instance of RestRequestHandler that can be used for all channels and all requests. In addition, each channel might have to maintain some per request state

  • The RestRequest that it is currently processing (required state per request) - This is required per request if content is expected since content will have to be added to the RestRequest.
  • The RestResponseChannel (required state per request) - This has to be maintained per request since the RestResponseChannel has to be informed of any errors during NIO layer processing.

Component Interaction

The following sections describe how components interact with each other to execute operations. Operation Execution

Much of this section uses Ambry and its Router library as a means of presenting the design. It should be easy to draw parallels and design any RemoteService that might need to be implemented. Some parts of the design and functionality of AsyncRequestResponseHandler are also presented and assumed to be in use.

Common operations

  • Receiving requests

When a request is received, the NIO layer first packages its own representation of a HTTP request into a implementation of RestRequest (that the NIO layer provides). It passes this RestRequest along with a RestResponseChannel (that can be used to return a response to the request) to the RestRequestHandler. The request is then enqueued to be handled asynchronously at the RestRequestHandler.

  • Receiving content

In GET, DELETE and HEAD requests, no valid content is expected. In a POST request, we expect content with the request. Any content received is added by the NIO layer to the RestRequest. Since the implementation of RestRequest is provided by the NIO layer, this can be done internally without involving the scaling layer. This content should be available for reading (at the remote service library) through the read operations of ReadableStreamChannel. Exceptions are thrown in case valid state transitions are not respected.

  • Dequeing requests inside the AsyncRequestResponseHandler

Every request submitted to the AsyncRequestResponseHandler is handed off to a AsyncRequestWorker. The AsyncRequestWorker has a thread that regularly dequeues RestRequests from the request queue in order to process them. The handling of a dequeued request depends on the type of request.

GET

  • Handling dequeued requests at the Remote Service (AmbryBlobStorageService)

For handleGet, AmbryBlobStorageService extracts the blob ID (and sub-resource) from the request , interacts with any required external services and does pre processing of request data if required (All this will be non blocking).

For a GET request, we require both blob properties (to update headers) and the content of the blob. To this end, we create a Callback object for a getBlobInfo call first. This Callback object contains a function that needs to be called on operation completion and also encapsulates all the details required to make a subsequent getBlob call. The getBlobInfo method of the Router is then called with the blob ID and Callback.

public interface Callback<T> {  
  public void onCompletion(T result, Exception exception);  
}  
  • On getBlobInfo callback received

When the getBlobInfo callback is received, the response headers are populated. The Callback invokes the getBlob method of the Router with the blob ID and a new Callback that encapsulates all the information required to send a response

public class HeadForGetCallback<BlobInfo> { 
  private final RestResponseHandler restResponseHandler; 
  private final RestResponseChannel restResponseChannel;  
  private final RestRequest restRequest;
  private final Router router;

  public HeadForGetCallback(RestResponseHandler restResponseHandler, RestResponseChannel restResponseChannel, RestRequest restRequest, Router router) {
    this.restResponseHandler = restRequestResponseHandler; 
    this.restResponseChannel = restResponseChannel;
    this.restRequest = restRequest;
    this.router = router;
  }  
  public void onCompletion(BlobInfo result, Exception exception) { 
    if (exception == null) {
      // update headers in RestResponseChannel.  
      // get blob id from RestRequest.
      // create GetCallback.
      router.getBlob(blobId, getCallback); 
    } else {
      restResponseHandler.handleResponse(restRequest, restResponseChannel, null, exception);  
     }
  } 
}  
  • Router

At the Router, a future that will eventually contain the result of any operations invoked is created and returned immediately to AmbryBlobStorageService. This ensures that the thread of the AsyncRequestWorker is not blocked. For getBlobInfo, the result is a BlobInfo object and for getBlob, the result is a ReadableStreamChannel representing blob data.

The getBlobInfo callback is invoked with a BlobInfo when both the blob properties and user metadata are available. The getBlob callback is invoked with a ReadableStreamChannel representing blob data when at least one byte of the blob is available. In both cases, if there was an exception while executing the request, the Router invokes the callback with the exception that caused the request to fail.

  • On getBlob callback received

When the getBlob callback is received, any necessary headers are updated and the response is submitted to the RestResponseHandler (AsyncRequestResponseHandler). The ReadableStreamChannel - RestResponseChannel pair is added to the response set and the response reading is initiated (which is asynchronous because of the design of ReadableStreamChannel). Once the response reading is complete (which is known via the callback), all remaining state can be cleaned up.

public class GetCallback<ReadableStreamChannel> {
  private final RestResponseHandler restResponseHandler;
  private final RestResponseChannel restResponseChannel;
  private final RestRequest restRequest;

  public GetCallback(RestResponseHandler restResponseHandler, RestResponseChannel restResponseChannel, RestRequest restRequest) {
    this.restResponseHandler = restRequestResponseHandler;
    this.restResponseChannel = restResponseChannel;
    this.restRequest = restRequest;
  }

  public void onCompletion(ReadableStreamChannel result, Exception exception) {
    // update headers if required.
    restResponseHandler.handleResponse(restRequest, restResponseChannel, result, exception);
  }
}

POST

  • Handling dequeued requests at the Remote Service (AmbryBlobStorageService)

For handlePost, AmbryBlobStorageService extracts the blob properties and user metadata from the request (headers). It also interacts with any required external services and does pre processing of request data if required (all this is non-blocking). Further, it creates a Callback object for a putBlob call that contains a function that needs to be called on operation completion and also encapsulates all the information required to send a response.

The putBlob method in the Router is then invoked with blob properties, user metadata, a ReadableStreamChannel representing the data to be POSTed (this is the RestRequest itself) and the Callback.

  • Router

At the Router, the putBlob operation will return a Future of String, that will eventually contain the blob ID, immediately to AmbryBlobStorageService. This ensures that the thread of the AsyncRequestWorker is not blocked. The putBlob callback is invoked with a blob ID when the put is complete. If there was an exception while executing the request, the Router invokes the callback with the exception that caused the request to fail.

  • On putBlob callback received

When the putBlob callback is received, the headers are updated to include the blob ID as a part of the response (or error thrown is transmitted) and the response is submitted to the RestResponseHandler. PostCallback

public class PostCallback<String> {
  private final RestResponseHandler restResponseHandler;
  private final RestResponseChannel restResponseChannel;
  private final RestRequest restRequest; 

  public PostCallback(RestResponseHandler restResponseHandler, RestResponseChannel restResponseChannel, RestRequest restRequest) {
    this.restResponseHandler = restRequestResponseHandler;
    this.restResponseChannel = restResponseChannel;
    this.restRequest = restRequest;
  }

  public void onCompletion(String result, Exception exception) {
    if(result !=null && exception == null) {
      // set blob ID as a response header.
    }
    restResponseHandler.handleResponse(restRequest, restResponseChannel, null, exception);
  }
}
  • Reading data at the Router and applying back pressure

The ReadableStreamChannel (represented by the RestRequest) given to the Router is read on demand by the Router. The Router will not "pull" unless it is ready to receive more data and this translates to a form of back pressure on the front end. Since the implementation of the RestRequest is provided by the NIO layer, it falls upon implementers of the NIO layer to transmit this back pressure through the network to the client.

Talking in terms of the Netty implementation, this is achieved through a two step process. As the first step, when the POST request is received, we switch off auto read on the channel (by default, auto read is on) after receiving a fixed number of content chunks. Once auto read is switched off, reading from the channel occurs on demand (switching off auto-read transmits back pressure through the network protocol, TCP, to the client). In the second step we couple the callbacks received from the AsyncWritableChannel provided by the Router to the on demand read from the channel i.e. on every write callback, we determine whether we are ready to pull more content from the channel and call channel.read() accordingly.

  • Decoding HTTP POST data

HTTP POST data might need to be decoded (in case of multipart data) and this a compute heavy operation. This is done in the context of the thread inside AsyncRequestWorker via a call to RestRequest.prepare().

DELETE

  • Handling dequeued requests at the Remote Service (AmbryBlobStorageService)

For handleDelete, AmbryBlobStorageService extracts the blob ID from the request. It also interacts with any required external services and does pre processing of request data if required (all this is non-blocking). Further, it creates a Callback object for a deleteBlob call that contains a function that needs to be called on operation completion and also encapsulates all the information required to send a response. The deleteBlob method of the Router is then called with the blob ID and Callback.

  • Router

At the Router, the deleteBlob operation will return a Future of Void immediately to AmbryBlobStorageService. This ensures that the thread of the AsyncRequestWorker is not blocked. The deleteBlob callback is invoked with a null result when the delete is complete. If there was an exception while executing the request, the Router invokes the callback with the exception that caused the request to fail.

  • On deleteBlob callback received

When the deleteBlob callback is received, the headers are updated to indicate that the delete was accepted (or error thrown is transmitted) and the response is submitted to the RestResponseHandler. DeleteCallback

public class DeleteCallback<Void> {
  private final RestResponseHandler restResponseHandler;
  private final RestResponseChannel restResponseChannel;
  private final RestRequest restRequest; 

  public DeleteCallback(RestResponseHandler restResponseHandler, RestResponseChannel restResponseChannel, RestRequest restRequest) {
    this.restResponseHandler = restRequestResponseHandler;
    this.restResponseChannel = restResponseChannel;
    this.restRequest = restRequest;
  }

  public void onCompletion(Void result, Exception exception) {
    if(exception == null) {
      // send response status to ACCEPTED.
    }
    restResponseHandler.handleResponse(restRequest, restResponseChannel, null, exception);
  }
}

HEAD

  • Handling dequeued requests at the Remote Service (AmbryBlobStorageService)

For handleHead, AmbryBlobStorageService extracts the blob ID from the request. It also interacts with any required external services and does pre processing of request data if required (all this is non-blocking). Further, it creates a Callback object for a getBlobInfo call that contains a function that needs to be called on operation completion and also encapsulates all the information required to send a response. The getBlobInfo method of the Router is then called with the blob ID and Callback.

  • Router

At the Router, the getBlobInfo operation will return a Future of BlobInfo immediately to AmbryBlobStorageService. This ensures that the thread of the AsyncRequestWorker is not blocked. The getBlobInfo callback is invoked with a BlobInfo when both the blob properties and user metadata are available. If there was an exception while executing the request, the Router invokes the callback with the exception that caused the request to fail.

  • On getBlobInfo callback received

When the getBlobInfo callback is received, the headers are populated with the properties and user metadata returned (or error thrown is transmitted) and the response is submitted to the RestResponseHandler. HeadCallback

public class HeadCallback<BlobInfo> {
  private final RestResponseHandler restResponseHandler;
  private final RestResponseChannel restResponseChannel;
  private final RestRequest restRequest; 

  public HeadCallback(RestResponseHandler restResponseHandler, RestResponseChannel restResponseChannel, RestRequest restRequest) {
    this.restResponseHandler = restRequestResponseHandler;
    this.restResponseChannel = restResponseChannel;
    this.restRequest = restRequest;
  }

  public void onCompletion(BlobInfo result, Exception exception) {
    if(result !=null && exception == null) {
      // set response headers.
    }
    restResponseHandler.handleResponse(restRequest, restResponseChannel, null, exception);
  }
}