Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A draft design for iSCSI cluster #66

Open
carmark opened this issue Nov 6, 2017 · 0 comments
Open

A draft design for iSCSI cluster #66

carmark opened this issue Nov 6, 2017 · 0 comments

Comments

@carmark
Copy link
Contributor

carmark commented Nov 6, 2017

Target

  • open source Scale-Out iSCSI cluster
  • fully implemented in user space with golang
  • ceph as the backend device
  • controlled by kubernetes

The gotgtTM cluster is an open source Scale-Out SAN solution offering massive scalability and performance. It uses modern cloud based technologies to provide the elasticity and agility to scale up the storage cluster simply by adding more nodes. This can be done at any time and in a truly non-disruptive manner.

The gotgtTM cluster has a new implementation for iSCSI target in user space with golang, so that we can provide a better design than traditional iSCSI target (such as SCST and LIO). The cluster can be dockerized and controlled by kubernetes, then it will be very easy to install and operate.

The gotgtTM is designed to provide a RESTful API to check and operate the resources (such as LUs and targets). Based on those APIs, the cluster can be operated by non-Linux administrators through its easy to use graphical interfaces.

In gotgtTM, an iSCSI disk can have many access paths, each is identified by its virtual IP address. These IP addresses are clustered across several storage nodes. Access to the same disk is load balanced across different nodes in a symmetric Active/Active fashion.

The gotgtTM cluster brings proven technologies, powering many of today's largest clouds to the datacenter with point and click interfaces. First, we have a self developed iSCSI target with golang in user space. It can provide better interfaces for management and has much better performance. Secondly, it uses the Ceph storage engine. Ceph is the leading edge open source SDS cloud storage solution, powering many of today's private and public clouds, some with thousands of nodes. Ceph is a fault tolerant, self healing and self adapting system with no single point of failure. Furthermore, it uses a distributed key-value storage (such as Consul or Etcd) for cloud-scale distributed metadata management .

Design details

iscsi_cluster

In the above diagram, the gotgtTM cluster has four components:

  • Monitor
  • iSCSI target
  • Metadata storage
  • Storage devices (Ceph or others)

Monitor

The client (iSCSI initiator) will connect to the monitor component to get the real login services information, the monitor service returns a message which contains a REDIRECT command. Then the client can re-login to the one or more scheduled iSCSI targets based on the user's setting.

Initial connection sequence of events:

  1. Initiator performs the discovery procedure by using executing a discovery session.

  2. The discovery session is used, the initiator opens a TCP
     connection to the discovery target portal, logs in and issues
     the "send targets" commands.  The target responds with a list
     of target names and their associated portals.  The initiator or
     user selects the portals associated with the specific target it
     is interested in establishing a session with.  The initiator
     terminates the discovery session and closes the associated TCP
     connection.

  3. Whichever discovery procedure is used, the initiator remembers
     the portals for this target as the "initial target portals".

  4. The initiator iterates through the initial target portals list
     until it succeeds in opening a TCP connection to one of them.

  5. The initiator then logs into the target, which may respond with
     a "target moved temporarily" redirect response, listing the
     redirect portal for the target.  The initiator remembers this
     as the "redirect portal."  The initiator then closes the TCP
     connection.

  6. The initiator then opens a TCP connection to the redirect
     portal and logs in.  The target accepts this login and the
     session proceeds to full feature phase.

  7. Data flow begins.

Target node or interface failure sequence of events:

  1. The initiator has an iSCSI session established and TCP
     connection open to the redirect portal.  Full feature session
     in progress.  Data is flowing.

  2. The target fails.

  3. The initiator detects the failure of the TCP connection with
     the target.

  4. The initiator iterates through the list of initial target
     portals learned in the discovery process until it succeeds in
     opening a TCP connection to one of them.

  5. If the initiator succeeds in connecting to one of the initial
     target portals, it executes steps 5 and 6 in the "Initial
     connection sequence of events" section.

  6. If the initiator fails to connect to any of the initial target
     portals, it repeats steps 1 through 6 in the "Initial
     connection sequence of events" section.

  7. Data flow resumes.

iSCSI target

The existing gotgtTM iSCSI target library project: https://github.com/gostor/gotgt.

Metadata storage

With the help of a distributed key-value database, the metadata of cluster and iSCSI resources (targets, LUs and TPGTs) will be stored in it, so that each service can get the real time data.

Either Consul or Etcd can be used as the metadata storage.

Storage devices

Backend storage device system. We prefer to using Ceph in this implementation.

Deployment scenario and its implied deliverables

A simple deployment scenario is similar to the diagram presented earlier in this document.

The client (iSCSI initiator) applications will work transparently to take advantage of this enhanced cluster implementation of gotgt along with other docker[3] components. This will allow the application to move freely in a cloud environment.

This new gotgt cluster docker image provides multi-pathing and load balancing for the data transport and iSCSI sessions. In this way, the clustered iSCSI target library acts like an iSCSI transport gateway.

The gotgt will include plug-in based interfaces to connect to backend storage, which will include block storage like Ceph. Simple file-based storage interfaces will be provided as well.

Compatible and tested other docker images such as Ceph (backend storage) and etcd (distributed key-value store) will be provided.
For ease of management, load balancing and resilience (through replications),
kubernetes image and example cluster configuration files for pods and services will be provided.

In the longer term, there will be exported RESTful API defined for simple graphical interface implementation to replace text-based configuration files.
Also in the longer term, simple configuration graphical interface implementation will be provided for iSCSI target cluster administration.

References

[1] https://tools.ietf.org/html/draft-gilligan-iscsi-fault-tolerance-00

[2] https://www.snia.org/sites/default/education/tutorials/2011/spring/networking/HufferdJohn-IP_Storage_Protocols-iSCSI.pdf

[3] https://www.docker.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant