Skip to content

Proposal: RtpsDiscovery Refactoring

Justin Wilson edited this page Nov 20, 2023 · 3 revisions

Motivation

  • Sedp is the largest source file by lines.

  • tsan reports dataraces.

  • SPDP and SEDP have taken on a number of additional responsibilities: security, XTypes, extended BITs, and ICE.

  • Current Organization RtpsDiscovery is the top-level class. RtpsDiscovery implements Discovery which is how the DCPS layer accesses it.

For each participant that is created in a domain, RtpsDiscovery creates an instance of Spdp. Spdp creates an instance of SpdpTransport for the SPDP protocol. It also creates an instance of Sedp for the SEDP protocol. Sedp creates an rtps_udp transport and attaches a number of TransportClients which represent the various writers and readers in Sedp.

The core of Spdp/Sedp is data structures that represent local/remote participants, publications, and subscriptions. There are also a number of auxiliary classes that are used to get around architectural issues including interacting with the ACE_Reactor and upcalls into user space.

Spdp and Sedp have the responsibility of managing the BITs for the participant. They also have the responsibility of managing the extra BITs, participating in ICE, and performing STUN connectivity checks with the RtpsRelay.

Goals

  • Make the code easier to understand, especially with respect to concurrency.
  • Make the code easier to test.
  • Reduce the number of “crutches” needed to overcome architectural deficiencies.

Out of Scope

  1. Changing the Discovery interface.
    1. While we might not want to change the Discovery interface, we have the option of adapting the existing discovery interface. The new interface would consists of a set of InternalTopics that represent local participants, publications, and subscriptions. The adaptor would convert the Discovery interface to writes on these topics. RtpsDiscovery and friends could then read these topics and take the appropriate actions, i.e., putting samples into the Sedp Writers and matching. The same approach could be used to represent remote participants, publications, and subscriptions. To complete the picture, the result of matching can be published to an internal topic and not reach all the way to the DataReaders and DataWriters. Again, one would need to introduce an adapter that routes that information to the necessary DataReader and/or DataWriter. The (hypothesized) advantage of doing this is that the discovery logic becomes completely isolated.
  2. Changing the TransportClient interface.
  3. Introducing RTPS writers and readers for ICE.
  4. Redoing configuration with internal DDS.
  5. Replacing extra BITs with internal DDS.

Proposal

  1. Change the interactions with ICE to use internal DDS.
    1. This is a pilot for changing other interfaces.
    2. Spdp will own the necessary readers and writers.
    3. Spdp will implement the necessary listeners.
  2. Introduce a new interface RtpsController.
    1. Spdp will implement RtpsController.
    2. Endpoints (TransportClients) will dispatch into RtpsController instead of Sedp.
    3. RtpsController will be mocked to facilitate unit testing.
  3. Update all of the entry points in Spdp.
    1. An entry point is a virtual method that is called asynchronously.
      1. These may be external, e.g., called via Discovery or TransportClient
      2. These may be internal, e.g., timer expires, SPDP message is received.
    2. An entry point should have the following form.
      1. acquire lock
      2. update state and schedule outcalls
      3. release lock
      4. execute “out calls” (not necessary if all outcalls are scheduled)
  4. Factor out the core data structures of SPDP/SEDP. These should be plain-old C++ classes. These class must not forward control. That is, the thread of control calling into one of these classes should not reach user space, the transport, etc. Candidates include DiscoveredParticipant, DiscoveredPublication, DiscoveredSubscription, etc.
  5. Factor out other reactive code like the SpdpTransport to dispatch to the RtpsController.
  6. Factor out I/O and (de)serialization.

Example: Receiving a publication

  1. Receiving a publication starts with the reception of an RTPS message by the transport.
  2. This is passed to a Reader (which is a TransportClient).
  3. The Reader will parse the message and perform any pre-processing.
  4. The Reader will then invoke the RtpsController (Spdp).
  5. The RtpsController will acquire the lock.
  6. The RtpsController will then update internal data structures and schedule “out calls”.
    1. Matching may result in associating or disassociating with local readers.
    2. Update the BITs can invoke listeners
    3. Executing these directly is not safe with the lock held.
  7. The RtpsController will then release the lock.
  8. The RtpsController will then execute “out calls”.
    1. This design would be used instead of scheduling.
    2. This design is not recommended but may be necessary for performance.

Example: Creating a subscription

  1. The user creates a DataReader.
  2. This call ultimately reaches the RtpsDiscovery instances and then Spdp.
  3. Spdp will acquire the lock.
  4. Spdp will then update the database of local readers.
  5. Spdp will write the subscription on the appropriate writers.
    1. The outcall should be scheduled but can probably be executed inline.
  6. Spdp will match the new reader against existing writers.
    1. This may be multi-step process that includes type look up (calls into transport) and calls into user space (on_subscription_match and BIT listeners).
    2. Calls into user space should be scheduled.
  7. Spdp will release the lock.

Testing

  1. Testing the “transformational” data structures that are the core of SPDP and SEDP should be straightforward. If these classes schedule “out calls” then suitable interfaces and mock must be created for testing. There should be no problem reaching 100% intentional unit test coverage.
  2. Testing pass throughs like the Readers is straight forward by mocking the RtpsController.
  3. Testing Spdp can be done my mocking the environment which includes local participants, publishers, subscribers, writers, readers, transport clients, and internal DDS entities. In general, the idea is that we should be able to see what messages Spdp would send in response to an action and what actions Spdp would take when receiving a message.
    1. It may be necessary/desirable to factor out low-level I/O and (de)serialization operations as these can be tested independently of the business logic of RTPS Discovery.

Development Plan

Refactor and Unit Test ICE (see Proposal: Refactor ICE)

ICE has three interfaces: Endpoint, AgentInfoListener, Agent. The goal is to eliminate Endpoint and AgentInfoListener. Most of the existing methods in Agent will be removed and replaced with getters for InternalTopics. The interfaces will be broken up into the following InternalTopics:

  • LocalAgentInfo

    • Type is AgentInfo
      • Key is username
      • May need to add member for guids that are serviced by the endpoint.
    • Written by endpoints (SPDP, SEDP, DATA)
      • When local address changes
      • When server reflexive address changes
      • When password changes
    • Read by discovery (SPDP, SEDP)
  • RemoteAgentInfo

    • Type is AgentInfo
      • Key is username
    • Written by discovery
    • Read by ICE Agent
  • IceRequest

    • Type

      struct IceRequest {
        // (key) Identifies the local endpoint.
        std::string local_username;
        // (key) Identifies the local entity.
        GUID_t local_guid;
        // (key) Identitifies the remote agent.
        std::string remote_username;  
        // (key) Identifies the remote entity.
        GUID_t remote_guid;
      };
      
    • Written by discovery

      • To start/stop ICE protocol
    • Read by ICE Agent

  • IceStatus

    • Type

      struct IceStatus {
        // (key) Identifies the local endpoint.
        std::string local_username;
        // (key) Identifies the local entity.
        GUID_t local_guid;
        // (key) Identitifies the remote agent.
        std::string remote_username;  
        // (key) Identifies the remote entity.
        GUID_t remote_guid;
        // The address of the remote.
        NetworkAddress remote_address;
        // Measured round-trip latency. 
        // TimeDuration latency; 
      };
      
    • Written by ICE Agent

      • Write when ICE is successful, new latency measurement.
      • Dispose/Unregister when ICE is stopped or fails
    • Read by Discovery to update ParticipantLocation and ConnectionRecord

    • Read by transport to use remote_address

  • SendStun

    • Type

      struct SendStun {
        // (key) Identifies the local endpoint that should send the message.
        std::string username;
        // The destination address.
        NetworkAddress address;
        // The STUN message to send.
        STUN::Message message;
      };
      
      • Infinite history.
      • Ideally, we would have publisher-side content filtering to handle the fanout.
        • We can consider multiple topics.
    • Written by ICE Agent to execute ICE protocol

    • Read by Endpoints

  • ReceiveStun

    • Type

      struct ReceiveStun {
        // Identifies the local endpoint that received the message.
        std::string username;
        // The local address that received the message.
        NetworkAddress local_address;
        // The remote address that sent the message. 
        NetworkAddress remote_address;
        // The STUN message that was received.
        STUN::Message message;
      };
      
      • No key
      • Infinite history.
    • Written by endpoints

    • Read by ICE Agent

After refactoring, we should be able to unit test the ICE Agent with high coverage.

Factor Out Association Records and Unit Test

These are “plain old C++ classes”.

Factor Out Discovered Participant, Publication, and Subscription and Unit Test

These are “plain old C++ classes”.

Factor Out LocalEndpoint, LocalParticipantMessage, LocalPublication, and LocalSubscription and Unit Test

These are “plain old C++ classes”.

Decide if a class called LocalParticipant should exist.

Decide what to do with LocalParticipantMessage.

Replace GuidPrefixWrapper

Comment from Son Dinh .

Fold type lookup login into progressive matching and association logic

The following classes are used in type lookup:

  • MatchingData
  • MatchingPair
  • TypeIdOrigSeqNumber

Progressive association logic was added to Sedp to not start association until both sides were (mostly) ready to do so. The type lookup logic should be converted.

Factor out SpdpTransport and Endpoint and Unit Test

SpdpTransport and Endpoint are concerned with sending and receiving RTPS messages. The goal is to factor out a layer for I/O, serialization, etc. while leaving the business logic in Spdp to make testing easier.

Notes on Spdp/Sedp interaction (as of 2022-10-07)

SPDP calls to SEDP

init and fini add a GUID to ignore list, check if a GUID is ignored ICE send a message through liveliness, secure participant discov, stateless/volatile associate, change, disassociate SEDP endpoints access the security handle registry, generate participant crypto handle testing/logging/stats support dynamic reconfiguration (*_now) access job queue and transport

SEDP calls to SPDP

get available builtin endpoints get the default locators get the crypto handle access the security plugins and handle registry, check if sec is enabled check if state is initialized, associated, shutting down get a DiscoveredParticipant, or get specific details like time discovered, flags, etc. get BIT subscriber get domain ID process inbound secure participant discov, stateless, or volatile msg get data payload for outbound secure participant discov ICE access the config object

Refactoring Suggestions

  • Objects and data that are used by both Sedp and Spdp, and that don’t themselves use either one, can be accessed through some common object (such as RtpsDiscovery or a newly created class for this purpose)
    • Domain ID
    • Ignore list
    • Security plugins and handle registry
    • Transport / Job Queue
    • BIT Subscriber
    • RtpsDiscoveryConfig
    • (probably more)
  • Remaining calls between the two (like associating Sedp) need to be evaluated to determine locking requirements. Hypothesis is that Sedp and Spdp do not need a common mutex and such calls can be made without holding the caller’s mutex. These calls can be direct or mediated by a common object or “scheduling” mechanism described above.
Clone this wiki locally