Skip to content
Serge Bóinn edited this page Jul 8, 2016 · 4 revisions

By now you understand serializers and deserializers. And you understand how to achieve concurrency using multiple connections. So now its time to learn about YapDatabasePolicy rules. And how you can use them to achieve an extra performance boost.


Understanding object policy

Consider the following common scenario:

- (UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath
{
    __block IMMessage *message = nil;
    
    // Access IMMessage on main thread (via uiConnection)
    [uiConnection readWithBlock:^(YapDatabaseReadTransaction *transaction){
        message = [[transaction ext:@"view"] objectAtIndexPath:indexPath withMappings:mappings];
    }];
    
    // configure and return cell...
}

- (void)didReceiveDeliveryReceiptForMessageId:(NSString *)messageId
{
    // Update IMMessage on background thread (via bgConnection)
    [bgConnection readWriteWithBlock:^(YapDatabaseReadWriteTransaction *transaction){
    
        IMMessage *message = [transaction objectForKey:messageId inCollection:@"messages"];
        
        // As per the recommended best practice, we copy the object before modifying it.
        message = [message mutableCopy];
        message.delivered = YES;

        [transaction setObject:message forKey:messageId inCollection:@"messages"];
    }];
}

In the example code above, an object is being updated on 'bgConnection', but it will immediately be needed on the 'uiConnection' in order to redraw the corresponding cell in the tableView. As you might imagine, this is a very common scenario.

So what exactly happens? That is, how does the 'IMMessage' object go from 'bgConnection' to 'uiConnection'?

The exact answer depends upon how YapDatabase is configured. Specifically, it depends on what objectPolicy is set. If you look in YapDatabaseConnection.h you'll see this:

typedef enum {
	YapDatabasePolicyContainment = 0,
	YapDatabasePolicyShare       = 1,
	YapDatabasePolicyCopy        = 2,
} YapDatabasePolicy;

/**
 * YapDatabase can use various optimizations to reduce overhead and memory footprint.
 * The policy properties allow you to opt in to these optimizations when ready.
 * 
 * The default value is YapDatabasePolicyContainment.
 * It is the slowest, but also the safest policy.
 * The other policies are faster, but require a little more work, and little deeper understanding.
**/
@property (atomic, assign, readwrite) YapDatabasePolicy objectPolicy;
@property (atomic, assign, readwrite) YapDatabasePolicy metadataPolicy;

So if the databaseConnections are using the default objectPolicy (YapDatabasePolicyContainment), then here's what happens:

  • The 'bgConnection' serializes the object and writes it to disk
  • The 'uiConnection' then reads the serialized data from disk, and deserializes it

But YapDatabase can do better. You can do better. We can be much more efficient than this.


Copying objects between connections

The default policy, YapDatabasePolicyContainment, is a policy to contain all objects to their connection. In this respect, it works conceptually like Core Data, where every NSManagedObject is tied to its NSManagedObjectContext.

An alternative to this is YapDatabasePolicyCopy. Here's how the above example plays out under YapDatabasePolicyCopy:

  • The 'bgConnection' serializes the object and writes it to disk (same as before)
  • The 'uiConnection' receives a copy of the object ([object copy]) automatically
  • The 'uiConnection' does NOT need to re-read it from disk (or deserialize it) if the object was previously in the cache

When a read-write transaction completes, a change-set is posted internally to the other connections. This allows other connections to automatically update their cache. This is how connections manage to keep their cache in-sync as they move from commit to commit.

So if we use YapDatabasePolicyCopy, then we can move objects from connection to connection without forcing a re-read from disk.

There is still the memory overhead of having multiple copies of the object, one per connection. And if the object doesn't support the NSCopying protocol, then it falls back to YapDatabasePolicyContainment for that particular object. But the disk overhead is reduced, and that's the biggest bottleneck.

Opting In

The policy settings can be configured for objects and metadata separately. And they can be configured for each connection separately.

The easiest way to configure your policy is to set it as the default policy when initializing your database. That way all connections inherit the correct policy automatically when they're created:

YapDatabase *database = [ [YapDatabase alloc] initWithPath:databasePath];
database.defaultObjectPolicy = YapDatabasePolicyCopy;
database.defaultMetadataPolicy = YapDatabasePolicyCopy;

// All connections going forward will default to YapDatabasePolicyCopy when created.

Sharing objects between connections

You may have noticed in the example above that the code is copying objects before modifying them. This is the recommended practice from the Performance Primer article.

That is, other threads/connections have promised to never directly modify an object received from the database. Instead they promise to first make copies, and then modify the copy before writing it back to the database. Thus the updated object, once saved to the database, is thread-safe.

If this promise is kept (or better yet, if there are cool tools to enforce this promise), then YapDatabase can give itself an additional performance boost by directly passing the updated object to other connections. No copies, and no re-reading from disk if the object was already in the connection's cache. Thus multiple connections can share a single instance of this object. Reduced disk IO & reduced memory footprint!

This is how YapDatabasePolicyShare works.

Continuing our example, here's how it plays out under YapDatabasePolicyShare:

  • The 'bgConnection' serializes the object and writes it to disk (same as before)
  • The 'uiConnection' receives the object automatically
  • The 'uiConnection' does NOT need to re-read it from disk (or deserialize it) if the object was previously in the cache

This is similar to the copy policy, but without the overhead of making copies, or the memory cost of having multiple copies in RAM.

I'm a little confused still. What do you mean by "sharing objects between connections" ? Can you give an example?

Say there are 2 connections: connectionA and connectionB.

Both connectionA and connectionB have their own separate caches. And say both connections have, in their cache, the object for key @"abc123". Then connectionA executes a read-write transaction, makes a copy of said object, modifies it, and then saves it back into the database. What will happen is that, connectionB, when it updates to this latest commit, will internally process a "change-set" from connectionA. And in doing so, it will update the contents of its own cache. Specifically, it will replace the object for key @"abc123" with the object from the "change-set" that was passed from connectionA. Thus both connectionA and connectionB will have, in their separate caches, the same exact object instance for key @"abc123". (Both connections will have a reference (in their cache) to the same object in memory for key @"abc123".)

Of course, this policy can be a little "dangerous" if you start breaking "promises". Luckily YapDatabase has tools to help you keep your promise. Break your promise and it breaks your kneecaps throws an exception.


Enforcing immutability using the sanitizer

As noted, the recommended best practice is to makes copies of database objects before modifying them. There are multiple reasons this is consistently recommended:

  • it's straight-forward and easy to do
  • it allows all your database objects to be safely passed among threads
  • it allows your database objects to be shared between connections
  • it's enforceable using a few tricks

In addition to having a serializer & deserializer, YapDatabase supports a sanitizer:

typedef id (^YapDatabaseSanitizer)(NSString *key, id object);

- (id)initWithPath:(NSString *)path
        serializer:(YapDatabaseSerializer)serializer
      deserializer:(YapDatabaseDeserializer)deserializer
         sanitizer:(YapDatabaseSanitizer)sanitizer;

Whenever you invoke setObject:forKey:inCollection:, the sanitizer will automatically be run on the given object, before it enters the database system.

For a simple example, assume all objects you put into the database are NSString's. For example:

[databaseConnection readWriteWithBlock:^(YapDatabaseReadWriteTransaction *transaction){

    NSString *fortune = [transaction objectForKey:fortuneCookieId inCollection:@"fortunes"];
    
    NSMutableString *updatedFortune = [fortune mutableCopy];
    [updatedFortune appendString:@" in bed"];

    [transaction setObject:updatedFortune forKey:fortuneCookieId inCollection:@"fortunes"];
}];

You'll notice that we're actually storing an NSMutableString. We could use the sanitizer to ensure that all string objects we store into the database are immutable. This ensures that we won't later forget about thread-safety, and start casting and mutating the string.

YapDatabaseSanitizer sanitizer = ^(NSString *collection, NSString *key, id object){
    
    if ([object isKindOfClass:[NSString class]])
        return [object copy]; // ensure NSString (immutable), not NSMutableString
    else
        return object;
};

Of course, this was a simple example. In real life, you will likely be storing your own custom objects. And it may be overkill to have separate mutable vs immutable versions! Not to mention the extra work involved. All we really want to do is mark an object as immutable.

Luckily, there are tricks you can use to mark your custom objects as immutable !

To find out how, check out the MyDatabaseObject wiki page.