Skip to content

LongLivedReadTransactions

Robbie Hanson edited this page Nov 10, 2019 · 35 revisions

Transactions are great. Except when they're not.

From an abstract perspective, it's easy to think in terms of transactions. We execute a read-write transaction and make some changes to the database. And after the transaction is complete we know that future transactions will see the changes. But the problem is, we don't always think in terms of transactions. This is especially true on the main thread. Consider the following code:

override func tableView(_ tableView: UITableView, cellForRowAt indexPath: IndexPath) -> UITableViewCell {
    
  var onSaleItem: StoreItem? = nil
  dbConnection.read {(transaction) in
    if let ext = transaction.ext("view") as? YapDatabaseViewTransaction {
      onSaleItem ext.object(atIndex: indexPath.row, inGroup: "sales") as? StoreItem
    }
  }
  
  // configure and return cell...
}

At first glance, this code looks correct. In fact, this is the natural and recommended way to write this code. But what about in terms of transactions? What happens if we execute a read-write transaction on a background thread and remove a bunch of sales items? And meanwhile the main thread is chugging away, populating the tableView or scrolling the tableView, and invoking the above dataSource method?

The answer is that things might get out-of-sync. At least temporarily. View controllers such as tableViews and collectionViews require a stable data source. The underlying data needs to remain in a consistent state, until the main thread is able to update the view controller and underlying data in a synchronized fashion. And if the UI is busy, this may be several run loop cycles away.

This is where long-lived read-only transactions come in.

A read-only transaction represents an immutable snapshot-in-time of the database. But the block-based transaction architecture limits the duration of the transaction, and thus limits the duration of the snapshot. Long-lived transactions allow you to "bookmark" a snapshot of the database, and ensure that all future read-only transactions use the previously "bookmarked" snapshot. Furthermore, the long-lived architecture allows you to move your "bookmarked" snapshot forward in time in a single atomic operation.

The architecture was designed to make it easy to use YapDatabase without having to worry about asynchronous read-write issues. Your main thread can move from one steady-state to another. And the code-block above will work just fine, without worrying about transaction issues.

 

Getting Started

Step One is to begin a long-lived transaction:

override func viewDidLoad() {
  dbConnection.beginLongLivedReadTransaction()

  // ...
}

After you invoke beginLongLivedReadTransaction, future invocations of readWithBlock or asyncReadWithBlock will use the snapshot that was "bookmarked" at the point-in-time in which you called beginLongLivedReadTransaction.

For example, if the database was at commit #42, then your dbConnection stays at commit #42 until you explicitly allow it to move forward. Even as commits #43 and #44 complete, your dbConnection remains steady at #42.

Now, obviously, you don't want to stay on the same commit/snapshot forever. When the database changes, you want to update your view.

Step Two is to listen for YapDatabaseModifiedNotification's:

override func viewDidLoad() {
  
  dbConnection.beginLongLivedReadTransaction()
  
  let nc = NotificationCenter.default
  nc.addObserver( self,
	      selector: #selector(self.yapDatabaseModified(notification:)),
	          name: Notification.Name.YapDatabaseModified,
	        object: nil)
}

@objc func yapDatabaseModified(notification: Notification) {

  // End & Re-Begin the long-lived transaction atomically.
  // I.e. jump to latest commit. 
  dbConnection.beginLongLivedReadTransaction()

  // Maybe update view. See next few sections for the details.
}

 

Handling Multiple Modifications

YapDatabase is fast. And it can handle multiple read-write transactions extremely fast. In fact, it can often handle multiple read-write transactions in the time it takes your main thread to update its view once...

Which begs the question: What if multiple read-write transactions occur before my main thread invokes beginLongLivedReadTransaction?

The answer is that you may jump multiple commits. For example, if you were previously on snapshot 12, you may end up jumping to snapshot 14 (which is 2 read-write commits later).

In fact, you may also jump zero commits. For example, the main thread was busy scrolling, and multiple YapDatabaseModifiedNotification's got queued up on the main thread. The first time yapDatabaseModified is hit, you jump 2 commits. And the second time yapDatabaseModified is hit, you jump zero commits (because you already handled it previously).

But no need to freak out! There's a clear way to handle it:

@objc func yapDatabaseModified(notification: Notification) {

  // End & Re-Begin the long-lived transaction atomically.
  // Also grab all the notifications for all the commits that I jump.
  let notifications = dbConnection.beginLongLivedReadTransaction()
  if notifications.count == 0 {
    return; // already processed commit
  }
  
  // Maybe update view. See next section for the details.
}

The 'notifications' array corresponds to all the individual 'notification' objects that are pending. This allows your main thread to catch up in a single leap, as opposed to slowly processing each change one-at-a-time (which can be expensive if you're wanting to animate changes).

 

Did the change affect my view?

YapDatabaseModifiedNotification has all the juicy details on what changed. And there are various methods you can use to inspect the notification(s) to see if anything related to your view actually changed. Thus you can be more selective about potentially expensive view updates.

@objc func yapDatabaseModified(notification: Notification) {

  // End & Re-Begin the long-lived transaction atomically.
  // Also grab all the notifications for all the commits that I jump.
  let notifications = dbConnection.beginLongLivedReadTransaction()
  if notifications.count == 0 {
    return; // already processed commit
  }

  // Update views if needed
  if dbConnection.hasChange(forKey: productId, inCollection: "products", in: notifications) {
    self.updateProductView()
  }
  if dbConnection.hasChange(forCollection: "shoppingCart", in: notifications) {
    self.updateShoppingCartImageAndBadge()
  }
}

You'll notice that all methods for detecting changes take an array of notifications. Which handily matches the longLivedReadTransaction API. All such methods can handle multiple notifications in the same way they would handle a single notification.

 

Keeping it Read-Only

What happens if I execute a read-write transaction on my dbConnection while it's in a longLivedReadTransaction?

In order to execute a read-write transaction, the dbConnection is forced to drop its long-lived read transaction, and move itself to the most recent commit in order to run the read-write transaction. The end result is that the dbConnection is no longer in a long-lived read transaction.

This is almost always a mistake. And one that's pretty easy to make. I speak from experience. But don't worry, we've got you covered...

By default, when compiling in DEBUG mode (#if DEBUG), you'll get an exception.

If you implicitly execute a readWrite transaction on a connection that is in a longLivedReadTransaction, then YapDatabase will throw an exception:

YapDatabaseException: YapDatabaseConnection had long-lived read transaction implicitly ended by executing a read-write transaction.

The exception allows you to quickly catch your mistakes (as opposed to waiting weeks before noticing some goofy timing bug that's nearly impossible to reproduce).

Yeah, but... couldn't I just restart the long-lived read transaction in the completionBlock?

Nope. At least not safely. This is not the solution you're looking for.

The whole idea of a long-lived read transaction is to provide a stable data source for your UI thread. So if you abandon your stable snapshot that glues together your UI and data source, you're gonna have a bad time. Imagine the following scenario:

  • You start a long-lived read-only transaction at snapshot #12
  • A background thread starts a read-write transaction (snapshot #13)
  • You accidentally run a read-write transaction using your long-lived read-only transaction (snapshot #14)
  • Your main thread gets a YapDatabaseModifiedNotification for snapshot #13

Do you need to process these changes? The database connection doesn't think so. After all, it's at snapshot #14. It thinks this is a change-set you've already processed.

Could you jump through multiple flaming hoops and figure out a solution to this situation? I suppose. Is it worth the effort? Nope. Especially since there's a much easier solution.

So what's the recommended "best practice"?

It's easy. Keep it read-only. That is, keep your long-lived read-only dbConnection for read-only operations. This ensures your UI and data-source are always glued together properly.

Keep your UI connection read-only. And use a separate connection to perform read-write transactions.

func flagCommentAsInappropriate(_ commentID: String) {
  
  // Keep our UI connection read-only.
  // Use a separate connection for read-write transactions.
  // If we do this a lot, we'll use a dedicated "read-write" connection.
  // But for rarely used operations, an on-the-fly connection is probably fine.

  db.newConnection().asyncReadWrite {(transaction) in
    
    if let comment = transaction.object(forKey: commentId, inCollection: "posts") {
      comment.inappropriate = true
      
      transaction.setObject(comment, forKey: commentId, inCollection: "posts")
    }
  }
}

As you can see, it's pretty easy to do. And the extra concurrency means we're not going to block our UI, even if there's some other background operation that's blocking the read-write transaction with some big batch insert of freshly downloaded posts.

 

Important Warning

It is absolutely critical that you listen for YapDatabaseModifiedNotification's, and properly move your long-lived transactions forward. Failure to do so could significantly slow down the database.

In order to provide things like concurrency, snapshots, and long-lived transactions, the database uses SQLite's write-ahead log (WAL) mechanism. Here's a really high-level overview on how this works:

  • There is the database file, and a separate write-ahead log (WAL)
  • The WAL contains commits that have yet to be synced into the database file
  • As the minimum transaction moves forward in time, old commits are moved from the WAL to the database
  • If the WAL gets too big, the performance of the database begins to suffer
  • Typically this won't ever happen, unless you start a long-lived read transaction and never move it forward!

Think of each read-write commit as having a number, where the number is incremented by 1 each time. So the database file might reflect every commit up to, say, number 12. And each commit after number 12 is in the WAL, like so:

Database: {12}
WAL     : [13, 14]
            ^   ^
   connection1  ^
                ^
       connection2

In the picture above, connection1 is reading commit #13, and connection2 is on the latest (commit #14). At this point the database can move commit #13 into the database file. Why? Because every single connection is at or past commit #13. However, it cannot move commit #14 into the database file. Why? Because connection1 doesn't know about it yet. It's still on commit #13.

In most situations this is perfectly fine. Eventually connection1 will move forward, and all commits in the WAL will get moved into the database. And then the WAL file will get reset (become empty).

But if you use a long-lived read transaction, and don't bother to move your transaction forward, then you end up with a situation like this:

Danger, Will Robinson!

Database: {12}
WAL     : [13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]
            ^                                               ^
   connection1                                              ^
                                                            ^
                                                   connection2

As your WAL file continues to grow, the performance begins to degrade. If left unchecked for an extended number of commits, the sqlite database may become sluggish. What may be even worse is the startup time for a database that encounters a giant WAL file.

The solution is quite simple. Just listen for the proper notification and move your long-lived transactions forward in time (as demonstrated above).

 

Cleanup

There is a public API for endLongLivedReadTransaction. Does this need to be called during dealloc?

Nope. There is no special cleanup that is required.

When a databaseConnection gets deallocated, it will automatically end the longLivedReadTransaction (if necessary) as part of its deallocation process.