Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit 6: Concurrency #40

Open
cheatsheet1999 opened this issue Sep 27, 2021 · 0 comments
Open

Unit 6: Concurrency #40

cheatsheet1999 opened this issue Sep 27, 2021 · 0 comments

Comments

@cheatsheet1999
Copy link
Owner

cheatsheet1999 commented Sep 27, 2021

Lock-based Concurrency Control and Recovery From Failures

Lesson Introduction: Lock-based Concurrency Control

Database systems are equipped with lock-based protocols to ensure that any transaction to a database cannot read or write data until it acquires an appropriate lock on the transaction. In a big data system where thousands of transactions can be executed simultaneously, it is critical to control the concurrency of transactions.

We will learn about the family of approaches called Lock-Based Concurrency Control. You will learn about the most widely used protocol, the Strict Two-phase Locking, or Strict 2PL.

Topic: Lock-Based Concurrency Control

First, using a dependency graph, we can tell whether a schedule is conflict serializable or not.

Lock based concurrency control
Strict Two-phase Locking (Strict 2PL) Protocol

  • Each transaction must obtain a shared lock on the object before reading, and an exclusive lock on the object before writing.
  • All locks held by a transaction are released when the transaction completes.
  • Strict 2PL allows only serializable schedules, it simplifies transaction aborts

To be more clear, before reading a transaction A, R(A), it needs to acquire shared lock, S(A). Whenever a transaction wants to write a data item W(A), it needs to acquire exclusive lock X(A).

It calls Two-phase because the transaction goes to two main phases.

  • A phase where the transaction keeps acquiring locks, either shared locks or exclusive locks.
  • Release locks at the end of the transaction. Here is a less strict version where you can release it in the second phase, start releasing locks step by step.

There is a separate module in the database system called lock manager, it controls all the locking going up, all the transactions, or the concurrency control module in the database system has to call the lock manager to acquire locks. So the lock manager basically receives locks and releases all locks as well.

Simple rule:

  • if one transaction already holds a shared lock on item A, if another transaction comes and wants to also acquire a shared lock on the same item, this is allowed. So the lock manager will grant that lock to the other transaction as well.
  • If the transaction has a shared lock on item A, but another transaction came and it wants to acquire an exclusive lock, that is Not allowed
  • If the original transaction already has an exclusive lock on item A, and another transaction comes in and wants a shared lock. It is not allowed
  • If an original transaction has an exclusive lock and another transaction also wants an exclusive lock on the data item

As we can see. Shared on Shared is allowed, otherwise, not allowed

Screen Shot 2021-09-27 at 1 55 33 PM

We have to wait for the transaction to release the lock first if it has an exclusive lock in order to acquire a shared or exclusive lock on it.
And if we want to have an exclusive lock, we have to wait for a shared lock also to be released.

Deadlock
A deadlock means that one transaction wants to acquire a lock on a data item A, and this data item A is already locked by another transaction B. At the same time, this other transaction B also wants to acquire a lock on a data item A that is already also locked by transaction B. They both waiting for each other, and it leads to a deadlock because nobody will release the lock until the other one does.

Deadlock Detection

  • Create a data structure called the waits-for graph. In the graph, there is an edge from Ti to Tj if Ti is waiting for Tj to release a lock

Ti => Tj

To detect deadlock, we periodically check for cycles in the graph. if there is a cycle, it means we have a deadlock and we have to break the cycle by aborting one of the transactions that are causing the cycle

IMG_BEF9110C3F28-1

No deadlocks because of no cycle ⬆️

Screen Shot 2021-09-27 at 3 09 51 PM

deadlocks because of cycle ⬆️


Topic: Database Recovery

As we already know from the previous unit, a transaction is a collection of actions that make consistent transformations of system states while preserving system consistency. There might be temporarily in an inconsistent state during execution in the middle, but that is fine. As long as the beginning and the end of the transaction are consistent states.

If we have a deadlock, for example, we may need to abort the transaction that causes the deadlock.
However, aborting a transaction is a failure.
Type of Failures:

  • Transaction failures
    • Transaction aborts
    • Avg 3% of transactions abort abnormally.
  • System failure
    • Failure of processor, main memory loss due to power failure
    • Main memory contents are lost, but secondary storage contents are safe
    • partial vs. total failure
  • Media failures
    • Failure of secondary storage devices such that the stored data is lost
  • Communication failures
    • Lost/undeliverable messages due to network unstable

Local Recovery Management - Architecture

  • Volatile storage
    • consists of the main memory of the computer system (RAM).
  • Stable / Persistent storage
    • Resilient to failures and loses its contents only in the presence of media failures

Main memory is smaller than the persistent storage so you have to select a few pages that we can actually store in the buffer. When computers shut down, all data in RAM is wiped out.

That is why we have a Recovery manager

  • Before sending data to the new stable database state to do recovery, we have a database log, which logs every single operation that is happening in transactions into persistent storage. Database Log is an append-only file

Recovering From a Crash

  • There are 3 phases in the Aries recovery algorithm:
    • Analysis: Scan the log forward (from the most recent checkpoint)
    • Redo: For all the transactions committed before the crash, we will apply for a redo of all the operations of this transaction.
    • Undo: For transactions that did not commit before the crash, undo all operations to the old values

This way, we ensure that we recover from failure. Transactions that are committed, are redone again, we are fine. For transactions that did not commit before the crash, they don't have a commit record in the log, undo them, and run them again later

Knowledge Check: Lock-Based Concurrency Control

  1. What kind of lock must a transaction acquire in order to read from a database?
  • Concurrency Lock
  • [Correct] Shared Lock
    (This is the type of lock a transaction needs to acquire in order to read from a database)
  • Exclusive Lock
  • Read Lock
  1. What kind of lock must a transaction acquire in order to write to a database?
  • [Correct!] Exclusive Lock
    (This is the type of lock a transaction needs to acquire in order to write to a database.)
  • Shared Lock
  • Write Lock
  • Atomicity Lock
  1. What is the name of the database module that manages lock-based concurrency control for transactions?
  • Concurrency Manager
  • Transaction Manager
  • [Correct] Lock Manager
    (Lock-based concurrency control is managed by a specific database module called lock manager)
  • Control Manager

Unit 5 & 6: Practice Quiz

  1. What is the result of the following database transaction if the initial values of both A and B are 100?

BEGIN

A=A+100

B=A+100

END


A = 100 + 100 = 200
B = 200 + 100 = 300
A=200, B=300

  1. How does a database achieve concurrency?
  • By putting transactions in a queue.
  • [Correct] By interleaving transactions.
    (A database can process multiple transactions concurrently by interleaving them.)
  • By processing transactions at the same time.
  • By processing transactions as they come in.
  1. Which of the following describes the atomicity principle of database transactions?
  • A database can only serve one user at a time.
  • A database can only process one query at a time.
  • [Correct]All of the components of a transaction run in their entirety, or none of them run at all.
  • A database can only process one transaction at a time.
  1. What kind of lock must a transaction acquire in order to read from a database?
  • Read Lock
  • Concurrency Lock
  • [Correct] Shared Lock
    (This is the type of lock a transaction needs to acquire in order read from a database.)
  • Exclusive Lock
  1. Consider the following two database transactions with initial values of A=500 and B=500:

T1: A=A+100, B=A-100

T2: A=A1.06, B=B1.06


Assuming T1 arrives first, what will be the final values of A and B if the database management system interleaves transactions?

T1: A = 500 + 100 = 600
T2: A = 600 * 1.06 = 636
T1: B = 636 - 100 = 536
T2: B = 536 * 1.06 = 568.16

A=636, B=568.16


Unit 5 & 6: Graded Quiz

  1. Database transactions are processed according to a principle that all of the components of a transaction run in their entirety or none of them run at all. What is the name of this principle?
  • Concurrency
  • Durability
  • [Correct] Atomicity
    (The atomicity principle states that all of the components of a transaction run in their entirety or none of them run at all)
  • Isolation
  1. Which of the following database operations requires a transaction to obtain a shared lock?
  • Delete Data
  • Update Data
  • [Correct] Read Data
    (A database transaction needs to obtain a shared lock in order to read data from a database)
  • Write Data
  1. Which of the following database operations does not require a transaction to obtain an exclusive lock?
  • [Correct] Read Data
    (Reading data from a database does not require an exclusive lock; a shared lock is sufficient)
  • Write Data
  • Delete Data
  • Update Data
  1. Consider the following two database transactions with initial values of A=0 and B=0:

T1: A=A+10, B=A+10

T2: A=A*1.1, B=B*1.1

Assuming T1 arrives first, what will be the final values of A and B if the two transactions are processed using a serial schedule?

T1: A = 0 + 10 = 10
T1: B = 10 + 10 = 20
T2: A = 10 * 1.1 = 11
T2: B = 20 * 1.1 = 22

A=11, B=22

  1. Consider the following two database transactions with initial values of A=0 and B=0:

T1: A=A+10, B=A+10

T2: A=A1.1, B=B1.1

Assuming T1 arrives first, what will be the final values of A and B if the database management system interleaves transactions?

T1: A = 0 + 10 = 10
T2: A = 10 = 10 * 1.1 = 11
T1: B = 11 + 10 = 21
T2: B = 21 * 1.1 = 23.1

A=11, B=23.1

  1. Why do database management systems implement transaction interleaving?
  • To achieve atomicity
  • To achieve isolation
  • [Correct] To achieve concurrency
    (Database management systems implement transaction interleaving in order to achieve concurrency, which is the ability of a database to execute multiple transactions simultaneously)
  • To achieve consistency
  1. Which of the following describes the isolation principle of database transactions?
  • [Correct] Each transaction runs as if there is no other transaction running in the database system simultaneously.
    (This is the definition of the isolation principle of database transactions)
  • All of the components of a database transaction run in their entirety, or none of them run at all.
  • A database management system can only process one transaction at a time.
  • A database management system can only process one query at a time.
  1. Which of the following describes the serializability property of database transactions?
  • All of the components of a transaction run in their entirety or none of them run at all.
  • Each transaction runs as if there are no other transactions running in the database system simultaneously.
  • A database management system can only process transactions in series, not concurrently.
  • [Correct] If several transactions are executed concurrently, the results must be the same as if they were executed one after another.
  1. Which of the following statements about the ACID properties (Atomicity, Consistency, Isolation, Durability) is correct?
  • [Correct] Isolation implies Serializability
    (Isolation means that concurrent changes are invisible, which implies serializability)
  • Atomicity implies Serializability
  • Consistency implies Serializability
  • Durability implies Serializability
  1. Which of the following describes the durability property of database transactions?
  • Each transaction runs as if there are no other transactions running in the database system simultaneously.
  • [Correct] Once a transaction commits, the system guarantees the results will never be lost, in spite of subsequent failures.
  • All of the components of a transaction run in their entirety, or none of them run at all.
  • If several transactions are executed concurrently, the results must be the same as if they were executed one after another.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant