관리자

7. Transactions

The slippery Concept of a Transaction

ACID

  • Atomicity

    • The transaction is aborted if some of writes in one grouped atomic transaction can not be completed.
    • main idea: what happens if a client wants to make several writes but what if some fails?
  • Consistency

    • the data should always follow some rules.
      • ex. in accounting system, credit sum == debit sum
    • it is a property of application
  • Isolation

    • concurrently executing transactions are isolated from each other
    • classic text describes isolation as serializability
    • (serializability): each transaction can pretend that it is the only transaction running on the entire database
  • Durability

    • Once a transaction has committed successfully, data written will not be forgotten, despite of hardware fault or database crash
    • single-node DB : data has been written to nonvolatile storage
    • replicated DB : successfully copied to some number of nodes

Single-Object and Multi-Object Operations

  • common philosophy: Abandon transaction entirely, rather than remain half-finished

  • leaderless replication : best effort

  • retry is good, though not perfect

    • transaction succeed, but network( DB → server ) failed

    • when failed due to overload → worse

    • only worth in transient errors

    • side effects

    • if client process fails while retrying, any data it was trying to write to the DB is lost

      Weak Isolation Levels

    • transaction isolation

      Rather than blindly relying on tools, we need to develop a good understanding of the kinds of concurrency problems that exist and how to prevent them.

      Read committed

    • no dirty read : read only committed

      • when updating multiple objects, keep sync/consistency
      • the same result even after rollback
    • no dirty write : overwrite only committed

      • with row-level locks
      • when transaction is ongoing, others see old value.

      Snapshot Isolation and Repeatable Read

    • Repeatable Read

      • Nonrepeatable read(read skew)
        • The data read at the end of the transaction is different with previous query
    • Snapshot Isolation

      • each transaction reads from a consistent snapshot of database

        (see only the old data)

      • Readers never block writers, writers never block readers

      • multi-version concurrency control(MVCC)

        • maintaining several version of an object side by side
        • data is tagged with its transaction id(txid)
      • visibility rules

        At the start of the transaction, make a list of proceeding transactions.

        1. Any write of above transactions are ignored
        2. Any writes made by aborted transactions are ignored.
        3. Any writes made by transactions with later transaction ID are ignored

        Preventing lost updates

      • client1: read → update

        client2: read - > update

        client1's update is lost

      • Prevent by locking

        • explicit lock
        • compare and set
        • conflict resolution and replication

        Write skews and Phantoms

      • write skew

        • Transactions update two different objects concurrently and there occurs conflict.

        • ex. at least 2 doctors should be on_call.

          • Bob changes to on_cal=false, Alice changes to on_cal=false, concurrently.
          • Now less than 2 doctors are on call.
          • Two different objects(record of Bob, record of Alice) are updated
        • restricted solution

          serialization, or explicit lock on rows

      • phantom

        • a write in one transaction changes the result of a search query in another tranascation

        • ex. select → condition → update

          between select and update, something changed.

        • SELECT FOR UPDATE may not help when SELECT returns 0 rows.

        • materializing conflict : create a table of all possible combinations, and put lock there.

          Serializability

        • the end result is the same as if they had executed one at a time, serially without any concurrency

          Two-phase Locking

        • read : shared lock

        • write : exclusive lock

        • first phase : when the locks are acquired,

          second phase : when all the locks are released

        • predicate lock

          • lock that belongs to all objects that match some search condition
          • applies even to objects that do not yet exist
        • index-range locking(next-key locking)

          • predicate lock has not good performance.
          • lock on index

          Serializable Snapshot Isolation

        • pessimistic concurrency control

          • if anything may go wrong, better wait
        • optimistic concurrency control

          • transaction continues. If anything goes wrong, abort
          • old idea
          • tend to be better than pessimistic control
        • Serializable Snapshot Isolation is promising.

          • optimistic approach
          • allows transactions without blocking

'Books > DDIA' 카테고리의 다른 글

6. Partition  (0) 2021.01.08
5. Replication  (0) 2021.01.07

+ Recent posts