Problematic rollback semantics with nested transactions #2767

rgalanakis · 2023-08-15T23:19:54Z

Hi there, thanks for building this library!

I am trying to do something like this:

with db.transaction():
  Thing.create()
  try:
    with db.transaction():
      raise SomeException()
  except SomeException as ex:
    assert len(Thing) == 1
    raise ex

But what's happening is:

BEGIN
Noop because we're in a transaction
INSERT
Exception is raised
ROLLBACK
ROLLBACK

That is, the BEGIN and ROLLBACK are unbalanced.

I believe you are aware this is the case, as you have recommended folks use atomic, which handles this balancing. However I do not want to use atomic, since it must create a savepoint, and I don't want to use savepoints (I find it makes the semantics of the transaction far more difficult to reason about, Python ecosystem seems to disagree but I'm not here to argue).

What I'd humbly suggest/ask is either:

Add support for atomic(savepoint=False), so we can use atomic without savepoints, to get the balanced transaction behavior.
Modify _transaction.__exit__ so it only rolls back if we're dealing with the top transaction, something like:

    def __exit__(self, exc_type, exc_val, exc_tb):
        try:
            if self.db.transaction_depth() == 1:
                try:
                  if exc_type: # This logic was outside the transaction_depth() == 1 check
                    self.rollback(False)
                  else:
                    self.commit(False)
                except:
                    self.rollback(False)
                    raise
        finally:
            self.db.pop_transaction()

I don't believe the current transaction rollback behavior is desirable since it breaks the semantics of context managers by leaking actions across contexts (the inner context manager rolls back the transaction begun by the outer transaction manager).

The text was updated successfully, but these errors were encountered:

coleifer · 2023-08-16T13:19:16Z

I am not going to change the semantics of the transaction() context-manager's rollback behavior in the presence of an unhandled exception. The explicit use of the transaction() context manager when nesting is, as you note, up to the user to manage -- if you want nested atomic blocks you ought to be using atomic() instead of transaction().

In other words, if you do not want the exception to cause a rollback, catch it inside the nested transaction() block.

Alternatively, you can implement your own transactional context manager if you prefer to only roll-back the outermost, but I think that's problematic since an unhandled exception in a transaction should imply that the whole transaction should be aborted (since there is, effectively, only one transaction).

I'm struggling to think of a real-world scenario where this problem would actually occur.

rgalanakis · 2023-08-16T17:48:28Z

I'm struggling to think of a real-world scenario where this problem would actually occur.

Sure.

An API unit test is wrapped in a transaction/atomic, doesn't matter.

The test setup involves fixturing a user.

The endpoint, for resetting a password, has code like:

with db.transaction():
  try:
    code = UserResetCode.verify_code_with_token(token)
  except UserResetCode.Unusable:
    return error_response()
  code.user.verify_email().save()

In the model:

class UserResetCode:
  @classmethod
  def verify_code_with_token(cls, token):
    code = cls.find_usable(token=token)
    with db.transaction():
      if not code:
          raise Unusable()
      # Mark the code used, etc

The Unusable exception, when it hits the transaction() in the verify_code_with_token method, rolls back all the data created in the unit test method. This wasn't a database exception, the transactions in the endpoint didn't end up opening a transaction, so I can't see why we'd want it to roll back the transaction.

The explicit use of the transaction() context manager when nesting is, as you note, up to the user to manage

That's what I'm trying to do. However I can't actually manage internal nesting, because it leaks and rolls back transactions opened up by outer contexts.

if you want nested atomic blocks you ought to be using atomic() instead of transaction().

Right, and I would, but I don't want automatic savepoints. Why not make the savepoint behavior of atomic configurable? While I never point to Django as an example of good design, its atomic does support savepoint=False at least.

EDIT (8/16/2023T10:50:00 Pacific): Fixed the example, there was a nested transaction in the verify method I originally missed.

coleifer · 2023-08-16T18:25:20Z

I'm still struggling to see it. It doesn't quite make sense to me, e.g., here:

    with db.transaction():
      if not code:
          raise Unusable()
      # Mark the code used, etc

Why wouldn't that exception be raised outside the transaction block?

The current semantics are such that raising an unhandled exception within a transaction-block will rollback. Part of the issue seems to me to be compounded by the decision to raise exceptions for flow-control.

So, with all this being said, the peewee docs are quite clear:

I don't intend to change it, as other people may rely on the current functionality -- and it should be quite easy to write your own context-manager subclass if you prefer the semantics you described in the original comment. Nor am I aware of any downsides to permitting savepoints to be used via the atomic() decorator in cases where nesting is variable depending on the way the code is called.

rgalanakis · 2023-08-17T04:08:36Z

Why wouldn't that exception be raised outside the transaction block?

It gets handled at the endpoint level, and an appropriate error is returned.

Part of the issue seems to me to be compounded by the decision to raise exceptions for flow-control.

Sure, and when I'm programming in Go I don't use exceptions for flow control, but we're dealing with a language featuring StopIteration- exceptions for flow control is part of standard Python design. It's also done frequently in Peewee, like XDoesNotExist errors.

as other people may rely on the current functionality

This feels like depending on undefined C compiler behavior but ok. I think you can easily define the behavior, though- inner transactions are no-ops, only the outermost transaction does does anything. This is how all the other similar mechanisms in other languages I've used work (I believe it's also this way in other Python frameworks but can't say for sure). Python, maybe because Django did it, seems to be the only one with savepoint being the default- 'nested transactions' being noops is standard/reasonable behavior.

Nor am I aware of any downsides to permitting savepoints to be used via the atomic() decorator in cases where nesting is variable depending on the way the code is called.

Are you saying adding a savepoint=False option to atomic is on the table?

coleifer · 2023-08-17T12:49:58Z

I think you misunderstood the main issue. Consider this example.

If this code is run with the changes you proposed, then any queries executed after the except block will not be executable because Postgres reports that the transaction state is aborted (due to the integrity error):

db = PostgresqlDatabase('peewee_test')

class Reg(db.Model):
    key = CharField(primary_key=True)

db.create_tables([Reg])

with db.transaction() as tx:
    Reg.create(key='k1')
    try:
        with db.transaction() as tx2:
            Reg.create(key='k1')
    except IntegrityError:
        pass

    print(len(Reg))  # Cannot do this, transaction state is messed-up.

With the current Peewee code, the above code runs and produces an output of 0.

Neither of these seem quite correct. That is why it is strongly recommended to use atomic() if you have varying levels of nesting.

rgalanakis · 2023-08-17T15:52:54Z

Maybe we should take a step back and look at the what I believe we can agree is a valid use case, and that there's no way to achieve it with peewee's transaction semantics (at least that I can find).

As a user of this library, I would like to be able to reason about a single transaction boundary, and any errors that reach said boundary will result in a rollback. This boundary respects both internal error handling (raising and catching of Python exceptions not having to do with the database) and database error handling (errors result in an aborted transaction).

With the current behavior, there is no way to achieve this- Python exceptions can result in database rollbacks, and database errors can be recovered from.

Put differently, as a user of this library, this library would be easier to reason about if the transaction semantics allowed this:

BEGIN;
INSERT;
INSERT;
INSERT;
-- ROLLBACK or COMMIT;

However, the current behavior requires I reason about this:

BEGIN;
INSERT;
-- Potential ROLLBACK
INSERT;
-- Potential ROLLBACK;
INSERT;
-- Potential ROLLBACK;
-- ROLLBACK or COMMIT

Do you disagree that users of this library should be able to reason about a single transaction boundary? Especially when it can be achieved in a relatively straightforward manner (making nested transaction noop, which is preferable, or adding a savepoint=False argument to atomic?

See #2767 for discussion.

coleifer · 2023-08-17T16:19:54Z

Yeah I'm starting to agree that your point-of-view on this one would be an improvement. I've made a small change to the _transaction.__exit__ handler to only initiate the rollback if it is the outermost one, which I think satisfies the original request. See 278c24d

I will note this in the changelog -- any users affected by the change I will deal with if/when they come up.

I am curious why the aversion to using savepoints, but nonetheless I do appreciate your persistence and agree that this is an improvement.

rgalanakis · 2023-08-17T16:31:03Z

Perfect, thanks Charles! I ended up subclassing _transaction to make that change, and monkeypatching .transaction, so I'll be glad to remove my patch.

The aversion to using savepoints is that 1) they add database calls that may not even be necessary. I have looked at DB logs of applications where most of the database traffic are SAVEPOINT and RELEASE, and the latency from the calls is not insignificant at that level of chatter, and especially 2) I find them hard to reason about. I try not to hide the fact that we're working with a database, and the semantics of the single transaction boundary are easier to reason about (and I use savepoints in specific instances, like recovering from your example of an integrity error)

By the way if you ever find yourself writing Ruby I'd highly recommend Sequel, it's a really great ORM that I think you'd find a lot of sympathetic design and ideas with the author/community.

Thanks again,
Rob

See coleifer#2767 for discussion.

coleifer · 2023-10-13T15:46:40Z

3.17.0 has been released, which contains this fix.

coleifer closed this as completed Aug 16, 2023

coleifer added a commit that referenced this issue Aug 17, 2023

Only roll-back in outermost @transaction decorator/ctx manager.

278c24d

See #2767 for discussion.

gshmu pushed a commit to gshmu/peewee that referenced this issue Aug 22, 2023

Only roll-back in outermost @transaction decorator/ctx manager.

5a005b5

See coleifer#2767 for discussion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problematic rollback semantics with nested transactions #2767

Problematic rollback semantics with nested transactions #2767

rgalanakis commented Aug 15, 2023

coleifer commented Aug 16, 2023

rgalanakis commented Aug 16, 2023 •

edited

Loading

coleifer commented Aug 16, 2023 •

edited

Loading

rgalanakis commented Aug 17, 2023

coleifer commented Aug 17, 2023

rgalanakis commented Aug 17, 2023 •

edited

Loading

coleifer commented Aug 17, 2023

rgalanakis commented Aug 17, 2023

coleifer commented Oct 13, 2023

Problematic rollback semantics with nested transactions #2767

Problematic rollback semantics with nested transactions #2767

Comments

rgalanakis commented Aug 15, 2023

coleifer commented Aug 16, 2023

rgalanakis commented Aug 16, 2023 • edited Loading

coleifer commented Aug 16, 2023 • edited Loading

rgalanakis commented Aug 17, 2023

coleifer commented Aug 17, 2023

rgalanakis commented Aug 17, 2023 • edited Loading

coleifer commented Aug 17, 2023

rgalanakis commented Aug 17, 2023

coleifer commented Oct 13, 2023

rgalanakis commented Aug 16, 2023 •

edited

Loading

coleifer commented Aug 16, 2023 •

edited

Loading

rgalanakis commented Aug 17, 2023 •

edited

Loading