Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We can't write the entries of one database into another #189

Open
Kerollmops opened this issue Jul 20, 2023 · 5 comments
Open

We can't write the entries of one database into another #189

Kerollmops opened this issue Jul 20, 2023 · 5 comments
Labels
enhancement New feature or request feedback A feedback from a user
Milestone

Comments

@Kerollmops
Copy link
Member

Kerollmops commented Jul 20, 2023

Currently, heed is too restrictive on the write transactions and do not permit certain basic operations like writing the content of one database into another as the example shows below.

According to the documentation, write transactions were designed to support this kind of operations on the same database too.
Which means that the following example could even work with the same database.

It would, therefore, be possible for heed replace the places where we use &mut RwTxn with, a less restrictive, &RwTxn.

let mut wtxn = env.write_txn()?;
for result in database1.iter(&wtxn)? {
    let (k, v) = result?;
    database2.put(&mut wtxn, k, v)?; // can't compile &mut and & of the same RwTxn used at the same time
}
@Kerollmops Kerollmops added enhancement New feature or request feedback A feedback from a user labels Jul 20, 2023
@Kerollmops Kerollmops added this to the v0.20.0 milestone Jul 20, 2023
@Kerollmops
Copy link
Member Author

Hey @hyc 👋

Do you have any advice on these points:

  • Is it safe to use a write txn to read and write in two different databases at the same time?
  • Is it safe to iterate on one database with a cursor created from a write txn and write in the same database at the same time?
  • I suppose it is safe to iterate on one database and write the content of it in another one?

Have a nice day 🌞

@hyc
Copy link

hyc commented Jul 22, 2023

  1. yes of course. ACID transactions would be pretty useless if they didn't support operations on multiple DBs in the same txn. That is the prime requirement for the C "Consistency" in ACID.
  2. yes. Note that the mtest*.c test programs already demonstrate this.
  3. yes.

@Kerollmops
Copy link
Member Author

Kerollmops commented Jul 23, 2023

Thank you for the info, Howard!

Unfortunately, the changes made in #190 are invalid as the following rule is no more ensured at compile-time:

Values returned from the database are valid only until a subsequent update operation, or the end of the transaction.

To continue, possible solutions to enable the original limitation described by this issue. We could expose an unsafe new method on the RwTxn struct to create two SplitRwTxn. These split transactions implement DerefMut to behave like normal RwTxn. By declaring it as unsafe we can explain the security concerns but the possibilities it unlocks.

/// Returns `N` views of a mutable transaction. Don't use it like you would use an `RwTxn`.
unsafe fn RwTxn::split<const N: usize>(&mut self) -> [SplitRwTxn; N];

let mut wtxn = env.write_txn()?;
let [mut wtxn1, mut wtxn2] = unsafe { wtxn.split() };
for result in database1.iter(&wtxn1)? {
    let (k, v) = result?;
    database2.put(&mut wtxn2, k, v)?;
}
// by dropping wtxn1 and wtxn2 you can commit the original wtxn
wtxn.commit()?;

@Kerollmops Kerollmops modified the milestones: v0.20.0, v0.21.0 Feb 18, 2024
@Kerollmops
Copy link
Member Author

Kerollmops commented Jul 9, 2024

Hey @hyc 👋

Values returned from the database are valid only until a subsequent update operation, or the end of the transaction.

I am wondering if this sentence is about a subsequent update operation in the database or over the whole environment. Is the cursor cache shared between databases, and therefore, pointers can become invalid after new writes?

A transaction and its cursors must only be used by a single thread, and a thread may only have a single transaction at a time. If #MDB_NOTLS is in use, this does not apply to read-only transactions.
If [parent] is non-NULL, the new transaction will be a nested transaction, with the transaction indicated by \b parent as its parent. Transactions may be nested to any level. A parent transaction and its cursors may not issue any other operations than mdb_txn_commit and mdb_txn_abort while it has active child transactions.

Does that mean we can create a nested read-only transaction from a write transaction and send that read-only transaction to another thread?

Have a nice day 🌵

@hyc
Copy link

hyc commented Jul 9, 2024

I am wondering if this sentence is about a subsequent update operation in the database or over the whole environment. Is the cursor cache shared between databases, and therefore, pointers can become invalid after new writes?

It is for the whole environment. While databases are generally independent, if a transaction gets a large enough number of dirty pages, buffers will get flushed and re-used.

Does that mean we can create a nested read-only transaction from a write transaction and send that read-only transaction to another thread?

No. Read-only txns don't support nesting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feedback A feedback from a user
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants