Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage): replace MViewTable with CellBasedTable and add relational interfaces #1046

Merged
merged 12 commits into from
Mar 21, 2022

Conversation

wcy-fdu
Copy link
Contributor

@wcy-fdu wcy-fdu commented Mar 17, 2022

What's changed and what's your intention?

This is a draft of cell-based table, which first replaces(rename) MViewTable with CellBasedTable, because MViewTable is already cell-based.

Then this PR adds the relational interfaces for CellBasedTable, such as insert_row, update_row, delete_row , get_row and batch_insert_row. And its scan interface is the same as previous MViewTableIter.

There are some details to be updated, such as the starting key of scan.

Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests

Refer to a related PR or issue link (optional)

Tracking issue #849

@wcy-fdu wcy-fdu marked this pull request as draft March 17, 2022 12:57
@wcy-fdu wcy-fdu changed the title [WIP]feat(storage): add cell-based table WIP feat(storage): add cell-based table Mar 17, 2022
@wcy-fdu wcy-fdu changed the title WIP feat(storage): add cell-based table [WIP] feat(storage): add cell-based table Mar 17, 2022
@codecov
Copy link

codecov bot commented Mar 17, 2022

Codecov Report

Merging #1046 (b4615aa) into main (b54424d) will decrease coverage by 0.07%.
The diff coverage is 55.01%.

@@             Coverage Diff              @@
##               main    #1046      +/-   ##
============================================
- Coverage     70.28%   70.20%   -0.08%     
  Complexity     2766     2766              
============================================
  Files           997      997              
  Lines         83966    84087     +121     
  Branches       1790     1790              
============================================
+ Hits          59016    59034      +18     
- Misses        24059    24162     +103     
  Partials        891      891              
Flag Coverage Δ
java 61.01% <ø> (ø)
rust 72.34% <55.01%> (-0.11%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
rust/storage/src/lib.rs 100.00% <ø> (ø)
rust/storage/src/cell_based_row_serializer.rs 26.66% <26.66%> (ø)
rust/stream/src/executor/batch_query.rs 60.86% <33.33%> (ø)
rust/batch/src/executor/row_seq_scan.rs 61.42% <50.00%> (ø)
rust/storage/src/table/cell_based_table.rs 55.59% <55.35%> (+55.59%) ⬆️
...ust/stream/src/executor/mview/table_state_tests.rs 96.74% <100.00%> (+<0.01%) ⬆️
rust/stream/src/executor/mview/test_utils.rs 100.00% <100.00%> (ø)
rust/common/src/types/ordered_float.rs 22.50% <0.00%> (+0.19%) ⬆️
.../src/executor/managed_state/aggregation/extreme.rs 90.40% <0.00%> (+0.27%) ⬆️

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@wcy-fdu wcy-fdu requested review from twocode and st1page March 17, 2022 13:33
@wcy-fdu wcy-fdu changed the title [WIP] feat(storage): add cell-based table [WIP] feat(storage): replace MViewTable with CellBasedTable and add relational interfaces Mar 19, 2022
@wcy-fdu wcy-fdu requested a review from BugenZhao March 20, 2022 14:25
@wcy-fdu wcy-fdu marked this pull request as ready for review March 21, 2022 06:27
Copy link
Contributor

@st1page st1page left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest generally LGTM, some fields in struct CellBasedTable is duplicate or unnecessary, but we can do refactor later.

CellBasedTableRowIter::new(self.keyspace.clone(), self.column_descs.clone(), epoch).await
}

pub async fn get_for_test(&self, pk: Row, column_id: i32, epoch: u64) -> Result<Option<Datum>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we replaced it with get_cell? or in future PR is ok.

keyspace: Keyspace<S>,
pk: Vec<OrderedColumnDesc>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can add it back when we need it, it is for #588

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

&mut self,
pk: Row,
cell_value: Option<Row>,
column_descs: Vec<ColumnDesc>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
column_descs: Vec<ColumnDesc>,
column_descs: &[ColumnDesc],

&mut self,
pk: Row,
cell_value: Option<Row>,
column_descs: Vec<ColumnDesc>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
column_descs: Vec<ColumnDesc>,
column_descs: &[ColumnDesc],

pub async fn batch_insert_row(
&mut self,
rows: Vec<(Row, Option<Row>)>,
column_descs: Vec<ColumnDesc>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
column_descs: Vec<ColumnDesc>,
column_descs: &[ColumnDesc],

@wcy-fdu wcy-fdu changed the title [WIP] feat(storage): replace MViewTable with CellBasedTable and add relational interfaces feat(storage): replace MViewTable with CellBasedTable and add relational interfaces Mar 21, 2022
@wcy-fdu wcy-fdu merged commit dcd4b90 into main Mar 21, 2022
@wcy-fdu wcy-fdu deleted the wcy_cell_based_table branch March 21, 2022 09:20
Copy link
Member

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please test the implementation well. Thanks.

rust/batch/src/executor/row_seq_scan.rs Show resolved Hide resolved
rust/compute/tests/table_v2_materialize.rs Show resolved Hide resolved
Comment on lines +115 to +120
let state_store_get_res = self
.keyspace
.state_store()
.get(&key.clone(), epoch)
.await
.unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be incorrect. The prefix of the keyspace is not prepended in this way. Have you add some tests for these functions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides, what's the meaning for getting the value using pk as the key?

Comment on lines +136 to +141
let state_store_get_res = self
.keyspace
.state_store()
.get(&key.clone(), epoch)
.await
.unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

let column_id = column.column_id.get_id() as usize;
let mut cell_based_row_deserializer = CellBasedRowDeserializer::new(vec![column.clone()]);
let pk_and_row =
cell_based_row_deserializer.deserialize(&key, &state_store_get_res.unwrap())?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why unwrap the get result here?

Comment on lines +126 to +129
Some(pk_row) => {
return Ok(Some(pk_row.1.index(column_id).clone()));
}
None => Ok(None),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cell_based_row_deserializer.deserialize only returns Some when it encounter a new key. In this case, it always returns None. Please check the other usages first, for example, call deserializer.take().

cell_based_row_deserializer.deserialize(&key, &state_store_get_res.unwrap())?;
match pk_and_row {
Some(pk_row) => {
return Ok(Some(pk_row.1.index(column_id).clone()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value will only have one cell, given vec![column.clone()] as input. It seems incorrect to index here.

Comment on lines +143 to +144
let pk_and_row =
cell_based_row_deserializer.deserialize(&key, &state_store_get_res.unwrap())?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

rust/storage/src/table/cell_based_table.rs Show resolved Hide resolved
schema: Schema,

/// ColumnDesc contains strictly more info than `schema`.
column_descs: Vec<ColumnDesc>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we store the column_descs here, isn't passing another list of ColumnDesc for every get and insert redundant? We can simply pass some IDs there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants