-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blind index #118
Blind index #118
Conversation
src/search.rs
Outdated
///The required length of the salt. | ||
const REQUIRED_LEN: usize = 32; | ||
|
||
///The result of creating a new index as well as indexing on a particular phrase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We return this structure when they generate the index tokens for a string?
src/search.rs
Outdated
pub sdk: IronSimpleSearch, | ||
} | ||
|
||
///Trait which gives the ability to create an index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the ability to create an index and the ability to initialize an existing index should be in separate traits. Everyone using the SDK needs initialize - the ability to create the index might not even be included in the normal import app.
src/search.rs
Outdated
///If you need to index terms immediately, see `create_index_and_initialize` which will return | ||
///the IronSimpleSearch for reuse. | ||
async fn create_index(&self, group_id: &GroupId) -> Result<DocumentEncryptUnmanagedResult>; | ||
///Create an index and also index the term. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't actually index anything, does it? Just returns the SDK "handle".
I'm not sure how useful it is to have these two separate. Maybe just have create_search_index
and initialize_search_index
?
src/search.rs
Outdated
///Create an index and encrypt it to the provided group_id. | ||
///If you need to index terms immediately, see `create_index_and_initialize` which will return | ||
///the IronSimpleSearch for reuse. | ||
async fn create_index(&self, group_id: &GroupId) -> Result<DocumentEncryptUnmanagedResult>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the DocumentEncryptUnmanagedResult
that this returns be the input initialize_search_index
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DocumentEncryptUnmanagedResult
has the people it was encrypted to, errors about the sharing, etc. Maybe I should change the structure to just include the relevant data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe give it an alias that better describes its usage here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review not complete, still trying to wrap my head around some of the details here, but I have some initial feedback that I hope will generate discussion and be helpful on it's own.
src/search.rs
Outdated
#[async_trait] | ||
pub trait SimpleSeachInitialize { | ||
///Given the encrypted salt and the edeks, decrypt them and give back the IronSimpleSearch object. | ||
async fn initialize_search_sdk( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming suggestions: initialize_search
src/search.rs
Outdated
///Given the encrypted salt and the edeks, decrypt them and give back the IronSimpleSearch object. | ||
async fn initialize_search_sdk( | ||
&self, | ||
encrypted_data: &[u8], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just call this the encrypted_salt
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to encrypted_salt_bytes to hopefully avoid confusion between the entire "EncryptedSalt" which is the deks + bytes and just the Aes bytes.
src/search.rs
Outdated
|
||
///The result of creating a new index as well as initializing an IronSimpleSdk. | ||
///If you only want to create the index, see create_index. | ||
pub struct CreatedIndexResult { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest: CreateIndexResult
or SearchIndexCreateResult
or IndexCreateResult
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of the suggestions, I would choose IndexCreateResult
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't create_index
return an Index
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe a BlindIndex
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about putting the encrypted_salt in the BlindIndexSearchSdk struct and getting rid of this altogether?
Maybe this would look like getting rid of create_index
(or not) and changing create_index_and_initialize_search
to also return Result<BlindIndexSearchSdk>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline that this type of construction doesn't make sense as the existence of a BlindIndexSearch
doesn't necessarily mean that an EncryptedBlindIndexSalt
exists.
So I'm back to suggesting IndexCreateResult
as a name here, unless someone has something better. I think what bothers me about the current name is that it starts with a verb.
src/search.rs
Outdated
///Create an index and encrypt it to the provided group_id. | ||
///If you need to index terms immediately, see `create_index_and_initialize` which will return | ||
///the IronSimpleSearch for reuse. | ||
async fn create_index(&self, group_id: &GroupId) -> Result<DocumentEncryptUnmanagedResult>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe give it an alias that better describes its usage here?
src/search.rs
Outdated
|
||
///The result of creating a new index as well as initializing an IronSimpleSdk. | ||
///If you only want to create the index, see create_index. | ||
pub struct CreatedIndexResult { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about putting the encrypted_salt in the BlindIndexSearchSdk struct and getting rid of this altogether?
Maybe this would look like getting rid of create_index
(or not) and changing create_index_and_initialize_search
to also return Result<BlindIndexSearchSdk>
src/search.rs
Outdated
|
||
#[async_trait] | ||
impl BlindIndexSearchInitialize for IronOxide { | ||
async fn initialize_search( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider rename to initialize_blind_index_search
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You said "ok", but didn't change it. Intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. Pushed.
src/search.rs
Outdated
} | ||
|
||
///Generate the list of tokens to use to find entries that match the search query, given the specified partition_id. | ||
pub fn tokenize_query(&self, query: &str, partition_id: Option<&str>) -> Vec<u32> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what partition_id
is. These two functions could use more descriptive docs, perhaps with an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some more text about it.
src/search.rs
Outdated
|
||
///Generate the list of tokens to use to find entries that match the search query, given the specified partition_id. | ||
/// query - The string you want to tokenize and hash | ||
/// partition_id - An extra string you want to include in every hash, this allows 2 queries with different partition_ids to produce a different set of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you forgot to end your
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what
Cargo.toml
Outdated
@@ -40,6 +40,7 @@ vec1 = "~1.4.0" | |||
# ironoxide requires rt-threaded at runtime, but not at compile time | |||
tokio = {version = "~0.2.0", features=["time", "rt-threaded"]} | |||
dashmap = "3.4.0" | |||
ironcore-search-helpers = { git = "https:/IronCoreLabs/search-helpers", rev = "38be96fe6160a739396cf505054a4e0dec6e0aa6", optional=true} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the plan to still add tags to this repo so we can point to a specific tag for versioning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out we have to publish it or we can't publish ironoxide. I'll do that.
|
||
///Generate the list of tokens to use to find entries that match the search query, given the specified partition_id. | ||
/// query - The string you want to tokenize and hash | ||
/// partition_id - An extra string you want to include in every hash, this allows 2 queries with different partition_ids to produce a different set of tokens for the same data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment looks like it was just copy/pasted from tokenize_query
where I assume it serves a different purpose even though the arguments are the same type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because they currently do the same thing. I'll change it when it gets updated to do something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That time has come
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the comment. Let me know if you think it could be improved.
src/search.rs
Outdated
pub struct BlindIndexCreateResult { | ||
pub encrypted_salt: EncryptedBlindIndexSalt, | ||
pub sdk: BlindIndexSearch, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks to be unused now, which matches up with what we discussed in the meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
Co-Authored-By: Craig Colegrove <[email protected]>
TODO