-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async reconcile: resource definitions and code file parser #2761
Async reconcile: resource definitions and code file parser #2761
Conversation
for _, resource := range p.resourcesForPath[path] { | ||
// Multiple entries in resourcesForPath may point to the same resource. | ||
// By adding resource.Paths to checkPaths, the outer loop will eventually clear those (maybe it already has). | ||
checkPaths = append(checkPaths, resource.Paths...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we do the deduplication here? Embedded sources seem to explode checkPaths quite a bit. The test with 2 models referencing once source leads to about 14 entries in this when one of the model is deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good observation. I tried investigating some simple deduplication checks / other logic changes, but wasn't able to come up with something that benchmarked better. For small list sizes, it seems appending to checkPaths is about the same cost as doing extra deduplication checks.
Given the small number of files that will usually be reparsed, I think it's okay to keep it slightly unoptimized (the seenPaths will still guard against duplicating any expensive ops, like checking file stat), and it's nice to keep the logic simpler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Always bench before optimisation :)
spec.DefaultTimeRange = tmp.DefaultTimeRange | ||
spec.AvailableTimeZones = tmp.AvailableTimeZones | ||
|
||
for i, dim := range tmp.Dimensions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where will the validation of duplicate dimension name and measure name go if not here? (Edited with clarification)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good. Any tweaks can go in new PRs
* API stubs and resource definitions * Revert admin proto change * Fix proto linter errors * Remove API stubs * New compiler * Adapt for duckdbsql changes * Split parser into multiple files * Fix tests * DuckDB parsing fixes and limits * Support schedule and remove loose ends * Basic test coverage * Fix web test * Support errors with line numbers for YAML and DuckDB SQL parse errors * Tests for reparse * Tests for embedded sources * Handle dirty parses * Move SQL annotation parsing to a separate package * Couple SQL and YAML files with same file stem * Review * Stem -> Node * Benchmark reparse * Use known fields for metrics views * Split dots in annotations * Fix errors from merge
The key files in this PR are the revamped resource definitions in
proto/rill/runtime/v1/resources.proto
and the new compiler for parsing projects into resources inruntime/compilers/rillv1
.Some implementation notes (in no particular order):
model.yaml
andmodel.sql
can set config on the same resourcekind
in the YAML or SQL)Migration
resource, to supportinit.sql
and other ad-hoc SQL statementsconnector
for any SQL-based resource, allowing e.g. models to be orchestrated on other DBs. Ifconnector == ""
, the new reconcile will use the default connector (which is DuckDB onstage.db
).{{ ref "other-model" }}
for models-referencing-models-- @annotation:
or{{ configure "key" "value" }}
syntax. The comment-based annotations are only supported for DuckDB SQL files.Notes about catalog changes: