Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for querying HLLSketch types #454

Merged
merged 2 commits into from
May 9, 2023

Conversation

bsyk
Copy link
Contributor

@bsyk bsyk commented May 9, 2023

These represent approximate distinct counts. We can treat them as longsum/counters. Usage will be nuanced, we'll probably need to ensure callers are performing per-step queries for the results to make sense.

These represent approximate distinct counts. We can treat them as longsum/counters.
Usage will be nuanced, we'll probably need to ensure callers are performing per-step queries for the results to make sense.
@@ -439,6 +439,7 @@ object DruidDatabaseActor {
case Histogram(_) if metric.isDistSummary => Aggregation.distSummary(metric.name)
case DataExpr.GroupBy(e, _) => toAggregation(metric, e)
case DataExpr.Consolidation(e, _) => toAggregation(metric, e)
case _: DataExpr.Sum if metric.isSketch => Aggregation.distinct(metric.name)
Copy link
Contributor

@brharrington brharrington May 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if other aggregates are used with this type, what will the behavior be? Would it be better to just use the distinct behavior for any aggregate function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a good point. We wouldn't do anything different if called with max or min. I'll change to always use this aggregator for columns of this type.

@brharrington brharrington merged commit 72e8233 into Netflix-Skunkworks:main May 9, 2023
@bsyk bsyk deleted the HLLSketch branch May 9, 2023 22:29
manolama pushed a commit to manolama/iep-apps that referenced this pull request Oct 25, 2023
These represent approximate distinct counts. We can treat them as
longsum/counters. Usage will be nuanced, we'll probably need to
ensure callers are performing per-step queries for the results to make
sense.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants