Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hyperloglog functions #717

Closed
pauldix opened this issue Jul 7, 2014 · 5 comments · Fixed by #20603
Closed

Add hyperloglog functions #717

pauldix opened this issue Jul 7, 2014 · 5 comments · Fixed by #20603

Comments

@pauldix
Copy link
Member

pauldix commented Jul 7, 2014

For people that want to calculate distinct counts across large numbers of items, we should support HyperLogLog functions. I was thinking of something like this: http://antirez.com/news/75

Basically have an aggregate that will output a string like the redis implementation. And have a function for summing them and another function for outputting the count. Like this:

select hllHash(user_id) from events group by time(1h) into 1h.hll.events

select hllSum(hllHash) from 1h.hll.events

select hllCount(hllSum(hllHash) from 1h.hll.events where time > now() 30d

select hllCount(hllHash(user_id)) from events group by time(1d) where time > now() - 7d

Notice the examples where we're chaining them together.

@ceeaspb
Copy link

ceeaspb commented Aug 19, 2014

https:/aggregateknowledge/postgresql-hll

could influxdb store the hll as a native datatype rather than (or in addition to) function?

@toddboom toddboom added the idea label Oct 23, 2014
@beckettsean beckettsean removed the idea label Apr 21, 2015
@beckettsean beckettsean added this to the Longer term milestone Apr 21, 2015
@howie
Copy link

howie commented Jun 12, 2015

+1

@beckettsean
Copy link
Contributor

As mentioned in my post to the mailing list we are experimenting with simplifying our open GitHub Issues. This feature request has been rolled into an aggregate issue for all function requests, so that we can close this issue until we are ready to work on it.

You may continue to make comments here. Closing the issue does not mean we are rejecting this idea.

@lesam
Copy link
Contributor

lesam commented Jan 26, 2021

Re-opening as this is now actively worked on for 1.x

@lesam
Copy link
Contributor

lesam commented Mar 3, 2021

Note the was implemented in 1.x (not 2.x), the corresponding issue to implement HLL in flux is here: influxdata/flux#354

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants