Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[[feature collection]] requested Functions and query operators #5930

Open
15 of 32 tasks
beckettsean opened this issue Mar 7, 2016 · 107 comments
Open
15 of 32 tasks

[[feature collection]] requested Functions and query operators #5930

beckettsean opened this issue Mar 7, 2016 · 107 comments

Comments

@beckettsean
Copy link
Contributor

beckettsean commented Mar 7, 2016

This issue contains a list of related feature requests that are not on the near-term roadmap. The feature requests in this issue are all new functions that have been requested. If you want to request a function not already listed please make a comment on this issue, and we will add it to the checklist.

Aggregations

Selectors

Transformations

Operators

InfluxQL enhancements

@tbowmo
Copy link

tbowmo commented Mar 20, 2016

Is there any timeline for these functions? (I'm very interested in the aggregate integral functions, to calculate kWh from watts). Seems that a feature request was opened a year ago.

@beckettsean
Copy link
Contributor Author

There is no timeline for each specific function/feature. All work on new functions was on hold while the query engine was refactored, and that refactoring was merged into InfluxDB 0.11. We plan to introduce a few functions with each release from now on.

@timgriffiths
Copy link

Would also be great to get #3633 Added to your list this feature would be very useful on the client side when graphing data.

@sidgod
Copy link

sidgod commented Jun 25, 2018

+1 for histogram()

@charlottetucker
Copy link

+1 for CAST function. Seriously, why isn't this a thing?

@AnnapoorniS
Copy link

Is user defined function feature available in influx db? It will of great help if somebody could help me with sample in influxdb, which are similar to stored procedures. Any suggestions? Thanks in advance.

@timhallinflux
Copy link
Contributor

@AnnapoorniS I'd suggest looking at Flux https:/influxdata/platform/tree/master/query

@henriklb
Copy link

Assuming that because #834 is checked it is seen as completed.

These functions does still not work (v.1.5.1) and would very be useful for me:
sum(non_negative_derivative(value))
sum(derivative(value))

Thanks

@balakrishnabk
Copy link

aggregation with respect to weekday and weeknumber based on the timestamp

@matthenning
Copy link

Does #3552 fit in here?

@MarcoPignati
Copy link

+1 for histogram #3674

@mrungue
Copy link

mrungue commented Nov 14, 2018

I'd like to request for a new summary function specifically made for time series visualization: Largest-Triangle-Three-Buckets.

I've been using mean to downsample data, but averaging time series data results in a smooth line which does not accurately represents the underlying data.

Sveinn Steinarsson wrote a thesis titled "Downsampling Time Series for
Visual Representation" that describes and tests algorithms for that purpose. The d3fc library makes use of "Largest-Triangle-Three-Buckets" in the front end, but it would be best if InfluxDB could downsample the data to transmit and process as few bytes as possible.

+1

@n1nj4888
Copy link

👍 +1 for timeshift functions - Seems like this has been requested multiple times by a number of users

@sfitts
Copy link

sfitts commented Nov 30, 2018

The checklist implies that both "percentile + derivative" (#5150) and "Top accepts nested functions" (#2467, #5345) are addressed. However, the documentation says that neither percentile nor top supports nested functions (as of 1.7). So... is the checklist wrong? Can you do these without nested function support? Is the doc wrong?

@timhallinflux
Copy link
Contributor

Hey @sfitts! (from the Forte days?!?) ... I believe you can do these now with subqueries. So, while it's not a direct nesting, it can be done that way. The primary focus going forward in terms of extending the query surface area is going to be done via Flux. https://docs.influxdata.com/flux/v0.7/

InfluxQL will, of course, continue to be supported. But there are challenges that we are going to address at the query engine layer and then open up the ability to address so many of these requests via Flux. Have a look, let us know!

@sfitts
Copy link

sfitts commented Nov 30, 2018

@timhallinflux just after I wrote this it dawned on me that subqueries were probably the answer -- thanks for the confirmation. Also hadn't picked up on the fact that a query language replacement was in the works, so I'll definitely check that out.

(and yep -- I date back to Forte 👍)

@timhallinflux
Copy link
Contributor

Good to reconnect! I want to be super clear.... we are not going to "replace" InfluxQL. As we continue forward, InfluxQL will continue to be the primary on-ramp and supported. But, in terms of working with time series data -- we determined that a functional language can be a powerful way to manipulate the functions, results, and simplify developer code (in the end). So many of these requests were part of our design center for Flux itself and ensuring that we can deliver on them. We have been listening, observing, and attempting to address many of these for multiple years now. We started and failed at least twice...a couple of attempts that never saw the light of day and weren't "ship worthy".

With Flux, we are on the brink of breaking through and delivering on this list (and more!) while continuing to support InfluxQL. i.e. Histogram...already in. So, we maintain the easy on-ramp via InfluxQ. If that is all you need...great! But, if you need more power...and there are a number of time series use cases which certainly do -- particularly given this list, Flux will be there. In 1.7 InfluxDB, there are two query engines that run in parallel. In 2.0, the Flux engine will be the primary engine and InfluxQL will run in a compatibility mode on top of that engine. Hope that helps clarify!

@sfitts
Copy link

sfitts commented Nov 30, 2018

Makes sense (and thanks for the clarification).

@dgnorton dgnorton added the 1.x label Jan 7, 2019
@tecmatia-dp
Copy link

3 years and Boolean cast to Integer (1/0) is not yet implemented...

@uski
Copy link

uski commented Jun 28, 2019

Thank you for the hard work. I just came here to express that more functions would definitely be very useful.
I am currently held back by the lack of logarithmic mean aggregator and I believe the query language is not flexible enough to allow me to do it from the query itself.

So, +1 for log mean please !

@stale stale bot added the wontfix label Sep 26, 2019
@influxdata influxdata deleted a comment from stale bot Sep 26, 2019
@stale stale bot removed the wontfix label Sep 26, 2019
@andrewthad
Copy link

Boolean type cast would be really useful in my work place.

@gpratt3151
Copy link

+1 for Histogram

@MartinMuzatko
Copy link

MartinMuzatko commented Jan 18, 2020

I'd like to have CAST from string to boolean to integer
"true" => true => 1
"false" => false => 0

@anthosz
Copy link

anthosz commented Mar 15, 2020

+1 to CAST from boolean to integer

@timhallinflux
Copy link
Contributor

Casting boolean -> int has been implemented in Flux. This is available as a technical preview in 1.7 and we are just about to release 1.8 which includes some additional, significant updates.

https://docs.influxdata.com/flux/v0.50/introduction/flux-vs-influxql/

@garcipat
Copy link

garcipat commented Mar 6, 2024

I'd like to request for a new summary function specifically made for time series visualization: Largest-Triangle-Three-Buckets.

I've been using mean to downsample data, but averaging time series data results in a smooth line which does not accurately represents the underlying data.

Sveinn Steinarsson wrote a thesis titled "Downsampling Time Series for Visual Representation" that describes and tests algorithms for that purpose. The d3fc library makes use of "Largest-Triangle-Three-Buckets" in the front end, but it would be best if InfluxDB could downsample the data to transmit and process as few bytes as possible.

Has anybody worked in that and worked out at least an influx aggregation function for that? I would love to have this otherwise I will build it myself or will have to go to a less accurate algorithm

@garcipat
Copy link

To add something usefull here regarding filtering of peaks, I implemented a simpler algorithm for aggregation by comparing values to the average of a time window, maybe that helps someone else as well:

`
import "math"
import "join"

fields = ["test", "test2"]

data = from(bucket: "MyBucket")
  |> range(start: 2024-03-11T13:00:00Z, stop: 2024-03-11T13:05:00Z)
  |> filter(fn: (r) => r._measurement == "measurement")
  |> filter(fn: (r) => contains(value: r._field, set: fields))
  |> window(every: 1m)

mean = data
  |> mean()
  |> duplicate(column: "_start", as: "_time")

diff = join.full(left: data, right: mean,
  on: (l,r) =>  l._time == r._time and l._field == r._field and l._measurement == r._measurement,
  as: (l,r) => {
    time = if exists l._time then l._time else r._time
    return {l with _time: time, _value: l._value, _mean: r._value}
  })
  |> fill(column: "_mean", usePrevious: true)
  |> filter(fn: (r) => exists r._value)
  |> map(fn: (r) => ({r with _diff: math.abs(x: (r._value - r._mean))}))
  |> reduce(fn: (r, accumulator) => ({
        _value: if (exists r._value and r._diff > accumulator._maxDiff) then
            r._value
            else accumulator._value,
        _maxDiff: if r._diff > accumulator._maxDiff then
            r._diff
            else accumulator._maxDiff
    }), identity: { _value: 0.0, _maxDiff: 0.0 })
  |> duplicate(column: "_stop", as: "_time")
  |> window(every: inf)
  |> yield(name: "diff")
`

You can change the aggregation window (in the example 1m), the range and the filtered fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests