Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Extension] Add GluonTS extension #1903

Merged
merged 14 commits into from
Aug 25, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions examples/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ dependencies {
implementation "org.apache.logging.log4j:log4j-slf4j-impl:${log4j_slf4j_version}"
implementation project(":basicdataset")
implementation project(":model-zoo")
implementation project(":extensions:timeseries")

runtimeOnly project(":engines:pytorch:pytorch-model-zoo")
runtimeOnly project(":engines:tensorflow:tensorflow-model-zoo")
Expand Down
100 changes: 100 additions & 0 deletions extensions/timeseries/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# TimeSeries support

This module contains the time series model support extension with [gluon-ts](https:/awslabs/gluon-ts).

Right now, the package provides the `BaseTimeSeriesTranslator` and transform package that allows you to do inference from your pre-trained time series model.

## Module structure

### Forecast

An abstract class representing the forecast result.

It contains the distribution of the results, the start date of the forecast, the frequency of the time series, etc. User can get all these information by simply invoking the corresponding attribute.

- `SampleForecast` extends the `Forecast` that contain all the sample paths in the form of `NDArray`. User can query the prediction results by accessing the data in the samples.

### TimeSeriesData

The data entry for managing timing data in preprocessing as an input to the transform method. It contains a key-value pair list mapping from the time series name field to `NDArray`.

### dataset

- FieldName -- The name field for time series data including START, TARGET, and so on.

### timefeature

This module contains all the methods for generating time features from the predicted frequency.

- Lag -- Generates a list of lags that are appropriate for the frequency.
- TimeFeature -- Generates a list of time features that are appropriate for the frequency.

### transform

In general, it gets the `TimeSeriesData` and transform it to another `TimeSeriesData` that can possibly contain more fields. It can be done by defining a set of of "actions" to the raw dataset in training or just invoking at translator in inference.

This action usually create some additional features or transform an existing feature.

#### convert

- Convert -- Convert the array shape to the preprocessing.
- VstackFeatures.java -- vstack the inputs name field of the `TimeSeriesData`. We make it implement the `TimeSeriesTransform` interface for **training feature.**

#### feature

- Feature -- Add time features to the preprocessing.
- AddAgeFeature -- Creates the `FEAT_DYNAMIC_AGE` name field in the `TimeSeriesData`. Adds a feature that its value is small for distant past timestamps and it monotonically increases the more we approach the current timestamp. We make it implement the `TimeSeriesTransform` interface for **training feature.**
- AddObservedValueIndicator -- Creates the `OBSERVED_VALUES` name field in the `TimeSeriesData`. Adds a feature that equals to 1 if the value is observed and 0 if the value is missing. We make it implement the `TimeSeriesTransform` interface for **training feature.**
- AddTimeFeature -- Creates the `FEAT_TIME` name field in the `TimeSeriesData`. Adds a feature that its value is based on the different prediction frequencies. We make it implement the `TimeSeriesTransform` interface for **training feature.**

#### field

- Field -- Process key-value data entry to the preprocessing. It usually add or remove the feature in the `TimeSeriesData`.
- RemoveFields -- Remove the input name field. We make it implement the `TimeSeriesTransform` interface for **training feature.**
- SelectField -- Only keep input name fields. We make it implement the `TimeSeriesTransform` interface for **training feature.**
- SetField -- Set the input name field with `NDArray`. We make it implement the `TimeSeriesTransform` interface for **training feature.**

#### split

- Split -- Split time series data for training and inferring to the preprocessing.
- InstanceSplit -- Split time series data with the slice from `Sampler` for training and inferring to the preprocessing. We make it implement the `TimeSeriesTransform` interface for **training feature.**

### InstanceSampler

Sample index for splitting based on training or inferring.

`PredictionSampler` extends `InstanceSampler` for the prediction including test and valid. It would return the end of the time series bound as the dividing line between the future and past.

### translator

Existing time series model translators and corresponding factories. Now we have developed `DeepARTranslator` and `TransformerTranslator` for users.

The following pseudocode demonstrates how to create a `DeepARTranslator` with `arguments`.

```java
Map<String, Object> arguments = new ConcurrentHashMap<>();
arguments.put("prediction_length", 28);
arguments.put("use_feat_dynamic_real", false);
DeepARTranslator.Builder builder = DeepARTranslator.builder(arguments);
DeepARTranslator translator = builder.build();
```

If you want to customize your own time series model translator, you can easily use the transform package for your data preprocess.

See [examples](../src/main/java/ai/djl/timeseries/examples) for more details.

We plan to add the following features in the future:

- a `TimeSeriesDataset`class to support creating data entry and transforming raw csv data like in TimeSeries.
- Many time series models that can be trained in djl.
- ......

## Documentation

You can build the latest javadocs locally using the following command:

```sh
./gradlew javadoc
```

The javadocs output is built in the `build/doc/javadoc` folder.
29 changes: 29 additions & 0 deletions extensions/timeseries/build.gradle
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
group 'ai.dj.timeseries'

dependencies {
api project(":api")
api project(":basicdataset")
api "tech.tablesaw:tablesaw-core:${tablesaw_version}"
api "tech.tablesaw:tablesaw-jsplot:${tablesaw_version}"

testImplementation("org.testng:testng:${testng_version}") {
exclude group: "junit", module: "junit"
}

testImplementation "org.slf4j:slf4j-simple:${slf4j_version}"
testImplementation project(":testing")

testRuntimeOnly project(":engines:mxnet:mxnet-model-zoo")
}

publishing {
publications {
maven(MavenPublication) {
pom {
name = "TimeSeries for DJL"
description = "TimeSeries for DJL"
url = "http://www.djl.ai/extensions/${project.name}"
}
}
}
}
1 change: 1 addition & 0 deletions extensions/timeseries/gradlew
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
/*
* Copyright 2022 Amazon.com, Inc. or its affiliates. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance
* with the License. A copy of the License is located at
*
* http://aws.amazon.com/apache2.0/
*
* or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES
* OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions
* and limitations under the License.
*/
package ai.djl.timeseries;

import ai.djl.ndarray.NDArray;

import java.time.LocalDateTime;

/** An abstract class representing the forecast results for the time series data. */
public abstract class Forecast {

protected LocalDateTime startDate;
protected int predictionLength;
protected String freq;

/**
* Constructs a {@code Forecast} instance.
*
* @param startDate the time series start date
* @param predictionLength the time length of prediction
* @param freq the prediction frequency
*/
public Forecast(LocalDateTime startDate, int predictionLength, String freq) {
this.startDate = startDate;
this.predictionLength = predictionLength;
this.freq = freq;
}

/**
* Computes a quantile from the predicted distribution.
*
* @param q quantile to compute
* @return value of the quantile across the prediction range
*/
public abstract NDArray quantile(float q);

/**
* Computes a quantile from the predicted distribution.
*
* @param q quantile to compute
* @return value of the quantile across the prediction range
*/
public NDArray quantile(String q) {
return quantile(Float.parseFloat(q));
}

/**
* Computes and returns the forecast mean.
*
* @return forecast mean
*/
public abstract NDArray mean();

/**
* Computes the median of forecast.
*
* @return value of the median
*/
public NDArray median() {
return quantile(0.5f);
}

/**
* Returns the prediction frequency like "D", "H"....
*
* @return the prediction frequency
*/
public String freq() {
return freq;
}

/**
* Returns the time length of forecast.
*
* @return the prediction length
*/
public int getPredictionLength() {
return predictionLength;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
/*
* Copyright 2022 Amazon.com, Inc. or its affiliates. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance
* with the License. A copy of the License is located at
*
* http://aws.amazon.com/apache2.0/
*
* or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES
* OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions
* and limitations under the License.
*/
package ai.djl.timeseries;

import ai.djl.ndarray.NDArray;

import java.time.LocalDateTime;

/**
* A {@link Forecast} object, where the predicted distribution is represented internally as samples.
*/
public class SampleForecast extends Forecast {

private NDArray samples;
private int numSamples;

/**
* Constructs a {@code SampleForeCast}.
*
* @param samples {@link NDArray} array of size (num_samples, prediction_length) (1D case),
* (num_samples, prediction_length, target_dim) (multivariate case)
* @param startDate start of the forecast
* @param freq frequency of the forecast
*/
public SampleForecast(NDArray samples, LocalDateTime startDate, String freq) {
super(startDate, (int) samples.getShape().get(1), freq);
this.samples = samples;
this.numSamples = (int) samples.getShape().head();
}

/**
* Returns the sorted sample array.
*
* @return the sorted sample array
*/
public NDArray getSortedSamples() {
return samples.sort(0);
}

/**
* Returns the number of samples representing the forecast.
*
* @return the number of samples
*/
public int getNumSamples() {
return numSamples;
}

/** {@inheritDoc} */
@Override
public NDArray quantile(float q) {
int sampleIdx = Math.round((numSamples - 1) * q);
return getSortedSamples().get("{}, :", sampleIdx);
}

/**
* Returns a new Forecast object with only the selected sub-dimension.
*
* @param dim the selected dim
* @return a new {@link SampleForecast}.
*/
public SampleForecast copyDim(int dim) {
NDArray copySamples;
if (samples.getShape().dimension() == 2) {
copySamples = samples;
} else {
int targetDim = (int) samples.getShape().get(2);
if (dim >= targetDim) {
throw new IllegalArgumentException(
String.format(
"must set 0 <= dim < target_dim, but got dim=%d, target_dim=%d",
dim, targetDim));
}
copySamples = samples.get(":, :, {}", dim);
}

return new SampleForecast(copySamples, startDate, freq);
}

/** {@inheritDoc}. */
@Override
public NDArray mean() {
return samples.mean(new int[] {0});
}
}
Loading