V1 Scenarios need to be covered by tests #2498

rogancarr · 2019-02-11T19:35:40Z

In issue #584, we laid out a set of scenarios that we'd like to cover for V1.0 of ML.NET. We need high-level functional tests to make sure that these work well in the 1.0 library.

Here is a list of tests that cover the scenarios. Let's use this issue as a top-level issue to track coverage of the APIs.

Category	Scenarios	Link to Test	Completed PR	Blocked by Issue
Data I/O	I can use objects already in memory (as IEnumerable) as input to my ML pipeline/experiment	Link	#2518
Data I/O	I can use locally stored delimited files (.csv, .tsv, etc.) as input to my ML pipeline/experiment	Link	#2518
Data I/O	I can use locally stored binary files (.idv) as input to my ML pipeline/experiment	Link	#2518
Data I/O	I can go through any arbitrary data transformation / model training and save the output to disk as a delimited file (.csv, .tsv, etc.).	Link	#2518
Data I/O	I can go through any arbitrary data transformation / model training and save the output to disk as a binary file (.idv).	Link	#2518
Data I/O	I can go through any arbitrary data transformation / model training and convert the output to an IEnumerable.	Link	#2518
Data I/O	I can use data from a SQL database by reading it into memory or to disk using an existing SQL reader and then use that as input to my ML pipeline/experiment	(May be a sample)
Data Transformation, Feature Engineering	I can take an existing ONNX model and get predictions from it (as both final output and as input to downstream pipelines)
Data Transformation, Feature Engineering	Extensible transformation: It should be possible to write simple row-mapping transforms. Examples: "I can add custom steps to my pipeline such as creating a new column that is the addition of two other columns, or easily add cosine similarity, without having to create my own build of ML.NET."		#2803
Data Transformation, Feature Engineering	I can modify settings in the TextFeaturizer to update the number of word-grams and char-grams used along with things like the normalization.		#2803	#2802
Data Transformation, Feature Engineering	I can apply normalization to the columns of my data		#2803
Data Transformation, Feature Engineering	I can take an existing TF model and get predictions from it or any layer in the model		WIP Rogan
Data Transformation and Feature Engineering	P1: I can take an existing TF model and use ML.NET APIs to identify the input and output nodes		WIP Rogan
Debugging	I can see how my data was read in to verify that I specified the schema correctly		#2937
Debugging	I can see the output at the end of my pipeline to see which columns are available (score, probability, predicted label)		#2937
Debugging	I can look at intermediate steps of the pipeline to debug my model. Example: > I were to have the text `"Help I'm a bug!"` I should be able to see the steps where it is normalized to `"help i'm a bug"` then tokenized into `["help", "i'm", "a", "bug"]` then mapped into term numbers `[203, 25, 3, 511]` then projected into the sparse float vector `{3:1, 25:1, 203:1, 511:1}`, etc. etc.		#2937
Debugging	P1: I can access the information needed for understanding the progress of my training (e.g. number of trees trained so far out of how many)		#2937
Evaluation	I can evaluate a model trained for any of my tasks on test data. The evaluation outputs metrics that are relevant to the task (e.g. AUC, accuracy, P/R, and F1 for binary classification)		#2646
Evaluation	P1: I can get the data that will allow me to plot PR curves		#2646	#2645
Explainability & Interpretability	I can get near-free (local) feature importance for scored examples (Feature Contributions)		#2584
Explainability & Interpretability	I can view how much each feature contributed to each prediction for trees and linear models (Feature Contributions)		#2584
Explainability & Interpretability	I can view the overall importance of each feature (Permutation Feature Importance, GetFeatureWeights)		#2584
Explainability & Interpretability	I can train interpretable models (linear model, GAM)
Introspective training	I can take an existing model file and inspect what transformers were included in the pipeline		#2859
Introspective training	I can inspect the coefficients (weights and bias) of a linear model without much work. Easy to find via auto-complete.		#2859
Introspective training	I can inspect the normalization coefficients of a normalizer in my pipeline without much work. Easy to find via auto-complete.		#2859
Introspective training	I can inspect the trees of a boosted decision tree model without much work. Easy to find via auto-complete.		#2859
Introspective training	I can inspect the topics after training an LDA transform. Easy to find via auto-complete.		#2859
Introspective training	I can inspect a categorical transform and see which feature values map to which key values. Easy to find via auto-complete.		#2859
Introspective training	I can access the GAM feature histograms through APIs		#2859
Model files	I can train a model and save it as a file. This model includes the learner as well as the transforms (e.g. Decomposability)
Model files	I can use a model file in a completely different process to make predictions. (e.g. Decomposability)
Model files	I can use newer versions of ML.NET with ML.NET model files of previous versions (for v1.x)		test in V1.1
Model files	I can easily figure out which NuGets (and versions) I need to score an ML.NET model
Model files	P2: I can move data between NimbusML and ML.NET (using IDV). Prepare with NimbusML and load with ML.NET		V1.1
Model files	P2: I can use model files interchangeably between compatible versions of ML.NET and NimbusML.		V1.1
Model files	P1: I can export ML.NET models to ONNX (limited to the existing internal functionality)
Model files	I can save a model to text		V1.1
Prediction	I can get predictions (scores, probabilities, predicted labels) for every row in a test dataset
Prediction	I can reconfigure the threshold of my binary classification model based on analysis of the PR curves or other metrics scores.	Link		#2465
Prediction	(Might not work?) I can map the score/probability for each class to the original class labels I provided in the pipeline (multiclass, binary classification).
Tasks	I can train a model to do classification (binary and multiclass)		#2646
Tasks	I can train a model to do regression		#2646
Tasks	I can train a model to do anomaly detection		#2646
Tasks	I can train a model to do recommendations		#2646
Tasks	I can train a model to do ranking		#2646
Tasks	I can train a model to do clustering		#2646
Training	I can provide multiple learners and easily compare evaluation metrics between them.		#2921
Training	I can use an initial predictor to update/train the model for some trainers (e.g. linear learners like averaged perceptron). Specifically, start the weights for the model from the existing weights.		#2921
Training	Metacomponents smartly restrict their use to compatible components. Example: "When specifying what trainer OVA should use, a user will be able to specify any binary classifier. If they specify a regression or multi-class classifier ideally that should be a compile error."		#2921
Training	I can train TF models when I bring a TF model topology		WIP Rogan
Training	I can use OVA and easily add any binary classifier to it		#2921
Use in web environments	I can use ML.NET models to make predictions in multi-threaded environments like ASP.NET. (This doesn't have to be inherent in the prediction engine but should be easy to do.)
Validation	Cross-validation: I can take a pipeline and easily do cross validation on it without having to know how CV works.	Link	#2470
Validation	I can use a validation set in a pipeline for learners that support them (e.g. FastTree, GAM)	Link	#2503

The text was updated successfully, but these errors were encountered:

endintiers · 2019-02-13T04:12:45Z

So we aren't going to support training from an IEnumerable (directly) backed by an EF DbContext in V1? If so better note that somewhere, because it works up to a point and then fails in a confusing way.

rogancarr · 2019-02-13T15:36:54Z

@endintiers For V1, we will only support training with IDataView, but that should still be possible with an IEnumerable backed by an EF DbContext. (@singlis and @Ivanidzo4ka)

Would you mind giving an example of what you've been doing and how it's been failing?

Ivanidzo4ka · 2019-02-13T18:20:44Z

#2159 for more details.

The actual problem is that EF Core DBContexts aren't thread-safe, so if cursors from multiple threads exhaust the pool and more than one tries to access the Enumerable to get more: 'boom'

I believe setting conc to 1 for mlContext should help, but it need verification.

rogancarr · 2019-02-13T20:19:05Z

Adding SQL test back — I had misunderstood the requirements, and it looks like we can use an EF DbContext for it.

endintiers · 2019-02-13T23:57:21Z

Tar. I have looked at how to modify mlContext.CreateStreamingDataView to be able to detect? and create an IDataView that could signal 'single-threaded source' downstream. This could be done (it's just the sync of buffer re-loads that is an issue). Given the time release-wise though just setting conc to 1 is a good move. Sadly this will slow training (on serious datasets with many available CPUs). I should volunteer to do at least this test... (using EF Core In-Memory). In the real world generating text files from the DB and training on them instead seems to be the best move.

jwood803 · 2019-03-09T20:12:41Z

I have a sample that reads data from a SQL database. I can create another one using one of the data sets used in the samples. Would the connection string be hidden from the sample since it's not necessary for sample?

endintiers · 2019-03-10T07:11:10Z

@jwood803 You should load the (textfile?) dataset into an in-memory database provider such as Microsoft.EntityFrameworkCore.InMemory. These are functionally equivalent to real DB providers and are used to build DB tests. You won't need a connection string.

rogancarr mentioned this issue Feb 11, 2019

Create functional tests for all V1 validation scenarios #2499

Closed

rogancarr added API Issues pertaining the friendly API test related to tests labels Feb 11, 2019

rogancarr mentioned this issue Feb 11, 2019

Create functional tests for all V1 Data I/O scenarios #2508

Closed

rogancarr mentioned this issue Feb 13, 2019

Create functional tests for all V1 TensorFlow scenarios #2538

Open

This was referenced Feb 25, 2019

Create functional tests for all V1 Data Transformation scenarios #2711

Closed

Add V1 Scenario tests for data transformation #2803

Merged

Create functional tests for all V1 Introspective Training scenarios #2817

Closed

This was referenced Mar 5, 2019

Latent Dirichlet Allocation (LDA) - how to get a list of topics? #2197

Closed

Normalizer parameters require a manual cast #2854

Open

Add V1 Introspective Training Tests #2859

Merged

rogancarr closed this as completed in #2859 Mar 7, 2019

rogancarr reopened this Mar 9, 2019

This was referenced Mar 9, 2019

Create functional tests for all Model Files scenarios #2896

Closed

Create model file V1 scenario tests #2899

Merged

jwood803 mentioned this issue Mar 10, 2019

Add sample to get data from a SQL database #2902

Closed

This was referenced Mar 11, 2019

Create functional tests for all Training scenarios #2906

Closed

Adding functional tests for all training scenarios #2921

Merged

Create functional tests for all Debugging scenarios #2932

Closed

This was referenced Mar 12, 2019

Adding Debugging Scenario tests for V1 APIs #2937

Merged

Create functional tests for all ONNX scenarios #2963

Closed

Add functional tests for ONNX scenarios #2984

Merged

frank-dong-ms-zz added the P2 Priority of the issue for triage purpose: Needs to be fixed at some point. label Jan 9, 2020

harishsk added the onnx Exporting ONNX models or loading ONNX models label Apr 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V1 Scenarios need to be covered by tests #2498

V1 Scenarios need to be covered by tests #2498

rogancarr commented Feb 11, 2019 •

edited

Loading

endintiers commented Feb 13, 2019

rogancarr commented Feb 13, 2019 •

edited

Loading

Ivanidzo4ka commented Feb 13, 2019

rogancarr commented Feb 13, 2019

endintiers commented Feb 13, 2019

jwood803 commented Mar 9, 2019

endintiers commented Mar 10, 2019

V1 Scenarios need to be covered by tests #2498

V1 Scenarios need to be covered by tests #2498

Comments

rogancarr commented Feb 11, 2019 • edited Loading

endintiers commented Feb 13, 2019

rogancarr commented Feb 13, 2019 • edited Loading

Ivanidzo4ka commented Feb 13, 2019

rogancarr commented Feb 13, 2019

endintiers commented Feb 13, 2019

jwood803 commented Mar 9, 2019

endintiers commented Mar 10, 2019

rogancarr commented Feb 11, 2019 •

edited

Loading

rogancarr commented Feb 13, 2019 •

edited

Loading