Adding Debugging Scenario tests for V1 APIs #2937

rogancarr · 2019-03-12T23:25:13Z

As laid out in #2498 , we need scenarios to cover the Debugging functionality we want fully supported in V1.

Scenarios

I can see how my data was read in to verify that I specified the schema correctly
I can see the output at the end of my pipeline to see which columns are available (score, probability, predicted label)
I can look at intermediate steps of the pipeline to debug my model. Example: > I were to have the text "Help I'm a bug!" I should be able to see the steps where it is normalized to "help i'm a bug" then tokenized into ["help", "i'm", "a", "bug"] then mapped into term numbers [203, 25, 3, 511] then projected into the sparse float vector {3:1, 25:1, 203:1, 511:1}, etc. etc.
(P1) I can access the information needed for understanding the progress of my training (e.g. number of trees trained so far out of how many)

codecov · 2019-03-13T21:22:48Z

Codecov Report

❗ No coverage uploaded for pull request base (master@4dbc327). Click here to learn what that means.
The diff coverage is 100%.

@@            Coverage Diff            @@
##             master    #2937   +/-   ##
=========================================
  Coverage          ?   72.23%           
=========================================
  Files             ?      796           
  Lines             ?   142139           
  Branches          ?    16056           
=========================================
  Hits              ?   102668           
  Misses            ?    35091           
  Partials          ?     4380

Flag	Coverage Δ
#Debug	`72.23% <100%> (?)`
#production	`67.98% <ø> (?)`
#test	`88.39% <100%> (?)`

Impacted Files	Coverage Δ
test/Microsoft.ML.Functional.Tests/Debugging.cs	`100% <100%> (ø)`

abgoswam · 2019-03-14T03:33:34Z

test/Microsoft.ML.Functional.Tests/Debugging.cs

+
+ public LogWatcher()
+ {
+ Lines = new Dictionary<string, int>();


Dictionary [](start = 28, length = 10)

do we need a dictionary for this ? #Resolved

Good question. I want to hold all the lines, keep a count of how many times I see a unique set of characters, and have the lookup be quick. A hash seems like the way to go, and a Dictionary is a type-safe hash (obligatory stackoverflow link ).

What do you think? I could also reverse the lookup, so that we have a dictionary with the lines of choice in them, and use LogWatcher to count the occurrences. Then I could assert on the number of occurrences (0, 1, 2, etc.) back in the main function.

In reply to: 265409282 [](ancestors = 265409282)

I'll keep as is because perf tradeoff is nominal.

In reply to: 265410723 [](ancestors = 265410723,265409282)

abgoswam · 2019-03-14T03:34:05Z

test/Microsoft.ML.Functional.Tests/Debugging.cs

+ new TweetSentiment[]
+ {
+ new TweetSentiment { Sentiment = true, SentimentText = "I love ML.NET." },
+ new TweetSentiment { Sentiment = true, SentimentText = "I love TLC." },


TLC [](start = 83, length = 3)

? #Resolved

Tiny little cakes?

In reply to: 265409347 [](ancestors = 265409347)

lol, Tender Loving Care

In reply to: 265411100 [](ancestors = 265411100,265409347)

abgoswam

shmoradims · 2019-03-14T18:09:15Z

test/Microsoft.ML.Functional.Tests/Debugging.cs

+
+ // Verify that columns can be inspected.
+ // Validate the tokens column.
+ var tokensColumn = transformedData.GetColumn<string[]>(transformedData.Schema["Features_TransformedText"]);


Features_TransformedText [](start = 91, length = 24)

where is this magic string coming from? #Resolved

This magic string is the tokenized column. It takes the name you give it, and returns ${OutputColumnName}_TransformedText} I'll file a separate issue on it.

In reply to: 265701391 [](ancestors = 265701391)

Issue #2957

In reply to: 265708837 [](ancestors = 265708837,265701391)

shmoradims · 2019-03-14T18:15:21Z

test/Microsoft.ML.Functional.Tests/Debugging.cs

+ var expectedLines = new string[3] {
+ @"[Source=SdcaTrainerBase; Training, Kind=Info] Auto-tuning parameters: L2 = 0.001.",
+ @"[Source=SdcaTrainerBase; Training, Kind=Info] Auto-tuning parameters: L1Threshold (L1/L2) = 0.",
+ @"[Source=SdcaTrainerBase; Training, Kind=Info] Using best model from iteration 7."};


is this text guaranteed to be constant, across runs and OSs? #Resolved

Good question!

Fixed seed gives consistency across runs; tests pass across OS signifies that that is guaranteed too.

In reply to: 265703890 [](ancestors = 265703890)

shmoradims

Adding Debugging Scenario tests for V1 APIs

c312ed5

rogancarr requested review from abgoswam and shmoradims March 12, 2019 23:25

Rogan Carr added 2 commits March 13, 2019 13:21

Merge branch 'master' into 2932_debugging_scenarios

ff593be

Updating to the new APIs in master.

2006be7

rogancarr requested review from artidoro and sfilipi March 13, 2019 21:49

Rogan Carr added 2 commits March 13, 2019 15:25

Merge branch 'master' into 2932_debugging_scenarios

82b5729

Fixing changes against master.

1a10d27

abgoswam reviewed Mar 14, 2019

View reviewed changes

abgoswam approved these changes Mar 14, 2019

View reviewed changes

shmoradims reviewed Mar 14, 2019

View reviewed changes

shmoradims approved these changes Mar 14, 2019

View reviewed changes

rogancarr merged commit d794383 into dotnet:master Mar 14, 2019

rogancarr deleted the 2932_debugging_scenarios branch March 14, 2019 18:40

rogancarr mentioned this pull request Mar 14, 2019

V1 Scenarios need to be covered by tests #2498

Open

ghost locked as resolved and limited conversation to collaborators Mar 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding Debugging Scenario tests for V1 APIs #2937

Adding Debugging Scenario tests for V1 APIs #2937

rogancarr commented Mar 12, 2019

codecov bot commented Mar 13, 2019 •

edited

Loading

abgoswam Mar 14, 2019 •

edited by rogancarr

Loading

rogancarr Mar 14, 2019 •

edited

Loading

rogancarr Mar 14, 2019

abgoswam Mar 14, 2019 •

edited by rogancarr

Loading

rogancarr Mar 14, 2019

shmoradims Mar 14, 2019

abgoswam left a comment

shmoradims Mar 14, 2019 •

edited by rogancarr

Loading

rogancarr Mar 14, 2019

rogancarr Mar 14, 2019

shmoradims Mar 14, 2019 •

edited by rogancarr

Loading

rogancarr Mar 14, 2019

shmoradims left a comment

Adding Debugging Scenario tests for V1 APIs #2937

Adding Debugging Scenario tests for V1 APIs #2937

Conversation

rogancarr commented Mar 12, 2019

codecov bot commented Mar 13, 2019 • edited Loading

Codecov Report

abgoswam Mar 14, 2019 • edited by rogancarr Loading

Choose a reason for hiding this comment

rogancarr Mar 14, 2019 • edited Loading

Choose a reason for hiding this comment

rogancarr Mar 14, 2019

Choose a reason for hiding this comment

abgoswam Mar 14, 2019 • edited by rogancarr Loading

Choose a reason for hiding this comment

rogancarr Mar 14, 2019

Choose a reason for hiding this comment

shmoradims Mar 14, 2019

Choose a reason for hiding this comment

abgoswam left a comment

Choose a reason for hiding this comment

shmoradims Mar 14, 2019 • edited by rogancarr Loading

Choose a reason for hiding this comment

rogancarr Mar 14, 2019

Choose a reason for hiding this comment

rogancarr Mar 14, 2019

Choose a reason for hiding this comment

shmoradims Mar 14, 2019 • edited by rogancarr Loading

Choose a reason for hiding this comment

rogancarr Mar 14, 2019

Choose a reason for hiding this comment

shmoradims left a comment

Choose a reason for hiding this comment

codecov bot commented Mar 13, 2019 •

edited

Loading

abgoswam Mar 14, 2019 •

edited by rogancarr

Loading

rogancarr Mar 14, 2019 •

edited

Loading

abgoswam Mar 14, 2019 •

edited by rogancarr

Loading

shmoradims Mar 14, 2019 •

edited by rogancarr

Loading

shmoradims Mar 14, 2019 •

edited by rogancarr

Loading