Skip to content

Latest commit

 

History

History
61 lines (49 loc) · 4.19 KB

LIME.md

File metadata and controls

61 lines (49 loc) · 4.19 KB

LIME: “Why Should I Trust You?” Explaining the Predictions of Any Classifier

Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin

University of Washington

KDD

Whats New This paper presents unique way of explaining a prediction, by demonstrating feature importance, and also explaining the model by ranking features based on its global importance, aggregated over predictions on diverse set of instances.

Intuition

  • Explanation labels leads to faithfulness on the model

    Source: Author

Hypothesis

  • Any instance of features can be reduced to simple instance, i.e. marking presence or absence of a feature
  • Simple instance can be used to predict outcome in locality with interpretable classifier like linear regression, logistic regression etc.
  • Local classifier over a simpler instance would give feature importance, which can serve as explanations
  • Aggregating features imporatance over dataset would give global importance

Problem Formulation

  • Idea is to select a simple explainable function which can estimate prediction in locality, g.

  • For each instance, x, its explaination model, g, which generates predictions similar to actual model f over locality defined by kernel \pi_x, and \omega(g) is the complexity of generated explanations, i.e. it can correspond to depth of the tree if g is a decision tree, or non zero coefficients if its logistic regression.

    \xi(x)=\underset{g \in G}{\operatorname{argmin}} \mathcal{L}\left(f, g, \pi_{x}\right)+\Omega(g) \mathcal{L}\left(f, g, \pi_{x}\right)=\sum_{z, z^{\prime} \in \mathcal{Z}} \pi_{x}(z)\left(f(z)-g\left(z^{\prime}\right)\right)^{2}
  • Figure below, gives a good intuition

    Source: Author

    • Blue and Pink region are seperated by a global model f.
    • The bold red cross is the instance being explained
    • The locality weights generated by kernel \pi_x is demonstrated by size of cross or circles.
    • Local function over simplified input representation is learnt as linear dashed boundary.
  • Experiment Setup and Testing method

    • Are explanation faithful to model?
      • Actual model is selected to be interpretable, i.e DT and Sparse Logistic Regression
      • Overall, gold set of features are derived, by keeping constraint that maximum numbers of features to be used per instance is 10.
      • For each prediction on test set, explanations are generated, and how many gold features are covered is computed as model's faithfullness.
      • LIME performs much better than Parzen, Greedy and Random approach.
    • Should I trust prediction?
      • Randomly 25% of features are labelled as "untrustworthy"
      • Gold labels are created for trustworthiness. If black box classifer changes its predictions if we remove untrustworthy features then it is marked as "untrustworthy", and "trustworthy" otherwise.
      • If local function changes predictions after removing "untrustworthy" features, then prediction is predicted as "untrustworthy"
      • Accuracy of labelling predictions "untrustworthy" is computed. Higher accuracy denotes that local function is able to capture features impact well.
      • LIME performs much better than Parzen, Greedy and Random approach.
    • Can I trust this model?
      • Random noise features were added.
      • Two different classifier with similar validation accuracy but different test accuacy are trained.
      • Random noise features are marked as "untrustworthy"
      • Using explanations, better classifer having more trustworthy features are selected.
      • And, hence it was able to retrive the better classifer.