add importance_sample method to NestedSamples and MCMCSamples #122

lukashergt · 2020-08-20T10:26:25Z

As suggested in #120, this PR implements a method for importance re-weighting the nested samples, which allows e.g. discarding/penalising samples as done in #120, but also allows adding to logL, e.g. with data from a different experiment.

Some things to keep in mind:

We should note that this is a distinct advantage of doing it this way, rather than manually computing log volumes, since we get an error bar.
In general, reweighting doesn't solve everything, as if there is a big difference between the importance sampled likelihood and the actual one you sampled over in the first place then you won't have enough sampling coverage to get decent posteriors and evidences, but it would definitely be worth doing in a similar context to the planck parameters table, where the reweighting is more minor.

Checklist:

I have performed a self-review of my own code
My code is PEP8 compliant (flake8 anesthetic tests)
My code contains compliant docstrings (pydocstyle --convention=numpy anesthetic)
New and existing unit tests pass locally with my changes (python -m pytest)
I have added tests that prove my fix is effective or that my feature works

anesthetic/samples.py

codecov · 2020-08-20T10:30:00Z

Codecov Report

Merging #122 into master will increase coverage by 0.07%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #122      +/-   ##
==========================================
+ Coverage   92.72%   92.80%   +0.07%     
==========================================
  Files          16       16              
  Lines        1458     1473      +15     
==========================================
+ Hits         1352     1367      +15     
  Misses        106      106

Impacted Files	Coverage Δ
anesthetic/samples.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9b81fb9...955776a. Read the comment docs.

anesthetic/samples.py

williamjameshandley · 2020-08-20T11:09:57Z

postscript -- mine and Andrew's comments overlap (apologies, I was mid-review and did not see andrew).

…or passing a single run

anesthetic/samples.py

tests/test_samples.py

anesthetic/samples.py

williamjameshandley · 2020-08-20T13:15:51Z

@pstoecker, given your work on gambit related importance weighting, you will likely be interested in this PR when it is merged later today, as well as the conversation in #120.

williamjameshandley

Sorry to be a pain in requesting another change, but something I just realised when adjusting the title is that a key thing that this PR is missing is an equivalent method for MCMCSamples, which would just reweight the logL column. NestedSamples should then super() call the reweighting portion, and then recompute the nlive with merge_nested_samples

williamjameshandley

Excellent work.

One further optional addition is to add an inplace=False optional kwarg to importance_sample, in the same vein as NestedSamples.set_beta, and generally quite common in pandas for this kind of large-scale operation.. This would be implemented simply as

def importance_sample(self, logL_new, action='add', inplace=False):
....
     if inplace:
        self = samples
    else:
        return samples

lukashergt · 2020-08-20T14:54:53Z

One further optional addition is to add an inplace=False optional kwarg to importance_sample, in the same vein as NestedSamples.set_beta, and generally quite common in pandas for this kind of large-scale operation.. This would be implemented simply as
def importance_sample(self, logL_new, action='add', inplace=False):
....
     if inplace:
        self = samples
    else:
        return samples

The situation here is a tiny bit more complicated than for set_beta, as self might (or probably will) change shape here. I gave it a try and it didn't work as easily as anticipated and now I'm wondering whether this is really needed?

Typical situations where inplace is used is where the structure of the object stays intact and only some elements are modified, e.g. to get a perfomance gain. This is not the case, here, so I'd vote for leaving this option out.

lukashergt · 2020-08-20T15:00:25Z

Sorry to be a pain in requesting another change, but something I just realised when adjusting the title is that a key thing that this PR is missing is an equivalent method for MCMCSamples, which would just reweight the logL column. NestedSamples should then super() call the reweighting portion, and then recompute the nlive with merge_nested_samples

This thing made me wish for docstring inheritance (#22, #24) again... In the current version the docstrings for the MCMCSamples and the NestedSamples method are almost identical...

lukashergt · 2020-08-20T15:08:49Z

Many thanks to @andrewfowlie and @williamjameshandley for your helpful input!

williamjameshandley · 2020-08-21T14:40:14Z

This thing made me wish for docstring inheritance (#22, #24) again... In the current version the docstrings for the MCMCSamples and the NestedSamples method are almost identical...

I agree that the current setup requires a lot of work (which still results in inconsistencies). I am wary of any method however which results in non-readable strings in source code. One of the the things I really like about python that the documentation and source are tightly linked in a text file by convention.

It would be nice if there were a way of automatically checking docstring consistency via a script (which was fed/interpreted the logical structure), but I'm not sure if such a thing exists.

lukashergt added 2 commits August 20, 2020 11:14

add importance_reweighting method to NestedSamples

6caf252

version bump to 2.0.0-beta.3

902526f

lukashergt commented Aug 20, 2020

View reviewed changes

anesthetic/samples.py Outdated Show resolved Hide resolved

lukashergt commented Aug 20, 2020

View reviewed changes

anesthetic/samples.py Outdated Show resolved Hide resolved

lukashergt requested a review from williamjameshandley August 20, 2020 10:42

williamjameshandley requested changes Aug 20, 2020

View reviewed changes

anesthetic/samples.py Outdated Show resolved Hide resolved

anesthetic/samples.py Show resolved Hide resolved

lukashergt added 6 commits August 20, 2020 12:40

change add and replace args to action arg taking a string

bb7a013

rename importance_reweighting to importance_sample

bfd5183

append merge_nested_samples docstring to reflect additional utility f…

fc52995

…or passing a single run

add test for importance_samples

d129ba8

add action='mask' to importance_sample and corresponding test

45562a0

add 'mask' to the sets in docstring and NotImplementedError message

c56a429

lukashergt requested a review from williamjameshandley August 20, 2020 12:49

williamjameshandley requested changes Aug 20, 2020

View reviewed changes

anesthetic/samples.py Show resolved Hide resolved

tests/test_samples.py Show resolved Hide resolved

tests/test_samples.py Outdated Show resolved Hide resolved

anesthetic/samples.py Outdated Show resolved Hide resolved

change action default from 'replace' to 'add'

611a868

williamjameshandley changed the title ~~add importance_reweighting method to NestedSamples~~ add importance_sample method to NestedSamples Aug 20, 2020

williamjameshandley requested changes Aug 20, 2020

View reviewed changes

lukashergt added 3 commits August 20, 2020 14:43

fix importance_sample tests

4bafa5f

move importance_sample to MCMCSamples and inherit in NestedSamples

f4d315a

fix merge_nested_samples docstring: two->one

955776a

williamjameshandley changed the title ~~add importance_sample method to NestedSamples~~ add importance_sample method to NestedSamples and MCMCSamples Aug 20, 2020

williamjameshandley approved these changes Aug 20, 2020

View reviewed changes

lukashergt merged commit 3db7903 into handley-lab:master Aug 20, 2020

williamjameshandley mentioned this pull request Aug 20, 2020

importance_sample with inplace=False #124

Closed

5 tasks

This was referenced Aug 20, 2020

priors, likelihoods, logzero, -np.inf and np.na #125

Open

importance_sample with inplace=False #126

Merged

williamjameshandley mentioned this pull request Sep 30, 2020

Both logL and logL_birth need modifying for importance sampling. #128

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add importance_sample method to NestedSamples and MCMCSamples #122

add importance_sample method to NestedSamples and MCMCSamples #122

lukashergt commented Aug 20, 2020 •

edited

Loading

codecov bot commented Aug 20, 2020 •

edited

Loading

williamjameshandley commented Aug 20, 2020

williamjameshandley commented Aug 20, 2020

williamjameshandley left a comment

williamjameshandley left a comment

lukashergt commented Aug 20, 2020

lukashergt commented Aug 20, 2020

lukashergt commented Aug 20, 2020

williamjameshandley commented Aug 21, 2020

add importance_sample method to NestedSamples and MCMCSamples #122

add importance_sample method to NestedSamples and MCMCSamples #122

Conversation

lukashergt commented Aug 20, 2020 • edited Loading

Checklist:

codecov bot commented Aug 20, 2020 • edited Loading

Codecov Report

williamjameshandley commented Aug 20, 2020

williamjameshandley commented Aug 20, 2020

williamjameshandley left a comment

Choose a reason for hiding this comment

williamjameshandley left a comment

Choose a reason for hiding this comment

lukashergt commented Aug 20, 2020

lukashergt commented Aug 20, 2020

lukashergt commented Aug 20, 2020

williamjameshandley commented Aug 21, 2020

lukashergt commented Aug 20, 2020 •

edited

Loading

codecov bot commented Aug 20, 2020 •

edited

Loading