Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

priors, likelihoods, logzero, -np.inf and np.na #125

Open
williamjameshandley opened this issue Aug 20, 2020 · 3 comments
Open

priors, likelihoods, logzero, -np.inf and np.na #125

williamjameshandley opened this issue Aug 20, 2020 · 3 comments

Comments

@williamjameshandley
Copy link
Collaborator

In nested sampling, there is a concept called 'logzero' which is important, but neither well-defined, nor consistently implemented.

In general, at runtime if a likelihood returns 'logzero' (which in MultiNest and PolyChord is user-configurable, defaulting a large floating point number -1e30), then nested sampling interprets this as 'excluded at the prior level', which has a non-trivial effect on the evidence. Small likelihoods larger than this e.g. -1e29 effectively do not participate in the posterior, but don't act to truncate the prior volume.

There are two concepts here

  1. zero likelihood: absolutely excluded by data
  2. zero prior: excluded a-priori

As MultiNest and PolyChord are written in FORTRAN, which does not have a platform-independent implementation of 'infinity' or 'nan', the best they can do in that language is to have this slight hack of defining a threshold float below which values are 'prior zero' and have no exact notion of 'likelihood zero', but get away with small values of the loglikelihood a little above the threshold.

We can't do anything to change FORTRAN, but anesthetic can probably do better than this to avoid gotchas, and support a better notion with languages that have nans and infs. At the moment anesthetic assume np.inf represents zero prior, and still has the slight hack of 'very low likelihood' being equivalent to 'zero likelihood'.

One proposition which I would like to discuss is to define

  1. zero likelihood: -np.inf
  2. zero prior: np.nan

This potentially has issues associated with the fact that nan is greater, less and equal to nothing (including itself), which would mean that any mask on logL_birth (for which the initial live points would have NaN in that column) would have to be carefully defined, but in my view this is unlikely to affect users.

@andrewfowlie, @lukashergt, @appetrosyan, thoughts and suggestions are welcome.

This has come up in a host of past issues #70, #71, #120, #122, Cobaya#77. Feel free to tag more that I have undoubtedly missed, and anybody else who might be interested in this issue.

@andrewfowlie
Copy link
Collaborator

andrewfowlie commented Aug 21, 2020

is it possible to use a new type of object inside the pandas dataframe? Gives something smaller than any float, including -np.inf

class PriorZero(object):
    def __eq__(self, other):
      return isinstance(other, PriorZero)
    def __lt__(self, other):
        return True
    def __gt__(self, other):
        return False
    def __le__(self, other):
        return True
    def __ge__(self, other):
        return self == other 

If not possible, similar but inherit from np.float64

class PriorZero(np.float64):
    def __eq__(self, other):
      return isinstance(other, PriorZero)
    def __lt__(self, other):
        return True
    def __gt__(self, other):
        return False
    def __le__(self, other):
        return True
    def __ge__(self, other):
        return self == other 

not sure these are neccessarily the best way forward, just the only other possibilities I could imagine beyond the -np.inf and np.nan.

@williamjameshandley
Copy link
Collaborator Author

Hi @andrewfowlie. This is definitely possible and would achieve the same idea but with more robust treatment than nan. My concern is that even if derived from np.float, the pandas logL_birthcolumn would still have dtype object, and we would lose all the fast vectorised operations which make Python a reasonable choice for anesthetic.

@andrewfowlie
Copy link
Collaborator

I see. I’m not sure if my suggestion is the best way. But if performance is the only concern, might be worth quickly checking that the cost is as big as you fear it might be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants