Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minif2f-Isabella acc #30

Open
wangzhihao-coder opened this issue Aug 12, 2024 · 1 comment
Open

minif2f-Isabella acc #30

wangzhihao-coder opened this issue Aug 12, 2024 · 1 comment

Comments

@wangzhihao-coder
Copy link

I use the docker image from the PISA repository and the prediction file from output.zip of your repository(path/outputs/DeepSeekMath-Base/miniF2F-Isabelle-test/results/cot/predictions.json). But my acc is about 10% compared to the result of 24.6%. I'd like to know what is the reason for this difference.

@wyt2000
Copy link

wyt2000 commented Sep 2, 2024

I also tried to reproduce the same results as @wangzhihao-coder without using docker. When following the tutorial in PISA, I encountered a mismatch of the package version in SBT. After fixing it, I started the PISA server successfully. However, the evaluation results (miniF2F-Isabelle-test: 21.72, miniF2F-Isabelle-valid: 22.13) were also worse than those mentioned in the paper. Is there anyone who can help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants