Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't generate guards when all arguments have fixed known types #981

Open
wants to merge 4,491 commits into
base: master
Choose a base branch
from

Conversation

undingen
Copy link
Contributor

Don't generate argument guards for callattr & runtimeCalls if we know that the classes can't change.
My first approach kept track of every individual arg and tracked if it needed a guard but this had a too large overhead - I could not archive a speedup. This approach now only supports the case where all args have fixed classes.

                                   upstream/master:  origin/guards_know:
           django_template3.py             2.2s (2)             2.2s (2)  +1.3%
                 pyxl_bench.py             1.8s (2)             1.8s (2)  -0.2%
     sqlalchemy_imperative2.py             2.3s (2)             2.3s (2)  +0.6%
                pyxl_bench2.py             1.3s (2)             1.3s (2)  -2.5%
       django_template3_10x.py            14.1s (2)            14.3s (2)  +1.6%
             pyxl_bench_10x.py            13.6s (2)            13.4s (2)  -1.6%
 sqlalchemy_imperative2_10x.py            17.8s (2)            17.6s (2)  -1.3%
            pyxl_bench2_10x.py            11.7s (2)            11.1s (2)  -5.6%
                       geomean                 5.1s                 5.1s  -1.0%

@undingen undingen added the wip label Oct 23, 2015
@undingen undingen force-pushed the guards_known_cls branch 3 times, most recently from 9c8f88f to 329d617 Compare October 27, 2015 14:03
@undingen undingen removed the wip label Oct 27, 2015
@undingen
Copy link
Contributor Author

I fear that this commit is introducing random test failures :-(
I have been seeing lately already some but I think this commit is increasing the number of it. (There is this thread id assert we already know since a long time but there is now also a newer on inside the GC handler of generators)

@undingen
Copy link
Contributor Author

So I restarted the travis ci several and could not see a problem with this patch but the perf improvement is now much smaller:

                                  upstream/master:  origin/guards_know:
           django_template3.py             2.1s (4)             2.1s (4)  +0.5%
                 pyxl_bench.py             1.8s (4)             1.8s (4)  +0.9%
     sqlalchemy_imperative2.py             2.2s (4)             2.2s (4)  +1.0%
                pyxl_bench2.py             1.1s (4)             1.0s (4)  -0.9%
       django_template3_10x.py            13.8s (4)            14.1s (4)  +1.9%
             pyxl_bench_10x.py            13.4s (4)            13.0s (4)  -3.2%
 sqlalchemy_imperative2_10x.py            16.9s (4)            16.8s (4)  -0.5%
            pyxl_bench2_10x.py             9.1s (4)             8.9s (4)  -1.8%
                       geomean                 4.7s                 4.7s  -0.3%

kmod and others added 14 commits April 13, 2016 21:19
I think this adds some pretty heavy debugging that we might not always want,
but turning this on is always my first debugging step.
I don't quite remember this but I think the goal was to ignore core/from_llvm, not all of core/
Or rather, add some more refcount-related calls to it to help
it get down to zero refs at the end.
Our underlying implementation still looks pretty different, but
rather than implement some newly-needed APIs completely from scratch,
I copied in some of CPython's implementation.

The result is a bit messy (multiple ways of doing similar things),
but I think it's a step in the right direction.

Regardless, this commit adds "clean up thread-local storage when the
local object dies" functionality, as well as better cleanup when
there are multiple threads.  I think this should help with the fork
issues as well.
I wasn't sure what we would have to do in this case --
we don't really have any way of cleaning up the data referenced
by those other threads.

Fortunately(?), CPython doesn't do much cleanup of those threads
(cleans up their metadata but doesn't try to clean up any references
held by the thread), so we don't have to do much either.  Just set
a flag saying that this happened and that we should skip asserting
that we got down to 0 refs.
kmod and others added 25 commits May 25, 2016 00:42
that is just based off of pulling our latest release
PyObject_New: register the type if the type is not yet registered
The problem is that we emit an llvm "unreachable" instruction, and then
continue to emit other code, which fails the verifier.

endBlock(DEAD) is supposed to be the right way to handle that, but there
is some more work that would need to be done there to get that working
properly.

So just do the easy thing for now -- create a new BB so that it's ok to
emit more code.
We were doing a "call bumpUse() early" optimization to free up registers
when we can, but as a side-effect it looked to the refcounter like the
reference was done being used.
Not using it in this commit, just wanted to
get the unmodified version in so it's easier to see the changes.
This is for adding a guard on a non-immortal object, since we need
to be safe against that object getting deallocated and a different object
is allocated in its spot.

We had support for this already, but it leaked memory.  The biggest was
that we never freed our runtimeICs, so if those ended up getting any GC
references in them, then we would leak memory.  So I started freeing those,
but then that exposed the issue that the ICInvalidators expect that their
dependent ICs never get freed.  So I added back a mapping from ICSlotInfo->
ICInvalidators that reference them.
until I realize that it's because we were passing more tests than we expected.
The behavior changed in CPython 2.7.4, and Travis-CI runs 2.7.3.
Switch to CPython's descrobject.c
before we added a it as a module which made code fail
which does something like __builtins__["unicode"]
Some packaging / distributing updates
update list of failing cpython tests
exec, input: if globals has no __builtins__ add it as a dictwrapper
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants