Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kryo fail to wipe out generic type for collection serializer when reference objects #411

Closed
yumengzhao92 opened this issue Mar 18, 2016 · 21 comments

Comments

@yumengzhao92
Copy link

kryo-bugreport.zip

Blow is a brief description of when the bug happens(Please run KryoSerializationIssue in attached file to see what I mean):

Given classes A, B
Class A {List s1, List s2}
Class B is an ArrayList

An instance a of A and an instance b of B are to be deserialized by Kryo.

What’s special about a and b and how they are combined to produce the bug is that:

  • a.s2 is a reference of a.s1, and s2 is deserialized after s1
  • b is deserialized after a
  • b has the same class type as a.s2 (both of them are ArrayList in this case) but they have different generic types - the generic type of a.s2 is String, the generic type of b is Double

When a.s2 is to be deserialized, the CollectionSerializer’ genericType in ArrayList Registration is set to String. This genericType is usually wiped out (set to null) when CollectionSerializer is entered to read elements of the collection.
But this step is bypassed because a.s2 is a reference to earlier object and can be retrieved from recorded objects thus CollectionSerializer is not entered.
This has caused the CollectionSerializer of ArrayList Registration inside classToRegistration Map to hang on to the uncleared generic type String, even if the next ArrayList to be deserialized is a Collection of Doubles (or any other non-primitive types).

When deserialize b, Kryo is trying to deserialize String instead of Double and cause problems.

So to generalize the case, i believe an internal bug exists when Kryo deserializes two classes in such a way that,
the last field to be deserialized of the first class is the same collection type as the second class, and the last field instance is a reference to an earlier object.
This will cause Kryo to carry over the wrong generic type to deserialize the second class and throw errors.

@yumengzhao92
Copy link
Author

One possible solution might be to add
serializer.setGenerics(this, null)
to Kryo.readObject and Kryo.readObjectOrNull method if (stackSize == REF)

** Reads an object using the specified serializer. The registered serializer is ignored. */
    public <T> T readObject (Input input, Class<T> type, Serializer serializer) {
        if (input == null) throw new IllegalArgumentException("input cannot be null.");
        if (type == null) throw new IllegalArgumentException("type cannot be null.");
        if (serializer == null) throw new IllegalArgumentException("serializer cannot be null.");
        beginObject();
        try {
            T object;
            if (references) {
                int stackSize = readReferenceOrNull(input, type, false);
                if (stackSize == REF) {
                                        #  ADD HERE
                                        serializer.setGenerics(this, null)
                                        return (T)readObject;
                                 }              
                                object = (T)serializer.read(this, input, type);
                if (stackSize == readReferenceIds.size) reference(object);
            } else
                object = (T)serializer.read(this, input, type);
            if (TRACE || (DEBUG && depth == 1)) log("Read", object);
            return object;
        } finally {
            if (--depth == 0 && autoReset) reset();
        }
    }

Although i am not sure whether this would cause other problems..

@magro
Copy link
Collaborator

magro commented Mar 19, 2016

Thanks for reporting! Can you have a look at #385 and #377 (also have a look at this comment) and see how your solution compares to the other one / how it handles the other issue? Thanks!

@yumengzhao92
Copy link
Author

My solution is basically the same as #385 suggested. Not sure if add serializer.setGenerics(this, null) to readObject (Input input, Class type, Serializer serializer) would be sufficient. Maybe we should consider adding serializer.setGenerics(this, null) to all read methods after determining the current object is an reference to earlier objects.

@magro
Copy link
Collaborator

magro commented Mar 22, 2016

@romix WDYT, can you step in here?

@romix
Copy link
Collaborator

romix commented Mar 22, 2016

I'll have a look.

I'm going to first add generics-related tests from all reported issues and then see how they fail with the current version. Then I'll try to add some of the proposed changes and see if it solves some/all of the issues.

I suspect that the change proposed earlier in this thread does not solve all problems and I need to dig deeper.

@magro
Copy link
Collaborator

magro commented Mar 22, 2016

Great, thanks!

@romix
Copy link
Collaborator

romix commented Mar 22, 2016

I had a quick look. As I expected, the proposed one-liner fix does not fix all problems related to generics.

And indeed, as it was mentioned in #377, there is something wrong with the commit @4764dee63cf65ceb59364f731ac444f7fab765b3. Non-generic classes with generic super-classes seem to be handled in a wrong way under certain circumstances.

I'm looking into it.

@romix
Copy link
Collaborator

romix commented Mar 23, 2016

OK. I found the actual issue, but the proper fix is likely to be very non-trivial. It make take some time to get it right.

@magro
Copy link
Collaborator

magro commented Mar 23, 2016

Great that you found it!

@magro
Copy link
Collaborator

magro commented Apr 8, 2016

@romix how's your progress here, do you have a rough guess for a ETA? IMHO we should wait with 3.1 for this, or do you think differently?

@romix
Copy link
Collaborator

romix commented Apr 11, 2016

@magro I didn't have enough time to solve this problem. It may take a while. Therefore I don't think it should be a show-stopper for the next release. After all, this problem existed since the moment when generics support was introduced. And we've had a number of releases in the meantime.

@magro
Copy link
Collaborator

magro commented Apr 11, 2016

@romix Ok. What about commit 4764dee, should we revert it until the broader solution is there? Can you share your findings so far (if it doesn't take too much time)?

@romix
Copy link
Collaborator

romix commented Apr 11, 2016

@magro In short, there are problems in the analysis phase, i.e. in the FieldSerializerGenericsUtil and the way how it associates and propagates the collected information about generic parameters with classes. And there are even bigger runtime problems in the ObjectField, when it comes to setting the generic parameters of a field properly.

I'm probably near to having a solution, which would solve both, but I'm not sure about its performance. One of the issues is that we do not have different FieldSerializer instances for different instantiations of the same generic class, e.g. MyClass<Integer> and MyClass<String> use the same serializer instance. As a result, we need to always call at runtime setGenerics, because we have to set a proper context. If we would treat such instantiations as different serializer instances, we wouldn't need it. But then we'd most likely need to register such instantiations as different classes, so that they get different serialization class IDs.

As for 4764dee, I don't have a strong opinion about it. Overall, we are wrong with and without this change. The fix I'm working on is includes among other things an improvement to 4764dee.

@magro
Copy link
Collaborator

magro commented Apr 12, 2016

@romix Thanx for this explanation! Honestly today I don't fully understand generics handling in FieldSerializer and related classes, but I'd like to better understand the problem(s) and the solutions (so that I may better help in the future supporting this). If you have new unit tests that show what's currently not working perhaps I could look at them and step/debug through the code to get a better understanding (you might create a new branch in your fork). Probably I could also use existing tests to see what's there so far...

Regarding performance the best way is probably to get it correct at first and then see how to optimize it.

@magro
Copy link
Collaborator

magro commented Jun 1, 2016

@romix how's your progress here, did you have time working on this?

@mikehearn
Copy link

mikehearn commented Jun 2, 2016

I suspect I'm hitting a related issue ... actually we get a JVM crash (!) when using nested generics like Foo<Bar<Baz>> and the serialisation logs suggest that Kryo is erasing Baz and treating it as just Foo<Bar>. It's a bit hard to minimise a test case, but are nested generics fully tested/expected to work?

From looking at the code, it seems to assume that type variables always can be resolved to a Class but in the case of nested generics I'd expect a type variable to be mapped to a Type.

@magro
Copy link
Collaborator

magro commented Jun 2, 2016

There are issues with nested generics, which is this issue about and others mentioned above.

@romix
Copy link
Collaborator

romix commented Jun 2, 2016

@magro Sorry, I'm still too busy at work :-( Didn't have any time to look into this issue any further. What I might do in the meantime is to commit my current changes as a new branch, so that others can have a look and may be even finish it. I'll try to do it over the weekend.

@magro
Copy link
Collaborator

magro commented Jun 3, 2016

@romix Ok, thanks!

@magro
Copy link
Collaborator

magro commented Jul 6, 2016

Should be fixed with kryo 4.0.0 and optimized generics off by default.

NathanSweet added a commit that referenced this issue Aug 7, 2017
For example, a field that is `ArrayList<Value<Integer>>`. Previously Kryo supported only the first level of type parameters: the ArrayList serializer would know it has Value objects, but the Value serializer wouldn't know it has Integer objects. Now it does!

closes #411
closes #463
@NathanSweet
Copy link
Member

OP's test passes in the kryo-5.0.0-dev branch. All edge cases with generics should be solid there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants