Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Struct discriminated unions with explicit layout for more compact size #9368

Closed
isaevdan opened this issue Jun 2, 2020 · 8 comments
Closed

Comments

@isaevdan
Copy link

isaevdan commented Jun 2, 2020

I'm currently working on project that requires efficient memory management and we do have a base type which is created millions of times during of application lifecycle. For now we use reference DU, but this actually leads to performance drop as we need to create lots of small objects all the time.

We consider to move to struct DU to avoid heap allocation, but because of native implementation doesn't use explicit layout and our DU has 5 cases it will result in quite big object while only small part of it will be used. This may result in worse performance. So we will probably look into some custom implementation with layouting, but we will lose pattern matching feature and will have to do a major refactoring.

Is there any reasoning why struct DU wasn't implement with Explicit layout? Is it possible to add such feature in future?

@abelbraaksma
Copy link
Contributor

An interesting request, but since a struct DU internally requires more data, afaik mainly to store the Tag, which is an int, allowing control over the layout may not yield the results you would expect.

Perhaps a better option in your case is to use ExplicitLayout with a normal struct and write a bunch of matching active patterns to cover the different cases.

It does make one wonder, btw, why a struct-DU has to have the size of all its members (plus 4), esp if the type is blittable. I don't know enough of the inner complexities, but it seems reasonable to assume that a struct-DU allows for overlapping fields by default (or at user option), since at any given time only one field will be set. For instance: take the largest single blittable field, add 4 bytes for the Tag, resize to fit alignment and then use FieldOffset to make all fields overlapping. I don't know if the F# team has considered (and dismissed) that option, but it seems like it would put much less stress on the stack.

This method of overlap is applied for normal DU's, though (using polymorphism internally), meaning in many cases the struct-DU will end up being larger than the class-DU (but on the heap).

@isaevdan
Copy link
Author

isaevdan commented Jun 2, 2020

@abelbraaksma well this is sort of what I meant, maybe I didn't explain it clearly. Currently struct DU is implemented as a struct with all fields side by side, and it seems that it is possible and highly efficient to implement it as Tag at 0 offset and all other fields directly after Tag. Besides benefit of putting less pressure on stack it would decrease size of array of struct DUs a lot. Indeed current implementation of reference cells does implement this overlapping. That's why our migration to struct DUs can actually decrease performance in our case and that's why we consider custom implementation. Thanks for suggestion about active patterns can actually simplify work with this custom structs. Need to look a bit more into them to make sure they have comparable performance to out of the patterm matching

@abelbraaksma
Copy link
Contributor

abelbraaksma commented Jun 2, 2020

Currently struct DU is implemented as a struct with all fields side by side, and it seems that it is possible and highly efficient to implement it as Tag at 0 offset and all other fields directly after Tag

@isaevdan, while this is true for the way it is coded up, it is not true for the internal memory layout after JIT'ing. In fact, in almost all cases, the JIT will rearrange the layout to something more useful. For instance, if you have, say, byte | long | byte | short, the JIT will turn it into long | byte | byte | short (or something similar), to prevent a type to cross alignment boundaries, and to align types on a favorable position. It will often also add padding, esp. at the end, so that successive structs are laid out nicely on cache boundaries.

This process is dependent on CPU architecture and it is possible it is different for different JITs.

In other words, how it is laid out in code is irrelevant w.r.t. optimizations. My suggestion, however, was to let the blittable types overlap, while still maintaining padding and alignment. Whether this is feasible, and/or whether this leads to better or worse performance is something I cannot yet answer ;).

Once you use ExplicitLayout + FieldOffset, the burden of proper layout lies with you, and you will have to take care of the correct choice per CPU architecture. Note that other kinds of layout only affect marshaling and do not affect managed code.

Need to look a bit more into them to make sure they have comparable performance to out of the patterm matching

As a hint, partial matching patterns are executed for each match, complete active patterns (without |__|) at the end) are executed once for all cases in a match and are therefore often faster (see https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/active-patterns). As always, measure before you optimize ;).

@isaevdan
Copy link
Author

isaevdan commented Jun 2, 2020

@abelbraaksma Thanks for this insight. As I understand in case of struct DU, only 2 fields are needed to be located in memory Tag and Value, where Tag has fixed size and Value depends on the largest value size in DU. So there should be not so much combinations to optimally locate fields and probably in fact it's ok to always keep Tag first and Value second.

@abelbraaksma
Copy link
Contributor

FYI, related: #5215

@Happypig375
Copy link
Member

fsharp/fslang-suggestions#699

@TIHan
Copy link
Contributor

TIHan commented Jun 15, 2020

I always avoid creating struct DUs mainly because of how large the struct can become.

I think a reasonable first step is the merging of fields of the same type.

@dsyme
Copy link
Contributor

dsyme commented Apr 20, 2022

Covered by the language suggestion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants