Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ocsigen_http_com: avoid O(response) allocation by using Lwt_io, 2X speedup #84

Merged
merged 1 commit into from
Aug 16, 2015

Conversation

mfp
Copy link
Contributor

@mfp mfp commented Aug 12, 2015

Lwt_chan (deprecated) is now built atop Lwt_io and allocates a buffer on every
write. Refer to #49.

This makes ocsigenserver over 2X faster at serving large files, and also gives
a measurable improvement (~20%) for smaller ones (in the 10 KB range).

…eedup.

Lwt_chan (deprecated) is now built atop Lwt_io and allocates a buffer on every
write. Refer to ocsigen#49.

This makes ocsigenserver over 2X faster at serving large files, and also gives
a measurable improvement (~20%) for smaller ones (in the 10 KB range).
@mfp
Copy link
Contributor Author

mfp commented Aug 12, 2015

AFAIK the build fail is bogus, and ocsigenserver did build & install correctly. Also works for me on 4.02.1.

@mfp
Copy link
Contributor Author

mfp commented Aug 12, 2015

Before this PR, in order to server a 22 KB file, ocsigenserver would allocate around (a) 22KB in the major heap (Lwt_chan inefficiency addressed by this PR) + (b) 8KB in the major heap (buffer used to read file, reused by the stream) + (c) 8KB in the ("C") heap (malloc via Lwt_bytes) + (d) 6KB (inefficiency in stream generated by File_content, which performs a String.sub buf 0 available when there's not enough data left to fill the buffer entirely in the last pass).

This PR removed (a) (22 out of 36 KB in the major heap) and gave +20% speed. #85 removes (b) for tiny files, and yields a ~10% boost.
I estimate we can also get that +10% (for the 22KB file example) by using a buffer pool in Ocsigen_senders.File_content.result_of_content (the buffer would be returned to the pool in the stream finalizer).

I don't know how important (c) is. There's a hardcoded constant in the GC that would adjust the GC speed so that a full major run is completed by the time 1 GB has been allocated outside OCaml's heap (refer to Interfacing C with OCaml). That means a full major GC run would be completed on average at most after ~120000 files are served (if we only consider the Lwt_byte buffer allocation), so once every 10s, which doesn't really sound like a lot. However, even if the effect on speed might be limited, this would make memory usage balloon, when it could be essentially constant (and if it weren't, that would mean more GC work is being performed).

At any rate, (c) can also be addressed by using a buffer pool (thanks to the new ?buffer param in Lwt >= 2.5.0).

As for (d) (up to 8KB - 1 bytes allocated at the end of the file read), I think the best way to address it is to have another pool of buffers whose sizes are powers of two between word_size * Max_young_wosize (256) and Ocsigen_config.get_filebuffersize / 2 (so likely 1024 ---accounting for 32bit platforms---, 2048, and 4096), and "decomposing" the last read as a sum of buffers of decreasing power-of-two size + the last one being a Bytes.sub of the remaining size (which will thus be allocated in the minor heap, at little cost). In fact the buffer pool could probably have lower powers of two if we want to avoid allocation to greater extents -- down to 128 bytes might be OK.

vbmithr added a commit that referenced this pull request Aug 16, 2015
Ocsigen_http_com: avoid O(response) allocation by using Lwt_io, 2X speedup
@vbmithr vbmithr merged commit 1e3c2e1 into ocsigen:2.6 Aug 16, 2015
@vbmithr
Copy link
Member

vbmithr commented Aug 16, 2015

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants