Ocsigen_http_com: avoid O(response) allocation by using Lwt_io, 2X speedup #84

mfp · 2015-08-12T15:51:11Z

Lwt_chan (deprecated) is now built atop Lwt_io and allocates a buffer on every
write. Refer to #49.

This makes ocsigenserver over 2X faster at serving large files, and also gives
a measurable improvement (~20%) for smaller ones (in the 10 KB range).

…eedup. Lwt_chan (deprecated) is now built atop Lwt_io and allocates a buffer on every write. Refer to ocsigen#49. This makes ocsigenserver over 2X faster at serving large files, and also gives a measurable improvement (~20%) for smaller ones (in the 10 KB range).

mfp · 2015-08-12T18:08:29Z

AFAIK the build fail is bogus, and ocsigenserver did build & install correctly. Also works for me on 4.02.1.

mfp · 2015-08-12T19:04:53Z

Before this PR, in order to server a 22 KB file, ocsigenserver would allocate around (a) 22KB in the major heap (Lwt_chan inefficiency addressed by this PR) + (b) 8KB in the major heap (buffer used to read file, reused by the stream) + (c) 8KB in the ("C") heap (malloc via Lwt_bytes) + (d) 6KB (inefficiency in stream generated by File_content, which performs a String.sub buf 0 available when there's not enough data left to fill the buffer entirely in the last pass).

This PR removed (a) (22 out of 36 KB in the major heap) and gave +20% speed. #85 removes (b) for tiny files, and yields a ~10% boost.
I estimate we can also get that +10% (for the 22KB file example) by using a buffer pool in Ocsigen_senders.File_content.result_of_content (the buffer would be returned to the pool in the stream finalizer).

I don't know how important (c) is. There's a hardcoded constant in the GC that would adjust the GC speed so that a full major run is completed by the time 1 GB has been allocated outside OCaml's heap (refer to Interfacing C with OCaml). That means a full major GC run would be completed on average at most after ~120000 files are served (if we only consider the Lwt_byte buffer allocation), so once every 10s, which doesn't really sound like a lot. However, even if the effect on speed might be limited, this would make memory usage balloon, when it could be essentially constant (and if it weren't, that would mean more GC work is being performed).

At any rate, (c) can also be addressed by using a buffer pool (thanks to the new ?buffer param in Lwt >= 2.5.0).

As for (d) (up to 8KB - 1 bytes allocated at the end of the file read), I think the best way to address it is to have another pool of buffers whose sizes are powers of two between word_size * Max_young_wosize (256) and Ocsigen_config.get_filebuffersize / 2 (so likely 1024 ---accounting for 32bit platforms---, 2048, and 4096), and "decomposing" the last read as a sum of buffers of decreasing power-of-two size + the last one being a Bytes.sub of the remaining size (which will thus be allocated in the minor heap, at little cost). In fact the buffer pool could probably have lower powers of two if we want to avoid allocation to greater extents -- down to 128 bytes might be OK.

Ocsigen_http_com: avoid O(response) allocation by using Lwt_io, 2X speedup

vbmithr · 2015-08-16T09:53:48Z

Thanks

vbmithr added a commit that referenced this pull request Aug 16, 2015

Merge pull request #84 from mfp/decrease-gc-pressure

1e3c2e1

Ocsigen_http_com: avoid O(response) allocation by using Lwt_io, 2X speedup

vbmithr merged commit 1e3c2e1 into ocsigen:2.6 Aug 16, 2015

This was referenced Aug 16, 2015

Use buffer pool to avoid large buffer allocation for connections and responses #87

Open

Huge GC pressure when serving files with Cohttp_lwt_unix.Server.respond_file mirage/ocaml-cohttp#207

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ocsigen_http_com: avoid O(response) allocation by using Lwt_io, 2X speedup #84

Ocsigen_http_com: avoid O(response) allocation by using Lwt_io, 2X speedup #84

mfp commented Aug 12, 2015

mfp commented Aug 12, 2015

mfp commented Aug 12, 2015

vbmithr commented Aug 16, 2015

Ocsigen_http_com: avoid O(response) allocation by using Lwt_io, 2X speedup #84

Ocsigen_http_com: avoid O(response) allocation by using Lwt_io, 2X speedup #84

Conversation

mfp commented Aug 12, 2015

mfp commented Aug 12, 2015

mfp commented Aug 12, 2015

vbmithr commented Aug 16, 2015