Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements #89 by adding a FileResponse class #94

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

scallopedllama
Copy link

This Pull Request adds a new class FileResponse which will asynchronously stream file data back to a client.

It can be used in two ways:

  • A shared pointer to an instance of a FileResponse can be returned from a PathHandler to stream that file back
  • Another Response object can instantiate a FileResponse and use its respond method to send file data back on a provided ResponseWriter shared pointer

In either situation, the file data is read and queued up for transmission on the server thread to prevent stalling other requests. If zlib support is enabled and it is explicitly enabled within the constructor, the data response can be compressed with the deflate method.

When the file being sent back is uncompressed, file resuming with the Range header is supported. Also, the If-Modified-Since, If-Unmodified-Since, and If-Range headers are supported, using the file's last modification time to determine this. Support for these headers required the addition of a webtime string parser to be added to the StringUtil class.

When the file being sent back is compressed, it is not compressed ahead of time but while it is being loaded from the disk. As a consequence, the final data size is unknown so a Content-Length header is not provided and the Range header is no longer supported though the If-Modified-Since and If-Unmodified-Since header will still be acknowledged. If the client supports compression but would rather have file resuming ability, compression is disabled if the incoming Accept-Encoding header does not have deflate in it.

This Pull Request implements #89

Add a string webtime parser that returns a time_t to go with
the webtime generator that takes a time_t and returns a string.
@codecov
Copy link

codecov bot commented Aug 8, 2018

Codecov Report

Merging #94 into master will decrease coverage by 2.23%.
The diff coverage is 0%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master    #94      +/-   ##
========================================
- Coverage   39.23%    37%   -2.24%     
========================================
  Files          40     42       +2     
  Lines        2039   2162     +123     
========================================
  Hits          800    800              
- Misses       1239   1362     +123
Impacted Files Coverage Δ
src/main/c/seasocks/Response.h 100% <ø> (ø) ⬆️
src/main/c/seasocks/ZlibContext.cpp 0% <ø> (ø) ⬆️
src/main/c/Response.cpp 90.47% <0%> (-9.53%) ⬇️
src/main/c/Connection.cpp 22% <0%> (-0.03%) ⬇️
src/main/c/StringUtil.cpp 75% <0%> (-6.36%) ⬇️
src/main/c/seasocks/ResponseCodeDefs.h 3.03% <0%> (-0.2%) ⬇️
src/main/c/seasocks/util/FileResponse.h 0% <0%> (ø)
src/main/c/util/FileResponse.cpp 0% <0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4daa1e3...0c473a4. Read the comment docs.

@hoytech
Copy link
Contributor

hoytech commented Aug 8, 2018

Personally I would consider if it's possible for your application to use a server that specialises in file serving (nginx is good) because it can probably do it much more efficiently (ie using sendfile(2), pre-gzipped files, etc).

That said, I'm not against this as long as it doesn't increase the attack surface for apps that just want to use websockets (which of course is the main point of seasocks).

Also I quickly grepped your patch and didn't see any Vary headers, you'll probably want to add a "Vary: Accept-Encoding": https://blog.stackpath.com/accept-encoding-vary-important

@hoytech
Copy link
Contributor

hoytech commented Aug 9, 2018

I just realised my comment may have come off sounding discouraging, sorry for that. I actually do think this is a worthwhile patch!

Also, this was just my opinion and I of course don't speak for @mattgodbolt nor about the direction he is planning for seasocks.

Cheers!

@scallopedllama
Copy link
Author

Personally I would consider if it's possible for your application to use a server that specialises in file serving (nginx is good) because it can probably do it much more efficiently (ie using sendfile(2), pre-gzipped files, etc).

Definitely would have used an engine like that if most of the things that my application was doing was returning files but for the most part it's programmatic. Person accesses address, C++ function is called, data is returned. So far Seasocks has been a fantastic match for the common case. It's just that a few addresses return an actual file from the filesystem which is why I need this thing. Doesn't need to be fancy, just work well enough.

I'll look into the Vary header, thanks for the great link

A shared pointer to a FileResponse can be returned from a PathHandler's handle()
method to asynchronously return the indicated file back to a client.
It understands and supports the Range header for non-multipart byte ranges along
with the If-Range, If-Modified-Since, and If-Unmodified-Since conditional headers.

Additionally, if enabled during construction and the client supports it,
the file data being sent back to the client is compressed using the
built in zlib support for deflate Content-Encoding.
This does not compress ahead of time and cache the compressed data,
instead the data is compressed while it is sent back to the client.
Because of this, content length is not known ahead of time and resume
features are disabled.
Add a static fileResponse method that creates and returns a
file response, modeled after the text, json, and html response
methods.
@scallopedllama
Copy link
Author

Added the Vary header to the response so it will be cached correctly.

@offa
Copy link
Collaborator

offa commented Aug 9, 2018

Wow, you've put a lot of work into this! I've added some implementation related comments.

Btw. some tests would be good.

@@ -52,6 +53,7 @@ class Response {
static std::shared_ptr<Response> textResponse(const std::string& response);
static std::shared_ptr<Response> jsonResponse(const std::string& response);
static std::shared_ptr<Response> htmlResponse(const std::string& response);
static std::shared_ptr<Response> fileResponse(const Request &request, const std::string &filePath, const std::string &contentType, bool allowCompression, bool allowCaching);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a forward declaration possible here (to avoid the inclusion of Request.h)?

@@ -0,0 +1,71 @@
// Copyright (c) 2018, Joe Balough
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scallopedllama / @mattgodbolt does this require an user to handle two licenses – the default text and this one?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm no lawyer, but I don't think the copyright affects the license. In other projects I have the author put their name, as long as the license remains intact below

#include "seasocks/ToString.h"
#include "seasocks/ZlibContext.h"

using namespace seasocks;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be better wrapped into namespace seasocks { ... }.


using namespace seasocks;

constexpr size_t ReadWriteBufferSize = 16 * 1024;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably anonymous namespace here and variables usually start with lower case (just to be consistent).


// If compression is allowed, try to initialize the zlib context here
// It will throw a runtime_error if zlib was diabled in the build
if (sendCompressedData) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should improve this. Would it help you to have some way to directly determine whether compression is enabled / disabled?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also probably we can avoid the if (sendCompressedData) { ... } checks and extract it to a single place (at least it would make sense).

}
bytesLeft -= bytesRead;
// compress data if possible
if (sendCompressedData) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above; maybe we can improve the zlib stuff in this function.

std::string rangeString = _request.getHeader("Range");

// Check for multipart range
std::regex multipartRegex("^\\S+=[0-9]+-[0-9]*, [0-9]+-[0-9]*");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instances of std::regex should be used with care as it may have poor performance on creation. It makes sense to make it const and create it only once therefore. Actually moving it as a const-variable into the anonymous namespace should do the trick.

(Same to line 240).

@@ -125,6 +126,15 @@ std::string webtime(time_t time) {
return buf;
}

time_t webtime(std::string time) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Const-Ref here?

@@ -48,6 +48,8 @@ void replace(std::string& string, const std::string& find, const std::string& re
bool caseInsensitiveSame(const std::string &lhs, const std::string &rhs);

std::string webtime(time_t time);
// Returns -1 on error
time_t webtime(std::string time);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it reasonable to throw an exception in case of an error here?


private:

ResponseCode parseHeaders(const off_t fileSize, const time_t fileLastModified, bool sendCompressedData, off_t& fileTransferStart, off_t& fileTransferEnd) const;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible avoid out-parameters and eg. return a tuple.

@offa
Copy link
Collaborator

offa commented Aug 9, 2018

I've added some implementation related comments.

… which are not visible until Submit review is pressed …

@mattgodbolt
Copy link
Owner

@offa how do you feel about this change? I skim-read the comments and all looks OK to me...are we waiting on the author to address anything?

@offa
Copy link
Collaborator

offa commented Sep 5, 2018

I think some of the comments should be addressed before. And we should get some tests in. But actually there's nothing dramatic, just a few things which have to be done.

We / I can help @scallopedllama to fix them of course. But which way do we go?

  1. Contribute via PR to this PR
  2. Contribute directly here (can we?!)
  3. Merge to a development branch and once finished to master

I'm fine with all of them. We could also merge this to master and then work on it, but this is probably not the best way to go. @scallopedllama / @mattgodbolt what do you think?

@offa
Copy link
Collaborator

offa commented Sep 25, 2018

ping @scallopedllama

@mattgodbolt
Copy link
Owner

Another ping @scallopedllama :) Thanks again for this, would be great to get it merged in. I'd prefer working off-master (another branch in this project) is fine, if we need to work on the repo here.

@scallopedllama
Copy link
Author

scallopedllama commented Oct 17, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants