Enforced data size in channels 2.1.7 prevent large file uploads #1242

agateblue · 2019-02-07T13:06:15Z

Todo:

Test case to reproduce the issue
Fix :D

I think the actual fix will be a bit tricky because right now, we rely on the Content-Length header value to accept or refuse the request. Mimicking Django's implementation would probably be a better option:

If the request CONTENT_TYPE is multipart, then we need to check things when we lazily read the body
Else, we can use the CONTENT_LENGHT to accept or refuse the request upfront

If that looks good to you, then I can start working on a fix :)

cc @Zarathustra2 @carltongibson

carltongibson · 2019-02-07T13:12:44Z

Okay, awesome. I’m not going to get to this until next week now BUT, your breakdown is exactly what I was half-thinking , so if you’d like to sketch that out too, that would be great. (First draft doesn’t have to be perfect.)

…t requests

agateblue · 2019-02-07T13:30:22Z

@carltongibson I think I got a working fix, it took less time than expected :D

I've added another test case, (multipart + file data + too large POST data) just to ensure we've got that right too. Let me know what you think of it. Basically:

if we're in a multipart request, we skip channels check, and let django do the check lazily via the multipartparser when accessing POST data
otherwise, we fail early and loudly, when instanciating the request

agateblue · 2019-02-07T13:52:40Z

I tested on my project and it indeed fix the denied file upload issue, by the way!

channels/http.py

jpic · 2019-02-07T14:08:05Z

Why didn't you decide to set DATA_UPLOAD_MAX_MEMORY_SIZE=None in your project which should disable this check at all, or why did you decide not to increase your DATA_UPLOAD_MAX_MEMORY_SIZE ?

agateblue · 2019-02-07T14:11:32Z

Why didn't you decide to set DATA_UPLOAD_MAX_MEMORY_SIZE=None in your project which should disable this check at all, or why did you decide not to increase your DATA_UPLOAD_MAX_MEMORY_SIZE ?

@jpic as explained in #1240, there is a bug in the current implementation. As stated in Django's documentation about DATA_UPLOAD_MAX_MEMORY_SIZE:

The check is done when accessing request.body or request.POST and is calculated against the total request size excluding any file upload data.`

I do want to benefit from this protection in my project, I just don't want to have it applied to file uploads, which is the behaviour implemented in Django.

tests/test_http.py

jpic · 2019-02-07T14:53:33Z

Nice, thanks @EliotBerriot. So, your implementation works for me in the sense that it does what it means to do. However, I'm wondering if it's not the occasion to move that check "upon access" as it is it Django, rather than keeping it in the constructor.

I mean, having a too big request should prevent me to read it with RequestTooBig, but should it really prevent me from instantiating a request object at all ? Sorry if it's out of the scope of your PR which does fix a bug.

agateblue · 2019-02-07T15:01:40Z

I mean, having a too big request should prevent me to read it with RequestTooBig, but should it really prevent me from instantiating a request object at all ? Sorry if it's out of the scope of your PR which does fix a bug.

@jpic oh, I understand what you mean. Well that's really an API design decision. I don't see anything inherently wrong with the current approach. In terms of security/resource usage, failing upon instanciation is probably the best choice.

Now, if we need to instanciate a big request, it could indeed be impractical. I trying to think of a few situations/use cases where we'd need to do that, but cannot find any right now :)

tests/test_http.py

agateblue · 2019-02-08T09:35:38Z

@jpic I've pushed a commit that factorize the boundary / multipart request thing, let me know what you think about it :)

jpic · 2019-02-08T09:38:00Z

tests/test_http.py

+ + b'Content-Disposition: form-data; name="pdf"; filename="book.pdf"\r\n\r\n'
+ + b"FAKEPDFBYTESGOHERETHISISREALLYLONGBUTNOTUSEDTOCOMPUTETHESIZEOFTHEREQUEST"
+ + END_BOUNDARY
+ )


if body is the same as above, would it work to define a module or class level variable ?

@jpic I can, but my own experience is that we usually don't need the same level of factorization in tests and in runtime code, and I tend to avoid shared state between tests.

Tests should be readable, and independant. If we try to factorize everything, it makes it really harder to understand and maintain tests. Quick example: it took me literally 30 seconds to write those tests, because I could read above tests and copy-paste the code, and edit it as needed.

It would have take me way more time if I had to jump between 5 constants, understand if each matched my needs. Then figure if I should create a new one for my test (which may be slightly different), or edit the one in place (which could break other tests). What if someone edit that contstant in the future? They would need to understand their tests, and mine, to avoid breaking anything.

I like that you can screenshot/copy-paste a test and understand what's going on. It's far more important than factorization IMHO, which quickly become a nightmare with huge codebases and 1000s of tests.

That being said, I'm not maintaining this project, and I'll happily move the whole body in a constant if you think it's the best thing :)

Oh I can live with anything, it was just my advice because I like to refactor tests after writing them, let's see what others think about it ;)

Grats for your contribution and have a great day !

agateblue · 2019-02-14T09:41:02Z

Hey there, what are we doing about this? :)

carltongibson · 2019-02-14T09:55:25Z

Hi @EliotBerriot did you see #1240 (comment)?

agateblue · 2019-02-14T10:11:17Z

Hi @carltongibson :) I did, but I don't really know what this mean for the current PR, hence my question. Do we close this? Do we merge this? Do I need to amend this PR?

I understand the need to refactor how channels handle request body. I still think there was a backward incompatible change and a non-standard behaviour (according to django's documentation) with the 2.1.7 release, and in this context I'd say fixing the issue is still relevant. It does not prevent a refactor at a later point.

Setting DATA_UPLOAD_MAX_MEMORY_SIZE=None works in theory but I don't want to allow anyone to send requests with megabytes of data in the body, unless it's a file.

carltongibson · 2019-02-14T14:37:49Z

Hey @EliotBerriot. Yes, I understand. It doesn't make any sense from a project management POV to just merge this, just to alter it again immediately. Better to think through the approach, do it right and then roll a release, with appropriate change notes so people know what has occurred.

If you're in a super rush here, you can deploy your branch (using Pip's super VCS support) or create a middleware that just wraps the check for you, which you can remove later.

I am grateful for your effort here, and in particular the test cases that help show what's going on!

At heart, in patching one issue, we've revealed a sub-optimal area. We'll improve that and make progress. I appreciate that you having found it want a fix to be released now, but as I said, measure twice, cut once here. (It's top of my list.)

agateblue · 2019-02-14T15:14:37Z

Hey @EliotBerriot. Yes, I understand. It doesn't make any sense from a project management POV to just merge this, just to alter it again immediately. Better to think through the approach, do it right and then roll a release, with appropriate change notes so people know what has occurred.

Sure, this makes complete sense, and we can close this. I wanted to avoid having an open PR lying around for months :D

If you're in a super rush here, you can deploy your branch (using Pip's super VCS support) or create a middleware that just wraps the check for you, which you can remove later.

For now, I've pinned the previous version, and will follow the progress on this topic as well (I'm not sure if I'll be able to help with the dev part, but I can test things, which is still something)

Thank you for the dedication and maintenance effort :)

carltongibson · 2019-02-14T15:17:25Z

No, don't close it. 😀

It won't be months. It'll be a week or so. (I just need a hour or two to sit down with it.)

I need it open so I don't think I've done it, when I haven't. Thanks. 👍

carltongibson · 2019-02-25T21:20:56Z

OK, @EliotBerriot. Thanks for your efforts here.

I've begun work on #1251, which incorporates the tests you've provided. (Very helpful, thank you!) so I'll close this now. I'll ping you on #1251 when that's ready for you to run. 👍

Test case for django#1240

60d93ed

agateblue requested a review from carltongibson as a code owner February 7, 2019 13:06

Fix django#1240: ignore files when checking request size for multipar…

3e448c2

…t requests

agateblue force-pushed the 1240-file-upload-size branch from e93a5de to 3e448c2 Compare February 7, 2019 13:29

jpic reviewed Feb 7, 2019

View reviewed changes

channels/http.py Show resolved Hide resolved

jpic reviewed Feb 7, 2019

View reviewed changes

tests/test_http.py Show resolved Hide resolved

jpic reviewed Feb 7, 2019

View reviewed changes

tests/test_http.py Show resolved Hide resolved

Factorized boundary / multipart asgi request

43aa241

agateblue changed the title ~~WIP: Enforced data size in channels 2.1.7 prevent large file uploads~~ Enforced data size in channels 2.1.7 prevent large file uploads Feb 8, 2019

jpic reviewed Feb 8, 2019

View reviewed changes

agateblue closed this Feb 14, 2019

carltongibson reopened this Feb 14, 2019

carltongibson mentioned this pull request Feb 25, 2019

Rework HTTP body handling. #1251

Closed

carltongibson closed this Feb 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enforced data size in channels 2.1.7 prevent large file uploads #1242

Enforced data size in channels 2.1.7 prevent large file uploads #1242

agateblue commented Feb 7, 2019 •

edited

Loading

carltongibson commented Feb 7, 2019

agateblue commented Feb 7, 2019

agateblue commented Feb 7, 2019

jpic commented Feb 7, 2019

agateblue commented Feb 7, 2019

jpic commented Feb 7, 2019

agateblue commented Feb 7, 2019

agateblue commented Feb 8, 2019

jpic Feb 8, 2019

agateblue Feb 8, 2019

jpic Feb 8, 2019

agateblue commented Feb 14, 2019

carltongibson commented Feb 14, 2019

agateblue commented Feb 14, 2019

carltongibson commented Feb 14, 2019

agateblue commented Feb 14, 2019

carltongibson commented Feb 14, 2019

carltongibson commented Feb 25, 2019

Enforced data size in channels 2.1.7 prevent large file uploads #1242

Enforced data size in channels 2.1.7 prevent large file uploads #1242

Conversation

agateblue commented Feb 7, 2019 • edited Loading

carltongibson commented Feb 7, 2019

agateblue commented Feb 7, 2019

agateblue commented Feb 7, 2019

jpic commented Feb 7, 2019

agateblue commented Feb 7, 2019

jpic commented Feb 7, 2019

agateblue commented Feb 7, 2019

agateblue commented Feb 8, 2019

jpic Feb 8, 2019

Choose a reason for hiding this comment

agateblue Feb 8, 2019

Choose a reason for hiding this comment

jpic Feb 8, 2019

Choose a reason for hiding this comment

agateblue commented Feb 14, 2019

carltongibson commented Feb 14, 2019

agateblue commented Feb 14, 2019

carltongibson commented Feb 14, 2019

agateblue commented Feb 14, 2019

carltongibson commented Feb 14, 2019

carltongibson commented Feb 25, 2019

agateblue commented Feb 7, 2019 •

edited

Loading