Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLS corrupting buffer? #42635

Closed
ThePedroo opened this issue Apr 7, 2022 · 11 comments
Closed

TLS corrupting buffer? #42635

ThePedroo opened this issue Apr 7, 2022 · 11 comments
Labels
question Issues that look for answers.

Comments

@ThePedroo
Copy link

Version

17.7.2

Platform

Linux 47a7b9095f7d 5.13.0-1023-gcp #28~20.04.1-Ubuntu SMP Wed Mar 30 03:51:07 UTC 2022 x86_64 GNU/Linux

Subsystem

tls / https?

What steps will reproduce the bug?

Create a socket client with TLS and https
Make the socket server send 2 messages one after the another.

How often does it reproduce? Is there a required condition?

I don't think it so, but when it sends 2 messages at once, it corrupts the buffer and when you use .toString('utf8') shows 2 jsons instead of 1.

What is the expected behavior?

.toString('utf8') show only one json, then after some ms, show the another, executing the data event again.

What do you see instead?

image

Additional information

Nothing, I think.

@ThePedroo
Copy link
Author

By the way, seems the same with latest LTS node version.

@mscdex
Copy link
Contributor

mscdex commented Apr 7, 2022

In general when it comes to streams (tls or otherwise), you should never make assumptions about the chunks you will receive. You might get 1 byte, you might get 1000 bytes, you might get multi-byte characters split across chunks.

There are a few solutions to this problem, depending on the data stream you're reading from, including:

  1. Readable node streams have a setEncoding() method that will automatically convert stream data to strings and will automatically take care of characters split across chunks. This only works however if your stream doesn't contain binary data though.

  2. If your data stream is something like newline-delimited JSON, then you could just buffer raw chunks in an array until you find a newline. At that point you can just Buffer.concat() all of the chunks up until the newline and then convert to a string that you can JSON.parse().

  3. If your data stream is a mix of binary and strings and your streams or packets do not have length fields, then you will need to explicitly create a StringDecoder. Pass it the string's raw bytes and it will output only completed characters.

@mscdex mscdex closed this as completed Apr 7, 2022
@mscdex mscdex added the question Issues that look for answers. label Apr 7, 2022
@ThePedroo
Copy link
Author

I already have a setEncoding, however it doesn't encode correctly, it left some special characters that makes JSON.parse fail.
And it's normal for it to send 2 responses from the ws in one data?

@ThePedroo
Copy link
Author

image
image

@ThePedroo
Copy link
Author

@mscdex Btw, seems that it's failing to encode into utf8, not that it's sending incorrect buffer, but is there a way to save the raw buffer in the file so I can encode in another applications and see if it's a problem if my code or maybe a bug?

@mscdex
Copy link
Contributor

mscdex commented Apr 8, 2022

If you're manually handling (e.g. not using an existing library/module) something like WebSockets, you need to first parse the packets appropriately according to the specification. If you're not already doing this, this is most likely the cause of the binary data from your socket.

@ThePedroo
Copy link
Author

Hmmm, I tried looking ws/WebSocket parse function, I never understood what they were doing, but even not handling it correct, it is normal to tls send 2 socket responses? or it's just a thing I am also need to handle it, btw, I don't know to handle, but I believe it's something related to parse every "letter" of the buffer because of ws/Websocket loop event for parse it. Right?

@ThePedroo
Copy link
Author

And the socket.setEncoding('utf8') shouldn't parse it correctly?

@mscdex
Copy link
Contributor

mscdex commented Apr 9, 2022

If you're using ws, you'll have to take it up with them if you are having problems using their code.

@ThePedroo
Copy link
Author

Oh, nope, not using ws lol, I have trying to see how they parse it.

@mscdex
Copy link
Contributor

mscdex commented Apr 10, 2022

WebSocket is not just a bare socket, it is a protocol complete with a well-defined packet format. You need to either read up on the WebSocket specification and parse this data yourself according to that specification or use a third party library/module (e.g. ws) to do it for you. The choice is up to you. Either way, node is doing nothing wrong here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Issues that look for answers.
Projects
None yet
Development

No branches or pull requests

2 participants