Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐞 Param string data buffer is reused, causing unexpected side effects #426

Closed
ViRb3 opened this issue May 29, 2020 · 13 comments
Closed

Comments

@ViRb3
Copy link

ViRb3 commented May 29, 2020

Fiber version/commit

v1.10.1

Issue description

The method fiber.Ctx.Params(key string) string returns strings that share the same data buffer. This means that once a new string is returned, all previously returned strings will be changed to the new string's content, no matter how much they have been copied and passed around, since their data buffer is the same. This behavior is clearly seen in PoC below. unsafe is used to parse the header of each string returned from the Params method and then its content is printed. Notice how the data pointer is identical for all strings, meaning that they all use the same data buffer.

The only workaround right now is to explicitly copy the data buffer into a new buffer and create a string from the new buffer:

newBuffer := make([]byte, len(result))
copy(newBuffer, result)
newResult := string(newBuffer)

This behavior is absolutely unexpected, as strings in Go are expected to be immutable. I have not tracked down exactly what causes this issue, but I suspect that it may be a side effect from an inherited optimization by fasthttp. If this is true, I suspect that there are other methods that behave the same way as well. This behavior should be changed, or a clear documentation notice should be placed that indicates this side effect.

PoC

package main

import (
	"fmt"
	"github.com/gofiber/fiber"
	"net/http"
	"reflect"
	"sync"
	"unsafe"
)

var results []string
var wg sync.WaitGroup

func main() {
	go server()
	client()
	wg.Wait()
}

func client() {
	wg.Add(3)
	http.Get("http://127.0.0.1:3000/aaa")
	http.Get("http://127.0.0.1:3000/bbbb")
	http.Get("http://127.0.0.1:3000/c")
}

func server() {
	app := fiber.New()
	app.Get("/:test", func(c *fiber.Ctx) {
		result := c.Params("test")
		hdr := (*reflect.StringHeader)(unsafe.Pointer(&result))
		results = append(results, result)
		fmt.Printf("Param string: %+v Value:%s\n", hdr, result)
		fmt.Printf("Results: %+v\n", results)
		wg.Done()
	})
	app.Listen(3000)
}

Output

Param string: &{Data:824633868529 Len:3} Value:aaa
Results: [aaa]
Param string: &{Data:824633868529 Len:4} Value:bbbb
Results: [bbb bbbb]
Param string: &{Data:824633868529 Len:1} Value:c
Results: [cbb cbbb c]
@welcome
Copy link

welcome bot commented May 29, 2020

Thanks for opening your first issue here! 🎉 Be sure to follow the issue template! If you need help or want to chat with us, join us on Discord https://gofiber.io/discord

@Fenny
Copy link
Member

Fenny commented May 30, 2020

We use the same method as Fasthttp to convert strings and bytes without memory allocation, but requires you to never change the original byte array, but that's a reasonable contract in many cases. To persist values after returning from the handler, you can use the build-in Immutable setting in Fiber.

Reference: #185

app := fiber.New(&fiber.Settings{
    Immutable: true,
})

However, we could make an exception to allocate parameters when c.Params() is called and keep 0 allocations on matching / hot paths. @ReneWerner87 thoughts?

@ViRb3
Copy link
Author

ViRb3 commented May 30, 2020

Oh, I was not aware of the immutable setting. I wouldn't mind keeping the current behavior, but there should be clear documentation that states such side effects - perhaps a general warning in the start of the project, so people are aware that this is even a possible side effect, and maybe note down the scope of returned variables on each method's documentation, like fasthttp does it. It took me a full day to track down this issue just because I just didn't expect it to be coming from fiber.

@Fenny
Copy link
Member

Fenny commented May 30, 2020

You are right regarding the documentation, we could add some sort of warning/disclaimer regarding this issue and explain how it would be solvable using the Immutable setting or allocate/copy specific variables by using the default string or []byte types.

If you have any suggestions on how we could phrase/summary this behavior, I will add it to all README's and API Docs right away.

Suggestion:

Returned value is valid until the handler returns, do not store references, make copies instead or use the Immutable setting to do this for you. Keep in mind that this may result in memory allocation+copy.

@ViRb3
Copy link
Author

ViRb3 commented May 30, 2020

My recommendation would be to make two changes.

First, add a dedicated section in the beginning of the docs that outlines this concept as a whole. I created a PR for that, mentioned above. Any feedback is very welcome.

Second, for each method that returns a non-immutable value, like the Params method discussed in this issue, add a warning to its docs, like the one you proposed. I allowed myself to re-phrase it to make it a bit shorter:

Returned value is only valid within the handler. Do not store any references. Make copies or use the Immutable setting instead.

@Fenny
Copy link
Member

Fenny commented May 30, 2020

Awesome work @ViRb3, I merged your PR in the docs. I will add your short phrase to each non-immutable method this weekend.

Feel free to join us on Discord to have a chat!

@Fenny Fenny closed this as completed May 30, 2020
@gabzim
Copy link

gabzim commented Jul 10, 2021

@Fenny when you say not to keep references to it, is it ok to pass it to the service layer if the handler will wait for the service to return before it returns? example:

app.Get("/users/:id", func(c *fiber.Ctx) {
		userID := c.Params("id")
		// is this ok?
		user, err := userService.FindByID(userID)
                // ... rest of the code
	})

Assuming above that userID is only used to make a query to the DB and not stored anywhere in state beyond that use.

@ReneWerner87
Copy link
Member

Yeah should be okay, the string is there until the response is written.

@calbot
Copy link

calbot commented Aug 14, 2021

I just got hit by this issue. I would like to suggest, (if it works) to return a new type something like below unsafestring.
This would make it explicit that the mutability MUST be considered when using the string. Go strings are always spoken of as immutable so it should probably be a new type. This way you could pass your unsafestring into functions and it would be clear that it requires special consideration. If you have fairy complicated program you could easily lose track of which strings are allowed to outlive the caller's scope.

Params, Query would return unsafestring instead of string

//Use the Copy method if you need it outside the scope
//of the fiber handler. It's ok to use UNSAFE to avoid a copy 
//if you know that that the result won't outlive the fiber handler

type unsafestring string

func (u unsafestring) Copy() string {
	return fiberUtils.ImmutableString(string(u))
}

func (u unsafestring) UNSAFE() string {
	return string(u)
}

@DiggidyDave
Copy link

Also just got hit by this. This is insidious. Not only was it causing values to change underneath us, it was also changing the values that we were logging to trace through the system and try to find the issue.

I appreciate what you're doing here, but this should be an opt-in setting IMHO. It would be much more discoverable as a performance feature that you can enable that can be called out in documentation for people looking to improve performance. I do see that it is in the docs now as suggested above, but I suspect very few people look up and read to the bottom the documentation for ctx.Get for example.

@gaby
Copy link
Member

gaby commented May 19, 2024

@DiggidyDave This is highly explained in the First Page of the documentation. See here https://docs.gofiber.io/#zero-allocation

It also has an example of how to turn off this feature for the full framework.

@DiggidyDave
Copy link

DiggidyDave commented May 19, 2024

Sure, anyone who actually starts with the docs will probably see it. Others, like me, may start with the readme on github and will only consult docs to fill gaps in an otherwise very intuitive API. Also, someone may see it and brush it off as an edge case that doesn't apply to them at the time and skim past it, only to be bitten by it later. If you are banking on folks seeing the docs, why not have the riskiest part of that assumption be opt-in and super discoverable in the docs?

It was just a suggestion, no major complaints here. Love the project, just hit this and thought I'd mention it. :-)

@gaby
Copy link
Member

gaby commented May 19, 2024

@DiggidyDave Thanks for the suggestion, I will update the README to mention this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants