to fix v1.5.8 Usage return #223

nasa1024 · 2023-04-03T11:47:40Z

In the pr on 215. I found that I used the wrong way to add Usage. when stream = true will not return Usage filed.

Usage should be calculated by the user himself.

sashabaranov · 2023-04-03T13:05:56Z

@nasa1024 hey, you meant to remove this line in this PR? (see diff)

nasa1024 · 2023-04-03T13:59:03Z

here is the right way to caculator usage

go get github.com/pkoukk/tiktoken-go

here is example to get usage

package main

import (
	"context"
	"encoding/json"
	"errors"
	"fmt"
	"io"

	"github.com/pkoukk/tiktoken-go"
	openai "github.com/sashabaranov/go-openai"
)

func main() {
	chat()
	stream()
}

func chat() {
	client := openai.NewClient("")
	resp, err := client.CreateChatCompletion(
		context.Background(),
		openai.ChatCompletionRequest{
			Model:     openai.GPT3Dot5Turbo,
			MaxTokens: 20,
			Messages: []openai.ChatCompletionMessage{
				{
					Role:    openai.ChatMessageRoleAssistant,
					Content: "你好，我叫小梦",
				},
				{
					Role:    openai.ChatMessageRoleUser,
					Content: "什么是算法稳定币？",
				},
			},
		},
	)

	if err != nil {
		fmt.Printf("ChatCompletion error: %v\n", err)
		return
	}

	fmt.Printf("response: %#v\n", resp)
}

func stream() {
	c := openai.NewClient("")
	ctx := context.Background()
	var response openai.ChatCompletionResponse
	var completionText string

	req := openai.ChatCompletionRequest{
		Model:     openai.GPT3Dot5Turbo,
		MaxTokens: 20,
		Messages: []openai.ChatCompletionMessage{
			{
				Role:    openai.ChatMessageRoleAssistant,
				Content: "你好，我叫小梦",
			},
			{
				Role:    openai.ChatMessageRoleUser,
				Content: "什么是算法稳定币？",
			},
		},
		Stream: true,
	}
	stream, err := c.CreateChatCompletionStream(ctx, req)
	if err != nil {
		fmt.Printf("ChatCompletionStream error: %v\n", err)
		return
	}
	defer stream.Close()

	fmt.Printf("Stream response: ")
	for {
		response, err := stream.Recv()
		if errors.Is(err, io.EOF) {
			fmt.Println("\nStream finished")
			break
		}

		if err != nil {
			fmt.Printf("\nStream error: %v\n", err)
			return
		}

		completionText += response.Choices[0].Delta.Content
		fmt.Printf(response.Choices[0].Delta.Content)
	}

	str, err := json.Marshal(req)
	if err != nil {
		err = fmt.Errorf(": %v", err)
		return
	}
	fmt.Printf("Request: %s", string(str))

	response.Usage.PromptTokens = NumTokensFromMessages(req.Messages, openai.GPT3Dot5Turbo)
	response.Usage.CompletionTokens = getTokenByModel(completionText, openai.GPT3Dot5Turbo)
	response.Usage.TotalTokens = response.Usage.PromptTokens + response.Usage.CompletionTokens
	fmt.Printf("response: %#v\n", response)
}

// getTokenByEncoding
func getTokenByModel(text string, model string) (num_tokens int) {

	tkm, err := tiktoken.EncodingForModel(model)
	if err != nil {
		err = fmt.Errorf(": %v", err)
		return
	}

	token := tkm.Encode(text, nil, nil)

	return len(token)
}

// NumTokensFromMessages
func NumTokensFromMessages(messages []openai.ChatCompletionMessage, model string) (num_tokens int) {
	tkm, err := tiktoken.EncodingForModel(model)
	if err != nil {
		err = fmt.Errorf("EncodingForModel: %v", err)
		fmt.Println(err)
		return
	}

	var tokens_per_message int
	var tokens_per_name int
	if model == "gpt-3.5-turbo-0301" || model == "gpt-3.5-turbo" {
		tokens_per_message = 4
		tokens_per_name = -1
	} else if model == "gpt-4-0314" || model == "gpt-4" {
		tokens_per_message = 3
		tokens_per_name = 1
	} else {
		fmt.Println("Warning: model not found. Using cl100k_base encoding.")
		tokens_per_message = 3
		tokens_per_name = 1
	}

	for _, message := range messages {
		num_tokens += tokens_per_message
		num_tokens += len(tkm.Encode(message.Content, nil, nil))
		if message.Name != "" {
			num_tokens += tokens_per_name
		}
	}
	num_tokens += 3
	return num_tokens
}

I'm sorry for increase in your workload and the error of the commit history

nasa1024 · 2023-04-03T14:02:33Z

@collinvandyck
there is have prompt token no equal

response: openai.ChatCompletionResponse{ID:"chatcmpl-71EycXmMs6ofZi4IdbUwEaYP1T8iR", Object:"chat.completion", Created:1680530254, Model:"gpt-3.5-turbo-0301", Choices:[]openai.ChatCompletionChoice{openai.ChatCompletionChoice{Index:0, Message:openai.ChatCompletionMessage{Role:"assistant", Content:"算法稳定币是一种数字货币，与其他稳定", Name:""}, FinishReason:"length"}}, Usage:openai.Usage{PromptTokens:35, CompletionTokens:20, TotalTokens:55}}
Stream response: 算法稳定币是一种基于算法的数字货币，
Stream finished
Request: {"model":"gpt-3.5-turbo","messages":[{"role":"assistant","content":"你好，我叫小梦"},{"role":"user","content":"什么是算法稳定币？"}],"max_tokens":20,"stream":true}response: openai.ChatCompletionResponse{ID:"", Object:"", Created:0, Model:"", Choices:[]openai.ChatCompletionChoice(nil), Usage:openai.Usage{PromptTokens:33, CompletionTokens:20, TotalTokens:53}}

maybe we need some discuss more

nasa1024 · 2023-04-03T14:05:30Z

@nasa1024 hey, you meant to remove this line in this PR? (see diff)

i don't know what should i do, i's so sorry!
XD
sir, have nice day

collinvandyck · 2023-04-03T14:09:56Z

@nasa1024 I believe that @sashabaranov is saying that because usage does not come back in the actual HTTP response for stream completions, that adding omitempty is not really the right approach, since it will always be empty. I think it should be removed altogether.

Also thanks for the call out on how to calculate tokens manually. This seems like a cool thing to have in go-openai -- the Recv code could possibly use this and just calculate the usage for you based on the model that generated the stream response.

nasa1024 · 2023-04-03T14:10:25Z

@nasa1024 hey, you meant to remove this line in this PR? (see diff)

i don't know what should i do, i's so sorry! XD sir, have nice day

add omitempty tag

codecov · 2023-04-03T14:14:33Z

Codecov Report

Merging #223 (6e88d53) into master (bee0656) will increase coverage by 0.92%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master     #223      +/-   ##
==========================================
+ Coverage   72.51%   73.44%   +0.92%     
==========================================
  Files          21       21              
  Lines         593      625      +32     
==========================================
+ Hits          430      459      +29     
- Misses        124      125       +1     
- Partials       39       41       +2

Impacted Files	Coverage Δ
chat_stream.go	`88.00% <ø> (ø)`

... and 2 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

sashabaranov · 2023-04-04T08:07:20Z

@nasa1024 As @collinvandyck correctly mentioned — please just remove the Usage field from the response, it would not be there anyway.

chat_stream.go

nasa1024

remove the Usage field from the response

kaijietti · 2023-04-18T16:46:47Z

#223 (comment)
@nasa1024
方法 func NumTokensFromMessages(messages []openai.ChatCompletionMessage, model string) (num_tokens int) 内

for _, message := range messages {
	num_tokens += tokens_per_message
	num_tokens += len(tkm.Encode(message.Content, nil, nil))
	if message.Name != "" {
		num_tokens += tokens_per_name
	}
}

这里应该还漏掉了：
num_tokens += len(tkm.Encode(message.Role, nil, nil))
以及
num_tokens += len(tkm.Encode(message.Name, nil, nil))

===
参考：https:/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb

nicewook · 2023-06-27T03:11:37Z

At the moment(6/27), Counting token is slightly changed.

// below link may not work on Chrome(error: Unable to render code block)
// then, use FireFox
// OpenAI Cookbook: https:/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
func NumTokensFromMessages(messages []openai.ChatCompletionMessage, model string) (numTokens int) {
	tkm, err := tiktoken.EncodingForModel(model)
	if err != nil {
		err = fmt.Errorf("encoding for model: %v", err)
		log.Println(err)
		return
	}

	var tokensPerMessage, tokensPerName int

	if model == "gpt-3.5-turbo-0613" ||
		model == "gpt-3.5-turbo-16k-0613" ||
		model == "gpt-4-0314" ||
		model == "gpt-4-32k-0314" ||
		model == "gpt-4-0613" ||
		model == "gpt-4-32k-0613" {
		tokensPerMessage = 3
		tokensPerName = -1
	} else if model == "gpt-3.5-turbo-0301" {
		tokensPerMessage = 4 // every message follows <|start|>{role/name}\n{content}<|end|>\n
		tokensPerName = -1   // if there's a name, the role is omitted
	} else if model == "gpt-3.5-turbo" {
		log.Println("warning: gpt-3.5-turbo may update over time. Returning num tokens assuming gpt-3.5-turbo-0613.")
		return NumTokensFromMessages(messages, "gpt-3.5-turbo-0613")
	} else if model == "gpt-4" {
		log.Println("warning: gpt-4 may update over time. Returning num tokens assuming gpt-4-0613.")
		return NumTokensFromMessages(messages, "gpt-4-0613")
	} else {
		err := errors.New("warning: model not found. Using cl100k_base encoding")
		log.Println(err)
		return
	}

	for _, message := range messages {
		numTokens += tokensPerMessage
		numTokens += len(tkm.Encode(message.Content, nil, nil))
		numTokens += len(tkm.Encode(message.Role, nil, nil))
		numTokens += len(tkm.Encode(message.Name, nil, nil))
		if message.Name != "" {
			numTokens += tokensPerName
		}
	}
	numTokens += 3 // every reply is primed with <|start|>assistant<|message|>
	return numTokens
}

lihang added 2 commits March 31, 2023 21:02

add ChatCompletionStream Usage return

dc8c8e4

ChatCompletionStreamResponse will bot return Usage

ad942ce

nasa1024 mentioned this pull request Apr 3, 2023

add ChatCompletionStream Usage return #215

Merged

Merge branch 'master' into master

bad64ef

sashabaranov reviewed Apr 4, 2023

View reviewed changes

chat_stream.go Outdated Show resolved Hide resolved

remove the Usage field from the response, it would not be there anyway.

6e88d53

nasa1024 commented Apr 4, 2023

View reviewed changes

sashabaranov approved these changes Apr 4, 2023

View reviewed changes

sashabaranov merged commit 2f3700f into sashabaranov:master Apr 4, 2023

smile-magic mentioned this pull request Apr 8, 2023

如何获取花费的 token 数？ #231

Closed

sashabaranov mentioned this pull request Apr 16, 2023

no usage available when using CreateChatCompletionStream #254

Closed

vvatanabe mentioned this pull request Jul 6, 2023

usage for stream output in chat completion #284

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

to fix v1.5.8 Usage return #223

to fix v1.5.8 Usage return #223

nasa1024 commented Apr 3, 2023 •

edited

Loading

sashabaranov commented Apr 3, 2023

nasa1024 commented Apr 3, 2023

nasa1024 commented Apr 3, 2023

nasa1024 commented Apr 3, 2023

collinvandyck commented Apr 3, 2023

nasa1024 commented Apr 3, 2023

codecov bot commented Apr 3, 2023 •

edited

Loading

sashabaranov commented Apr 4, 2023

nasa1024 left a comment

kaijietti commented Apr 18, 2023 •

edited

Loading

nicewook commented Jun 27, 2023

to fix v1.5.8 Usage return #223

to fix v1.5.8 Usage return #223

Conversation

nasa1024 commented Apr 3, 2023 • edited Loading

sashabaranov commented Apr 3, 2023

nasa1024 commented Apr 3, 2023

nasa1024 commented Apr 3, 2023

nasa1024 commented Apr 3, 2023

collinvandyck commented Apr 3, 2023

nasa1024 commented Apr 3, 2023

codecov bot commented Apr 3, 2023 • edited Loading

Codecov Report

sashabaranov commented Apr 4, 2023

nasa1024 left a comment

Choose a reason for hiding this comment

kaijietti commented Apr 18, 2023 • edited Loading

nicewook commented Jun 27, 2023

nasa1024 commented Apr 3, 2023 •

edited

Loading

codecov bot commented Apr 3, 2023 •

edited

Loading

kaijietti commented Apr 18, 2023 •

edited

Loading