You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 19, 2024. It is now read-only.
fastText uses std::string to represent text data, which is an alias for std::basic_string<char>. It is compiler defined if char is signed or unsigned. However, the result of
uint32_t Dictionary::hash(const std::string& str) const {
uint32_t h = 2166136261;
for (size_t i = 0; i < str.size(); i++) {
h = h ^ uint32_t(str[i]);
h = h * 16777619;
}
return h;
}
depends on if char is signed or unsigned.
To guarantee portability of binary fasttext models, std::basic_string<signed char> should be used throughout the library. This should match the current behavior on Linux and OS X (possibly more), but would change the behavior on Android systems for which char defaults apparently often to unsigned char. Ie. if anyone was using a pretrained binary fasttext model there it was giving wrong results.
The text was updated successfully, but these errors were encountered:
fastText uses
std::string
to represent text data, which is an alias forstd::basic_string<char>
. It is compiler defined ifchar
is signed orunsigned
. However, the result ofdepends on if
char
issigned
orunsigned
.To guarantee portability of binary fasttext models,
std::basic_string<signed char>
should be used throughout the library. This should match the current behavior on Linux and OS X (possibly more), but would change the behavior on Android systems for whichchar
defaults apparently often tounsigned char
. Ie. if anyone was using a pretrained binary fasttext model there it was giving wrong results.The text was updated successfully, but these errors were encountered: