Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not all words are converted #78

Open
turgic opened this issue Mar 15, 2019 · 2 comments
Open

Not all words are converted #78

turgic opened this issue Mar 15, 2019 · 2 comments

Comments

@turgic
Copy link

turgic commented Mar 15, 2019

I have 2 files :
File has one row with : magn�sienne
Second file has one row with magn�sienne
The results for one document is :
magnésienne
For other :
magn?sienne
Implimentation :
$return[] = $this->utf8Encoding($row);

private function utf8Encoding($datas)
{
foreach ($datas as $key => $data) {
$datas[$key] = Encoding::fixUTF8($data);
}
return $datas;
}
Have you an idea ? Thx in advance

@millenniumtree
Copy link

millenniumtree commented May 24, 2019

Here's an explanation of that special character:

U+FFFD � REPLACEMENT CHARACTER used to replace an unknown, unrecognized or unrepresentable character

So if your data source has already replaced the original character with that dummy character, there may be no way to get it back.

@garrettw
Copy link

Although those two lines may appear the same in your text editor, the underlying bytes may not be the same. The latter character could be the literal U+FFFD question mark, while the first one could be some other thing that your editor doesn't recognize and thus replaces it visually with the question mark to indicate that fact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants