Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyphens / dashes normalization? #11

Open
danielweck opened this issue May 21, 2020 · 2 comments
Open

Hyphens / dashes normalization? #11

danielweck opened this issue May 21, 2020 · 2 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@danielweck
Copy link

danielweck commented May 21, 2020

There are many different variants which can all be normalized to -:
https://www.compart.com/en/unicode/category/Pd
http://jkorpela.fi/dashes.html

However, to keep things simple:
https://en.wikipedia.org/wiki/Wikipedia:Hyphens_and_dashes

=>

["–": "-"],
["—": "-"],
["−": "-"],
["‒": "-"],
@danielweck
Copy link
Author

Side note: there is already an "underscore" transliteration for a similar-looking Arabic character:

['ـ', '_'],

@sindresorhus
Copy link
Owner

sindresorhus commented May 26, 2020

We can do:

string.replace(/\p{Dash_Punctuation}/gu, '-');

to cover all the dashes.


Full reference: https://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt

@sindresorhus sindresorhus added enhancement New feature or request help wanted Extra attention is needed labels May 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants