Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 biome lint panics when encountering multi-byte characters #4181

Closed
1 task done
TatuUlmanen opened this issue Oct 5, 2024 · 4 comments · Fixed by #4186
Closed
1 task done

🐛 biome lint panics when encountering multi-byte characters #4181

TatuUlmanen opened this issue Oct 5, 2024 · 4 comments · Fixed by #4186
Assignees
Labels
L-JavaScript Language: JavaScript and super languages S-Bug-confirmed Status: report has been confirmed as a valid bug

Comments

@TatuUlmanen
Copy link

TatuUlmanen commented Oct 5, 2024

Environment information

CLI:
  Version:                      1.9.3
  Color support:                true

Platform:
  CPU Architecture:             x86_64
  OS:                           linux

Environment:
  BIOME_LOG_PATH:               unset
  BIOME_LOG_PREFIX_NAME:        unset
  BIOME_CONFIG_PATH:            unset
  NO_COLOR:                     unset
  TERM:                         "xterm-256color"
  JS_RUNTIME_VERSION:           "v18.17.1"
  JS_RUNTIME_NAME:              "node"
  NODE_PACKAGE_MANAGER:         "bun/1.1.29"

Biome Configuration:
  Status:                       Loaded successfully
  Formatter disabled:           false
  Linter disabled:              false
  Organize imports disabled:    false
  VCS disabled:                 true

Workspace:
  Open Documents:               0

What happened?

When running biome lint on a JavaScript file containing (only) the expression a + '€', Biome encounters an internal panic. The error message indicates an issue with byte indexing for the Euro symbol (€), which uses more than one byte. Here is the exact error:

Biome encountered an unexpected error
Source Location: crates/biome_js_factory/src/utils.rs:34:36
Thread Name: biome::worker_1
Message: byte index 1 is not a char boundary; it is inside '€' (bytes 0..3) of `€`

The original code that triggered the error was return v.toFixed(2).replace('.', ',') + ' €'; which I was able to trim down to just a + '€' in a separate file while still producing the same error.

If I change the original code to use template literals instead, no panic happens:

`${v.toFixed(2).replace('.', ',')} €`

Expected result

Biome should correctly parse and lint the file without crashing, even when handling non-ASCII characters like the Euro symbol.

Code of Conduct

  • I agree to follow Biome's Code of Conduct
@dyc3
Copy link
Contributor

dyc3 commented Oct 5, 2024

Can you provide a playground link or a reproduction repo?

@dyc3 dyc3 added the S-Needs repro Status: needs a reproduction label Oct 5, 2024
@TatuUlmanen
Copy link
Author

TatuUlmanen commented Oct 5, 2024

@dyc3 This is easily reproducible by running the following in an empty directory:

bun init
bun add --dev --exact @biomejs/biome
echo "a + '€'" > index.ts
bunx biome lint index.ts
Biome encountered an unexpected error

This is a bug in Biome, not an error in your code, and we would appreciate it if you could report it to https:/biomejs/biome/issues/ along with the following information to help us fixing the issue:

Source Location: crates/biome_js_factory/src/utils.rs:34:36
Thread Name: biome::worker_0
Message: byte index 1 is not a char boundary; it is inside '€' (bytes 0..3) of `€`

index.ts internalError/panic  INTERNAL  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  ✖ processing panicked: byte index 1 is not a char boundary; it is inside '€' (bytes 0..3) of `€`

  ⚠ This diagnostic was derived from an internal Biome error. Potential bug, please report it if necessary.


Checked 0 files in 203ms. No fixes applied.
internalError/io ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  ✖ No files were processed in the specified paths.

The same happens when ran via npm.

Running in WSL2 on Ubuntu 22.04.4. My rustc version is 1.81.0 if that matters.

@TatuUlmanen
Copy link
Author

It appears the issue was introduced in a recent commit where the string iteration was changed from using char_indices() to bytes().enumerate(). The change to processing strings as raw bytes caused the error, as bytes().enumerate() does not respect UTF-8 character boundaries. This leads to panics when the code attempts to slice strings containing multi-byte characters, like , at invalid byte offsets.

To handle multi-byte UTF-8 characters safely, the file should be reverted to use char_indices() instead of bytes().enumerate(). This ensures that the string is processed as valid UTF-8 characters, preventing the panic from occurring.

The relevant code change would be:

let mut iter = unescaped_string.char_indices();

Instead of:

let mut iter = unescaped_string.bytes().enumerate();

@TatuUlmanen TatuUlmanen changed the title 🐛 biome lint panics when encountering euro symbol 🐛 biome lint panics when encountering multi-byte characters Oct 5, 2024
@dyc3 dyc3 added L-JavaScript Language: JavaScript and super languages S-Bug-confirmed Status: report has been confirmed as a valid bug and removed S-Needs repro Status: needs a reproduction labels Oct 5, 2024
@Conaclos Conaclos self-assigned this Oct 6, 2024
@Conaclos
Copy link
Member

Conaclos commented Oct 6, 2024

@TatuUlmanen

As a temporary workaround you can disable the fix of the linter rule style/useTemplate:

{
  "linter": {
    "rules": {
      "style": {
        "useTemplate": {
          "level": "warn",
          "fix": "off"
        }
      }
    }
  }
}

or you can completely disable the rule:

{
  "linter": {
    "rules": {
      "style": {
        "useTemplate": "off"
      }
    }
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
L-JavaScript Language: JavaScript and super languages S-Bug-confirmed Status: report has been confirmed as a valid bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants