[BUG] Line number out by one for syntax errors in script with shebang #435

hdwalters · 2024-08-31T19:32:44Z

If the parser detects a syntax error in the input file, it highlights the offending token:

hwalters@Wintermute ~/git/amber (master) 
$ cat stdtest.ab 
import { len } from "std/text"
let string = "SHOUTY text"
echo "Original: {string}"
echo "Length: {len(string)}"
echo "Upper: {upper(string)}"
echo "Lower: {lower(string)}"

hwalters@Wintermute ~/git/amber (master) 
$ amber stdtest.ab 
 ERROR  Function 'upper' does not exist
at stdtest.ab:5:15

4| echo "Length: {len(string)}"
5| echo "Upper: {upper(string)}" <========= "upper" is highlighted
6| echo "Lower: {lower(string)}"

However, if the input file has a shebang line, the line number is out by one:

hwalters@Wintermute ~/git/amber (master) 
$ cat stdtest.ab 
#!/usr/bin/env amber <===================== Added shebang line
import { len } from "std/text"
let string = "SHOUTY text"
echo "Original: {string}"
echo "Length: {len(string)}"
echo "Upper: {upper(string)}"
echo "Lower: {lower(string)}"

hwalters@Wintermute ~/git/amber (master) 
$ ./stdtest.ab 
 ERROR  Function 'upper' does not exist
at ./stdtest.ab:5:15 <===================== Should be "6:15"

4| echo "Original: {string}"
5| echo "Length: {len(string)}" <========== "{len(" is highlighted
6| echo "Upper: {upper(string)}"

It looks like the recent change to support shebang lines (see #286) failed to take account of this. The problem appears to be that function AmberCompiler::strip_off_shebang() removes the shebang line from the input file before passing the file contents to the tokenizer, but Logger::snippet() in the heraclitus compiler rereads the input file to generate the code snippet.

The input file is tokenized (setting the line and column number on each token) inside the heraclitus compiler, so by the time the tokenizer returns to AmberCompiler::tokenize(), it's too late to fix up the line numbers. I therefore believe the only way to fix this is to comment out the shebang line in the data passed to the tokenizer, instead of removing it:

// #!/usr/bin/env amber
import { len } from "std/text"
...

The only downside is that if syntax errors are ever detected in the shebang itself (which I do not believe will happen, given that we're ignoring it anyway) they will be off by three characters. Any objections before I create a PR?

The text was updated successfully, but these errors were encountered:

hdwalters added the bug Something isn't working label Aug 31, 2024

hdwalters mentioned this issue Sep 2, 2024

Comment shebang line before passing to tokenizer #437

Merged

Mte90 closed this as completed in #437 Sep 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Line number out by one for syntax errors in script with shebang #435

[BUG] Line number out by one for syntax errors in script with shebang #435

hdwalters commented Aug 31, 2024

[BUG] Line number out by one for syntax errors in script with shebang #435

[BUG] Line number out by one for syntax errors in script with shebang #435

Comments

hdwalters commented Aug 31, 2024