Incorrect parsing of large floating point number: one bit offset #48648

Ilia-Kosenkov · 2021-02-23T14:58:51Z

Description

Parsing large floating-point numbers sometimes produces a number that differs from the correct value by one bit.
In particular, "6250000000000000000000000000000000e-12" produces this error.
The problem is illustrated in this sharplab.io sample

Let @double = double.Parse("6250000000000000000000000000000000e-12", NumberStyles.Any).
The first indication of the parsing problem is that the default string representation of @double is "6.250000000000001E+21" (note an extra 1 at the end).
The (little-endian) byte-representation (obtained with e.g. BitConverter.GetBytes) is
"F74AE1C7022D7544".
This representation is unfortunately incorrect.
The value 6.25e21 can be represented more accurately by a double-precision variable, and the byte-representation of this value is in fact
"F64AE1C7022D7544".
The difference (for this particular number) is one bit (F6 vs F7).
The mentioned above sharplab sample nicely illustrates this problem.

This issue was discovered when dotnet's parsing methods were run against a large set of floating point tests provided in this repository
https:/nigeltao/parse-number-fxx-test-data

The test case discussed in this issue is found in data/ibm-fpgen.txt, line 64724

7C00 63A96816 44752D02C7E14AF6 6250000000000000000000000000000000e-12

There may be other (similar) issues, which I have not found yet.

Configuration

So far tested on Windows 10 x64 20H2 19042.804

dotnet @ Windows:
- 5.0.103
- 6.0.100-preview.1.21103.13
dotnet @ WSL2:
- 5.0.103

Whatever backend sharplab.io uses also reproduces this error.

Regression?

Did not test earlier versions of dotnet.

Other information

This report would not be possible without https:/nigeltao/parse-number-fxx-test-data project and its contributors.

I verified this particular parsing case against Rust implementation, see this rust demo.

Maybe related to #48119
Partially inspired by discussion around #48646.

The text was updated successfully, but these errors were encountered:

dotnet-issue-labeler · 2021-02-23T14:58:54Z

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

Ilia-Kosenkov · 2021-02-23T15:31:22Z

This issue may arise here

runtime/src/libraries/System.Private.CoreLib/src/System/Number.NumberToFloatingPointBits.cs

Line 217 in 4396f09

mantissa = RightShiftWithRounding(mantissa, -normalMantissaShift, hasZeroTail);

And here

runtime/src/libraries/System.Private.CoreLib/src/System/Number.NumberToFloatingPointBits.cs

Lines 737 to 755 in 4396f09

 private static ulong RightShiftWithRounding(ulong value, int shift, bool hasZeroTail) 

 { 

 // If we'd need to shift further than it is possible to shift, the answer 

 // is always zero: 

 if (shift >= 64) 

 { 

 return 0; 

 } 

 ulong extraBitsMask = (1UL << (shift - 1)) - 1; 

 ulong roundBitMask = (1UL << (shift - 1)); 

 ulong lsbBitMask = 1UL << shift; 

 bool lsbBit = (value & lsbBitMask) != 0; 

 bool roundBit = (value & roundBitMask) != 0; 

 bool hasTailBits = !hasZeroTail || (value & extraBitsMask) != 0; 

 return (value >> shift) + (ShouldRoundUp(lsbBit, roundBit, hasTailBits) ? 1UL : 0); 

 }

There should be no rounding for this number (if I understand this correctly).
I am unable to trace what causes ShouldRoundUp to return true instead of false.

ghost · 2021-02-23T16:13:17Z

Tagging subscribers to this area: @tannergooding, @pgovind
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

Parsing large floating-point numbers sometimes produces a number that differs from the correct value by one bit.
In particular, "6250000000000000000000000000000000e-12" produces this error.
The problem is illustrated in this sharplab.io sample

Let @double = double.Parse("6250000000000000000000000000000000e-12", NumberStyles.Any).
The first indication of the parsing problem is that the default string representation of @double is "6.250000000000001E+21" (note an extra 1 at the end).
The (little-endian) byte-representation (obtained with e.g. BitConverter.GetBytes) is
"F74AE1C7022D7544".
This representation is unfortunately incorrect.
The value 6.25e21 can be represented more accurately by a double-precision variable, and the byte-representation of this value is in fact
"F64AE1C7022D7544".
The difference (for this particular number) is one bit (F6 vs F7).
The mentioned above sharplab sample nicely illustrates this problem.

This issue was discovered when dotnet's parsing methods were run against a large set of floating point tests provided in this repository
https:/nigeltao/parse-number-fxx-test-data

The test case discussed in this issue is found in data/ibm-fpgen.txt, line 64724

7C00 63A96816 44752D02C7E14AF6 6250000000000000000000000000000000e-12

There may be other (similar) issues, which I have not found yet.

Configuration

So far tested on Windows 10 x64 20H2 19042.804

dotnet @ Windows:
- 5.0.103
- 6.0.100-preview.1.21103.13
dotnet @ WSL2:
- 5.0.103

Whatever backend sharplab.io uses also reproduces this error.

Regression?

Did not test earlier versions of dotnet.

Other information

This report would not be possible without https:/nigeltao/parse-number-fxx-test-data project and its contributors.

I verified this particular parsing case against Rust implementation, see this rust demo.

Maybe related to #48119
Partially inspired by discussion around #48646.

Author:	Ilia-Kosenkov
Assignees:	-
Labels:	`area-System.Numerics`, `untriaged`
Milestone:	-

tannergooding · 2021-02-23T16:43:18Z

@pgovind, can you check if this is handled by your trailing zero fix?

Ilia-Kosenkov · 2021-02-23T17:24:01Z

Just to clarify - after running parsing tests for all of the test datasets (in this folder, 5 268 191 strings), this is the only value that fails parsing test, so this is probably something quite rare.

tannergooding · 2021-02-23T17:41:37Z

Thanks! We had also run another large suite from ES6 covering 100m inputs and hadn't found any failures there either, so I expect you're right.

The Roslyn implementation doesn't have this bug so it was likely introduced by one of the refactorings or perf improvements on the libraries side.
Given the large number of trailing zeros with a negative exponent here, I'm hoping that it's fixed by the bug that @pgovind recently fixed.

tannergooding · 2021-02-23T17:42:43Z

We should make sure to add a regression test for this before closing the issue in either case.

pgovind · 2021-02-23T19:16:33Z

Hmm weird, this is not fixed by https:/dotnet/runtime/pull/47666/files. Investigating locally.

pgovind · 2021-02-23T20:06:11Z

So, this is similar to #46827. #47666 does not fix this because #47666 only works when we encounter a decimal point in the input. Since there is no decimal point in the input here, we never calculate the number of trailing zeros.

The fix here is to carefully update the trailing zero calculation to take such cases into account.

…umbers (#48857) * Track trailing zeros only for floating point numbers * Undo previous unit test change * Add a roundtrip unit test * Move check outside the loop * Globalization * Also fix #48648 and unit tests * Assert and unit test Co-authored-by: Prashanth Govindarajan <[email protected]>

dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Feb 23, 2021

jkotas added the area-System.Numerics label Feb 23, 2021

pgovind self-assigned this Feb 23, 2021

pgovind removed the untriaged New issue has not been triaged by the area owner label Feb 23, 2021

pgovind pushed a commit to pgovind/runtime that referenced this issue Feb 24, 2021

Also fix dotnet#48648 and unit tests

9897a27

Ilia-Kosenkov added a commit to Ilia-Kosenkov/Backports that referenced this issue Feb 24, 2021

HotFixing dotnet/runtime#48648

336716e

Ilia-Kosenkov added a commit to Ilia-Kosenkov/Backports that referenced this issue Feb 24, 2021

HotFixing dotnet/runtime#48648 (#4)

b5a2b6f

pgovind mentioned this issue Feb 24, 2021

Track trailing zeros only for floating point numbers #48608

Merged

ghost closed this as completed in 5b04977 Feb 26, 2021

github-actions bot pushed a commit that referenced this issue Feb 27, 2021

Also fix #48648 and unit tests

975a63a

pgovind mentioned this issue Feb 27, 2021

[release/6.0-preview2] Track trailing zeros only for floating point numbers #48857

Merged

ghost locked as resolved and limited conversation to collaborators Mar 29, 2021

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect parsing of large floating point number: one bit offset #48648

Incorrect parsing of large floating point number: one bit offset #48648

Ilia-Kosenkov commented Feb 23, 2021

dotnet-issue-labeler bot commented Feb 23, 2021

Ilia-Kosenkov commented Feb 23, 2021

ghost commented Feb 23, 2021

Description

Configuration

Regression?

Other information

tannergooding commented Feb 23, 2021

Ilia-Kosenkov commented Feb 23, 2021

tannergooding commented Feb 23, 2021

tannergooding commented Feb 23, 2021

pgovind commented Feb 23, 2021

pgovind commented Feb 23, 2021

Incorrect parsing of large floating point number: one bit offset #48648

Incorrect parsing of large floating point number: one bit offset #48648

Comments

Ilia-Kosenkov commented Feb 23, 2021

Description

Configuration

Regression?

Other information

dotnet-issue-labeler bot commented Feb 23, 2021

Ilia-Kosenkov commented Feb 23, 2021

ghost commented Feb 23, 2021

Description

Configuration

Regression?

Other information

tannergooding commented Feb 23, 2021

Ilia-Kosenkov commented Feb 23, 2021

tannergooding commented Feb 23, 2021

tannergooding commented Feb 23, 2021

pgovind commented Feb 23, 2021

pgovind commented Feb 23, 2021