Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Add Span<T> Binary Reader/Writer APIs #24400

Merged
merged 11 commits into from
Oct 12, 2017
Merged

Conversation

ahsonkhan
Copy link
Member

@ahsonkhan ahsonkhan commented Oct 3, 2017

From https:/dotnet/corefx/issues/24144 (part of https:/dotnet/corefx/issues/24174)

Edit: Addressed changes from the API review.

// System.Memory.dll
namespace System.Buffers.Binary {
    public static class BinaryPrimitives {
        public static T ReadMachineEndian<T>(ReadOnlySpan<byte> buffer) where T : struct;
        public static short ReadInt16BigEndian(ReadOnlySpan<byte> buffer);
        public static short ReadInt16LittleEndian(ReadOnlySpan<byte> buffer);
        public static int ReadInt32BigEndian(ReadOnlySpan<byte> buffer);
        public static int ReadInt32LittleEndian(ReadOnlySpan<byte> buffer);
        public static long ReadInt64BigEndian(ReadOnlySpan<byte> buffer);
        public static long ReadInt64LittleEndian(ReadOnlySpan<byte> buffer);
        public static ushort ReadUInt16BigEndian(ReadOnlySpan<byte> buffer);
        public static ushort ReadUInt16LittleEndian(ReadOnlySpan<byte> buffer);
        public static uint ReadUInt32BigEndian(ReadOnlySpan<byte> buffer);
        public static uint ReadUInt32LittleEndian(ReadOnlySpan<byte> buffer);
        public static ulong ReadUInt64BigEndian(ReadOnlySpan<byte> buffer);
        public static ulong ReadUInt64LittleEndian(ReadOnlySpan<byte> buffer);
        public static byte ReverseEndianness(byte value);
        public static short ReverseEndianness(short value);
        public static int ReverseEndianness(int value);
        public static long ReverseEndianness(long value);
        public static sbyte ReverseEndianness(sbyte value);
        public static ushort ReverseEndianness(ushort value);
        public static uint ReverseEndianness(uint value);
        public static ulong ReverseEndianness(ulong value);
        public static bool TryReadMachineEndian<T>(ReadOnlySpan<byte> buffer, out T value) where T : struct;
        public static bool TryReadInt16BigEndian(ReadOnlySpan<byte> buffer, out short value);
        public static bool TryReadInt16LittleEndian(ReadOnlySpan<byte> buffer, out short value);
        public static bool TryReadInt32BigEndian(ReadOnlySpan<byte> buffer, out int value);
        public static bool TryReadInt32LittleEndian(ReadOnlySpan<byte> buffer, out int value);
        public static bool TryReadInt64BigEndian(ReadOnlySpan<byte> buffer, out long value);
        public static bool TryReadInt64LittleEndian(ReadOnlySpan<byte> buffer, out long value);
        public static bool TryReadUInt16BigEndian(ReadOnlySpan<byte> buffer, out ushort value);
        public static bool TryReadUInt16LittleEndian(ReadOnlySpan<byte> buffer, out ushort value);
        public static bool TryReadUInt32BigEndian(ReadOnlySpan<byte> buffer, out uint value);
        public static bool TryReadUInt32LittleEndian(ReadOnlySpan<byte> buffer, out uint value);
        public static bool TryReadUInt64BigEndian(ReadOnlySpan<byte> buffer, out ulong value);
        public static bool TryReadUInt64LittleEndian(ReadOnlySpan<byte> buffer, out ulong value);
        public static bool TryWriteInt16BigEndian(Span<byte> buffer, short value);
        public static bool TryWriteInt16LittleEndian(Span<byte> buffer, short value);
        public static bool TryWriteInt32BigEndian(Span<byte> buffer, int value);
        public static bool TryWriteInt32LittleEndian(Span<byte> buffer, int value);
        public static bool TryWriteInt64BigEndian(Span<byte> buffer, long value);
        public static bool TryWriteInt64LittleEndian(Span<byte> buffer, long value);
        public static bool TryWriteUInt16BigEndian(Span<byte> buffer, ushort value);
        public static bool TryWriteUInt16LittleEndian(Span<byte> buffer, ushort value);
        public static bool TryWriteUInt32BigEndian(Span<byte> buffer, uint value);
        public static bool TryWriteUInt32LittleEndian(Span<byte> buffer, uint value);
        public static bool TryWriteUInt64BigEndian(Span<byte> buffer, ulong value);
        public static bool TryWriteUInt64LittleEndian(Span<byte> buffer, ulong value);
        public static void WriteMachineEndian<T>(Span<byte> buffer, ref T value) where T : struct;
        public static void WriteInt16BigEndian(Span<byte> buffer, short value);
        public static void WriteInt16LittleEndian(Span<byte> buffer, short value);
        public static void WriteInt32BigEndian(Span<byte> buffer, int value);
        public static void WriteInt32LittleEndian(Span<byte> buffer, int value);
        public static void WriteInt64BigEndian(Span<byte> buffer, long value);
        public static void WriteInt64LittleEndian(Span<byte> buffer, long value);
        public static void WriteUInt16BigEndian(Span<byte> buffer, ushort value);
        public static void WriteUInt16LittleEndian(Span<byte> buffer, ushort value);
        public static void WriteUInt32BigEndian(Span<byte> buffer, uint value);
        public static void WriteUInt32LittleEndian(Span<byte> buffer, uint value);
        public static void WriteUInt64BigEndian(Span<byte> buffer, ulong value);
        public static void WriteUInt64LittleEndian(Span<byte> buffer, ulong value);
        public static bool TryWriteMachineEndian<T>(Span<byte> buffer, ref T value) where T : struct;
    }
}

cc @KrzysztofCwalina, @stephentoub, @GrabYourPitchforks, @jkotas, @safern

Code Coverage when running on a Little Endian machine:
!image
image

@jkotas
Copy link
Member

jkotas commented Oct 3, 2017

FWIW, I still think all Span related stuff should be in a single assembly - I do not see a point in spreading it over multiple assemblies.

@mmitche
Copy link
Member

mmitche commented Oct 3, 2017

@dotnet-bot test this please (ignore, testing)

public static long Reverse(this long value) { throw null; }
public static ulong Reverse(this ulong value) { throw null; }

public static T Read<T>(this Span<byte> buffer) where T : struct { throw null; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Read<T> and Write<T> APIs should probably prevent callers from using non-blittable T, otherwise we risk reading and writing memory addresses to the underlying stream. (Additionally, are there cases where the field offsets can change between runtimes or architectures? I couldn't find any examples.)

Copy link
Member Author

@ahsonkhan ahsonkhan Oct 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding this, we are considering adding a Primitive attribute to constrain T to blittable types only (structs without reference types) which we will add to Read/Write.

cc @KrzysztofCwalina, @jaredpar

public static byte Reverse(this byte value) { throw null; }
public static short Reverse(this short value) { throw null; }
public static ushort Reverse(this ushort value) { throw null; }
public static int Reverse(this int value) { throw null; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember this coming up in earlier review - do we really want extension methods hanging off of the primitive integer types? That seems like it'll confuse people or pollute Intellisense.

@ahsonkhan ahsonkhan changed the title [WIP] Add Span<T> Binary Reader/Writer APIs Add Span<T> Binary Reader/Writer APIs Oct 3, 2017
public static long ReverseEndianness(long value) { throw null; }
public static ulong ReverseEndianness(ulong value) { throw null; }

public static T ReadCurrentEndianness<T>(ReadOnlySpan<byte> buffer) where T : struct { throw null; }
Copy link
Member

@KrzysztofCwalina KrzysztofCwalina Oct 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer MachineEndian. Current seems like something that can change. And Endianness is something we use in Reverse, not in Read.

@ahsonkhan
Copy link
Member Author

System\Buffers\Binary\ReaderBigEndian.cs(32,17): error CS0103: The name 'BitConverter' does not exist in the current context [D:\j\workspace\windows-TGrou---c60886e1\src\System.Buffers\src\System.Buffers.csproj]

How do I reference BitConverter for certain build configurations like UWP NetNative?

/// Reads a structure of type T out of a read-only span of bytes.
/// </summary>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static T ReadCurrentEndianness<T>(ReadOnlySpan<byte> buffer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we verify that T is a "primitive"/"blittable" and fail fast if it's not?

Copy link
Member

@jkotas jkotas Oct 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is a good idea to the approximate blittable check here - just like what we have done for Span: https:/dotnet/corefx/blob/master/src/System.Memory/src/System/Span.cs#L87
https:/dotnet/coreclr/blob/master/src/mscorlib/shared/System/Span.cs#L100

Since you are insisting on having this in separate .dll from Span, it means that there needs to be a second copy of the SpanHelpers machinery compiled into the portable System.Memory to perform this check...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I am not insisting on System.Buffers.dll. We discussed this at the API review and I think most people felt it's safer in System.Buffers.dll (because maybe we will not in-box the types there ever). If we are sure there is no difference from OOB perspective, I don't think anybody had any stron preferences between these two dlls.

Copy link
Member

@jkotas jkotas Oct 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

System.Buffers is inbox in .NET Core, so it has to play by the OOB+inbox rules already.

Actually, what happens when I am going to reference System.Buffers with these additions in my .NET Core 2.0 app? Is it going to work - what build of System.Buffers I am going to get?

@@ -30,13 +39,17 @@
</ItemGroup>
<ItemGroup Condition="'$(IsPartialFacadeAssembly)' == 'true'">
<ReferenceFromRuntime Include="System.Private.CoreLib" />
<ProjectReference Include="..\..\System.Runtime\src\System.Runtime.csproj" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use ReferenceFromRuntime?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean something like this?

    <ReferenceFromRuntime Include="System.Runtime" />
    <!-- <ProjectReference Include="..\..\System.Runtime\src\System.Runtime.csproj" /> -->

What would that do? I am already referencing System.Private.CoreLib.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, if you are referencing implementation directly it has to be referenced to the Project directly and you've already referenced from runtime System.Private.CoreLib. That is correct, the only thing that is not correct is that all the ProjectReference that we add in a src project has to be to src projects not reference assembly.

So the reference below has to be a reference to the src project as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, got it. will fix that.

Copy link
Member

@GrabYourPitchforks GrabYourPitchforks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the "is blittable?" check is being addressed, and that was the only concern I had.

{
if (BinaryHelpers.IsReferenceOrContainsReferences<T>())
{
Environment.FailFast($"Cannot read a span into the non-blittable type <{typeof(T).Name}>.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should throw exception, not actually FailFast.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An exception can be handled, i.e. I am not sure we can change from an exception to language constraint.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed blitable constrain (dotnet/csharplang#206) is stronger than this check. You won't be able to start enforcing the blitable constraint without breaking something regardless.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let's just throw for now and push for blittable

public static T ReadMachineEndian<T>(ReadOnlySpan<byte> buffer)
where T : struct
{
if (BinaryHelpers.IsReferenceOrContainsReferences<T>())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should call RuntimeHelpers.IsReferenceOrContainsReferences<T>() when you are building against runtime that supports this API.

/// Determine if a type is eligible for storage in unmanaged memory.
/// Portable equivalent of RuntimeHelpers.IsReferenceOrContainsReferences&lt;T&gt;()
/// </summary>
public static bool IsReferenceOrContainsReferences<T>() => IsReferenceOrContainsReferencesCore(typeof(T));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should actually read as follows for best performance.

private static class IsReferenceImpl<T> {
  public static bool IsReferenceOrContainsReferences = IsReferenceOrContainsReferenceCore(typeof(T));
}

public static bool IsReferenceOrContainsReferences<T>() => IsReferenceImpl<T>.IsReferenceOrContainsReferences;

Moving the <T> onto a class rather than in a method allows the JITter to cache the value per T.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it should be done exactly like it is done for Span: https:/dotnet/corefx/blob/master/src/System.Memory/src/System/SpanHelpers.cs#L129

If we end up compiling in multiple copies, the implementation should be moved under Common.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we have the blittable constraint, this code will no longer be necessary here. Should we still move it to Common?

Copy link
Member

@jkotas jkotas Oct 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do not need to move it now that we have decided to add these APIs to System.Memory instead.

@joshfree joshfree added this to the 2.1.0 milestone Oct 4, 2017
@joshfree
Copy link
Member

joshfree commented Oct 4, 2017

https://ci3.dot.net/job/dotnet_corefx/job/master/job/windows-TGroup_all+CGroup_Debug+AGroup_x64+TestOuter_false_prtest/3834/console

19:10:26 System\Buffers\Binary\Reader.cs(103,17): error CS0103: The name 'Environment' does not exist in the current context [D:\j\workspace\windows-TGrou---0d2c9ac4\src\System.Buffers\src\System.Buffers.csproj]
19:10:26 System\Buffers\Binary\Reader.cs(122,17): error CS0103: The name 'Environment' does not exist in the current context [D:\j\workspace\windows-TGrou---0d2c9ac4\src\System.Buffers\src\System.Buffers.csproj]
19:10:26 System\Buffers\Binary\ReaderBigEndian.cs(18,17): error CS0103: The name 'BitConverter' does not exist in the current context [D:\j\workspace\windows-TGrou---0d2c9ac4\src\System.Buffers\src\System.Buffers.csproj]
19:10:26 System\Buffers\Binary\ReaderBigEndian.cs(32,17): error CS0103: The name 'BitConverter' does not exist in the current context [D:\j\workspace\windows-TGrou---0d2c9ac4\src\System.Buffers\src\System.Buffers.csproj]

@karelz
Copy link
Member

karelz commented Oct 11, 2017

What is the remaining part of the API review? (it is getting muddy :()

@karelz karelz added the blocked Issue/PR is blocked on something - see comments label Oct 11, 2017
@ahsonkhan
Copy link
Member Author

What is the remaining part of the API review? (it is getting muddy :()

There is nothing pending for review specific to these APIs. This PR takes into account the concerns from our previous review and I don't think we need to block it.

cc @KrzysztofCwalina, @terrajobst

@karelz
Copy link
Member

karelz commented Oct 12, 2017

@ahsonkhan in that case please work with @terrajobst offline to clarify that on the API review issue - it is not marked as approved yet as there was substantial feedback: https:/dotnet/corefx/issues/24144#issuecomment-334279677

Under normal circumstances, it means that feedback should be incorporated, then last (quick) API review should happen. If you think we can do second part offline over email, that's fine.

@joshfree
Copy link
Member

https:/dotnet/corefx/issues/24144 is marked as api-reviewed
/cc @weshaggard

@joshfree joshfree removed the blocked Issue/PR is blocked on something - see comments label Oct 12, 2017
@@ -7,6 +7,7 @@
<CLSCompliant>false</CLSCompliant>
<DocumentationFile>$(OutputPath)$(MSBuildProjectName).xml</DocumentationFile>
<IsPartialFacadeAssembly Condition="'$(TargetGroup)' == 'netcoreapp' OR '$(TargetGroup)' == 'uap'">true</IsPartialFacadeAssembly>
<DefineConstants Condition="'$(IsPartialFacadeAssembly)' == 'true'">$(DefineConstants);IsPartialFacade</DefineConstants>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsPartialFacade doesn't really seem to capture a real condition here. Can we instead switch the define to something that captures the thing you are trying to use? See https:/dotnet/corefx/blob/master/Documentation/coding-guidelines/project-guidelines.md#define-naming-convention for our guidelines.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to resolve this issue in a separate PR given this one is already green. I could keep it the way I had before, which is separate netcoreapp and uap constants and do the 'or' in code.

So, replace #if IsPartialFacade with #if netcoreapp || uap. Does that work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or give it a FEATURE_NAME name specific to what you are trying to use.

public static T ReadMachineEndian<T>(ReadOnlySpan<byte> buffer)
where T : struct
{
#if IsPartialFacade
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should always call SpanHelpers and have the #ifdef only in the one place which is in SpanHelpers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would still end up with the #ifdef for the ThrowHelper so it won't reduce the number of places we have to ifdef. I am not certain how to work around it (there is already an internal ThrowHelper class in mscorlib).

Copy link
Member

@weshaggard weshaggard Oct 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't the ThrowHelper call also be in the SpanHelper #ifdef?

<Reference Include="System.Runtime.InteropServices" />
<Reference Condition="'$(TargetGroup)' != 'netstandard1.1'" Include="System.Numerics.Vectors" />
<Reference Condition="'$(TargetGroup)' != 'netstandard1.1'" Include="System.Runtime.CompilerServices.Unsafe" />
<ProjectReference Condition="'$(TargetGroup)' == 'netstandard1.1'" Include="..\..\System.Runtime.CompilerServices.Unsafe\ref\System.Runtime.CompilerServices.Unsafe.csproj" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you have to change from ref to src here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also why do we need this to be a ProjectReference at all? A normal reference should work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From @safern, #24400 (comment)

That is correct, the only thing that is not correct is that all the ProjectReference that we add in a src project has to be to src projects not reference assembly.
So the reference below has to be a reference to the src project as well.

For netstandard, there is a normal reference.
But for netstandard1.1, we need a project reference since the Unsafe class was added in netstandard1.6.
https://apisof.net/catalog/System.Runtime.CompilerServices.Unsafe
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But for netstandard1.1, we need a project reference since the Unsafe class was added in netstandard1.6.

I'm not sure I understand this. If we are building this library for netstandard1.1 then it cannot depend on a netstanard1.6 library. You might be getting away with this because of the ProjectReference but I still believe this should just be a Reference.

@ahsonkhan ahsonkhan merged commit 563ce5f into dotnet:master Oct 12, 2017
@ahsonkhan ahsonkhan deleted the AddBinaryAPIs branch October 12, 2017 23:07
pjanotti pushed a commit to pjanotti/corefx that referenced this pull request Oct 31, 2017
* Adding skeleton for Binary APIs.

* Adding Binary APIs and tests.

* Adding performance tests.

* Fix reference and add perf test project to solution.

* Addressing feedback from API Review

* Renaming Read/Write and verifying T is blittable.

* Adding helper which validates that the type T is blittable.

* Move Binary APIs from Buffers.dll to Memory.dll.

* Fix Write APIs and use correct condition for is partial facade.

* Add reference to S.R.Extensions for BitConverter for netstandard1.1

* Adding test for reading ROSpan into a struct with references.
@GF-Huang
Copy link

GF-Huang commented May 8, 2021

The docs is not clear enough. What the mean of as little endian? Is it read the data into little endian uint16 or the source span is consider as little endian?

image

@danmoseley
Copy link
Member

Hi @GF-Huang this issue is long closed. Could you please open an issue here? https:/dotnet/dotnet-api-docs

@danmoseley
Copy link
Member

Meantime though I expect @GrabYourPitchforks can answer this.

@GrabYourPitchforks
Copy link
Member

@GF-Huang It means within the span, the least significant byte of the number's binary representation comes at the beginning.

So, if your span contains the two bytes [ 00 01 ] (in that order), then on all platforms ReadUInt16BigEndian will return 1, and on all platforms ReadUInt16LittleEndian will return 256.

@danmoseley
Copy link
Member

@GF-Huang perhaps you'd like to offer a clarification to this doc? Simply click the pencil button in the top right of any API doc page to edit it.

@GF-Huang
Copy link

Thanks for clarification.

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* Adding skeleton for Binary APIs.

* Adding Binary APIs and tests.

* Adding performance tests.

* Fix reference and add perf test project to solution.

* Addressing feedback from API Review

* Renaming Read/Write and verifying T is blittable.

* Adding helper which validates that the type T is blittable.

* Move Binary APIs from Buffers.dll to Memory.dll.

* Fix Write APIs and use correct condition for is partial facade.

* Add reference to S.R.Extensions for BitConverter for netstandard1.1

* Adding test for reading ROSpan into a struct with references.


Commit migrated from dotnet/corefx@563ce5f
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.