Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add armv7l (32-bit arm) build #124

Closed
coletdjnz opened this issue Sep 17, 2023 · 40 comments
Closed

Add armv7l (32-bit arm) build #124

coletdjnz opened this issue Sep 17, 2023 · 40 comments
Assignees
Labels
Milestone

Comments

@coletdjnz
Copy link
Contributor

coletdjnz commented Sep 17, 2023

curl-impersonate has recently added builds for armv7l: lwthiker/curl-impersonate@9303ce3

Currently these are only present in the new 0.6.0 alpha release, but once that is stabilized it would be great if we can also get some armv7 builds running for curl_cffi too.

https:/lwthiker/curl-impersonate/releases/tag/v0.6.0-alpha.1

@perklet
Copy link
Collaborator

perklet commented Sep 17, 2023

Sure, let's wait for the official release.

@bjia56
Copy link
Contributor

bjia56 commented Oct 3, 2023

Fwiw, I have an older build of curl_cffi for armv7l here: https:/bjia56/armv7l-wheels

It uses the changes I contributed upstream to curl-impersonate, but builds libraries against an older glibc than the cross compiler, so should have greater Linux version compatibility.

@bjia56
Copy link
Contributor

bjia56 commented Oct 3, 2023

If there's interest for upstream curl_cffi to be able to build armv7l wheels and pull in the releases from upstream curl-impersonate, I highly recommend modifying setup.py to download the shared objects based on detected architecture as part of build, so the piwheels project can automatically pull it into their build system.

@perklet
Copy link
Collaborator

perklet commented Oct 3, 2023

I'm trying to incorporate upstream changes on this branch. Basically, I just changed the version numbers.

I haven't touched the armv7l parts yet, since it seems that CI build process needs to change for cross compiling. PRs are welcome if you would like to work on this.

@perklet
Copy link
Collaborator

perklet commented Nov 17, 2023

I'm tring to build my fork with newer browser versions support, but it failed for armv7l with zstd related errors. Do you happen to know about that? @bjia56

@bjia56
Copy link
Contributor

bjia56 commented Nov 17, 2023

Haven't seen that before, it looks like the build might require cross compiling zstd (a compression library) before linking. Perhaps it is a newer dependency to chrome?

@bjia56
Copy link
Contributor

bjia56 commented Nov 17, 2023

It looks like it also failed on arm64 with same linking error. Did your change enable zstd? I see it disabled in the upstream build. Could also be because you bumped the Ubuntu builder version, newer libraries exist on the system and it's trying to pull in zstd. Might need to change the curl-impersonate Makefile to specifically turn off this library.

@bjia56
Copy link
Contributor

bjia56 commented Nov 17, 2023

btw - I'm starting to test your build of curl-impersonate in my armv7l wheels repo. It'll be nice to get a newer curl build in, thanks for your effort on updating the build!

edit: The build for curl 8.1.1 succeeded on my armv7l docker container, which is running centos 7. So I suspect it's due to the newer Ubuntu image.

@perklet perklet added this to the v0.6 milestone Nov 22, 2023
@perklet perklet linked a pull request Dec 11, 2023 that will close this issue
2 tasks
@perklet perklet self-assigned this Dec 11, 2023
@perklet
Copy link
Collaborator

perklet commented Dec 31, 2023

Just checked replated projects and it seems that cibuildwheel does not support building arm32 wheels directly and there is not bdist channel on PyPI for that. I will try to build a wheel to the releases on GitHub later. Moving this out of #163

@perklet perklet removed a link to a pull request Dec 31, 2023
2 tasks
@bjia56
Copy link
Contributor

bjia56 commented Jan 10, 2024

Just checked replated projects and it seems that cibuildwheel does not support building arm32 wheels directly and there is not bdist channel on PyPI for that. I will try to build a wheel to the releases on GitHub later. Moving this out of #163

If setup.py downloads the libcurl binaries from gh (based on architecture) as part of the build (instead of a preprocess makefile step), then piwheels.org will be able to automatically produce armv7l wheels built on real 32 bit raspberry pi hardware.

@perklet
Copy link
Collaborator

perklet commented Jan 10, 2024

I see, at least we can make setup.py download the .so files when they are not downloaded already.

@perklet
Copy link
Collaborator

perklet commented Jan 22, 2024

@bjia56 We've merged download logic to setup.py, do you have a arm32 machine to verify if the sdist works in #192 ?

@T-256
Copy link
Contributor

T-256 commented Feb 17, 2024

Since sdist now working, we need just push a release to pypi then piwheels.org will then able to produce armv7l bdist wheel automatically.

@bjia56
Copy link
Contributor

bjia56 commented Feb 22, 2024

piwheels is still experiencing build errors for 0.6.0: https://www.piwheels.org/project/curl-cffi/

@bjia56
Copy link
Contributor

bjia56 commented Feb 22, 2024

Maybe this (and the armv7l below) should just be arm? https:/yifeikong/curl_cffi/blob/main/libs.json#L63

@perklet
Copy link
Collaborator

perklet commented Feb 23, 2024

Maybe this (and the armv7l below) should just be arm? https:/yifeikong/curl_cffi/blob/main/libs.json#L63

Updated in v0.6.1, does it work?

@bjia56
Copy link
Contributor

bjia56 commented Feb 23, 2024

Unfortunately not quite, it looks like the latest so download fails because arm-linux-gnu.tar.gz needs to be arm-linux-gnueabihf.tar.gz

@perklet
Copy link
Collaborator

perklet commented Feb 23, 2024

Unfortunately not quite, it looks like the latest so download fails because arm-linux-gnu.tar.gz needs to be arm-linux-gnueabihf.tar.gz

Which one should be changed then? Here or curl-impersonate? I'm not really an expert in embedded compiler names.

@bjia56
Copy link
Contributor

bjia56 commented Feb 23, 2024

It's probably more correct to change in curl-cffi to specify the platform triplet of arm-linux-gnueabihf in libs.json (for this and other platforms as well). arm32 is finicky since it's got a wide number of optional hardware features that vary from manufacturer to manufacturer, so the gnueabihf is needed to say that there needs to be hardware floating point support on the machine.

@bjia56
Copy link
Contributor

bjia56 commented Feb 23, 2024

I guess changing the sysname will fix it without adding another platform triplet property https:/yifeikong/curl_cffi/blob/main/libs.json#L70

@perklet
Copy link
Collaborator

perklet commented Feb 23, 2024

Do you mind opening a PR? I'm not sure if we have the .so for armv6l and its corresponding values.

@bjia56
Copy link
Contributor

bjia56 commented Feb 23, 2024

Sure. The arm builds are built against armv6l which will also work on armv7l, so both platforms will work fine with the same download.

@bjia56
Copy link
Contributor

bjia56 commented Feb 23, 2024

#255 opened. Trying to build it on my raspberry pi is also having other errors related to missing types such as struct curl_ws_frame, which I assume is due to newer curl headers than what's in the system apt repo. Is there any way curl.h can be bundled into the build process? Or has that been bundled into the sdist already?

@perklet
Copy link
Collaborator

perklet commented Feb 23, 2024

Yes, it's in the sdist tarball.

@bjia56
Copy link
Contributor

bjia56 commented Feb 27, 2024

For folks following this thread - there are still some pending items to work through for piwheels to properly build wheels that can just be downloaded and used directly. In the meantime, I will be maintaining prebuilt wheels (with so's bundled in) over on my armv7l index here: https://bjia56.github.io/armv7l-wheels/curl-cffi/

@perklet
Copy link
Collaborator

perklet commented Feb 27, 2024

gcc.txt

Here is the resutls from my RPi Zero 2W, running gcc -mcpu=native -march=native -Q --help=target

>>> import platform
>>> platform.uname()
uname_result(system='Linux', node='raspberrypi', release='6.1.21+', version='#1642 Mon Apr  3 17:19:14 BST 2023', machine='armv6l')

Which shows this:

>>> from curl_cffi import requests
>>> requests.get("https://www.zhihu.com")
Illegal instruction

@bjia56
Copy link
Contributor

bjia56 commented Feb 28, 2024

Do you also get illegal instruction when using the curl-impersonate-chrome executable?

@bjia56
Copy link
Contributor

bjia56 commented Feb 28, 2024

Can you try something like this to identify which instruction is involved: https://stackoverflow.com/a/40223712

@perklet
Copy link
Collaborator

perklet commented Feb 28, 2024

Do you also get illegal instruction when using the curl-impersonate-chrome executable?

Yes. But I found out that it works for http site.

@bjia56
Copy link
Contributor

bjia56 commented Feb 28, 2024

My guess is boringssl is trying to do some hardware-optimized assembly that isn't available for that particular raspberry pi model. I can test with my raspberry pi 1 (also armv6) later today or tomorrow

@perklet
Copy link
Collaborator

perklet commented Feb 28, 2024

Can you try something like this to identify which instruction is involved: https://stackoverflow.com/a/40223712

>0x1f3644    vld1.64     {d18-d19}, [r0 :128]  
 0x1f3648    mov r0, #2                        
 0x1f364c    vmov.i32    q8, #0  ; 0x00000000  
 0x1f3650    add r6, r4, #76     ; 0x4c        
 0x1f3654    strb        r0, [r4, #44]   ; 0x2c
 0x1f3658    mov r0, #0                        
 0x1f365c    str r0, [r4, #40]   ; 0x28        
 0x1f3660    mov r3, #102400     ; 0x19000     
 0x1f3664    ldr r2, [r1, #4]                  
 0x1f3668    ldr r1, [r1, #8]                  
 0x1f366c    str r2, [r4]                      

It seems that the last instruction is a neon instruction, which is not available on my RPi zero W. (I mistakenly said it's a Zero 2, but it's not)

yifei@raspberrypi:~/curl-impersonate-v0.6.1.arm-linux-gnueabihf $ cat /proc/cpuinfo 
processor	: 0
model name	: ARMv6-compatible processor rev 7 (v6l)
BogoMIPS	: 697.95
Features	: half thumb fastmult vfp edsp java tls 
CPU implementer	: 0x41
CPU architecture: 7
CPU variant	: 0x0
CPU part	: 0xb76
CPU revision	: 7

Hardware	: BCM2835
Revision	: 9000c1
Serial		: 000000001c892f54
Model		: Raspberry Pi Zero W Rev 1.1

Maybe armv6l is too old to be supported.

@bjia56
Copy link
Contributor

bjia56 commented Feb 28, 2024

Yeah, that matches my suspicion. There's a couple root causes I can think of:

  • Zig compiler is erroneously enabling NEON (should be fixable with CFLAGS)
  • BoringSSL is erroneously including NEON on all arm builds (a lot harder to fix)

I can take a look at the first item. If it turns out that the second one is the issue, then I might say it's too much hassle to support and we should just stick with armv7l and beyond.

Alternatively, we can try compiling with --mfloat-abi=soft and disable hardware acceleration altogether. If so, we might want to split out the arm builds into soft/hard and have armv6 pull down soft, armv7 pull down hard.

@bjia56
Copy link
Contributor

bjia56 commented Feb 28, 2024

Testing Zig with the following code (courtesy of ChatGPT) implies that NEON is not enabled by the compiler. I see BoringSSL is doing some runtime CPU capability magic, might be possible to disable that with CFLAGS.

#include <stdio.h>

// Function to detect NEON support
int detectNEON() {
    #if defined(__ARM_NEON) || defined(__ARM_NEON__)
        return 1;
    #else
        return 0;
    #endif
}

int main() {
    if (detectNEON()) {
        printf("NEON instructions are enabled by the compiler.\n");
    } else {
        printf("NEON instructions are not enabled by the compiler.\n");
    }
    return 0;
}

@Kartatz
Copy link

Kartatz commented Feb 28, 2024

Some time ago I got this same issue when compiling BoringSSL to arm-tizen*. BoringSSL is indeed enabling Neon for all arm builds by default.

At that time I fixed this by compiling BoringSSL with -DOPENSSL_NO_ASM=ON (CMake flag). Not sure if there is a better approach for this rather than disabling ASM entirely.

@perklet
Copy link
Collaborator

perklet commented Mar 1, 2024

@Kartatz I did the same for the Windows build, still havn't got time to check whether it's the best approach. I left an issue as a reminder.

@bjia56
Copy link
Contributor

bjia56 commented Mar 1, 2024

I've tried a bunch of stuff over on lexiforest/curl-impersonate#49 but unfortunately the compiler still spits out three vld1.64 instructions. To my understanding, the BoringSSL assembly parts have been disabled, yet NEON code is still generated.

One such case is in this function:
https:/google/boringssl/blob/d24a38200fef19150eef00cad35b138936c08767/ssl/extensions.cc#L3303
where the disassembly looks like:

001a0710 <_ZN4bssl31ssl_setup_extension_permutationEPNS_13SSL_HANDSHAKEE>:
  1a0710:       e92d48f0        push    {r4, r5, r6, r7, fp, lr}
  1a0714:       e24dd060        sub     sp, sp, #96     ; 0x60
  1a0718:       e1a04000        mov     r4, r0
  1a071c:       e5900004        ldr     r0, [r0, #4]
  1a0720:       e3a07001        mov     r7, #1
  1a0724:       e1d007b9        ldrh    r0, [r0, #121]  ; 0x79
  1a0728:       e3100b01        tst     r0, #1024       ; 0x400
  1a072c:       0a00002e        beq     1a07ec <_ZN4bssl31ssl_setup_extension_permutationEPNS_13SSL_HANDSHAKEE+0xdc>
  1a0730:       e1a0600d        mov     r6, sp
  1a0734:       e3a01060        mov     r1, #96 ; 0x60
  1a0738:       e1a00006        mov     r0, r6
  1a073c:       eb0121f4        bl      1e8f14 <RAND_bytes>
  1a0740:       e3a07000        mov     r7, #0
  1a0744:       e3500000        cmp     r0, #0
  1a0748:       0a000025        beq     1a07e4 <_ZN4bssl31ssl_setup_extension_permutationEPNS_13SSL_HANDSHAKEE+0xd4>
  1a074c:       e3a00000        mov     r0, #0
  1a0750:       eb02b8d6        bl      24eab0 <OPENSSL_free>
  1a0754:       e3a00019        mov     r0, #25
  1a0758:       eb02b8a8        bl      24ea00 <OPENSSL_malloc>
  1a075c:       e3500000        cmp     r0, #0
  1a0760:       0a00001f        beq     1a07e4 <_ZN4bssl31ssl_setup_extension_permutationEPNS_13SSL_HANDSHAKEE+0xd4>
  1a0764:       e1a05000        mov     r5, r0
  1a0768:       e28f0090        add     r0, pc, #144    ; 0x90
  1a076c:       f4600aef        vld1.64 {d16-d17}, [r0 :128]
  1a0770:       e3a00018        mov     r0, #24
  1a0774:       e3011110        movw    r1, #4368       ; 0x1110
  1a0778:       e286605c        add     r6, r6, #92     ; 0x5c
  1a077c:       e5c50018        strb    r0, [r5, #24]
  1a0780:       e3010514        movw    r0, #5396       ; 0x1514
  1a0784:       e3410716        movt    r0, #5910       ; 0x1716
  1a0788:       e3a07000        mov     r7, #0
  1a078c:       e5850014        str     r0, [r5, #20]
  1a0790:       e1a00005        mov     r0, r5
  1a0794:       f4400a0d        vst1.8  {d16-d17}, [r0]!
  1a0798:       e3411312        movt    r1, #4882       ; 0x1312
  1a079c:       e5801000        str     r1, [r0]
  1a07a0:       e7960107        ldr     r0, [r6, r7, lsl #2]
  1a07a4:       e2871019        add     r1, r7, #25
  1a07a8:       eb0438b7        bl      2aea8c <__aeabi_uidivmod>
  1a07ac:       e0852007        add     r2, r5, r7
  1a07b0:       e2477001        sub     r7, r7, #1
  1a07b4:       e7d50001        ldrb    r0, [r5, r1]
  1a07b8:       e3770018        cmn     r7, #24
  1a07bc:       e5d23018        ldrb    r3, [r2, #24]
  1a07c0:       e5c20018        strb    r0, [r2, #24]
  1a07c4:       e7c53001        strb    r3, [r5, r1]
  1a07c8:       1afffff4        bne     1a07a0 <_ZN4bssl31ssl_setup_extension_permutationEPNS_13SSL_HANDSHAKEE+0x90>
  1a07cc:       e5940220        ldr     r0, [r4, #544]  ; 0x220
  1a07d0:       eb02b8b6        bl      24eab0 <OPENSSL_free>
  1a07d4:       e3a00019        mov     r0, #25
  1a07d8:       e3a07001        mov     r7, #1
  1a07dc:       e5845220        str     r5, [r4, #544]  ; 0x220
  1a07e0:       e5840224        str     r0, [r4, #548]  ; 0x224
  1a07e4:       e3a00000        mov     r0, #0
  1a07e8:       eb02b8b0        bl      24eab0 <OPENSSL_free>
  1a07ec:       e1a00007        mov     r0, r7
  1a07f0:       e28dd060        add     sp, sp, #96     ; 0x60
  1a07f4:       e8bd88f0        pop     {r4, r5, r6, r7, fp, pc}
  1a07f8:       e320f000        nop     {0}
  1a07fc:       e320f000        nop     {0}
  1a0800:       03020100        .word   0x03020100
  1a0804:       07060504        .word   0x07060504
  1a0808:       0b0a0908        .word   0x0b0a0908
  1a080c:       0f0e0d0c        .word   0x0f0e0d0c

This looks to me like some loop is getting vectorized, since there's no clear crypto calls being made in this section. However, attempting to disable compiler vectorization with Clang's flags hasn't made a difference.

My hunch is that there's probably some bug in the Zig compiler that's still producing NEON instructions even with multiple attempts to disable it. I might be able to poke around at the compiler code sometime later to see if there's a root cause, but for now I suggest we drop armv6 until further notice.

@bjia56
Copy link
Contributor

bjia56 commented Mar 1, 2024

Looks like I spoke too soon. Propagating the -mcpu flag directly down to Zig seems to work, instead of going through CFLAGS. I was able to get a build that runs without the illegal instruction error on my Raspberry Pi 1. Perhaps something (maybe cmake?) in the layers of build toolchains messed things up.

@bjia56
Copy link
Contributor

bjia56 commented Mar 1, 2024

Also appears that keeping BoringSSL's assembly and CPU feature detection in the build works as well, so we should still be able to build one distribution for both armv6 and armv7, and dynamically use NEON when hardware has the support.

@T-256
Copy link
Contributor

T-256 commented Mar 3, 2024

I see wheels at https://www.piwheels.org/project/curl-cffi/ built successful on v0.6.2. Can you verify these wheels works as expected?

@bjia56
Copy link
Contributor

bjia56 commented Mar 3, 2024

The build on piwheels works for me.

@perklet perklet closed this as completed Mar 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants