Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

now: node-test-binary-arm doesn't finish #774

Closed
refack opened this issue Jun 27, 2017 · 5 comments
Closed

now: node-test-binary-arm doesn't finish #774

refack opened this issue Jun 27, 2017 · 5 comments

Comments

@refack
Copy link
Contributor

refack commented Jun 27, 2017

Ref: nodejs/node#13857 (comment)
Ref: nodejs/node#13940 (comment)
Ref: https://ci.nodejs.org/job/node-test-binary-arm/8899/

@rvagg
Copy link
Member

rvagg commented Jun 27, 2017

Been discussing this in #node-build fwiw

I pulled out the SSD that the Pi's use via NFS and this seems to have had a big hit on their perf, way more than I expected. It's possible that it'll fix itself out if left to their own devices but probably not.

I've been rearranging the network over here, replacing some key components with some (exciting and fun) new networking gear (my own gear, not specifically for the ARM cluster but should improve things for the cluster a great deal), and the SSD is going back but in a new host so this was supposed to be a single step in a bunch of steps to reconfigure. I might just pull arm-fanned out of test-pull-request for now until this is done, it might be another 24 hours till it's complete but I'll keep folks updated here.

@rvagg
Copy link
Member

rvagg commented Jun 27, 2017

External access to the ARM cluster is disabled for the moment, but the machines on that network are all back online, Pi's are not activated in test-commit though still. I have a new gateway and parent switch in place that's letting me do some fancier config, and it should mean faster throughput internally too.

@refack
Copy link
Contributor Author

refack commented Jun 27, 2017

I wanted to ask if it's worth taking node-test-commit-arm-fanned out of nose-test-commit but I see that's already been done 👍

jinx

@rvagg
Copy link
Member

rvagg commented Jun 29, 2017

Okie dokie! We have progress. I've moved to a new gateway/router and parent switch, along with a dedicated NAS that now holds the SSD that we were using for the Pi's. And, it's all much faster and more profesh all round. We're up to NFSv4, I don't know if that makes much difference but there is chatter that suggests that it should be faster.

I've put node-test-commit-arm-fanned back in to node-test-commit so it's back in play.

I've also disconnected the switch that the Pi 3's were all attached to, it was the most ancient piece of hardware in the stack (an old Dell thing that I've had in my pile for many years) and updated and reconnected all of them. There are enough ports in the other two switches, so they are now sharing the same switch as the Pi 2's. They are now back in node-test-commit-arm-fanned and appear to be working exactly as planned without the git / slowdown problems we were seeing before.

SSH access back in to the cluster (including the macOS machine) is active again, folks with access to nodejs_build_test should be able to get back in again if you have the config setup right for the jump host and all of the internal hostnames (one change being that test-requireio_arm-ubuntu1404-arm64_xgene-1 is now on 192.168.2.4 rather than 192.168.2.1, would someone mind updating that in the ansible config if it's there?).

It's not quite as simple as turning it all back on though (of course!), we have a couple of things to clean up, both of which can be seen in console output of test runs (e.g. https://ci.nodejs.org/job/node-test-binary-arm/8907/RUN_SUBSET=5,label=pi3-raspbian-jessie/console):

  • parallel/test-fs-readdir-ucs2 is failing on all of the Pi's: Error: EINVAL: invalid argument, open '/home/iojs/build/workspace/node-test-binary-arm/test/tmp.0/=���', this is likely to be to do with it being on an NFS share exported from a ZFS volume. I'm going to have to defer to @nodejs/testing to figure out how to handle that special case. Let me know if there's something I should be doing for that mount to enable UCS2 (no promises that it's achievable on my end tho).
  • Because of the failure, the error message is getting into the tap output file and it's not parsable. org.tap4j.parser.ParserException: Error parsing TAP Stream: Error parsing YAML coming from Caused by: unacceptable character '�' (0x4) special characters are not allowed. This will obviously go away when we fix up test-fs-readdir-ucs2 but suggests that we might need to do something to protect against this in future.

In addition, these machines are out of action, I'm going to turn them off completely until I get back from vacation, I know at least one of them has a corrupt SD card that I need to replace: test-requireio_bengl-debian7-arm_pi1p-1, test-requireio_joeyvandijk-debian7-arm_pi2-1, test-requireio_ceejbot-debian7-arm_pi2-1 (bad card), test-requireio_kahwee-debian8-arm_pi3-1.

@refack
Copy link
Contributor Author

refack commented Jun 29, 2017

Because of the failure, the error message is getting into the tap output file and it's not parsable. org.tap4j.parser.ParserException: Error parsing TAP Stream: Error parsing YAML coming from Caused by: unacceptable character '�' (0x4) special characters are not allowed. This will obviously go away when we fix up test-fs-readdir-ucs2 but suggests that we might need to do something to protect against this in future.

+1
Do you think a fix in core's tools/test.py is needed? Something about encoding like: nodejs/node#12786 or nodejs/node-gyp#1203?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants