Skip to content
This repository has been archived by the owner on Feb 22, 2023. It is now read-only.

[ci] migrate Cirrus to Apple Silicon #5693

Closed
wants to merge 53 commits into from
Closed

Conversation

fkorotkov
Copy link
Contributor

Migrates Cirrus CI to use the new M1-powered infra.

Pre-launch Checklist

  • I read the Contributor Guide and followed the process outlined there for submitting PRs.
  • I read the Tree Hygiene wiki page, which explains my responsibilities.
  • I read and followed the relevant style guides and ran the auto-formatter. (Unlike the flutter/flutter repo, the flutter/plugins repo does use dart format.)
  • I signed the CLA.
  • The title of the PR starts with the name of the plugin surrounded by square brackets, e.g. [shared_preferences]
  • I listed at least one issue that this PR fixes in the description above.
  • I updated pubspec.yaml with an appropriate new version according to the pub versioning philosophy, or this PR is exempt from version changes.
  • I updated CHANGELOG.md to add a description of the change, following repository CHANGELOG style.
  • I updated/added relevant documentation (doc comments with ///).
  • I added new tests to check the change I am making, or this PR is test-exempt.
  • All existing and new tests are passing.

If you need help, consider asking for advice on the #hackers-new channel on Discord.

@stuartmorgan
Copy link
Contributor

Seeing if I can repro the failures locally for investigation.

Before I forget though: ios-build_all_plugins and macos-build_all_plugins tasks should stay on Intel, so we have build coverage of both architectures.

@stuartmorgan
Copy link
Contributor

Podspec issues fall into two categories:

  • macOS: ld: warning: [...]/Pods/FlutterMacOS/FlutterMacOS.framework/FlutterMacOS, building for macOS-arm64 but attempting to link with file built for macOS-x86_64. @cbracken Shouldn't that framework be universal on master?
  • iOS dependencies. E.g.: ld: building for iOS Simulator, but linking in object file built for iOS, file '[...]/Pods/GoogleMaps/Maps/Frameworks/GoogleMaps.framework/GoogleMaps' for architecture arm64. @jmagman Is this something we can adjust about how pod does its validation build, the way we have for the Runner?

In the short term though, we could just leave that task on Intel machines so it doesn't block moving the rest.

@stuartmorgan
Copy link
Contributor

I filed flutter/flutter#103515 for the podspec validation issues since it doesn't need to block this PR, so we can continue discussion of that part there.

@stuartmorgan
Copy link
Contributor

stuartmorgan commented May 11, 2022

I haven't been able to reproduce the video_player_avfoundation failure in native-test locally. Is it possible that it's specific to this virtualization environment?

@fkorotkov
Copy link
Contributor Author

@stuartmorgan, you are refering to this failure of [VideoPlayerTests testHLSControls]:

2022-05-11 11:00:26.656214-0700 Runner[4971:29643] *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '-[FlutterError objectForKeyedSubscript:]: unrecognized selector sent to instance 0x6000028e1e20'
*** First throw call stack:
(
	0   CoreFoundation                      0x00000001011575e4 __exceptionPreprocess + 236
	1   libobjc.A.dylib                     0x0000000100e7613c objc_exception_throw + 56
	2   CoreFoundation                      0x0000000101166b64 +[NSObject(NSObject) instanceMethodSignatureForSelector:] + 0
	3   CoreFoundation                      0x000000010115b854 ___forwarding___ + 1440
	4   CoreFoundation                      0x000000010115d8ec _CF_forwarding_prep_0 + 92
	5   RunnerTests                         0x0000000123b06ee0 __35-[VideoPlayerTests testPlugin:uri:]_block_invoke + 68
	6   Runner                              0x0000000100aa6668 -[FLTVideoPlayer observeValueForKeyPath:ofObject:change:context:] + 1500
	7   Foundation                          0x0000000101cbe08c NSKeyValueNotifyObserver + 288
	8   Foundation                          0x0000000101cc14e4 NSKeyValueDidChange + 372
	9   Foundation                          0x0000000101cbd570 NSKeyValueDidChangeWithPerThreadPendingNotifications + 148
	10  AVFCore                             0x000000011655fde0 -[AVPlayerItem _changeStatusToFailedWithError:] + 524
	11  AVFCore                             0x000000011657ba40 __avplayeritem_fpItemNotificationCallback_block_invoke + 2100
	12  libdispatch.dylib                   0x0000000109002244 _dispatch_call_block_and_release + 24
	13  libdispatch.dylib                   0x0000000109003a98 _dispatch_client_callout + 16
	14  libdispatch.dylib                   0x000000010901141c _dispatch_main_queue_drain + 976
	15  libdispatch.dylib                   0x000000010901103c _dispatch_main_queue_callback_4CF + 40
	16  CoreFoundation                      0x00000001010c5218 __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__ + 12
	17  CoreFoundation                      0x00000001010bf69c __CFRunLoopRun + 2432
	18  CoreFoundation                      0x00000001010be804 CFRunLoopRunSpecific + 572
	19  XCTestCore                          0x0000000116e3a810 -[XCTWaiter waitForExpectations:timeout:enforceOrder:] + 872
	20  XCTestCore                          0x0000000116e0d1c4 -[XCTestCase(AsynchronousTesting) waitForExpectationsWithTimeout:handler:] + 212
	21  RunnerTests                         0x0000000123b04680 -[VideoPlayerTests testPlugin:uri:] + 1784
	22  RunnerTests                         0x0000000123b02e90 -[VideoPlayerTests testHLSControls] + 188
	23  CoreFoundation                      0x000000010115daa0 __invoking___ + 144
	24  CoreFoundation                      0x000000010115afc8 -[NSInvocation invoke] + 300
	25  XCTestCore                          0x0000000116e5fcdc +[XCTFailableInvocation invokeStandardConventionInvocation:completion:] + 84
	26  XCTestCore                          0x0000000116e5fc80 __90+[XCTFailableInvocation invokeInvocation:withTestMethodConvention:lastObservedErrorIssue:]_block_invoke_3 + 24
	27  XCTestCore                          0x0000000116e5f6d8 __81+[XCTFailableInvocation invokeWithAsynchronousWait:lastObservedErrorIssue:block:]_block_invoke.13 + 80
	28  XCTestCore                          0x0000000116e1ea90 +[XCTSwiftErrorObservation observeErrorsInBlock:] + 140
	29  XCTestCore                          0x0000000116e5f564 +[XCTFailableInvocation invokeWithAsynchronousWait:lastObservedErrorIssue:block:] + 428
	30  XCTestCore                          0x0000000116e5fa00 +[XCTFailableInvocation invokeInvocation:withTestMethodConvention:lastObservedErrorIssue:] + 252
	31  XCTestCore                          0x0000000116e5fd54 +[XCTFailableInvocation invokeInvocation:lastObservedErrorIssue:] + 76
	32  XCTestCore                          0x0000000116e4d51c __24-[XCTestCase invokeTest]_block_invoke.287 + 112
	33  XCTestCore                          0x0000000116e16d44 -[XCTestCase(XCTIssueHandling) _caughtUnhandledDeveloperExceptionPermittingControlFlowInterruptions:caughtInterruptionException:whileExecutingBlock:] + 172
	34  XCTestCore                          0x0000000116e4d0ec -[XCTestCase invokeTest] + 836
	35  XCTestCore                          0x0000000116e4e8ac __26-[XCTestCase performTest:]_block_invoke.396 + 44
	36  XCTestCore                          0x0000000116e16d44 -[XCTestCase(XCTIssueHandling) _caughtUnhandledDeveloperExceptionPermittingControlFlowInterruptions:caughtInterruptionException:whileExecutingBlock:] + 172
	37  XCTestCore                          0x0000000116e4e2fc __26-[XCTestCase performTest:]_block_invoke.375 + 476
	38  XCTestCore                          0x0000000116e33a30 +[XCTContext runInContextForTestCase:markAsReportingBase:block:] + 220
	39  XCTestCore                          0x0000000116e4df2c -[XCTestCase performTest:] + 316
	40  XCTestCore                          0x0000000116e04b64 -[XCTest runTest] + 60
	41  XCTestCore                          0x0000000116e36a4c -[XCTestSuite runTestBasedOnRepetitionPolicy:testRun:] + 156
	42  XCTestCore                          0x0000000116e368b8 __27-[XCTestSuite performTest:]_block_invoke + 208
	43  XCTestCore                          0x0000000116e362c0 __59-[XCTestSuite _performProtectedSectionForTest:testSection:]_block_invoke + 40
	44  XCTestCore                          0x0000000116e33a30 +[XCTContext runInContextForTestCase:markAsReportingBase:block:] + 220
	45  XCTestCore                          0x0000000116e36268 -[XCTestSuite _performProtectedSectionForTest:testSection:] + 164
	46  XCTestCore                          0x0000000116e36558 -[XCTestSuite performTest:] + 212
	47  XCTestCore                          0x0000000116e04b64 -[XCTest runTest] + 60
	48  XCTestCore                          0x0000000116e36a4c -[XCTestSuite runTestBasedOnRepetitionPolicy:testRun:] + 156
	49  XCTestCore                          0x0000000116e368b8 __27-[XCTestSuite performTest:]_block_invoke + 208
	50  XCTestCore                          0x0000000116e362c0 __59-[XCTestSuite _performProtectedSectionForTest:testSection:]_block_invoke + 40
	51  XCTestCore                          0x0000000116e33a30 +[XCTContext runInContextForTestCase:markAsReportingBase:block:] + 220
	52  XCTestCore                          0x0000000116e36268 -[XCTestSuite _performProtectedSectionForTest:testSection:] + 164
	53  XCTestCore                          0x0000000116e36558 -[XCTestSuite performTest:] + 212
	54  XCTestCore                          0x0000000116e04b64 -[XCTest runTest] + 60
	55  XCTestCore                          0x0000000116e36a4c -[XCTestSuite runTestBasedOnRepetitionPolicy:testRun:] + 156
	56  XCTestCore                          0x0000000116e368b8 __27-[XCTestSuite performTest:]_block_invoke + 208
	57  XCTestCore                          0x0000000116e362c0 __59-[XCTestSuite _performProtectedSectionForTest:testSection:]_block_invoke + 40
	58  XCTestCore                          0x0000000116e33a30 +[XCTContext runInContextForTestCase:markAsReportingBase:block:] + 220
	59  XCTestCore                          0x0000000116e36268 -[XCTestSuite _performProtectedSectionForTest:testSection:] + 164
	60  XCTestCore                          0x0000000116e36558 -[XCTestSuite performTest:] + 212
	61  XCTestCore                          0x0000000116e04b64 -[XCTest runTest] + 60
	62  XCTestCore                          0x0000000116e066b8 __89-[XCTTestRunSession executeTestsWithIdentifiers:skippingTestsWithIdentifiers:completion:]_block_invoke + 112
	63  XCTestCore                          0x0000000116e33a30 +[XCTContext runInContextForTestCase:markAsReportingBase:block:] + 220
	64  XCTestCore                          0x0000000116e065a8 -[XCTTestRunSession executeTestsWithIdentifiers:skippingTestsWithIdentifiers:completion:] + 260
	65  XCTestCore                          0x0000000116e6d5cc __72-[XCTExecutionWorker enqueueTestIdentifiersToRun:testIdentifiersToSkip:]_block_invoke_2 + 116
	66  XCTestCore                          0x0000000116e6d704 -[XCTExecutionWorker runWithError:] + 116
	67  XCTestCore                          0x0000000116e30f44 __25-[XCTestDriver _runTests]_block_invoke.322 + 60
	68  XCTestCore                          0x0000000116e0f470 -[XCTestObservationCenter _observeTestExecutionForBlock:] + 312
	69  XCTestCore                          0x0000000116e30c34 -[XCTestDriver _runTests] + 1452
	70  XCTestCore                          0x0000000116e05144 _XCTestMain + 116
	71  libXCTestBundleInject.dylib         0x0000000100d638c0 __copy_helper_block_e8_32s + 0
	72  CoreFoundation                      0x00000001010c5580 __CFRUNLOOP_IS_CALLING_OUT_TO_A_BLOCK__ + 20
	73  CoreFoundation                      0x00000001010c4854 __CFRunLoopDoBlocks + 408
	74  CoreFoundation                      0x00000001010bf018 __CFRunLoopRun + 764
	75  CoreFoundation                      0x00000001010be804 CFRunLoopRunSpecific + 572
	76  GraphicsServices                    0x0000000102a1b60c GSEventRunModal + 160
	77  UIKitCore                           0x000000010efa9d2c -[UIApplication _run] + 992
	78  UIKitCore                           0x000000010efae8c8 UIApplicationMain + 112
	79  Runner                              0x0000000100aa24c0 main + 104
	80  dyld                                0x0000000100ce1cd8 start_sim + 20
	81  ???                                 0x0000000100d95088 0x0 + 4309209224
	82  ???                                 0x3d51800000000000 0x0 + 4418453446915522560

Could it be because I updated SimRuntime to 15-4 from 15-0 and controls changed? I assume it's a native video player.

Sorry I'm not too familiar with flutter/plugins. BTW how did you try to reproduce the issue?

@stuartmorgan
Copy link
Contributor

Could it be because I updated SimRuntime to 15-4 from 15-0 and controls changed? I assume it's a native video player.

We don't use native controls, just the video output itself. Also, this isn't a UI test, it's a unit test.

But I just realized that I was misunderstand the exception here; I thought it was memory stomping, but it's actually just us getting an error we aren't expecting and the test not being set up to handle that gracefully. I'll push an update that should give us better error output, which may tell us what's going on.

BTW how did you try to reproduce the issue?

I've tried running from Xcode, but also running from the command line directly with the same commands the Cirrus config uses (modified slightly to only run that one package.)

@stuartmorgan
Copy link
Contributor

The failure should now give us the underlying AVFoundation video load error message; we'll see if that's enlightening.

@stuartmorgan
Copy link
Contributor

"VideoError: Failed to load video: The operation could not be completed"

Not super helpful. But for some reason AVFoundation isn't loading an HLS video in CI.

@fkorotkov
Copy link
Contributor Author

I wonder if it's related to Metal/Graphics config of the VM. Do you think it might be related? Let me check it.

.cirrus.yml Outdated
Comment on lines 365 to 366
# TODO(stuartmorgan): Move this to M1 once
# https:/flutter/flutter/issues/103515 is resolved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Flutter 3.0 arm64 version of macOS framework has been published, can we try on arm64 again?
https:/CocoaPods/Specs/blob/master/Specs/4/2/c/FlutterMacOS/3.0.0/FlutterMacOS.podspec.json

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just ran locally and still got FlutterMacOS (2.10.2). Maybe it's not propagated yet?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's possible, sometimes it takes awhile. I'll check locally tomorrow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out I had to pod repo update.

@stuartmorgan
Copy link
Contributor

I wonder if it's related to Metal/Graphics config of the VM. Do you think it might be related? Let me check it.

I wondered that as well; I had been hoping the error would give us a hint as to whether that was likely. I just tried pushing a version that logs more info, but I don't know if it'll be useful or not.

@fkorotkov
Copy link
Contributor Author

I was able to reproduce it locally in a VM and saw this weird dialog:

Screen Shot 2022-05-11 at 10 24 23 PM

Here are details of it when I clicked .

report.txt

Seems it contains a bit more info than the logs. Might be helpfull.

If you want to try to repro locally here are the steps:

First start a VM:

brew install cirruslabs/cli/tart
tart clone ghcr.io/cirruslabs/macos-monterey-xcode:latest monterey-xcode
tart run --no-graphics monterey-xcode

I prefer to use "Screen Sharing" because it supports copy/paste from host. That's why I run with --no-graphics and then do tart ip --wait monterey-xcode in a separate terminal to get an IP for Screen Sharing. Username is admin and password is admin.

@stuartmorgan
Copy link
Contributor

stuartmorgan commented May 12, 2022

The NSException (and thus the dialog and crash log) are just symptoms of the actual failure; it looks like you are not running with the changes I pushed here earlier. The problem is that we're getting an error in this test in the first place; the exception is just because the test assertions were expecting a different type of object (a normal result, not an error object).

@fkorotkov
Copy link
Contributor Author

@stuartmorgan I got the same error locally on my M1 Max device. How can I debug it? I'm running the latest changes but don't see any extra information.

Also which exact command are you using to run only VideoPlayerTests. I didn't figure it out and don't want to run all tests each time. 😅

@fkorotkov
Copy link
Contributor Author

@stuartmorgan seems newer version of the framework bundled with Xcode 13.3.1 added bufferization and there was am issue that assertion was fullfilled on the first event and not the initialized one. Fixed it in 82365e5 which made the tests pass on my local machine.

@fkorotkov
Copy link
Contributor Author

I'm a bit concerned about frequent timeouts. It seems CI script is hanging. Compare these two runs:

  1. Seems everything is done but the script is just hanging.
  2. In a successful run there is a run overview printed.

In case of a time out Cirrus agent reports a process tree and I can see this cirrus-ci-agent -> bash -> bash -> dart -> dart -> simctl chain of processes on the timed out task.

@stuartmorgan have you seen similar flaky time outs on the current Intel Macs? Maybe you also have an idea where this left over simctl is not cleaned up?

@stuartmorgan
Copy link
Contributor

I still see the video player error in the latest CI run. With the added logging it's:

[VideoPlayerTests testHLSControls] : failed - VideoError: Failed to load video: The operation couldn’t be completed. (CoreMediaErrorDomain error -12746.) (CoreMediaErrorDomain:-12746)

It looks like that error is kCMClockError_InvalidParameter. No idea what that would be.

@stuartmorgan
Copy link
Contributor

  1. Seems everything is done but the script is just hanging.

Nope, everything is not done there. It's finished the build for that package, but the next step is to run the integration tests on a simulator, which there's no logging for.

In case of a time out Cirrus agent reports a process tree and I can see this cirrus-ci-agent -> bash -> bash -> dart -> dart -> simctl chain of processes on the timed out task.

The flutter tool runs simctl as part of running on simulator, so it sounds like what is happening is that simctl is randomly hanging forever in the virtualized environment.

@stuartmorgan have you seen similar flaky time outs on the current Intel Macs?

No, we don't see this kind of hang in our current CI.

@@ -192,6 +192,7 @@ - (void)testTransformFix {
__block NSDictionary<NSString *, id> *initializationEvent;
[player onListenWithArguments:nil
eventSink:^(id event) {
NSLog(@"event: %@", event);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same logging that the assertion failure in the next line already does (which is less detail than what is in the [event isKindOfClass:[FlutterError class]] branch below), so you're not going to get new information here.

You could try logging information about the original error itself, in the case AVPlayerItemStatusFailed: section of FLTVideoPlayerPlugin.m, but I didn't have any luck with that in the CI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if it's not a FlutterError? This logging helped me locally before to find out about the initialization. Let's see whaat it will print. 🤷

@stuartmorgan
Copy link
Contributor

Also which exact command are you using to run only VideoPlayerTests. I didn't figure it out and don't want to run all tests each time. 😅

dart ./script/tool/bin/flutter_plugin_tools.dart native-test --ios --ios-destination "platform=iOS Simulator,name=iPhone 11,OS=latest" --packages=video_player_avfoundation

(Which is as close as possible to the CI command, but modified to filer to that package.)

@fkorotkov
Copy link
Contributor Author

I can reproduce the failure locally in a VM. Here is the intereseting output:

VideoError: Failed to load video: The operation couldn’t be completed. (CoreMediaErrorDomain error -12746.)

Googling it showed a few results including one related to Flutter. I wonder if it's related to a fact that the VM doesn't have an audio output. Let me try to investigate a bit more.

@stuartmorgan
Copy link
Contributor

Googling it showed a few results including one related to Flutter.

Note that that's a different error code though. The textual part of the message is a vague catch-all, so it's likely that the error code is the most interesting part.

I wonder if it's related to a fact that the VM doesn't have an audio output. Let me try to investigate a bit more.

I was going to say that seemed unlikely since it's a clock error code, but then I saw CMAudioDeviceClockCreate. So perhaps the issue is that it's trying to create a clock for a non-existent audio device, and failing.

@fkorotkov
Copy link
Contributor Author

Yes, it's a missing audio device. I installed virtual device in the VM and test passed. Fixing it in cirruslabs/macos-image-templates#47. Will need to rebuild the VM and release it which will take a few hours. Will report back once it's ready.

@stuartmorgan
Copy link
Contributor

Great! Meanwhile I'll push some minor fixes to reduce the unrelated red. I think we're just down to the simctl issue now.

@fkorotkov
Copy link
Contributor Author

Pardon for the noise. I've rebased the branch and not sure yet what happened.

@fkorotkov
Copy link
Contributor Author

Oh. I guess my original branch was based of master and I rebased on top of main. I'll just close up the PR. There is still investigation going on and I don't want to spam even more. Sorry for the noise once again!

@fkorotkov fkorotkov closed this May 20, 2022
@stuartmorgan
Copy link
Contributor

@stuartmorgan do I understand it correctly that each integration test is a separate App that is getting installed into the simulator and the simulator is never getting reset?

Yes, that's correct.

I wonder if resetting the simulator or at least uninstalling the app after running drive might help.

🤷🏻 I would expect that the simulator can handle having a bunch of other apps installed, and it's never been a problem for us before. But doing that shouldn't break anything if you want to try it. (I do expect it would make runs somewhat slower; I'm not sure how much worse having each test be a clear simulator launch would be.)

stuartmorgan added a commit to stuartmorgan/plugins that referenced this pull request May 20, 2022
This adopts the new Apple silicon images for:
- linting
- macOS platform tests
- iOS build-all

This gives us build coverage across both architectures for both iOS and
macOS.

Ideally we would use ARM for iOS platform tests instead of build-all,
but driving the iOS tests currently has flaky hangs on ARM. See
flutter#5693 for details. Completing
that part of the migration is left as a TODO.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.