Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix scheduler ci endpoint test #40

Merged
merged 1 commit into from
Jun 3, 2024
Merged

Conversation

cicoyle
Copy link
Owner

@cicoyle cicoyle commented Jun 3, 2024

Fix scheduler ci endpoint test

Signed-off-by: Cassandra Coyle <[email protected]>
@cicoyle cicoyle requested review from artursouza and yaron2 June 3, 2024 17:35
@cicoyle cicoyle merged commit afe239a into feat-dist-scheduler Jun 3, 2024
cicoyle added a commit that referenced this pull request Jun 26, 2024
* add back in scheduler app that I forgot to add in. fix go.mod

Signed-off-by: Cassandra Coyle <[email protected]>

* add a bunch of integration tests and run make lint

Signed-off-by: Cassandra Coyle <[email protected]>

* Convenient ignore for .local folder.

Signed-off-by: Artur Souza <[email protected]>

* Add scheduler to the list of binaries in Makefile.

Signed-off-by: Artur Souza <[email protected]>

* Move placement interface out of internal package.

Signed-off-by: Artur Souza <[email protected]>

* Move actor config out of internal package.

Signed-off-by: Artur Souza <[email protected]>

* Move actor api level out of internal.

Signed-off-by: Artur Souza <[email protected]>

* Add placement client to scheduler.

Signed-off-by: Artur Souza <[email protected]>

* update go.mod and rm the appID prefix from being returned to users

Signed-off-by: Cassandra Coyle <[email protected]>

* fix up some make lint issues

Signed-off-by: Cassandra Coyle <[email protected]>

* Make data to be bytes and add it to scheduler's trigger callback.

Signed-off-by: Artur Souza <[email protected]>

* Triggers reminders using scheduler service.

Signed-off-by: Artur Souza <[email protected]>

* Address comments.

Signed-off-by: Artur Souza <[email protected]>

* Update pkg/actors/actors.go

Co-authored-by: Cassie Coyle <[email protected]>
Signed-off-by: Artur Souza <[email protected]>

* Revert Job's data back to Any.

Signed-off-by: Artur Souza <[email protected]>

* integration tests for daprd connecting to scheduler and performing crud ops on 10jobs. req validation

Signed-off-by: Cassandra Coyle <[email protected]>

* add error integration tests for schedule job

Signed-off-by: Cassandra Coyle <[email protected]>

* Apply suggestions from code review

Co-authored-by: Josh van Leeuwen <[email protected]>
Signed-off-by: Cassie Coyle <[email protected]>

* fix imports and general cleanup and fix issue with require.Len on slice from PR review feedback

Signed-off-by: Cassandra Coyle <[email protected]>

* add helper for error reason string construction

Signed-off-by: Cassandra Coyle <[email protected]>

* build out scheduler server as grpc server for testing to force scheduler server errors. add tests for all other tests

Signed-off-by: Cassandra Coyle <[email protected]>

* rename constructReason file

Signed-off-by: Cassandra Coyle <[email protected]>

* metadata -> errMetadata to avoid confusion

Signed-off-by: Cassandra Coyle <[email protected]>

* updates based on PR feedback

Signed-off-by: Cassandra Coyle <[email protected]>

* Update tests/integration/suite/scheduler/api/jobs.go

Co-authored-by: Josh van Leeuwen <[email protected]>
Signed-off-by: Cassie Coyle <[email protected]>

* fix names due to added PR feedback thru UI

Signed-off-by: Cassandra Coyle <[email protected]>

* Apply suggestions from code review

Co-authored-by: Josh van Leeuwen <[email protected]>
Signed-off-by: Cassie Coyle <[email protected]>

* incorporate Josh's feedback

Signed-off-by: Cassandra Coyle <[email protected]>

* update error strings based on PR feedback

Signed-off-by: Cassandra Coyle <[email protected]>

* Add etcd data dir, Listen addr, cleanup (#15)

* pulling out changes from existing PRs to be cleaner

Signed-off-by: Cassandra Coyle <[email protected]>

* cherrypick listen addr logic from other PR commit

Signed-off-by: Cassandra Coyle <[email protected]>

* Make scheduler host address singular

Signed-off-by: joshvanl <[email protected]>

* wip

Signed-off-by: Cassandra Coyle <[email protected]>

* put back etcd name to fix err

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>
Signed-off-by: joshvanl <[email protected]>
Co-authored-by: joshvanl <[email protected]>

* fix issue running tests in parallel by adding initial cluster info and more etcd config

Signed-off-by: Cassandra Coyle <[email protected]>

* change scheduler logLevel back to info for test

Signed-off-by: Cassandra Coyle <[email protected]>

* wip

Signed-off-by: Cassandra Coyle <[email protected]>

* got ha for 2 schedulers working and am able to see the db in etcd via the cli

Signed-off-by: Cassandra Coyle <[email protected]>

* move tests around based on PR feedback

Signed-off-by: Cassandra Coyle <[email protected]>

* move tests around based on PR feedback

Signed-off-by: Cassandra Coyle <[email protected]>

* cleanup of comments and update test to schedule against a random scheduler, but ensure data exists on all schedulers

Signed-off-by: Cassandra Coyle <[email protected]>

* cleanup and switch initialCluster to []string

Signed-off-by: Cassandra Coyle <[email protected]>

* update etcdClientPorts to be a map to de-dup the lookup logic

Signed-off-by: Cassandra Coyle <[email protected]>

* update test framework to have EtcdClientPort on scheduler for easier lookup later

Signed-off-by: Cassandra Coyle <[email protected]>

* make lint

Signed-off-by: Cassandra Coyle <[email protected]>

* cleanup

Signed-off-by: Cassandra Coyle <[email protected]>

* update test framework etcd-client-ports

Signed-off-by: Cassandra Coyle <[email protected]>

* fix import

Signed-off-by: Cassandra Coyle <[email protected]>

* add note to rm etcdctl cmd later

Signed-off-by: Cassandra Coyle <[email protected]>

* rm raw command for library func

Signed-off-by: Cassandra Coyle <[email protected]>

* add initial connections pool for scheduler server of sidecars connecting

Signed-off-by: Cassandra Coyle <[email protected]>

* initial streaming from scheduler to sidecar at trigger time works. need to cleanup still and write tests

Signed-off-by: Cassandra Coyle <[email protected]>

* cleanup scheduler server.go

Signed-off-by: Cassandra Coyle <[email protected]>

* fix stream ctx to exit go routine properly

Signed-off-by: Cassandra Coyle <[email protected]>

* kill runtime go routine if ctx is cancelled or if it errs and pass err out of go routine

Signed-off-by: Cassandra Coyle <[email protected]>

* Create Scheduler Charts (#12)

* initial charts. need to fix caching issue to confirm

Signed-off-by: Cassandra Coyle <[email protected]>

* Adds ETCD data dir option, use emptyDir in statefulset

Signed-off-by: joshvanl <[email protected]>

* fix healthz port

Signed-off-by: Cassandra Coyle <[email protected]>

* updates to charts

Signed-off-by: Cassandra Coyle <[email protected]>

* wip. need to fix connectivity issue, but added to injector and updated client code to be an abstract wrapper

Signed-off-by: Cassandra Coyle <[email protected]>

* Make scheduler host address singular

Signed-off-by: joshvanl <[email protected]>

* Update cmd/scheduler/options/options.go

Co-authored-by: Cassie Coyle <[email protected]>
Signed-off-by: Josh van Leeuwen <[email protected]>

* rm reminders service name condition

Signed-off-by: Cassandra Coyle <[email protected]>

* add listen address for scheduler

Signed-off-by: Cassandra Coyle <[email protected]>

* rebase in trigger reminders via scheduler PR and fix actors scheduler client

Signed-off-by: Cassandra Coyle <[email protected]>

* ha -> replicaCount, update conditionals including it and use Etcd over ETCD vars

Signed-off-by: Cassandra Coyle <[email protected]>

* rm volumeClaimTemplates and cleanup

Signed-off-by: Cassandra Coyle <[email protected]>

* fix version issue

Signed-off-by: Cassandra Coyle <[email protected]>

* rm diff

Signed-off-by: Cassandra Coyle <[email protected]>

* updates to ensure ha for k8s works

Signed-off-by: Cassandra Coyle <[email protected]>

* condense var into 1 line

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>
Signed-off-by: joshvanl <[email protected]>
Signed-off-by: Josh van Leeuwen <[email protected]>
Co-authored-by: joshvanl <[email protected]>

* remove the sidecar connections from the connPool in scheduler when appropriate. general cleanup. add logs

Signed-off-by: Cassandra Coyle <[email protected]>

* pkg/runtime/runtime.go

Signed-off-by: Cassandra Coyle <[email protected]>

* schedulerManager added, streaming works for n sidecars however, it doesn't stream back to the proper appID. this needs to be fixed

Signed-off-by: Cassandra Coyle <[email protected]>

* jobs now stream to the proper appID

Signed-off-by: Cassandra Coyle <[email protected]>

* fix runtime go routines to close properly. this schedules jobs properly to 2 sidecars of a diff appID

Signed-off-by: Cassandra Coyle <[email protected]>

* minConnCount -> max. fix lock on nsappid

Signed-off-by: Cassandra Coyle <[email protected]>

* add backoff retry inf logic for sidecar connecting

Signed-off-by: Cassandra Coyle <[email protected]>

* cleanup

Signed-off-by: Cassandra Coyle <[email protected]>

* random index for appID connection streaming choice and read lock for better efficiency

Signed-off-by: Cassandra Coyle <[email protected]>

* send only job data & metadata on stream. remove ConnectHosts.

Signed-off-by: Cassandra Coyle <[email protected]>

* string ptr for addresses to string. cleanup. add cmd options to testing framework for scheduler: placement, maxConns, maxWaitTime, add sidecar to notls test to get it to pass since scheduler needs a sidecar to stream back to at trigger time

Signed-off-by: Cassandra Coyle <[email protected]>

* make max sidecar conns -1 pending load tests

Signed-off-by: Cassandra Coyle <[email protected]>

* wip

Signed-off-by: Cassandra Coyle <[email protected]>

* testing my push

Signed-off-by: Cassandra Coyle <[email protected]>

* Adds closeCh to signal shutdown of watcher

Signed-off-by: joshvanl <[email protected]>

* still need to polish code, but wanted to push bc its working for reconnecting and having sidecar wait for scheduler to come up

Signed-off-by: Cassandra Coyle <[email protected]>

* Update to latest cron lib. (#17)

* Update to latest cron lib.

Signed-off-by: Artur Souza <[email protected]>

* Implement DeleteReminder

Signed-off-by: Artur Souza <[email protected]>

* Fix IT for cron after internal schema change in etcd store.

Signed-off-by: Artur Souza <[email protected]>

* Fix IT for scheduler.

Signed-off-by: Artur Souza <[email protected]>

* Update cmd/daprd/options/options.go

Co-authored-by: Cassie Coyle <[email protected]>
Signed-off-by: Artur Souza <[email protected]>

* Use "@every" to convert period to schedule.

Signed-off-by: Artur Souza <[email protected]>

* Add GetJob again and assume namespace in metadata.
Other comments.

Signed-off-by: Artur Souza <[email protected]>

* Address comments.

Signed-off-by: Artur Souza <[email protected]>

* remove debug line.

Signed-off-by: Artur Souza <[email protected]>

* Don't double close etcd.

Signed-off-by: Artur Souza <[email protected]>

---------

Signed-off-by: Artur Souza <[email protected]>
Co-authored-by: Cassie Coyle <[email protected]>

* streaming test ensuring triggered job logged on proper sidecar. adding wip rotate scheduler client if conn err. fixing to switch back to what I originally had.

Signed-off-by: Cassandra Coyle <[email protected]>

* rm unused rotate client on conn err. instead proceed to rotate schedulers per crud operation

Signed-off-by: Cassandra Coyle <[email protected]>

* gs

Signed-off-by: Cassandra Coyle <[email protected]>

* unexport manager structs where possible. updates based on PR feedback

Signed-off-by: Cassandra Coyle <[email protected]>

* make lint

Signed-off-by: Cassandra Coyle <[email protected]>

* rm \n

Signed-off-by: Cassandra Coyle <[email protected]>

* Fix scheduler client ref. (#20)

Signed-off-by: Artur Souza <[email protected]>

* use runner manager for go routines for scheduler manager & refactor/update streaming code in scheduler manager to be cleaner

Signed-off-by: Cassandra Coyle <[email protected]>

* update log spacing bc that was failing the streaming test. rm extra lock on pool

Signed-off-by: Cassandra Coyle <[email protected]>

* make lint

Signed-off-by: Cassandra Coyle <[email protected]>

* update actor code to use schedulerManager for next client and rm schedulerClient

Signed-off-by: Cassandra Coyle <[email protected]>

* update scheduler server conn pool to be in internal dir

Signed-off-by: Cassandra Coyle <[email protected]>

* mv struct creation to schedulerManager code, not runtime

Signed-off-by: Cassandra Coyle <[email protected]>

* rm maxConnsPerAppID and maxWaitTime

Signed-off-by: Cassandra Coyle <[email protected]>

* add appID + ns as first class fields and comment out ttl since updating to the latest cron lib breaks it. this will be added by artur in a followup PR

Signed-off-by: Cassandra Coyle <[email protected]>

* update notls test to rm time.sleep and assertEventually. update ns+appID to be in func and not caller logic

Signed-off-by: Cassandra Coyle <[email protected]>

* rm double string concatenation per josh PR feedback

Signed-off-by: Cassandra Coyle <[email protected]>

* rm extra setting of scheduler addr on daprd in framework since its set higher up in exec area

Signed-off-by: Cassandra Coyle <[email protected]>

* rm unused var

Signed-off-by: Cassandra Coyle <[email protected]>

* update trigger func to be latest lib

Signed-off-by: Cassandra Coyle <[email protected]>

* metadata of ns+appID added to scheduler logic, and removed from sidecar side

Signed-off-by: Cassandra Coyle <[email protected]>

* rm comment

Signed-off-by: Cassandra Coyle <[email protected]>

* simplify nextIdx for scheduler based on PR feedback

Signed-off-by: Cassandra Coyle <[email protected]>

* add code todos

Signed-off-by: Cassandra Coyle <[email protected]>

* pr feedback: watchJobs

Signed-off-by: Cassandra Coyle <[email protected]>

* mv client funcs to client dir

Signed-off-by: Cassandra Coyle <[email protected]>

* fix weird spacing in proto file

Signed-off-by: Cassandra Coyle <[email protected]>

* add schedulerManager Options struct

Signed-off-by: Cassandra Coyle <[email protected]>

* log.Errorf everywhere with addr

Signed-off-by: Cassandra Coyle <[email protected]>

* use atomic int instead of rand for next scheduler client. rm lock

Signed-off-by: Cassandra Coyle <[email protected]>

* rm SideCarConnDetails struct

Signed-off-by: Cassandra Coyle <[email protected]>

* move ns, appID concat to pool pkg

Signed-off-by: Cassandra Coyle <[email protected]>

* Update pkg/scheduler/server/internal/pool.go

Co-authored-by: Josh van Leeuwen <[email protected]>
Signed-off-by: Cassie Coyle <[email protected]>

* rm ctx per PR feedback since its unused

Signed-off-by: Cassandra Coyle <[email protected]>

* Apply suggestions from code review

Co-authored-by: Josh van Leeuwen <[email protected]>
Signed-off-by: Cassie Coyle <[email protected]>

* Update tests/integration/framework/process/daprd/options.go

Co-authored-by: Josh van Leeuwen <[email protected]>
Signed-off-by: Cassie Coyle <[email protected]>

* added todo to rm appID + ns from test for streaming

Signed-off-by: Cassandra Coyle <[email protected]>

* Productionises distributed scheduler (#22)

* Fixes warning “Running http and grpc server on single port. This is not recommended for production.”

Signed-off-by: Elena Kolevska <[email protected]>

* Suffixes data dirs with instance id

Signed-off-by: Elena Kolevska <[email protected]>

* Adds space quota parameter

Signed-off-by: Elena Kolevska <[email protected]>

* Sets default quota to 2GB

Signed-off-by: Elena Kolevska <[email protected]>

* Adds compaction parameters

Signed-off-by: Elena Kolevska <[email protected]>

* Updates helm charts

Signed-off-by: Elena Kolevska <[email protected]>

* Adds namespace to data dir name. Renames etcdID to just ID.

Signed-off-by: Elena Kolevska <[email protected]>

---------

Signed-off-by: Elena Kolevska <[email protected]>

* merge in master and comment out actorInternal in scheduler server.go

Signed-off-by: Cassandra Coyle <[email protected]>

* Send Triggered Job to App (#23)

* able to send triggered job back to app via the app channel from daprd sidecar using both grpc and http protocols

Signed-off-by: Cassandra Coyle <[email protected]>

* change sidecar receiving job to debug level to still validate the scheduler stream

Signed-off-by: Cassandra Coyle <[email protected]>

* grpc test

Signed-off-by: Cassandra Coyle <[email protected]>

* wip

Signed-off-by: Cassandra Coyle <[email protected]>

* some cleanup

Signed-off-by: Cassandra Coyle <[email protected]>

* update test framework grpc app to add the OnJobEventFn and update test to use it. grpc appcallback test passes

Signed-off-by: Cassandra Coyle <[email protected]>

* wip http test

Signed-off-by: Cassandra Coyle <[email protected]>

* added http working test. need to make lint

Signed-off-by: Cassandra Coyle <[email protected]>

* update tests with stub for interface func for triggerJob to app now since its in the app channel interface

Signed-off-by: Cassandra Coyle <[email protected]>

* defer release of ch

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>

* go-etcd-cron (#25)

* go-etcd-cron

Signed-off-by: joshvanl <[email protected]>

* Fix multi-scheduler int test

Signed-off-by: joshvanl <[email protected]>

* Review comments

Signed-off-by: joshvanl <[email protected]>

* Rename schedule app job type to job

Signed-off-by: joshvanl <[email protected]>

---------

Signed-off-by: joshvanl <[email protected]>

* fix charts (#24)

* restore test file diff, keep chart chagnes

Signed-off-by: Cassandra Coyle <[email protected]>

* fix read-only err

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>

* rm space in go.mod

Signed-off-by: Cassandra Coyle <[email protected]>

* Bidirectional job trigger & ack. (#26)

* Bidirectional job trigger & ack.

Adds job ack from scheduler client for when job is finished processing
and can be ticked.

Adds mTLS authorization to scheduler API server.

Adds integration tests for scheduler Jobs and Actor Reminders.

Signed-off-by: joshvanl <[email protected]>

* Review comments & reconnect scheduler int test

Signed-off-by: joshvanl <[email protected]>

* Update go-etcd-cron

Signed-off-by: joshvanl <[email protected]>

* Linting

Signed-off-by: joshvanl <[email protected]>

---------

Signed-off-by: joshvanl <[email protected]>

* Charts: Adds option to use dynamic PVC provistioned volume for Scheduler statefulset (#27)

* Charts: Adds option to use PVC for Scheduler statefulset

Adds optional `dapr_scheduler.cluster.persistentVolumeClaimName` helm
chart values option to change the scheduler data dir volume to use the
references PersistentVolumeClaim, rather than an empty dir, making ETCD
data persistent across pod restarts.

Also changes the volume and mount paths so that all schedulers share the
same root mount path, but write to a sub directory of the form
"/<namespace>/<scheduler-id>".

Signed-off-by: joshvanl <[email protected]>

* Update scheduler volume to use volumeClaimTemplate

Signed-off-by: joshvanl <[email protected]>

---------

Signed-off-by: joshvanl <[email protected]>

* add opts.HealthzListenAddress to fix make lint

Signed-off-by: Cassandra Coyle <[email protected]>

* Scheduler e2e tests (#28)

* merge & fix http status code check

Signed-off-by: Cassandra Coyle <[email protected]>

* triggered job e2e test for http app

Signed-off-by: Cassandra Coyle <[email protected]>

* update test iteration nums

Signed-off-by: Cassandra Coyle <[email protected]>

* rm time.sleep -> assert.eventually

Signed-off-by: Cassandra Coyle <[email protected]>

* rm local test changes

Signed-off-by: Cassandra Coyle <[email protected]>

* update test name

Signed-off-by: Cassandra Coyle <[email protected]>

* tweaks

Signed-off-by: Cassandra Coyle <[email protected]>

* grpc e2d works, need to cleanup grpc test

Signed-off-by: Cassandra Coyle <[email protected]>

* rm grpc test and combine into http test. keep both apps tho. need to cleanup local test changes in scheduler_test

Signed-off-by: Cassandra Coyle <[email protected]>

* cleanup local test changes

Signed-off-by: Cassandra Coyle <[email protected]>

* make lint

Signed-off-by: Cassandra Coyle <[email protected]>

* mv things around

Signed-off-by: Cassandra Coyle <[email protected]>

* cleanup

Signed-off-by: Cassandra Coyle <[email protected]>

* thread -> goroutine

Signed-off-by: Cassandra Coyle <[email protected]>

* Update clients.go

Signed-off-by: Cassie Coyle <[email protected]>

* rm commented line

Signed-off-by: Cassandra Coyle <[email protected]>

* Apply suggestions from code review

Co-authored-by: Josh van Leeuwen <[email protected]>
Signed-off-by: Cassie Coyle <[email protected]>

* PR review updates

Signed-off-by: Cassandra Coyle <[email protected]>

* review updates. add code todo

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>
Signed-off-by: Cassie Coyle <[email protected]>
Co-authored-by: Josh van Leeuwen <[email protected]>

* Retry clients if first try fails (#29)

* continuously retry scheduler clients if it fails upon the first try

Signed-off-by: Cassandra Coyle <[email protected]>

* Apply suggestions from code review

Co-authored-by: Josh van Leeuwen <[email protected]>
Signed-off-by: Cassie Coyle <[email protected]>

* fix indentation after UI committing

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>
Signed-off-by: Cassie Coyle <[email protected]>
Co-authored-by: Josh van Leeuwen <[email protected]>

* make lint

Signed-off-by: Cassandra Coyle <[email protected]>

* forgot to push this diff for e2e tests (#30)

Signed-off-by: Cassandra Coyle <[email protected]>

* Scheduler reminder preview feature (#31)

* initial feature flag

Signed-off-by: Cassandra Coyle <[email protected]>

* rename protos. add getReminder scheduler logic

Signed-off-by: Cassandra Coyle <[email protected]>

* fix typos

Signed-off-by: Cassandra Coyle <[email protected]>

* undo exporting the internal struct. added and exported func to call for reminderPeriod

Signed-off-by: Cassandra Coyle <[email protected]>

* fix typo

Signed-off-by: Cassandra Coyle <[email protected]>

* tweaks

Signed-off-by: Cassandra Coyle <[email protected]>

* total credit for this code goes to @artursouza

Signed-off-by: Cassandra Coyle <[email protected]>

* test for actor reminder scheduler preview feature

Signed-off-by: Cassandra Coyle <[email protected]>

* make lint

Signed-off-by: Cassandra Coyle <[email protected]>

* PR feedback

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>

* v1.0 -> v1.0-alpha1 (#32)

* v1.0 -> v1.0-alpha1

Signed-off-by: Cassandra Coyle <[email protected]>

* v1.0-alpha1 for e2e app

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>

* revert some unintentional changes (#33)

Signed-off-by: Cassandra Coyle <[email protected]>

* make lint

Signed-off-by: Cassandra Coyle <[email protected]>

* try to fix e2e (#34)

* try to fix e2e

Signed-off-by: Cassandra Coyle <[email protected]>

* rm test changes since env var now

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>

* Adds actor reminder scheduler perf tests (#35)

* Adds actor reminder scheduler perf tests

Signed-off-by: joshvanl <[email protected]>

* Run daprd in debug mode for test relying on new debug line

Signed-off-by: joshvanl <[email protected]>

---------

Signed-off-by: joshvanl <[email protected]>

* Adds trigger performance tests (#36)

Signed-off-by: joshvanl <[email protected]>

* update rpc names, fix mtls test (#37)

Signed-off-by: Cassandra Coyle <[email protected]>

* Fix scheduler int tests (#38)

* fix idtypes test

Signed-off-by: Cassandra Coyle <[email protected]>

* tweaks to fix some tests

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>

* make tests pass consistently (#39)

Signed-off-by: Cassandra Coyle <[email protected]>

* fix endpoint test (#40)

Signed-off-by: Cassandra Coyle <[email protected]>

* add ttl and up schedule time to see if it fixes the test (#41)

Signed-off-by: Cassandra Coyle <[email protected]>

* fix %d -> %f for perf tests

Signed-off-by: Cassandra Coyle <[email protected]>

* Updates go-etcd-cron to main to include dequeue delete (#42)

Also updated the scheduler crud test to remove race condition on testing
results but waiting longer to observe/not observe triggers.

Signed-off-by: joshvanl <[email protected]>

* fix go.sum

Signed-off-by: Cassandra Coyle <[email protected]>

* Feat dist scheduler perf tune (#43)

* Increase dueTime for reminder

Signed-off-by: joshvanl <[email protected]>

* Adds scheduler app as scope to config state store

Signed-off-by: joshvanl <[email protected]>

---------

Signed-off-by: joshvanl <[email protected]>

* Tune V2 (#44)

Signed-off-by: joshvanl <[email protected]>

* Enable HA for perf tests (#46)

Signed-off-by: joshvanl <[email protected]>

* rm ha (#47)

Signed-off-by: Cassandra Coyle <[email protected]>

* Enable scheduler reminders in workflow performance test (#48)

Signed-off-by: joshvanl <[email protected]>

* Enable scheduler reminders in workflow performance test (#49)

Signed-off-by: joshvanl <[email protected]>

* Fix workflow perf config option (#50)

Signed-off-by: joshvanl <[email protected]>

* Make runtime scheduler app channel use same interface. Set status code (#51)

on HTTP send job to gRPC status code

Signed-off-by: joshvanl <[email protected]>

* Updates go-etcd-cron (#52)

Signed-off-by: joshvanl <[email protected]>

* Updates go.sum (#53)

Signed-off-by: joshvanl <[email protected]>

* Updates go-etcd-cron (#54)

Signed-off-by: joshvanl <[email protected]>

* Fix buildx from arm64 host OS.

Signed-off-by: Artur Souza <[email protected]>

* Force docker type for docker image.

Signed-off-by: Artur Souza <[email protected]>

* Enable scheduler for workflows (#55)

* Enable scheduler for workflows

Enables scheduler to work with actor workflows.

Updates go-etcd-cron library to fix job name validation errors, and
ensure actor reminders are triggered on due_time, even when schedule
set.

Executes `actorRuntime` reminder execution handling in runtime scheduler
client trigger handler, ensuring internal workflow actors are invoked.

Advertises internal workflow `dapr.internal.default.xyz.workflow` and
`dapr.internal.default.xyz.activity` actor types to scheduler so enable
receiving workflow actor reminders.

Adds scheduler workflow integration tests.

Moves scheduler actor reminders as an implementation of the actor
reminder interface, removing the no-op implementation.

Signed-off-by: joshvanl <[email protected]>

* Adds namespace to scheduler advertised internal work actor type

Signed-off-by: joshvanl <[email protected]>

---------

Signed-off-by: joshvanl <[email protected]>

* Adds CallActorReminder internal gRPC to hop reminder calls to (#57)

correct host

Signed-off-by: joshvanl <[email protected]>

* Always use localnamespace in acotr invocation, and use correct target (#58)

AppID in reminder

Signed-off-by: joshvanl <[email protected]>

* Restart scheduler between workflow runs (#59)

Fix data race condition in remote scheduler integration test

Signed-off-by: joshvanl <[email protected]>

* rename type->JobTargetMetadata per proposal feedback (#56)

* rename type->kind per proposal feedback

Signed-off-by: Cassandra Coyle <[email protected]>

* JobMetadataKind->JobTargetMetadata

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>

* Test 100k Workflows with Scheduler (#60)

* 100k workflows

Signed-off-by: Cassandra Coyle <[email protected]>

* just do the 100k workflows

Signed-off-by: Cassandra Coyle <[email protected]>

* update test.js

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>

* Scheduler 100k workflows (#61)

* 100k workflows

Signed-off-by: Cassandra Coyle <[email protected]>

* just do the 100k workflows

Signed-off-by: Cassandra Coyle <[email protected]>

* update test.js

Signed-off-by: Cassandra Coyle <[email protected]>

* up timing and mem

Signed-off-by: Cassandra Coyle <[email protected]>

* 1k->100

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>

* up all mem limits (#62)

Signed-off-by: Cassandra Coyle <[email protected]>

* try rm-ing ctx timeout for actor reminder for workflow (#63)

Signed-off-by: Cassandra Coyle <[email protected]>

* Scheduler fix workflow perf (#65)

* update existing scenarios to test 100k workflows

Signed-off-by: Cassandra Coyle <[email protected]>

* only do it for 2 of tests

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>

* add rate (#66)

Signed-off-by: Cassandra Coyle <[email protected]>

* 100k -> 10k (#67)

Signed-off-by: Cassandra Coyle <[email protected]>

* upping/rm every timeout I could fine

Signed-off-by: Cassandra Coyle <[email protected]>

* up timeout (#68)

Signed-off-by: Cassandra Coyle <[email protected]>

* reset workflow timeouts and vals/config (#69)

Signed-off-by: Cassandra Coyle <[email protected]>

* Update charts/dapr/charts/dapr_scheduler/templates/dapr_scheduler_poddisruptionbudget.yaml

Signed-off-by: Alessandro (Ale) Segala <[email protected]>

* Scheduler fix int tests: quorum/notls on windows (#64)

* up time for windows int tests

Signed-off-by: Cassandra Coyle <[email protected]>

* up int test times to pass

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>

* Update pkg/api/errors/errors.go

Co-authored-by: Alessandro (Ale) Segala <[email protected]>
Signed-off-by: Cassie Coyle <[email protected]>

* Update tests/perf/actor_reminder/actor_reminder_test.go

Co-authored-by: Mike Nguyen <[email protected]>
Signed-off-by: Cassie Coyle <[email protected]>

* gofmt

Signed-off-by: Cassandra Coyle <[email protected]>

* Reset Workflow Config & Fix License (#71)

* fix license yr & fireAt -> fire_at from PR feedback

Signed-off-by: Cassandra Coyle <[email protected]>

* fix proto spacing and dont use empty

Signed-off-by: Cassandra Coyle <[email protected]>

* update return type and rm deprecation warning

Signed-off-by: Cassandra Coyle <[email protected]>

* reset activity_actor.go diff

Signed-off-by: Cassandra Coyle <[email protected]>

* fix license and reset workflow config

Signed-off-by: Cassandra Coyle <[email protected]>

* fix flaky idtypes test, PR feedback

Signed-off-by: Cassandra Coyle <[email protected]>

* rm legacy code bits per PR feedback

Signed-off-by: Cassandra Coyle <[email protected]>

* uint64 -> string per PR feedback

Signed-off-by: Cassandra Coyle <[email protected]>

* reset listen-addr and uint64 type

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>

* rename scheduler api -> jobs api (#72)

Signed-off-by: Cassandra Coyle <[email protected]>

* update e2e app endpoint

Signed-off-by: Cassandra Coyle <[email protected]>

* Scheduler workflow perf test app + original workflow perf test app (#73)

* add another app to test scheduler instead of replacing the existing one

Signed-off-by: Cassandra Coyle <[email protected]>

* add : for new var

Signed-off-by: Cassandra Coyle <[email protected]>

* general cleanup of diff + grpc api -> jobs instead of scheduler

Signed-off-by: Cassandra Coyle <[email protected]>

* cleanup

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>

* PR feedback updates + up test time (#74)

* update rand -> crypto/rand

Signed-off-by: Cassandra Coyle <[email protected]>

* update uuid -> id per PR feedback + up test time

Signed-off-by: Cassandra Coyle <[email protected]>

* uuid -> id, rand -> id++

Signed-off-by: Cassandra Coyle <[email protected]>

* uuid -> id

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>

* up test time

Signed-off-by: Cassandra Coyle <[email protected]>

* change default listen addr, up timeouts

Signed-off-by: Cassandra Coyle <[email protected]>

* reset int test timeout

Signed-off-by: Cassandra Coyle <[email protected]>

* rm redundant err check, debug loglevel for notls- (I will revert this)

Signed-off-by: Cassandra Coyle <[email protected]>

* mv err log to actual err conditional block

Signed-off-by: Cassandra Coyle <[email protected]>

* add getJobFn

Signed-off-by: Cassandra Coyle <[email protected]>

* localhost -> 127.0.0.1

Signed-off-by: Cassandra Coyle <[email protected]>

* trying @JoshVanL suggested changes to fix CI failures...TY

Signed-off-by: Cassandra Coyle <[email protected]>

* make lint

Signed-off-by: Cassandra Coyle <[email protected]>

* tweak notls: localhost->127.0.0.1

Signed-off-by: Cassandra Coyle <[email protected]>

* tweak ctx handling. add debug loglevel (will revert)

Signed-off-by: Cassandra Coyle <[email protected]>

* idtypes debug (will revert)

Signed-off-by: Cassandra Coyle <[email protected]>

* notls debug loglevel again and 127.0.0.1 instead of localhost

Signed-off-by: Cassandra Coyle <[email protected]>

* acct for windows path being different in etcd. / vs \

Signed-off-by: Cassandra Coyle <[email protected]>

* revert int test debug log level

Signed-off-by: Cassandra Coyle <[email protected]>

* log line added to idtypes

Signed-off-by: Cassandra Coyle <[email protected]>

* rm log for testing

Signed-off-by: Cassandra Coyle <[email protected]>

* rm log levels

Signed-off-by: Cassandra Coyle <[email protected]>

* Move OnJobEvent to OnJobEventAlpha1 (#75)

Signed-off-by: Artur Souza <[email protected]>

* Fix pubsub E2E. (#76)

Signed-off-by: Artur Souza <[email protected]>

* make lint

Signed-off-by: Cassandra Coyle <[email protected]>

---------

Signed-off-by: Cassandra Coyle <[email protected]>
Signed-off-by: Artur Souza <[email protected]>
Signed-off-by: Cassie Coyle <[email protected]>
Signed-off-by: joshvanl <[email protected]>
Signed-off-by: Josh van Leeuwen <[email protected]>
Signed-off-by: Elena Kolevska <[email protected]>
Signed-off-by: Alessandro (Ale) Segala <[email protected]>
Co-authored-by: Artur Souza <[email protected]>
Co-authored-by: Josh van Leeuwen <[email protected]>
Co-authored-by: Elena Kolevska <[email protected]>
Co-authored-by: Dapr Bot <[email protected]>
Co-authored-by: Alessandro (Ale) Segala <[email protected]>
Co-authored-by: Mike Nguyen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants