Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically retry lost/timed out LIFX requests #91157

Merged
merged 30 commits into from
Apr 16, 2023
Merged

Automatically retry lost/timed out LIFX requests #91157

merged 30 commits into from
Apr 16, 2023

Conversation

bdraco
Copy link
Member

@bdraco bdraco commented Apr 10, 2023

Proposed change

The LIFX integration will now allow for the update to take 90s and retry the update requests up to 5 times before declaring the device unavailable to account for UDP drops and device connection instability.

Previously the integration would declare the device unavailable after 5 attempts allowing a maximum response timeout of 1.65s per message and a total timeout of 9 seconds.

#78876 (comment)

fixes #78876

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Deprecation (breaking change to happen in the future)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

  • This PR fixes or closes issue: fixes #
  • This PR is related to issue:
  • Link to documentation pull request:

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • I have followed the perfect PR recommendations
  • The code has been formatted using Black (black --fast homeassistant tests)
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all.
  • For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.
  • Untested files have been added to .coveragerc.

To help with the load of incoming pull requests:

@home-assistant
Copy link

Hey there @Djelibeybi, mind taking a look at this pull request as it has been labeled with an integration (lifx) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of lifx can trigger bot actions by commenting:

  • @home-assistant close Closes the pull request.
  • @home-assistant rename Awesome new title Renames the pull request.
  • @home-assistant reopen Reopen the pull request.
  • @home-assistant unassign lifx Removes the current integration label and assignees on the pull request, add the integration domain after the command.

@bdraco
Copy link
Member Author

bdraco commented Apr 10, 2023

There are still some issues with the lib:

The fire and forget background task can be GCed at any time because there is no reference being held to it so its just going to randomly fail

the wait_for needs to be changed to an async_timeout

But all of this can be done in parallel

@bdraco bdraco changed the title Automatically retry failed LIFX requests Automatically retry lost/timesLIFX requests Apr 10, 2023
@bdraco bdraco changed the title Automatically retry lost/timesLIFX requests Automatically retry lost/timed out LIFX requests Apr 10, 2023
@bdraco
Copy link
Member Author

bdraco commented Apr 10, 2023

I should be able to pare this down a bit since the lib has retry support built in but it doesn't have much flexibility to do it per request type.

Probably need to fix some of the underlying issues with the lib to make it reliable as well

@bdraco bdraco marked this pull request as ready for review April 10, 2023 18:32
@frenck frenck added the smash Indicator this PR is close to finish for merging or closing label Apr 11, 2023
@Djelibeybi
Copy link
Contributor

It's probably worth waiting for aiolifx to incorporate all your PRs so that this can be refactored accordingly.

@bdraco
Copy link
Member Author

bdraco commented Apr 12, 2023

Unless I'm missing something, I don't think any of the PRs I have open to aiolifx will materially change this. They should make it more reliable since the tasks will be less likely to be prematurely garbage collected, but with the retry logic, even if that does happen it would hopefully mean the next one would take care of that assuming it didn't get prematurely garage collected as well.

@bdraco
Copy link
Member Author

bdraco commented Apr 13, 2023

aiolifx bump #91324

@frenck frenck merged commit 9625444 into dev Apr 16, 2023
@frenck frenck deleted the lifx_retries branch April 16, 2023 12:27
@bdraco
Copy link
Member Author

bdraco commented Apr 16, 2023

Thanks

@github-actions github-actions bot locked and limited conversation to collaborators Apr 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lifx integration with many devices frequently goes unavailable
3 participants