Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd revision occurs Inconsistent #13654

Closed
didihongsheng opened this issue Jan 28, 2022 · 5 comments
Closed

etcd revision occurs Inconsistent #13654

didihongsheng opened this issue Jan 28, 2022 · 5 comments
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@didihongsheng
Copy link

etcd version : 3.5.0
k8s version : 1.20

we are in production environment, the k8s log shown below, it shows the revision has a problem :

image

so we check endpoints status in etcd cluster, as shown below:
image

etcd raft index seems healthy, but revision is inconsistent, one node is too far behind. I know the revision is maintained in mvcc , but in normal case , if etcd raft index is heathy in cluster ,revision should not occurs like this, I don't known the reason.

@ptabor ptabor added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jan 31, 2022
@ahrtr
Copy link
Member

ahrtr commented Feb 7, 2022

Any reason why you setup etcd cluster of 3.5.0 for K8s 1.20? Note that K8s 1.20 is supposed to work with etcd 3.4.13, see go.mod#L447. There is a similar issue 13547.

Can you always see this issue when running K8s 1.12 together with etcd 3.5.0?

@serathius
Copy link
Member

@didihongsheng One idea is that you run Etcd instances with slightly different apply logic. This could be caused by incorrectly doing cluster upgrade.

Could you specify if this cluster was created in version v3.5.0 or it was upgraded? What versions you run before this? Did you have any problems during upgrade?

@didihongsheng
Copy link
Author

an you always see this is

this issue only appear once when the k8s's control plane crash ,then I remove the problematic member and delete its datadir ,and member add it back

@didihongsheng
Copy link
Author

the version before 3.5 is 3.3,we update 3.3 to 3.4, then 3.4 to 3.5, I didn't meet with problems .

@serathius
Copy link
Member

Data corruption issue was found in v3.5.[0-2] release. Please upgrade to v3.5.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Development

No branches or pull requests

4 participants