-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pods stuck in terminating or Init:0/2 state after fileshere went down and then up again #693
Comments
should be related to this bug: kubernetes/kubernetes#121851, force delete the pod could workaround. |
@andyzhangx |
@achaikaJH there is already a PR to fix it: #694 |
@andyzhangx |
Sorry it took me so long to respond.
Thank you for your help with this issue! |
What happened:
I'm using csi-smb driver to connect to Windows DFS file shares from multiple pods and clusters in azure AKS v 1.26.3. The same share path might be assigned to different pods and sometimes the source for the PV would vary but ultimately come to the same DFS root. For example:
PV1:
PV2
PV3
Last week DFS server went down for several minutes and then got back up. But PVs in the clusters never recovered. If I try to restart pod it goes in terminating state and just stays there forever. New pod meanwhile goes into Init:0/2 state and also stuck there.
Events from pod which stuck in "Terminating" state:
Events from pod which stuck in "Init:0/2" state:
Events from cs0-smb-node, smb container:
I noticed that there is no new events since Friday.
One workaround I found that if I cordon node and force delete pod it'll start on the another node but only if this node didn't have this share mounted previously.
What you expected to happen:
I expected SMB connection to reconcile after file share was available again.
How to reproduce it:
Anything else we need to know?:
Environment:
kubectl version
): 1.26.3uname -a
): 5.15.0-1041-azureThe text was updated successfully, but these errors were encountered: