After more investigation I found out that the Linkerd CNI script monitor only a specific events (CREATE & DELETE) that doesn’t get triggered when calico pod restarted.
monitor() {
inotifywait -m "${HOST_CNI_NET}" -e create,delete |
while read -r directory action filename; do
if [[ "$filename" =~ .*.(conflist|conf)$ ]]; then
echo "Detected change in $directory: $action $filename"
sync "$filename" "$action" "$cni_conf_sha"
# When file exists (i.e we didn't deal with a DELETE ev)
# then calculate its sha to be used the next turn.
if [[ -e "$directory/$filename" && "$action" != 'DELETE' ]]; then
cni_conf_sha="$(sha256sum "$directory/$filename" | while read -r s _; do echo "$s"; done)"
fi
fi
done
}
These are the events on the CNI directory when I try reproducing the issue by killing calico-node in the node where Linkerd CNI run
root@linkerd-cni-9h2rc:/linkerd $ inotifywait -m /host/etc/cni/net.d
Setting up watches.
Watches established.
/host/etc/cni/net.d/ OPEN,ISDIR
/host/etc/cni/net.d/ CLOSE_NOWRITE,CLOSE,ISDIR
/host/etc/cni/net.d/ MODIFY calico-kubeconfig
/host/etc/cni/net.d/ OPEN calico-kubeconfig
/host/etc/cni/net.d/ MODIFY calico-kubeconfig
/host/etc/cni/net.d/ CLOSE_WRITE,CLOSE calico-kubeconfig
/host/etc/cni/net.d/ OPEN,ISDIR
/host/etc/cni/net.d/ ACCESS,ISDIR
/host/etc/cni/net.d/ MODIFY 10-calico.conflist
/host/etc/cni/net.d/ OPEN 10-calico.conflist
/host/etc/cni/net.d/ ACCESS,ISDIR
/host/etc/cni/net.d/ CLOSE_NOWRITE,CLOSE,ISDIR
/host/etc/cni/net.d/ MODIFY 10-calico.conflist
/host/etc/cni/net.d/ CLOSE_WRITE,CLOSE 10-calico.conflist
/host/etc/cni/net.d/ OPEN 10-calico.conflist
/host/etc/cni/net.d/ ACCESS 10-calico.conflist
/host/etc/cni/net.d/ CLOSE_NOWRITE,CLOSE 10-calico.conflist
/host/etc/cni/net.d/ OPEN 10-calico.conflist
/host/etc/cni/net.d/ ACCESS 10-calico.conflist
/host/etc/cni/net.d/ CLOSE_NOWRITE,CLOSE 10-calico.conflist
/host/etc/cni/net.d/ OPEN,ISDIR
/host/etc/cni/net.d/ ACCESS,ISDIR
/host/etc/cni/net.d/ CLOSE_NOWRITE,CLOSE,ISDIR
/host/etc/cni/net.d/ OPEN 10-calico.conflist
/host/etc/cni/net.d/ ACCESS 10-calico.conflist
/host/etc/cni/net.d/ CLOSE_NOWRITE,CLOSE 10-calico.conflist
/host/etc/cni/net.d/ OPEN,ISDIR
/host/etc/cni/net.d/ ACCESS,ISDIR
/host/etc/cni/net.d/ CLOSE_NOWRITE,CLOSE,ISDIR
/host/etc/cni/net.d/ OPEN 10-calico.conflist
/host/etc/cni/net.d/ ACCESS 10-calico.conflist
/host/etc/cni/net.d/ CLOSE_NOWRITE,CLOSE 10-calico.conflist
/host/etc/cni/net.d/ OPEN,ISDIR
/host/etc/cni/net.d/ CLOSE_NOWRITE,CLOSE,ISDIR
/host/etc/cni/net.d/ MODIFY calico-kubeconfig
/host/etc/cni/net.d/ OPEN calico-kubeconfig
/host/etc/cni/net.d/ MODIFY calico-kubeconfig
/host/etc/cni/net.d/ CLOSE_WRITE,CLOSE calico-kubeconfig
/host/etc/cni/net.d/ OPEN,ISDIR
/host/etc/cni/net.d/ ACCESS,ISDIR
/host/etc/cni/net.d/ CLOSE_NOWRITE,CLOSE,ISDIR
/host/etc/cni/net.d/ OPEN 10-calico.conflist
/host/etc/cni/net.d/ ACCESS 10-calico.conflist
/host/etc/cni/net.d/ CLOSE_NOWRITE,CLOSE 10-calico.conflist
/host/etc/cni/net.d/ OPEN,ISDIR
/host/etc/cni/net.d/ ACCESS,ISDIR
/host/etc/cni/net.d/ ACCESS,ISDIR
/host/etc/cni/net.d/ CLOSE_NOWRITE,CLOSE,ISDIR
/host/etc/cni/net.d/ OPEN 10-calico.conflist
/host/etc/cni/net.d/ ACCESS 10-calico.conflist
/host/etc/cni/net.d/ CLOSE_NOWRITE,CLOSE 10-calico.conflist
I guess if the Linkerd CNI script monitor also modify events that could solve this issue.