Installation Issue 2024.6.4

Installed 2024.6.4 linkerd-cni via helm - everything looks good 2/2 pods with repair controller.

But installing linkerd, it keeps getting stuck on linkerd-network-validator. We’re running on rke2 (k3s). Using helm for the installs,

helm install --debug --namespace linkerd linkerd2 --set controllerImage=oursite:20443/linkerd/controller --set debugContainer.image.name=oursite:20443/linkerd/debug --set global.proxy.image.name=oursite:20443/linkerd/proxy --set global.proxyInit.image.name=oursite:20443/linkerd/proxy-init --set-file identityTrustAnchorsPEM=./goodlocation/linkerd-root-ca.crt --set-file identity.issuer.tls.crtPEM=./goodlocation/linkerd-issuer.crt --set-file identity.issuer.tls.keyPEM=./goodlocation/linkerd-issuer.key --set identity.issuer.crtExpiry=“2025-07-03T09:29:18Z” ./ --values ./values/dev/values.yaml --set cniEnabled=true

I get these linkerd-destination pod errors of incomplete status for linkerd-network-validator - Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox - plugin type=“linkerd-cni” name=“linkerd-cni” failed (add): exit status 127

Has anybody seen this? I’m guessing although linkerd-cni seems okay with no log issues, there’s probably some sort of configuration or link issue.

Speculating because, I don’t think it is creating the 05-linkerd-cni.conflist or anything similar…

However, ZZZ-linkerd-cni-kubeconfig exists under /etc/cni/net.d and linkerd-cni is in the /opt/cni/bin folder.

Thoughts?

More info:

From the events:
(combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox “4b…35”: plugin type=“linkerd-cni” name=“linkerd-cni” failed (add): exit status 127

type or apiVersion: v1
data:
  cni_network_config: |-    
    {
      "name": "linkerd-cni",
      "type": "linkerd-cni",
      "log_level": "debug",
      "policy": {
          "type": "k8s",
          "k8s_api_root": "https://kubernetes.default.svc:443",
          "k8s_auth_token": "/var/run/secrets/kubernetes.io/serviceaccount/token"
      },
      "kubernetes": {
          "kubeconfig": "/etc/cni/net.d/ZZZ-linkerd-cni-kubeconfig"
      },
      "linkerd": {
        "incoming-proxy-port": 4143,
        "outgoing-proxy-port": 4140,
        "proxy-uid": 2102,
        "ports-to-redirect": [],
        "inbound-ports-to-ignore": ["4191","4190"],
        "simulate": false,
        "use-wait-flag": false,
        "iptables-mode": "legacy",
        "ipv6": false
      }
    }
  dest_cni_bin_dir: /opt/cni/bin
  dest_cni_net_dir: /etc/cni/net.d
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: linkerd-cni
    meta.helm.sh/release-namespace: linkerd-cni
  creationTimestamp: "2024-07-03T19:45:27Z"
  labels:
    app.kubernetes.io/managed-by: Helm
    linkerd.io/cni-resource: "true"
  name: linkerd-cni-config
  namespace: linkerd-cnipaste code here

linkerd check just gives me this:
kubernetes-api

√ can initialize the client
√ can query the Kubernetes API

kubernetes-version

√ is running the minimum Kubernetes API version

linkerd-existence

√ ‘linkerd-config’ config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods

  • No running pods for “linkerd-destination”

where the last step just keeps spinning…

Are you using Cilium as the CNI by any chance?

If not, are there CNI logs we can look at?