Linkerd multicluster: probe-gateway mirrored from cluster X has no endpoints

:x: probe services able to communicate with all gateway mirrors

  • probe-gateway-xxx-linkerd-multicluster mirrored from cluster [xxx] has no endpoints
  • see | Linkerd for hints

Got the error above - how to troubleshoot this. I am sorry but the description in the docs is not really helpful

1 Like

Hi @leonids2005 - can you confirm if the probe-gateway has endpoints on the remote clusters?

You can check if both kubernetes and linkerd see the endpoints in the remote cluster (where the service actually lives before being mimrrored), and ensure they agree by running the following:

linkerd dg endpoints <probe_gateway_svc_name>.<namespace>.svc.cluster.local && kubectl get endpoints <probe_gateway_svc_name> -n <namespace>

there no endpoints in probe-gateway on both clusters… the question how to troubleshoot this - why they are not there

Maybe a bit more details. My setup is a bit unusual. I use local installation with OrbStack, Kind and MetalLB. But everything works in “other” scenarios.

During the multicluster creation process for some reason endpoints in the probe service are not created.

With Orbstack and Kind, you’ll likely have to edit the API server’s address. Kind sets up a portforward and sets the API server address in your $KUBECONFIG to something like “http://127.0.0.1:59240” – and that won’t work when your other cluster tries to use it to find mirrored Services.

The simplest way to do this is to use the IP address of the Kind control plane Node:

NODEADDR=$(kubectl get nodes -ojsonpath='{.items[0].status.addresses[0].address}')
kubectl config set clusters.$clustername.server "https://${NODEADDR}:6443"

(If you have a multinode Kind cluster, you may need to be careful about which Node you choose.)

Do that before linkerd mc link, or delete the Link resources and then rerun the linkerd mc link command. Let us know how it goes!

1 Like

If I need bidirectional communication, I need to run this for every cluster before linking clusters. I use one node cluster… but if I need more - how to choose? Is there a flag in the link command to override the address?

There is a flag for the link command to override it, yes (--api-server-address). I’ve gotten into the habit of resetting the API server address on Kind clusters just so I don’t have to remember the flag – you do it once, right after creating the cluster, and it lasts for the life of the cluster. :slight_smile: -

With multiple-Node Kind clusters, I’m pretty sure you use the control plane Node rather than any worker Nodes.

Yeah, just tried creating a three-Node cluster named foo, and this worked:

NODEADDR=$(kubectl get node foo-control-plane -ojsonpath='{.status.addresses[0].address}')
kubectl config set clusters.kind-foo.server "https://${NODEADDR}:6443"

Note that to kubectl, the cluster name is kind-foo, even though Kind thinks of it as just foo, and also that the jsonpath changes because I’m asking for a single Node by name, rather than all Nodes and looking at the first.

1 Like

I have exactly the same issue but on GKE. Already have two clusters. exactly follows the steps in https://linkerd.io/2.15/tasks/multicluster/#install-linkerd

but after running this command. I got probe error:
linkerd --context=west multicluster check

linkerd-multicluster
--------------------
√ Link CRD exists
√ Link resources are valid
        * east
√ remote cluster access credentials are valid
        * east
√ clusters share trust anchors
        * east
√ service mirror controller has required permissions
        * east
√ service mirror controllers are running
        * east
× probe services able to communicate with all gateway mirrors
        probe-gateway-east.linkerd-multicluster mirrored from cluster [east] has no endpoints
    see https://linkerd.io/2/checks/#l5d-multicluster-gateways-endpoints for hints
√ multicluster extension proxies are healthy
√ multicluster extension proxies are up-to-date
√ multicluster extension proxies and cli versions match

Status check results are ×

What I’ve missed ?

cluster west => 3 nodes
cluster west => 3 nodes

I also get ALIVE False in this command:

linkerd --context=west multicluster gateways

CLUSTER  ALIVE    NUM_SVC      LATENCY  
east     False          0            - 

My feeling we have the same problem here - we need to make sure that address is set correctly - and should point to control plane node IP address.

Yeah. When you’re looking at multicluster problems, here’s my top three list of things that get folks in trouble:

  1. Your clusters don’t actually have network connectivity.
    This is generally more of an issue with pod-to-pod than with gateway-based multicluster but it is always the first thing to check. In all cases, a pod in one cluster needs to be able to talk to the external IP of the linkerd-gateway Service in the linkerd-multicluster namespace of the other clusters, and to the API servers in the other clusters. For pod-to-pod, additionally, a pod in one cluster needs to be able to communicate directly with a pod in the others.

  2. Your clusters don’t have good addresses for the other clusters’ API servers.
    If you’re trying to set up multicluster between cluster A and cluster B, then you need to link them together, which means that in clusterA you’ll apply a Link object that contains (among other things) a set of Kubernetes credentials to use when asking questions of the API server in cluster B. While kubectl might be using port forwarding to talk to the cluster B API server via 127.0.0.1, that won’t work for the Link object. It needs cluster B’s real API server address, nothing else.

  3. Your clusters don’t have a shared trust anchor.
    For mTLS (and thus Linkerd) to work, the clusters have to have a linked trust hierarchy. By far the simplest way to do that is for both clusters to share the same trust anchor.

So those things are where I would start. @mojtaba, if you check the troubleshooting URL, one of the things it lists is a problem with the external IP of the gateway service in the target cluster…

cluster needs to be able to talk to the external IP of the linkerd-gateway

But as I understand there should be a connectivity with “external” cluster control plane required to get information for replication.

@leonids2005 :man_facepalming: You’re correct. I’ve edited my answer above to include that, mea culpa!

for the first item we checked that it is connected. we tested using connectivity test in gcloud.

for the second item could you show me stepwise how practically we can check it ? because when I see config in ~/.kube/config it seems the IPs are there and there is no 127.0.0.1

for the last item you mentioned I exactly followed the official document to create self-sign certificate => Multi-cluster communication | Linkerd
is there anything else needed for that part ?

Sorry, one more question. On my local setup with kind clusters, I had to reapply certificates daily, although my certificates are not expired. What can I do wrong here?

After making it successfully work in local environment we are facing problem with running it on GKE clusters. The problem is that they are configured as private clusters - connectivity is a problem. I am sure we are not the first people to solve this problem - can you please recommend the approach to setup these clusters/connectivity.

if we cannot make control plane available to remote cluster we can probably manually “mirror” services. Are there any hidden problems using this approach?