Services cannot talk to mirrored services

Hi, new to Linkerd so I need some extra explanations.

Currently, I am trying to set up 2 clusters that can talk to each other to run some experiments. The idea is given the sock-shop/ hotel reservation microservice apps to deploy half the services on one cluster and the other half on the other cluster. The current set up is I have 2 Ubuntu VMs running K3s. I’ve followed the tutorials and got east → west connection working (though repeating the same steps but from west to east and the connection does not work, complaining about ServiceAccounts, ClusterRoleBindings, Roles and RoleBindings). I export my services on east and west can see them. I can ping the services from west. However say I deploy the frontend service on west, it cannot communicate with the services in east. From the docs I would expect that linkerd in the background would route the frontend’s requests to east.

Any idea how to properly set this up so that the frontend in west can communicate with all services in east? (I get it that in a real scenario this probably makes no sense, however I need to be able to deploy different microservices on different clusters and ensure the app still works as a whole).

Thanks in advance. Any tips on a more robust set up are also appreciated as the current linking is very unstable and needs to be redone almost every time the host computer reboots.

The first questions are always: what version of Linkerd are you using? and what does linkerd check say?

Given that we can take a look at what might be going on. :slight_smile:

I’ve slightly updated my set up, I now have north, east and west. Each is a VM in virtual box (host Windows 11, guest Ubuntu server 24). VMs are connected on a host only network. I’m trying to link the clusters like this: north → east → west.

I realized the reason I could not get services to talk was because I was missing the traffic splitter. Now I can configure the traffic splitter such that requests are send to another copy of the service on a different cluster. Nonetheless, my multicluster connectivity is very hit or miss. Sometimes running linkerd mc gateways returns True and sometimes False. Even though I’m doing absolutely nothing.

Do you have any idea why it is so unstable? I spend the majority of time waiting for the moment where the gateway starts working again.

I had the two cluster set up working for a moment but then when I tried updating the set up to add north everything broke haha… Anyways here are the details:

Linkerd version:

Client version: edge-25.7.2
Server version: edge-25.7.2

Linkerd check on east:

kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ control plane pods are ready
√ cluster networks contains all node podCIDRs
√ cluster networks contains all pods
√ cluster networks contains all services

linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ proxy-init container runs as root user if docker container runtime is used

linkerd-identity
----------------
√ certificate config is valid
√ trust anchors are using supported crypto algorithm
√ trust anchors are within their validity period
√ trust anchors are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust anchor

linkerd-webhooks-and-apisvc-tls
-------------------------------
√ proxy-injector webhook has valid cert
√ proxy-injector cert is valid for at least 60 days
√ sp-validator webhook has valid cert
√ sp-validator cert is valid for at least 60 days
√ policy-validator webhook has valid cert
√ policy-validator cert is valid for at least 60 days

linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

control-plane-version
---------------------
√ can retrieve the control plane version
√ control plane is up-to-date
√ control plane and cli versions match

linkerd-control-plane-proxy
---------------------------
√ control plane proxies are healthy
√ control plane proxies are up-to-date
√ control plane proxies and cli versions match

linkerd-extension-checks
------------------------
√ namespace configuration for extensions

It hangs here forever, nothing else is reported. I suppose something to do with linkerd mc check also hanging after √ probe services able to communicate with all gateway mirrors * west.

Linkerd check on west:

kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ control plane pods are ready
√ cluster networks contains all node podCIDRs
√ cluster networks contains all pods
√ cluster networks contains all services

linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ proxy-init container runs as root user if docker container runtime is used

linkerd-identity
----------------
√ certificate config is valid
√ trust anchors are using supported crypto algorithm
√ trust anchors are within their validity period
√ trust anchors are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust anchor

linkerd-webhooks-and-apisvc-tls
-------------------------------
√ proxy-injector webhook has valid cert
√ proxy-injector cert is valid for at least 60 days
√ sp-validator webhook has valid cert
√ sp-validator cert is valid for at least 60 days
√ policy-validator webhook has valid cert
√ policy-validator cert is valid for at least 60 days

linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

control-plane-version
---------------------
√ can retrieve the control plane version
√ control plane is up-to-date
√ control plane and cli versions match

linkerd-control-plane-proxy
---------------------------
√ control plane proxies are healthy
√ control plane proxies are up-to-date
√ control plane proxies and cli versions match

linkerd-extension-checks
------------------------
√ namespace configuration for extensions

linkerd-multicluster
--------------------
√ Link CRD exists
√ multicluster extension proxies are healthy
√ multicluster extension proxies are up-to-date
√ multicluster extension proxies and cli versions match

linkerd-viz
-----------
√ linkerd-viz Namespace exists
√ can initialize the client
√ linkerd-viz ClusterRoles exist
√ linkerd-viz ClusterRoleBindings exist
√ tap API server has valid cert
√ tap API server cert is valid for at least 60 days
√ tap API service is running
√ linkerd-viz pods are injected
√ viz extension pods are running
√ viz extension proxies are healthy
√ viz extension proxies are up-to-date
√ viz extension proxies and cli versions match
√ prometheus is installed and configured correctly
√ viz extension self-check

linkerd-smi
-----------
√ linkerd-smi extension Namespace exists
√ SMI extension service account exists
√ SMI extension pods are injected
√ SMI extension pods are running
√ SMI extension proxies are healthy

Status check results are √

Linkerd check on north:

kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ control plane pods are ready
√ cluster networks contains all node podCIDRs
√ cluster networks contains all pods
√ cluster networks contains all services

linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ proxy-init container runs as root user if docker container runtime is used

linkerd-identity
----------------
√ certificate config is valid
√ trust anchors are using supported crypto algorithm
√ trust anchors are within their validity period
√ trust anchors are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust anchor

linkerd-webhooks-and-apisvc-tls
-------------------------------
√ proxy-injector webhook has valid cert
√ proxy-injector cert is valid for at least 60 days
√ sp-validator webhook has valid cert
√ sp-validator cert is valid for at least 60 days
√ policy-validator webhook has valid cert
√ policy-validator cert is valid for at least 60 days

linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

control-plane-version
---------------------
√ can retrieve the control plane version
√ control plane is up-to-date
√ control plane and cli versions match

linkerd-control-plane-proxy
---------------------------
√ control plane proxies are healthy
√ control plane proxies are up-to-date
√ control plane proxies and cli versions match

linkerd-extension-checks
------------------------
√ namespace configuration for extensions

linkerd-multicluster
--------------------
√ Link CRD exists
√ Link resources are valid
        * east
√ Link and CLI versions match
        * east
× remote cluster access credentials are valid
            * failed to connect to API for cluster: [east]: Get "https://192.168.56.101:6443/version?timeout=30s": net/http: TLS handshake timeout
    see https://linkerd.io/2/checks/#l5d-smc-target-clusters-access for hints
× clusters share trust anchors
    Problematic clusters:
    * east: unable to fetch anchors: Get "https://192.168.56.101:6443/api/v1/namespaces/linkerd/configmaps/linkerd-config?timeout=30s": net/http: TLS handshake timeout
    see https://linkerd.io/2/checks/#l5d-multicluster-clusters-share-anchors for hints
√ service mirror controller has required permissions
        * east
√ service mirror controllers are running
        * east
√ extension is managing controllers
        * east
× probe services able to communicate with all gateway mirrors
        liveness checks failed for east
    see https://linkerd.io/2/checks/#l5d-multicluster-gateways-endpoints for hints
√ multicluster extension proxies are healthy
√ multicluster extension proxies are up-to-date
√ multicluster extension proxies and cli versions match

linkerd-viz
-----------
√ linkerd-viz Namespace exists
√ can initialize the client
√ linkerd-viz ClusterRoles exist
√ linkerd-viz ClusterRoleBindings exist
√ tap API server has valid cert
√ tap API server cert is valid for at least 60 days
√ tap API service is running
√ linkerd-viz pods are injected
√ viz extension pods are running
√ viz extension proxies are healthy
√ viz extension proxies are up-to-date
√ viz extension proxies and cli versions match
√ prometheus is installed and configured correctly
√ viz extension self-check

linkerd-smi
-----------
√ linkerd-smi extension Namespace exists
√ SMI extension service account exists
√ SMI extension pods are injected
√ SMI extension pods are running
√ SMI extension proxies are healthy

Status check results are ×

Thank you.

Did you generate the links with multicluster link or multicluster link-gen? and, wait, which traffic splitter are you talking about?

I used multicluster link-gen, namely this command: linkerd multicluster link-gen --cluster-name east --api-server-address https://192.168.56.101:6443 > link.yaml(then applieed link.yaml on the other cluster).

I’m talking about this traffic splitter: Multi-cluster communication | Linkerd

Also, I reinstalled Linkerd on everything which got the gateways and all to work. However the gateways still die every once in a while and then never come back online. Even though I absolutely nothing. Sometimes reapplying the link.yaml helps but other times not. The errors from check change and sometimes say the probes fail, sometimes the certificate is invalid or the trust anchor is invalid etc.

Do you know why this keeps happening? Maybe there is something I can do to make the gateway connections more stable?