We adopted linkerd to support tls termination between pods inside our Kubernetes cluster. We have some services that communicating between each other using grpc over secure connection.
When services communicating using public FQDN everything working fine since TLS is terminating on AWS NLB level, but we also would like to provide communication over the Kubernetes network using Kubernetes service hostname (service.namespace.svc.cluster.local). After reading documentation we thought that when you inject linkerd-proxy you are able to communicate securely without any additional configuration.
Did we understood it wrong and we need to configure some additional stuff, like route traffic from k8 service to linkerd-proxy instead of microservice container etc?
In general, cloud load balancers can’t be meshed: there’s not anything running inside the cluster to which Linkerd can attach its proxy. That means that the first hop from the LB to a meshed Pod will be unencrypted. All the communications between meshed Pods will still be encrypted; it’s just that first hop which is cleartext.
The most straightforward way to get everything encrypted is to not have the NLB terminate TLS. Instead, have the NLB direct traffic to a meshed Kubernetes ingress controller, and let the ingress controller terminate TLS.
@YuriiP Any communication between meshed pods in a Kubernetes clusters will be mTLS’d without you having to do any further configuration. You should not have to manually route traffic from K8s services to linkerd-proxy; that is all taken care of for you when you mesh the pods.
Hi @William . Yes, it seems that we have the contrary behaviour, all requests through grpc protocol between meshed pods are failing no matter if you use internal kubeservice domain or pod ip.
Ok. So there are two pods: service-1 and service-2, both in same namespace (services).
Executing following command in pod-1:
grpcurl service-2.example.com:443 HelloWorld
return the following error Error invoking method "HelloWorld": given method name "HelloWorld" is not in expected format: 'service/method' or 'service.method' which is fine since the request scheme is not correct, but important thing there that we can see that grpc request itself reached the target successfully. As you can see it is working when we use public fqdn since TLS is terminated on NLB side.
Lets try same command but using internal kubeservice domain:
Response is: Failed to dial target host "service-2.services.svc.cluster.local:7233": tls: first record does not look like a TLS handshake
service-2 linkerd-proxy logs:
[ 0.003504s] INFO ThreadId(01) linkerd2_proxy: release 2.210.0 (85db2fc) by linkerd on 2023-09-21T21:24:58Z
[ 0.004292s] INFO ThreadId(01) linkerd2_proxy::rt: Using single-threaded proxy runtime
[ 0.004986s] INFO ThreadId(01) linkerd2_proxy: Admin interface on 0.0.0.0:4191
[ 0.005003s] INFO ThreadId(01) linkerd2_proxy: Inbound interface on 0.0.0.0:4143
[ 0.005006s] INFO ThreadId(01) linkerd2_proxy: Outbound interface on 127.0.0.1:4140
[ 0.005009s] INFO ThreadId(01) linkerd2_proxy: Tap interface on 0.0.0.0:4190
[ 0.005012s] INFO ThreadId(01) linkerd2_proxy: Local identity is default.temporal.serviceaccount.identity.linkerd.cluster.local
[ 0.005014s] INFO ThreadId(01) linkerd2_proxy: Identity verified via linkerd-identity-headless.linkerd.svc.cluster.local:8080 (linkerd-identity.linkerd.serviceaccount.identity.linkerd.cluster.local)
[ 0.005017s] INFO ThreadId(01) linkerd2_proxy: Destinations resolved via linkerd-dst-headless.linkerd.svc.cluster.local:8086 (linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local)
[ 0.034203s] INFO ThreadId(02) daemon:identity: linkerd_app: Certified identity id=default.temporal.serviceaccount.identity.linkerd.cluster.local
[ 0.713850s] INFO ThreadId(01) outbound:proxy{addr=10.43.238.84:7233}:service{ns= name=service port=0}: linkerd_proxy_api_resolve::resolve: No endpoints
[ 1.796331s] INFO ThreadId(01) outbound: linkerd_app_core::serve: Connection closed error=connect timed out after 1s client.addr=10.1.1.239:47054
[ 32.802473s] INFO ThreadId(01) outbound: linkerd_app_core::serve: Connection closed error=connect timed out after 1s client.addr=10.1.1.239:35382
linkerd-check output:
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API
kubernetes-version
------------------
√ is running the minimum Kubernetes API version
linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ control plane pods are ready
√ cluster networks contains all pods
√ cluster networks contains all services
linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ proxy-init container runs as root user if docker container runtime is used
linkerd-identity
----------------
√ certificate config is valid
√ trust anchors are using supported crypto algorithm
√ trust anchors are within their validity period
√ trust anchors are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust anchor
linkerd-webhooks-and-apisvc-tls
-------------------------------
√ proxy-injector webhook has valid cert
√ proxy-injector cert is valid for at least 60 days
√ sp-validator webhook has valid cert
√ sp-validator cert is valid for at least 60 days
√ policy-validator webhook has valid cert
√ policy-validator cert is valid for at least 60 days
linkerd-version
---------------
√ can determine the latest version
‼ cli is up-to-date
is running version 2.13.5 but the latest stable version is 2.14.1
see https://linkerd.io/2.13/checks/#l5d-version-cli for hints
control-plane-version
---------------------
√ can retrieve the control plane version
√ control plane is up-to-date
‼ control plane and cli versions match
control plane running stable-2.14.1 but cli running stable-2.13.5
see https://linkerd.io/2.13/checks/#l5d-version-control for hints
linkerd-control-plane-proxy
---------------------------
√ control plane proxies are healthy
√ control plane proxies are up-to-date
‼ control plane proxies and cli versions match
linkerd-destination-5476c5bc64-bjnkn running stable-2.14.1 but cli running stable-2.13.5
see https://linkerd.io/2.13/checks/#l5d-cp-proxy-cli-version for hints
linkerd-multicluster
--------------------
× Link CRD exists
multicluster.linkerd.io/Link CRD is missing: the server could not find the requested resource
see https://linkerd.io/2.13/checks/#l5d-multicluster-link-crd-exists for hints
linkerd-viz
-----------
√ linkerd-viz Namespace exists
√ can initialize the client
√ linkerd-viz ClusterRoles exist
√ linkerd-viz ClusterRoleBindings exist
√ tap API server has valid cert
√ tap API server cert is valid for at least 60 days
√ tap API service is running
√ linkerd-viz pods are injected
√ viz extension pods are running
√ viz extension proxies are healthy
√ viz extension proxies are up-to-date
‼ viz extension proxies and cli versions match
metrics-api-6bbd6d4bc7-4cnrn running stable-2.14.1 but cli running stable-2.13.5
see https://linkerd.io/2.13/checks/#l5d-viz-proxy-cli-version for hints
√ prometheus is installed and configured correctly
√ viz extension self-check
Status check results are ×
That’s super useful information! The short answer here is that is what you would expect because while Linkerd is encrypting your traffic in transit it is transparent to your application. The longer answer is when you use Linkerd you should continue to run your apps as if they are sending data in plain text. Linkerd’s proxy is encrypting the data from proxy to proxy. The traffic from app to proxy and proxy to app should always be in plain text.