Outbound calls failure

Darshan · February 23, 2026, 12:57pm

Hi team, Recently we’ve started facing issue with outbound calls for many pods when trying to connect with services within same cluster. Proxy was OOM killed and we fixed it, we restarted Linkerd control plane components, including the destination and identity pods.

However, even after stabilizing the control plane, the connection issues persisted for the affected application pods. The problem was only resolved after we manually restarted each impacted application pod.

We would like to better understand the underlying behavior here:

Why does restarting the Linkerd control plane (destination/identity) not automatically restore connectivity for already-running pods?
Is there a way to refresh or resync metadata, or proxy state without restarting the application pods?
Does the linkerd-proxy sidecar maintain any persistent state that requires a full pod restart after certain failure scenarios?

Our goal is to understand whether this behavior is expected and whether there is a cleaner recovery approach than restarting all affected application pods.

Adding few logs we found from proxy sidecar container during the outage

worker must set a failure if it exits prematurely

thread 'main' panicked at /__w/linkerd2-proxy/linkerd2-proxy/linkerd/proxy/balance/queue/src/service.rs:73:18:

[238763.453630s]  WARN ThreadId(01) outbound: linkerd_app_core::serve: Server failed to become ready error=buffer's worker closed unexpectedly client.addr=10.2.57.34:52068

Adding previous ticket where we had raised for similar issue, but it was due to certificate expiry linkerd-identity-issuer-not-refreshing-certificates-as-expected - we had to restart all application pods to restore our services.

Flynn · February 26, 2026, 5:00pm

Hey @Darshan, sorry for the delay here! I’m just back from vacation.

Linkerd proxies currently set up their identities at startup. Notably, this includes loading the identity issuer certificate and trust anchor, which means that if the identity issuer or trust anchor ever expire, you must restart the proxies to get an updated trust chain. That’s what you were seeing here. We’re actively working on making this smoother, but the data-plane restart is very important at the moment.

Darshan · February 28, 2026, 7:25pm

Hi @Flynn understood that we need to restart the proxies to get an updated trust chain. But main concern here is post OOM of control plane components and restarting identity, destination pods - outbound call failure issue still persisted until restart of application pods - is there any better way to fix issue without restarting application pods? or can we expect an auto-recovery of service mesh in such cases without the need of application pods restart in upcoming linkerd versions ?

Topic		Replies	Views
Linkerd destination control plane pod restarts Linkerd General Discussion configuration	1	587	September 27, 2023
Meshed pods fail to connect to unmeshed smtp/redis pods with Linkerd 2.14.2 Linkerd General Discussion	10	1795	November 9, 2023
Linkerd Proxy Restart Causes Brief Traffic Disruptions Linkerd General Discussion	0	102	February 21, 2025
linkerd-Identity-Issuer not refreshing certificates as expected Linkerd General Discussion proxy , configuration , mtls , certificates	5	257	January 8, 2026
LinkerD seems to swallow certain requests, left hanging on frontend + tap Linkerd General Discussion proxy	6	108	January 12, 2026

Outbound calls failure

Related topics