linkerd-Identity-Issuer not refreshing certificates as expected

Shubham · October 1, 2025, 1:11pm

Recently we’ve started facing issue with outbound calls for some pods when trying to connect with services within same cluster.

Can see following logs in proxy

[9515905.481616s] WARN ThreadId(01) outbound:proxy{addr=10.247.15.20:4191}:forward{addr=10.247.15.20:4191}: linkerd_reconnect: Failed to connect error=invalid peer certificate: Expired
[9515905.500380s] WARN ThreadId(01) outbound:proxy{addr=10.247.59.3:9996}:forward{addr=10.247.59.3:9996}: linkerd_reconnect: Failed to connect error=invalid peer certificate: Expired
[9515905.500801s] WARN ThreadId(01) policy:controller{addr=linkerd-policy.linkerd.svc.cluster.local:8090}:endpoint{addr=10.247.59.3:8090}: linkerd_reconnect: Failed to connect error=endpoint 10.247.59.3:8090: invalid peer certificate: Expired error.sources=[invalid peer certificate: Expired]

Further investigation revealed that there was cert renewal for linkerd-identity service, and few pods didn’t have cert refreshed for more than 2 months that included some of linkerd-destination and proxy pods as well.

sum(control_identity_cert_refresh_timestamp_seconds) by (pod) < 2month(epoch timestamp)

According to docs these should be refreshed every 24hrs, and for most of other pods seems to be working fine even in same deployment.

Don’t see any errors in linkerd-identity-issuer pods either.
To recover we had to delete the impacted pods, as I don’t see a way to force refresh certs via identity.

Can someone help in further debugging or is it a linkerd issue?

Setup:

We used offical linkerd helm chart to install linkerd without any custom tuning.
version: edge-24.11.5
k8s version: 1.31
cert manager for auto cert renewal

Flynn · October 7, 2025, 2:46pm

Hey @Shubham — yes, restarting pods is indeed how to force a reissue.

I’m more curious about the identity issuer rotation – how was that done? Did the control plane get restarted afterward?

Shubham · October 10, 2025, 6:42pm

Hi @Flynn , thanks for reply.

We use auto cert rotation via cert manager. That shouldn’t require control plane pods restart unless it’s trust anchor. During issue we restarted the control plane in sequence identity-issuer >> destination/proxy. Otherwise it was throwing poststarthook error in linkerd-proxy when trying to connect to control plane pods.

Shubham · October 10, 2025, 6:56pm

Hi @Flynn , We faced another instance of this same problem in another cluster this time there was no recent cert renewal for any controller pods.

[706995.441763s] ERROR ThreadId(02) identity:identity{server.addr=linkerd-identity-headless.linkerd.svc.cluster.local:8080}: linkerd_proxy_identity_client::certify: Failed to obtain identity error=status: Unknown, message: "controller linkerd-identity-headless.linkerd.svc.cluster.local:8080: endpoint 10.200.39.163:8080: connection error: received fatal alert: CertificateExpired", details: [], metadata: MetadataMap { headers: {} } error.sources=[controller linkerd-identity-headless.linkerd.svc.cluster.local:8080: endpoint 10.200.39.163:8080: connection error: received fatal alert: CertificateExpired, endpoint 10.200.39.163:8080: connection error: received fatal alert: CertificateExpired, connection error: received fatal alert: CertificateExpired, received fatal alert: CertificateExpired] [707002.681759s]  INFO ThreadId(01) outbound:proxy{addr=10.20.233.112:8080}:service{ns=spr-apps name=live-reporting-ms-tier1-svc port=8080}:endpoint{addr=10.200.5.23:8080}:rescue{client.addr=10.200.49.6:41100}: linkerd_app_core::errors::respond: gRPC request failed error=endpoint 10.200.5.23:8080: connection error: received fatal alert: CertificateExpired error.sources=[connection error: received fatal alert: CertificateExpired, received fatal alert: CertificateExpired]

[706995.441763s] ERROR ThreadId(02) identity:identity{server.addr=linkerd-identity-headless.linkerd.svc.cluster.local:8080}: linkerd_proxy_identity_client::certify: Failed to obtain identity error=status: Unknown, message: “controller linkerd-identity-headless.linkerd.svc.cluster.local:8080: endpoint 10.200.39.163:8080: connection error: received fatal alert: CertificateExpired”, details: , metadata: MetadataMap { headers: {} } error.sources=[controller linkerd-identity-headless.linkerd.svc.cluster.local:8080: endpoint 10.200.39.163:8080: connection error: received fatal alert: CertificateExpired, endpoint 10.200.39.163:8080: connection error: received fatal alert: CertificateExpired, connection error: received fatal alert: CertificateExpired, received fatal alert: CertificateExpired][707002.681759s]  INFO ThreadId(01) outbound:proxy{addr=10.20.233.112:8080}:service{ns=namespace name=internal-service-name port=8080}:endpoint{addr=10.200.5.23:8080}:rescue{client.addr=10.200.49.6:41100}: linkerd_app_core::errors::respond: gRPC request failed error=endpoint 10.200.5.23:8080: connection error: received fatal alert: CertificateExpired error.sources=[connection error: received fatal alert: CertificateExpired, received fatal alert: CertificateExpired]

Resolved after same restart sequence as above,
Still not sure why it happened for specific workloads, there was no notable resource throttling for these, can you help in suggesting method/metric to detect/dubug this in future? Thanks.

Topic		Replies	Views
ERROR: failed to verify issuer credentials for 'identity.linkerd.cluster.local' with trust anchors: x509: certificate has expired or is not yet valid Linkerd General Discussion certificates	4	1993	January 22, 2024
[Cert-Manager] Webhook Certificates renewal Failure Linkerd General Discussion certificates	4	1636	August 16, 2023
Linkerd check --proxy Linkerd General Discussion certificates	3	53	December 31, 2024
Trouble with trusted anchors. I don't know where to look Linkerd General Discussion certificates	3	53	July 14, 2025
Create a script that can automatically update the needed certifications for Linkerd once they expire Linkerd General Discussion certificates	6	547	October 20, 2023

linkerd-Identity-Issuer not refreshing certificates as expected

Related topics