Hi I recently joined a company who where using linkerd in production. Long story short and due to an issue with linkerd certs werent being renewed by cert-manager which caused a bunch of deployments to fall over after being rotated as they didnt have new certs. Linkerd injection was removed for critical services and I’ve been tasked to look at cert monitoring before we proceed with implementing in production again. Whats the best way to do this? I would like to setup some cert monitors in datadog. thanks.
Hi @go4brendon,
Since you’ve already integrated Linkerd with Cert-Manager, you can take advantage of the built-in Datadog integration (cert-manager). It uses the metrics exposed by Cert-Manager to monitor certificate validity (e.g., remaining days until expiration) and the number of ready/unready certificates.
What was the issue that prevented the certificates from being renewed?
thanks @GTRekter i believe the issue stemmed from a bug from linkerd deployment in one of the versions. you’re indeed correct datadog is pulling cert_manager metrics. do you perhaps have a template to the dashboard you linked in first screenshot? that would be useful, thank you
nevermind realised this is a custom cert-manager dashboard so will do the same i do think im missing some metrics though dont see logs
That’s Interesting. If you’ve opened a GitHub issue for the Linkerd bug, could you share the link? I’d be happy to take a look and see what’s going on under the hood.