Linkerd proxy is causing Redis to fail on upgrade to latest Redis version

Hi,

We are running Linkerd enterprise-2.16.1.
In our clusters we have Redis version 7.2.4 running in with:

architecture: replication
sentinel:
  enabled: true

We use this helm chart:

This has been working fine. All Redis pods are meshed and I see that the proxy-injector is also adding this annotation: config.linkerd.io/opaque-ports: "6379". We probably should add the sentinel port 26379 there also, but it has not been a problem

Today I tried to upgrade Redis to the version 7.4.1 with Helm chart version 20.2.1
After the upgrade there is a lot of connection issues. Redis pods can not connect to each other anymore. And applications can not connect to Redis.
I see these type of errors in the Linkerd proxy log:

linkerd-proxy [   200.305512s]  INFO ThreadId(01) outbound: linkerd_app_core::serve: Connection closed error=connect timed out after 1s client.addr=10.241.75.1:46842 server.addr=10.241.73.252:6379
linkerd-proxy [   200.821799s]  INFO ThreadId(01) outbound: linkerd_app_core::serve: Connection closed error=connect timed out after 1s client.addr=10.241.75.1:52114 server.addr=10.241.73.252:26379

If I disable Linkerd proxy for Redis pods then everything works fine. But, as we are using Linkerd we want to run all the pods in our application namespace in the service mesh.

I need some help figuring out what is going on and how to fix it.

Adding this annotation to Redis pods also makes Redis work:
config.linkerd.io/skip-inbound-ports: “6379,26379”
Not really a solution since we want to run all pods in the service mesh with mTLS

This looks like a bug in Linkerd I think

It is very easy to reproduce. Create a test-values.yaml file with this content:

architecture: replication
sentinel:
  enabled: true
replica:
  podAnnotations:
    linkerd.io/inject: enabled
    config.linkerd.io/opaque-ports: "6379,26379"

Then install Redis using this command:

helm install redis oci://registry-1.docker.io/bitnamicharts/redis -f test-values.yaml

Redis pods will now be flooded with
INFO ThreadId(01) outbound: linkerd_app_core::serve: Connection closed error=connect timed out after 1
errors in the proxy logs
And redis pods can not connect to each other
@Flynn @william I would be grateful is someone could take a look at this issue

I was wrong. It does not work with this annotation:
config.linkerd.io/skip-inbound-ports: “6379,26379”
only way to make latest version of redis work is to disable Linkerd proxy

Mmm. Had one occasion where the skip-inbound-ports annotation dod not help as a workaround. Now it works again. But this is equivalent to disabling Linkerd for Redis, so not really a solution

Are those connection closed errors coming from the Redis proxies, or from the apps trying to connect to Redis?

The errors are from the Redis proxies. As you may know the Redis pods connect to each other both for Sentinel and replication. From what I heard when things started failing, other apps using Redis had connection issues

But to investigate it is simplest to just deploy Redis with the configuration above and it will start failing right away. Works fine without Linkerd. Also works with the old version of Redis and Linkerd enabled, which was our configuration before we attempted upgrade of Redis