I’ve raised a ticket in Github, but would like to know if anyone else has hit this kind of problem or has advice. Essentially, a certain app that creates outgoing HTTP connections to a k8s service accumulates thousands of ESTABLISHED TCP connections, sometimes until the count exceeds 8000 and the linkerd container crashes.
The problem is specific to a single cluster even though we run common code and use many other clusters with the same cloud provider.
Linkerd Enterprise 2.15.5-0
opened 04:18PM - 04 Dec 24 UTC
bug
### What is the issue?
We are running Linkerd enterprise-2.15.5-0.
A Deplo… yment's Linkerd-meshed pods opens TCP port 80 connections to a Service with Pods running Varnish behind it.
We are witnessing counts of established TCP connections on the outgoing pod's side over 2000. On occasion we've seen as many as 8000+ connections, which caused the linkerd container to crash. Under normal operations we would expect this to be 2 or 3 digits at the most. This is only happening in a single k8s cluster although all of our clusters run common code and are in the same provider's cloud.
Linkerd does not appear to be closing these HTTP connections properly. The connections are after all going to the Service IP.
The Java methods that we use on the outgoing pod are:
```
HttpUtils http = new HttpUtils();
http.setUrl(url);
http.setMethod(HttpUtils.GET);
http.execute();
byte buffer[] = http.getBodyAsBinary();
http.close();
```
We think this may be related to a similar open connection issue that we raised in the past: https://github.com/linkerd/linkerd2/issues/9724
We are planning to try `opaque-ports` to confirm our theory that Linkerd is leaving the connections open.
### How can it be reproduced?
We do not have a way to reproduce outside of this k8s cluster.
### Logs, error output, etc
Current connections on the effected outgoing pods below. The IP address is that of the destination Service:
```
╰─◦ kubectl get pods -l app=<app label here> -oname | xargs -I% kubectl exec % -- /bin/sh -c "netstat -an | grep 10.39.250.148 | wc -l"
4722
4197
4795
4167
4670
5114
4722
4860
4410
4421
4733
4747
3870
3850
4205
4238
4275
4749
4223
4966
4589
4182
3740
4240
4977
3943
4481
```
### output of `linkerd check -o short`
```
╰─◦ linkerd check -o short
linkerd-version
---------------
‼ cli is up-to-date
is running version 2.15.5-0 but the latest enterprise version is 2.16.2-1
see https://linkerd.io/2/checks/#l5d-version-cli for hints
control-plane-version
---------------------
‼ control plane is up-to-date
is running version 2.15.5-0 but the latest enterprise version is 2.16.2-1
see https://linkerd.io/2/checks/#l5d-version-control for hints
linkerd-control-plane-proxy
---------------------------
‼ control plane proxies are up-to-date
some proxies are not running the current version:
* linkerd-destination-d89467495-8bxtv (enterprise-2.15.5-0)
* linkerd-destination-d89467495-n6th5 (enterprise-2.15.5-0)
* linkerd-destination-d89467495-v48q4 (enterprise-2.15.5-0)
* linkerd-identity-d55486b6c-k4nmc (enterprise-2.15.5-0)
* linkerd-identity-d55486b6c-msh5l (enterprise-2.15.5-0)
* linkerd-identity-d55486b6c-v49j5 (enterprise-2.15.5-0)
* linkerd-proxy-injector-ff7d897fc-97bj9 (enterprise-2.15.5-0)
* linkerd-proxy-injector-ff7d897fc-df5wl (enterprise-2.15.5-0)
* linkerd-proxy-injector-ff7d897fc-nf8lh (enterprise-2.15.5-0)
see https://linkerd.io/2/checks/#l5d-cp-proxy-version for hints
linkerd-viz
-----------
‼ viz extension proxies are up-to-date
some proxies are not running the current version:
* metrics-api-9979bc8f5-rdf7v (enterprise-2.15.5-0)
* prometheus-fd6b67bcc-tmpfs (enterprise-2.15.5-0)
* tap-7ddb4f85b4-m7mg7 (enterprise-2.15.5-0)
* tap-7ddb4f85b4-nqzsl (enterprise-2.15.5-0)
* tap-7ddb4f85b4-qtms7 (enterprise-2.15.5-0)
* tap-injector-b8fbc5b78-996j6 (enterprise-2.15.5-0)
* tap-injector-b8fbc5b78-d8rcz (enterprise-2.15.5-0)
* tap-injector-b8fbc5b78-fszsv (enterprise-2.15.5-0)
* tap-injector-b8fbc5b78-nns8h (enterprise-2.15.5-0)
* tap-injector-b8fbc5b78-xnxn2 (enterprise-2.15.5-0)
* web-9c7b78c9f-l49d2 (enterprise-2.15.5-0)
see https://linkerd.io/2/checks/#l5d-viz-proxy-cp-version for hints
Status check results are √
```
### Environment
Kubernetes version: v1.29.10
Provider: GKE
Host OS: Ubuntu Jammy
Linkerd Version: enterprise-2.15.5-0
### Possible solution
_No response_
### Additional context
_No response_
### Would you like to work on fixing this bug?
None