Hi everyone, I am new to LinkerD and I have a question about retries. I have configured a ServiceProfile:
apiVersion: linkerd.io/v1alpha2
kind: ServiceProfile
metadata:
creationTimestamp: null
name: report.test-linkerd.svc.cluster.local
namespace: test-linkerd
spec:
routes:
- name: All GET Requests
condition:
method: GET
pathRegex: ".*"
isRetryable: true
- name: All POST Requests
condition:
method: POST
pathRegex: ".*"
isRetryable: true
retryBudget:
retryRatio: 0.2
minRetriesPerSecond: 10
ttl: 10s
I have a service profile that has two routes, one for GET and one for POST. I have a retry budget set for the service profile.
When I try to make a condition for a retry (killing the report pod), I see that the retries are not happening.
I started to investigate,
➜ ~ linkerd viz routes --to deploy/report -n test-linkerd -o wide deploy/api-gateway
ROUTE SERVICE EFFECTIVE_SUCCESS EFFECTIVE_RPS ACTUAL_SUCCESS ACTUAL_RPS LATENCY_P50 LATENCY_P95 LATENCY_P99
All GET Requests portfolio-report - - - - - - -
All POST Requests portfolio-report - - - - - - -
[DEFAULT] portfolio-report - - - - - - -
And I see that the routes are not being matched. I have checked the logs of the api-gateway and I see that the requests are being made to the report service.
Then I ran the following command to check the metrics of the report service:
➜ ~ linkerd diagnostics proxy-metrics -n test-linkerd deploy/report | grep route_response_total # HELP route_response_total Total count of HTTP responses. # TYPE route_response_total counter route_response_total{direction="inbound",dst="report.test-linkerd.svc.cluster.local:80",rt_route="All POST Requests",status_code="200",classification="success",grpc_status="",error=""} 5 route_response_total{direction="inbound",dst="report.test-linkerd.svc.cluster.local:8090",rt_route="All GET Requests",status_code="200",classification="success",grpc_status="",error=""} 9 route_response_total{direction="inbound",dst="report.test-linkerd.svc.cluster.local:8090",rt_route="All GET Requests",status_code="304",classification="success",grpc_status="",error=""} 44
And I see that the report service is receiving requests with the routes All GET Requests and All POST Requests.
Additionally, in the dashboard, I do not see any requests from the api-gateway to the report service. They do show as “meshed”, but there is no green bar or metrics.
The only other clue I have found is the logs of the linkerd proxy container:
{"timestamp":"[ 1355.773887s]","level":"INFO","fields":{"message":"Connection closed","error":"connection closed before message completed","client.addr":"10.62.71.212:35084","server.addr":"10.62.68.10:8080"},"target":"linkerd_app_core::serve","spans":[{"name":"inbound"}],"threadId":"ThreadId(1)"}
Am I missing something in the configuration of the service profile?
Am I misunderstanding how retries work in LinkerD?
Any help would be appreciated. Thanks!
Additional Info:
Client version: stable-2.14.10
Server version: stable-2.14.10
linkerd check
and linkerd viz check
come back all good.