-
Notifications
You must be signed in to change notification settings - Fork 51
Description
Description
The DSM (Data Streams Monitoring) Queue tab shows empty CloudWatch metric graphs for SQS queues consumed by Lambda functions. The root cause is a mismatch between how datadog-lambda-python identifies queues in DSM checkpoints vs how the AWS CloudWatch integration tags SQS metrics.
Root Cause
In datadog_lambda/tracing.py, _dsm_set_checkpoint() passes the full eventSourceARN to set_consume_checkpoint():
# tracing.py line ~262
source_arn = first_record.get("eventSourceARN", "")
# ...
_dsm_set_checkpoint(context_json, event_type, source_arn) # full ARNThis results in a DSM checkpoint with:
topic:arn:aws:sqs:eu-west-2:123456789012:my-queue.fifo
The DSM Queue tab then uses this full ARN value to construct CloudWatch metric queries:
sum:aws.sqs.number_of_messages_received{queuename:arn:aws:sqs:eu-west-2:123456789012:my-queue.fifo}
But the AWS CloudWatch integration tags SQS metrics with just the short queue name (from the CloudWatch QueueName dimension):
queuename:my-queue.fifo
Result: the query returns no data, and all Queue tab graphs are empty.
Comparison with botocore SDK path
The botocore instrumentation in dd-trace-py correctly extracts the short name:
# ddtrace/internal/datastreams/botocore.py
def get_queue_name(params):
queue_url = params["QueueUrl"]
url = parse.urlparse(queue_url)
return url.path.rsplit("/", 1)[-1] # returns "my-queue.fifo"Both handle_sqs_sns_produce() and handle_sqs_receive() use this short name as the DSM topic: tag. When SQS messages are consumed by a long-running process polling with sqs.receive_message(), the botocore instrumentation handles the DSM checkpoint and the Queue tab works correctly. The bug is specific to Lambda functions triggered by SQS event source mappings, where datadog-lambda-python handles the checkpoint instead.
Expected Behavior
_dsm_set_checkpoint() should extract the short queue name from the ARN before passing it to set_consume_checkpoint(), e.g.:
queue_name = source_arn.rsplit(":", 1)[-1] # "my-queue.fifo"
set_consume_checkpoint(event_type, queue_name, carrier_get, manual_checkpoint=False)This would align the Lambda consumption path with the botocore SDK path, and the DSM Queue tab CloudWatch queries would match the actual queuename tag values.
Workaround
Users can manually change the metric filter from queuename to dd_resource_key (which contains the full ARN) to see data. But this must be done for each graph individually and doesn't persist.
Environment
datadog-lambdav8.123.0dd-trace-pyv4.6.0- Lambda Extension v92-next
- Python 3.14
- SQS FIFO queue consumed via Lambda event source mapping
- AWS region: eu-west-2