mssql_config
The mssql_config
block configures the mssql
integration, an embedded version of sql_exporter
that lets you collect Microsoft SQL Server metrics.
It is recommended that you have a dedicated user set up for monitoring an mssql instance. The user for monitoring must have the following grants in order to populate the metrics:
GRANT VIEW ANY DEFINITION TO <MONITOR_USER>
GRANT VIEW SERVER STATE TO <MONITOR_USER>
Quick configuration example
To get started, define the MSSQL connection string in Grafana Agent’s integration block:
metrics:
wal_directory: /tmp/wal
integrations:
mssql:
enabled: true
connection_string: "sqlserver://[user]:[pass]@localhost:1433"
Full reference of options:
# Enables the MSSQL integration, allowing the Agent to automatically
# collect metrics for the specified MSSQL instance.
[enabled: <boolean> | default = false]
# Sets an explicit value for the instance label when the integration is
# self-scraped. Overrides inferred values.
#
# The default value for this integration is the host:port of the provided connection_string.
[instance: <string>]
# Automatically collect metrics from this integration. If disabled,
# the MSSQL integration is run but not scraped and thus not
# remote-written. Metrics for the integration are exposed at
# /integrations/mssql/metrics and can be scraped by an external
# process.
[scrape_integration: <boolean> | default = <integrations_config.scrape_integrations>]
# How often should the metrics be collected? Defaults to
# prometheus.global.scrape_interval.
[scrape_interval: <duration> | default = <global_config.scrape_interval>]
# The timeout before considering the scrape a failure. Defaults to
# prometheus.global.scrape_timeout.
[scrape_timeout: <duration> | default = <global_config.scrape_timeout>]
# Allows for relabeling labels on the target.
relabel_configs:
[- <relabel_config> ... ]
# Relabel metrics coming from the integration, lets you drop series
# that you don't care about from the integration.
metric_relabel_configs:
[ - <relabel_config> ... ]
# How frequently the WAL is truncated for this integration.
[wal_truncate_frequency: <duration> | default = "60m"]
#
# Exporter-specific configuration options
#
# The connection_string to use to connect to the MSSQL instance.
# It is specified in the form of: "sqlserver://<USERNAME>:<PASSWORD>@<HOST>:<PORT>"
connection_string: <string>
# The maximum number of open database connections to the MSSQL instance.
[max_open_connections: <int> | default = 3]
# The maximum number of idle database connections to the MSSQL instance.
[max_idle_connections: <int> | default = 3]
# The timeout for scraping metrics from the MSSQL instance.
[timeout: <duration> | default = "10s"]
# Embedded MSSQL query configuration for specifying custom MSSQL Prometheus metrics.
# See https://github.com/burningalchemist/sql_exporter#collectors for more details how to specify your metric configurations.
query_config:
[- <metrics> ... ]
[- <queries> ... ]]
Authentication
By default, the USERNAME
and PASSWORD
used within the connection_string
argument corresponds to a SQL Server username and password.
If Grafana Agent is running in the same Windows domain as the SQL Server, then you can use the parameter authenticator=winsspi
within the connection_string
to authenticate without any additional credentials.
sqlserver://@<HOST>:<PORT>?authenticator=winsspi
If you want to use Windows credentials to authenticate, instead of SQL Server credentials, you can use the parameter authenticator=ntlm
within the connection_string
.
The USERNAME
and PASSWORD
then corresponds to a Windows username and password.
The Windows domain may need to be prefixed to the username with a trailing \
.
sqlserver://<DOMAIN\USERNAME>:<PASSWORD>@<HOST>:<PORT>?authenticator=ntlm
Custom metrics
You can use the optional query_config
parameter to retrieve custom Prometheus metrics for a MSSQL instance.
If this is defined, the new configuration will be used to query your MSSQL instance and create whatever Prometheus metrics are defined. If you want additional metrics on top of the default metrics, the default configuration must be used as a base.
The default configuration used by this integration is as follows:
collector_name: mssql_standard
metrics:
- metric_name: mssql_local_time_seconds
type: gauge
help: 'Local time in seconds since epoch (Unix time).'
values: [unix_time]
query: |
SELECT DATEDIFF(second, '19700101', GETUTCDATE()) AS unix_time
- metric_name: mssql_connections
type: gauge
help: 'Number of active connections.'
key_labels:
- db
values: [count]
query: |
SELECT DB_NAME(sp.dbid) AS db, COUNT(sp.spid) AS count
FROM sys.sysprocesses sp
GROUP BY DB_NAME(sp.dbid)
#
# Collected from sys.dm_os_performance_counters
#
- metric_name: mssql_deadlocks_total
type: counter
help: 'Number of lock requests that resulted in a deadlock.'
values: [cntr_value]
query: |
SELECT cntr_value
FROM sys.dm_os_performance_counters WITH (NOLOCK)
WHERE counter_name = 'Number of Deadlocks/sec' AND instance_name = '_Total'
- metric_name: mssql_user_errors_total
type: counter
help: 'Number of user errors.'
values: [cntr_value]
query: |
SELECT cntr_value
FROM sys.dm_os_performance_counters WITH (NOLOCK)
WHERE counter_name = 'Errors/sec' AND instance_name = 'User Errors'
- metric_name: mssql_kill_connection_errors_total
type: counter
help: 'Number of severe errors that caused SQL Server to kill the connection.'
values: [cntr_value]
query: |
SELECT cntr_value
FROM sys.dm_os_performance_counters WITH (NOLOCK)
WHERE counter_name = 'Errors/sec' AND instance_name = 'Kill Connection Errors'
- metric_name: mssql_page_life_expectancy_seconds
type: gauge
help: 'The minimum number of seconds a page will stay in the buffer pool on this node without references.'
values: [cntr_value]
query: |
SELECT top(1) cntr_value
FROM sys.dm_os_performance_counters WITH (NOLOCK)
WHERE counter_name = 'Page life expectancy'
- metric_name: mssql_batch_requests_total
type: counter
help: 'Number of command batches received.'
values: [cntr_value]
query: |
SELECT cntr_value
FROM sys.dm_os_performance_counters WITH (NOLOCK)
WHERE counter_name = 'Batch Requests/sec'
- metric_name: mssql_log_growths_total
type: counter
help: 'Number of times the transaction log has been expanded, per database.'
key_labels:
- db
values: [cntr_value]
query: |
SELECT rtrim(instance_name) AS db, cntr_value
FROM sys.dm_os_performance_counters WITH (NOLOCK)
WHERE counter_name = 'Log Growths' AND instance_name <> '_Total'
- metric_name: mssql_buffer_cache_hit_ratio
type: gauge
help: 'Ratio of requests that hit the buffer cache'
values: [BufferCacheHitRatio]
query: |
SELECT (a.cntr_value * 1.0 / b.cntr_value) * 100.0 as BufferCacheHitRatio
FROM sys.dm_os_performance_counters a
JOIN (SELECT cntr_value, OBJECT_NAME
FROM sys.dm_os_performance_counters
WHERE counter_name = 'Buffer cache hit ratio base'
AND OBJECT_NAME = 'SQLServer:Buffer Manager') b ON a.OBJECT_NAME = b.OBJECT_NAME
WHERE a.counter_name = 'Buffer cache hit ratio'
AND a.OBJECT_NAME = 'SQLServer:Buffer Manager'
- metric_name: mssql_checkpoint_pages_sec
type: gauge
help: 'Checkpoint Pages Per Second'
values: [cntr_value]
query: |
SELECT cntr_value
FROM sys.dm_os_performance_counters
WHERE [counter_name] = 'Checkpoint pages/sec'
#
# Collected from sys.dm_io_virtual_file_stats
#
- metric_name: mssql_io_stall_seconds_total
type: counter
help: 'Stall time in seconds per database and I/O operation.'
key_labels:
- db
value_label: operation
values:
- read
- write
query_ref: mssql_io_stall
#
# Collected from sys.dm_os_process_memory
#
- metric_name: mssql_resident_memory_bytes
type: gauge
help: 'SQL Server resident memory size (AKA working set).'
values: [resident_memory_bytes]
query_ref: mssql_process_memory
- metric_name: mssql_virtual_memory_bytes
type: gauge
help: 'SQL Server committed virtual memory size.'
values: [virtual_memory_bytes]
query_ref: mssql_process_memory
- metric_name: mssql_available_commit_memory_bytes
type: gauge
help: 'SQL Server available to be committed memory size.'
values: [available_commit_limit_bytes]
query_ref: mssql_process_memory
- metric_name: mssql_memory_utilization_percentage
type: gauge
help: 'The percentage of committed memory that is in the working set.'
values: [memory_utilization_percentage]
query_ref: mssql_process_memory
- metric_name: mssql_page_fault_count_total
type: counter
help: 'The number of page faults that were incurred by the SQL Server process.'
values: [page_fault_count]
query_ref: mssql_process_memory
#
# Collected from sys.dm_os_sys_info
#
- metric_name: mssql_server_total_memory_bytes
type: gauge
help: 'SQL Server committed memory in the memory manager.'
values: [committed_memory_bytes]
query_ref: mssql_os_sys_info
- metric_name: mssql_server_target_memory_bytes
type: gauge
help: 'SQL Server target committed memory set for the memory manager.'
values: [committed_memory_target_bytes]
query_ref: mssql_os_sys_info
#
# Collected from sys.dm_os_sys_memory
#
- metric_name: mssql_os_memory
type: gauge
help: 'OS physical memory, used and available.'
value_label: 'state'
values: [used, available]
query: |
SELECT
(total_physical_memory_kb - available_physical_memory_kb) * 1024 AS used,
available_physical_memory_kb * 1024 AS available
FROM sys.dm_os_sys_memory
- metric_name: mssql_os_page_file
type: gauge
help: 'OS page file, used and available.'
value_label: 'state'
values: [used, available]
query: |
SELECT
(total_page_file_kb - available_page_file_kb) * 1024 AS used,
available_page_file_kb * 1024 AS available
FROM sys.dm_os_sys_memory
queries:
# Populates `mssql_io_stall` and `mssql_io_stall_total`
- query_name: mssql_io_stall
query: |
SELECT
cast(DB_Name(a.database_id) as varchar) AS [db],
sum(io_stall_read_ms) / 1000.0 AS [read],
sum(io_stall_write_ms) / 1000.0 AS [write]
FROM
sys.dm_io_virtual_file_stats(null, null) a
INNER JOIN sys.master_files b ON a.database_id = b.database_id AND a.file_id = b.file_id
GROUP BY a.database_id
# Populates `mssql_resident_memory_bytes`, `mssql_virtual_memory_bytes`, mssql_available_commit_memory_bytes,
# and `mssql_memory_utilization_percentage`, and `mssql_page_fault_count_total`
- query_name: mssql_process_memory
query: |
SELECT
physical_memory_in_use_kb * 1024 AS resident_memory_bytes,
virtual_address_space_committed_kb * 1024 AS virtual_memory_bytes,
available_commit_limit_kb * 1024 AS available_commit_limit_bytes,
memory_utilization_percentage,
page_fault_count
FROM sys.dm_os_process_memory
# Populates `mssql_server_total_memory_bytes` and `mssql_server_target_memory_bytes`.
- query_name: mssql_os_sys_info
query: |
SELECT
committed_kb * 1024 AS committed_memory_bytes,
committed_target_kb * 1024 AS committed_memory_target_bytes
FROM sys.dm_os_sys_info