Compare commits

...

243 Commits
master ... main

Author SHA1 Message Date
Ulric Qin 99fbdae121 refactor boardPutConfigs 2022-11-11 12:11:39 +08:00
kongfei605 aa26ddfb48
Merge pull request #1263 from ccfos/router_easyjson
regenerate easyjson file for router_opentsdb
2022-11-10 12:50:22 +08:00
kongfei ba5aba9cdf sync main branch code 2022-11-10 12:47:46 +08:00
kongfei 3400803672 regenerate easyjosn file for router_opentsdb 2022-11-10 12:41:16 +08:00
kongfei605 f11377b289
replace json with easyjson for router (#1261) 2022-11-10 11:11:20 +08:00
kongfei 1165312532 replace json with easyjson for router 2022-11-10 11:00:38 +08:00
JellyTony 8a145d5ba2
feat: 报警脚本超时时间改为可配置 (#1253)
* Update docker-compose.yaml

* Update docker-compose.yaml

* feat: 报警脚本超时时间改为可配置

* feat: docker 镜像Alerting 增加 超时时间

Co-authored-by: ulricqin <ulricqin@qq.com>
Co-authored-by: JeffreyBool <zhanggaoyuan@mediatrack.cn>
2022-11-04 15:18:36 +08:00
47 352415662a
feat:CAS and OAuth2 login (#1236)
* Feat(cas login):Add CAS login

Signed-off-by: root <foursevenlove@gmail.com>

* Fix(CAS login):1.print logs of CAS Authentication Response's Attributes 2.modify fileds of ssoClient and CAS config.

Signed-off-by: root <foursevenlove@gmail.com>

* Fix(CAS login):Fields modifing

Signed-off-by: root <foursevenlove@gmail.com>

* Feat(OAuth Login):1.Add OAuth2 login 2.Add display name

Signed-off-by: root <foursevenlove@gmail.com>

* Fix(webapi.conf):Add example

Signed-off-by: root <foursevenlove@gmail.com>

* fix(webapi.conf):Modify default value of username in OAuth2

Signed-off-by: root <foursevenlove@gmail.com>

* Fix:Error handling

Signed-off-by: root <foursevenlove@gmail.com>

Signed-off-by: root <foursevenlove@gmail.com>
2022-11-02 14:31:59 +08:00
Ulric Qin 65d8f80637 Merge branch 'main' of github.com:ccfos/nightingale 2022-11-02 08:35:06 +08:00
Ulric Qin b3700c7251 add Headers configuration demo 2022-11-02 08:34:49 +08:00
chenginger 106a8e490a
alert mute cannot refresh the bug (#1242)
bugfix:mute cannot be refreshed after being modified
2022-11-01 15:41:01 +08:00
Ulric Qin 5332f797a6 add alert duration in wecom.tpl 2022-10-30 17:11:51 +08:00
Ulric Qin aff0dbfea1 use json-iterator/go instead encoding/json 2022-10-28 10:22:04 +08:00
Ulric Qin da5dd683d6 bugfix 2022-10-25 09:46:32 +08:00
zheng 15892d6e57
规则名称支持变量 (#1217)
* 规则名称支持变量

* parse rule_name
2022-10-20 20:18:15 +08:00
xtan fbff60eefb
docs: pg init sql (#1210)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-10-20 12:32:13 +08:00
xtan 62867ddbf2
feat: conf file password supports ciphertext (#1207)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-10-20 12:31:48 +08:00
Ulric Qin 5d4acb6cc3 update sql 2022-10-19 12:25:50 +08:00
Yening Qin b893483d26
prom client support add header (#1203)
* prom client support add header
2022-10-18 20:44:38 +08:00
Ulric Qin 4130a5df02 board: support ident field 2022-10-18 19:46:08 +08:00
Ulric Qin 445d03e096 Merge branch 'main' of github.com:ccfos/nightingale 2022-10-12 20:37:34 +08:00
Ulric Qin 577c402a5b support: callback_del 2022-10-12 20:37:21 +08:00
zheng 40bbbfd475
bugfix: duplicate recoding rule metric name (#1186) 2022-10-12 18:36:04 +08:00
Ulric Qin 0d05ad85f2 callback_add callback_del 2022-10-12 18:29:59 +08:00
Ulric Qin e70622d18c bugfix: update cluster when heartbeat 2022-10-02 21:21:17 +08:00
gengleiming 562f98ddaf
bug: user update by multifields, param need '...' (#1170)
bug: 多字段更新用户时,参数作为slice接受,需要拆包之后再往下传入。该bug导致user.Update方法只能成功更新第一个参数对应的字段
2022-09-26 14:45:58 +08:00
Ulric Qin ee07969c8a code refactor 2022-09-23 10:52:52 +08:00
Ulric Qin 5b0e24cd40 code refactor 2022-09-23 10:12:12 +08:00
Ulric Qin 78b2e54910 bugfix: do not make event if target is nil 2022-09-23 10:06:10 +08:00
Ulric Qin 2e64c83632 bugfix: fix nil target 2022-09-23 10:03:10 +08:00
Ulric Qin 537d5d2386 add newline for categraf config.toml 2022-09-17 09:24:19 +08:00
Ulric Qin 86899b8c48 update README 2022-09-16 19:40:34 +08:00
Ulric Qin fcc45ebf2a update readme 2022-09-16 19:39:41 +08:00
KurolZ 95727e9c00
feat: support for sharing dashboards (#1150)
* feat: support for sharing dashboards

* merge dashboard get interface

* update read-only permission verification
2022-09-16 15:03:12 +08:00
Yening Qin 3a3ad5d9d9
feat: add configs service api (#1155)
* add configs service api
2022-09-16 12:06:39 +08:00
xtan 7209da192f
docs: fix pg init sql (#1154)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-09-15 18:28:09 +08:00
Ulric Qin 98f3508424 add configuration ForceUseServerTS 2022-09-09 13:44:50 +08:00
Ulric Qin c33900ee1b code refactor 2022-09-07 13:57:17 +08:00
Ulric Qin a2490104b9 add freedomkk-qfeng as active contributor 2022-09-07 13:56:51 +08:00
Ulric Qin 1a25c3804e add lsy1990 as active contributor 2022-09-07 13:55:12 +08:00
xtan 23eb766c14
feat: alert_subscribe add name and disabled (#1145)
* feat: alert_subscribe add name and disabled

* feat: alert_subscribe add name and disabled

Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-09-07 12:24:38 +08:00
SunnyBoy-WYH a7bad003f5
feat: alert-mute support edit and disable (#1144)
* batch query prom for single panel

* make code better:

1.extract server/api.go

2.make webapi reading prom with reusing server's API,not a new prom client

* clear code

* clear code

* format code
clear code

* move reader.go,reuse webapi/prom/prom.go clusterTypes clients cache

* clear code,extract common method

* feat: add edit and disabled for alert mute

* fix cr problem

* disabled add default 0
2022-09-05 12:11:50 +08:00
xtan e4e48cfda0
docs: pg init sql (#1142)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-09-03 18:59:31 +08:00
Ulric Qin d0ce4c25e5 Merge branch 'main' of github.com:ccfos/nightingale 2022-09-02 12:19:47 +08:00
Ulric Qin 01aea821b9 extract ident from append tags 2022-09-02 12:19:34 +08:00
xtan bdc1c1c60b
feat: compatible with redis4 to 7 (#1141)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-09-01 18:10:31 +08:00
Yening Qin 09f37b8076
refactor: change alert mute clean (#1140)
refactor: change alert mute clean
2022-09-01 11:18:04 +08:00
kongfei605 fc4c4b96bf
Merge pull request #1138 from ccfos/mm_notification
mm notification support at someone
2022-08-31 23:05:28 +08:00
kongfei 5c60c2c85e mm notification support at someone 2022-08-31 23:03:29 +08:00
kongfei 1e9bd900e9 update notify.py 2022-08-31 17:55:26 +08:00
kongfei605 1ca000af2c
Merge pull request #1137 from kongfei605/notification
add mattermost notification
2022-08-31 15:57:23 +08:00
kongfei 81fade557b fix dingtalk notification url 2022-08-31 15:45:59 +08:00
kongfei b82f646636 update configurarion in docker 2022-08-31 15:37:34 +08:00
kongfei 26a3d2dafa add mm notification with notify plugin 2022-08-31 15:31:27 +08:00
kongfei 5e931ebe8e add mm notification 2022-08-31 14:11:28 +08:00
Ulric Qin 8c45479c02 add primary key 2022-08-29 11:27:53 +08:00
Ulric Qin 940313bd4e use big nodata interval 2022-08-27 18:15:56 +08:00
xiaoziv 5057cd0ae6
add id column for table user_group_member and role_operation (#1126)
Co-authored-by: Ziv <xiaozheng@tuya.com>
2022-08-27 10:40:11 +08:00
Ulric Qin a4be2c73ac Merge branch 'main' of github.com:ccfos/nightingale 2022-08-27 10:35:27 +08:00
Ulric Qin a38e50d6b8 bugfix: server hearbeat 2022-08-27 10:35:15 +08:00
laiwei 89f66dd5d1
improve commuinity guide (#1133)
* improve community governance

* improve guide

* update contributors guide
2022-08-26 19:54:02 +08:00
ulricqin 3963470603
add configuration ForceUseServerTS (#1128) 2022-08-22 23:22:58 +08:00
xiaoziv 640b6e6825
fix: add board check when del group (#1124)
* fix: add board check when del group

* Update busi_group.go

Co-authored-by: Ziv <xiaozheng@tuya.com>
Co-authored-by: ulricqin <ulricqin@qq.com>
2022-08-22 19:08:40 +08:00
ulricqin e7d2c45f9d
Manage bindings of n9e-server and datasource in web (#1127)
* manage bindings of n9e-server and datasource

* fix sync memsto
2022-08-22 18:39:29 +08:00
Yening Qin 80ee54898a
feat: alert rule support cate (#1123)
* alert rule support cate

* his_event add cate

* change RecoverEvent time

* add get event api

* event query by cate
2022-08-22 14:17:17 +08:00
xtan fe68cebbf9
docs: sync pg init sql (#1122)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-08-19 15:57:06 +08:00
Ulric Qin c1fec215a9 add some api for server cluster bindings 2022-08-18 09:27:43 +08:00
Ulric Qin 388228a631 collect target total number 2022-08-17 20:16:42 +08:00
ulricqin b4ddd03691
read prom url from database (#1119)
* add model alerting_engine

* heartbeat using db

* reader.Client from database

* fix sql
2022-08-17 17:20:42 +08:00
Yening Qin b92e4abf86
feat: support handle event api (#1113)
* support handle event service api
2022-08-17 11:22:49 +08:00
Ulric Qin a1c458b764 use hostname:port as identity 2022-08-17 10:28:57 +08:00
laiwei acb4b8e33e
improve community governance (#1115) 2022-08-16 14:20:57 +08:00
zheng 54eab51e54
添加告警规则执行日志 (#1112) 2022-08-15 17:17:28 +08:00
HK.MF be89fde030
add aws cloudwatch rds metrics descriptions
Co-authored-by: e <hackermofrom@gmail.com>
2022-08-13 19:52:07 +08:00
jeff 37711ea6b2
add ping监控指标中文说明 (#1110)
add ping监控指标中文说明
2022-08-12 16:42:26 +08:00
xiaoziv 3b5c8d8357
optimize error report (#1109)
* optimize error report

* code refactor

* add /-/reload as reload route like prometheus

Co-authored-by: ziv <xiaozheng@tuya.com>
2022-08-12 14:10:03 +08:00
JellyTony 635369e3fd
Update docker-compose.yaml (#1107)
* Update docker-compose.yaml

* Update docker-compose.yaml

Co-authored-by: ulricqin <ulricqin@qq.com>
2022-08-12 13:16:02 +08:00
ulricqin 6c2c945bd9
event.Cluster use target.Cluster instead of rule.Cluster (#1108) 2022-08-12 13:13:06 +08:00
xiaoziv 48d24c79d6
use slim base image (#1105)
Co-authored-by: ziv <xiaozheng@tuya.com>
2022-08-11 19:35:33 +08:00
xiaoziv c6a1761a7b
support tpls reload (#1104)
Co-authored-by: ziv <xiaozheng@tuya.com>
2022-08-11 17:05:41 +08:00
Ulric Qin 23d7e5a7de add disk_util for target table 2022-08-10 17:05:29 +08:00
xiaoziv b1b2c7d6b0
feat: support ident disk usage metric (#1100)
* feat: support ident disk usage metric

* code refactor

Co-authored-by: ziv <xiaozheng@tuya.com>
2022-08-10 17:00:49 +08:00
Ulric Qin f34c3c6a2c comment default WriteRelabels 2022-08-10 16:53:41 +08:00
Ulric Qin 454dc7f983 go mod tidy 2022-08-10 16:52:51 +08:00
Resurgence c1e92b56b9
feat: add write_relabel action before n9e remote writing to multi tsdb (#1098)
* add write relabel config

* change parse relabel Regex field time when config loaded
2022-08-10 16:50:52 +08:00
xiaoziv fd93fd7182
feat: support i18n metric desc (#1097)
* support i18n metric desc

* code refactor

* code refactor

Co-authored-by: ziv <xiaozheng@tuya.com>
2022-08-10 13:21:11 +08:00
Ulric Qin 1a446f0749 fix configurations: TargetMetrics 2022-08-10 10:36:02 +08:00
Ulric Qin f18ed76593 escape TargetMetrics 2022-08-09 20:07:17 +08:00
Ulric Qin 9b3a9f29d9 extract promql to webapi.conf 2022-08-09 20:01:54 +08:00
Ulric Qin 49965fd5d5 fix target mem util 2022-08-09 17:19:27 +08:00
Ulric Qin a248e054fa add some host metrics for targets get api 2022-08-09 17:11:24 +08:00
ning bbb35d36be fix: categraf panic when use docker compose 2022-08-09 10:44:18 +08:00
xiaoziv fd3e51cbb1
fix i18n header bug (#1095)
Co-authored-by: ziv <xiaozheng@tuya.com>
2022-08-08 20:28:36 +08:00
xiaoziv bd0480216c
feat: support i18n request headerkey (#1094)
Co-authored-by: ziv <xiaozheng@tuya.com>
2022-08-08 19:02:14 +08:00
Ulric Qin 2c963258cf code refactor 2022-08-08 15:26:11 +08:00
Yening Qin b4f267fb01
feat: prom support tls (#1091) 2022-08-08 12:17:52 +08:00
xiaoziv ea46401db2
remove record rule check (#1090)
Co-authored-by: ziv <xiaozheng@tuya.com>
2022-08-06 18:15:04 +08:00
xiaoziv 58e777eb00
support graph url (#1088)
Co-authored-by: ziv <xiaozheng@tuya.com>
2022-08-06 18:12:33 +08:00
xiaoziv 04a9161f75
feat: support rule convert from prometheus/vmalert (#1087)
* feat: support rule convert from prometheus/vmalert

* Update rule_converter.py

* Update rule_converter.py

Co-authored-by: ulricqin <ulricqin@qq.com>
2022-08-04 20:06:54 +08:00
xtan 1ed8f38833
feat: add first trigger time (#1086)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-08-04 19:29:44 +08:00
Ulric Qin bb17751a81 fix typo 2022-08-02 12:21:01 +08:00
ulricqin a8dcb1fe83
add retry controller for poster (#1082) 2022-08-02 12:20:02 +08:00
Ulric Qin 1ea30e03a4 check user exists when refresh token 2022-08-01 14:44:22 +08:00
kongfei605 ba0eafa065
docker compose use latest version of n9e and categraf (#1079) 2022-07-29 17:38:28 +08:00
xtan c78c8d07f2
refactor: error info return (#1077)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-07-29 17:38:03 +08:00
Ulric Qin 8fe9e57c03 Merge branch 'main' of github.com:ccfos/nightingale 2022-07-29 17:35:49 +08:00
Ulric Qin 64646d2ace refactor linux dashboard 2022-07-29 17:35:23 +08:00
ning e747e73145 add debug log for ldap login 2022-07-29 15:38:45 +08:00
xtan 896f85efdf
refactor: add error log (#1076)
* refactor: add error log

* refactor: update error log

* refactor: fix error log

Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-07-29 11:41:39 +08:00
Ulric Qin 77e4499a32 refactor linux dashboard 2022-07-27 19:05:00 +08:00
ulricqin 7c351e09e5
add api: /board/:bid/pure (#1073) 2022-07-27 14:30:35 +08:00
xiaoziv 14ad3b1b0a
fix proxy auth username error (#1072) 2022-07-27 14:13:48 +08:00
Ulric Qin 184867d07c feature: query busigroup by ident 2022-07-27 13:13:17 +08:00
Ulric Qin 3476b95b35 fix: query busigroup by ident 2022-07-26 18:23:14 +08:00
Ulric Qin 76e105c93a query busigroup by ident 2022-07-26 17:59:57 +08:00
Ulric Qin 39705787c9 Merge branch 'main' of github.com:ccfos/nightingale 2022-07-26 15:54:42 +08:00
Ulric Qin 293680a9cd use english comma 2022-07-26 15:54:25 +08:00
Yening Qin 05005357fb
feat: push event api add mute (#1070) 2022-07-25 16:05:35 +08:00
ulricqin ba7ff133e6
modify prometheus query batch response format (#1068) 2022-07-23 17:50:16 +08:00
ulricqin 0bd7ba9549
code refactor notify (#1066) 2022-07-22 18:12:42 +08:00
ulricqin 17c7361620
code refactor notify plugin (#1065) 2022-07-22 17:56:52 +08:00
lsy1990 c45cbd02cc
supply plugin to notify maintainer (#1063) 2022-07-22 17:02:49 +08:00
hwloser 04cb501ab4
[fix] fix the docker problem of apple chip (#1060)
Co-authored-by: huanwei <huanwei@huanweideMacBook-Pro.local>
2022-07-21 14:46:27 +08:00
Yening Qin ba6f089c78
fix: get alert rules by api (#1059)
* fix event push api
2022-07-19 12:10:02 +08:00
Ulric Qin ab0cb6fc47 Merge branch 'main' of github.com:ccfos/nightingale 2022-07-18 17:08:07 +08:00
Ulric Qin 2847a315b1 add server-dash.json 2022-07-18 17:05:45 +08:00
Yening Qin 65439df7fb
fix event push api (#1057) 2022-07-18 14:37:31 +08:00
laiwei b6436b09ce
update community governance (#1056)
* update readme to add badge
* update community gov
2022-07-17 21:57:16 +08:00
Ulric Qin 92354d5765 code refactor 2022-07-17 13:22:16 +08:00
SunnyBoy-WYH 05651ad744
Query batch feature (#1052)
* batch query prom for single panel

* make code better:

1.extract server/api.go

2.make webapi reading prom with reusing server's API,not a new prom client

* clear code

* clear code

* format code
clear code

* move reader.go,reuse webapi/prom/prom.go clusterTypes clients cache

* clear code,extract common method
2022-07-17 12:52:33 +08:00
Ulric Qin b7ff82d722 alertSubscribePut can modify cluster 2022-07-13 19:24:09 +08:00
Ulric Qin a285966560 fix func RecordingRuleGetsByCluster 2022-07-13 11:01:27 +08:00
xiaoziv 538880b0e0
[feature] support multiple cluster config with mute&subscribe (#1046)
* [feature] support multiple cluster config with mute&subscribe

* [feature] support multiple cluster config with mute&subscribe
2022-07-13 10:56:57 +08:00
kongfei605 299270f74e
keep build version in Makefile consistency with goreleaser (#1047) 2022-07-12 23:43:24 +08:00
xiaoziv 9c69362650
[feat(#984)] multiple cluster support (#1045)
* [feat(#984)] multiple cluster support

* add stats ClusterAll handle
2022-07-12 19:30:42 +08:00
ulricqin d508aef7e5
fix mute: parse regexp (#1044) 2022-07-12 16:39:23 +08:00
Ulric Qin 616674b643 code refactor 2022-07-11 13:10:40 +08:00
zheng 94847d9059
get rule node (#1042) 2022-07-11 13:06:11 +08:00
Ulric Qin cbd416495c modify server.conf in docker env 2022-07-11 13:04:04 +08:00
Ulric Qin cc32194fb6 code refactor 2022-07-10 11:27:36 +08:00
Ulric Qin f5e2b43526 go mod tidy 2022-07-10 10:28:43 +08:00
xiaoziv 5bc8f0b9b1
Feature mute enhancement (#1041)
* [feature(#1029)] alert mute enhancement

* handle error of proxy user Add
2022-07-09 22:06:33 +08:00
xtan 7359a69223
fix: fix plugin error (#1038)
* fix: fix plugin error

* fix-plugin

Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-07-08 18:21:09 +08:00
xtan 04d64d09d7
fix: fix version info (#1036)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-07-08 16:22:15 +08:00
xiaoziv 43343182e4
[feature] add proxy auth support (#1035)
Co-authored-by: ziv <xiazoheng@tuya.com>
2022-07-08 15:19:22 +08:00
Ulric Qin 072ab98fcf use ForwardDuration in goroutine 2022-07-08 12:53:32 +08:00
Ulric Qin 35ef6b9265 duplicate label key checker 2022-07-08 12:02:57 +08:00
Ulric Qin eaa53f2533 check duplicate label key 2022-07-08 11:48:44 +08:00
Ulric Qin de322c4daf add n9e_server.json 2022-07-08 10:03:04 +08:00
Ulric Qin 936c751a93 Merge branch 'main' of github.com:ccfos/nightingale 2022-07-08 09:48:22 +08:00
Ulric Qin 796a7014a1 use goroutine to forward data 2022-07-08 09:48:08 +08:00
xtan f4368302ea
fix: pg sql for recording rule (#1034)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-07-07 15:53:12 +08:00
kongfei605 01e611a9f9
auto release with github action (#1032)
* auto release with github action

* build arm64 artifacts
2022-07-07 14:17:23 +08:00
Yening Qin 315e0ef903
fix: get clusters by api (#1030) 2022-07-07 12:29:35 +08:00
Ulric Qin 98d5dfff8e add namespace and subsystem prefix for metrics 2022-07-07 12:23:06 +08:00
Ulric Qin 6b4705608b add forward stat 2022-07-07 12:13:45 +08:00
Ulric Qin 5907817cba n9e-server: add http request stat 2022-07-07 10:52:04 +08:00
Ulric Qin aa97ac54d1 register GaugeSampleQueueSize 2022-07-07 10:17:15 +08:00
Ulric Qin 8fe548aba9 rename mapkey alertname to rulename 2022-07-07 10:06:34 +08:00
Tripitakav 18a9288b75
fix mute bug (#1025)
Co-authored-by: tripitakav <chengzhi.shang@longbridge.sg>
2022-07-07 10:05:39 +08:00
ulricqin fe82886f09
report sample queue size (#1027)
* report sample queue size

* report sample channel size
2022-07-07 10:00:08 +08:00
xtan 32e6993eea
fix: fix event api for service (#1026)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-07-07 09:58:05 +08:00
ning 56b61909a3 fix: event service api 2022-07-07 09:44:26 +08:00
ulricqin 2ef541cdd7
refactor recording rule and and field disabled (#1022) 2022-07-06 17:21:14 +08:00
laiwei 6b1d283cda
Merge pull request #1019 from ccfos/community-guide
add community guide and governance docs (draft)
2022-07-06 17:08:14 +08:00
laiwei c8e5566c81 add stargazers chart 2022-07-06 16:25:54 +08:00
laiwei 7f3d9df089 update 2022-07-06 16:20:58 +08:00
laiwei 99aa4dbca8 update format 2022-07-06 16:12:59 +08:00
Ulric Qin c193b8abd4 remove drop table sql 2022-07-06 16:08:24 +08:00
laiwei 4efdc4f169 update format 2022-07-06 16:06:13 +08:00
Tripitakav 1304a4630b
Add recording rule (#1015)
* add prometheus recording rules

* fix recording rule sql

* add record rule note

* fix copy error

* add some regx

Co-authored-by: 尚承志 <chengzhi.shang@longbridge.sg>
2022-07-06 15:58:08 +08:00
laiwei 2cc3f939a7 Merge branch 'main' into community-guide 2022-07-06 15:44:11 +08:00
laiwei d0260e564c Merge remote-tracking branch 'origin/main' into community-guide 2022-07-06 15:40:52 +08:00
laiwei 2a24179423 update readme to add community governance 2022-07-06 15:40:25 +08:00
laiwei 34082b44f1 Merge branch 'main' of github.com:ccfos/nightingale 2022-07-06 13:12:04 +08:00
UlricQin bfe340d24d upgrade 5.9.4 2022-07-05 16:54:15 +08:00
xtan a9288e376d
feat: persist notify cur number (#1013)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-07-05 16:42:20 +08:00
laiwei cb9a03d010 update guide 2022-07-05 00:43:36 +08:00
laiwei e62366b755 community guide 2022-07-05 00:37:01 +08:00
Ulric Qin 2a2a96d9fc add contains funcmap 2022-07-04 20:03:11 +08:00
ysyneu 64a671ae13
update kafka alerts and dashboard (#1012)
* update kafka alerts and dashboard

* update kafka dashboard

Co-authored-by: yushuangyu <yushuangyu@flashcat.cloud>
2022-07-04 19:56:51 +08:00
Ulric Qin 45945876d8 update README 2022-07-03 08:57:13 +08:00
Henry Chia 90dacd0085
fix typo (#1004)
* 修改拼写错误

修改拼写错误
exsits -> exists

* Update router_login.go
2022-06-29 19:08:58 +08:00
ning 540ef68dc8 fix: alert mute add by service 2022-06-29 11:11:12 +08:00
zheng 54cc981956
fix ForDuration (#999) 2022-06-28 16:13:23 +08:00
xtan 2e8ea354d7
refactor: now categraf v0.1.6 is fine (#993)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
Co-authored-by: ulricqin <ulric.qin@gmail.com>
2022-06-28 11:01:23 +08:00
Ulric Qin 217f52294e update categraf image version to v0.1.9 2022-06-27 15:06:14 +08:00
Ulric Qin f9b2675077 upgrade 5.9.3 2022-06-27 15:01:34 +08:00
Ulric Qin 95fd2d99b2 code refactor 2022-06-27 12:28:40 +08:00
Ulric Qin dba2b23e9e code refactor 2022-06-27 12:25:26 +08:00
Ulric Qin acbc199143 code refactor 2022-06-27 12:23:25 +08:00
Ulric Qin 2449c8715e update doc 2022-06-27 12:18:42 +08:00
Ulric Qin c62593c0eb update README 2022-06-23 10:37:47 +08:00
Ulric Qin 00cbc9342f modify vx-qrcode.png 2022-06-22 16:56:08 +08:00
Ulric Qin df5a3a37f2 refactor docker-compose.yaml for categraf 2022-06-22 10:41:59 +08:00
xtan 5ec14c588b
Feat:update docker-compose from telegraf to categraf (#992)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-06-22 10:25:57 +08:00
Ulric Qin d78a3a638a update issue template 2022-06-20 15:20:50 +08:00
Ulric Qin 19d2cbfa27 add issue_template 2022-06-20 15:17:10 +08:00
chenxuan f9af916352
fix alert put api not verify bug (#987) 2022-06-20 11:50:14 +08:00
xtan 90db12b513
Fix:fix target_up nodata judge for prometheus scrape (#986) 2022-06-17 22:44:25 +08:00
Ulric Qin 7d326ef306 use metrics as hash key 2022-06-17 09:56:10 +08:00
Ulric Qin d0b005fb14 code refactor: set createBy when update metric_view 2022-06-16 13:17:58 +08:00
Ulric Qin 118060cf77 UPDATE README 2022-06-15 15:38:11 +08:00
Ulric Qin d2ef68daac upgrade 5.9.2 2022-06-15 14:40:43 +08:00
Ulric Qin 8393b93c53 Merge branch 'main' of github.com:ccfos/nightingale 2022-06-15 14:01:22 +08:00
Ulric Qin 63adcc2cd9 bugfix for alert-aggr-views 2022-06-15 14:01:01 +08:00
xtan 60c842c704
fix: NotifyMaxNumber for postgres db (#978)
* fix: NotifyMaxNumber for postgres db

* fix: sql for pg

Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-06-13 10:56:14 +08:00
Ulric Qin f6fd6aed7f add some categraf alerts.json 2022-06-11 17:54:52 +08:00
Ulric Qin cb92368e5b add categraf dashboard 2022-06-11 17:40:43 +08:00
Ulric Qin 8cd97db362 add some categraf dashboard 2022-06-11 17:38:10 +08:00
Ulric Qin 94e1359895 fix handler: NotifyMaxNumber 2022-06-10 17:49:48 +08:00
Ulric Qin 1bcc5b77ec remote write and read: support header 2022-06-10 17:37:33 +08:00
Ulric Qin ae622e0c08 fix 2022-06-10 16:36:47 +08:00
Ulric Qin c951f7d822 support max notify number 2022-06-10 16:26:53 +08:00
Ulric Qin 6a366acc74 modify log level 2022-06-10 15:39:45 +08:00
Ulric Qin a5f7d5e9cf modify log level 2022-06-10 15:15:13 +08:00
Ulric Qin ea2249c30c forward samples in sequence 2022-06-10 14:20:18 +08:00
Ulric Qin a8c60c9f2b alert_aggr_view support modify by admin 2022-06-10 13:55:26 +08:00
xtan 0581e02cf3
Feat:add common template functions (#976)
* Feat:增加常用模板函数

* Feat:修改增加模板函数的实现方式

Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-06-08 20:40:45 +08:00
Ulric Qin efec811b91 update README 2022-06-08 13:40:06 +08:00
Ulric Qin f85209c817 add gif in README 2022-06-08 11:19:57 +08:00
Ulric Qin 7fda5a9a4b update README 2022-06-08 11:17:03 +08:00
Ulric Qin ab689fc0db update README 2022-06-08 11:12:51 +08:00
Ulric Qin cdcdbe8f70 update README 2022-06-08 11:11:15 +08:00
Ulric Qin 46ca8a409a update README 2022-06-08 10:53:58 +08:00
Ulric Qin e5c1641b6b code refactor: move struct ReaderOptions to config 2022-06-07 18:00:11 +08:00
Ulric Qin 3e475e7e08 Merge branch 'main' of github.com:ccfos/nightingale 2022-06-07 17:56:49 +08:00
Ulric Qin 3899144f8f add header for writer post 2022-06-07 17:56:23 +08:00
ning b8cb9e7734 fix: linux_by_telegraf dashboard 2022-06-07 14:17:17 +08:00
xtan c62b9edf87
fix:pg数据库脚本同步 (#974)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-06-06 11:48:57 +08:00
ning 0e5aea40e8 Merge branch 'main' of github.com:ccfos/nightingale 2022-06-02 11:07:40 +08:00
ning 1dbfcd3dc8 refactor: service api 2022-06-02 11:07:31 +08:00
xtan a4ef5fca46
pg数据库初始化脚本字段同步 (#969)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-06-01 19:57:17 +08:00
Ulric Qin 7cf309345f Merge branch 'main' of github.com:ccfos/nightingale 2022-06-01 12:58:36 +08:00
Ulric Qin 495632a064 fix alert rule delete by service 2022-06-01 12:58:09 +08:00
xtan f6591e80ea
Feat:提供基于Postgres的数据库初始化脚本 (#967)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-05-31 18:06:42 +08:00
Ulric Qin ab5e8c366e code refactor 2022-05-31 14:44:57 +08:00
Ulric Qin ce35e23a0f modify alert rule verify 2022-05-31 13:08:09 +08:00
Ulric Qin ece263ea45 update readme 2022-05-30 09:44:19 +08:00
Ulric Qin f777318cc7 update README 2022-05-30 09:13:44 +08:00
Ulric Qin c29f3ecdeb update README 2022-05-30 09:12:16 +08:00
Ulric Qin 8f8740ad94 update README 2022-05-30 09:10:58 +08:00
Ulric Qin 9acabba761 use go1.18 and tidy go.mod 2022-05-30 09:08:54 +08:00
204 changed files with 18909 additions and 10874 deletions

View File

@ -1,12 +0,0 @@
---
name: Bug Report
about: Report a bug encountered while operating Nightingale
labels: kind/bug
---
**夜莺版本**:
**问题和复现方法**:

67
.github/ISSUE_TEMPLATE/bug_report.yml vendored Normal file
View File

@ -0,0 +1,67 @@
name: Bug Report
description: Report a bug encountered while running Nightingale
labels: ["kind/bug"]
body:
- type: markdown
attributes:
value: |
Thanks for taking time to fill out this bug report!
The more detailed the form is filled in, the easier the problem will be solved.
- type: textarea
id: config
attributes:
label: Relevant server.conf | webapi.conf
description: Place config in the toml code section. This will be automatically formatted into toml, so no need for backticks.
render: toml
validations:
required: true
- type: textarea
id: logs
attributes:
label: Relevant logs
description: categraf | telegraf | server | webapi | prometheus | chrome request/response ...
render: text
validations:
required: true
- type: input
id: system-info
attributes:
label: System info
description: Include nightingale version, operating system, and other relevant details
placeholder: ex. n9e 5.9.2, n9e-fe 5.5.0, categraf 0.1.0, Ubuntu 20.04, Docker 20.10.8
validations:
required: true
- type: textarea
id: reproduce
attributes:
label: Steps to reproduce
description: Describe the steps to reproduce the bug.
value: |
1.
2.
3.
...
validations:
required: true
- type: textarea
id: expected-behavior
attributes:
label: Expected behavior
description: Describe what you expected to happen when you performed the above steps.
validations:
required: true
- type: textarea
id: actual-behavior
attributes:
label: Actual behavior
description: Describe what actually happened when you performed the above steps.
validations:
required: true
- type: textarea
id: additional-info
attributes:
label: Additional info
description: Include gist of relevant config, logs, etc.
validations:
required: false

View File

@ -1,5 +1,5 @@
blank_issues_enabled: false
contact_links:
- name: Nightingale community
url: https://n9e.didiyun.com/community/
about: List of communication channels for the Nightingale community.
- name: Nightingale docs
url: https://n9e.github.io/
about: You may want to read through the document before asking questions.

View File

@ -1,26 +1,32 @@
name: Go
name: Release
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
tags:
- 'v*'
env:
GO_VERSION: 1.18
jobs:
build:
name: Build
goreleaser:
runs-on: ubuntu-latest
steps:
- name: Set up Go 1.17
uses: actions/setup-go@v1
with:
go-version: 1.17
id: go
- name: Check out code into the Go module directory
uses: actions/checkout@v2
- name: Build
run: make
- name: Checkout Source Code
uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Setup Go Environment
uses: actions/setup-go@v3
with:
go-version: ${{ env.GO_VERSION }}
- uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Run GoReleaser
uses: goreleaser/goreleaser-action@v3
with:
version: latest
args: release --rm-dist
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

88
.goreleaser.yaml Normal file
View File

@ -0,0 +1,88 @@
before:
hooks:
# You may remove this if you don't use go modules.
- go mod tidy
snapshot:
name_template: '{{ .Tag }}'
checksum:
name_template: 'checksums.txt'
changelog:
skip: true
builds:
- id: build
hooks:
pre:
- ./fe.sh
main: ./src/
binary: n9e
env:
- CGO_ENABLED=0
goos:
- linux
goarch:
- amd64
- arm64
ldflags:
- -s -w
- -X github.com/didi/nightingale/v5/src/pkg/version.VERSION={{ .Tag }}-{{.Commit}}
archives:
- id: n9e
builds:
- build
format: tar.gz
format_overrides:
- goos: windows
format: zip
name_template: "n9e-v{{ .Version }}-{{ .Os }}-{{ .Arch }}"
wrap_in_directory: false
files:
- docker/*
- etc/*
- pub/*
release:
github:
owner: ccfos
name: nightingale
name_template: "v{{ .Version }}"
dockers:
- image_templates:
- flashcatcloud/nightingale:{{ .Version }}-amd64
goos: linux
goarch: amd64
ids:
- build
dockerfile: docker/Dockerfile.goreleaser
extra_files:
- pub
use: buildx
build_flag_templates:
- "--platform=linux/amd64"
- image_templates:
- flashcatcloud/nightingale:{{ .Version }}-arm64v8
goos: linux
goarch: arm64
ids:
- build
dockerfile: docker/Dockerfile.goreleaser
extra_files:
- pub
use: buildx
build_flag_templates:
- "--platform=linux/arm64/v8"
docker_manifests:
- name_template: flashcatcloud/nightingale:{{ .Version }}
image_templates:
- flashcatcloud/nightingale:{{ .Version }}-amd64
- flashcatcloud/nightingale:{{ .Version }}-arm64v8
- name_template: flashcatcloud/nightingale:latest
image_templates:
- flashcatcloud/nightingale:{{ .Version }}-amd64
- flashcatcloud/nightingale:{{ .Version }}-arm64v8

View File

@ -2,10 +2,15 @@
NOW = $(shell date -u '+%Y%m%d%I%M%S')
RELEASE_VERSION = 5.9.0
APP = n9e
SERVER_BIN = $(APP)
ROOT:=$(shell pwd -P)
GIT_COMMIT:=$(shell git --work-tree ${ROOT} rev-parse 'HEAD^{commit}')
_GIT_VERSION:=$(shell git --work-tree ${ROOT} describe --tags --abbrev=14 "${GIT_COMMIT}^{commit}" 2>/dev/null)
TAG=$(shell echo "${_GIT_VERSION}" | awk -F"-" '{print $$1}')
RELEASE_VERSION:="$(TAG)-$(GIT_COMMIT)"
# RELEASE_ROOT = release
# RELEASE_SERVER = release/${APP}
# GIT_COUNT = $(shell git rev-list --all --count)
@ -17,6 +22,9 @@ all: build
build:
go build -ldflags "-w -s -X github.com/didi/nightingale/v5/src/pkg/version.VERSION=$(RELEASE_VERSION)" -o $(SERVER_BIN) ./src
build-linux:
GOOS=linux GOARCH=amd64 go build -ldflags "-w -s -X github.com/didi/nightingale/v5/src/pkg/version.VERSION=$(RELEASE_VERSION)" -o $(SERVER_BIN) ./src
# start:
# @go run -ldflags "-X main.VERSION=$(RELEASE_TAG)" ./cmd/${APP}/main.go web -c ./configs/config.toml -m ./configs/model.conf --menu ./configs/menu.yaml
run_webapi:

165
README.md
View File

@ -1,84 +1,121 @@
<img src="doc/img/ccf-n9e.png" width="240">
<p align="center">
<a href="https://github.com/ccfos/nightingale">
<img src="doc/img/ccf-n9e.png" alt="nightingale - cloud native monitoring" width="240" /></a>
<p align="center">夜莺是一款开源的云原生监控系统,采用 all-in-one 的设计,提供企业级的功能特性,开箱即用的产品体验。推荐升级您的 Prometheus + AlertManager + Grafana 组合方案到夜莺</p>
</p>
# Nightingale
<p align="center">
<img alt="GitHub latest release" src="https://img.shields.io/github/v/release/ccfos/nightingale"/>
<a href="https://n9e.github.io">
<img alt="Docs" src="https://img.shields.io/badge/docs-get%20started-brightgreen"/></a>
<a href="https://hub.docker.com/u/flashcatcloud">
<img alt="Docker pulls" src="https://img.shields.io/docker/pulls/flashcatcloud/nightingale"/></a>
<img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/ccfos/nightingale">
<img alt="GitHub forks" src="https://img.shields.io/github/forks/ccfos/nightingale">
<a href="https://github.com/ccfos/nightingale/graphs/contributors">
<img alt="GitHub contributors" src="https://img.shields.io/github/contributors-anon/ccfos/nightingale"/></a>
<img alt="License" src="https://img.shields.io/badge/license-Apache--2.0-blue"/>
</p>
Nightingale is an enterprise-level cloud-native monitoring system, which can be used as drop-in replacement of Prometheus for alerting and management.
[English](./README_EN.md) | [中文](./README.md)
[English](./README.md) | [中文](./README_ZH.md)
## Highlighted Features
## Introduction
Nightingale is an cloud-native monitoring system by All-In-On design, support enterprise-class functional features with an out-of-the-box experience. We recommend upgrading your `Prometheus` + `AlertManager` + `Grafana` combo solution to Nightingale.
- **开箱即用**
- 支持 Docker、Helm Chart、云服务等多种部署方式集数据采集、监控告警、可视化为一体内置多种监控仪表盘、快捷视图、告警规则模板导入即可快速使用**大幅降低云原生监控系统的建设成本、学习成本、使用成本**
- **专业告警**
- 可视化的告警配置和管理,支持丰富的告警规则,提供屏蔽规则、订阅规则的配置能力,支持告警多种送达渠道,支持告警自愈、告警事件管理等;
- **云原生**
- 以交钥匙的方式快速构建企业级的云原生监控体系,支持 [**Categraf**](https://github.com/flashcatcloud/categraf)、Telegraf、Grafana-agent 等多种采集器,支持 Prometheus、VictoriaMetrics、M3DB、ElasticSearch 等多种数据库,兼容支持导入 Grafana 仪表盘,**与云原生生态无缝集成**
- **高性能,高可用**
- 得益于夜莺的多数据源管理引擎,和夜莺引擎侧优秀的架构设计,借助于高性能时序库,可以满足数亿时间线的采集、存储、告警分析场景,节省大量成本;
- 夜莺监控组件均可水平扩展,无单点,已在上千家企业部署落地,经受了严苛的生产实践检验。众多互联网头部公司,夜莺集群机器达百台,处理数亿级时间线,重度使用夜莺监控;
- **灵活扩展,中心化管理**
- 夜莺监控,可部署在 1 核 1G 的云主机,可在上百台机器集群化部署,可运行在 K8s 中;也可将时序库、告警引擎等组件下沉到各机房、各 Region兼顾边缘部署和中心化统一管理**解决数据割裂,缺乏统一视图的难题**
- **开放社区**
- 托管于[中国计算机学会开源发展委员会](https://www.ccf.org.cn/kyfzwyh/),有[**快猫星云**](https://flashcat.cloud)和众多公司的持续投入,和数千名社区用户的积极参与,以及夜莺监控项目清晰明确的定位,都保证了夜莺开源社区健康、长久的发展。活跃、专业的社区用户也在持续迭代和沉淀更多的最佳实践于产品中;
- **Multiple prometheus data sources management**: manage all alerts and dashboards in one centralized visually view;
- **Out-of-the-box alert rule**: built-in multiple alert rules, reuse alert rules template by one-click import with detailed explanation of metrics;
- **Multiple modes for visualizing data**: out-of-the-box dashboards, instance customize views, expression browser and Grafana integration;
- **Multiple collection clients**: support using Promethues Exporter、Telegraf、Datadog Agent to collecting metrics;
- **Integration of multiple storage**: support Prometheus, M3DB, VictoriaMetrics, Influxdb, TDEngine as storage solutions, and original support for PromQL;
- **Fault self-healing**: support the ability to self-heal from failures by configuring webhook;
> 如果您在使用 Prometheus 过程中,有以下的一个或者多个需求场景,推荐您无缝升级到夜莺:
#### If you are using Prometheus and have one or more of the following requirement scenarios, it is recommended that you upgrade to Nightingale:
- Prometheus、Alertmanager、Grafana 等多个系统较为割裂,缺乏统一视图,无法开箱即用;
- 通过修改配置文件来管理 Prometheus、Alertmanager 的方式,学习曲线大,协同有难度;
- 数据量过大而无法扩展您的 Prometheus 集群;
- 生产环境运行多套 Prometheus 集群,面临管理和使用成本高的问题;
- Multiple systems such as Prometheus, Alertmanager, Grafana, etc. are fragmented and lack a unified view and cannot be used out of the box;
- The way to manage Prometheus and Alertmanager by modifying configuration files has a big learning curve and is difficult to collaborate;
- Too much data to scale-up your Prometheus cluster;
- Multiple Prometheus clusters running in production environments, which faced high management and usage costs;
> 如果您在使用 Zabbix有以下的场景推荐您升级到夜莺
#### If you are using Zabbix and have the following scenarios, it is recommended that you upgrade to Nightingale:
- 监控的数据量太大,希望有更好的扩展解决方案;
- 学习曲线高,多人多团队模式下,希望有更好的协同使用效率;
- 微服务和云原生架构下监控数据的生命周期多变、监控数据维度基数高Zabbix 数据模型不易适配;
- Monitoring too much data and wanting a better scalable solution;
- A high learning curve and a desire for better efficiency of collaborative use in a multi-person, multi-team model;
- Microservice and cloud-native architectures with variable monitoring data lifecycles and high monitoring data dimension bases, which are not easily adaptable to the Zabbix data model;
> 如果您在使用 [Open-Falcon](https://github.com/open-falcon/falcon-plus),我们更推荐您升级到夜莺:
- 关于 Open-Falcon 和夜莺的详细介绍,请参考阅读:[云原生监控的十个特点和趋势](https://mp.weixin.qq.com/s?__biz=MzkzNjI5OTM5Nw==&mid=2247483738&idx=1&sn=e8bdbb974a2cd003c1abcc2b5405dd18&chksm=c2a19fb0f5d616a63185cd79277a79a6b80118ef2185890d0683d2bb20451bd9303c78d083c5#rd)。
> 我们推荐您使用 [Categraf](https://github.com/flashcatcloud/categraf) 作为首选的监控数据采集器:
- [Categraf](https://github.com/flashcatcloud/categraf) 是夜莺监控的默认采集器采用开放插件机制和 all-in-one 的设计同时支持 metric、log、trace、event 的采集。Categraf 不仅可以采集 CPU、内存、网络等系统层面的指标也集成了众多开源组件的采集能力支持K8s生态。Categraf 内置了对应的仪表盘和告警规则开箱即用。
## Getting Started
- [快速安装](https://mp.weixin.qq.com/s/iEC4pfL1TgjMDOWYh8H-FA)
- [详细文档](https://n9e.github.io/)
- [社区分享](https://n9e.github.io/docs/prologue/share/)
## Screenshots
<img src="doc/img/intro.gif" width="480">
#### If you are using [open-falcon](https://github.com/open-falcon/falcon-plus), we recommend you to upgrade to Nightingale
- For more information about open-falcon and Nightingale, please refer to read [Ten features and trends of cloud-native monitoring](https://mp.weixin.qq.com/s?__biz=MzkzNjI5OTM5Nw==&mid=2247483738&idx=1&sn=e8bdbb974a2cd003c1abcc2b5405dd18&chksm=c2a19fb0f5d616a63185cd79277a79a6b80118ef2185890d0683d2bb20451bd9303c78d083c5#rd)。
## Architecture
## Quickstart
- [n9e.github.io/quickstart](https://n9e.github.io/quickstart/)
<img src="doc/img/arch-product.png" width="480">
## Documentation
- [n9e.github.io](https://n9e.github.io/)
夜莺监控可以接收各种采集器上报的监控数据(比如 [Categraf](https://github.com/flashcatcloud/categraf)、telegraf、grafana-agent、Prometheus并写入多种流行的时序数据库中可以支持Prometheus、M3DB、VictoriaMetrics、Thanos、TDEngine等提供告警规则、屏蔽规则、订阅规则的配置能力提供监控数据的查看能力提供告警自愈机制告警触发之后自动回调某个webhook地址或者执行某个脚本提供历史告警事件的存储管理、分组查看的能力。
## Example of use
#### You can import and set MySQL-related alert rules:
<img src="doc/img/mysql-alerts.png" width="680">
<img src="doc/img/arch-system.png" width="480">
#### You can import and set the host-related dashboard
<img src="doc/img/n9e-node-dashboard.png" width="680">
#### You can also easily view all active alerts and historical alerts in Nightingale:
<img src="https://n9e.github.io/intro/alert-events.png" width="680">
## System Architecture
#### A typical Nightingale deployment architecture:
<img src="https://n9e.github.io/intro/arch-system.png" width="680">
#### Typical deployment architecture using VictoriaMetrics as storage:
<img src="https://n9e.github.io/fc-monitoring-vm.png" width="680">
## Contact us and feedback questions
- We recommend that you use [github issue](https://github.com/didi/nightingale/issues) as the preferred channel for issue feedback and requirement submission;
- You can join our WeChat group - [Nightingale WeChat Group](https://s3-gz01.didistatic.com/n9e-pub/image/n9e-wx.png);
<img src="https://n9e.github.io/cloudmon.png" width="180">
夜莺 v5 版本的设计非常简单,核心是 server 和 webapi 两个模块webapi 无状态放到中心端承接前端请求将用户配置写入数据库server 是告警引擎和数据转发模块,一般随着时序库走,一个时序库就对应一套 server每套 server 可以只用一个实例也可以多个实例组成集群server 可以接收 Categraf、Telegraf、Grafana-Agent、Datadog-Agent、Falcon-Plugins 上报的数据,写入后端时序库,周期性从数据库同步告警规则,然后查询时序库做告警判断。每套 server 依赖一个 redis。
## Contributing
We welcome your participation in the Nightingale open source project and open source community in a variety of ways:
- Feedback on problems and bugs => [github issue](https://github.com/didi/nightingale/issues)
- Additional and improved documentation => [n9e.github.io](https://n9e.github.io/)
- Share your best practices and insights on using Nightingale => [User Story](https://github.com/didi/nightingale/issues/897)
- Join our community events => [Nightingale wechat group](https://s3-gz01.didistatic.com/n9e-pub/image/n9e-wx.png)
- Submit code to make Nightingale better =>[github PR](https://github.com/didi/nightingale/pulls)
<img src="doc/img/install-vm.png" width="480">
## TODO
- [x] deploy nightingale in docker
- [x] export /metrics endpoint
- [x] notify.py support feishu
- [ ] notify.py support sms
- [ ] notify.py support voice
- [x] support remote write api
- [ ] support pushgateway api
如果单机版本的时序数据库(比如 Prometheus 性能有瓶颈或容灾较差,我们推荐使用 [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics)VictoriaMetrics 架构较为简单性能优异易于部署和运维架构图如上。VictoriaMetrics 更详尽的文档,还请参考其[官网](https://victoriametrics.com/)。
## Community
开源项目要更有生命力,离不开开放的治理架构和源源不断的开发者和用户共同参与,我们致力于建立开放、中立的开源治理架构,吸纳更多来自企业、高校等各方面对云原生监控感兴趣、有热情的开发者,一起打造有活力的夜莺开源社区。关于《夜莺开源项目和社区治理架构(草案)》,请查阅 [COMMUNITY GOVERNANCE](./doc/community-governance.md).
**我们欢迎您以各种方式参与到夜莺开源项目和开源社区中来,工作包括不限于**
- 补充和完善文档 => [n9e.github.io](https://n9e.github.io/)
- 分享您在使用夜莺监控过程中的最佳实践和经验心得 => [文章分享](https://n9e.github.io/docs/prologue/share/)
- 提交产品建议 =》 [github issue](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Ffeature&template=enhancement.md)
- 提交代码,让夜莺监控更快、更稳、更好用 => [github pull request](https://github.com/didi/nightingale/pulls)
**尊重、认可和记录每一位贡献者的工作**是夜莺开源社区的第一指导原则,我们提倡**高效的提问**,这既是对开发者时间的尊重,也是对整个社区知识沉淀的贡献:
- 提问之前请先查阅 [FAQ](https://www.gitlink.org.cn/ccfos/nightingale/wiki/faq)
- 我们使用[GitHub Discussions](https://github.com/ccfos/nightingale/discussions)作为交流论坛,有问题可以到这里搜索、提问
- 我们也推荐你加入微信群,和其他夜莺用户交流经验 (请先加好友:[UlricGO](https://www.gitlink.org.cn/UlricQin/gist/tree/master/self.jpeg) 备注:夜莺加群+姓名+公司)
## Who is using
您可以通过在 **[Who is Using Nightingale](https://github.com/ccfos/nightingale/issues/897)** 登记您的使用情况,分享您的使用经验。
## Stargazers
[![Stargazers over time](https://starchart.cc/ccfos/nightingale.svg)](https://starchart.cc/ccfos/nightingale)
## Contributors
<a href="https://github.com/ccfos/nightingale/graphs/contributors">
<img src="https://contrib.rocks/image?repo=ccfos/nightingale" />
</a>
## License
Nightingale with [Apache License V2.0](https://github.com/didi/nightingale/blob/main/LICENSE) open source license.
[Apache License V2.0](https://github.com/didi/nightingale/blob/main/LICENSE)
## Contact Us
推荐您关注夜莺监控公众号,及时获取相关产品和社区动态:
<img src="doc/img/n9e-vx-new.png" width="120">

68
README_EN.md Normal file
View File

@ -0,0 +1,68 @@
<img src="doc/img/ccf-n9e.png" width="240">
Nightingale is an enterprise-level cloud-native monitoring system, which can be used as drop-in replacement of Prometheus for alerting and management.
[English](./README_EN.md) | [中文](./README.md)
## Introduction
Nightingale is an cloud-native monitoring system by All-In-On design, support enterprise-class functional features with an out-of-the-box experience. We recommend upgrading your `Prometheus` + `AlertManager` + `Grafana` combo solution to Nightingale.
- **Multiple prometheus data sources management**: manage all alerts and dashboards in one centralized visually view;
- **Out-of-the-box alert rule**: built-in multiple alert rules, reuse alert rules template by one-click import with detailed explanation of metrics;
- **Multiple modes for visualizing data**: out-of-the-box dashboards, instance customize views, expression browser and Grafana integration;
- **Multiple collection clients**: support using Promethues Exporter、Telegraf、Datadog Agent to collecting metrics;
- **Integration of multiple storage**: support Prometheus, M3DB, VictoriaMetrics, Influxdb, TDEngine as storage solutions, and original support for PromQL;
- **Fault self-healing**: support the ability to self-heal from failures by configuring webhook;
#### If you are using Prometheus and have one or more of the following requirement scenarios, it is recommended that you upgrade to Nightingale:
- Multiple systems such as Prometheus, Alertmanager, Grafana, etc. are fragmented and lack a unified view and cannot be used out of the box;
- The way to manage Prometheus and Alertmanager by modifying configuration files has a big learning curve and is difficult to collaborate;
- Too much data to scale-up your Prometheus cluster;
- Multiple Prometheus clusters running in production environments, which faced high management and usage costs;
#### If you are using Zabbix and have the following scenarios, it is recommended that you upgrade to Nightingale:
- Monitoring too much data and wanting a better scalable solution;
- A high learning curve and a desire for better efficiency of collaborative use in a multi-person, multi-team model;
- Microservice and cloud-native architectures with variable monitoring data lifecycles and high monitoring data dimension bases, which are not easily adaptable to the Zabbix data model;
#### If you are using [open-falcon](https://github.com/open-falcon/falcon-plus), we recommend you to upgrade to Nightingale
- For more information about open-falcon and Nightingale, please refer to read [Ten features and trends of cloud-native monitoring](https://mp.weixin.qq.com/s?__biz=MzkzNjI5OTM5Nw==&mid=2247483738&idx=1&sn=e8bdbb974a2cd003c1abcc2b5405dd18&chksm=c2a19fb0f5d616a63185cd79277a79a6b80118ef2185890d0683d2bb20451bd9303c78d083c5#rd)。
## Quickstart
- [n9e.github.io/quickstart](https://n9e.github.io/docs/install/compose/)
## Documentation
- [n9e.github.io](https://n9e.github.io/)
## Example of use
<img src="doc/img/intro.gif" width="680">
## System Architecture
#### A typical Nightingale deployment architecture:
<img src="doc/img/arch-system.png" width="680">
#### Typical deployment architecture using VictoriaMetrics as storage:
<img src="doc/img/install-vm.png" width="680">
## Contact us and feedback questions
- We recommend that you use [github issue](https://github.com/didi/nightingale/issues) as the preferred channel for issue feedback and requirement submission;
- You can join our WeChat group
<img src="doc/img/n9e-vx-new.png" width="180">
## Contributing
We welcome your participation in the Nightingale open source project and open source community in a variety of ways:
- Feedback on problems and bugs => [github issue](https://github.com/didi/nightingale/issues)
- Additional and improved documentation => [n9e.github.io](https://n9e.github.io/)
- Share your best practices and insights on using Nightingale => [User Story](https://github.com/didi/nightingale/issues/897)
- Join our community events => [Nightingale wechat group](https://s3-gz01.didistatic.com/n9e-pub/image/n9e-wx.png)
- Submit code to make Nightingale better =>[github PR](https://github.com/didi/nightingale/pulls)
## License
Nightingale with [Apache License V2.0](https://github.com/didi/nightingale/blob/main/LICENSE) open source license.

View File

@ -1,89 +0,0 @@
<img src="doc/img/ccf-n9e.png" width="240">
# Nightingale
[English](./README.md) | [中文](./README_ZH.md)
## 介绍
> Nightingale is an enterprise-level cloud-native monitoring system, which can be used as drop-in replacement of Prometheus for alerting and management.
>
>夜莺是一款开源的云原生监控系统,采用 All-In-One 的设计,提供企业级的功能特性,开箱即用的产品体验。推荐升级您的 `Prometheus` + `AlertManager` + `Grafana` 组合方案到夜莺。
- 内置丰富的Dashboard、好用实用的告警管理、自定义视图、故障自愈
- Dashboard和告警策略支持一键导入详细的指标分类和解释
- 支持多 Prometheus 数据源管理以一个集中的视图来管理所有的告警和dashboard
- 支持 Prometheus、M3DB、VictoriaMetrics、Influxdb、TDEngine 等多种时序库作为存储方案;
- 原生支持 PromQL
- 支持 Exporter 作为数据采集方案;
- 支持 Telegraf 作为监控数据采集方案;
- 支持对接 Grafana 作为补充可视化方案;
#### 如果您在使用 Prometheus 过程中,有以下的一个或者多个需求场景,推荐您升级到夜莺:
- Prometheus、Alertmanager、Grafana 等多个系统较为割裂,缺乏统一视图,无法开箱即用;
- 通过修改配置文件来管理 Prometheus、Alertmanager 的方式,学习曲线大,协同有难度;
- 数据量过大而无法扩展您的 Prometheus 集群;
- 生产环境运行多套 Prometheus 集群,面临管理和使用成本高的问题;
#### 如果您在使用Zabbix有以下的场景推荐您升级到夜莺
- 监控的数据量太大,希望有更好的扩展解决方案;
- 学习曲线高,多人多团队模式下,希望有更好的协同使用效率;
- 微服务和云原生架构下监控数据的生命周期多变、监控数据维度基数高Zabbix数据模型不易适配
#### 如果您在使用[open-falcon](https://github.com/open-falcon/falcon-plus),我们更推荐您升级到夜莺:
- 关于open-falcon和夜莺的详细介绍请参考阅读[云原生监控的十个特点和趋势](https://mp.weixin.qq.com/s?__biz=MzkzNjI5OTM5Nw==&mid=2247483738&idx=1&sn=e8bdbb974a2cd003c1abcc2b5405dd18&chksm=c2a19fb0f5d616a63185cd79277a79a6b80118ef2185890d0683d2bb20451bd9303c78d083c5#rd)。
## 快速安装部署
- [n9e.github.io/quickstart](https://n9e.github.io/quickstart/)
## 详细文档
- [n9e.github.io](https://n9e.github.io/)
## 产品演示
#### 您可以直接导入并生成 MySQL 相关的告警策略:
<img src="doc/img/mysql-alerts.png" width="680">
#### 您可以直接导入并生成主机相关的 dashboard
<img src="doc/img/n9e-node-dashboard.png" width="680">
#### 您也可以在夜莺中方便的查看所有活跃的告警以及历史告警:
<img src="https://n9e.github.io/intro/alert-events.png" width="680">
## 系统架构
#### 一个典型的 Nightingale 部署架构:
<img src="https://n9e.github.io/intro/arch-system.png" width="680">
#### 使用 VictoriaMetrics 作为时序数据库的典型部署架构:
<img src="https://n9e.github.io/fc-monitoring-vm.png" width="680">
## 联系我们和反馈问题
- 我们推荐您优先使用[github issue](https://github.com/didi/nightingale/issues)作为首选问题反馈和需求提交的通道;
- 您可以加入我们的微信群组——[Nightingale 微信群组](https://s3-gz01.didistatic.com/n9e-pub/image/n9e-wx.png)
- 当然,推荐您关注夜莺监控公众号,及时获取相关产品动态
<img src="https://n9e.github.io/cloudmon.png" width="180">
## 参与到夜莺开源项目和社区
我们欢迎您以各种方式参与到夜莺开源项目和开源社区中来,工作包括不限于:
- 反馈使用中遇到的问题和Bug => [github issue](https://github.com/didi/nightingale/issues)
- 补充和完善文档 => [n9e.github.io](https://n9e.github.io/)
- 分享您在使用夜莺监控过程中的最佳实践和经验心得 => [夜莺User Story](https://github.com/didi/nightingale/issues/897)
- 参与我们的社区活动 => [Nightingale 微信群组](https://s3-gz01.didistatic.com/n9e-pub/image/n9e-wx.png)
- 提交代码,让夜莺监控更快、更稳、更好用 =>[github PR](https://github.com/didi/nightingale/pulls)
## TODO
- [x] deploy nightingale in docker
- [x] export /metrics endpoint
- [x] notify.py support feishu
- [ ] notify.py support sms
- [ ] notify.py support voice
- [x] support remote write api
- [ ] support pushgateway api
## License
夜莺监控,采用[Apache License V2.0](https://github.com/didi/nightingale/blob/main/LICENSE)开源许可证。

View File

@ -0,0 +1,7 @@
## Active Contributors
- [xiaoziv](https://github.com/xiaoziv)
- [tanxiao1990](https://github.com/tanxiao1990)
- [bbaobelief](https://github.com/bbaobelief)
- [freedomkk-qfeng](https://github.com/freedomkk-qfeng)
- [lsy1990](https://github.com/lsy1990)

5
doc/committers.md Normal file
View File

@ -0,0 +1,5 @@
## Committers
- [YeningQin](https://github.com/710leo)
- [FeiKong](https://github.com/kongfei605)
- [XiaqingDai](https://github.com/jsers)

View File

@ -0,0 +1,80 @@
[夜莺监控](https://github.com/ccfos/nightingale "夜莺监控")是一款开源云原生监控系统由滴滴设计开发2020 年 3 月份开源之后,凭借其优秀的产品设计、灵活性架构和明确清晰的定位,夜莺监控快速发展为国内最活跃的企业级云原生监控方案。[截止当前](具体指2022年8月 "截止当前"),在 [Github](https://github.com/ccfos/nightingale "Github") 上已经迭代发布了 **70** 多个版本,获得了 **5K** 多个 Star**80** 多位代码贡献者。快速的迭代,也让夜莺监控的用户群越来越大,涉及各行各业。
更进一步,夜莺监控于 2022 年 5 月 11 日,正式捐赠予中国计算机学会开源发展委员会 [CCF ODC](https://www.ccf.org.cn/kyfzwyh/ "CCF ODC"),为 CCF ODC 成立后接受捐赠的第一个开源项目。
开源项目要更有生命力,离不开开放的治理架构和源源不断的开发者共同参与。夜莺监控项目加入 CCF 开源大家庭后,能在计算机学会的支持和带动下,进一步结合云原生、可观测、国产化等多个技术发展的需求,建立开放、中立的开源治理架构,打造更专业、有活力的开发者社区。
**今天,我们郑重发布夜莺监控开源社区治理架构,并公示相关的任命和社区荣誉,期待开源的道路上,一起同行。**
# 夜莺监控开源社区架构
### User|用户
> 欢迎任何个人、公司以及组织,使用夜莺监控,并积极的反馈 bug、提交功能需求、以及相互帮助我们推荐使用 [Github Issue](https://github.com/ccfos/nightingale/issues "Github Issue") 来跟踪 bug 和管理需求。
社区用户,可以通过在 **[Who is Using Nightingale](https://github.com/ccfos/nightingale/issues/897 "Who is Using Nightingale")** 登记您的使用情况,并分享您使用夜莺监控的经验,将会自动进入 **[END USERS](https://github.com/ccfos/nightingale/blob/main/doc/end-users.md "END USERS")** 文件列表,并获得社区的 **VIP Support**
### Contributor|贡献者
> 欢迎每一位用户,包括但不限于以下方式参与到夜莺开源社区并做出贡献:
1. 在 [Github Issue](https://github.com/ccfos/nightingale/issues "Github Issue") 中积极参与讨论,参与社区活动;
1. 提交代码补丁;
1. 翻译、修订、补充和完善[文档](https://n9e.github.io "文档")
1. 分享夜莺监控的使用经验,积极布道;
1. 提交建议 / 批评;
年度累计向 [CCFOS/NIGHTINGALE](https://github.com/ccfos/nightingale "CCFOS/NIGHTINGALE") 提交 **5** 个PR被合并或者因为其他贡献被**项目管委会**一致认可,将会自动进入到 **[ACTIVE CONTRIBUTORS](https://github.com/ccfos/nightingale/blob/main/doc/active-contributors.md "ACTIVE CONTRIBUTORS")** 列表,并获得夜莺开源社区颁发的证书,享有夜莺开源社区一定的权益和福利。
所有向 [CCFOS/NIGHTINGALE](https://github.com/ccfos/nightingale "CCFOS/NIGHTINGALE") 提交过PR被合并或者做出过重要贡献的 Contributor都会被永久记载于 [CONTRIBUTORS](https://github.com/ccfos/nightingale/blob/main/doc/contributors.md "CONTRIBUTORS") 列表。
### Committer|提交者
> Committer 是指拥有 [CCFOS/NIGHTINGALE](https://github.com/ccfos/nightingale "CCFOS/NIGHTINGALE") 代码仓库写操作权限的贡献者。原则上 Committer 能够自主决策某个代码补丁是否可以合入到夜莺代码仓库,但是项目管委会拥有最终的决策权。
Committer 承担以下一个或多个职责:
- 积极回应 Issues
- Review PRs
- 参加开发者例行会议,积极讨论项目规划和技术方案;
- 代表夜莺开源社区出席相关技术会议并做演讲;
Committer 记录并公示于 **[COMMITTERS](https://github.com/ccfos/nightingale/blob/main/doc/committers.md "COMMITTERS")** 列表,并获得夜莺开源社区颁发的证书,以及享有夜莺开源社区的各种权益和福利。
### PMC|项目管委会
> PMC项目管委会作为一个实体来管理和领导夜莺项目为整个项目的发展全权负责。项目管委会相关内容记录并公示于文件[PMC](https://github.com/ccfos/nightingale/blob/main/doc/pmc.md "PMC").
- 项目管委会成员(PMC Member),从 Contributor 或者 Committer 中选举产生,他们拥有 [CCFOS/NIGHTINGALE](https://github.com/ccfos/nightingale "CCFOS/NIGHTINGALE") 代码仓库的写操作权限,拥有 Nightingale 社区相关事务的投票权、以及提名 Committer 候选人的权利。
- 项目管委会主席(PMC Chair),从项目管委会成员中投票产生。管委会主席是 **[CCF ODC](https://www.ccf.org.cn/kyfzwyh/ "CCF ODC")** 和项目管委会之间的沟通桥梁,履行特定的项目管理职责。
## Communication|沟通机制
1. 我们推荐使用邮件列表来反馈建议(待发布);
2. 我们推荐使用 [Github Issue](https://github.com/ccfos/nightingale/issues "Github Issue") 跟踪 bug 和管理需求;
3. 我们推荐使用 [Github Milestone](https://github.com/ccfos/nightingale/milestones "Github Milestone") 来管理项目进度和规划;
4. 我们推荐使用腾讯会议来定期召开项目例会(会议 ID 待发布);
## Documentation|文档
1. 我们推荐使用 [Github Pages](https://n9e.github.io "Github Pages") 来沉淀文档;
2. 我们推荐使用 [Gitlink Wiki](https://www.gitlink.org.cn/ccfos/nightingale/wiki/faq "Gitlink Wiki") 来沉淀 FAQ
## Operation|运营机制
1. 我们定期组织用户、贡献者、项目管委会成员之间的沟通会议,讨论项目开发的目标、方案、进度,以及讨论相关需求的合理性、优先级等议题;
2. 我们定期组织 meetup (线上&线下),创造良好的用户交流分享环境,并沉淀相关内容到文档站点;
3. 我们定期组织夜莺开发者大会,分享 [best user story](https://n9e.github.io/docs/prologue/share/ "best user story")、同步年度开发目标和计划、讨论新技术方向等;
## Philosophy|社区指导原则
>尊重、认可和记录每一位贡献者的工作。
## 关于提问的原则
按照**尊重、认可、记录每一位贡献者的工作**原则,我们提倡**高效的提问**,这既是对开发者时间的尊重,也是对整个社区的知识沉淀的贡献:
1. 提问之前请先查阅 [FAQ](https://www.gitlink.org.cn/ccfos/nightingale/wiki/faq "FAQ")
2. 提问之前请先搜索 [Github Issues](https://github.com/ccfos/nightingale/issues "Github Issue")
3. 我们优先推荐通过提交 [Github Issue](https://github.com/ccfos/nightingale/issues "Github Issue") 来提问,如果[有问题点击这里](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Fbug&template=bug_report.yml "有问题点击这里") | [有需求建议点击这里](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Ffeature&template=enhancement.md "有需求建议点击这里")
最后,我们推荐你加入微信群,针对相关开放式问题,相互交流咨询 (请先加好友:[UlricGO](https://www.gitlink.org.cn/UlricQin/gist/tree/master/self.jpeg "UlricGO") 备注:夜莺加群+姓名+公司,交流群里会有开发者团队和专业、热心的群友回答问题)。

5
doc/contributors.md Normal file
View File

@ -0,0 +1,5 @@
## Contributors
<a href="https://github.com/ccfos/nightingale/graphs/contributors">
<img src="https://contrib.rocks/image?repo=ccfos/nightingale" />
</a>

5
doc/end-users.md Normal file
View File

@ -0,0 +1,5 @@
## End Users
- [中移动](https://github.com/ccfos/nightingale/issues/897#issuecomment-1086573166)
- [inke](https://github.com/ccfos/nightingale/issues/897#issuecomment-1099840636)
- [方正证券](https://github.com/ccfos/nightingale/issues/897#issuecomment-1110492461)

BIN
doc/img/alert-events.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 450 KiB

BIN
doc/img/arch-product.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 127 KiB

BIN
doc/img/arch-system.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 198 KiB

BIN
doc/img/install-vm.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 141 KiB

BIN
doc/img/intro.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.1 MiB

BIN
doc/img/n9e-vx-new.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 331 KiB

BIN
doc/img/vm-cluster-arch.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 127 KiB

BIN
doc/img/wx.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 99 KiB

View File

@ -1,211 +0,0 @@
[
{
"name": "JVM监控大盘",
"tags": "",
"configs": "{\"var\":[{\"name\":\"java_app\",\"definition\":\"label_values(jvm_info, label)\",\"selected\":[],\"multi\":true,\"allOption\":true}]}",
"chart_groups": [
{
"name": "JVM统计信息",
"weight": 0,
"charts": [
{
"configs": "{\"name\":\"jvm版本信息\",\"QL\":[{\"PromQL\":\"avg(jvm_info{java_app=~\\\"$java_app\\\"}) without (runtime,vendor)\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"Java进程启动时间\",\"link\":\"\",\"QL\":[{\"PromQL\":\"(time() - process_start_time_seconds{java_app=~\\\"$java_app\\\"})/3600\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":\"humantime\"},\"version\":1,\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "JVM内存使用",
"weight": 1,
"charts": [
{
"configs": "{\"name\":\"nonheap 非堆区\",\"QL\":[{\"PromQL\":\"jvm_memory_bytes_used{java_app=~\\\"$java_app\\\",area=\\\"nonheap\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"heap堆区\",\"QL\":[{\"PromQL\":\"jvm_memory_bytes_used{java_app=~\\\"$java_app\\\",area=\\\"heap\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"提交给 Java虚拟机使用的内存量 heap 堆区\",\"QL\":[{\"PromQL\":\"jvm_memory_bytes_committed{java_app=~\\\"$java_app\\\",area=\\\"heap\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"提交给 Java虚拟机使用的内存量 nonheap 非堆区\",\"QL\":[{\"PromQL\":\"jvm_memory_bytes_committed{java_app=~\\\"$java_app\\\",area=\\\"nonheap\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"jvm最大内存\",\"QL\":[{\"PromQL\":\"jvm_memory_bytes_max{java_app=~\\\"$java_app\\\",area=\\\"heap\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":2,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"jvm初始化内存\",\"QL\":[{\"PromQL\":\"jvm_memory_bytes_init{java_app=~\\\"$java_app\\\",area=\\\"heap\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":2,\"i\":\"5\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"jvm内存使用百分比% heap堆区\",\"QL\":[{\"PromQL\":\"100 * jvm_memory_bytes_used{java_app=~\\\"$java_app\\\",area=\\\"heap\\\"}/jvm_memory_bytes_max{java_app=~\\\"$java_app\\\",area=\\\"heap\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":2,\"i\":\"6\"}}",
"weight": 0
}
]
},
{
"name": "JVM内存池",
"weight": 2,
"charts": [
{
"configs": "{\"name\":\"jvm内存池分pool展示\",\"QL\":[{\"PromQL\":\"jvm_memory_pool_bytes_max{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"JVM 缓冲池使用缓存大小\",\"QL\":[{\"PromQL\":\"jvm_buffer_pool_used_bytes{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"JVM 缓冲池的字节容量\",\"QL\":[{\"PromQL\":\"jvm_buffer_pool_capacity_bytes{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":2,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"JVM 缓冲池使用的字节大小\",\"QL\":[{\"PromQL\":\"jvm_buffer_pool_used_bytes{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":2,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "jvm gc情况",
"weight": 3,
"charts": [
{
"configs": "{\"name\":\"新生代gc耗时 1分钟\",\"QL\":[{\"PromQL\":\"increase(jvm_gc_collection_seconds_sum{java_app=~\\\"$java_app\\\",gc=\\\"G1 Young Generation\\\" }[1m])\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"老生代gc耗时 1分钟\",\"QL\":[{\"PromQL\":\"increase(jvm_gc_collection_seconds_sum{java_app=~\\\"$java_app\\\",gc=\\\"G1 Old Generation\\\" }[1m])\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"新生代gc次数 1分钟\",\"QL\":[{\"PromQL\":\"increase(jvm_gc_collection_seconds_count{java_app=~\\\"$java_app\\\",gc=\\\"G1 Young Generation\\\" }[1m])\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"老生代gc次数 1分钟\",\"QL\":[{\"PromQL\":\"increase(jvm_gc_collection_seconds_count{java_app=~\\\"$java_app\\\",gc=\\\"G1 Old Generation\\\" }[1m])\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"新生代平均gc耗时 秒\",\"QL\":[{\"PromQL\":\"jvm_gc_collection_seconds_sum{java_app=~\\\"$java_app\\\",gc=\\\"G1 Young Generation\\\" }/jvm_gc_collection_seconds_count{java_app=~\\\"$java_app\\\",gc=\\\"G1 Young Generation\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":2,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"老生代平均gc耗时\",\"QL\":[{\"PromQL\":\"jvm_gc_collection_seconds_sum{java_app=~\\\"$java_app\\\",gc=\\\"G1 Old Generation\\\"}/jvm_gc_collection_seconds_count{java_app=~\\\"$java_app\\\",gc=\\\"G1 Old Generation\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":2,\"i\":\"5\"}}",
"weight": 0
}
]
},
{
"name": "jvm线程情况",
"weight": 4,
"charts": [
{
"configs": "{\"name\":\"当前线程数\",\"QL\":[{\"PromQL\":\"jvm_threads_current{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"守护线程数\",\"QL\":[{\"PromQL\":\"jvm_threads_daemon{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"死锁线程数\",\"QL\":[{\"PromQL\":\"jvm_threads_deadlocked{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"活动线程峰值\",\"QL\":[{\"PromQL\":\"jvm_threads_peak{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"自JVM启动后启动的线程总量包括daemon,non-daemon和终止了的\",\"QL\":[{\"PromQL\":\"jvm_threads_started_total{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":2,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"当前TERMINATED线程个数\",\"QL\":[{\"PromQL\":\"jvm_threads_state{java_app=~\\\"$java_app\\\",state=\\\"TERMINATED\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":2,\"i\":\"5\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"当前RUNNABLE线程个数\",\"QL\":[{\"PromQL\":\"jvm_threads_state{java_app=~\\\"$java_app\\\",state=\\\"RUNNABLE\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":2,\"i\":\"6\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"当前NEW线程个数\",\"QL\":[{\"PromQL\":\"jvm_threads_state{java_app=~\\\"$java_app\\\",state=\\\"NEW\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":2,\"i\":\"7\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"当前TIMED_WAITING线程个数\",\"QL\":[{\"PromQL\":\"jvm_threads_state{java_app=~\\\"$java_app\\\",state=\\\"TIMED_WAITING\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":4,\"i\":\"8\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"当前BLOCKED线程个数\",\"QL\":[{\"PromQL\":\"jvm_threads_state{java_app=~\\\"$java_app\\\",state=\\\"BLOCKED\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":4,\"i\":\"9\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"当前WAITING线程个数\",\"QL\":[{\"PromQL\":\"jvm_threads_state{java_app=~\\\"$java_app\\\",state=\\\"WAITING\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":4,\"i\":\"10\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"当前线程状态汇总\",\"QL\":[{\"PromQL\":\"jvm_threads_state{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":4,\"i\":\"11\"}}",
"weight": 0
}
]
},
{
"name": "加载类情况",
"weight": 5,
"charts": [
{
"configs": "{\"name\":\"jvm 当前加载的类个数\",\"QL\":[{\"PromQL\":\"jvm_classes_loaded{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"jvm启动以来加载的类总个数\",\"QL\":[{\"PromQL\":\"jvm_classes_loaded_total{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"jvm启动以来卸载的类总个数\",\"QL\":[{\"PromQL\":\"jvm_classes_unloaded_total{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "机器指标(配置了java.lang才有)",
"weight": 6,
"charts": [
{
"configs": "{\"name\":\"java进程打开fd数\",\"QL\":[{\"PromQL\":\"os_open_file_descriptor_count{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"机器总内存\",\"QL\":[{\"PromQL\":\"os_total_memory_size{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"机器可用内存\",\"QL\":[{\"PromQL\":\"os_free_memory_size{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"机器近期cpu使用率%\",\"link\":\"https://docs.oracle.com/javase/7/docs/jre/api/management/extension/com/sun/management/OperatingSystemMXBean.html#getSystemCpuLoad()\",\"QL\":[{\"PromQL\":\"100 * os_system_cpu_load{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"java进程cpu使用\",\"link\":\"https://docs.oracle.com/javase/7/docs/jre/api/management/extension/com/sun/management/OperatingSystemMXBean.html#getProcessCpuLoad()\",\"QL\":[{\"PromQL\":\"os_process_cpu_load{java_app=~\\\"$java_app\\\"}\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":2,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"name\":\"jvm cpu百分比\",\"QL\":[{\"PromQL\":\"100 *(os_process_cpu_load{java_app=~\\\"$java_app\\\"}/os_system_cpu_load{java_app=~\\\"$java_app\\\"})\"}],\"legend\":false,\"highLevelConfig\":{\"shared\":true,\"sharedSortDirection\":\"desc\",\"precision\":\"short\",\"formatUnit\":1000},\"version\":1,\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":2,\"i\":\"5\"}}",
"weight": 0
}
]
}
]
}
]

7
doc/pmc.md Normal file
View File

@ -0,0 +1,7 @@
### PMC Chair
- [laiwei](https://github.com/laiwei)
### PMC Co-Chair
- [UlricQin](https://github.com/UlricQin)
### PMC Member

234
doc/server-dash.json Normal file
View File

@ -0,0 +1,234 @@
{
"name": "夜莺大盘",
"tags": "",
"configs": {
"var": [],
"panels": [
{
"targets": [
{
"refId": "A",
"expr": "rate(n9e_server_samples_received_total[1m])"
}
],
"name": "每秒接收的数据点个数",
"options": {
"tooltip": {
"mode": "all",
"sort": "none"
},
"legend": {
"displayMode": "hidden"
},
"standardOptions": {},
"thresholds": {}
},
"custom": {
"drawStyle": "lines",
"lineInterpolation": "smooth",
"fillOpacity": 0.5,
"stack": "off"
},
"version": "2.0.0",
"type": "timeseries",
"layout": {
"h": 4,
"w": 12,
"x": 0,
"y": 0,
"i": "53fcb9dc-23f9-41e0-bc5e-121eed14c3a4",
"isResizable": true
},
"id": "53fcb9dc-23f9-41e0-bc5e-121eed14c3a4"
},
{
"targets": [
{
"refId": "A",
"expr": "rate(n9e_server_alerts_total[10m])"
}
],
"name": "每秒产生的告警事件个数",
"options": {
"tooltip": {
"mode": "all",
"sort": "none"
},
"legend": {
"displayMode": "hidden"
},
"standardOptions": {},
"thresholds": {}
},
"custom": {
"drawStyle": "lines",
"lineInterpolation": "smooth",
"fillOpacity": 0.5,
"stack": "off"
},
"version": "2.0.0",
"type": "timeseries",
"layout": {
"h": 4,
"w": 12,
"x": 12,
"y": 0,
"i": "47fc6252-9cc8-4b53-8e27-0c5c59a47269",
"isResizable": true
},
"id": "f70dcb8b-b58b-4ef9-9e48-f230d9e17140"
},
{
"targets": [
{
"refId": "A",
"expr": "n9e_server_alert_queue_size"
}
],
"name": "告警事件内存队列长度",
"options": {
"tooltip": {
"mode": "all",
"sort": "none"
},
"legend": {
"displayMode": "hidden"
},
"standardOptions": {},
"thresholds": {}
},
"custom": {
"drawStyle": "lines",
"lineInterpolation": "smooth",
"fillOpacity": 0.5,
"stack": "off"
},
"version": "2.0.0",
"type": "timeseries",
"layout": {
"h": 4,
"w": 12,
"x": 0,
"y": 4,
"i": "ad1af16c-de0c-45f4-8875-cea4e85d51d0",
"isResizable": true
},
"id": "caf23e58-d907-42b0-9ed6-722c8c6f3c5f"
},
{
"targets": [
{
"refId": "A",
"expr": "n9e_server_http_request_duration_seconds_sum/n9e_server_http_request_duration_seconds_count"
}
],
"name": "数据接收接口平均响应时间(单位:秒)",
"options": {
"tooltip": {
"mode": "all",
"sort": "desc"
},
"legend": {
"displayMode": "hidden"
},
"standardOptions": {},
"thresholds": {}
},
"custom": {
"drawStyle": "lines",
"lineInterpolation": "smooth",
"fillOpacity": 0.5,
"stack": "noraml"
},
"version": "2.0.0",
"type": "timeseries",
"layout": {
"h": 4,
"w": 12,
"x": 12,
"y": 4,
"i": "64c3abc2-404c-4462-a82f-c109a21dac91",
"isResizable": true
},
"id": "6b8d2db1-efca-4b9e-b429-57a9d2272bc5"
},
{
"targets": [
{
"refId": "A",
"expr": "n9e_server_sample_queue_size"
}
],
"name": "内存数据队列长度",
"options": {
"tooltip": {
"mode": "all",
"sort": "desc"
},
"legend": {
"displayMode": "hidden"
},
"standardOptions": {},
"thresholds": {}
},
"custom": {
"drawStyle": "lines",
"lineInterpolation": "smooth",
"fillOpacity": 0.5,
"stack": "off"
},
"version": "2.0.0",
"type": "timeseries",
"layout": {
"h": 4,
"w": 12,
"x": 0,
"y": 8,
"i": "1c7da942-58c2-40dc-b42f-983e4a35b89b",
"isResizable": true
},
"id": "bd41677d-40d3-482e-bb6e-fbd25df46d87"
},
{
"targets": [
{
"refId": "A",
"expr": "avg(n9e_server_forward_duration_seconds_sum/n9e_server_forward_duration_seconds_count)"
}
],
"name": "数据发往TSDB平均耗时单位",
"options": {
"tooltip": {
"mode": "all",
"sort": "desc"
},
"legend": {
"displayMode": "hidden"
},
"standardOptions": {
"decimals": 8
},
"thresholds": {}
},
"custom": {
"drawStyle": "lines",
"lineInterpolation": "smooth",
"fillOpacity": 0.5,
"stack": "noraml"
},
"version": "2.0.0",
"type": "timeseries",
"layout": {
"h": 4,
"w": 12,
"x": 12,
"y": 8,
"i": "eed94a0b-954f-48ac-82e5-a2eada1c8a3d",
"isResizable": true
},
"id": "c8642e72-f384-46a5-8410-1e6be2953c3c"
}
],
"version": "2.0.0"
}
}

View File

@ -1,4 +1,5 @@
FROM python:2
FROM python:2.7.8-slim
#FROM python:2
#FROM ubuntu:21.04
WORKDIR /app

View File

@ -0,0 +1,14 @@
FROM --platform=$BUILDPLATFORM python:2.7.8-slim
WORKDIR /app
ADD n9e /app
ADD http://download.flashcat.cloud/wait /wait
RUN mkdir -p /app/pub && chmod +x /wait
ADD pub /app/pub/
RUN chmod +x n9e
EXPOSE 19000
EXPOSE 18000
CMD ["/app/n9e", "-h"]

View File

@ -1,5 +1,4 @@
#!/bin/sh
if [ $# -ne 1 ]; then
echo "$0 <tag>"
exit 0

View File

@ -0,0 +1,51 @@
[global]
# whether print configs
print_configs = false
# add label(agent_hostname) to series
# "" -> auto detect hostname
# "xx" -> use specified string xx
# "$hostname" -> auto detect hostname
# "$ip" -> auto detect ip
# "$hostname-$ip" -> auto detect hostname and ip to replace the vars
hostname = "$HOSTNAME"
# will not add label(agent_hostname) if true
omit_hostname = false
# s | ms
precision = "ms"
# global collect interval
interval = 15
[global.labels]
source="categraf"
# region = "shanghai"
# env = "localhost"
[writer_opt]
# default: 2000
batch = 2000
# channel(as queue) size
chan_size = 10000
[[writers]]
url = "http://nserver:19000/prometheus/v1/write"
# Basic auth username
basic_auth_user = ""
# Basic auth password
basic_auth_pass = ""
# timeout settings, unit: ms
timeout = 5000
dial_timeout = 2500
max_idle_conns_per_host = 100
[http]
enable = false
address = ":9100"
print_access = false
run_mode = "release"

View File

@ -0,0 +1,5 @@
# # collect interval
# interval = 15
# # whether collect per cpu
# collect_per_cpu = false

View File

@ -0,0 +1,11 @@
# # collect interval
# interval = 15
# # By default stats will be gathered for all mount points.
# # Set mount_points will restrict the stats to only the specified mount points.
# mount_points = ["/"]
# Ignore mount points by filesystem type.
ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]
ignore_mount_points = ["/boot"]

View File

@ -0,0 +1,6 @@
# # collect interval
# interval = 15
# # By default, categraf will gather stats for all devices including disk partitions.
# # Setting devices will restrict the stats to the specified devices.
# devices = ["sda", "sdb", "vd*"]

View File

@ -0,0 +1,63 @@
# # collect interval
# interval = 15
[[instances]]
# # append some labels for series
# labels = { region="cloud", product="n9e" }
# # interval = global.interval * interval_times
# interval_times = 1
## Docker Endpoint
## To use TCP, set endpoint = "tcp://[ip]:[port]"
## To use environment variables (ie, docker-machine), set endpoint = "ENV"
endpoint = "unix:///var/run/docker.sock"
## Set to true to collect Swarm metrics(desired_replicas, running_replicas)
gather_services = false
gather_extend_memstats = false
container_id_label_enable = true
container_id_label_short_style = true
## Containers to include and exclude. Globs accepted.
## Note that an empty array for both will include all containers
container_name_include = []
container_name_exclude = []
## Container states to include and exclude. Globs accepted.
## When empty only containers in the "running" state will be captured.
## example: container_state_include = ["created", "restarting", "running", "removing", "paused", "exited", "dead"]
## example: container_state_exclude = ["created", "restarting", "running", "removing", "paused", "exited", "dead"]
# container_state_include = []
# container_state_exclude = []
## Timeout for docker list, info, and stats commands
timeout = "5s"
## Specifies for which classes a per-device metric should be issued
## Possible values are 'cpu' (cpu0, cpu1, ...), 'blkio' (8:0, 8:1, ...) and 'network' (eth0, eth1, ...)
## Please note that this setting has no effect if 'perdevice' is set to 'true'
perdevice_include = []
## Specifies for which classes a total metric should be issued. Total is an aggregated of the 'perdevice' values.
## Possible values are 'cpu', 'blkio' and 'network'
## Total 'cpu' is reported directly by Docker daemon, and 'network' and 'blkio' totals are aggregated by this plugin.
## Please note that this setting has no effect if 'total' is set to 'false'
total_include = ["cpu", "blkio", "network"]
## Which environment variables should we use as a tag
##tag_env = ["JAVA_HOME", "HEAP_SIZE"]
## docker labels to include and exclude as tags. Globs accepted.
## Note that an empty array for both will include all labels as tags
docker_label_include = []
docker_label_exclude = ["annotation*", "io.kubernetes*", "*description*", "*maintainer*", "*hash", "*author*"]
## Optional TLS Config
# use_tls = false
# tls_ca = "/etc/telegraf/ca.pem"
# tls_cert = "/etc/telegraf/cert.pem"
# tls_key = "/etc/telegraf/key.pem"
## Use TLS but skip chain & host verification
# insecure_skip_verify = false

View File

@ -0,0 +1,2 @@
# # collect interval
# interval = 15

View File

@ -0,0 +1,124 @@
# # collect interval
# interval = 15
# file: /proc/vmstat
[white_list]
oom_kill = 1
nr_free_pages = 0
nr_alloc_batch = 0
nr_inactive_anon = 0
nr_active_anon = 0
nr_inactive_file = 0
nr_active_file = 0
nr_unevictable = 0
nr_mlock = 0
nr_anon_pages = 0
nr_mapped = 0
nr_file_pages = 0
nr_dirty = 0
nr_writeback = 0
nr_slab_reclaimable = 0
nr_slab_unreclaimable = 0
nr_page_table_pages = 0
nr_kernel_stack = 0
nr_unstable = 0
nr_bounce = 0
nr_vmscan_write = 0
nr_vmscan_immediate_reclaim = 0
nr_writeback_temp = 0
nr_isolated_anon = 0
nr_isolated_file = 0
nr_shmem = 0
nr_dirtied = 0
nr_written = 0
numa_hit = 0
numa_miss = 0
numa_foreign = 0
numa_interleave = 0
numa_local = 0
numa_other = 0
workingset_refault = 0
workingset_activate = 0
workingset_nodereclaim = 0
nr_anon_transparent_hugepages = 0
nr_free_cma = 0
nr_dirty_threshold = 0
nr_dirty_background_threshold = 0
pgpgin = 0
pgpgout = 0
pswpin = 0
pswpout = 0
pgalloc_dma = 0
pgalloc_dma32 = 0
pgalloc_normal = 0
pgalloc_movable = 0
pgfree = 0
pgactivate = 0
pgdeactivate = 0
pgfault = 0
pgmajfault = 0
pglazyfreed = 0
pgrefill_dma = 0
pgrefill_dma32 = 0
pgrefill_normal = 0
pgrefill_movable = 0
pgsteal_kswapd_dma = 0
pgsteal_kswapd_dma32 = 0
pgsteal_kswapd_normal = 0
pgsteal_kswapd_movable = 0
pgsteal_direct_dma = 0
pgsteal_direct_dma32 = 0
pgsteal_direct_normal = 0
pgsteal_direct_movable = 0
pgscan_kswapd_dma = 0
pgscan_kswapd_dma32 = 0
pgscan_kswapd_normal = 0
pgscan_kswapd_movable = 0
pgscan_direct_dma = 0
pgscan_direct_dma32 = 0
pgscan_direct_normal = 0
pgscan_direct_movable = 0
pgscan_direct_throttle = 0
zone_reclaim_failed = 0
pginodesteal = 0
slabs_scanned = 0
kswapd_inodesteal = 0
kswapd_low_wmark_hit_quickly = 0
kswapd_high_wmark_hit_quickly = 0
pageoutrun = 0
allocstall = 0
pgrotated = 0
drop_pagecache = 0
drop_slab = 0
numa_pte_updates = 0
numa_huge_pte_updates = 0
numa_hint_faults = 0
numa_hint_faults_local = 0
numa_pages_migrated = 0
pgmigrate_success = 0
pgmigrate_fail = 0
compact_migrate_scanned = 0
compact_free_scanned = 0
compact_isolated = 0
compact_stall = 0
compact_fail = 0
compact_success = 0
htlb_buddy_alloc_success = 0
htlb_buddy_alloc_fail = 0
unevictable_pgs_culled = 0
unevictable_pgs_scanned = 0
unevictable_pgs_rescued = 0
unevictable_pgs_mlocked = 0
unevictable_pgs_munlocked = 0
unevictable_pgs_cleared = 0
unevictable_pgs_stranded = 0
thp_fault_alloc = 0
thp_fault_fallback = 0
thp_collapse_alloc = 0
thp_collapse_alloc_failed = 0
thp_split = 0
thp_zero_page_alloc = 0
thp_zero_page_alloc_failed = 0
balloon_inflate = 0
balloon_deflate = 0
balloon_migrate = 0

View File

@ -0,0 +1,2 @@
# # collect interval
# interval = 15

View File

@ -0,0 +1,5 @@
# # collect interval
# interval = 15
# # whether collect platform specified metrics
collect_platform_fields = true

View File

@ -0,0 +1,8 @@
# # collect interval
# interval = 15
# # whether collect protocol stats on Linux
# collect_protocol_stats = false
# # setting interfaces will tell categraf to gather these explicit interfaces
# interfaces = ["eth0"]

View File

@ -0,0 +1,2 @@
# # collect interval
# interval = 15

View File

@ -0,0 +1,8 @@
# # collect interval
# interval = 15
# # force use ps command to gather
# force_ps = false
# # force use /proc to gather
# force_proc = false

View File

@ -0,0 +1,5 @@
# # collect interval
# interval = 15
# # whether collect metric: system_n_users
# collect_user_number = false

View File

@ -0,0 +1,35 @@
[logs]
## key 占位符
api_key = "ef4ahfbwzwwtlwfpbertgq1i6mq0ab1q"
## 是否开启日志采集
enable = false
## 接受日志的server地址
send_to = "127.0.0.1:17878"
## 发送日志的协议 http/tcp
send_type = "http"
## 是否压缩发送
use_compress = false
## 是否采用ssl
send_with_tls = false
##
batch_wait = 5
## 日志offset信息保存目录
run_path = "/opt/categraf/run"
## 最多同时采集多少个日志文件
open_files_limit = 100
## 定期扫描目录下是否有新增日志
scan_period = 10
##
frame_size = 10
##
collect_container_all = true
## 全局的处理规则
[[logs.Processing_rules]]
## 单个日志采集配置
[[logs.items]]
## file/journald
type = "file"
## type=file时 path必填type=journald时 port必填
path = "/opt/tomcat/logs/*.txt"
source = "tomcat"
service = "my_service"

View File

@ -6,6 +6,7 @@ networks:
services:
mysql:
platform: linux/x86_64
image: "mysql:5.7"
container_name: mysql
hostname: mysql
@ -79,7 +80,7 @@ services:
sh -c "/wait && /app/ibex server"
nwebapi:
image: ulric2019/nightingale:5.8.0
image: flashcatcloud/nightingale:latest
container_name: nwebapi
hostname: nwebapi
restart: always
@ -107,7 +108,7 @@ services:
sh -c "/wait && /app/n9e webapi"
nserver:
image: ulric2019/nightingale:5.8.0
image: flashcatcloud/nightingale:latest
container_name: nserver
hostname: nserver
restart: always
@ -134,19 +135,22 @@ services:
command: >
sh -c "/wait && /app/n9e server"
telegraf:
image: "telegraf:1.20.3"
container_name: "telegraf"
hostname: "telegraf01"
categraf:
image: "flashcatcloud/categraf:latest"
container_name: "categraf"
hostname: "categraf01"
restart: always
environment:
TZ: Asia/Shanghai
HOST_PROC: /hostfs/proc
HOST_SYS: /hostfs/sys
HOST_MOUNT_PREFIX: /hostfs
volumes:
- ./telegrafetc/telegraf.conf:/etc/telegraf/telegraf.conf
- ./categraf/conf:/etc/categraf/conf
- /:/hostfs
- /var/run/docker.sock:/var/run/docker.sock
ports:
- "8125:8125/udp"
- "8092:8092/udp"
- "8094:8094/tcp"
- "9100:9100/tcp"
networks:
- nightingale
depends_on:

View File

@ -35,4 +35,4 @@ Interval = 1000
# rpc servers
Servers = ["ibex:20090"]
# $ip or $hostname or specified string
Host = "telegraf01"
Host = "categraf01"

View File

@ -41,10 +41,12 @@ CREATE TABLE `user_group` (
insert into user_group(id, name, create_at, create_by, update_at, update_by) values(1, 'demo-root-group', unix_timestamp(now()), 'root', unix_timestamp(now()), 'root');
CREATE TABLE `user_group_member` (
`id` bigint unsigned not null auto_increment,
`group_id` bigint unsigned not null,
`user_id` bigint unsigned not null,
KEY (`group_id`),
KEY (`user_id`)
KEY (`user_id`),
PRIMARY KEY(`id`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
insert into user_group_member(group_id, user_id) values(1, 1);
@ -52,7 +54,7 @@ insert into user_group_member(group_id, user_id) values(1, 1);
CREATE TABLE `configs` (
`id` bigint unsigned not null auto_increment,
`ckey` varchar(191) not null,
`cval` varchar(1024) not null default '',
`cval` varchar(4096) not null default '',
PRIMARY KEY (`id`),
UNIQUE KEY (`ckey`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
@ -70,10 +72,12 @@ insert into `role`(name, note) values('Standard', 'Ordinary user role');
insert into `role`(name, note) values('Guest', 'Readonly user role');
CREATE TABLE `role_operation`(
`id` bigint unsigned not null auto_increment,
`role_name` varchar(128) not null,
`operation` varchar(191) not null,
KEY (`role_name`),
KEY (`operation`)
KEY (`operation`),
PRIMARY KEY(`id`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
-- Admin is special, who has no concrete operation but can do anything.
@ -123,7 +127,10 @@ insert into `role_operation`(role_name, operation) values('Standard', '/job-tpls
insert into `role_operation`(role_name, operation) values('Standard', '/job-tasks');
insert into `role_operation`(role_name, operation) values('Standard', '/job-tasks/add');
insert into `role_operation`(role_name, operation) values('Standard', '/job-tasks/put');
insert into `role_operation`(role_name, operation) values('Standard', '/recording-rules');
insert into `role_operation`(role_name, operation) values('Standard', '/recording-rules/add');
insert into `role_operation`(role_name, operation) values('Standard', '/recording-rules/put');
insert into `role_operation`(role_name, operation) values('Standard', '/recording-rules/del');
-- for alert_rule | collect_rule | mute | dashboard grouping
CREATE TABLE `busi_group` (
@ -158,13 +165,16 @@ CREATE TABLE `board` (
`id` bigint unsigned not null auto_increment,
`group_id` bigint not null default 0 comment 'busi group id',
`name` varchar(191) not null,
`ident` varchar(200) not null default '',
`tags` varchar(255) not null comment 'split by space',
`public` tinyint(1) not null default 0 comment '0:false 1:true',
`create_at` bigint not null default 0,
`create_by` varchar(64) not null default '',
`update_at` bigint not null default 0,
`update_by` varchar(64) not null default '',
PRIMARY KEY (`id`),
UNIQUE KEY (`group_id`, `name`)
UNIQUE KEY (`group_id`, `name`),
KEY(`ident`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
-- for dashboard new version
@ -223,6 +233,7 @@ CREATE TABLE `chart_share` (
CREATE TABLE `alert_rule` (
`id` bigint unsigned not null auto_increment,
`group_id` bigint not null default 0 comment 'busi group id',
`cate` varchar(128) not null,
`cluster` varchar(128) not null,
`name` varchar(255) not null,
`note` varchar(1024) not null default '',
@ -230,10 +241,10 @@ CREATE TABLE `alert_rule` (
`algorithm` varchar(255) not null default '',
`algo_params` varchar(255),
`delay` int not null default 0,
`severity` tinyint(1) not null comment '0:Emergency 1:Warning 2:Notice',
`severity` tinyint(1) not null comment '1:Emergency 2:Warning 3:Notice',
`disabled` tinyint(1) not null comment '0:enabled 1:disabled',
`prom_for_duration` int not null comment 'prometheus for, unit:s',
`prom_ql` varchar(8192) not null comment 'promql',
`prom_ql` text not null comment 'promql',
`prom_eval_interval` int not null comment 'evaluate interval',
`enable_stime` char(5) not null default '00:00',
`enable_etime` char(5) not null default '23:59',
@ -243,6 +254,7 @@ CREATE TABLE `alert_rule` (
`notify_channels` varchar(255) not null default '' comment 'split by space: sms voice email dingtalk wecom',
`notify_groups` varchar(255) not null default '' comment 'split by space: 233 43',
`notify_repeat_step` int not null default 0 comment 'unit: min',
`notify_max_number` int not null default 0 comment '',
`recover_duration` int not null default 0 comment 'unit: s',
`callbacks` varchar(255) not null default '' comment 'split by space: http://a.com/api/x http://a.com/api/y',
`runbook_url` varchar(255),
@ -259,13 +271,19 @@ CREATE TABLE `alert_rule` (
CREATE TABLE `alert_mute` (
`id` bigint unsigned not null auto_increment,
`group_id` bigint not null default 0 comment 'busi group id',
`prod` varchar(255) not null default '',
`note` varchar(1024) not null default '',
`cate` varchar(128) not null,
`cluster` varchar(128) not null,
`tags` varchar(4096) not null default '' comment 'json,map,tagkey->regexp|value',
`cause` varchar(255) not null default '',
`btime` bigint not null default 0 comment 'begin time',
`etime` bigint not null default 0 comment 'end time',
`disabled` tinyint(1) not null default 0 comment '0:enabled 1:disabled',
`create_at` bigint not null default 0,
`create_by` varchar(64) not null default '',
`update_at` bigint not null default 0,
`update_by` varchar(64) not null default '',
PRIMARY KEY (`id`),
KEY (`create_at`),
KEY (`group_id`)
@ -273,7 +291,10 @@ CREATE TABLE `alert_mute` (
CREATE TABLE `alert_subscribe` (
`id` bigint unsigned not null auto_increment,
`name` varchar(255) not null default '',
`disabled` tinyint(1) not null default 0 comment '0:enabled 1:disabled',
`group_id` bigint not null default 0 comment 'busi group id',
`cate` varchar(128) not null,
`cluster` varchar(128) not null,
`rule_id` bigint not null default 0,
`tags` varchar(4096) not null default '' comment 'json,map,tagkey->regexp|value',
@ -339,6 +360,25 @@ CREATE TABLE `metric_view` (
insert into metric_view(name, cate, configs) values('Host View', 0, '{"filters":[{"oper":"=","label":"__name__","value":"cpu_usage_idle"}],"dynamicLabels":[],"dimensionLabels":[{"label":"ident","value":""}]}');
CREATE TABLE `recording_rule` (
`id` bigint unsigned not null auto_increment,
`group_id` bigint not null default '0' comment 'group_id',
`cluster` varchar(128) not null,
`name` varchar(255) not null comment 'new metric name',
`note` varchar(255) not null comment 'rule note',
`disabled` tinyint(1) not null default 0 comment '0:enabled 1:disabled',
`prom_ql` varchar(8192) not null comment 'promql',
`prom_eval_interval` int not null comment 'evaluate interval',
`append_tags` varchar(255) default '' comment 'split by space: service=n9e mod=api',
`create_at` bigint default '0',
`create_by` varchar(64) default '',
`update_at` bigint default '0',
`update_by` varchar(64) default '',
PRIMARY KEY (`id`),
KEY `group_id` (`group_id`),
KEY `update_at` (`update_at`)
) ENGINE=InnoDB DEFAULT CHARSET = utf8mb4;
CREATE TABLE `alert_aggr_view` (
`id` bigint unsigned not null auto_increment,
`name` varchar(191) not null default '',
@ -356,6 +396,7 @@ insert into alert_aggr_view(name, rule, cate) values('By RuleName', 'field:rule_
CREATE TABLE `alert_cur_event` (
`id` bigint unsigned not null comment 'use alert_his_event.id',
`cate` varchar(128) not null,
`cluster` varchar(128) not null,
`group_id` bigint unsigned not null comment 'busi group id of rule',
`group_name` varchar(255) not null default '' comment 'busi group name',
@ -375,8 +416,10 @@ CREATE TABLE `alert_cur_event` (
`notify_channels` varchar(255) not null default '' comment 'split by space: sms voice email dingtalk wecom',
`notify_groups` varchar(255) not null default '' comment 'split by space: 233 43',
`notify_repeat_next` bigint not null default 0 comment 'next timestamp to notify, get repeat settings from rule',
`notify_cur_number` int not null default 0 comment '',
`target_ident` varchar(191) not null default '' comment 'target ident, also in tags',
`target_note` varchar(191) not null default '' comment 'target note',
`first_trigger_time` bigint,
`trigger_time` bigint not null,
`trigger_value` varchar(255) not null,
`tags` varchar(1024) not null default '' comment 'merge data_tags rule_tags, split by ,,',
@ -390,6 +433,7 @@ CREATE TABLE `alert_cur_event` (
CREATE TABLE `alert_his_event` (
`id` bigint unsigned not null AUTO_INCREMENT,
`is_recovered` tinyint(1) not null,
`cate` varchar(128) not null,
`cluster` varchar(128) not null,
`group_id` bigint unsigned not null comment 'busi group id of rule',
`group_name` varchar(255) not null default '' comment 'busi group name',
@ -408,8 +452,10 @@ CREATE TABLE `alert_his_event` (
`notify_recovered` tinyint(1) not null comment 'whether notify when recovery',
`notify_channels` varchar(255) not null default '' comment 'split by space: sms voice email dingtalk wecom',
`notify_groups` varchar(255) not null default '' comment 'split by space: 233 43',
`notify_cur_number` int not null default 0 comment '',
`target_ident` varchar(191) not null default '' comment 'target ident, also in tags',
`target_note` varchar(191) not null default '' comment 'target note',
`first_trigger_time` bigint,
`trigger_time` bigint not null,
`trigger_value` varchar(255) not null,
`recover_time` bigint not null default 0,
@ -472,3 +518,13 @@ CREATE TABLE `task_record`
KEY (`create_at`, `group_id`),
KEY (`create_by`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;
CREATE TABLE `alerting_engines`
(
`id` int unsigned NOT NULL AUTO_INCREMENT,
`instance` varchar(128) not null default '' comment 'instance identification, e.g. 10.9.0.9:9090',
`cluster` varchar(128) not null default '' comment 'target reader cluster',
`clock` bigint not null,
PRIMARY KEY (`id`),
UNIQUE KEY (`instance`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;

View File

@ -0,0 +1,615 @@
-- n9e version 5.8 for postgres
CREATE TABLE users (
id bigserial,
username varchar(64) not null ,
nickname varchar(64) not null ,
password varchar(128) not null default '',
phone varchar(16) not null default '',
email varchar(64) not null default '',
portrait varchar(255) not null default '' ,
roles varchar(255) not null ,
contacts varchar(1024) ,
maintainer smallint not null default 0,
create_at bigint not null default 0,
create_by varchar(64) not null default '',
update_at bigint not null default 0,
update_by varchar(64) not null default ''
) ;
ALTER TABLE users ADD CONSTRAINT users_pk PRIMARY KEY (id);
ALTER TABLE users ADD CONSTRAINT users_un UNIQUE (username);
COMMENT ON COLUMN users.username IS 'login name, cannot rename';
COMMENT ON COLUMN users.nickname IS 'display name, chinese name';
COMMENT ON COLUMN users.portrait IS 'portrait image url';
COMMENT ON COLUMN users.roles IS 'Admin | Standard | Guest, split by space';
COMMENT ON COLUMN users.contacts IS 'json e.g. {wecom:xx, dingtalk_robot_token:yy}';
insert into users(id, username, nickname, password, roles, create_at, create_by, update_at, update_by) values(1, 'root', '超管', 'root.2020', 'Admin', date_part('epoch',current_timestamp)::int, 'system', date_part('epoch',current_timestamp)::int, 'system');
CREATE TABLE user_group (
id bigserial,
name varchar(128) not null default '',
note varchar(255) not null default '',
create_at bigint not null default 0,
create_by varchar(64) not null default '',
update_at bigint not null default 0,
update_by varchar(64) not null default ''
) ;
ALTER TABLE user_group ADD CONSTRAINT user_group_pk PRIMARY KEY (id);
CREATE INDEX user_group_create_by_idx ON user_group (create_by);
CREATE INDEX user_group_update_at_idx ON user_group (update_at);
insert into user_group(id, name, create_at, create_by, update_at, update_by) values(1, 'demo-root-group', date_part('epoch',current_timestamp)::int, 'root', date_part('epoch',current_timestamp)::int, 'root');
CREATE TABLE user_group_member (
id bigserial,
group_id bigint not null,
user_id bigint not null
) ;
ALTER TABLE user_group_member ADD CONSTRAINT user_group_member_pk PRIMARY KEY (id);
CREATE INDEX user_group_member_group_id_idx ON user_group_member (group_id);
CREATE INDEX user_group_member_user_id_idx ON user_group_member (user_id);
insert into user_group_member(group_id, user_id) values(1, 1);
CREATE TABLE configs (
id bigserial,
ckey varchar(191) not null,
cval varchar(4096) not null default ''
) ;
ALTER TABLE configs ADD CONSTRAINT configs_pk PRIMARY KEY (id);
ALTER TABLE configs ADD CONSTRAINT configs_un UNIQUE (ckey);
CREATE TABLE role (
id bigserial,
name varchar(191) not null default '',
note varchar(255) not null default ''
) ;
ALTER TABLE role ADD CONSTRAINT role_pk PRIMARY KEY (id);
ALTER TABLE role ADD CONSTRAINT role_un UNIQUE ("name");
insert into role(name, note) values('Admin', 'Administrator role');
insert into role(name, note) values('Standard', 'Ordinary user role');
insert into role(name, note) values('Guest', 'Readonly user role');
CREATE TABLE role_operation(
id bigserial,
role_name varchar(128) not null,
operation varchar(191) not null
) ;
ALTER TABLE role_operation ADD CONSTRAINT role_operation_pk PRIMARY KEY (id);
CREATE INDEX role_operation_role_name_idx ON role_operation (role_name);
CREATE INDEX role_operation_operation_idx ON role_operation (operation);
-- Admin is special, who has no concrete operation but can do anything.
insert into role_operation(role_name, operation) values('Guest', '/metric/explorer');
insert into role_operation(role_name, operation) values('Guest', '/object/explorer');
insert into role_operation(role_name, operation) values('Guest', '/help/version');
insert into role_operation(role_name, operation) values('Guest', '/help/contact');
insert into role_operation(role_name, operation) values('Standard', '/metric/explorer');
insert into role_operation(role_name, operation) values('Standard', '/object/explorer');
insert into role_operation(role_name, operation) values('Standard', '/help/version');
insert into role_operation(role_name, operation) values('Standard', '/help/contact');
insert into role_operation(role_name, operation) values('Standard', '/users');
insert into role_operation(role_name, operation) values('Standard', '/user-groups');
insert into role_operation(role_name, operation) values('Standard', '/user-groups/add');
insert into role_operation(role_name, operation) values('Standard', '/user-groups/put');
insert into role_operation(role_name, operation) values('Standard', '/user-groups/del');
insert into role_operation(role_name, operation) values('Standard', '/busi-groups');
insert into role_operation(role_name, operation) values('Standard', '/busi-groups/add');
insert into role_operation(role_name, operation) values('Standard', '/busi-groups/put');
insert into role_operation(role_name, operation) values('Standard', '/busi-groups/del');
insert into role_operation(role_name, operation) values('Standard', '/targets');
insert into role_operation(role_name, operation) values('Standard', '/targets/add');
insert into role_operation(role_name, operation) values('Standard', '/targets/put');
insert into role_operation(role_name, operation) values('Standard', '/targets/del');
insert into role_operation(role_name, operation) values('Standard', '/dashboards');
insert into role_operation(role_name, operation) values('Standard', '/dashboards/add');
insert into role_operation(role_name, operation) values('Standard', '/dashboards/put');
insert into role_operation(role_name, operation) values('Standard', '/dashboards/del');
insert into role_operation(role_name, operation) values('Standard', '/alert-rules');
insert into role_operation(role_name, operation) values('Standard', '/alert-rules/add');
insert into role_operation(role_name, operation) values('Standard', '/alert-rules/put');
insert into role_operation(role_name, operation) values('Standard', '/alert-rules/del');
insert into role_operation(role_name, operation) values('Standard', '/alert-mutes');
insert into role_operation(role_name, operation) values('Standard', '/alert-mutes/add');
insert into role_operation(role_name, operation) values('Standard', '/alert-mutes/del');
insert into role_operation(role_name, operation) values('Standard', '/alert-subscribes');
insert into role_operation(role_name, operation) values('Standard', '/alert-subscribes/add');
insert into role_operation(role_name, operation) values('Standard', '/alert-subscribes/put');
insert into role_operation(role_name, operation) values('Standard', '/alert-subscribes/del');
insert into role_operation(role_name, operation) values('Standard', '/alert-cur-events');
insert into role_operation(role_name, operation) values('Standard', '/alert-cur-events/del');
insert into role_operation(role_name, operation) values('Standard', '/alert-his-events');
insert into role_operation(role_name, operation) values('Standard', '/job-tpls');
insert into role_operation(role_name, operation) values('Standard', '/job-tpls/add');
insert into role_operation(role_name, operation) values('Standard', '/job-tpls/put');
insert into role_operation(role_name, operation) values('Standard', '/job-tpls/del');
insert into role_operation(role_name, operation) values('Standard', '/job-tasks');
insert into role_operation(role_name, operation) values('Standard', '/job-tasks/add');
insert into role_operation(role_name, operation) values('Standard', '/job-tasks/put');
insert into role_operation(role_name, operation) values('Standard', '/recording-rules');
insert into role_operation(role_name, operation) values('Standard', '/recording-rules/add');
insert into role_operation(role_name, operation) values('Standard', '/recording-rules/put');
insert into role_operation(role_name, operation) values('Standard', '/recording-rules/del');
CREATE TABLE recording_rule (
id bigserial NOT NULL,
group_id bigint not null default 0,
cluster varchar(128) not null,
name varchar(255) not null,
note varchar(255) not null,
disabled smallint not null,
prom_ql varchar(8192) not null,
prom_eval_interval int not null,
append_tags varchar(255) default '',
create_at bigint default 0,
create_by varchar(64) default '',
update_at bigint default 0,
update_by varchar(64) default '',
CONSTRAINT recording_rule_pk PRIMARY KEY (id)
) ;
CREATE INDEX recording_rule_group_id_idx ON recording_rule (group_id);
CREATE INDEX recording_rule_update_at_idx ON recording_rule (update_at);
COMMENT ON COLUMN recording_rule.group_id IS 'group_id';
COMMENT ON COLUMN recording_rule.name IS 'new metric name';
COMMENT ON COLUMN recording_rule.note IS 'rule note';
COMMENT ON COLUMN recording_rule.disabled IS '0:enabled 1:disabled';
COMMENT ON COLUMN recording_rule.prom_ql IS 'promql';
COMMENT ON COLUMN recording_rule.prom_eval_interval IS 'evaluate interval';
COMMENT ON COLUMN recording_rule.append_tags IS 'split by space: service=n9e mod=api';
-- for alert_rule | collect_rule | mute | dashboard grouping
CREATE TABLE busi_group (
id bigserial,
name varchar(191) not null,
label_enable smallint not null default 0,
label_value varchar(191) not null default '' ,
create_at bigint not null default 0,
create_by varchar(64) not null default '',
update_at bigint not null default 0,
update_by varchar(64) not null default ''
) ;
ALTER TABLE busi_group ADD CONSTRAINT busi_group_pk PRIMARY KEY (id);
ALTER TABLE busi_group ADD CONSTRAINT busi_group_un UNIQUE ("name");
COMMENT ON COLUMN busi_group.label_value IS 'if label_enable: label_value can not be blank';
insert into busi_group(id, name, create_at, create_by, update_at, update_by) values(1, 'Default Busi Group', date_part('epoch',current_timestamp)::int, 'root', date_part('epoch',current_timestamp)::int, 'root');
CREATE TABLE busi_group_member (
id bigserial,
busi_group_id bigint not null ,
user_group_id bigint not null ,
perm_flag char(2) not null
) ;
ALTER TABLE busi_group_member ADD CONSTRAINT busi_group_member_pk PRIMARY KEY (id);
CREATE INDEX busi_group_member_busi_group_id_idx ON busi_group_member (busi_group_id);
CREATE INDEX busi_group_member_user_group_id_idx ON busi_group_member (user_group_id);
COMMENT ON COLUMN busi_group_member.busi_group_id IS 'busi group id';
COMMENT ON COLUMN busi_group_member.user_group_id IS 'user group id';
COMMENT ON COLUMN busi_group_member.perm_flag IS 'ro | rw';
insert into busi_group_member(busi_group_id, user_group_id, perm_flag) values(1, 1, 'rw');
-- for dashboard new version
CREATE TABLE board (
id bigserial not null ,
group_id bigint not null default 0 ,
name varchar(191) not null,
ident varchar(200) not null default '',
tags varchar(255) not null ,
public smallint not null default 0,
create_at bigint not null default 0,
create_by varchar(64) not null default '',
update_at bigint not null default 0,
update_by varchar(64) not null default ''
) ;
ALTER TABLE board ADD CONSTRAINT board_pk PRIMARY KEY (id);
ALTER TABLE board ADD CONSTRAINT board_un UNIQUE (group_id,"name");
COMMENT ON COLUMN board.group_id IS 'busi group id';
COMMENT ON COLUMN board.tags IS 'split by space';
COMMENT ON COLUMN board.public IS '0:false 1:true';
CREATE INDEX board_ident_idx ON board (ident);
-- for dashboard new version
CREATE TABLE board_payload (
id bigint not null ,
payload text not null
);
ALTER TABLE board_payload ADD CONSTRAINT board_payload_un UNIQUE (id);
COMMENT ON COLUMN board_payload.id IS 'dashboard id';
-- deprecated
CREATE TABLE dashboard (
id bigserial,
group_id bigint not null default 0 ,
name varchar(191) not null,
tags varchar(255) not null ,
configs varchar(8192) ,
create_at bigint not null default 0,
create_by varchar(64) not null default '',
update_at bigint not null default 0,
update_by varchar(64) not null default ''
) ;
ALTER TABLE dashboard ADD CONSTRAINT dashboard_pk PRIMARY KEY (id);
-- deprecated
-- auto create the first subclass 'Default chart group' of dashboard
CREATE TABLE chart_group (
id bigserial,
dashboard_id bigint not null,
name varchar(255) not null,
weight int not null default 0
) ;
ALTER TABLE chart_group ADD CONSTRAINT chart_group_pk PRIMARY KEY (id);
-- deprecated
CREATE TABLE chart (
id bigserial,
group_id bigint not null ,
configs text,
weight int not null default 0
) ;
ALTER TABLE chart ADD CONSTRAINT chart_pk PRIMARY KEY (id);
CREATE TABLE chart_share (
id bigserial,
cluster varchar(128) not null,
configs text,
create_at bigint not null default 0,
create_by varchar(64) not null default ''
) ;
ALTER TABLE chart_share ADD CONSTRAINT chart_share_pk PRIMARY KEY (id);
CREATE INDEX chart_share_create_at_idx ON chart_share (create_at);
CREATE TABLE alert_rule (
id bigserial NOT NULL,
group_id int8 NOT NULL DEFAULT 0,
cate varchar(128) not null default '' ,
"cluster" varchar(128) NOT NULL,
"name" varchar(255) NOT NULL,
note varchar(1024) NOT NULL,
severity int2 NOT NULL,
disabled int2 NOT NULL,
prom_for_duration int4 NOT NULL,
prom_ql text NOT NULL,
prom_eval_interval int4 NOT NULL,
enable_stime bpchar(5) NOT NULL DEFAULT '00:00'::bpchar,
enable_etime bpchar(5) NOT NULL DEFAULT '23:59'::bpchar,
enable_days_of_week varchar(32) NOT NULL DEFAULT ''::character varying,
enable_in_bg int2 NOT NULL DEFAULT 0,
notify_recovered int2 NOT NULL,
notify_channels varchar(255) NOT NULL DEFAULT ''::character varying,
notify_groups varchar(255) NOT NULL DEFAULT ''::character varying,
notify_repeat_step int4 NOT NULL DEFAULT 0,
notify_max_number int4 not null default 0,
recover_duration int4 NOT NULL DEFAULT 0,
callbacks varchar(255) NOT NULL DEFAULT ''::character varying,
runbook_url varchar(255) NULL,
append_tags varchar(255) NOT NULL DEFAULT ''::character varying,
create_at int8 NOT NULL DEFAULT 0,
create_by varchar(64) NOT NULL DEFAULT ''::character varying,
update_at int8 NOT NULL DEFAULT 0,
update_by varchar(64) NOT NULL DEFAULT ''::character varying,
prod varchar(255) NOT NULL DEFAULT ''::character varying,
algorithm varchar(255) NOT NULL DEFAULT ''::character varying,
algo_params varchar(255) NULL,
delay int4 NOT NULL DEFAULT 0,
CONSTRAINT alert_rule_pk PRIMARY KEY (id)
);
CREATE INDEX alert_rule_group_id_idx ON alert_rule USING btree (group_id);
CREATE INDEX alert_rule_update_at_idx ON alert_rule USING btree (update_at);
COMMENT ON COLUMN alert_rule.group_id IS 'busi group id';
COMMENT ON COLUMN alert_rule.severity IS '0:Emergency 1:Warning 2:Notice';
COMMENT ON COLUMN alert_rule.disabled IS '0:enabled 1:disabled';
COMMENT ON COLUMN alert_rule.prom_for_duration IS 'prometheus for, unit:s';
COMMENT ON COLUMN alert_rule.prom_ql IS 'promql';
COMMENT ON COLUMN alert_rule.prom_eval_interval IS 'evaluate interval';
COMMENT ON COLUMN alert_rule.enable_stime IS '00:00';
COMMENT ON COLUMN alert_rule.enable_etime IS '23:59';
COMMENT ON COLUMN alert_rule.enable_days_of_week IS 'split by space: 0 1 2 3 4 5 6';
COMMENT ON COLUMN alert_rule.enable_in_bg IS '1: only this bg 0: global';
COMMENT ON COLUMN alert_rule.notify_recovered IS 'whether notify when recovery';
COMMENT ON COLUMN alert_rule.notify_channels IS 'split by space: sms voice email dingtalk wecom';
COMMENT ON COLUMN alert_rule.notify_groups IS 'split by space: 233 43';
COMMENT ON COLUMN alert_rule.notify_repeat_step IS 'unit: min';
COMMENT ON COLUMN alert_rule.recover_duration IS 'unit: s';
COMMENT ON COLUMN alert_rule.callbacks IS 'split by space: http://a.com/api/x http://a.com/api/y';
COMMENT ON COLUMN alert_rule.append_tags IS 'split by space: service=n9e mod=api';
CREATE TABLE alert_mute (
id bigserial,
group_id bigint not null default 0 ,
cate varchar(128) not null default '' ,
prod varchar(255) NOT NULL DEFAULT '' ,
note varchar(1024) not null default '',
cluster varchar(128) not null,
tags varchar(4096) not null default '' ,
cause varchar(255) not null default '',
btime bigint not null default 0 ,
etime bigint not null default 0 ,
disabled smallint not null default 0 ,
create_at bigint not null default 0,
create_by varchar(64) not null default '',
update_at bigint not null default 0,
update_by varchar(64) not null default ''
) ;
ALTER TABLE alert_mute ADD CONSTRAINT alert_mute_pk PRIMARY KEY (id);
CREATE INDEX alert_mute_create_at_idx ON alert_mute (create_at);
CREATE INDEX alert_mute_group_id_idx ON alert_mute (group_id);
COMMENT ON COLUMN alert_mute.group_id IS 'busi group id';
COMMENT ON COLUMN alert_mute.tags IS 'json,map,tagkey->regexp|value';
COMMENT ON COLUMN alert_mute.btime IS 'begin time';
COMMENT ON COLUMN alert_mute.etime IS 'end time';
COMMENT ON COLUMN alert_mute.disabled IS '0:enabled 1:disabled';
CREATE TABLE alert_subscribe (
id bigserial,
"name" varchar(255) NOT NULL default '',
disabled int2 NOT NULL default 0 ,
group_id bigint not null default 0 ,
cate varchar(128) not null default '' ,
cluster varchar(128) not null,
rule_id bigint not null default 0,
tags jsonb not null ,
redefine_severity smallint default 0 ,
new_severity smallint not null ,
redefine_channels smallint default 0 ,
new_channels varchar(255) not null default '' ,
user_group_ids varchar(250) not null ,
create_at bigint not null default 0,
create_by varchar(64) not null default '',
update_at bigint not null default 0,
update_by varchar(64) not null default ''
) ;
ALTER TABLE alert_subscribe ADD CONSTRAINT alert_subscribe_pk PRIMARY KEY (id);
CREATE INDEX alert_subscribe_group_id_idx ON alert_subscribe (group_id);
CREATE INDEX alert_subscribe_update_at_idx ON alert_subscribe (update_at);
COMMENT ON COLUMN alert_subscribe.disabled IS '0:enabled 1:disabled';
COMMENT ON COLUMN alert_subscribe.group_id IS 'busi group id';
COMMENT ON COLUMN alert_subscribe.tags IS 'json,map,tagkey->regexp|value';
COMMENT ON COLUMN alert_subscribe.redefine_severity IS 'is redefine severity?';
COMMENT ON COLUMN alert_subscribe.new_severity IS '0:Emergency 1:Warning 2:Notice';
COMMENT ON COLUMN alert_subscribe.redefine_channels IS 'is redefine channels?';
COMMENT ON COLUMN alert_subscribe.new_channels IS 'split by space: sms voice email dingtalk wecom';
COMMENT ON COLUMN alert_subscribe.user_group_ids IS 'split by space 1 34 5, notify cc to user_group_ids';
CREATE TABLE target (
id bigserial,
group_id bigint not null default 0 ,
cluster varchar(128) not null ,
ident varchar(191) not null ,
note varchar(255) not null default '' ,
tags varchar(512) not null default '' ,
update_at bigint not null default 0
) ;
ALTER TABLE target ADD CONSTRAINT target_pk PRIMARY KEY (id);
ALTER TABLE target ADD CONSTRAINT target_un UNIQUE (ident);
CREATE INDEX target_group_id_idx ON target (group_id);
COMMENT ON COLUMN target.group_id IS 'busi group id';
COMMENT ON COLUMN target."cluster" IS 'append to alert event as field';
COMMENT ON COLUMN target.ident IS 'target id';
COMMENT ON COLUMN target.note IS 'append to alert event as field';
COMMENT ON COLUMN target.tags IS 'append to series data as tags, split by space, append external space at suffix';
CREATE TABLE metric_view (
id bigserial,
name varchar(191) not null default '',
cate smallint not null ,
configs varchar(8192) not null default '',
create_at bigint not null default 0,
create_by bigint not null default 0 ,
update_at bigint not null default 0
) ;
ALTER TABLE metric_view ADD CONSTRAINT metric_view_pk PRIMARY KEY (id);
CREATE INDEX metric_view_create_by_idx ON metric_view (create_by);
COMMENT ON COLUMN metric_view.cate IS '0: preset 1: custom';
COMMENT ON COLUMN metric_view.create_by IS 'user id';
insert into metric_view(name, cate, configs) values('Host View', 0, '{"filters":[{"oper":"=","label":"__name__","value":"cpu_usage_idle"}],"dynamicLabels":[],"dimensionLabels":[{"label":"ident","value":""}]}');
CREATE TABLE alert_aggr_view (
id bigserial,
name varchar(191) not null default '',
rule varchar(2048) not null default '',
cate smallint not null ,
create_at bigint not null default 0,
create_by bigint not null default 0 ,
update_at bigint not null default 0
) ;
ALTER TABLE alert_aggr_view ADD CONSTRAINT alert_aggr_view_pk PRIMARY KEY (id);
CREATE INDEX alert_aggr_view_create_by_idx ON alert_aggr_view (create_by);
COMMENT ON COLUMN alert_aggr_view.cate IS '0: preset 1: custom';
COMMENT ON COLUMN alert_aggr_view.create_by IS 'user id';
insert into alert_aggr_view(name, rule, cate) values('By BusiGroup, Severity', 'field:group_name::field:severity', 0);
insert into alert_aggr_view(name, rule, cate) values('By RuleName', 'field:rule_name', 0);
CREATE TABLE alert_cur_event (
id bigserial NOT NULL,
cate varchar(128) not null default '' ,
"cluster" varchar(128) NOT NULL,
group_id int8 NOT NULL,
group_name varchar(255) NOT NULL DEFAULT ''::character varying,
hash varchar(64) NOT NULL,
rule_id int8 NOT NULL,
rule_name varchar(255) NOT NULL,
rule_note varchar(2048) NOT NULL DEFAULT 'alert rule note'::character varying,
severity int2 NOT NULL,
prom_for_duration int4 NOT NULL,
prom_ql varchar(8192) NOT NULL,
prom_eval_interval int4 NOT NULL,
callbacks varchar(255) NOT NULL DEFAULT ''::character varying,
runbook_url varchar(255) NULL,
notify_recovered int2 NOT NULL,
notify_channels varchar(255) NOT NULL DEFAULT ''::character varying,
notify_groups varchar(255) NOT NULL DEFAULT ''::character varying,
notify_repeat_next int8 NOT NULL DEFAULT 0,
notify_cur_number int4 not null default 0,
target_ident varchar(191) NOT NULL DEFAULT ''::character varying,
target_note varchar(191) NOT NULL DEFAULT ''::character varying,
first_trigger_time int8,
trigger_time int8 NOT NULL,
trigger_value varchar(255) NOT NULL,
tags varchar(1024) NOT NULL DEFAULT ''::character varying,
rule_prod varchar(255) NOT NULL DEFAULT ''::character varying,
rule_algo varchar(255) NOT NULL DEFAULT ''::character varying,
CONSTRAINT alert_cur_event_pk PRIMARY KEY (id)
);
CREATE INDEX alert_cur_event_hash_idx ON alert_cur_event USING btree (hash);
CREATE INDEX alert_cur_event_notify_repeat_next_idx ON alert_cur_event USING btree (notify_repeat_next);
CREATE INDEX alert_cur_event_rule_id_idx ON alert_cur_event USING btree (rule_id);
CREATE INDEX alert_cur_event_trigger_time_idx ON alert_cur_event USING btree (trigger_time, group_id);
COMMENT ON COLUMN alert_cur_event.id IS 'use alert_his_event.id';
COMMENT ON COLUMN alert_cur_event.group_id IS 'busi group id of rule';
COMMENT ON COLUMN alert_cur_event.group_name IS 'busi group name';
COMMENT ON COLUMN alert_cur_event.hash IS 'rule_id + vector_pk';
COMMENT ON COLUMN alert_cur_event.rule_note IS 'alert rule note';
COMMENT ON COLUMN alert_cur_event.severity IS '1:Emergency 2:Warning 3:Notice';
COMMENT ON COLUMN alert_cur_event.prom_for_duration IS 'prometheus for, unit:s';
COMMENT ON COLUMN alert_cur_event.prom_ql IS 'promql';
COMMENT ON COLUMN alert_cur_event.prom_eval_interval IS 'evaluate interval';
COMMENT ON COLUMN alert_cur_event.callbacks IS 'split by space: http://a.com/api/x http://a.com/api/y';
COMMENT ON COLUMN alert_cur_event.notify_recovered IS 'whether notify when recovery';
COMMENT ON COLUMN alert_cur_event.notify_channels IS 'split by space: sms voice email dingtalk wecom';
COMMENT ON COLUMN alert_cur_event.notify_groups IS 'split by space: 233 43';
COMMENT ON COLUMN alert_cur_event.notify_repeat_next IS 'next timestamp to notify, get repeat settings from rule';
COMMENT ON COLUMN alert_cur_event.target_ident IS 'target ident, also in tags';
COMMENT ON COLUMN alert_cur_event.target_note IS 'target note';
COMMENT ON COLUMN alert_cur_event.tags IS 'merge data_tags rule_tags, split by ,,';
CREATE TABLE alert_his_event (
id bigserial NOT NULL,
is_recovered int2 NOT NULL,
cate varchar(128) not null default '' ,
"cluster" varchar(128) NOT NULL,
group_id int8 NOT NULL,
group_name varchar(255) NOT NULL DEFAULT ''::character varying,
hash varchar(64) NOT NULL,
rule_id int8 NOT NULL,
rule_name varchar(255) NOT NULL,
rule_note varchar(2048) NOT NULL DEFAULT 'alert rule note'::character varying,
severity int2 NOT NULL,
prom_for_duration int4 NOT NULL,
prom_ql varchar(8192) NOT NULL,
prom_eval_interval int4 NOT NULL,
callbacks varchar(255) NOT NULL DEFAULT ''::character varying,
runbook_url varchar(255) NULL,
notify_recovered int2 NOT NULL,
notify_channels varchar(255) NOT NULL DEFAULT ''::character varying,
notify_groups varchar(255) NOT NULL DEFAULT ''::character varying,
notify_cur_number int4 not null default 0,
target_ident varchar(191) NOT NULL DEFAULT ''::character varying,
target_note varchar(191) NOT NULL DEFAULT ''::character varying,
first_trigger_time int8,
trigger_time int8 NOT NULL,
trigger_value varchar(255) NOT NULL,
recover_time int8 NOT NULL DEFAULT 0,
last_eval_time int8 NOT NULL DEFAULT 0,
tags varchar(1024) NOT NULL DEFAULT ''::character varying,
rule_prod varchar(255) NOT NULL DEFAULT ''::character varying,
rule_algo varchar(255) NOT NULL DEFAULT ''::character varying,
CONSTRAINT alert_his_event_pk PRIMARY KEY (id)
);
CREATE INDEX alert_his_event_hash_idx ON alert_his_event USING btree (hash);
CREATE INDEX alert_his_event_rule_id_idx ON alert_his_event USING btree (rule_id);
CREATE INDEX alert_his_event_trigger_time_idx ON alert_his_event USING btree (trigger_time, group_id);
COMMENT ON COLUMN alert_his_event.group_id IS 'busi group id of rule';
COMMENT ON COLUMN alert_his_event.group_name IS 'busi group name';
COMMENT ON COLUMN alert_his_event.hash IS 'rule_id + vector_pk';
COMMENT ON COLUMN alert_his_event.rule_note IS 'alert rule note';
COMMENT ON COLUMN alert_his_event.severity IS '0:Emergency 1:Warning 2:Notice';
COMMENT ON COLUMN alert_his_event.prom_for_duration IS 'prometheus for, unit:s';
COMMENT ON COLUMN alert_his_event.prom_ql IS 'promql';
COMMENT ON COLUMN alert_his_event.prom_eval_interval IS 'evaluate interval';
COMMENT ON COLUMN alert_his_event.callbacks IS 'split by space: http://a.com/api/x http://a.com/api/y';
COMMENT ON COLUMN alert_his_event.notify_recovered IS 'whether notify when recovery';
COMMENT ON COLUMN alert_his_event.notify_channels IS 'split by space: sms voice email dingtalk wecom';
COMMENT ON COLUMN alert_his_event.notify_groups IS 'split by space: 233 43';
COMMENT ON COLUMN alert_his_event.target_ident IS 'target ident, also in tags';
COMMENT ON COLUMN alert_his_event.target_note IS 'target note';
COMMENT ON COLUMN alert_his_event.last_eval_time IS 'for time filter';
COMMENT ON COLUMN alert_his_event.tags IS 'merge data_tags rule_tags, split by ,,';
CREATE TABLE task_tpl
(
id serial,
group_id int not null ,
title varchar(255) not null default '',
account varchar(64) not null,
batch int not null default 0,
tolerance int not null default 0,
timeout int not null default 0,
pause varchar(255) not null default '',
script text not null,
args varchar(512) not null default '',
tags varchar(255) not null default '' ,
create_at bigint not null default 0,
create_by varchar(64) not null default '',
update_at bigint not null default 0,
update_by varchar(64) not null default ''
) ;
ALTER TABLE task_tpl ADD CONSTRAINT task_tpl_pk PRIMARY KEY (id);
CREATE INDEX task_tpl_group_id_idx ON task_tpl (group_id);
COMMENT ON COLUMN task_tpl.group_id IS 'busi group id';
COMMENT ON COLUMN task_tpl.tags IS 'split by space';
CREATE TABLE task_tpl_host
(
ii serial,
id int not null ,
host varchar(128) not null
) ;
ALTER TABLE task_tpl_host ADD CONSTRAINT task_tpl_host_pk PRIMARY KEY (id);
CREATE INDEX task_tpl_host_id_idx ON task_tpl_host (id,host);
COMMENT ON COLUMN task_tpl_host.id IS 'task tpl id';
COMMENT ON COLUMN task_tpl_host.host IS 'ip or hostname';
CREATE TABLE task_record
(
id bigserial,
group_id bigint not null ,
ibex_address varchar(128) not null,
ibex_auth_user varchar(128) not null default '',
ibex_auth_pass varchar(128) not null default '',
title varchar(255) not null default '',
account varchar(64) not null,
batch int not null default 0,
tolerance int not null default 0,
timeout int not null default 0,
pause varchar(255) not null default '',
script text not null,
args varchar(512) not null default '',
create_at bigint not null default 0,
create_by varchar(64) not null default ''
) ;
ALTER TABLE task_record ADD CONSTRAINT task_record_pk PRIMARY KEY (id);
CREATE INDEX task_record_create_at_idx ON task_record (create_at,group_id);
CREATE INDEX task_record_create_by_idx ON task_record (create_by);
COMMENT ON COLUMN task_record.id IS 'ibex task id';
COMMENT ON COLUMN task_record.group_id IS 'busi group id';
CREATE TABLE alerting_engines
(
id bigserial NOT NULL,
instance varchar(128) not null default '',
cluster varchar(128) not null default '',
clock bigint not null
) ;
ALTER TABLE alerting_engines ADD CONSTRAINT alerting_engines_pk PRIMARY KEY (id);
ALTER TABLE alerting_engines ADD CONSTRAINT alerting_engines_un UNIQUE (instance);
COMMENT ON COLUMN alerting_engines.instance IS 'instance identification, e.g. 10.9.0.9:9090';
COMMENT ON COLUMN alerting_engines.cluster IS 'target reader cluster';

View File

@ -0,0 +1,30 @@
[
{
"name": "HTTP地址探测失败",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "http_response_result_code != 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@ -0,0 +1,243 @@
[
{
"name": "监控对象失联",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "max_over_time(target_up[130s]) == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "机器负载-CPU较高请关注",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "cpu_usage_idle{cpu=\"cpu-total\"} < 25",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "机器负载-内存较高,请关注",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "mem_available_percent < 25",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "硬盘-IO有点繁忙",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "rate(diskio_io_time[1m])/10 > 99",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "硬盘-预计再有4小时写满",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "predict_linear(disk_free[1h], 4*3600) < 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "网卡-入向有丢包",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "increase(net_drop_in[1m]) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "网卡-出向有丢包",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "increase(net_drop_out[1m]) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "网络连接-TME_WAIT数量超过2万",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "netstat_tcp_time_wait > 20000",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@ -0,0 +1,302 @@
[
{
"name": "MysqlInnodbLogWaits",
"note": "MySQL innodb log writes stalling",
"severity": 2,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "rate(mysql_global_status_innodb_log_waits[15m]) > 10",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlInnodbLogWaits"
]
},
{
"name": "MysqlSlaveIoThreadNotRunning",
"note": "MySQL Slave IO thread not running",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mysql_slave_status_master_server_id > 0 and ON (instance) mysql_slave_status_slave_io_running == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlSlaveIoThreadNotRunning"
]
},
{
"name": "MysqlSlaveReplicationLag",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "mysql_slave_status_master_server_id > 0 and ON (instance) (mysql_slave_status_seconds_behind_master - mysql_slave_status_sql_delay) > 30",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlSlaveReplicationLag"
]
},
{
"name": "MysqlSlaveSqlThreadNotRunning",
"note": "MySQL Slave SQL thread not running",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mysql_slave_status_master_server_id > 0 and ON (instance) mysql_slave_status_slave_sql_running == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlSlaveSqlThreadNotRunning"
]
},
{
"name": "Mysql刚刚有重启请注意",
"note": "MySQL has just been restarted, less than one minute ago",
"severity": 3,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mysql_global_status_uptime < 60",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlRestarted"
]
},
{
"name": "Mysql实例挂了",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mysql_up == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlDown"
]
},
{
"name": "Mysql打开了很多文件句柄请注意",
"note": "More than 80% of MySQL files open",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "avg by (instance) (mysql_global_status_open_files) / avg by (instance)(mysql_global_variables_open_files_limit) * 100 > 80",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlHighOpenFiles"
]
},
{
"name": "Mysql最近一分钟有慢查询出现",
"note": "MySQL server mysql has some new slow query",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "increase(mysql_global_status_slow_queries[1m]) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlSlowQueries"
]
},
{
"name": "Mysql有超过60%的连接是running状态",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "avg by (instance) (mysql_global_status_threads_running) / avg by (instance) (mysql_global_variables_max_connections) * 100 > 60",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlHighThreadsRunning"
]
},
{
"name": "Mysql连接数已超过80%",
"note": "More than 80% of MySQL connections are in use",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "avg by (instance) (mysql_global_status_threads_connected) / avg by (instance) (mysql_global_variables_max_connections) * 100 > 80",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlTooManyConnections"
]
}
]

View File

@ -0,0 +1,30 @@
[
{
"name": "网络地址探活失败",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "net_response_result_code != 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@ -0,0 +1,30 @@
[
{
"name": "NTP时间偏移太大",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "ntp_offset_ms > 1000 or ntp_offset_ms < -1000",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@ -0,0 +1,30 @@
[
{
"name": "PING地址探测失败",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "ping_result_code != 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@ -0,0 +1,62 @@
[
{
"name": "进程监控-有进程数为0某进程可能挂了",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "procstat_lookup_count == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "进程监控-进程句柄限制过小",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "procstat_rlimit_num_fds_soft < 2048",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@ -0,0 +1,182 @@
[
{
"name": "Redis Ping 延迟高大于100毫秒",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "redis_ping_use_seconds > 0.1",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=HighPingLatency"
]
},
{
"name": "Redis内存使用率较高",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "redis_maxmemory > 0 and (redis_used_memory / redis_maxmemory) > 0.85",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisHighMemoryUsage"
]
},
{
"name": "Redis出现拒绝连接",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "(rate(redis_rejected_connections[5m])) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisRejectedConnHigh"
]
},
{
"name": "Redis刚刚有重启请注意",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "redis_uptime_in_seconds < 600",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisLowUptime"
]
},
{
"name": "Redis较低的命中率",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "rate(redis_keyspace_hits[5m])\n/\n(rate(redis_keyspace_misses[5m]) + rate(redis_keyspace_hits[5m]))\n< 0.9",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisLowHitRatio"
]
},
{
"name": "Redis驱逐率较高",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "(sum(rate(redis_evicted_keys[5m])) / sum(redis_keyspace_keys)) > 0.1",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisHighKeysEvictionRatio"
]
}
]

View File

@ -0,0 +1,19 @@
[
{
"name": "HTTP探测",
"tags": "",
"configs": "",
"chart_groups": [
{
"name": "Default chart group",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"max(http_response_result_code) by (target)\",\"legend\":\"UP?\"},{\"expr\":\"max(http_response_response_code) by (target)\",\"refId\":\"B\",\"legend\":\"status code\"},{\"expr\":\"max(http_response_response_time) by (target)\",\"refId\":\"C\",\"legend\":\"latency(s)\"},{\"expr\":\"max(http_response_cert_expire_timestamp) by (target) - time()\",\"refId\":\"D\",\"legend\":\"cert expire\"}],\"name\":\"URL Details\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"labelValuesToRows\",\"aggrDimension\":\"target\"},\"options\":{\"valueMappings\":[],\"standardOptions\":{}},\"overrides\":[{\"properties\":{\"valueMappings\":[{\"type\":\"special\",\"match\":{\"special\":0},\"result\":{\"text\":\"UP\",\"color\":\"#417505\"}},{\"type\":\"range\",\"match\":{\"special\":1,\"from\":1},\"result\":{\"text\":\"DOWN\",\"color\":\"#e90f0f\"}}],\"standardOptions\":{}},\"matcher\":{\"value\":\"A\"}},{\"type\":\"special\",\"matcher\":{\"value\":\"D\"},\"properties\":{\"standardOptions\":{\"util\":\"humantimeSeconds\"}}}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":4,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
}
]
}
]

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,119 @@
[
{
"name": "MySQL Overview - 模板",
"tags": "Prometheus MySQL",
"configs": "{\"var\":[{\"name\":\"instance\",\"definition\":\"label_values(mysql_global_status_uptime, instance)\"}]}",
"chart_groups": [
{
"name": "Basic Info",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"min(mysql_global_status_uptime{instance=~\\\"$instance\\\"})\"}],\"name\":\"MySQL Uptime\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":1800},\"result\":{\"color\":\"#ec7718\"}},{\"type\":\"range\",\"match\":{\"from\":1800},\"result\":{\"color\":\"#369603\"}}],\"standardOptions\":{\"util\":\"humantimeSeconds\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(mysql_global_status_queries{instance=~\\\"$instance\\\"}[5m])\"}],\"name\":\"Current QPS\",\"description\":\"mysql_global_status_queries\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":100},\"result\":{\"color\":\"#05a31f\"}},{\"type\":\"range\",\"match\":{\"from\":100},\"result\":{\"color\":\"#ea3939\"}}],\"standardOptions\":{\"decimals\":2}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"avg(mysql_global_variables_innodb_buffer_pool_size{instance=~\\\"$instance\\\"})\"}],\"name\":\"InnoDB Buffer Pool\",\"description\":\"**InnoDB Buffer Pool Size**\\n\\nInnoDB maintains a storage area called the buffer pool for caching data and indexes in memory. Knowing how the InnoDB buffer pool works, and taking advantage of it to keep frequently accessed data in memory, is one of the most important aspects of MySQL tuning. The goal is to keep the working set in memory. In most cases, this should be between 60%-90% of available memory on a dedicated database host, but depends on many factors.\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"util\":\"bytesIEC\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(increase(mysql_global_status_table_locks_waited{instance=~\\\"$instance\\\"}[5m]))\"}],\"name\":\"Table Locks Waited(5min)\",\"description\":\"**Table Locks**\\n\\nMySQL takes a number of different locks for varying reasons. In this graph we see how many Table level locks MySQL has requested from the storage engine. In the case of InnoDB, many times the locks could actually be row locks as it only takes table level locks in a few specific cases.\\n\\nIt is most useful to compare Locks Immediate and Locks Waited. If Locks waited is rising, it means you have lock contention. Otherwise, Locks Immediate rising and falling is normal activity.\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":1},\"result\":{\"color\":\"#e70d0d\"}},{\"type\":\"range\",\"match\":{\"to\":1},\"result\":{\"color\":\"#53b503\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Connections",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(mysql_global_status_threads_connected{instance=~\\\"$instance\\\"})\",\"legend\":\"Connections\"},{\"expr\":\"sum(mysql_global_status_max_used_connections{instance=~\\\"$instance\\\"})\",\"legend\":\"Max Used Connections\"},{\"expr\":\"sum(mysql_global_variables_max_connections{instance=~\\\"$instance\\\"})\",\"legend\":\"Max Connections\"},{\"expr\":\"sum(rate(mysql_global_status_aborted_connects{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Aborted Connections\"}],\"name\":\"MySQL Connections\",\"description\":\"**Max Connections** \\n\\nMax Connections is the maximum permitted number of simultaneous client connections. By default, this is 151. Increasing this value increases the number of file descriptors that mysqld requires. If the required number of descriptors are not available, the server reduces the value of Max Connections.\\n\\nmysqld actually permits Max Connections + 1 clients to connect. The extra connection is reserved for use by accounts that have the SUPER privilege, such as root.\\n\\nMax Used Connections is the maximum number of connections that have been in use simultaneously since the server started.\\n\\nConnections is the number of connection attempts (successful or not) to the MySQL server.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(mysql_global_status_threads_connected{instance=~\\\"$instance\\\"})\",\"legend\":\"Threads Connected\"},{\"expr\":\"sum(mysql_global_status_threads_running{instance=~\\\"$instance\\\"})\",\"legend\":\"Threads Running\"}],\"name\":\"MySQL Client Thread Activity\",\"description\":\"Threads Connected is the number of open connections, while Threads Running is the number of threads not sleeping.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Query Performance",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_created_tmp_tables{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Created Tmp Tables\"},{\"expr\":\"sum(rate(mysql_global_status_created_tmp_disk_tables{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Created Tmp Disk Tables\"},{\"expr\":\"sum(rate(mysql_global_status_created_tmp_files{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Created Tmp Files\"}],\"name\":\"MySQL Temporary Objects\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.64,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_select_full_join{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Full Join\"},{\"expr\":\"sum(rate(mysql_global_status_select_full_range_join{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Full Range Join\"},{\"expr\":\"sum(rate(mysql_global_status_select_range{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Range\"},{\"expr\":\"sum(rate(mysql_global_status_select_range_check{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Range Check\"},{\"expr\":\"sum(rate(mysql_global_status_select_scan{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Scan\"}],\"name\":\"MySQL Select Types\",\"description\":\"**MySQL Select Types**\\n\\nAs with most relational databases, selecting based on indexes is more efficient than scanning an entire table's data. Here we see the counters for selects not done with indexes.\\n\\n* ***Select Scan*** is how many queries caused full table scans, in which all the data in the table had to be read and either discarded or returned.\\n* ***Select Range*** is how many queries used a range scan, which means MySQL scanned all rows in a given range.\\n* ***Select Full Join*** is the number of joins that are not joined on an index, this is usually a huge performance hit.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"stack\":\"off\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.41},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_sort_rows{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Sort Rows\"},{\"expr\":\"sum(rate(mysql_global_status_sort_range{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Sort Range\"},{\"expr\":\"sum(rate(mysql_global_status_sort_merge_passes{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Sort Merge Passes\"},{\"expr\":\"sum(rate(mysql_global_status_sort_scan{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Sort Scan\"}],\"name\":\"MySQL Sorts\",\"description\":\"**MySQL Sorts**\\n\\nDue to a query's structure, order, or other requirements, MySQL sorts the rows before returning them. For example, if a table is ordered 1 to 10 but you want the results reversed, MySQL then has to sort the rows to return 10 to 1.\\n\\nThis graph also shows when sorts had to scan a whole table or a given range of a table in order to return the results and which could not have been sorted via an index.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_slow_queries{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Slow Queries\"}],\"name\":\"MySQL Slow Queries\",\"description\":\"**MySQL Slow Queries**\\n\\nSlow queries are defined as queries being slower than the long_query_time setting. For example, if you have long_query_time set to 3, all queries that take longer than 3 seconds to complete will show on this graph.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"bars\",\"stack\":\"off\",\"fillOpacity\":0.81},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Network",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_bytes_received{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Inbound\"},{\"expr\":\"sum(rate(mysql_global_status_bytes_sent{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Outbound\"}],\"name\":\"MySQL Network Traffic\",\"description\":\"**MySQL Network Traffic**\\n\\nHere we can see how much network traffic is generated by MySQL. Outbound is network traffic sent from MySQL and Inbound is network traffic MySQL has received.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesSI\",\"decimals\":2},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "Commands, Handlers",
"weight": 4,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"topk(10, rate(mysql_global_status_commands_total{instance=~\\\"$instance\\\"}[5m])>0)\",\"legend\":\"Com_{{command}}\"}],\"name\":\"Top Command Counters\",\"description\":\"**Top Command Counters**\\n\\nThe Com_{{xxx}} statement counter variables indicate the number of times each xxx statement has been executed. There is one status variable for each type of statement. For example, Com_delete and Com_update count [``DELETE``](https://dev.mysql.com/doc/refman/5.7/en/delete.html) and [``UPDATE``](https://dev.mysql.com/doc/refman/5.7/en/update.html) statements, respectively. Com_delete_multi and Com_update_multi are similar but apply to [``DELETE``](https://dev.mysql.com/doc/refman/5.7/en/delete.html) and [``UPDATE``](https://dev.mysql.com/doc/refman/5.7/en/update.html) statements that use multiple-table syntax.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":2},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.2,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(mysql_global_status_handlers_total{instance=~\\\"$instance\\\", handler!~\\\"commit|rollback|savepoint.*|prepare\\\"}[5m])\",\"legend\":\"{{handler}}\"}],\"name\":\"MySQL Handlers\",\"description\":\"**MySQL Handlers**\\n\\nHandler statistics are internal statistics on how MySQL is selecting, updating, inserting, and modifying rows, tables, and indexes.\\n\\nThis is in fact the layer between the Storage Engine and MySQL.\\n\\n* `read_rnd_next` is incremented when the server performs a full table scan and this is a counter you don't really want to see with a high value.\\n* `read_key` is incremented when a read is done with an index.\\n* `read_next` is incremented when the storage engine is asked to 'read the next index entry'. A high value means a lot of index scans are being done.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":3},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(mysql_global_status_handlers_total{instance=~\\\"$instance\\\", handler=~\\\"commit|rollback|savepoint.*|prepare\\\"}[5m])\",\"legend\":\"{{handler}}\"}],\"name\":\"MySQL Transaction Handlers\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Open Files",
"weight": 5,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"mysql_global_variables_open_files_limit{instance=~\\\"$instance\\\"}\",\"legend\":\"Open Files Limit\"},{\"expr\":\"mysql_global_status_open_files{instance=~\\\"$instance\\\"}\",\"legend\":\"Open Files\"}],\"name\":\"MySQL Open Files\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "Table Openings",
"weight": 6,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"rate(mysql_global_status_table_open_cache_hits{instance=~\\\"$instance\\\"}[5m])\\n/\\n(\\nrate(mysql_global_status_table_open_cache_hits{instance=~\\\"$instance\\\"}[5m])\\n+\\nrate(mysql_global_status_table_open_cache_misses{instance=~\\\"$instance\\\"}[5m])\\n)\",\"legend\":\"Table Open Cache Hit Ratio\"}],\"name\":\"Table Open Cache Hit Ratio Mysql 5.6.6+\",\"description\":\"**MySQL Table Open Cache Status**\\n\\nThe recommendation is to set the `table_open_cache_instances` to a loose correlation to virtual CPUs, keeping in mind that more instances means the cache is split more times. If you have a cache set to 500 but it has 10 instances, each cache will only have 50 cached.\\n\\nThe `table_definition_cache` and `table_open_cache` can be left as default as they are auto-sized MySQL 5.6 and above (ie: do not set them to any value).\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percentUnit\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"mysql_global_status_open_tables{instance=~\\\"$instance\\\"}\",\"legend\":\"Open Tables\"},{\"expr\":\"mysql_global_variables_table_open_cache{instance=~\\\"$instance\\\"}\",\"legend\":\"Table Open Cache\"}],\"name\":\"MySQL Open Tables\",\"description\":\"**MySQL Open Tables**\\n\\nThe recommendation is to set the `table_open_cache_instances` to a loose correlation to virtual CPUs, keeping in mind that more instances means the cache is split more times. If you have a cache set to 500 but it has 10 instances, each cache will only have 50 cached.\\n\\nThe `table_definition_cache` and `table_open_cache` can be left as default as they are auto-sized MySQL 5.6 and above (ie: do not set them to any value).\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
}
]
}
]

View File

@ -0,0 +1,19 @@
[
{
"name": "TCP探测",
"tags": "",
"configs": "",
"chart_groups": [
{
"name": "Default chart group",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"max(net_response_result_code) by (target)\",\"legend\":\"UP?\"},{\"expr\":\"max(net_response_response_time) by (target)\",\"refId\":\"C\",\"legend\":\"latency(s)\"}],\"name\":\"Targets\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"labelValuesToRows\",\"aggrDimension\":\"target\"},\"options\":{\"valueMappings\":[],\"standardOptions\":{}},\"overrides\":[{\"properties\":{\"valueMappings\":[{\"type\":\"special\",\"match\":{\"special\":0},\"result\":{\"text\":\"UP\",\"color\":\"#417505\"}},{\"type\":\"range\",\"match\":{\"special\":1,\"from\":1},\"result\":{\"text\":\"DOWN\",\"color\":\"#e90f0f\"}}],\"standardOptions\":{}},\"matcher\":{\"value\":\"A\"}}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":4,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
}
]
}
]

View File

@ -0,0 +1,127 @@
[
{
"name": "Oracle - 模板",
"tags": "Telegraf",
"configs": "{\"var\":[{\"name\":\"ident\",\"definition\":\"label_values(oracle_up,ident)\",\"options\":[\"tt-fc-log00.nj\"]},{\"name\":\"instance\",\"definition\":\"label_values(oracle_up{ident=\\\"$ident\\\"},instance)\"}]}",
"chart_groups": [
{
"name": "Activities",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(oracle_activity_execute_count_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}[2m])\"}],\"name\":\"execute count / second\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"decimals\":1}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":6,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(oracle_activity_user_commits_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}[2m])\"}],\"name\":\"user commits / second\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(oracle_activity_user_rollbacks_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}[2m])\"}],\"name\":\"user rollbacks / second\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_up{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"status\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"special\",\"match\":{\"special\":1},\"result\":{\"text\":\"UP\",\"color\":\"#5ea70f\"}},{\"type\":\"special\",\"match\":{\"special\":0},\"result\":{\"text\":\"DOWN\",\"color\":\"#f60f0f\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":0,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Waits",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_wait_time_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"legend\":\"{{wait_class}}\"}],\"name\":\"Time waited\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "Tablespace",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_tablespace_bytes{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}/oracle_tablespace_max_bytes{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"legend\":\"{{tablespace}}\"}],\"name\":\"Used Percent\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percentUnit\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_tablespace_free{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"legend\":\"{{tablespace}}\"}],\"name\":\"Free space\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "IO and TPS",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_io_requests_per_second_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"IO Requests / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_user_transaction_per_sec_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"TPS\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_io_megabytes_per_second_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}*1024*1024\"}],\"name\":\"IO Bytes / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Connections",
"weight": 4,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sessions_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\",status=\\\"ACTIVE\\\"}\",\"legend\":\"\"}],\"name\":\"Sessions\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "Hit Ratio",
"weight": 5,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_buffer_cache_hit_ratio_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"Buffer Cache Hit Ratio\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_redo_allocation_hit_ratio_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"Redo Allocation Hit Ratio\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_row_cache_hit_ratio_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"Row Cache Hit Ratio\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_library_cache_hit_ratio_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"Library Cache Hit Ratio\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Physical Read Write",
"weight": 6,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_physical_read_bytes_per_sec_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"},{\"expr\":\"oracle_sysmetric_Physical_Write_Bytes_Per_Sec{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"refId\":\"B\"}],\"name\":\"Physical Read Write Bytes / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_physical_read_total_bytes_per_sec_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"},{\"expr\":\"oracle_sysmetric_Physical_Write_Total_Bytes_Per_Sec{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"refId\":\"B\"}],\"name\":\"Physical Read Write Total Bytes / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_physical_read_io_requests_per_sec_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"},{\"expr\":\"oracle_sysmetric_Physical_Write_IO_Requests_Per_Sec{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"refId\":\"B\"}],\"name\":\"Physical RW IO Requests / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_physical_read_total_io_requests_per_sec_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"},{\"expr\":\"oracle_sysmetric_Physical_Write_Total_IO_Requests_Per_Sec{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"refId\":\"B\"}],\"name\":\"Physical RW Total IO Requests / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
}
]
}
]

View File

@ -0,0 +1,19 @@
[
{
"name": "PING探测",
"tags": "",
"configs": "",
"chart_groups": [
{
"name": "Default chart group",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"max(ping_result_code) by (target)\",\"legend\":\"UP?\"},{\"expr\":\"max(ping_percent_packet_loss) by (target)\",\"refId\":\"B\",\"legend\":\"Packet Loss %\"},{\"expr\":\"max(httpresponse_response_time) by (target)\",\"refId\":\"C\",\"legend\":\"latency(s)\"}],\"name\":\"Ping\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"labelValuesToRows\",\"aggrDimension\":\"target\"},\"options\":{\"valueMappings\":[],\"standardOptions\":{}},\"overrides\":[{\"properties\":{\"valueMappings\":[{\"type\":\"special\",\"match\":{\"special\":0},\"result\":{\"text\":\"UP\",\"color\":\"#417505\"}},{\"type\":\"range\",\"match\":{\"special\":1,\"from\":1},\"result\":{\"text\":\"DOWN\",\"color\":\"#e90f0f\"}}],\"standardOptions\":{}},\"matcher\":{\"value\":\"A\"}}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":4,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
}
]
}
]

View File

@ -0,0 +1,221 @@
[
{
"name": "RabbitMQ 3.8+",
"tags": "",
"configs": "{\"var\":[{\"definition\":\"label_values(rabbitmq_identity_info, rabbitmq_cluster)\",\"name\":\"rabbitmq_cluster\"}]}",
"chart_groups": [
{
"name": "Overview",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_queue_messages_ready * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Ready messages\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10000},\"result\":{\"color\":\"#4a90e2\"}},{\"type\":\"range\",\"match\":{\"from\":100000},\"result\":{\"color\":\"#f50a0a\"}},{\"type\":\"range\",\"match\":{\"to\":9999},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":7,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_published_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Incoming messages / s\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":50},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":5,\"x\":7,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_channels * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) - sum(rabbitmq_channel_consumers * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Publishers\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_connections * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Connections\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":16,\"y\":0,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_queues * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Queues\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":20,\"y\":0,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_queue_messages_unacked * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Unacknowledged messages\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":99},\"result\":{\"color\":\"#417505\"}},{\"type\":\"range\",\"match\":{\"from\":100},\"result\":{\"color\":\"#4a90e2\"}},{\"type\":\"range\",\"match\":{\"from\":500},\"result\":{\"color\":\"#d0021b\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":7,\"x\":0,\"y\":1,\"i\":\"5\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_redelivered_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) +\\nsum(rate(rabbitmq_channel_messages_delivered_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) +\\nsum(rate(rabbitmq_channel_messages_delivered_ack_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) +\\nsum(rate(rabbitmq_channel_get_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) +\\nsum(rate(rabbitmq_channel_get_ack_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Outgoing messages / s\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":50},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":5,\"x\":7,\"y\":1,\"i\":\"6\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_channel_consumers * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Consumers\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":12,\"y\":1,\"i\":\"7\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_channels * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Channels\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":16,\"y\":1,\"i\":\"8\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_build_info * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Nodes\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":null,\"from\":3},\"result\":{\"color\":\"#417505\"}},{\"type\":\"range\",\"match\":{\"from\":8},\"result\":{\"color\":\"#e70909\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":20,\"y\":1,\"i\":\"9\"}}",
"weight": 0
}
]
},
{
"name": "Nodes",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rabbitmq_build_info * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}\"}],\"name\":\"nodes\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"labelsOfSeriesToRows\",\"columns\":[\"rabbitmq_cluster\",\"rabbitmq_node\",\"rabbitmq_version\",\"erlang_version\"]},\"options\":{\"standardOptions\":{}},\"overrides\":[{}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":1,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"(rabbitmq_resident_memory_limit_bytes * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) -\\n(rabbitmq_process_resident_memory_bytes * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Memory available before publishers blocked\",\"description\":\"If the value is zero or less, the memory alarm will be triggered and all publishing connections across all cluster nodes will be blocked.\\n\\nThis value can temporarily go negative because the memory alarm is triggered with a slight delay.\\n\\nThe kernel's view of the amount of memory used by the node can differ from what the node itself can observe. This means that this value can be negative for a sustained period of time.\\n\\nBy default nodes use resident set size (RSS) to compute how much memory they use. This strategy can be changed (see the guides below).\\n\\n* [Alarms](https://www.rabbitmq.com/alarms.html)\\n* [Memory Alarms](https://www.rabbitmq.com/memory.html)\\n* [Reasoning About Memory Use](https://www.rabbitmq.com/memory-use.html)\\n* [Blocked Connection Notifications](https://www.rabbitmq.com/connection-blocked.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":1,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rabbitmq_disk_space_available_bytes * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}\"}],\"name\":\"Disk space available before publishers blocked\",\"description\":\"This metric is reported for the partition where the RabbitMQ data directory is stored.\\n\\nIf the value is zero or less, the disk alarm will be triggered and all publishing connections across all cluster nodes will be blocked.\\n\\nThis value can temporarily go negative because the free disk space alarm is triggered with a slight delay.\\n\\n* [Alarms](https://www.rabbitmq.com/alarms.html)\\n* [Disk Space Alarms](https://www.rabbitmq.com/disk-alarms.html)\\n* [Disk Space](https://www.rabbitmq.com/production-checklist.html#resource-limits-disk-space)\\n* [Persistence Configuration](https://www.rabbitmq.com/persistence-conf.html)\\n* [Blocked Connection Notifications](https://www.rabbitmq.com/connection-blocked.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":1,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"(rabbitmq_process_max_fds * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) -\\n(rabbitmq_process_open_fds * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"File descriptors available\",\"description\":\"When this value reaches zero, new connections will not be accepted and disk write operations may fail.\\n\\nClient libraries, peer nodes and CLI tools will not be able to connect when the node runs out of available file descriptors.\\n\\n* [Open File Handles Limit](https://www.rabbitmq.com/production-checklist.html#resource-limits-file-handle-limit)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":8,\"x\":16,\"y\":1,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"(rabbitmq_process_max_tcp_sockets * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) -\\n(rabbitmq_process_open_tcp_sockets * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"TCP sockets available\",\"description\":\"When this value reaches zero, new connections will not be accepted.\\n\\nClient libraries, peer nodes and CLI tools will not be able to connect when the node runs out of available file descriptors.\\n\\n* [Networking and RabbitMQ](https://www.rabbitmq.com/networking.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":8,\"x\":16,\"y\":2,\"i\":\"4\"}}",
"weight": 0
}
]
},
{
"name": "QUEUED MESSAGES",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_queue_messages_ready * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages ready to be delivered to consumers\",\"description\":\"Total number of ready messages ready to be delivered to consumers.\\n\\nAim to keep this value as low as possible. RabbitMQ behaves best when messages are flowing through it. It's OK for publishers to occasionally outpace consumers, but the expectation is that consumers will eventually process all ready messages.\\n\\nIf this metric keeps increasing, your system will eventually run out of memory and/or disk space. Consider using TTL or Queue Length Limit to prevent unbounded message growth.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\\n* [Queue Length Limit](https://www.rabbitmq.com/maxlength.html)\\n* [Time-To-Live and Expiration](https://www.rabbitmq.com/ttl.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_queue_messages_unacked * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages pending consumer acknowledgement\",\"description\":\"The total number of messages that are either in-flight to consumers, currently being processed by consumers or simply waiting for the consumer acknowledgements to be processed by the queue. Until the queue processes the message acknowledgement, the message will remain unacknowledged.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\\n* [Confirms and Acknowledgements](https://www.rabbitmq.com/confirms.html)\\n* [Consumer Prefetch](https://www.rabbitmq.com/consumer-prefetch.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "INCOMING MESSAGES",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_published_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages published / s\",\"description\":\"The incoming message rate before any routing rules are applied.\\n\\nIf this value is lower than the number of messages published to queues, it may indicate that some messages are delivered to more than one queue.\\n\\nIf this value is higher than the number of messages published to queues, messages cannot be routed and will either be dropped or returned to publishers.\\n\\n* [Publishers](https://www.rabbitmq.com/publishers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_confirmed_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages confirmed to publishers / s\",\"description\":\"The rate of messages confirmed by the broker to publishers. Publishers must opt-in to receive message confirmations.\\n\\nIf this metric is consistently at zero it may suggest that publisher confirms are not used by clients. The safety of published messages is likely to be at risk.\\n\\n* [Publisher Confirms](https://www.rabbitmq.com/confirms.html#publisher-confirms)\\n* [Publisher Confirms and Data Safety](https://www.rabbitmq.com/publishers.html#data-safety)\\n* [When Will Published Messages Be Confirmed by the Broker?](https://www.rabbitmq.com/confirms.html#when-publishes-are-confirmed)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_queue_messages_published_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages routed to queues / s\",\"description\":\"The rate of messages received from publishers and successfully routed to the master queue replicas.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\\n* [Publishers](https://www.rabbitmq.com/publishers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":1,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_unconfirmed[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages unconfirmed to publishers / s\",\"description\":\"The rate of messages received from publishers that have publisher confirms enabled and the broker has not confirmed yet.\\n\\n* [Publishers](https://www.rabbitmq.com/publishers.html)\\n* [Confirms and Acknowledgements](https://www.rabbitmq.com/confirms.html)\\n* [When Will Published Messages Be Confirmed by the Broker?](https://www.rabbitmq.com/confirms.html#when-publishes-are-confirmed)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":1,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_unroutable_dropped_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Unroutable messages dropped / s\",\"description\":\"The rate of messages that cannot be routed and are dropped. \\n\\nAny value above zero means message loss and likely suggests a routing problem on the publisher end.\\n\\n* [Unroutable Message Handling](https://www.rabbitmq.com/publishers.html#unroutable)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":2,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_unroutable_returned_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Unroutable messages returned to publishers / s\",\"description\":\"The rate of messages that cannot be routed and are returned back to publishers.\\n\\nSustained values above zero may indicate a routing problem on the publisher end.\\n\\n* [Unroutable Message Handling](https://www.rabbitmq.com/publishers.html#unroutable)\\n* [When Will Published Messages Be Confirmed by the Broker?](https://www.rabbitmq.com/confirms.html#when-publishes-are-confirmed)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":2,\"i\":\"5\"}}",
"weight": 0
}
]
},
{
"name": "OUTGOING MESSAGES",
"weight": 4,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(\\n (rate(rabbitmq_channel_messages_delivered_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) +\\n (rate(rabbitmq_channel_messages_delivered_ack_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\\n) by(rabbitmq_node)\"}],\"name\":\"Messages delivered / s\",\"description\":\"The rate of messages delivered to consumers. It includes messages that have been redelivered.\\n\\nThis metric does not include messages that have been fetched by consumers using `basic.get` (consumed by polling).\\n\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_redelivered_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages redelivered / s\",\"description\":\"The rate of messages that have been redelivered to consumers. It includes messages that have been requeued automatically and redelivered due to channel exceptions or connection closures.\\n\\nHaving some redeliveries is expected, but if this metric is consistently non-zero, it is worth investigating why.\\n\\n* [Negative Acknowledgement and Requeuing of Deliveries](https://www.rabbitmq.com/confirms.html#consumer-nacks-requeue)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_delivered_ack_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages delivered with manual ack / s\",\"description\":\"The rate of message deliveries to consumers that use manual acknowledgement mode.\\n\\nWhen this mode is used, RabbitMQ waits for consumers to acknowledge messages before more messages can be delivered.\\n\\nThis is the safest way of consuming messages.\\n\\n* [Consumer Acknowledgements](https://www.rabbitmq.com/confirms.html)\\n* [Consumer Prefetch](https://www.rabbitmq.com/consumer-prefetch.html)\\n* [Consumer Acknowledgement Modes, Prefetch and Throughput](https://www.rabbitmq.com/confirms.html#channel-qos-prefetch-throughput)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":1,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_delivered_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages delivered auto ack / s\",\"description\":\"The rate of message deliveries to consumers that use automatic acknowledgement mode.\\n\\nWhen this mode is used, RabbitMQ does not wait for consumers to acknowledge message deliveries.\\n\\nThis mode is fire-and-forget and does not offer any delivery safety guarantees. It tends to provide higher throughput and it may lead to consumer overload and higher consumer memory usage.\\n\\n* [Consumer Acknowledgement Modes, Prefetch and Throughput](https://www.rabbitmq.com/confirms.html#channel-qos-prefetch-throughput)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":1,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_acked_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages acknowledged / s\",\"description\":\"The rate of message acknowledgements coming from consumers that use manual acknowledgement mode.\\n\\n* [Consumer Acknowledgements](https://www.rabbitmq.com/confirms.html)\\n* [Consumer Prefetch](https://www.rabbitmq.com/consumer-prefetch.html)\\n* [Consumer Acknowledgement Modes, Prefetch and Throughput](https://www.rabbitmq.com/confirms.html#channel-qos-prefetch-throughput)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":2,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_get_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Polling operations with auto ack / s\",\"description\":\"The rate of messages delivered to polling consumers that use automatic acknowledgement mode.\\n\\nThe use of polling consumers is highly inefficient and therefore strongly discouraged.\\n\\n* [Fetching individual messages](https://www.rabbitmq.com/consumers.html#fetching)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":2,\"i\":\"5\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_get_empty_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Polling operations that yield no result / s\",\"description\":\"The rate of polling consumer operations that yield no result.\\n\\nAny value above zero means that RabbitMQ resources are wasted by polling consumers.\\n\\nCompare this metric to the other polling consumer metrics to see the inefficiency rate.\\n\\nThe use of polling consumers is highly inefficient and therefore strongly discouraged.\\n\\n* [Fetching individual messages](https://www.rabbitmq.com/consumers.html#fetching)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":3,\"i\":\"6\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_get_ack_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Polling operations with manual ack / s\",\"description\":\"The rate of messages delivered to polling consumers that use manual acknowledgement mode.\\n\\nThe use of polling consumers is highly inefficient and therefore strongly discouraged.\\n\\n* [Fetching individual messages](https://www.rabbitmq.com/consumers.html#fetching)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":3,\"i\":\"7\"}}",
"weight": 0
}
]
},
{
"name": "QUEUES",
"weight": 5,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rabbitmq_queues * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}\"}],\"name\":\"Total queues\",\"description\":\"Total number of queue masters per node. \\n\\nThis metric makes it easy to see sub-optimal queue distribution in a cluster.\\n\\n* [Queue Masters, Data Locality](https://www.rabbitmq.com/ha.html#master-migration-data-locality)\\n* [Queues](https://www.rabbitmq.com/queues.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_queues_declared_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Queues declared / s\",\"description\":\"The rate of queue declarations performed by clients.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of queue churn or high rates of connection recovery. Confirm connection recovery rates by using the _Connections opened_ metric.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":4,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_queues_created_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Queues created / s\",\"description\":\"The rate of new queues created (as opposed to redeclarations).\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of queue churn or high rates of connection recovery. Confirm connection recovery rates by using the _Connections opened_ metric.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":4,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_queues_deleted_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Queues deleted / s\",\"description\":\"The rate of queues deleted.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of queue churn or high rates of connection recovery. Confirm connection recovery rates by using the _Connections opened_ metric.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":4,\"x\":20,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "CHANNELS",
"weight": 6,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rabbitmq_channels * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}\"}],\"name\":\"Total channels\",\"description\":\"Total number of channels on all currently opened connections.\\n\\nIf this metric grows monotonically it is highly likely a channel leak in one of the applications. Confirm channel leaks by using the _Channels opened_ and _Channels closed_ metrics.\\n\\n* [Channel Leak](https://www.rabbitmq.com/channels.html#channel-leaks)\\n* [Channels](https://www.rabbitmq.com/channels.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channels_opened_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Channels opened / s\",\"description\":\"The rate of new channels opened by applications across all connections. Channels are expected to be long-lived.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of channel churn or mass connection recovery. Confirm connection recovery rates by using the _Connections opened_ metric.\\n\\n* [High Channel Churn](https://www.rabbitmq.com/channels.html#high-channel-churn)\\n* [Channels](https://www.rabbitmq.com/channels.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channels_closed_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Channels closed / s\",\"description\":\"The rate of channels closed by applications across all connections. Channels are expected to be long-lived.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of channel churn or mass connection recovery. Confirm connection recovery rates by using the _Connections opened_ metric.\\n\\n* [High Channel Churn](https://www.rabbitmq.com/channels.html#high-channel-churn)\\n* [Channels](https://www.rabbitmq.com/channels.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "CONNECTIONS",
"weight": 7,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_connections_closed_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Connections closed / s\",\"description\":\"The rate of connections closed. Connections are expected to be long-lived.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of connection churn or mass connection recovery.\\n\\n* [Connections](https://www.rabbitmq.com/connections.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rabbitmq_connections * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}\"}],\"name\":\"Total connections\",\"description\":\"Total number of client connections.\\n\\nIf this metric grows monotonically it is highly likely a connection leak in one of the applications. Confirm connection leaks by using the _Connections opened_ and _Connections closed_ metrics.\\n\\n* [Connection Leak](https://www.rabbitmq.com/connections.html#monitoring)\\n* [Connections](https://www.rabbitmq.com/connections.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_connections_opened_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Connections opened / s\",\"description\":\"The rate of new connections opened by clients. Connections are expected to be long-lived.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of connection churn or mass connection recovery.\\n\\n* [Connection Leak](https://www.rabbitmq.com/connections.html#monitoring)\\n* [Connections](https://www.rabbitmq.com/connections.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
}
]
}
]

View File

@ -0,0 +1,77 @@
[
{
"name": "Redis Overview - 模板",
"tags": "Redis Prometheus",
"configs": "{\"var\":[{\"name\":\"instance\",\"definition\":\"label_values(redis_uptime_in_seconds,instance)\",\"selected\":\"10.206.0.16:6379\"}]}",
"chart_groups": [
{
"name": "Basic Info",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"min(redis_uptime_in_seconds{instance=~\\\"$instance\\\"})\"}],\"name\":\"Redis Uptime\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"util\":\"humantimeSeconds\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(redis_connected_clients{instance=~\\\"$instance\\\"})\"}],\"name\":\"Connected Clients\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"redis_used_memory{instance=~\\\"$instance\\\"}\"}],\"name\":\"Memory Used\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":128000000},\"result\":{\"color\":\"#079e05\"}},{\"type\":\"range\",\"match\":{\"from\":128000000},\"result\":{\"color\":\"#f10909\"}}],\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":0}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"redis_maxmemory{instance=~\\\"$instance\\\"}\"}],\"name\":\"Max Memory Limit\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"util\":\"bytesIEC\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Commands",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"rate(redis_total_commands_processed{instance=~\\\"$instance\\\"}[5m])\"}],\"name\":\"Commands Executed / sec\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"irate(redis_keyspace_hits{instance=~\\\"$instance\\\"}[5m])\",\"legend\":\"hits\"},{\"expr\":\"irate(redis_keyspace_misses{instance=~\\\"$instance\\\"}[5m])\",\"legend\":\"misses\"}],\"name\":\"Hits / Misses per Sec\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"noraml\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"topk(5, irate(redis_cmdstat_calls{instance=~\\\"$instance\\\"} [1m]))\",\"legend\":\"{{command}}\"}],\"name\":\"Top Commands\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Keys",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum (redis_keyspace_keys{instance=~\\\"$instance\\\"}) by (db)\",\"legend\":\"{{db}}\"}],\"name\":\"Total Items per DB\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(redis_expired_keys{instance=~\\\"$instance\\\"}[5m])) by (instance)\",\"legend\":\"expired\"},{\"expr\":\"sum(rate(redis_evicted_keys{instance=~\\\"$instance\\\"}[5m])) by (instance)\",\"legend\":\"evicted\"}],\"name\":\"Expired / Evicted\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(redis_keyspace_keys{instance=~\\\"$instance\\\"}) - sum(redis_keyspace_expires{instance=~\\\"$instance\\\"}) \",\"legend\":\"not expiring\"},{\"expr\":\"sum(redis_keyspace_expires{instance=~\\\"$instance\\\"}) \",\"legend\":\"expiring\"}],\"name\":\"Expiring vs Not-Expiring Keys\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"noraml\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Network",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(redis_total_net_input_bytes{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"input\"},{\"expr\":\"sum(rate(redis_total_net_output_bytes{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"output\"}],\"name\":\"Network I/O\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":2},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
}
]
}
]

View File

@ -0,0 +1,63 @@
[
{
"name": "Tomcat - 模板",
"tags": "Categraf",
"configs": "{\"var\":[{\"name\":\"instance\",\"definition\":\"label_values(tomcat_up, instance)\"}],\"links\":[{\"title\":\"n9e\",\"url\":\"https://n9e.gitee.io/\",\"targetBlank\":true}]}",
"chart_groups": [
{
"name": "connector",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(tomcat_connector_bytes_sent{instance=\\\"$instance\\\"}[1m])\",\"legend\":\"sent\"},{\"expr\":\"rate(tomcat_connector_bytes_received{instance=\\\"$instance\\\"}[1m])\",\"refId\":\"B\",\"legend\":\"received\"}],\"name\":\"Traffic Bytes / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(tomcat_connector_request_count{instance=\\\"$instance\\\"}[1m])\",\"legend\":\"tomcat_connector_request_count\"},{\"expr\":\"rate(tomcat_connector_error_count{instance=\\\"$instance\\\"}[1m])\",\"refId\":\"B\",\"legend\":\"tomcat_connector_error_count\"}],\"name\":\"Request count / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_connector_max_threads{instance=\\\"$instance\\\"}\",\"legend\":\"max_threads\"},{\"expr\":\"tomcat_connector_current_thread_count{instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"current_thread_count\"},{\"expr\":\"tomcat_connector_current_threads_busy{instance=\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"current_threads_busy\"}],\"name\":\"Tread\",\"description\":\"max_threads: The maximum number of allowed worker threads.\\ncurrent_thread_count: The number of threads managed by the thread pool\\ncurrent_threads_busy: The number of threads that are in use\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(tomcat_connector_processing_time{instance=\\\"$instance\\\"}[1m])\",\"legend\":\"{{name}}-processing_time\"}],\"name\":\"Processing time\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "mem used",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_jvm_memory_max{instance=\\\"$instance\\\"}\",\"legend\":\"max\"},{\"expr\":\"tomcat_jvm_memory_total{instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"used\"},{\"expr\":\"tomcat_jvm_memory_free{instance=\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"free\"}],\"name\":\"Mem Used\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "memorypool",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_jvm_memorypool_used{instance=\\\"$instance\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Used\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_jvm_memorypool_max{instance=\\\"$instance\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Max\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_jvm_memorypool_committed{instance=\\\"$instance\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Committed\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_jvm_memorypool_init{instance=\\\"$instance\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Init\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
}
]
}
]

View File

@ -23,6 +23,11 @@ class Sender(object):
def send_feishu(cls, payload):
# already done in go code
pass
@classmethod
def send_mm(cls, payload):
# already done in go code
pass
@classmethod
def send_sms(cls, payload):

View File

@ -9,7 +9,14 @@ ClusterName = "Default"
BusiGroupLabelKey = "busigroup"
# sleep x seconds, then start judge engine
EngineDelay = 120
EngineDelay = 60
DisableUsageReport = false
# config | database
ReaderFrom = "config"
ForceUseServerTS = true
[Log]
# log write dir
@ -68,10 +75,12 @@ InsecureSkipVerify = true
Batch = 5
[Alerting]
# timeout settings, unit: ms, default: 30000ms
Timeout=30000
TemplatesDir = "./etc/template"
NotifyConcurrency = 10
# use builtin go code notify
NotifyBuiltinChannels = ["email", "dingtalk", "wecom", "feishu"]
NotifyBuiltinChannels = ["email", "dingtalk", "wecom", "feishu", "mm"]
[Alerting.CallScript]
# built in sending capability in go code
@ -83,7 +92,8 @@ ScriptPath = "./etc/script/notify.py"
Enable = false
# use a plugin via `go build -buildmode=plugin -o notify.so`
PluginPath = "./etc/script/notify.so"
Caller = "n9eCaller"
# The first letter must be capitalized to be exported
Caller = "N9eCaller"
[Alerting.RedisPub]
Enable = false
@ -101,7 +111,7 @@ Headers = ["Content-Type", "application/json", "X-From", "N9E"]
[NoData]
Metric = "target_up"
# unit: second
Interval = 15
Interval = 120
[Ibex]
# callback: ${ibex}/${tplid}/${host}
@ -136,7 +146,7 @@ MaxIdleConns = 50
# table prefix
TablePrefix = ""
# enable auto migrate or not
EnableAutoMigrate = false
# EnableAutoMigrate = false
[Reader]
# prometheus base url
@ -147,23 +157,16 @@ BasicAuthUser = ""
BasicAuthPass = ""
# timeout settings, unit: ms
Timeout = 30000
DialTimeout = 10000
TLSHandshakeTimeout = 30000
ExpectContinueTimeout = 1000
IdleConnTimeout = 90000
# time duration, unit: ms
KeepAlive = 30000
MaxConnsPerHost = 0
MaxIdleConns = 100
MaxIdleConnsPerHost = 10
DialTimeout = 3000
MaxIdleConnsPerHost = 100
[WriterOpt]
# queue channel count
QueueCount = 100
# queue max size
QueueMaxSize = 10000000
QueueMaxSize = 200000
# once pop samples number from queue
QueuePopSize = 2000
# unit: ms
SleepInterval = 50
[[Writers]]
Url = "http://prometheus:9090/api/v1/write"
@ -172,8 +175,8 @@ BasicAuthUser = ""
# Basic auth password
BasicAuthPass = ""
# timeout settings, unit: ms
Timeout = 30000
DialTimeout = 10000
Timeout = 10000
DialTimeout = 3000
TLSHandshakeTimeout = 30000
ExpectContinueTimeout = 1000
IdleConnTimeout = 90000
@ -182,6 +185,12 @@ KeepAlive = 30000
MaxConnsPerHost = 0
MaxIdleConns = 100
MaxIdleConnsPerHost = 100
# [[Writers.WriteRelabels]]
# Action = "replace"
# SourceLabels = ["__address__"]
# Regex = "([^:]+)(?::\\d+)?"
# Replacement = "$1:80"
# TargetLabel = "__address__"
# [[Writers]]
# Url = "http://m3db:7201/api/v1/prom/remote/write"

View File

@ -4,12 +4,21 @@ RunMode = "release"
# # custom i18n dict config
# I18N = "./etc/i18n.json"
# # custom i18n request header key
# I18NHeaderKey = "X-Language"
# metrics descriptions
MetricsYamlFile = "./etc/metrics.yaml"
BuiltinAlertsDir = "./etc/alerts"
BuiltinDashboardsDir = "./etc/dashboards"
# config | api
ClustersFrom = "config"
# using when ClustersFrom = "api"
ClustersFromAPIs = []
[[NotifyChannels]]
Label = "邮箱"
# do not change Key
@ -30,6 +39,11 @@ Label = "飞书机器人"
# do not change Key
Key = "feishu"
[[NotifyChannels]]
Label = "mm bot"
# do not change Key
Key = "mm"
[[ContactKeys]]
Label = "Wecom Robot Token"
# do not change Key
@ -45,6 +59,11 @@ Label = "Feishu Robot Token"
# do not change Key
Key = "feishu_robot_token"
[[ContactKeys]]
Label = "MatterMost Webhook URL"
# do not change Key
Key = "mm_webhook_url"
[Log]
# log write dir
Dir = "logs"
@ -92,6 +111,13 @@ AccessExpired = 1500
RefreshExpired = 10080
RedisKeyPrefix = "/jwt/"
[ProxyAuth]
# if proxy auth enabled, jwt auth is disabled
Enable = false
# username key in http proxy header
HeaderUserNameKey = "X-User-Name"
DefaultRoles = ["Standard"]
[BasicAuth]
user001 = "ccc26da7b9aba533cbb263a36c07dcc5"
@ -121,6 +147,20 @@ Nickname = "cn"
Phone = "mobile"
Email = "mail"
[OIDC]
Enable = false
RedirectURL = "http://n9e.com/callback"
SsoAddr = "http://sso.example.org"
ClientId = ""
ClientSecret = ""
CoverAttributes = true
DefaultRoles = ["Standard"]
[OIDC.Attributes]
Nickname = "nickname"
Phone = "phone_number"
Email = "email"
[Redis]
# address, ip:port
Address = "redis:6379"
@ -145,7 +185,7 @@ MaxIdleConns = 50
# table prefix
TablePrefix = ""
# enable auto migrate or not
EnableAutoMigrate = false
# EnableAutoMigrate = false
[[Clusters]]
# Prometheus cluster name
@ -158,14 +198,7 @@ BasicAuthUser = ""
BasicAuthPass = ""
# timeout settings, unit: ms
Timeout = 30000
DialTimeout = 10000
TLSHandshakeTimeout = 30000
ExpectContinueTimeout = 1000
IdleConnTimeout = 90000
# time duration, unit: ms
KeepAlive = 30000
MaxConnsPerHost = 0
MaxIdleConns = 100
DialTimeout = 3000
MaxIdleConnsPerHost = 100
[Ibex]
@ -174,4 +207,10 @@ Address = "http://ibex:10090"
BasicAuthUser = "ibex"
BasicAuthPass = "ibex"
# unit: ms
Timeout = 3000
Timeout = 3000
[TargetMetrics]
TargetUp = '''max(max_over_time(target_up{ident=~"(%s)"}[%dm])) by (ident)'''
LoadPerCore = '''max(max_over_time(system_load_norm_1{ident=~"(%s)"}[%dm])) by (ident)'''
MemUtil = '''100-max(max_over_time(mem_available_percent{ident=~"(%s)"}[%dm])) by (ident)'''
DiskUtil = '''max(max_over_time(disk_used_percent{ident=~"(%s)", path="/"}[%dm])) by (ident)'''

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,30 @@
[
{
"name": "HTTP地址探测失败",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "http_response_result_code != 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@ -1,58 +1,72 @@
[
{
"name": "数据有丢失风险-同步副本数小于3",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "sum(kafka_topic_partition_in_sync_replica) by (topic) < 3",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "消费能力不足-积压消息数超过50条",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "sum(kafka_topic_partition_current_offset{instance=\"$instance\"}) by (topic) - sum(kafka_consumergroup_current_offset{instance=\"$instance\"}) by (topic) ",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]
{
"name": "数据有丢失风险-副本数小于3",
"note": "",
"prod": "",
"algorithm": "",
"algo_params": null,
"delay": 0,
"severity": 2,
"disabled": 1,
"prom_for_duration": 60,
"prom_ql": "sum(kafka_topic_partition_in_sync_replica) by (topic) < 3",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"notify_max_number": 0,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"service=kafka"
]
},
{
"name": "消费能力不足-延迟超过5分钟",
"note": "",
"prod": "",
"algorithm": "",
"algo_params": null,
"delay": 0,
"severity": 2,
"disabled": 1,
"prom_for_duration": 60,
"prom_ql": "kafka_consumer_lag_millis / 1000 > 300",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"notify_max_number": 0,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"service=kafka"
]
}
]

View File

@ -0,0 +1,243 @@
[
{
"name": "监控对象失联",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "max_over_time(target_up[130s]) == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "机器负载-CPU较高请关注",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "cpu_usage_idle{cpu=\"cpu-total\"} < 25",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "机器负载-内存较高,请关注",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "mem_available_percent < 25",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "硬盘-IO有点繁忙",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "rate(diskio_io_time[1m])/10 > 99",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "硬盘-预计再有4小时写满",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "predict_linear(disk_free[1h], 4*3600) < 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "网卡-入向有丢包",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "increase(net_drop_in[1m]) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "网卡-出向有丢包",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "increase(net_drop_out[1m]) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "网络连接-TME_WAIT数量超过2万",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "netstat_tcp_time_wait > 20000",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@ -0,0 +1,302 @@
[
{
"name": "MysqlInnodbLogWaits",
"note": "MySQL innodb log writes stalling",
"severity": 2,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "rate(mysql_global_status_innodb_log_waits[15m]) > 10",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlInnodbLogWaits"
]
},
{
"name": "MysqlSlaveIoThreadNotRunning",
"note": "MySQL Slave IO thread not running",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mysql_slave_status_master_server_id > 0 and ON (instance) mysql_slave_status_slave_io_running == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlSlaveIoThreadNotRunning"
]
},
{
"name": "MysqlSlaveReplicationLag",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "mysql_slave_status_master_server_id > 0 and ON (instance) (mysql_slave_status_seconds_behind_master - mysql_slave_status_sql_delay) > 30",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlSlaveReplicationLag"
]
},
{
"name": "MysqlSlaveSqlThreadNotRunning",
"note": "MySQL Slave SQL thread not running",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mysql_slave_status_master_server_id > 0 and ON (instance) mysql_slave_status_slave_sql_running == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlSlaveSqlThreadNotRunning"
]
},
{
"name": "Mysql刚刚有重启请注意",
"note": "MySQL has just been restarted, less than one minute ago",
"severity": 3,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mysql_global_status_uptime < 60",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlRestarted"
]
},
{
"name": "Mysql实例挂了",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "mysql_up == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlDown"
]
},
{
"name": "Mysql打开了很多文件句柄请注意",
"note": "More than 80% of MySQL files open",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "avg by (instance) (mysql_global_status_open_files) / avg by (instance)(mysql_global_variables_open_files_limit) * 100 > 80",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlHighOpenFiles"
]
},
{
"name": "Mysql最近一分钟有慢查询出现",
"note": "MySQL server mysql has some new slow query",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "increase(mysql_global_status_slow_queries[1m]) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlSlowQueries"
]
},
{
"name": "Mysql有超过60%的连接是running状态",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "avg by (instance) (mysql_global_status_threads_running) / avg by (instance) (mysql_global_variables_max_connections) * 100 > 60",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlHighThreadsRunning"
]
},
{
"name": "Mysql连接数已超过80%",
"note": "More than 80% of MySQL connections are in use",
"severity": 2,
"disabled": 0,
"prom_for_duration": 120,
"prom_ql": "avg by (instance) (mysql_global_status_threads_connected) / avg by (instance) (mysql_global_variables_max_connections) * 100 > 80",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=MysqlTooManyConnections"
]
}
]

View File

@ -0,0 +1,30 @@
[
{
"name": "网络地址探活失败",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "net_response_result_code != 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@ -0,0 +1,30 @@
[
{
"name": "NTP时间偏移太大",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "ntp_offset_ms > 1000 or ntp_offset_ms < -1000",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@ -0,0 +1,30 @@
[
{
"name": "PING地址探测失败",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "ping_result_code != 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@ -0,0 +1,62 @@
[
{
"name": "进程监控-有进程数为0某进程可能挂了",
"note": "",
"severity": 1,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "procstat_lookup_count == 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
},
{
"name": "进程监控-进程句柄限制过小",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "procstat_rlimit_num_fds_soft < 2048",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"notify_recovered": 1,
"notify_channels": [
"email",
"dingtalk",
"wecom"
],
"notify_repeat_step": 60,
"callbacks": [],
"runbook_url": "",
"append_tags": []
}
]

View File

@ -0,0 +1,182 @@
[
{
"name": "Redis Ping 延迟高大于100毫秒",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "redis_ping_use_seconds > 0.1",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=HighPingLatency"
]
},
{
"name": "Redis内存使用率较高",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "redis_maxmemory > 0 and (redis_used_memory / redis_maxmemory) > 0.85",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisHighMemoryUsage"
]
},
{
"name": "Redis出现拒绝连接",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "(rate(redis_rejected_connections[5m])) > 0",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisRejectedConnHigh"
]
},
{
"name": "Redis刚刚有重启请注意",
"note": "",
"severity": 3,
"disabled": 0,
"prom_for_duration": 0,
"prom_ql": "redis_uptime_in_seconds < 600",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisLowUptime"
]
},
{
"name": "Redis较低的命中率",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "rate(redis_keyspace_hits[5m])\n/\n(rate(redis_keyspace_misses[5m]) + rate(redis_keyspace_hits[5m]))\n< 0.9",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisLowHitRatio"
]
},
{
"name": "Redis驱逐率较高",
"note": "",
"severity": 2,
"disabled": 0,
"prom_for_duration": 60,
"prom_ql": "(sum(rate(redis_evicted_keys[5m])) / sum(redis_keyspace_keys)) > 0.1",
"prom_eval_interval": 15,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": [
"alertname=RedisHighKeysEvictionRatio"
]
}
]

View File

@ -0,0 +1,19 @@
[
{
"name": "HTTP探测",
"tags": "",
"configs": "",
"chart_groups": [
{
"name": "Default chart group",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"max(http_response_result_code) by (target)\",\"legend\":\"UP?\"},{\"expr\":\"max(http_response_response_code) by (target)\",\"refId\":\"B\",\"legend\":\"status code\"},{\"expr\":\"max(http_response_response_time) by (target)\",\"refId\":\"C\",\"legend\":\"latency(s)\"},{\"expr\":\"max(http_response_cert_expire_timestamp) by (target) - time()\",\"refId\":\"D\",\"legend\":\"cert expire\"}],\"name\":\"URL Details\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"labelValuesToRows\",\"aggrDimension\":\"target\"},\"options\":{\"valueMappings\":[],\"standardOptions\":{}},\"overrides\":[{\"properties\":{\"valueMappings\":[{\"type\":\"special\",\"match\":{\"special\":0},\"result\":{\"text\":\"UP\",\"color\":\"#417505\"}},{\"type\":\"range\",\"match\":{\"special\":1,\"from\":1},\"result\":{\"text\":\"DOWN\",\"color\":\"#e90f0f\"}}],\"standardOptions\":{}},\"matcher\":{\"value\":\"A\"}},{\"type\":\"special\",\"matcher\":{\"value\":\"D\"},\"properties\":{\"standardOptions\":{\"util\":\"humantimeSeconds\"}}}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":4,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
}
]
}
]

View File

@ -1,63 +1,362 @@
[
{
"name": "Kafka - 模板",
"tags": "Kafka Prometheus ",
"configs": "{\"var\":[{\"name\":\"instance\",\"definition\":\"label_values(kafka_brokers, instance)\"},{\"name\":\"job\",\"definition\":\"label_values(kafka_brokers, job)\"}]}",
"chart_groups": [
{
"name": "overview",
"weight": 0,
"charts": [
{
"name": "Kafka - 模板",
"tags": "Kafka Prometheus",
"configs": {
"var": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"count(count by (topic) (kafka_topic_partitions))\"}],\"name\":\"topics\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"value\":50}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":8,\"x\":8,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"kafka_brokers\"}],\"name\":\"brokers\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"value\":50}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":8,\"x\":0,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(kafka_topic_partitions)\"}],\"name\":\"partitions\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{\"value\":50}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":8,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
"name": "cluster",
"definition": "label_values(kafka_brokers, cluster)",
"type": "query"
}
]
},
{
"name": "throughput",
"weight": 1,
"charts": [
],
"version": "2.0.0",
"panels": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(kafka_topic_partition_current_offset{instance=\\\"$instance\\\"}[1m])) by (topic)\"}],\"name\":\"Message in per second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
"id": "51502c3a-dd6f-41c7-b8f1-87b88826c96e",
"type": "row",
"name": "overview",
"layout": {
"h": 1,
"w": 24,
"x": 0,
"y": 0,
"i": "51502c3a-dd6f-41c7-b8f1-87b88826c96e",
"isResizable": false
},
"collapsed": true
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(kafka_consumer_lag_millis{instance=\\\"$instance\\\"}) by (consumergroup, topic) \",\"legend\":\"{{consumergroup}} (topic: {{topic}})\"}],\"name\":\"Latency by Consumer Group\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"humantimeMilliseconds\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
"targets": [
{
"refId": "A",
"expr": "kafka_brokers{cluster=\"$cluster\"}"
}
],
"name": "brokers",
"custom": {
"textMode": "value",
"colorMode": "value",
"calc": "lastNotNull",
"colSpan": 1,
"textSize": {
"value": 50
}
},
"options": {
"standardOptions": {}
},
"version": "2.0.0",
"type": "stat",
"layout": {
"h": 3,
"w": 6,
"x": 0,
"y": 1,
"i": "e2c1d271-ec43-4821-aa19-451e856af755",
"isResizable": true
},
"id": "e2c1d271-ec43-4821-aa19-451e856af755"
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(kafka_consumergroup_current_offset{instance=\\\"$instance\\\"}[1m])) by (topic)\"}],\"name\":\"Message consume per second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
"targets": [
{
"refId": "A",
"expr": "count(count by (topic) (kafka_topic_partitions{cluster=\"$cluster\"}))"
}
],
"name": "topics",
"custom": {
"textMode": "value",
"colorMode": "value",
"calc": "lastNotNull",
"colSpan": 1,
"textSize": {
"value": 50
}
},
"options": {
"standardOptions": {}
},
"version": "2.0.0",
"type": "stat",
"layout": {
"h": 3,
"w": 6,
"x": 6,
"y": 1,
"i": "fd3a0b9f-fd67-4360-a94c-869fee7b5b98",
"isResizable": true
},
"id": "fd3a0b9f-fd67-4360-a94c-869fee7b5b98"
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(kafka_topic_partition_current_offset{instance=\\\"$instance\\\"}) by (topic) - sum(kafka_consumergroup_current_offset{instance=\\\"$instance\\\"}) by (topic) \",\"legend\":\"{{consumergroup}} (topic: {{topic}})\"}],\"name\":\"Lag by Consumer Group\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"3\"}}",
"weight": 0
"targets": [
{
"refId": "A",
"expr": "sum(kafka_topic_partitions{cluster=\"$cluster\"})"
}
],
"name": "partitions",
"custom": {
"textMode": "value",
"colorMode": "value",
"calc": "lastNotNull",
"colSpan": 1,
"textSize": {
"value": 50
}
},
"options": {
"standardOptions": {}
},
"version": "2.0.0",
"type": "stat",
"layout": {
"h": 3,
"w": 6,
"x": 12,
"y": 1,
"i": "e228d857-746b-41b6-8d2d-0152453c46f4",
"isResizable": true
},
"id": "e228d857-746b-41b6-8d2d-0152453c46f4"
},
{
"targets": [
{
"refId": "A",
"expr": "sum(kafka_topic_partition_replicas{cluster=\"$cluster\"})"
}
],
"name": "Replicas",
"custom": {
"textMode": "valueAndName",
"colorMode": "value",
"calc": "lastNotNull",
"colSpan": 1,
"textSize": {}
},
"options": {
"standardOptions": {}
},
"version": "2.0.0",
"type": "stat",
"layout": {
"h": 3,
"w": 6,
"x": 18,
"y": 1,
"i": "85438099-8d6b-4817-b9b9-1d0ed36029cd",
"isResizable": true
},
"id": "85438099-8d6b-4817-b9b9-1d0ed36029cd"
},
{
"id": "0db4aac4-86cf-44cd-950e-6c6a99be8ff4",
"type": "row",
"name": "throughput",
"layout": {
"h": 1,
"w": 24,
"x": 0,
"y": 4,
"i": "0db4aac4-86cf-44cd-950e-6c6a99be8ff4",
"isResizable": false
},
"collapsed": true
},
{
"targets": [
{
"expr": "sum(rate(kafka_topic_partition_current_offset{cluster=\"$cluster\"}[1m])) by (topic)"
}
],
"name": "Messages produced per second",
"options": {
"tooltip": {
"mode": "all",
"sort": "none"
},
"legend": {
"displayMode": "hidden"
},
"standardOptions": {},
"thresholds": {}
},
"custom": {
"drawStyle": "lines",
"lineInterpolation": "smooth",
"fillOpacity": 0.5,
"stack": "off"
},
"version": "2.0.0",
"type": "timeseries",
"layout": {
"h": 7,
"w": 8,
"x": 0,
"y": 5,
"i": "c2ec4036-3081-45cc-b672-024c6df93833",
"isResizable": true
},
"id": "c2ec4036-3081-45cc-b672-024c6df93833"
},
{
"targets": [
{
"expr": "sum(rate(kafka_consumergroup_current_offset{cluster=\"$cluster\"}[1m])) by (topic)"
}
],
"name": "Messages consumed per second",
"options": {
"tooltip": {
"mode": "all",
"sort": "none"
},
"legend": {
"displayMode": "hidden"
},
"standardOptions": {},
"thresholds": {}
},
"custom": {
"drawStyle": "lines",
"lineInterpolation": "smooth",
"fillOpacity": 0.5,
"stack": "off"
},
"version": "2.0.0",
"type": "timeseries",
"layout": {
"h": 7,
"w": 8,
"x": 8,
"y": 5,
"i": "7ad651a6-c12c-4d46-8d01-749fa776faef",
"isResizable": true
},
"id": "7ad651a6-c12c-4d46-8d01-749fa776faef"
},
{
"targets": [
{
"expr": "sum(kafka_consumer_lag_millis{cluster=\"$cluster\"}) by (consumergroup, topic)",
"legend": "{{consumergroup}} (topic: {{topic}})"
}
],
"name": "Latency by Consumer Group",
"options": {
"tooltip": {
"mode": "all",
"sort": "none"
},
"legend": {
"displayMode": "hidden"
},
"standardOptions": {
"util": "humantimeMilliseconds"
},
"thresholds": {}
},
"custom": {
"drawStyle": "lines",
"lineInterpolation": "smooth",
"fillOpacity": 0.5,
"stack": "off"
},
"version": "2.0.0",
"type": "timeseries",
"layout": {
"h": 7,
"w": 8,
"x": 16,
"y": 5,
"i": "855aa8f5-0c51-42d4-b9a4-5460b7cd0f5a",
"isResizable": true
},
"id": "855aa8f5-0c51-42d4-b9a4-5460b7cd0f5a"
},
{
"id": "20166830-7f85-4665-8f39-bf904267af29",
"type": "row",
"name": "patition/replicate",
"layout": {
"h": 1,
"w": 24,
"x": 0,
"y": 12,
"i": "20166830-7f85-4665-8f39-bf904267af29",
"isResizable": false
},
"collapsed": true
},
{
"targets": [
{
"refId": "A",
"expr": "kafka_topic_partitions{cluster=\"$cluster\"}",
"legend": "{{topic}}"
}
],
"name": "Partitions per Topic",
"custom": {
"showHeader": true,
"colorMode": "value",
"calc": "lastNotNull",
"displayMode": "seriesToRows"
},
"options": {
"standardOptions": {}
},
"overrides": [
{}
],
"version": "2.0.0",
"type": "table",
"layout": {
"h": 7,
"w": 12,
"x": 0,
"y": 13,
"i": "8837a52e-c9eb-4afa-acc1-c3a5dac72d3b",
"isResizable": true
},
"id": "8837a52e-c9eb-4afa-acc1-c3a5dac72d3b"
},
{
"targets": [
{
"refId": "A",
"expr": "kafka_topic_partition_under_replicated_partition{cluster=\"$cluster\"}",
"legend": "{{topic}}-{{partition}}"
}
],
"name": "Partitions Under Replicated",
"description": "副本不同步预案\n1. Restart the Zookeeper leader.\n2. Restart the broker\\brokers that are not replicating some of the partitions.",
"custom": {
"showHeader": true,
"colorMode": "value",
"calc": "lastNotNull",
"displayMode": "seriesToRows"
},
"options": {
"standardOptions": {}
},
"overrides": [
{}
],
"version": "2.0.0",
"type": "table",
"layout": {
"h": 7,
"w": 12,
"x": 12,
"y": 13,
"i": "dd615767-dda7-4da6-b37f-0d484553aac6",
"isResizable": true
},
"id": "dd615767-dda7-4da6-b37f-0d484553aac6"
}
]
},
{
"name": "patition/replicate",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"kafka_topic_partitions{instance=\\\"$instance\\\"}\",\"legend\":\"{{topic}}\"}],\"name\":\"Partitions per Topic\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"seriesToRows\"},\"options\":{\"standardOptions\":{}},\"overrides\":[{}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"kafka_topic_partition_under_replicated_partition\",\"legend\":\"{{topic}}-{{partition}}\"}],\"name\":\"Under Replicated\",\"description\":\"副本不同步预案\\n1. Restart the Zookeeper leader.\\n2. Restart the broker\\\\brokers that are not replicating some of the partitions.\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"seriesToRows\"},\"options\":{\"standardOptions\":{}},\"overrides\":[{}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
}
]
]
}
]
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,119 @@
[
{
"name": "MySQL Overview - 模板",
"tags": "Prometheus MySQL",
"configs": "{\"var\":[{\"name\":\"instance\",\"definition\":\"label_values(mysql_global_status_uptime, instance)\"}]}",
"chart_groups": [
{
"name": "Basic Info",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"min(mysql_global_status_uptime{instance=~\\\"$instance\\\"})\"}],\"name\":\"MySQL Uptime\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":1800},\"result\":{\"color\":\"#ec7718\"}},{\"type\":\"range\",\"match\":{\"from\":1800},\"result\":{\"color\":\"#369603\"}}],\"standardOptions\":{\"util\":\"humantimeSeconds\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(mysql_global_status_queries{instance=~\\\"$instance\\\"}[5m])\"}],\"name\":\"Current QPS\",\"description\":\"mysql_global_status_queries\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":100},\"result\":{\"color\":\"#05a31f\"}},{\"type\":\"range\",\"match\":{\"from\":100},\"result\":{\"color\":\"#ea3939\"}}],\"standardOptions\":{\"decimals\":2}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"avg(mysql_global_variables_innodb_buffer_pool_size{instance=~\\\"$instance\\\"})\"}],\"name\":\"InnoDB Buffer Pool\",\"description\":\"**InnoDB Buffer Pool Size**\\n\\nInnoDB maintains a storage area called the buffer pool for caching data and indexes in memory. Knowing how the InnoDB buffer pool works, and taking advantage of it to keep frequently accessed data in memory, is one of the most important aspects of MySQL tuning. The goal is to keep the working set in memory. In most cases, this should be between 60%-90% of available memory on a dedicated database host, but depends on many factors.\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"util\":\"bytesIEC\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(increase(mysql_global_status_table_locks_waited{instance=~\\\"$instance\\\"}[5m]))\"}],\"name\":\"Table Locks Waited(5min)\",\"description\":\"**Table Locks**\\n\\nMySQL takes a number of different locks for varying reasons. In this graph we see how many Table level locks MySQL has requested from the storage engine. In the case of InnoDB, many times the locks could actually be row locks as it only takes table level locks in a few specific cases.\\n\\nIt is most useful to compare Locks Immediate and Locks Waited. If Locks waited is rising, it means you have lock contention. Otherwise, Locks Immediate rising and falling is normal activity.\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":1},\"result\":{\"color\":\"#e70d0d\"}},{\"type\":\"range\",\"match\":{\"to\":1},\"result\":{\"color\":\"#53b503\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Connections",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(mysql_global_status_threads_connected{instance=~\\\"$instance\\\"})\",\"legend\":\"Connections\"},{\"expr\":\"sum(mysql_global_status_max_used_connections{instance=~\\\"$instance\\\"})\",\"legend\":\"Max Used Connections\"},{\"expr\":\"sum(mysql_global_variables_max_connections{instance=~\\\"$instance\\\"})\",\"legend\":\"Max Connections\"},{\"expr\":\"sum(rate(mysql_global_status_aborted_connects{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Aborted Connections\"}],\"name\":\"MySQL Connections\",\"description\":\"**Max Connections** \\n\\nMax Connections is the maximum permitted number of simultaneous client connections. By default, this is 151. Increasing this value increases the number of file descriptors that mysqld requires. If the required number of descriptors are not available, the server reduces the value of Max Connections.\\n\\nmysqld actually permits Max Connections + 1 clients to connect. The extra connection is reserved for use by accounts that have the SUPER privilege, such as root.\\n\\nMax Used Connections is the maximum number of connections that have been in use simultaneously since the server started.\\n\\nConnections is the number of connection attempts (successful or not) to the MySQL server.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(mysql_global_status_threads_connected{instance=~\\\"$instance\\\"})\",\"legend\":\"Threads Connected\"},{\"expr\":\"sum(mysql_global_status_threads_running{instance=~\\\"$instance\\\"})\",\"legend\":\"Threads Running\"}],\"name\":\"MySQL Client Thread Activity\",\"description\":\"Threads Connected is the number of open connections, while Threads Running is the number of threads not sleeping.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "Query Performance",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_created_tmp_tables{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Created Tmp Tables\"},{\"expr\":\"sum(rate(mysql_global_status_created_tmp_disk_tables{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Created Tmp Disk Tables\"},{\"expr\":\"sum(rate(mysql_global_status_created_tmp_files{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Created Tmp Files\"}],\"name\":\"MySQL Temporary Objects\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.64,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_select_full_join{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Full Join\"},{\"expr\":\"sum(rate(mysql_global_status_select_full_range_join{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Full Range Join\"},{\"expr\":\"sum(rate(mysql_global_status_select_range{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Range\"},{\"expr\":\"sum(rate(mysql_global_status_select_range_check{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Range Check\"},{\"expr\":\"sum(rate(mysql_global_status_select_scan{ instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Select Scan\"}],\"name\":\"MySQL Select Types\",\"description\":\"**MySQL Select Types**\\n\\nAs with most relational databases, selecting based on indexes is more efficient than scanning an entire table's data. Here we see the counters for selects not done with indexes.\\n\\n* ***Select Scan*** is how many queries caused full table scans, in which all the data in the table had to be read and either discarded or returned.\\n* ***Select Range*** is how many queries used a range scan, which means MySQL scanned all rows in a given range.\\n* ***Select Full Join*** is the number of joins that are not joined on an index, this is usually a huge performance hit.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"stack\":\"off\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.41},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_sort_rows{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Sort Rows\"},{\"expr\":\"sum(rate(mysql_global_status_sort_range{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Sort Range\"},{\"expr\":\"sum(rate(mysql_global_status_sort_merge_passes{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Sort Merge Passes\"},{\"expr\":\"sum(rate(mysql_global_status_sort_scan{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Sort Scan\"}],\"name\":\"MySQL Sorts\",\"description\":\"**MySQL Sorts**\\n\\nDue to a query's structure, order, or other requirements, MySQL sorts the rows before returning them. For example, if a table is ordered 1 to 10 but you want the results reversed, MySQL then has to sort the rows to return 10 to 1.\\n\\nThis graph also shows when sorts had to scan a whole table or a given range of a table in order to return the results and which could not have been sorted via an index.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_slow_queries{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Slow Queries\"}],\"name\":\"MySQL Slow Queries\",\"description\":\"**MySQL Slow Queries**\\n\\nSlow queries are defined as queries being slower than the long_query_time setting. For example, if you have long_query_time set to 3, all queries that take longer than 3 seconds to complete will show on this graph.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"bars\",\"stack\":\"off\",\"fillOpacity\":0.81},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Network",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(mysql_global_status_bytes_received{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Inbound\"},{\"expr\":\"sum(rate(mysql_global_status_bytes_sent{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"Outbound\"}],\"name\":\"MySQL Network Traffic\",\"description\":\"**MySQL Network Traffic**\\n\\nHere we can see how much network traffic is generated by MySQL. Outbound is network traffic sent from MySQL and Inbound is network traffic MySQL has received.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesSI\",\"decimals\":2},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "Commands, Handlers",
"weight": 4,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"topk(10, rate(mysql_global_status_commands_total{instance=~\\\"$instance\\\"}[5m])>0)\",\"legend\":\"Com_{{command}}\"}],\"name\":\"Top Command Counters\",\"description\":\"**Top Command Counters**\\n\\nThe Com_{{xxx}} statement counter variables indicate the number of times each xxx statement has been executed. There is one status variable for each type of statement. For example, Com_delete and Com_update count [``DELETE``](https://dev.mysql.com/doc/refman/5.7/en/delete.html) and [``UPDATE``](https://dev.mysql.com/doc/refman/5.7/en/update.html) statements, respectively. Com_delete_multi and Com_update_multi are similar but apply to [``DELETE``](https://dev.mysql.com/doc/refman/5.7/en/delete.html) and [``UPDATE``](https://dev.mysql.com/doc/refman/5.7/en/update.html) statements that use multiple-table syntax.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":2},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.2,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(mysql_global_status_handlers_total{instance=~\\\"$instance\\\", handler!~\\\"commit|rollback|savepoint.*|prepare\\\"}[5m])\",\"legend\":\"{{handler}}\"}],\"name\":\"MySQL Handlers\",\"description\":\"**MySQL Handlers**\\n\\nHandler statistics are internal statistics on how MySQL is selecting, updating, inserting, and modifying rows, tables, and indexes.\\n\\nThis is in fact the layer between the Storage Engine and MySQL.\\n\\n* `read_rnd_next` is incremented when the server performs a full table scan and this is a counter you don't really want to see with a high value.\\n* `read_key` is incremented when a read is done with an index.\\n* `read_next` is incremented when the storage engine is asked to 'read the next index entry'. A high value means a lot of index scans are being done.\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":3},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"rate(mysql_global_status_handlers_total{instance=~\\\"$instance\\\", handler=~\\\"commit|rollback|savepoint.*|prepare\\\"}[5m])\",\"legend\":\"{{handler}}\"}],\"name\":\"MySQL Transaction Handlers\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Open Files",
"weight": 5,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"mysql_global_variables_open_files_limit{instance=~\\\"$instance\\\"}\",\"legend\":\"Open Files Limit\"},{\"expr\":\"mysql_global_status_open_files{instance=~\\\"$instance\\\"}\",\"legend\":\"Open Files\"}],\"name\":\"MySQL Open Files\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "Table Openings",
"weight": 6,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"rate(mysql_global_status_table_open_cache_hits{instance=~\\\"$instance\\\"}[5m])\\n/\\n(\\nrate(mysql_global_status_table_open_cache_hits{instance=~\\\"$instance\\\"}[5m])\\n+\\nrate(mysql_global_status_table_open_cache_misses{instance=~\\\"$instance\\\"}[5m])\\n)\",\"legend\":\"Table Open Cache Hit Ratio\"}],\"name\":\"Table Open Cache Hit Ratio Mysql 5.6.6+\",\"description\":\"**MySQL Table Open Cache Status**\\n\\nThe recommendation is to set the `table_open_cache_instances` to a loose correlation to virtual CPUs, keeping in mind that more instances means the cache is split more times. If you have a cache set to 500 but it has 10 instances, each cache will only have 50 cached.\\n\\nThe `table_definition_cache` and `table_open_cache` can be left as default as they are auto-sized MySQL 5.6 and above (ie: do not set them to any value).\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percentUnit\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"mysql_global_status_open_tables{instance=~\\\"$instance\\\"}\",\"legend\":\"Open Tables\"},{\"expr\":\"mysql_global_variables_table_open_cache{instance=~\\\"$instance\\\"}\",\"legend\":\"Table Open Cache\"}],\"name\":\"MySQL Open Tables\",\"description\":\"**MySQL Open Tables**\\n\\nThe recommendation is to set the `table_open_cache_instances` to a loose correlation to virtual CPUs, keeping in mind that more instances means the cache is split more times. If you have a cache set to 500 but it has 10 instances, each cache will only have 50 cached.\\n\\nThe `table_definition_cache` and `table_open_cache` can be left as default as they are auto-sized MySQL 5.6 and above (ie: do not set them to any value).\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
}
]
}
]

View File

@ -0,0 +1,232 @@
{
"name": "夜莺大盘",
"tags": "",
"configs": {
"var": [],
"panels": [
{
"targets": [
{
"refId": "A",
"expr": "rate(n9e_server_samples_received_total[1m])"
}
],
"name": "每秒接收的数据点个数",
"options": {
"tooltip": {
"mode": "all",
"sort": "none"
},
"legend": {
"displayMode": "hidden"
},
"standardOptions": {},
"thresholds": {}
},
"custom": {
"drawStyle": "lines",
"lineInterpolation": "smooth",
"fillOpacity": 0.5,
"stack": "off"
},
"version": "2.0.0",
"type": "timeseries",
"layout": {
"h": 4,
"w": 12,
"x": 0,
"y": 0,
"i": "53fcb9dc-23f9-41e0-bc5e-121eed14c3a4",
"isResizable": true
},
"id": "53fcb9dc-23f9-41e0-bc5e-121eed14c3a4"
},
{
"targets": [
{
"refId": "A",
"expr": "rate(n9e_server_alerts_total[10m])"
}
],
"name": "每秒产生的告警事件个数",
"options": {
"tooltip": {
"mode": "all",
"sort": "none"
},
"legend": {
"displayMode": "hidden"
},
"standardOptions": {},
"thresholds": {}
},
"custom": {
"drawStyle": "lines",
"lineInterpolation": "smooth",
"fillOpacity": 0.5,
"stack": "off"
},
"version": "2.0.0",
"type": "timeseries",
"layout": {
"h": 4,
"w": 12,
"x": 12,
"y": 0,
"i": "47fc6252-9cc8-4b53-8e27-0c5c59a47269",
"isResizable": true
},
"id": "f70dcb8b-b58b-4ef9-9e48-f230d9e17140"
},
{
"targets": [
{
"refId": "A",
"expr": "n9e_server_alert_queue_size"
}
],
"name": "告警事件内存队列长度",
"options": {
"tooltip": {
"mode": "all",
"sort": "none"
},
"legend": {
"displayMode": "hidden"
},
"standardOptions": {},
"thresholds": {}
},
"custom": {
"drawStyle": "lines",
"lineInterpolation": "smooth",
"fillOpacity": 0.5,
"stack": "off"
},
"version": "2.0.0",
"type": "timeseries",
"layout": {
"h": 4,
"w": 12,
"x": 0,
"y": 4,
"i": "ad1af16c-de0c-45f4-8875-cea4e85d51d0",
"isResizable": true
},
"id": "caf23e58-d907-42b0-9ed6-722c8c6f3c5f"
},
{
"targets": [
{
"refId": "A",
"expr": "n9e_server_http_request_duration_seconds_sum/n9e_server_http_request_duration_seconds_count"
}
],
"name": "数据接收接口平均响应时间(单位:秒)",
"options": {
"tooltip": {
"mode": "all",
"sort": "desc"
},
"legend": {
"displayMode": "hidden"
},
"standardOptions": {},
"thresholds": {}
},
"custom": {
"drawStyle": "lines",
"lineInterpolation": "smooth",
"fillOpacity": 0.5,
"stack": "noraml"
},
"version": "2.0.0",
"type": "timeseries",
"layout": {
"h": 4,
"w": 12,
"x": 12,
"y": 4,
"i": "64c3abc2-404c-4462-a82f-c109a21dac91",
"isResizable": true
},
"id": "6b8d2db1-efca-4b9e-b429-57a9d2272bc5"
},
{
"targets": [
{
"refId": "A",
"expr": "n9e_server_sample_queue_size"
}
],
"name": "内存数据队列长度",
"options": {
"tooltip": {
"mode": "all",
"sort": "desc"
},
"legend": {
"displayMode": "hidden"
},
"standardOptions": {},
"thresholds": {}
},
"custom": {
"drawStyle": "lines",
"lineInterpolation": "smooth",
"fillOpacity": 0.5,
"stack": "off"
},
"version": "2.0.0",
"type": "timeseries",
"layout": {
"h": 4,
"w": 12,
"x": 0,
"y": 8,
"i": "1c7da942-58c2-40dc-b42f-983e4a35b89b",
"isResizable": true
},
"id": "bd41677d-40d3-482e-bb6e-fbd25df46d87"
},
{
"targets": [
{
"refId": "A",
"expr": "avg(n9e_server_forward_duration_seconds_sum/n9e_server_forward_duration_seconds_count)"
}
],
"name": "数据发往TSDB平均耗时单位",
"options": {
"tooltip": {
"mode": "all",
"sort": "desc"
},
"legend": {
"displayMode": "hidden"
},
"standardOptions": {},
"thresholds": {}
},
"custom": {
"drawStyle": "lines",
"lineInterpolation": "smooth",
"fillOpacity": 0.5,
"stack": "noraml"
},
"version": "2.0.0",
"type": "timeseries",
"layout": {
"h": 4,
"w": 12,
"x": 12,
"y": 8,
"i": "eed94a0b-954f-48ac-82e5-a2eada1c8a3d",
"isResizable": true
},
"id": "c8642e72-f384-46a5-8410-1e6be2953c3c"
}
],
"version": "2.0.0"
}
}

View File

@ -0,0 +1,19 @@
[
{
"name": "TCP探测",
"tags": "",
"configs": "",
"chart_groups": [
{
"name": "Default chart group",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"max(net_response_result_code) by (target)\",\"legend\":\"UP?\"},{\"expr\":\"max(net_response_response_time) by (target)\",\"refId\":\"C\",\"legend\":\"latency(s)\"}],\"name\":\"Targets\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"labelValuesToRows\",\"aggrDimension\":\"target\"},\"options\":{\"valueMappings\":[],\"standardOptions\":{}},\"overrides\":[{\"properties\":{\"valueMappings\":[{\"type\":\"special\",\"match\":{\"special\":0},\"result\":{\"text\":\"UP\",\"color\":\"#417505\"}},{\"type\":\"range\",\"match\":{\"special\":1,\"from\":1},\"result\":{\"text\":\"DOWN\",\"color\":\"#e90f0f\"}}],\"standardOptions\":{}},\"matcher\":{\"value\":\"A\"}}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":4,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
}
]
}
]

View File

@ -0,0 +1,127 @@
[
{
"name": "Oracle - 模板",
"tags": "Telegraf",
"configs": "{\"var\":[{\"name\":\"ident\",\"definition\":\"label_values(oracle_up,ident)\",\"options\":[\"tt-fc-log00.nj\"]},{\"name\":\"instance\",\"definition\":\"label_values(oracle_up{ident=\\\"$ident\\\"},instance)\"}]}",
"chart_groups": [
{
"name": "Activities",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(oracle_activity_execute_count_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}[2m])\"}],\"name\":\"execute count / second\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"decimals\":1}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":6,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(oracle_activity_user_commits_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}[2m])\"}],\"name\":\"user commits / second\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(oracle_activity_user_rollbacks_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}[2m])\"}],\"name\":\"user rollbacks / second\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_up{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"status\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"special\",\"match\":{\"special\":1},\"result\":{\"text\":\"UP\",\"color\":\"#5ea70f\"}},{\"type\":\"special\",\"match\":{\"special\":0},\"result\":{\"text\":\"DOWN\",\"color\":\"#f60f0f\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":0,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Waits",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_wait_time_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"legend\":\"{{wait_class}}\"}],\"name\":\"Time waited\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "Tablespace",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_tablespace_bytes{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}/oracle_tablespace_max_bytes{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"legend\":\"{{tablespace}}\"}],\"name\":\"Used Percent\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"percentUnit\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_tablespace_free{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"legend\":\"{{tablespace}}\"}],\"name\":\"Free space\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "IO and TPS",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_io_requests_per_second_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"IO Requests / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_user_transaction_per_sec_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"TPS\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_io_megabytes_per_second_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}*1024*1024\"}],\"name\":\"IO Bytes / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Connections",
"weight": 4,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sessions_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\",status=\\\"ACTIVE\\\"}\",\"legend\":\"\"}],\"name\":\"Sessions\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "Hit Ratio",
"weight": 5,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_buffer_cache_hit_ratio_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"Buffer Cache Hit Ratio\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_redo_allocation_hit_ratio_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"Redo Allocation Hit Ratio\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_row_cache_hit_ratio_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"Row Cache Hit Ratio\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_library_cache_hit_ratio_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"}],\"name\":\"Library Cache Hit Ratio\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Physical Read Write",
"weight": 6,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_physical_read_bytes_per_sec_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"},{\"expr\":\"oracle_sysmetric_Physical_Write_Bytes_Per_Sec{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"refId\":\"B\"}],\"name\":\"Physical Read Write Bytes / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_physical_read_total_bytes_per_sec_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"},{\"expr\":\"oracle_sysmetric_Physical_Write_Total_Bytes_Per_Sec{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"refId\":\"B\"}],\"name\":\"Physical Read Write Total Bytes / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_physical_read_io_requests_per_sec_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"},{\"expr\":\"oracle_sysmetric_Physical_Write_IO_Requests_Per_Sec{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"refId\":\"B\"}],\"name\":\"Physical RW IO Requests / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"oracle_sysmetric_physical_read_total_io_requests_per_sec_value{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\"},{\"expr\":\"oracle_sysmetric_Physical_Write_Total_IO_Requests_Per_Sec{ident=\\\"$ident\\\", instance=\\\"$instance\\\"}\",\"refId\":\"B\"}],\"name\":\"Physical RW Total IO Requests / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
}
]
}
]

View File

@ -0,0 +1,19 @@
[
{
"name": "PING探测",
"tags": "",
"configs": "",
"chart_groups": [
{
"name": "Default chart group",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"max(ping_result_code) by (target)\",\"legend\":\"UP?\"},{\"expr\":\"max(ping_percent_packet_loss) by (target)\",\"refId\":\"B\",\"legend\":\"Packet Loss %\"},{\"expr\":\"max(httpresponse_response_time) by (target)\",\"refId\":\"C\",\"legend\":\"latency(s)\"}],\"name\":\"Ping\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"labelValuesToRows\",\"aggrDimension\":\"target\"},\"options\":{\"valueMappings\":[],\"standardOptions\":{}},\"overrides\":[{\"properties\":{\"valueMappings\":[{\"type\":\"special\",\"match\":{\"special\":0},\"result\":{\"text\":\"UP\",\"color\":\"#417505\"}},{\"type\":\"range\",\"match\":{\"special\":1,\"from\":1},\"result\":{\"text\":\"DOWN\",\"color\":\"#e90f0f\"}}],\"standardOptions\":{}},\"matcher\":{\"value\":\"A\"}}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":4,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
}
]
}
]

View File

@ -0,0 +1,221 @@
[
{
"name": "RabbitMQ 3.8+",
"tags": "",
"configs": "{\"var\":[{\"definition\":\"label_values(rabbitmq_identity_info, rabbitmq_cluster)\",\"name\":\"rabbitmq_cluster\"}]}",
"chart_groups": [
{
"name": "Overview",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_queue_messages_ready * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Ready messages\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10000},\"result\":{\"color\":\"#4a90e2\"}},{\"type\":\"range\",\"match\":{\"from\":100000},\"result\":{\"color\":\"#f50a0a\"}},{\"type\":\"range\",\"match\":{\"to\":9999},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":7,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_published_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Incoming messages / s\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":50},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":5,\"x\":7,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_channels * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) - sum(rabbitmq_channel_consumers * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Publishers\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_connections * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Connections\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":16,\"y\":0,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_queues * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Queues\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":20,\"y\":0,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_queue_messages_unacked * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Unacknowledged messages\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":99},\"result\":{\"color\":\"#417505\"}},{\"type\":\"range\",\"match\":{\"from\":100},\"result\":{\"color\":\"#4a90e2\"}},{\"type\":\"range\",\"match\":{\"from\":500},\"result\":{\"color\":\"#d0021b\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":7,\"x\":0,\"y\":1,\"i\":\"5\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_redelivered_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) +\\nsum(rate(rabbitmq_channel_messages_delivered_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) +\\nsum(rate(rabbitmq_channel_messages_delivered_ack_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) +\\nsum(rate(rabbitmq_channel_get_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) +\\nsum(rate(rabbitmq_channel_get_ack_total[60s]) * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Outgoing messages / s\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":50},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":5,\"x\":7,\"y\":1,\"i\":\"6\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_channel_consumers * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Consumers\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":12,\"y\":1,\"i\":\"7\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_channels * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Channels\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"from\":10},\"result\":{\"color\":\"#417505\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":16,\"y\":1,\"i\":\"8\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_build_info * on(instance) group_left(rabbitmq_cluster) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Nodes\",\"custom\":{\"textMode\":\"valueAndName\",\"colorMode\":\"background\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":null,\"from\":3},\"result\":{\"color\":\"#417505\"}},{\"type\":\"range\",\"match\":{\"from\":8},\"result\":{\"color\":\"#e70909\"}}],\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":4,\"x\":20,\"y\":1,\"i\":\"9\"}}",
"weight": 0
}
]
},
{
"name": "Nodes",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rabbitmq_build_info * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}\"}],\"name\":\"nodes\",\"custom\":{\"showHeader\":true,\"calc\":\"lastNotNull\",\"displayMode\":\"labelsOfSeriesToRows\",\"columns\":[\"rabbitmq_cluster\",\"rabbitmq_node\",\"rabbitmq_version\",\"erlang_version\"]},\"options\":{\"standardOptions\":{}},\"overrides\":[{}],\"version\":\"2.0.0\",\"type\":\"table\",\"layout\":{\"h\":1,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"(rabbitmq_resident_memory_limit_bytes * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) -\\n(rabbitmq_process_resident_memory_bytes * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"Memory available before publishers blocked\",\"description\":\"If the value is zero or less, the memory alarm will be triggered and all publishing connections across all cluster nodes will be blocked.\\n\\nThis value can temporarily go negative because the memory alarm is triggered with a slight delay.\\n\\nThe kernel's view of the amount of memory used by the node can differ from what the node itself can observe. This means that this value can be negative for a sustained period of time.\\n\\nBy default nodes use resident set size (RSS) to compute how much memory they use. This strategy can be changed (see the guides below).\\n\\n* [Alarms](https://www.rabbitmq.com/alarms.html)\\n* [Memory Alarms](https://www.rabbitmq.com/memory.html)\\n* [Reasoning About Memory Use](https://www.rabbitmq.com/memory-use.html)\\n* [Blocked Connection Notifications](https://www.rabbitmq.com/connection-blocked.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":1,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rabbitmq_disk_space_available_bytes * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}\"}],\"name\":\"Disk space available before publishers blocked\",\"description\":\"This metric is reported for the partition where the RabbitMQ data directory is stored.\\n\\nIf the value is zero or less, the disk alarm will be triggered and all publishing connections across all cluster nodes will be blocked.\\n\\nThis value can temporarily go negative because the free disk space alarm is triggered with a slight delay.\\n\\n* [Alarms](https://www.rabbitmq.com/alarms.html)\\n* [Disk Space Alarms](https://www.rabbitmq.com/disk-alarms.html)\\n* [Disk Space](https://www.rabbitmq.com/production-checklist.html#resource-limits-disk-space)\\n* [Persistence Configuration](https://www.rabbitmq.com/persistence-conf.html)\\n* [Blocked Connection Notifications](https://www.rabbitmq.com/connection-blocked.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":1,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"(rabbitmq_process_max_fds * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) -\\n(rabbitmq_process_open_fds * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"File descriptors available\",\"description\":\"When this value reaches zero, new connections will not be accepted and disk write operations may fail.\\n\\nClient libraries, peer nodes and CLI tools will not be able to connect when the node runs out of available file descriptors.\\n\\n* [Open File Handles Limit](https://www.rabbitmq.com/production-checklist.html#resource-limits-file-handle-limit)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":8,\"x\":16,\"y\":1,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"(rabbitmq_process_max_tcp_sockets * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) -\\n(rabbitmq_process_open_tcp_sockets * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\"}],\"name\":\"TCP sockets available\",\"description\":\"When this value reaches zero, new connections will not be accepted.\\n\\nClient libraries, peer nodes and CLI tools will not be able to connect when the node runs out of available file descriptors.\\n\\n* [Networking and RabbitMQ](https://www.rabbitmq.com/networking.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":8,\"x\":16,\"y\":2,\"i\":\"4\"}}",
"weight": 0
}
]
},
{
"name": "QUEUED MESSAGES",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_queue_messages_ready * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages ready to be delivered to consumers\",\"description\":\"Total number of ready messages ready to be delivered to consumers.\\n\\nAim to keep this value as low as possible. RabbitMQ behaves best when messages are flowing through it. It's OK for publishers to occasionally outpace consumers, but the expectation is that consumers will eventually process all ready messages.\\n\\nIf this metric keeps increasing, your system will eventually run out of memory and/or disk space. Consider using TTL or Queue Length Limit to prevent unbounded message growth.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\\n* [Queue Length Limit](https://www.rabbitmq.com/maxlength.html)\\n* [Time-To-Live and Expiration](https://www.rabbitmq.com/ttl.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rabbitmq_queue_messages_unacked * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages pending consumer acknowledgement\",\"description\":\"The total number of messages that are either in-flight to consumers, currently being processed by consumers or simply waiting for the consumer acknowledgements to be processed by the queue. Until the queue processes the message acknowledgement, the message will remain unacknowledged.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\\n* [Confirms and Acknowledgements](https://www.rabbitmq.com/confirms.html)\\n* [Consumer Prefetch](https://www.rabbitmq.com/consumer-prefetch.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
},
{
"name": "INCOMING MESSAGES",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_published_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages published / s\",\"description\":\"The incoming message rate before any routing rules are applied.\\n\\nIf this value is lower than the number of messages published to queues, it may indicate that some messages are delivered to more than one queue.\\n\\nIf this value is higher than the number of messages published to queues, messages cannot be routed and will either be dropped or returned to publishers.\\n\\n* [Publishers](https://www.rabbitmq.com/publishers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_confirmed_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages confirmed to publishers / s\",\"description\":\"The rate of messages confirmed by the broker to publishers. Publishers must opt-in to receive message confirmations.\\n\\nIf this metric is consistently at zero it may suggest that publisher confirms are not used by clients. The safety of published messages is likely to be at risk.\\n\\n* [Publisher Confirms](https://www.rabbitmq.com/confirms.html#publisher-confirms)\\n* [Publisher Confirms and Data Safety](https://www.rabbitmq.com/publishers.html#data-safety)\\n* [When Will Published Messages Be Confirmed by the Broker?](https://www.rabbitmq.com/confirms.html#when-publishes-are-confirmed)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_queue_messages_published_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages routed to queues / s\",\"description\":\"The rate of messages received from publishers and successfully routed to the master queue replicas.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\\n* [Publishers](https://www.rabbitmq.com/publishers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":1,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_unconfirmed[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages unconfirmed to publishers / s\",\"description\":\"The rate of messages received from publishers that have publisher confirms enabled and the broker has not confirmed yet.\\n\\n* [Publishers](https://www.rabbitmq.com/publishers.html)\\n* [Confirms and Acknowledgements](https://www.rabbitmq.com/confirms.html)\\n* [When Will Published Messages Be Confirmed by the Broker?](https://www.rabbitmq.com/confirms.html#when-publishes-are-confirmed)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":1,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_unroutable_dropped_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Unroutable messages dropped / s\",\"description\":\"The rate of messages that cannot be routed and are dropped. \\n\\nAny value above zero means message loss and likely suggests a routing problem on the publisher end.\\n\\n* [Unroutable Message Handling](https://www.rabbitmq.com/publishers.html#unroutable)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":2,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_unroutable_returned_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Unroutable messages returned to publishers / s\",\"description\":\"The rate of messages that cannot be routed and are returned back to publishers.\\n\\nSustained values above zero may indicate a routing problem on the publisher end.\\n\\n* [Unroutable Message Handling](https://www.rabbitmq.com/publishers.html#unroutable)\\n* [When Will Published Messages Be Confirmed by the Broker?](https://www.rabbitmq.com/confirms.html#when-publishes-are-confirmed)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":2,\"i\":\"5\"}}",
"weight": 0
}
]
},
{
"name": "OUTGOING MESSAGES",
"weight": 4,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(\\n (rate(rabbitmq_channel_messages_delivered_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) +\\n (rate(rabbitmq_channel_messages_delivered_ack_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"})\\n) by(rabbitmq_node)\"}],\"name\":\"Messages delivered / s\",\"description\":\"The rate of messages delivered to consumers. It includes messages that have been redelivered.\\n\\nThis metric does not include messages that have been fetched by consumers using `basic.get` (consumed by polling).\\n\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_redelivered_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages redelivered / s\",\"description\":\"The rate of messages that have been redelivered to consumers. It includes messages that have been requeued automatically and redelivered due to channel exceptions or connection closures.\\n\\nHaving some redeliveries is expected, but if this metric is consistently non-zero, it is worth investigating why.\\n\\n* [Negative Acknowledgement and Requeuing of Deliveries](https://www.rabbitmq.com/confirms.html#consumer-nacks-requeue)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_delivered_ack_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages delivered with manual ack / s\",\"description\":\"The rate of message deliveries to consumers that use manual acknowledgement mode.\\n\\nWhen this mode is used, RabbitMQ waits for consumers to acknowledge messages before more messages can be delivered.\\n\\nThis is the safest way of consuming messages.\\n\\n* [Consumer Acknowledgements](https://www.rabbitmq.com/confirms.html)\\n* [Consumer Prefetch](https://www.rabbitmq.com/consumer-prefetch.html)\\n* [Consumer Acknowledgement Modes, Prefetch and Throughput](https://www.rabbitmq.com/confirms.html#channel-qos-prefetch-throughput)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":1,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_delivered_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages delivered auto ack / s\",\"description\":\"The rate of message deliveries to consumers that use automatic acknowledgement mode.\\n\\nWhen this mode is used, RabbitMQ does not wait for consumers to acknowledge message deliveries.\\n\\nThis mode is fire-and-forget and does not offer any delivery safety guarantees. It tends to provide higher throughput and it may lead to consumer overload and higher consumer memory usage.\\n\\n* [Consumer Acknowledgement Modes, Prefetch and Throughput](https://www.rabbitmq.com/confirms.html#channel-qos-prefetch-throughput)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":1,\"i\":\"3\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_messages_acked_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Messages acknowledged / s\",\"description\":\"The rate of message acknowledgements coming from consumers that use manual acknowledgement mode.\\n\\n* [Consumer Acknowledgements](https://www.rabbitmq.com/confirms.html)\\n* [Consumer Prefetch](https://www.rabbitmq.com/consumer-prefetch.html)\\n* [Consumer Acknowledgement Modes, Prefetch and Throughput](https://www.rabbitmq.com/confirms.html#channel-qos-prefetch-throughput)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":2,\"i\":\"4\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_get_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Polling operations with auto ack / s\",\"description\":\"The rate of messages delivered to polling consumers that use automatic acknowledgement mode.\\n\\nThe use of polling consumers is highly inefficient and therefore strongly discouraged.\\n\\n* [Fetching individual messages](https://www.rabbitmq.com/consumers.html#fetching)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":2,\"i\":\"5\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_get_empty_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Polling operations that yield no result / s\",\"description\":\"The rate of polling consumer operations that yield no result.\\n\\nAny value above zero means that RabbitMQ resources are wasted by polling consumers.\\n\\nCompare this metric to the other polling consumer metrics to see the inefficiency rate.\\n\\nThe use of polling consumers is highly inefficient and therefore strongly discouraged.\\n\\n* [Fetching individual messages](https://www.rabbitmq.com/consumers.html#fetching)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":3,\"i\":\"6\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channel_get_ack_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Polling operations with manual ack / s\",\"description\":\"The rate of messages delivered to polling consumers that use manual acknowledgement mode.\\n\\nThe use of polling consumers is highly inefficient and therefore strongly discouraged.\\n\\n* [Fetching individual messages](https://www.rabbitmq.com/consumers.html#fetching)\\n* [Consumers](https://www.rabbitmq.com/consumers.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":12,\"y\":3,\"i\":\"7\"}}",
"weight": 0
}
]
},
{
"name": "QUEUES",
"weight": 5,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rabbitmq_queues * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}\"}],\"name\":\"Total queues\",\"description\":\"Total number of queue masters per node. \\n\\nThis metric makes it easy to see sub-optimal queue distribution in a cluster.\\n\\n* [Queue Masters, Data Locality](https://www.rabbitmq.com/ha.html#master-migration-data-locality)\\n* [Queues](https://www.rabbitmq.com/queues.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_queues_declared_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Queues declared / s\",\"description\":\"The rate of queue declarations performed by clients.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of queue churn or high rates of connection recovery. Confirm connection recovery rates by using the _Connections opened_ metric.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":4,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_queues_created_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Queues created / s\",\"description\":\"The rate of new queues created (as opposed to redeclarations).\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of queue churn or high rates of connection recovery. Confirm connection recovery rates by using the _Connections opened_ metric.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":4,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_queues_deleted_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Queues deleted / s\",\"description\":\"The rate of queues deleted.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of queue churn or high rates of connection recovery. Confirm connection recovery rates by using the _Connections opened_ metric.\\n\\n* [Queues](https://www.rabbitmq.com/queues.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":4,\"x\":20,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "CHANNELS",
"weight": 6,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rabbitmq_channels * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}\"}],\"name\":\"Total channels\",\"description\":\"Total number of channels on all currently opened connections.\\n\\nIf this metric grows monotonically it is highly likely a channel leak in one of the applications. Confirm channel leaks by using the _Channels opened_ and _Channels closed_ metrics.\\n\\n* [Channel Leak](https://www.rabbitmq.com/channels.html#channel-leaks)\\n* [Channels](https://www.rabbitmq.com/channels.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channels_opened_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Channels opened / s\",\"description\":\"The rate of new channels opened by applications across all connections. Channels are expected to be long-lived.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of channel churn or mass connection recovery. Confirm connection recovery rates by using the _Connections opened_ metric.\\n\\n* [High Channel Churn](https://www.rabbitmq.com/channels.html#high-channel-churn)\\n* [Channels](https://www.rabbitmq.com/channels.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_channels_closed_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Channels closed / s\",\"description\":\"The rate of channels closed by applications across all connections. Channels are expected to be long-lived.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of channel churn or mass connection recovery. Confirm connection recovery rates by using the _Connections opened_ metric.\\n\\n* [High Channel Churn](https://www.rabbitmq.com/channels.html#high-channel-churn)\\n* [Channels](https://www.rabbitmq.com/channels.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "CONNECTIONS",
"weight": 7,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_connections_closed_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Connections closed / s\",\"description\":\"The rate of connections closed. Connections are expected to be long-lived.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of connection churn or mass connection recovery.\\n\\n* [Connections](https://www.rabbitmq.com/connections.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rabbitmq_connections * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}\"}],\"name\":\"Total connections\",\"description\":\"Total number of client connections.\\n\\nIf this metric grows monotonically it is highly likely a connection leak in one of the applications. Confirm connection leaks by using the _Connections opened_ and _Connections closed_ metrics.\\n\\n* [Connection Leak](https://www.rabbitmq.com/connections.html#monitoring)\\n* [Connections](https://www.rabbitmq.com/connections.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"sum(rate(rabbitmq_connections_opened_total[60s]) * on(instance) group_left(rabbitmq_cluster, rabbitmq_node) rabbitmq_identity_info{rabbitmq_cluster=\\\"$rabbitmq_cluster\\\"}) by(rabbitmq_node)\"}],\"name\":\"Connections opened / s\",\"description\":\"The rate of new connections opened by clients. Connections are expected to be long-lived.\\n\\nLow sustained values above zero are to be expected. High rates may be indicative of connection churn or mass connection recovery.\\n\\n* [Connection Leak](https://www.rabbitmq.com/connections.html#monitoring)\\n* [Connections](https://www.rabbitmq.com/connections.html)\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
}
]
}
]
}
]

View File

@ -0,0 +1,77 @@
[
{
"name": "Redis Overview - 模板",
"tags": "Redis Prometheus",
"configs": "{\"var\":[{\"name\":\"instance\",\"definition\":\"label_values(redis_uptime_in_seconds,instance)\",\"selected\":\"10.206.0.16:6379\"}]}",
"chart_groups": [
{
"name": "Basic Info",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"min(redis_uptime_in_seconds{instance=~\\\"$instance\\\"})\"}],\"name\":\"Redis Uptime\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"util\":\"humantimeSeconds\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(redis_connected_clients{instance=~\\\"$instance\\\"})\"}],\"name\":\"Connected Clients\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"redis_used_memory{instance=~\\\"$instance\\\"}\"}],\"name\":\"Memory Used\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"valueMappings\":[{\"type\":\"range\",\"match\":{\"to\":128000000},\"result\":{\"color\":\"#079e05\"}},{\"type\":\"range\",\"match\":{\"from\":128000000},\"result\":{\"color\":\"#f10909\"}}],\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":0}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"redis_maxmemory{instance=~\\\"$instance\\\"}\"}],\"name\":\"Max Memory Limit\",\"custom\":{\"textMode\":\"value\",\"colorMode\":\"value\",\"calc\":\"lastNotNull\",\"colSpan\":1,\"textSize\":{}},\"options\":{\"standardOptions\":{\"util\":\"bytesIEC\"}},\"version\":\"2.0.0\",\"type\":\"stat\",\"layout\":{\"h\":1,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "Commands",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"rate(redis_total_commands_processed{instance=~\\\"$instance\\\"}[5m])\"}],\"name\":\"Commands Executed / sec\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"irate(redis_keyspace_hits{instance=~\\\"$instance\\\"}[5m])\",\"legend\":\"hits\"},{\"expr\":\"irate(redis_keyspace_misses{instance=~\\\"$instance\\\"}[5m])\",\"legend\":\"misses\"}],\"name\":\"Hits / Misses per Sec\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"noraml\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"topk(5, irate(redis_cmdstat_calls{instance=~\\\"$instance\\\"} [1m]))\",\"legend\":\"{{command}}\"}],\"name\":\"Top Commands\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Keys",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum (redis_keyspace_keys{instance=~\\\"$instance\\\"}) by (db)\",\"legend\":\"{{db}}\"}],\"name\":\"Total Items per DB\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(redis_expired_keys{instance=~\\\"$instance\\\"}[5m])) by (instance)\",\"legend\":\"expired\"},{\"expr\":\"sum(rate(redis_evicted_keys{instance=~\\\"$instance\\\"}[5m])) by (instance)\",\"legend\":\"evicted\"}],\"name\":\"Expired / Evicted\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":8,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"expr\":\"sum(redis_keyspace_keys{instance=~\\\"$instance\\\"}) - sum(redis_keyspace_expires{instance=~\\\"$instance\\\"}) \",\"legend\":\"not expiring\"},{\"expr\":\"sum(redis_keyspace_expires{instance=~\\\"$instance\\\"}) \",\"legend\":\"expiring\"}],\"name\":\"Expiring vs Not-Expiring Keys\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"noraml\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":8,\"x\":16,\"y\":0,\"i\":\"2\"}}",
"weight": 0
}
]
},
{
"name": "Network",
"weight": 3,
"charts": [
{
"configs": "{\"targets\":[{\"expr\":\"sum(rate(redis_total_net_input_bytes{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"input\"},{\"expr\":\"sum(rate(redis_total_net_output_bytes{instance=~\\\"$instance\\\"}[5m]))\",\"legend\":\"output\"}],\"name\":\"Network I/O\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":2},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
}
]
}
]

View File

@ -0,0 +1,63 @@
[
{
"name": "Tomcat - 模板",
"tags": "Categraf",
"configs": "{\"var\":[{\"name\":\"instance\",\"definition\":\"label_values(tomcat_up, instance)\"}],\"links\":[{\"title\":\"n9e\",\"url\":\"https://n9e.gitee.io/\",\"targetBlank\":true}]}",
"chart_groups": [
{
"name": "connector",
"weight": 0,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(tomcat_connector_bytes_sent{instance=\\\"$instance\\\"}[1m])\",\"legend\":\"sent\"},{\"expr\":\"rate(tomcat_connector_bytes_received{instance=\\\"$instance\\\"}[1m])\",\"refId\":\"B\",\"legend\":\"received\"}],\"name\":\"Traffic Bytes / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\"},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(tomcat_connector_request_count{instance=\\\"$instance\\\"}[1m])\",\"legend\":\"tomcat_connector_request_count\"},{\"expr\":\"rate(tomcat_connector_error_count{instance=\\\"$instance\\\"}[1m])\",\"refId\":\"B\",\"legend\":\"tomcat_connector_error_count\"}],\"name\":\"Request count / Second\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_connector_max_threads{instance=\\\"$instance\\\"}\",\"legend\":\"max_threads\"},{\"expr\":\"tomcat_connector_current_thread_count{instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"current_thread_count\"},{\"expr\":\"tomcat_connector_current_threads_busy{instance=\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"current_threads_busy\"}],\"name\":\"Tread\",\"description\":\"max_threads: The maximum number of allowed worker threads.\\ncurrent_thread_count: The number of threads managed by the thread pool\\ncurrent_threads_busy: The number of threads that are in use\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":0,\"y\":2,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"rate(tomcat_connector_processing_time{instance=\\\"$instance\\\"}[1m])\",\"legend\":\"{{name}}-processing_time\"}],\"name\":\"Processing time\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"none\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":12,\"x\":12,\"y\":2,\"i\":\"3\"}}",
"weight": 0
}
]
},
{
"name": "mem used",
"weight": 1,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_jvm_memory_max{instance=\\\"$instance\\\"}\",\"legend\":\"max\"},{\"expr\":\"tomcat_jvm_memory_total{instance=\\\"$instance\\\"}\",\"refId\":\"B\",\"legend\":\"used\"},{\"expr\":\"tomcat_jvm_memory_free{instance=\\\"$instance\\\"}\",\"refId\":\"C\",\"legend\":\"free\"}],\"name\":\"Mem Used\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":24,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
}
]
},
{
"name": "memorypool",
"weight": 2,
"charts": [
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_jvm_memorypool_used{instance=\\\"$instance\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Used\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":0,\"y\":0,\"i\":\"0\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_jvm_memorypool_max{instance=\\\"$instance\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Max\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":6,\"y\":0,\"i\":\"1\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_jvm_memorypool_committed{instance=\\\"$instance\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Committed\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":12,\"y\":0,\"i\":\"2\"}}",
"weight": 0
},
{
"configs": "{\"targets\":[{\"refId\":\"A\",\"expr\":\"tomcat_jvm_memorypool_init{instance=\\\"$instance\\\"}\",\"legend\":\"{{name}}\"}],\"name\":\"Init\",\"options\":{\"tooltip\":{\"mode\":\"all\",\"sort\":\"desc\"},\"legend\":{\"displayMode\":\"hidden\"},\"standardOptions\":{\"util\":\"bytesIEC\",\"decimals\":1},\"thresholds\":{}},\"custom\":{\"drawStyle\":\"lines\",\"lineInterpolation\":\"smooth\",\"fillOpacity\":0.5,\"stack\":\"off\"},\"version\":\"2.0.0\",\"type\":\"timeseries\",\"layout\":{\"h\":2,\"w\":6,\"x\":18,\"y\":0,\"i\":\"3\"}}",
"weight": 0
}
]
}
]
}
]

View File

@ -1,131 +1,383 @@
cpu_usage_idle: CPU空闲率单位%
cpu_usage_active: CPU使用率单位%
cpu_usage_system: CPU内核态时间占比单位%
cpu_usage_user: CPU用户态时间占比单位%
cpu_usage_nice: 低优先级用户态CPU时间占比也就是进程nice值被调整为1-19之间的CPU时间。这里注意nice可取值范围是-20到19数值越大优先级反而越低单位%
cpu_usage_iowait: CPU等待I/O的时间占比单位%
cpu_usage_irq: CPU处理硬中断的时间占比单位%
cpu_usage_softirq: CPU处理软中断的时间占比单位%
cpu_usage_steal: 在虚拟机环境下有该指标表示CPU被其他虚拟机争用的时间占比超过20就表示争抢严重单位%
cpu_usage_guest: 通过虚拟化运行其他操作系统的时间也就是运行虚拟机的CPU时间占比单位%
cpu_usage_guest_nice: 以低优先级运行虚拟机的时间占比(单位:%
zh:
cpu_usage_idle: CPU空闲率单位%
cpu_usage_active: CPU使用率单位%
cpu_usage_system: CPU内核态时间占比单位%
cpu_usage_user: CPU用户态时间占比单位%
cpu_usage_nice: 低优先级用户态CPU时间占比也就是进程nice值被调整为1-19之间的CPU时间。这里注意nice可取值范围是-20到19数值越大优先级反而越低单位%
cpu_usage_iowait: CPU等待I/O的时间占比单位%
cpu_usage_irq: CPU处理硬中断的时间占比单位%
cpu_usage_softirq: CPU处理软中断的时间占比单位%
cpu_usage_steal: 在虚拟机环境下有该指标表示CPU被其他虚拟机争用的时间占比超过20就表示争抢严重单位%
cpu_usage_guest: 通过虚拟化运行其他操作系统的时间也就是运行虚拟机的CPU时间占比单位%
cpu_usage_guest_nice: 以低优先级运行虚拟机的时间占比(单位:%
disk_free: 硬盘分区剩余量单位byte
disk_used: 硬盘分区使用量单位byte
disk_used_percent: 硬盘分区使用率(单位:%
disk_total: 硬盘分区总量单位byte
disk_inodes_free: 硬盘分区inode剩余量
disk_inodes_used: 硬盘分区inode使用量
disk_inodes_total: 硬盘分区inode总量
disk_free: 硬盘分区剩余量单位byte
disk_used: 硬盘分区使用量单位byte
disk_used_percent: 硬盘分区使用率(单位:%
disk_total: 硬盘分区总量单位byte
disk_inodes_free: 硬盘分区inode剩余量
disk_inodes_used: 硬盘分区inode使用量
disk_inodes_total: 硬盘分区inode总量
diskio_io_time: 从设备视角来看I/O请求总时间队列中有I/O请求就计数单位毫秒counter类型需要用函数求rate才有使用价值
diskio_iops_in_progress: 已经分配给设备驱动且尚未完成的IO请求不包含在队列中但尚未分配给设备驱动的IO请求gauge类型
diskio_merged_reads: 相邻读请求merge读的次数counter类型
diskio_merged_writes: 相邻写请求merge写的次数counter类型
diskio_read_bytes: 读取的byte数量counter类型需要用函数求rate才有使用价值
diskio_read_time: 读请求总时间单位毫秒counter类型需要用函数求rate才有使用价值
diskio_reads: 读请求次数counter类型需要用函数求rate才有使用价值
diskio_weighted_io_time: 从I/O请求视角来看I/O等待总时间如果同时有多个I/O请求时间会叠加单位毫秒
diskio_write_bytes: 写入的byte数量counter类型需要用函数求rate才有使用价值
diskio_write_time: 写请求总时间单位毫秒counter类型需要用函数求rate才有使用价值
diskio_writes: 写请求次数counter类型需要用函数求rate才有使用价值
diskio_io_time: 从设备视角来看I/O请求总时间队列中有I/O请求就计数单位毫秒counter类型需要用函数求rate才有使用价值
diskio_iops_in_progress: 已经分配给设备驱动且尚未完成的IO请求不包含在队列中但尚未分配给设备驱动的IO请求gauge类型
diskio_merged_reads: 相邻读请求merge读的次数counter类型
diskio_merged_writes: 相邻写请求merge写的次数counter类型
diskio_read_bytes: 读取的byte数量counter类型需要用函数求rate才有使用价值
diskio_read_time: 读请求总时间单位毫秒counter类型需要用函数求rate才有使用价值
diskio_reads: 读请求次数counter类型需要用函数求rate才有使用价值
diskio_weighted_io_time: 从I/O请求视角来看I/O等待总时间如果同时有多个I/O请求时间会叠加单位毫秒
diskio_write_bytes: 写入的byte数量counter类型需要用函数求rate才有使用价值
diskio_write_time: 写请求总时间单位毫秒counter类型需要用函数求rate才有使用价值
diskio_writes: 写请求次数counter类型需要用函数求rate才有使用价值
kernel_boot_time: 内核启动时间
kernel_context_switches: 内核上下文切换次数
kernel_entropy_avail: linux系统内部的熵池
kernel_interrupts: 内核中断次数
kernel_processes_forked: fork的进程数
kernel_boot_time: 内核启动时间
kernel_context_switches: 内核上下文切换次数
kernel_entropy_avail: linux系统内部的熵池
kernel_interrupts: 内核中断次数
kernel_processes_forked: fork的进程数
mem_active: 活跃使用的内存总数(包括cache和buffer内存)
mem_available: 应用程序可用内存数
mem_available_percent: 内存剩余百分比(0~100)
mem_buffered: 用来给文件做缓冲大小
mem_cached: 被高速缓冲存储器cache memory用的内存的大小等于 diskcache minus SwapCache
mem_commit_limit: 根据超额分配比率('vm.overcommit_ratio'这是当前在系统上分配可用的内存总量这个限制只是在模式2('vm.overcommit_memory')时启用
mem_committed_as: 目前在系统上分配的内存量。是所有进程申请的内存的总和
mem_dirty: 等待被写回到磁盘的内存大小
mem_free: 空闲内存数
mem_high_free: 未被使用的高位内存大小
mem_high_total: 高位内存总大小Highmem是指所有内存高于860MB的物理内存,Highmem区域供用户程序使用或用于页面缓存。该区域不是直接映射到内核空间。内核必须使用不同的手法使用该段内存
mem_huge_page_size: 每个大页的大小
mem_huge_pages_free: 池中尚未分配的 HugePages 数量
mem_huge_pages_total: 预留HugePages的总个数
mem_inactive: 空闲的内存数(包括free和avalible的内存)
mem_low_free: 未被使用的低位大小
mem_low_total: 低位内存总大小,低位可以达到高位内存一样的作用,而且它还能够被内核用来记录一些自己的数据结构
mem_mapped: 设备和文件等映射的大小
mem_page_tables: 管理内存分页页面的索引表的大小
mem_shared: 多个进程共享的内存总额
mem_slab: 内核数据结构缓存的大小,可以减少申请和释放内存带来的消耗
mem_sreclaimable: 可收回Slab的大小
mem_sunreclaim: 不可收回Slab的大小SUnreclaim+SReclaimableSlab
mem_swap_cached: 被高速缓冲存储器cache memory用的交换空间的大小已经被交换出来的内存但仍然被存放在swapfile中。用来在需要的时候很快的被替换而不需要再次打开I/O端口
mem_swap_free: 未被使用交换空间的大小
mem_swap_total: 交换空间的总大小
mem_total: 内存总数
mem_used: 已用内存数
mem_used_percent: 已用内存数百分比(0~100)
mem_vmalloc_chunk: 最大的连续未被使用的vmalloc区域
mem_vmalloc_totalL: 可以vmalloc虚拟内存大小
mem_vmalloc_used: vmalloc已使用的虚拟内存大小
mem_write_back: 正在被写回到磁盘的内存大小
mem_write_back_tmp: FUSE用于临时写回缓冲区的内存
mem_active: 活跃使用的内存总数(包括cache和buffer内存)
mem_available: 应用程序可用内存数
mem_available_percent: 内存剩余百分比(0~100)
mem_buffered: 用来给文件做缓冲大小
mem_cached: 被高速缓冲存储器cache memory用的内存的大小等于 diskcache minus SwapCache
mem_commit_limit: 根据超额分配比率('vm.overcommit_ratio'这是当前在系统上分配可用的内存总量这个限制只是在模式2('vm.overcommit_memory')时启用
mem_committed_as: 目前在系统上分配的内存量。是所有进程申请的内存的总和
mem_dirty: 等待被写回到磁盘的内存大小
mem_free: 空闲内存数
mem_high_free: 未被使用的高位内存大小
mem_high_total: 高位内存总大小Highmem是指所有内存高于860MB的物理内存,Highmem区域供用户程序使用或用于页面缓存。该区域不是直接映射到内核空间。内核必须使用不同的手法使用该段内存
mem_huge_page_size: 每个大页的大小
mem_huge_pages_free: 池中尚未分配的 HugePages 数量
mem_huge_pages_total: 预留HugePages的总个数
mem_inactive: 空闲的内存数(包括free和avalible的内存)
mem_low_free: 未被使用的低位大小
mem_low_total: 低位内存总大小,低位可以达到高位内存一样的作用,而且它还能够被内核用来记录一些自己的数据结构
mem_mapped: 设备和文件等映射的大小
mem_page_tables: 管理内存分页页面的索引表的大小
mem_shared: 多个进程共享的内存总额
mem_slab: 内核数据结构缓存的大小,可以减少申请和释放内存带来的消耗
mem_sreclaimable: 可收回Slab的大小
mem_sunreclaim: 不可收回Slab的大小SUnreclaim+SReclaimableSlab
mem_swap_cached: 被高速缓冲存储器cache memory用的交换空间的大小已经被交换出来的内存但仍然被存放在swapfile中。用来在需要的时候很快的被替换而不需要再次打开I/O端口
mem_swap_free: 未被使用交换空间的大小
mem_swap_total: 交换空间的总大小
mem_total: 内存总数
mem_used: 已用内存数
mem_used_percent: 已用内存数百分比(0~100)
mem_vmalloc_chunk: 最大的连续未被使用的vmalloc区域
mem_vmalloc_totalL: 可以vmalloc虚拟内存大小
mem_vmalloc_used: vmalloc已使用的虚拟内存大小
mem_write_back: 正在被写回到磁盘的内存大小
mem_write_back_tmp: FUSE用于临时写回缓冲区的内存
net_bytes_recv: 网卡收包总数(bytes)
net_bytes_sent: 网卡发包总数(bytes)
net_drop_in: 网卡收丢包数量
net_drop_out: 网卡发丢包数量
net_err_in: 网卡收包错误数量
net_err_out: 网卡发包错误数量
net_packets_recv: 网卡收包数量
net_packets_sent: 网卡发包数量
net_bytes_recv: 网卡收包总数(bytes)
net_bytes_sent: 网卡发包总数(bytes)
net_drop_in: 网卡收丢包数量
net_drop_out: 网卡发丢包数量
net_err_in: 网卡收包错误数量
net_err_out: 网卡发包错误数量
net_packets_recv: 网卡收包数量
net_packets_sent: 网卡发包数量
netstat_tcp_established: ESTABLISHED状态的网络链接数
netstat_tcp_fin_wait1: FIN_WAIT1状态的网络链接数
netstat_tcp_fin_wait2: FIN_WAIT2状态的网络链接数
netstat_tcp_last_ack: LAST_ACK状态的网络链接数
netstat_tcp_listen: LISTEN状态的网络链接数
netstat_tcp_syn_recv: SYN_RECV状态的网络链接数
netstat_tcp_syn_sent: SYN_SENT状态的网络链接数
netstat_tcp_time_wait: TIME_WAIT状态的网络链接数
netstat_udp_socket: UDP状态的网络链接数
netstat_tcp_established: ESTABLISHED状态的网络链接数
netstat_tcp_fin_wait1: FIN_WAIT1状态的网络链接数
netstat_tcp_fin_wait2: FIN_WAIT2状态的网络链接数
netstat_tcp_last_ack: LAST_ACK状态的网络链接数
netstat_tcp_listen: LISTEN状态的网络链接数
netstat_tcp_syn_recv: SYN_RECV状态的网络链接数
netstat_tcp_syn_sent: SYN_SENT状态的网络链接数
netstat_tcp_time_wait: TIME_WAIT状态的网络链接数
netstat_udp_socket: UDP状态的网络链接数
processes_blocked: 不可中断的睡眠状态下的进程数('U','D','L')
processes_dead: 回收中的进程数('X')
processes_idle: 挂起的空闲进程数('I')
processes_paging: 分页进程数('P')
processes_running: 运行中的进程数('R')
processes_sleeping: 可中断进程数('S')
processes_stopped: 暂停状态进程数('T')
processes_total: 总进程数
processes_total_threads: 总线程数
processes_unknown: 未知状态进程数
processes_zombies: 僵尸态进程数('Z')
#[ping]
ping_percent_packet_loss: ping数据包丢失百分比(%)
ping_result_code: ping返回码('0','1')
swap_used_percent: Swap空间换出数据量
processes_blocked: 不可中断的睡眠状态下的进程数('U','D','L')
processes_dead: 回收中的进程数('X')
processes_idle: 挂起的空闲进程数('I')
processes_paging: 分页进程数('P')
processes_running: 运行中的进程数('R')
processes_sleeping: 可中断进程数('S')
processes_stopped: 暂停状态进程数('T')
processes_total: 总进程数
processes_total_threads: 总线程数
processes_unknown: 未知状态进程数
processes_zombies: 僵尸态进程数('Z')
system_load1: 1分钟平均load值
system_load5: 5分钟平均load值
system_load15: 15分钟平均load值
system_n_users: 用户数
system_n_cpus: CPU核数
system_uptime: 系统启动时间
swap_used_percent: Swap空间换出数据量
nginx_accepts: 自nginx启动起,与客户端建立过得连接总数
nginx_active: 当前nginx正在处理的活动连接数,等于Reading/Writing/Waiting总和
nginx_handled: 自nginx启动起,处理过的客户端连接总数
nginx_reading: 正在读取HTTP请求头部的连接总数
nginx_requests: 自nginx启动起,处理过的客户端请求总数,由于存在HTTP Krrp-Alive请求,该值会大于handled值
nginx_upstream_check_fall: upstream_check模块检测到后端失败的次数
nginx_upstream_check_rise: upstream_check模块对后端的检测次数
nginx_upstream_check_status_code: 后端upstream的状态,up为1,down为0
nginx_waiting: 开启 keep-alive 的情况下,这个值等于 active (reading+writing), 意思就是 Nginx 已经处理完正在等候下一次请求指令的驻留连接
nginx_writing: 正在向客户端发送响应的连接总数
system_load1: 1分钟平均load值
system_load5: 5分钟平均load值
system_load15: 15分钟平均load值
system_n_users: 用户数
system_n_cpus: CPU核数
system_uptime: 系统启动时间
http_response_content_length: HTTP消息实体的传输长度
http_response_http_response_code: http响应状态码
http_response_response_time: http响应用时
http_response_result_code: url探测结果0为正常否则url无法访问
nginx_accepts: 自nginx启动起,与客户端建立过得连接总数
nginx_active: 当前nginx正在处理的活动连接数,等于Reading/Writing/Waiting总和
nginx_handled: 自nginx启动起,处理过的客户端连接总数
nginx_reading: 正在读取HTTP请求头部的连接总数
nginx_requests: 自nginx启动起,处理过的客户端请求总数,由于存在HTTP Krrp-Alive请求,该值会大于handled值
nginx_upstream_check_fall: upstream_check模块检测到后端失败的次数
nginx_upstream_check_rise: upstream_check模块对后端的检测次数
nginx_upstream_check_status_code: 后端upstream的状态,up为1,down为0
nginx_waiting: 开启 keep-alive 的情况下,这个值等于 active (reading+writing), 意思就是 Nginx 已经处理完正在等候下一次请求指令的驻留连接
nginx_writing: 正在向客户端发送响应的连接总数
http_response_content_length: HTTP消息实体的传输长度
http_response_http_response_code: http响应状态码
http_response_response_time: http响应用时
http_response_result_code: url探测结果0为正常否则url无法访问
# [aws cloudwatch rds]
cloudwatch_aws_rds_bin_log_disk_usage_average: rds 磁盘使用平均值
cloudwatch_aws_rds_bin_log_disk_usage_maximum: rds 磁盘使用量最大值
cloudwatch_aws_rds_bin_log_disk_usage_minimum: rds binlog 磁盘使用量最低
cloudwatch_aws_rds_bin_log_disk_usage_sample_count: rds binlog 磁盘使用情况样本计数
cloudwatch_aws_rds_bin_log_disk_usage_sum: rds binlog 磁盘使用总和
cloudwatch_aws_rds_burst_balance_average: rds 突发余额平均值
cloudwatch_aws_rds_burst_balance_maximum: rds 突发余额最大值
cloudwatch_aws_rds_burst_balance_minimum: rds 突发余额最低
cloudwatch_aws_rds_burst_balance_sample_count: rds 突发平衡样本计数
cloudwatch_aws_rds_burst_balance_sum: rds 突发余额总和
cloudwatch_aws_rds_cpu_utilization_average: rds cpu 利用率平均值
cloudwatch_aws_rds_cpu_utilization_maximum: rds cpu 利用率最大值
cloudwatch_aws_rds_cpu_utilization_minimum: rds cpu 利用率最低
cloudwatch_aws_rds_cpu_utilization_sample_count: rds cpu 利用率样本计数
cloudwatch_aws_rds_cpu_utilization_sum: rds cpu 利用率总和
cloudwatch_aws_rds_database_connections_average: rds 数据库连接平均值
cloudwatch_aws_rds_database_connections_maximum: rds 数据库连接数最大值
cloudwatch_aws_rds_database_connections_minimum: rds 数据库连接最小
cloudwatch_aws_rds_database_connections_sample_count: rds 数据库连接样本数
cloudwatch_aws_rds_database_connections_sum: rds 数据库连接总和
cloudwatch_aws_rds_db_load_average: rds db 平均负载
cloudwatch_aws_rds_db_load_cpu_average: rds db 负载 cpu 平均值
cloudwatch_aws_rds_db_load_cpu_maximum: rds db 负载 cpu 最大值
cloudwatch_aws_rds_db_load_cpu_minimum: rds db 负载 cpu 最小值
cloudwatch_aws_rds_db_load_cpu_sample_count: rds db 加载 CPU 样本数
cloudwatch_aws_rds_db_load_cpu_sum: rds db 加载cpu总和
cloudwatch_aws_rds_db_load_maximum: rds 数据库负载最大值
cloudwatch_aws_rds_db_load_minimum: rds 数据库负载最小值
cloudwatch_aws_rds_db_load_non_cpu_average: rds 加载非 CPU 平均值
cloudwatch_aws_rds_db_load_non_cpu_maximum: rds 加载非 cpu 最大值
cloudwatch_aws_rds_db_load_non_cpu_minimum: rds 加载非 cpu 最小值
cloudwatch_aws_rds_db_load_non_cpu_sample_count: rds 加载非 cpu 样本计数
cloudwatch_aws_rds_db_load_non_cpu_sum: rds 加载非cpu总和
cloudwatch_aws_rds_db_load_sample_count: rds db 加载样本计数
cloudwatch_aws_rds_db_load_sum: rds db 负载总和
cloudwatch_aws_rds_disk_queue_depth_average: rds 磁盘队列深度平均值
cloudwatch_aws_rds_disk_queue_depth_maximum: rds 磁盘队列深度最大值
cloudwatch_aws_rds_disk_queue_depth_minimum: rds 磁盘队列深度最小值
cloudwatch_aws_rds_disk_queue_depth_sample_count: rds 磁盘队列深度样本计数
cloudwatch_aws_rds_disk_queue_depth_sum: rds 磁盘队列深度总和
cloudwatch_aws_rds_ebs_byte_balance__average: rds ebs 字节余额平均值
cloudwatch_aws_rds_ebs_byte_balance__maximum: rds ebs 字节余额最大值
cloudwatch_aws_rds_ebs_byte_balance__minimum: rds ebs 字节余额最低
cloudwatch_aws_rds_ebs_byte_balance__sample_count: rds ebs 字节余额样本数
cloudwatch_aws_rds_ebs_byte_balance__sum: rds ebs 字节余额总和
cloudwatch_aws_rds_ebsio_balance__average: rds ebsio 余额平均值
cloudwatch_aws_rds_ebsio_balance__maximum: rds ebsio 余额最大值
cloudwatch_aws_rds_ebsio_balance__minimum: rds ebsio 余额最低
cloudwatch_aws_rds_ebsio_balance__sample_count: rds ebsio 平衡样本计数
cloudwatch_aws_rds_ebsio_balance__sum: rds ebsio 余额总和
cloudwatch_aws_rds_free_storage_space_average: rds 免费存储空间平均
cloudwatch_aws_rds_free_storage_space_maximum: rds 最大可用存储空间
cloudwatch_aws_rds_free_storage_space_minimum: rds 最低可用存储空间
cloudwatch_aws_rds_free_storage_space_sample_count: rds 可用存储空间样本数
cloudwatch_aws_rds_free_storage_space_sum: rds 免费存储空间总和
cloudwatch_aws_rds_freeable_memory_average: rds 可用内存平均值
cloudwatch_aws_rds_freeable_memory_maximum: rds 最大可用内存
cloudwatch_aws_rds_freeable_memory_minimum: rds 最小可用内存
cloudwatch_aws_rds_freeable_memory_sample_count: rds 可释放内存样本数
cloudwatch_aws_rds_freeable_memory_sum: rds 可释放内存总和
cloudwatch_aws_rds_lvm_read_iops_average: rds lvm 读取 iops 平均值
cloudwatch_aws_rds_lvm_read_iops_maximum: rds lvm 读取 iops 最大值
cloudwatch_aws_rds_lvm_read_iops_minimum: rds lvm 读取 iops 最低
cloudwatch_aws_rds_lvm_read_iops_sample_count: rds lvm 读取 iops 样本计数
cloudwatch_aws_rds_lvm_read_iops_sum: rds lvm 读取 iops 总和
cloudwatch_aws_rds_lvm_write_iops_average: rds lvm 写入 iops 平均值
cloudwatch_aws_rds_lvm_write_iops_maximum: rds lvm 写入 iops 最大值
cloudwatch_aws_rds_lvm_write_iops_minimum: rds lvm 写入 iops 最低
cloudwatch_aws_rds_lvm_write_iops_sample_count: rds lvm 写入 iops 样本计数
cloudwatch_aws_rds_lvm_write_iops_sum: rds lvm 写入 iops 总和
cloudwatch_aws_rds_network_receive_throughput_average: rds 网络接收吞吐量平均
cloudwatch_aws_rds_network_receive_throughput_maximum: rds 网络接收吞吐量最大值
cloudwatch_aws_rds_network_receive_throughput_minimum: rds 网络接收吞吐量最小值
cloudwatch_aws_rds_network_receive_throughput_sample_count: rds 网络接收吞吐量样本计数
cloudwatch_aws_rds_network_receive_throughput_sum: rds 网络接收吞吐量总和
cloudwatch_aws_rds_network_transmit_throughput_average: rds 网络传输吞吐量平均值
cloudwatch_aws_rds_network_transmit_throughput_maximum: rds 网络传输吞吐量最大
cloudwatch_aws_rds_network_transmit_throughput_minimum: rds 网络传输吞吐量最小值
cloudwatch_aws_rds_network_transmit_throughput_sample_count: rds 网络传输吞吐量样本计数
cloudwatch_aws_rds_network_transmit_throughput_sum: rds 网络传输吞吐量总和
cloudwatch_aws_rds_read_iops_average: rds 读取 iops 平均值
cloudwatch_aws_rds_read_iops_maximum: rds 最大读取 iops
cloudwatch_aws_rds_read_iops_minimum: rds 读取 iops 最低
cloudwatch_aws_rds_read_iops_sample_count: rds 读取 iops 样本计数
cloudwatch_aws_rds_read_iops_sum: rds 读取 iops 总和
cloudwatch_aws_rds_read_latency_average: rds 读取延迟平均值
cloudwatch_aws_rds_read_latency_maximum: rds 读取延迟最大值
cloudwatch_aws_rds_read_latency_minimum: rds 最小读取延迟
cloudwatch_aws_rds_read_latency_sample_count: rds 读取延迟样本计数
cloudwatch_aws_rds_read_latency_sum: rds 读取延迟总和
cloudwatch_aws_rds_read_throughput_average: rds 读取吞吐量平均值
cloudwatch_aws_rds_read_throughput_maximum: rds 最大读取吞吐量
cloudwatch_aws_rds_read_throughput_minimum: rds 最小读取吞吐量
cloudwatch_aws_rds_read_throughput_sample_count: rds 读取吞吐量样本计数
cloudwatch_aws_rds_read_throughput_sum: rds 读取吞吐量总和
cloudwatch_aws_rds_swap_usage_average: rds 交换使用平均值
cloudwatch_aws_rds_swap_usage_maximum: rds 交换使用最大值
cloudwatch_aws_rds_swap_usage_minimum: rds 交换使用量最低
cloudwatch_aws_rds_swap_usage_sample_count: rds 交换使用示例计数
cloudwatch_aws_rds_swap_usage_sum: rds 交换使用总和
cloudwatch_aws_rds_write_iops_average: rds 写入 iops 平均值
cloudwatch_aws_rds_write_iops_maximum: rds 写入 iops 最大值
cloudwatch_aws_rds_write_iops_minimum: rds 写入 iops 最低
cloudwatch_aws_rds_write_iops_sample_count: rds 写入 iops 样本计数
cloudwatch_aws_rds_write_iops_sum: rds 写入 iops 总和
cloudwatch_aws_rds_write_latency_average: rds 写入延迟平均值
cloudwatch_aws_rds_write_latency_maximum: rds 最大写入延迟
cloudwatch_aws_rds_write_latency_minimum: rds 写入延迟最小值
cloudwatch_aws_rds_write_latency_sample_count: rds 写入延迟样本计数
cloudwatch_aws_rds_write_latency_sum: rds 写入延迟总和
cloudwatch_aws_rds_write_throughput_average: rds 写入吞吐量平均值
cloudwatch_aws_rds_write_throughput_maximum: rds 最大写入吞吐量
cloudwatch_aws_rds_write_throughput_minimum: rds 写入吞吐量最小值
cloudwatch_aws_rds_write_throughput_sample_count: rds 写入吞吐量样本计数
cloudwatch_aws_rds_write_throughput_sum: rds 写入吞吐量总和
en:
cpu_usage_idle: "CPU idle rate(unit%)"
cpu_usage_active: "CPU usage rate(unit%)"
cpu_usage_system: "CPU kernel state time proportion(unit%)"
cpu_usage_user: "CPU user attitude time proportion(unit%)"
cpu_usage_nice: "The proportion of low priority CPU time, that is, the process NICE value is adjusted to the CPU time between 1-19. Note here that the value range of NICE is -20 to 19, the larger the value, the lower the priority, the lower the priority(unit%)"
cpu_usage_iowait: "CPU waiting for I/O time proportion(unit%)"
cpu_usage_irq: "CPU processing hard interrupt time proportion(unit%)"
cpu_usage_softirq: "CPU processing soft interrupt time proportion(unit%)"
cpu_usage_steal: "In the virtual machine environment, there is this indicator, which means that the CPU is used by other virtual machines for the proportion of time.(unit%)"
cpu_usage_guest: "The time to run other operating systems by virtualization, that is, the proportion of CPU time running the virtual machine(unit%)"
cpu_usage_guest_nice: "The proportion of time to run the virtual machine at low priority(unit%)"
disk_free: "The remaining amount of the hard disk partition (unit: byte)"
disk_used: "Hard disk partitional use (unit: byte)"
disk_used_percent: "Hard disk partitional use rate (unit:%)"
disk_total: "Total amount of hard disk partition (unit: byte)"
disk_inodes_free: "Hard disk partition INODE remaining amount"
disk_inodes_used: "Hard disk partition INODE usage amount"
disk_inodes_total: "The total amount of hard disk partition INODE"
diskio_io_time: "From the perspective of the device perspective, the total time of I/O request, the I/O request in the queue is count (unit: millisecond), the counter type, you need to use the function to find the value"
diskio_iops_in_progress: "IO requests that have been assigned to device -driven and have not yet been completed, not included in the queue but not yet assigned to the device -driven IO request, Gauge type"
diskio_merged_reads: "The number of times of adjacent reading request Merge, the counter type"
diskio_merged_writes: "The number of times the request Merge writes, the counter type"
diskio_read_bytes: "The number of byte reads, the counter type, you need to use the function to find the Rate to use the value"
diskio_read_time: "The total time of reading request (unit: millisecond), the counter type, you need to use the function to find the Rate to have the value of use"
diskio_reads: "Read the number of requests, the counter type, you need to use the function to find the Rate to use the value"
diskio_weighted_io_time: "From the perspective of the I/O request perspective, I/O wait for the total time. If there are multiple I/O requests at the same time, the time will be superimposed (unit: millisecond)"
diskio_write_bytes: "The number of bytes written, the counter type, you need to use the function to find the Rate to use the value"
diskio_write_time: "The total time of the request (unit: millisecond), the counter type, you need to use the function to find the rate to have the value of use"
diskio_writes: "Write the number of requests, the counter type, you need to use the function to find the rate to use value"
kernel_boot_time: "Kernel startup time"
kernel_context_switches: "Number of kernel context switching times"
kernel_entropy_avail: "Entropy pool inside the Linux system"
kernel_interrupts: "Number of kernel interruption"
kernel_processes_forked: "ForK's process number"
mem_active: "The total number of memory (including Cache and BUFFER memory)"
mem_available: "Application can use memory numbers"
mem_available_percent: "Memory remaining percentage (0 ~ 100)"
mem_buffered: "Used to make buffer size for the file"
mem_cached: "The size of the memory used by the cache memory (equal to diskcache minus Swap Cache )"
mem_commit_limit: "According to the over allocation ratio ('vm.overCommit _ Ratio'), this is the current total memory that can be allocated on the system."
mem_committed_as: "Currently allocated on the system. It is the sum of the memory of all process applications"
mem_dirty: "Waiting to be written back to the memory size of the disk"
mem_free: "Senior memory number"
mem_high_free: "Unused high memory size"
mem_high_total: "The total memory size of the high memory (Highmem refers to all the physical memory that is higher than 860 MB of memory, the HighMem area is used for user programs, or for page cache. This area is not directly mapped to the kernel space. The kernels must use different methods to use this section of memory. )"
mem_huge_page_size: "The size of each big page"
mem_huge_pages_free: "The number of Huge Pages in the pool that have not been allocated"
mem_huge_pages_total: "Reserve the total number of Huge Pages"
mem_inactive: "Free memory (including the memory of free and avalible)"
mem_low_free: "Unused low size"
mem_low_total: "The total size of the low memory memory can achieve the same role of high memory, and it can be used by the kernel to record some of its own data structure"
mem_mapped: "The size of the mapping of equipment and files"
mem_page_tables: "The size of the index table of the management of the memory paging page"
mem_shared: "The total memory shared by multiple processes"
mem_slab: "The size of the kernel data structure cache can reduce the consumption of application and release memory"
mem_sreclaimable: "The size of the SLAB can be recovered"
mem_sunreclaim: "The size of the SLAB cannot be recovered(SUnreclaim+SReclaimableSlab)"
mem_swap_cached: "The size of the swap space used by the cache memory (cache memory), the memory that has been swapped out, but is still stored in the swapfile. Used to be quickly replaced when needed without opening the I/O port again"
mem_swap_free: "The size of the switching space is not used"
mem_swap_total: "The total size of the exchange space"
mem_total: "Total memory"
mem_used: "Memory number"
mem_used_percent: "The memory has been used by several percentage (0 ~ 100)"
mem_vmalloc_chunk: "The largest continuous unused vmalloc area"
mem_vmalloc_totalL: "You can vmalloc virtual memory size"
mem_vmalloc_used: "Vmalloc's virtual memory size"
mem_write_back: "The memory size of the disk is being written back to the disk"
mem_write_back_tmp: "Fuse is used to temporarily write back the memory of the buffer area"
net_bytes_recv: "The total number of packaging of the network card (bytes)"
net_bytes_sent: "Total number of network cards (bytes)"
net_drop_in: "The number of packets for network cards"
net_drop_out: "The number of packets issued by the network card"
net_err_in: "The number of incorrect packets of the network card"
net_err_out: "Number of incorrect number of network cards"
net_packets_recv: "Net card collection quantity"
net_packets_sent: "Number of network card issuance"
netstat_tcp_established: "ESTABLISHED status network link number"
netstat_tcp_fin_wait1: "FIN _ WAIT1 status network link number"
netstat_tcp_fin_wait2: "FIN _ WAIT2 status number of network links"
netstat_tcp_last_ack: "LAST_ ACK status number of network links"
netstat_tcp_listen: "Number of network links in Listen status"
netstat_tcp_syn_recv: "SYN _ RECV status number of network links"
netstat_tcp_syn_sent: "SYN _ SENT status number of network links"
netstat_tcp_time_wait: "Time _ WAIT status network link number"
netstat_udp_socket: "Number of network links in UDP status"
processes_blocked: "The number of processes in the unreprudible sleep state('U','D','L')"
processes_dead: "Number of processes in recycling('X')"
processes_idle: "Number of idle processes hanging('I')"
processes_paging: "Number of paging processes('P')"
processes_running: "Number of processes during operation('R')"
processes_sleeping: "Can interrupt the number of processes('S')"
processes_stopped: "Pushing status process number('T')"
processes_total: "Total process number"
processes_total_threads: "Number of threads"
processes_unknown: "Unknown status process number"
processes_zombies: "Number of zombies('Z')"
swap_used_percent: "SWAP space replace the data volume"
system_load1: "1 minute average load value"
system_load5: "5 minutes average load value"
system_load15: "15 minutes average load value"
system_n_users: "User number"
system_n_cpus: "CPU nuclear number"
system_uptime: "System startup time"
nginx_accepts: "Since Nginx started, the total number of connections has been established with the client"
nginx_active: "The current number of activity connections that Nginx is being processed is equal to Reading/Writing/Waiting"
nginx_handled: "Starting from Nginx, the total number of client connections that have been processed"
nginx_reading: "Reading the total number of connections on the http request header"
nginx_requests: "Since nginx is started, the total number of client requests processed, due to the existence of HTTP Krrp - Alive requests, this value will be greater than the handled value"
nginx_upstream_check_fall: "UPStream_CHECK module detects the number of back -end failures"
nginx_upstream_check_rise: "UPSTREAM _ Check module to detect the number of back -end"
nginx_upstream_check_status_code: "The state of the backstream is 1, and the down is 0"
nginx_waiting: "When keep-alive is enabled, this value is equal to active (reading+writing), which means that Nginx has processed the resident connection that is waiting for the next request command"
nginx_writing: "The total number of connections to send a response to the client"
http_response_content_length: "HTTP message entity transmission length"
http_response_http_response_code: "http response status code"
http_response_response_time: "When http ring application"
http_response_result_code: "URL detection result 0 is normal, otherwise the URL cannot be accessed"
# [mysqld_exporter]
mysql_global_status_uptime: The number of seconds that the server has been up.(Gauge)
@ -237,7 +489,7 @@ redis_last_key_groups_scrape_duration_milliseconds: Duration of the last key gro
redis_last_slow_execution_duration_seconds: The amount of time needed for last slow execution, in seconds.
redis_latest_fork_seconds: The amount of time needed for last fork, in seconds.
redis_lazyfree_pending_objects: The number of objects waiting to be freed (as a result of calling UNLINK, or FLUSHDB and FLUSHALL with the ASYNC option).
redis_master_repl_offset: The server's current replication offset.
redis_master_repl_offset: The server's current replication offset.
redis_mem_clients_normal: Memory used by normal clients.(Gauge)
redis_mem_clients_slaves: Memory used by replica clients - Starting Redis 7.0, replica buffers share memory with the replication backlog, so this field can show 0 when replicas don't trigger an increase of memory usage.
redis_mem_fragmentation_bytes: Delta between used_memory_rss and used_memory. Note that when the total fragmentation bytes is low (few megabytes), a high ratio (e.g. 1.5 and above) is not an indication of an issue.
@ -370,8 +622,6 @@ node_load15: cpu load 15m
# MEM
# 内核态
# 用户追踪已从交换区获取但尚未修改的页面的内存
node_memory_SwapCached_bytes: Memory that keeps track of pages that have been fetched from swap but not yet been modified
# 内核用于缓存数据结构供自己使用的内存
node_memory_Slab_bytes: Memory used by the kernel to cache data structures for its own use
# slab中可回收的部分
@ -433,7 +683,7 @@ node_memory_SwapTotal_bytes: Memory information field SwapTotal_bytes
node_memory_SwapFree_bytes: Memory information field SwapFree_bytes
# DISK
node_filesystem_files_free: Filesystem space available to non-root users in byte
node_filesystem_avail_bytes: Filesystem space available to non-root users in byte
node_filesystem_free_bytes: Filesystem free space in bytes
node_filesystem_size_bytes: Filesystem size in bytes
node_filesystem_files_free: Filesystem total free file nodes
@ -479,7 +729,7 @@ kafka_consumer_lag_millis: Current approximation of consumer lag for a ConsumerG
kafka_topic_partition_under_replicated_partition: 1 if Topic/Partition is under Replicated
# [zookeeper_exporter]
zk_znode_count: The total count of znodes stored
zk_znode_count: The total count of znodes stored
zk_ephemerals_count: The number of Ephemerals nodes
zk_watch_count: The number of watchers setup over Zookeeper nodes.
zk_approximate_data_size: Size of data in bytes that a zookeeper server has in its data tree
@ -491,4 +741,4 @@ zk_open_file_descriptor_count: Number of file descriptors that a zookeeper serve
zk_max_file_descriptor_count: Maximum number of file descriptors that a zookeeper server can open
zk_avg_latency: Average time in milliseconds for requests to be processed
zk_min_latency: Maximum time in milliseconds for a request to be processed
zk_max_latency: Minimum time in milliseconds for a request to be processed
zk_max_latency: Minimum time in milliseconds for a request to be processed

View File

@ -24,6 +24,11 @@ class Sender(object):
# already done in go code
pass
@classmethod
def send_mm(cls, payload):
# already done in go code
pass
@classmethod
def send_sms(cls, payload):
users = payload.get('event').get("notify_users_obj")

View File

@ -7,12 +7,6 @@ import (
"github.com/tidwall/gjson"
)
// the caller can be called for alerting notify by complete this interface
type inter interface {
Descript() string
Notify([]byte)
}
// N9E complete
type N9EPlugin struct {
Name string
@ -37,9 +31,16 @@ func (n *N9EPlugin) Notify(bs []byte) {
}
}
func (n *N9EPlugin) NotifyMaintainer(bs []byte) {
fmt.Println("do something... begin")
result := string(bs)
fmt.Println(result)
fmt.Println("do something... end")
}
// will be loaded for alertingCall , The first letter must be capitalized to be exported
var N9eCaller = N9EPlugin{
Name: "n9e",
Description: "演示告警通过动态链接库方式通知",
Name: "N9EPlugin",
Description: "Notify by lib",
BuildAt: time.Now().Local().Format("2006/01/02 15:04:05"),
}

View File

@ -0,0 +1,193 @@
import json
import yaml
'''
将promtheus/vmalert的rule转换为n9e中的rule
支持k8s的rule configmap
'''
rule_file = 'rules.yaml'
def convert_interval(interval):
if interval.endswith('s') or interval.endswith('S'):
return int(interval[:-1])
if interval.endswith('m') or interval.endswith('M'):
return int(interval[:-1]) * 60
if interval.endswith('h') or interval.endswith('H'):
return int(interval[:-1]) * 60 * 60
if interval.endswith('d') or interval.endswith('D'):
return int(interval[:-1]) * 60 * 60 * 24
return int(interval)
def convert_alert(rule, interval):
name = rule['alert']
prom_ql = rule['expr']
if 'for' in rule:
prom_for_duration = convert_interval(rule['for'])
else:
prom_for_duration = 0
prom_eval_interval = convert_interval(interval)
note = ''
if 'annotations' in rule:
for v in rule['annotations'].values():
note = v
break
append_tags = []
severity = 2
if 'labels' in rule:
for k, v in rule['labels'].items():
if k != 'severity':
append_tags.append('{}={}'.format(k, v))
continue
if v == 'critical':
severity = 1
elif v == 'info':
severity = 3
# elif v == 'warning':
# severity = 2
n9e_alert_rule = {
"name": name,
"note": note,
"severity": severity,
"disabled": 0,
"prom_for_duration": prom_for_duration,
"prom_ql": prom_ql,
"prom_eval_interval": prom_eval_interval,
"enable_stime": "00:00",
"enable_etime": "23:59",
"enable_days_of_week": [
"1",
"2",
"3",
"4",
"5",
"6",
"0"
],
"enable_in_bg": 0,
"notify_recovered": 1,
"notify_channels": [],
"notify_repeat_step": 60,
"recover_duration": 0,
"callbacks": [],
"runbook_url": "",
"append_tags": append_tags
}
return n9e_alert_rule
def convert_record(rule, interval):
name = rule['record']
prom_ql = rule['expr']
prom_eval_interval = convert_interval(interval)
note = ''
append_tags = []
if 'labels' in rule:
for k, v in rule['labels'].items():
append_tags.append('{}={}'.format(k, v))
n9e_record_rule = {
"name": name,
"note": note,
"disabled": 0,
"prom_ql": prom_ql,
"prom_eval_interval": prom_eval_interval,
"append_tags": append_tags
}
return n9e_record_rule
'''
example of rule group file
---
groups:
- name: example
rules:
- alert: HighRequestLatency
expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5
for: 10m
labels:
severity: page
annotations:
summary: High request latency
'''
def deal_group(group):
"""
parse single prometheus/vmalert rule group
"""
alert_rules = []
record_rules = []
for rule_segment in group['groups']:
if 'interval' in rule_segment:
interval = rule_segment['interval']
else:
interval = '15s'
for rule in rule_segment['rules']:
if 'alert' in rule:
alert_rules.append(convert_alert(rule, interval))
else:
record_rules.append(convert_record(rule, interval))
return alert_rules, record_rules
'''
example of k8s rule configmap
---
apiVersion: v1
kind: ConfigMap
metadata:
name: rulefiles-0
data:
etcdrules.yaml: |
groups:
- name: etcd
rules:
- alert: etcdInsufficientMembers
annotations:
message: 'etcd cluster "{{ $labels.job }}": insufficient members ({{ $value}}).'
expr: sum(up{job=~".*etcd.*"} == bool 1) by (job) < ((count(up{job=~".*etcd.*"})
by (job) + 1) / 2)
for: 3m
labels:
severity: critical
'''
def deal_configmap(rule_configmap):
"""
parse rule configmap from k8s
"""
all_record_rules = []
all_alert_rules = []
for _, rule_group_str in rule_configmap['data'].items():
rule_group = yaml.load(rule_group_str, Loader=yaml.FullLoader)
alert_rules, record_rules = deal_group(rule_group)
all_alert_rules.extend(alert_rules)
all_record_rules.extend(record_rules)
return all_alert_rules, all_record_rules
def main():
with open(rule_file, 'r') as f:
rule_config = yaml.load(f, Loader=yaml.FullLoader)
# 如果文件是k8s中的configmap,使用下面的方法
# alert_rules, record_rules = deal_configmap(rule_config)
alert_rules, record_rules = deal_group(rule_config)
with open("alert-rules.json", 'w') as fw:
json.dump(alert_rules, fw, indent=2, ensure_ascii=False)
with open("record-rules.json", 'w') as fw:
json.dump(record_rules, fw, indent=2, ensure_ascii=False)
if __name__ == '__main__':
main()

View File

@ -9,10 +9,15 @@ ClusterName = "Default"
BusiGroupLabelKey = "busigroup"
# sleep x seconds, then start judge engine
EngineDelay = 120
EngineDelay = 60
DisableUsageReport = false
# config | database
ReaderFrom = "config"
ForceUseServerTS = true
[Log]
# log write dir
Dir = "logs"
@ -70,10 +75,12 @@ InsecureSkipVerify = true
Batch = 5
[Alerting]
# timeout settings, unit: ms, default: 30000ms
Timeout=30000
TemplatesDir = "./etc/template"
NotifyConcurrency = 10
# use builtin go code notify
NotifyBuiltinChannels = ["email", "dingtalk", "wecom", "feishu"]
NotifyBuiltinChannels = ["email", "dingtalk", "wecom", "feishu", "mm"]
[Alerting.CallScript]
# built in sending capability in go code
@ -104,7 +111,7 @@ Headers = ["Content-Type", "application/json", "X-From", "N9E"]
[NoData]
Metric = "target_up"
# unit: second
Interval = 15
Interval = 120
[Ibex]
# callback: ${ibex}/${tplid}/${host}
@ -155,15 +162,8 @@ BasicAuthUser = ""
BasicAuthPass = ""
# timeout settings, unit: ms
Timeout = 30000
DialTimeout = 10000
TLSHandshakeTimeout = 30000
ExpectContinueTimeout = 1000
IdleConnTimeout = 90000
# time duration, unit: ms
KeepAlive = 30000
MaxConnsPerHost = 0
MaxIdleConns = 100
MaxIdleConnsPerHost = 10
DialTimeout = 3000
MaxIdleConnsPerHost = 100
[WriterOpt]
# queue channel count
@ -180,8 +180,9 @@ BasicAuthUser = ""
# Basic auth password
BasicAuthPass = ""
# timeout settings, unit: ms
Timeout = 30000
DialTimeout = 10000
Headers = ["X-From", "n9e"]
Timeout = 10000
DialTimeout = 3000
TLSHandshakeTimeout = 30000
ExpectContinueTimeout = 1000
IdleConnTimeout = 90000
@ -190,6 +191,12 @@ KeepAlive = 30000
MaxConnsPerHost = 0
MaxIdleConns = 100
MaxIdleConnsPerHost = 100
# [[Writers.WriteRelabels]]
# Action = "replace"
# SourceLabels = ["__address__"]
# Regex = "([^:]+)(?::\\d+)?"
# Replacement = "$1:80"
# TargetLabel = "__address__"
# [[Writers]]
# Url = "http://127.0.0.1:7201/api/v1/prom/remote/write"

26
etc/template/README.md Normal file
View File

@ -0,0 +1,26 @@
# 告警消息模版文件
模版中可以使用的变量参考`AlertCurEvent`对象
模版语法如何使用可以参考[html/template](https://pkg.go.dev/html/template)
## 如何在告警模版中添加监控详情url
假设web的地址是http://127.0.0.1:18000/, 实际使用时用web地址替换该地址
在监控模版中添加以下行:
* dingtalk / wecom / feishu
```markdown
[监控详情](http://127.0.0.1:18000/metric/explorer?promql={{ .PromQl | escape }})
```
* mailbody
```html
<tr>
<th>监控详情:</th>
<td>
<a href="http://127.0.0.1:18000/metric/explorer?promql={{ .PromQl | escape }}" target="_blank">点击查看</a>
</td>
</tr>
```

7
etc/template/mm.tpl Normal file
View File

@ -0,0 +1,7 @@
级别状态: S{{.Severity}} {{if .IsRecovered}}Recovered{{else}}Triggered{{end}}
规则名称: {{.RuleName}}{{if .RuleNote}}
规则备注: {{.RuleNote}}{{end}}
监控指标: {{.TagsJSON}}
{{if .IsRecovered}}恢复时间:{{timeformat .LastEvalTime}}{{else}}触发时间: {{timeformat .TriggerTime}}
触发时值: {{.TriggerValue}}{{end}}
发送时间: {{timestamp}}

View File

@ -1,7 +1,9 @@
**级别状态**: {{if .IsRecovered}}<font color="info">S{{.Severity}} Recovered</font>{{else}}<font color="warning">S{{.Severity}} Triggered</font>{{end}}
**规则标题**: {{.RuleName}}{{if .RuleNote}}
**规则备注**: {{.RuleNote}}{{end}}
**监控指标**: {{.TagsJSON}}
{{if .IsRecovered}}**恢复时间**{{timeformat .LastEvalTime}}{{else}}**触发时间**: {{timeformat .TriggerTime}}
**规则备注**: {{.RuleNote}}{{end}}{{if .TargetIdent}}
**监控对象**: {{.TargetIdent}}{{end}}
**监控指标**: {{.TagsJSON}}{{if not .IsRecovered}}
**触发时值**: {{.TriggerValue}}{{end}}
{{if .IsRecovered}}**恢复时间**: {{timeformat .LastEvalTime}}{{else}}**首次触发时间**: {{timeformat .FirstTriggerTime}}{{end}}
{{$time_duration := sub now.Unix .FirstTriggerTime }}{{if .IsRecovered}}{{$time_duration = sub .LastEvalTime .FirstTriggerTime }}{{end}}**持续时长**: {{humanizeDurationInterface $time_duration}}
**发送时间**: {{timestamp}}

View File

@ -4,6 +4,9 @@ RunMode = "release"
# # custom i18n dict config
# I18N = "./etc/i18n.json"
# # custom i18n request header key
# I18NHeaderKey = "X-Language"
# metrics descriptions
MetricsYamlFile = "./etc/metrics.yaml"
@ -36,6 +39,11 @@ Label = "飞书机器人"
# do not change Key
Key = "feishu"
[[NotifyChannels]]
Label = "mm bot"
# do not change Key
Key = "mm"
[[ContactKeys]]
Label = "Wecom Robot Token"
# do not change Key
@ -51,6 +59,11 @@ Label = "Feishu Robot Token"
# do not change Key
Key = "feishu_robot_token"
[[ContactKeys]]
Label = "MatterMost Webhook URL"
# do not change Key
Key = "mm_webhook_url"
[Log]
# log write dir
Dir = "logs"
@ -98,6 +111,13 @@ AccessExpired = 1500
RefreshExpired = 10080
RedisKeyPrefix = "/jwt/"
[ProxyAuth]
# if proxy auth enabled, jwt auth is disabled
Enable = false
# username key in http proxy header
HeaderUserNameKey = "X-User-Name"
DefaultRoles = ["Standard"]
[BasicAuth]
user001 = "ccc26da7b9aba533cbb263a36c07dcc5"
@ -129,6 +149,7 @@ Email = "mail"
[OIDC]
Enable = false
DiaplayName = "OIDC登录"
RedirectURL = "http://n9e.com/callback"
SsoAddr = "http://sso.example.org"
ClientId = ""
@ -141,6 +162,52 @@ Nickname = "nickname"
Phone = "phone_number"
Email = "email"
[CAS]
Enable = false
DiaplayName = "CAS登录"
SsoAddr = "https://cas.example.com/cas/"
RedirectURL = "http://127.0.0.1:18000/callback/cas"
CoverAttributes = false
# cas user default roles
DefaultRoles = ["Standard"]
[CAS.Attributes]
Nickname = "nickname"
Phone = "phone_number"
Email = "email"
[OAuth]
Enable = false
DisplayName = "OAuth2登录"
RedirectURL = "http://127.0.0.1:18000/callback/oauth"
SsoAddr = "https://sso.example.com/oauth2/authorize"
TokenAddr = "https://sso.example.com/oauth2/token"
UserInfoAddr = "https://api.example.com/api/v1/user/info"
ClientId = ""
ClientSecret = ""
CoverAttributes = true
DefaultRoles = ["Standard"]
UserinfoIsArray = false
UserinfoPrefix = "data"
Scopes = ["profile", "email", "phone"]
[OAuth.Attributes]
# Username must be defined
Username = "username"
Nickname = "nickname"
Phone = "phone_number"
Email = "email"
# example
# # nested : UserinfoIsArray=false, UserinfoPrefix="data"
# # {"data":{"username":"123456","nickname":"姓名"},"code":0,"message":"ok"}
# # nested and array : UserinfoIsArray=true, UserinfoPrefix="data"
# # {"data":[{"username":"123456","nickname":"姓名"}],"code":0,"message":"ok"}
# # flat : UserinfoIsArray=false, UserinfoPrefix=""
# # {"username":"123456","nickname":"姓名"}
# # flat and array : UserinfoIsArray=true, UserinfoPrefix=""
# # [{"username":"123456","nickname":"姓名"}]
[Redis]
# address, ip:port or ip1:port,ip2:port for cluster and sentinel(SentinelAddrs)
Address = "127.0.0.1:6379"
@ -184,6 +251,7 @@ BasicAuthPass = ""
Timeout = 30000
DialTimeout = 3000
MaxIdleConnsPerHost = 100
Headers = ["X-From", "n9e"]
[Ibex]
Address = "http://127.0.0.1:10090"
@ -191,4 +259,10 @@ Address = "http://127.0.0.1:10090"
BasicAuthUser = "ibex"
BasicAuthPass = "ibex"
# unit: ms
Timeout = 3000
Timeout = 3000
[TargetMetrics]
TargetUp = '''max(max_over_time(target_up{ident=~"(%s)"}[%dm])) by (ident)'''
LoadPerCore = '''max(max_over_time(system_load_norm_1{ident=~"(%s)"}[%dm])) by (ident)'''
MemUtil = '''100-max(max_over_time(mem_available_percent{ident=~"(%s)"}[%dm])) by (ident)'''
DiskUtil = '''max(max_over_time(disk_used_percent{ident=~"(%s)", path="/"}[%dm])) by (ident)'''

8
fe.sh Executable file
View File

@ -0,0 +1,8 @@
#!/bin/bash
TAG=$(curl -sX GET https://api.github.com/repos/n9e/fe-v5/releases/latest | awk '/tag_name/{print $4;exit}' FS='[""]')
VERSION=$(echo $TAG | sed 's/v//g')
curl -o n9e-fe-${VERSION}.tar.gz -L https://github.com/n9e/fe-v5/releases/download/${TAG}/n9e-fe-${VERSION}.tar.gz
tar zxvf n9e-fe-${VERSION}.tar.gz

76
go.mod
View File

@ -1,16 +1,14 @@
module github.com/didi/nightingale/v5
go 1.14
go 1.18
require (
github.com/coreos/go-oidc v2.2.1+incompatible
github.com/dgrijalva/jwt-go v3.2.0+incompatible
github.com/fatih/camelcase v1.0.0 // indirect
github.com/fatih/structs v1.1.0 // indirect
github.com/gin-contrib/pprof v1.3.0
github.com/gin-gonic/gin v1.7.4
github.com/go-ldap/ldap/v3 v3.4.1
github.com/go-redis/redis/v8 v8.11.3
github.com/go-redis/redis/v9 v9.0.0-rc.1
github.com/gogo/protobuf v1.3.2
github.com/golang-jwt/jwt v3.2.2+incompatible
github.com/golang/protobuf v1.5.2
@ -18,25 +16,73 @@ require (
github.com/google/uuid v1.3.0
github.com/json-iterator/go v1.1.12
github.com/koding/multiconfig v0.0.0-20171124222453-69c27309b2d7
github.com/mailru/easyjson v0.7.7
github.com/mattn/go-isatty v0.0.12
github.com/orcaman/concurrent-map v0.0.0-20210501183033-44dafcb38ecc
github.com/pkg/errors v0.9.1
github.com/pquerna/cachecontrol v0.1.0 // indirect
github.com/prometheus/client_golang v1.11.0
github.com/prometheus/common v0.26.0
github.com/prometheus/client_golang v1.12.2
github.com/prometheus/common v0.32.1
github.com/prometheus/prometheus v2.5.0+incompatible
github.com/tidwall/gjson v1.14.0
github.com/toolkits/pkg v1.2.9
github.com/toolkits/pkg v1.3.1-0.20220824084030-9f9f830a05d5
github.com/urfave/cli/v2 v2.3.0
golang.org/x/net v0.0.0-20210805182204-aaa1db679c0d // indirect
golang.org/x/oauth2 v0.0.0-20200107190931-bf48bf16ab8d
golang.org/x/sys v0.0.0-20210809222454-d867a43fc93e // indirect
google.golang.org/genproto v0.0.0-20211007155348-82e027067bd4 // indirect
google.golang.org/grpc v1.41.0 // indirect
gopkg.in/alexcesaro/quotedprintable.v3 v3.0.0-20150716171945-2caba252f4dc // indirect
golang.org/x/oauth2 v0.0.0-20210514164344-f6687ab2804c
gopkg.in/gomail.v2 v2.0.0-20160411212932-81ebce5c23df
gopkg.in/square/go-jose.v2 v2.6.0 // indirect
gorm.io/driver/mysql v1.1.2
gorm.io/driver/postgres v1.1.1
gorm.io/gorm v1.21.15
)
require (
github.com/Azure/go-ntlmssp v0.0.0-20200615164410-66371956d46c // indirect
github.com/BurntSushi/toml v0.3.1 // indirect
github.com/beorn7/perks v1.0.1 // indirect
github.com/cespare/xxhash/v2 v2.1.2 // indirect
github.com/cpuguy83/go-md2man/v2 v2.0.0-20190314233015-f79a8a8ca69d // indirect
github.com/dgryski/go-rendezvous v0.0.0-20200823014737-9f7001d12a5f // indirect
github.com/fatih/camelcase v1.0.0 // indirect
github.com/fatih/structs v1.1.0 // indirect
github.com/gin-contrib/sse v0.1.0 // indirect
github.com/go-asn1-ber/asn1-ber v1.5.1 // indirect
github.com/go-playground/locales v0.13.0 // indirect
github.com/go-playground/universal-translator v0.17.0 // indirect
github.com/go-playground/validator/v10 v10.4.1 // indirect
github.com/go-sql-driver/mysql v1.6.0 // indirect
github.com/grpc-ecosystem/grpc-gateway v1.16.0 // indirect
github.com/jackc/chunkreader/v2 v2.0.1 // indirect
github.com/jackc/pgconn v1.10.0 // indirect
github.com/jackc/pgio v1.0.0 // indirect
github.com/jackc/pgpassfile v1.0.0 // indirect
github.com/jackc/pgproto3/v2 v2.1.1 // indirect
github.com/jackc/pgservicefile v0.0.0-20200714003250-2b9c44734f2b // indirect
github.com/jackc/pgtype v1.8.1 // indirect
github.com/jackc/pgx/v4 v4.13.0 // indirect
github.com/jinzhu/inflection v1.0.0 // indirect
github.com/jinzhu/now v1.1.2 // indirect
github.com/josharian/intern v1.0.0 // indirect
github.com/leodido/go-urn v1.2.0 // indirect
github.com/matttproud/golang_protobuf_extensions v1.0.1 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.2 // indirect
github.com/pquerna/cachecontrol v0.1.0 // indirect
github.com/prometheus/client_model v0.2.0 // indirect
github.com/prometheus/procfs v0.7.3 // indirect
github.com/russross/blackfriday/v2 v2.0.1 // indirect
github.com/shurcooL/sanitized_anchor_name v1.0.0 // indirect
github.com/spaolacci/murmur3 v1.1.0 // indirect
github.com/tidwall/match v1.1.1 // indirect
github.com/tidwall/pretty v1.2.0 // indirect
github.com/ugorji/go/codec v1.1.7 // indirect
go.uber.org/automaxprocs v1.4.0 // indirect
golang.org/x/crypto v0.0.0-20210817164053-32db794688a5 // indirect
golang.org/x/net v0.0.0-20220722155237-a158d28d115b // indirect
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f // indirect
golang.org/x/text v0.3.7 // indirect
google.golang.org/appengine v1.6.6 // indirect
google.golang.org/genproto v0.0.0-20211007155348-82e027067bd4 // indirect
google.golang.org/grpc v1.41.0 // indirect
google.golang.org/protobuf v1.27.1 // indirect
gopkg.in/alexcesaro/quotedprintable.v3 v3.0.0-20150716171945-2caba252f4dc // indirect
gopkg.in/square/go-jose.v2 v2.6.0 // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
)

Some files were not shown because too many files have changed in this diff Show More