Commit Graph

728 Commits

Author SHA1 Message Date
Ulric Qin 072ab98fcf use ForwardDuration in goroutine 2022-07-08 12:53:32 +08:00
Ulric Qin 35ef6b9265 duplicate label key checker 2022-07-08 12:02:57 +08:00
Ulric Qin eaa53f2533 check duplicate label key 2022-07-08 11:48:44 +08:00
Ulric Qin 796a7014a1 use goroutine to forward data 2022-07-08 09:48:08 +08:00
Yening Qin 315e0ef903
fix: get clusters by api (#1030) 2022-07-07 12:29:35 +08:00
Ulric Qin 98d5dfff8e add namespace and subsystem prefix for metrics 2022-07-07 12:23:06 +08:00
Ulric Qin 6b4705608b add forward stat 2022-07-07 12:13:45 +08:00
Ulric Qin 5907817cba n9e-server: add http request stat 2022-07-07 10:52:04 +08:00
Ulric Qin aa97ac54d1 register GaugeSampleQueueSize 2022-07-07 10:17:15 +08:00
Ulric Qin 8fe548aba9 rename mapkey alertname to rulename 2022-07-07 10:06:34 +08:00
Tripitakav 18a9288b75
fix mute bug (#1025)
Co-authored-by: tripitakav <chengzhi.shang@longbridge.sg>
2022-07-07 10:05:39 +08:00
ulricqin fe82886f09
report sample queue size (#1027)
* report sample queue size

* report sample channel size
2022-07-07 10:00:08 +08:00
ning 56b61909a3 fix: event service api 2022-07-07 09:44:26 +08:00
ulricqin 2ef541cdd7
refactor recording rule and and field disabled (#1022) 2022-07-06 17:21:14 +08:00
Tripitakav 1304a4630b
Add recording rule (#1015)
* add prometheus recording rules

* fix recording rule sql

* add record rule note

* fix copy error

* add some regx

Co-authored-by: 尚承志 <chengzhi.shang@longbridge.sg>
2022-07-06 15:58:08 +08:00
xtan a9288e376d
feat: persist notify cur number (#1013)
Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-07-05 16:42:20 +08:00
Ulric Qin 2a2a96d9fc add contains funcmap 2022-07-04 20:03:11 +08:00
Henry Chia 90dacd0085
fix typo (#1004)
* 修改拼写错误

修改拼写错误
exsits -> exists

* Update router_login.go
2022-06-29 19:08:58 +08:00
ning 540ef68dc8 fix: alert mute add by service 2022-06-29 11:11:12 +08:00
zheng 54cc981956
fix ForDuration (#999) 2022-06-28 16:13:23 +08:00
chenxuan f9af916352
fix alert put api not verify bug (#987) 2022-06-20 11:50:14 +08:00
xtan 90db12b513
Fix:fix target_up nodata judge for prometheus scrape (#986) 2022-06-17 22:44:25 +08:00
Ulric Qin 7d326ef306 use metrics as hash key 2022-06-17 09:56:10 +08:00
Ulric Qin d0b005fb14 code refactor: set createBy when update metric_view 2022-06-16 13:17:58 +08:00
Ulric Qin 63adcc2cd9 bugfix for alert-aggr-views 2022-06-15 14:01:01 +08:00
Ulric Qin 94e1359895 fix handler: NotifyMaxNumber 2022-06-10 17:49:48 +08:00
Ulric Qin 1bcc5b77ec remote write and read: support header 2022-06-10 17:37:33 +08:00
Ulric Qin ae622e0c08 fix 2022-06-10 16:36:47 +08:00
Ulric Qin c951f7d822 support max notify number 2022-06-10 16:26:53 +08:00
Ulric Qin 6a366acc74 modify log level 2022-06-10 15:39:45 +08:00
Ulric Qin a5f7d5e9cf modify log level 2022-06-10 15:15:13 +08:00
Ulric Qin ea2249c30c forward samples in sequence 2022-06-10 14:20:18 +08:00
Ulric Qin a8c60c9f2b alert_aggr_view support modify by admin 2022-06-10 13:55:26 +08:00
xtan 0581e02cf3
Feat:add common template functions (#976)
* Feat:增加常用模板函数

* Feat:修改增加模板函数的实现方式

Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-06-08 20:40:45 +08:00
Ulric Qin e5c1641b6b code refactor: move struct ReaderOptions to config 2022-06-07 18:00:11 +08:00
Ulric Qin 3899144f8f add header for writer post 2022-06-07 17:56:23 +08:00
ning 0e5aea40e8 Merge branch 'main' of github.com:ccfos/nightingale 2022-06-02 11:07:40 +08:00
ning 1dbfcd3dc8 refactor: service api 2022-06-02 11:07:31 +08:00
Ulric Qin 495632a064 fix alert rule delete by service 2022-06-01 12:58:09 +08:00
Ulric Qin ab5e8c366e code refactor 2022-05-31 14:44:57 +08:00
Ulric Qin ce35e23a0f modify alert rule verify 2022-05-31 13:08:09 +08:00
Ulric Qin c3adcc877a use standalone mode when RedisType is blank 2022-05-30 08:39:35 +08:00
xtan 7f92e921b4
Feat:增加对redis集群模式、哨兵模式的支持 (#965)
* 修复go plugin相关错误

* Feat:增加对redis集群模式、哨兵模式的支持

Co-authored-by: tanxiao <tanxiao@asiainfo.com>
2022-05-30 08:36:17 +08:00
caojiaqiang e22a4394f7
feat: 告警处理出错给Maintainer管理员发送告警信息 (#955)
* feat: 告警处理出错给管理员发送告警信息

* feat: 告警处理出错给管理员发送告警信息,发送信息自己拼接,不使用模版

* feat: 告警处理出错给管理员发送告警信息,不实用AlertCurEvent结构

* feat: 告警处理出错给管理员发送告警信息,日志打印、文本发送优化
2022-05-27 19:00:41 +08:00
Yening Qin c040dffb5f
feat: add some service api
* feat: add some service api
2022-05-25 15:14:52 +08:00
Ulric Qin c2f2a7d5e2 use post method to get datasources 2022-05-23 13:31:05 +08:00
Ulric Qin fd29d18312 delete no use code 2022-05-23 13:29:08 +08:00
Ulric Qin 2f724075b2 loop load clusters from api 2022-05-23 13:13:35 +08:00
Ulric Qin 06224e4b20 refactor 2022-05-22 17:03:57 +08:00
Ulric Qin f81888cd8a get prometheus info from api. code skelton 2022-05-22 16:56:58 +08:00
Ulric Qin 6a7b543ad6 add mutex for prom transport 2022-05-22 12:45:25 +08:00
ulricqin ecc51001c3
New Dashboard and support variables in alert_rule_note (#953)
* change alert rule

* Db connect update (#939)

* update target's cluster field when clustername modified in server.conf

* code refactor

* db connect update

* delete DriverName

Co-authored-by: Ulric Qin <ulric.qin@gmail.com>
Co-authored-by: zhangjiandong <zhang.jiandong@baiso.com>

* update sql struct

* change sql

* add some files for new dashboard

* add new board apis

* fix query data

* add dashboard migrate api

* rule note support template

* add value as data for template

* parse rule note before persist

* use prometheus var names

* fixbug rule note template

* refactor sql

* add logo

* refactor: add some log

* mv package poster to pkg

* add version

* compute user total in usage reporter

* feat: add some service api

Co-authored-by: 710leo <710leo@gmail.com>
Co-authored-by: countingwww <871138993@qq.com>
Co-authored-by: zhangjiandong <zhang.jiandong@baiso.com>
2022-05-20 23:48:49 +08:00
Ulric Qin 2bea8b7c84 add usage report 2022-05-17 19:24:06 +08:00
Ulric Qin dd5ae29f82 delete no use code 2022-05-12 10:58:28 +08:00
Ulric Qin e89760f374 code refactor 2022-05-08 16:04:20 +08:00
Ulric Qin 02dd70480d update target's cluster field when clustername modified in server.conf 2022-05-08 16:02:50 +08:00
Ulric Qin 882952de3e feature: builtin metric_view can be modified by admin 2022-04-27 10:51:12 +08:00
Ulric Qin 279bec6eaa Delete redundant judgment logic 2022-04-24 10:39:24 +08:00
Ulric Qin 614ed283c0 rename MinVersion to TLSMinVersion 2022-04-22 22:25:02 +08:00
Ulric Qin 06672d5ff9 fix user group search 2022-04-22 22:18:58 +08:00
Ulric Qin e0f0e08852 support redis tls 2022-04-22 21:48:56 +08:00
Ulric Qin e00f102703 give default configuration value for QueueCount 2022-04-21 12:29:43 +08:00
Ulric Qin 3921627fa2 Merge branch 'main' of github.com:didi/nightingale 2022-04-21 12:26:44 +08:00
Ulric Qin 7a1a65c31b add queue count control chan number 2022-04-21 12:24:26 +08:00
Curith 5e763f1a8b
use const http status text instead of a variable (#921) 2022-04-21 11:30:25 +08:00
Ulric Qin a0c5f94017 use goroutine to send metrics to backend 2022-04-21 11:07:56 +08:00
zheng 9ba1c2c32d
优化钉钉@ 方式,允许关闭at (#917)
token_xxx?noat=1
2022-04-19 15:14:18 +08:00
qzh 5732c4403b
perf: 合并targets_up指标为一个ident,减少资源利用。 (#915) 2022-04-15 21:17:16 +08:00
Yening Qin 6033a0a743
fix: err is nil (#914) 2022-04-15 14:35:26 +08:00
zheng e8cfe46381
按告警级别和数量排序 (#913) 2022-04-15 14:33:44 +08:00
Ulric Qin e94f807d52 delete no used code 2022-04-15 11:10:16 +08:00
Ulric Qin c15490e756 code refactor 2022-04-14 19:09:25 +08:00
Ulric Qin b25c523528 code refactor, use NotifyBuiltinChannels to control 2022-04-14 18:56:14 +08:00
Ulric Qin 6d27da8ad8 delete no used code 2022-04-14 17:32:51 +08:00
Ulric Qin 1633308000 modify queue size 2022-04-14 17:19:14 +08:00
Ulric Qin 3a97a67c7e third time: code refactor for pr 906. use channel as queue for all the receivers 2022-04-14 12:57:30 +08:00
Ulric Qin 8d6101ec5a second time: code refactor for pr 906. new concurrent-map when init; move lock to WritersType 2022-04-14 12:43:39 +08:00
Ulric Qin e73da37bc0 first time: code refactor for pr 906 2022-04-14 11:11:14 +08:00
qzh 3d587a5762
perf(opentsdb): 数据拉取以ident分发,并把list方式改为chan方式,提高消费效率。如果有多个prometheus实例,也可以通过header中的Ident字段进行一致性hash分发。 (#906)
Co-authored-by: zhihao.qu <zhihao.qu@ly.com>
2022-04-14 10:31:36 +08:00
zheng 42a6be95e8
fix dashboard name (#911) 2022-04-13 21:34:25 +08:00
zheng ee8c367933
修复大盘目录错误 (#910) 2022-04-13 18:39:41 +08:00
Lars Lehtonen a20e19922e
src/pkg/ibex: fix dropped error (#907) 2022-04-13 10:43:32 +08:00
Ulric Qin b838cb1c6f return last insert object of metric view 2022-04-08 11:07:30 +08:00
Ulric Qin cb3e371094 parse tags for cur_events 2022-04-07 18:30:11 +08:00
Ulric Qin ac24e8b028 fix: import builtin dashboards 2022-04-07 14:09:14 +08:00
Ulric Qin 30ba544f35 fix order metric_view 2022-04-07 12:01:16 +08:00
Ulric Qin e8c0d6b987 order by cate and name 2022-04-07 11:37:49 +08:00
Ulric Qin 8abb04afde use hostname+pid instead of ip 2022-04-06 10:27:28 +08:00
Ulric Qin f7318cfc5a alter table user to users 2022-04-05 09:12:30 +08:00
Ulric Qin 66bc023e51 bugfix: list builtin alerts and dashboards 2022-04-02 12:21:03 +08:00
Ulric Qin 9e8d9b44b1 fix NotifyRecovered logic 2022-04-01 15:20:52 +08:00
Ulric Qin 9d016212c8 move sender package to common 2022-03-31 15:31:21 +08:00
Ulric Qin a4158c476e mv poster to common package 2022-03-31 15:27:14 +08:00
Ulric Qin 5d17f006f0 check smtp configurations 2022-03-31 12:02:57 +08:00
Ulric Qin 16d303a6fb rename var 2022-03-31 10:40:14 +08:00
Ulric Qin 70e5ac4898 add alert_aggr_view 2022-03-31 10:24:42 +08:00
Yening Qin a67356639b
feat: support OIDC (#893)
* feat: support oidc

* refactor: sso -> oidc

* refactor: add AccessToken

* refactor: change some naming
2022-03-30 11:01:02 +08:00
Lars Lehtonen 7b3cb2eb00
fix router errors (#894) 2022-03-30 10:57:54 +08:00
Ulric Qin b260a20646 give blank method for datadog-agent 2022-03-28 16:41:31 +08:00
Ulric Qin c557e383b6 add metric_view crud method 2022-03-27 19:06:31 +08:00
Ulric Qin 4c22284ca7 add cluster field when import builtin alerts 2022-03-23 14:48:28 +08:00
Ulric Qin 929c970b42 import builtin dashboard 2022-03-23 14:04:55 +08:00
Ulric Qin 496c8d8356 handle alerts builtin 2022-03-23 13:58:45 +08:00
Jeyrce.Lu 18164fdb16
perf: optimize alert plugin call(#886) (#891) 2022-03-22 18:10:35 +08:00
Ulric Qin 3b9e40c5d4 add severity in card 2022-03-22 15:49:59 +08:00
Ulric Qin 6d20b8ef72 fill notify groups of events 2022-03-22 15:36:51 +08:00
Ulric Qin 8bdd35975e AlertCurEventGetByIds 2022-03-22 15:24:25 +08:00
Ulric Qin 9ccdd6c3e7 fix nil pointer 2022-03-22 15:18:45 +08:00
Ulric Qin 30365a2256 code refactor 2022-03-22 15:14:56 +08:00
Ulric Qin cdd4100a30 code refactor 2022-03-22 14:43:30 +08:00
Ulric Qin 2cd9f50357 code refactor 2022-03-22 14:38:56 +08:00
Ulric Qin 106345ff49 add debug log 2022-03-22 14:26:37 +08:00
Ulric Qin 7c8c961aef query alerts card 2022-03-22 14:10:10 +08:00
Ulric Qin e1bd7f0267 verify alert_aggr_view 2022-03-22 11:38:16 +08:00
Ulric Qin 025c5809be add alert_aggr_view crud 2022-03-22 11:19:06 +08:00
Ulric Qin d45fdd50e7 modify sql: add group_name for event 2022-03-21 17:35:10 +08:00
Ulric Qin 4a62339c69 do not math.Round for metric value 2022-03-21 16:43:28 +08:00
Ulric Qin 5a9b8d6bd0 add configuration: BusiGroupLabelKey 2022-03-21 14:13:04 +08:00
Ulric Qin 8ce71de693 code refactor for append labels 2022-03-21 14:04:32 +08:00
Ulric Qin 6d9846f1f5 sync busi_group 2022-03-21 12:06:53 +08:00
Ulric Qin c9be9b0538 add label_value field for busi_group 2022-03-21 11:44:51 +08:00
Jeyrce.Lu 302cebbbec
[#886] Feature: 提供一种go plugin 告警通知方式 (#887)
* [#886] Feature: 提供一种go plugin 告警通知方式

* fix: 移除下层并发
2022-03-20 10:27:17 +08:00
zheng 46c60a32fd
修复无法删除空dashboard问题 (#889) 2022-03-17 19:07:24 +08:00
Ulric Qin 1ffdf3d283 bugfix: AdminRole 2022-03-07 18:19:19 +08:00
Ulric Qin 94a49c17f7 persist recovered events 2022-03-03 10:44:06 +08:00
Ulric Qin e515039ad4 use bgrwCheck func to check alert_rule put 2022-03-03 10:25:52 +08:00
Ulric Qin c6356df81f +NotifyBuiltinEnable 2022-03-01 16:27:21 +08:00
Ulric Qin 9c662de129 add smtp log 2022-03-01 13:50:51 +08:00
Ulric Qin caa37b087c use batch send mail 2022-03-01 13:44:46 +08:00
Ulric Qin b63c853889 use smtp.DialAndSend func 2022-03-01 13:27:23 +08:00
Ulric Qin 2ff79c7780 use golang as sender 2022-03-01 11:16:55 +08:00
Ulric Qin 403cb5a6ad not stable version 2022-02-28 23:50:02 +08:00
zheng b43f196d86
优化只保留5位小数 (#878)
* 优化只保留5位小数

* 优化小数点保留方法
2022-02-28 18:06:23 +08:00
Ulric Qin 483b353494 Merge branch 'main' of github.com:didi/nightingale 2022-02-26 12:00:25 +08:00
Ulric Qin cddc99981d modify perm of read tasks 2022-02-26 12:00:08 +08:00
zheng 01f1f50880
限制timestamp不能大于当前时间5分钟 (#872) 2022-02-18 19:17:28 +08:00
Ulric Qin 8664c3df37 refactor 2022-02-18 16:29:49 +08:00
eshun f009c43878
add windows support (#867)
* add windows support

* add windows support

* add windows support

Co-authored-by: 78552423@qq.com <chenyz0812>
2022-02-18 15:55:00 +08:00
Ulric Qin f8482601a8 Merge branch 'main' of github.com:didi/nightingale 2022-02-17 19:29:51 +08:00
Ulric Qin 8c4ab88888 return all busi-groups when subscribe 2022-02-17 19:28:35 +08:00
张哲铭 37421dd56a
兼容使用pg数据库,contacts字段json格式无法转换的问题 (#868) 2022-02-15 17:58:33 +08:00
UlricQin d31fe9cb71 modify user-groups query limit 2022-02-11 13:05:20 +08:00
UlricQin bd762172d4 add space in error log 2022-02-10 17:54:52 +08:00
UlricQin b32a7b3a9e add global callback 2022-02-10 17:32:06 +08:00
UlricQin 3ccc09674e query user-groups 2022-02-10 15:45:38 +08:00
UlricQin 9beef8f36a add last_sent_time for alert_cur_event 2022-01-29 13:46:10 +08:00
UlricQin 2e63993b7f fix 2022-01-29 11:08:32 +08:00
UlricQin b482c7a076 recover_duration done 2022-01-29 11:01:44 +08:00
Ulric Qin 598ae07fc2 add feature: recover_duration 2022-01-26 08:59:30 +08:00
Ulric Qin e5d7612af9 n9e-server support basic auth for Reader 2022-01-21 23:34:25 +08:00