Commit Graph

694 Commits

Author SHA1 Message Date
Ulric Qin ac24e8b028 fix: import builtin dashboards 2022-04-07 14:09:14 +08:00
Ulric Qin 30ba544f35 fix order metric_view 2022-04-07 12:01:16 +08:00
Ulric Qin e8c0d6b987 order by cate and name 2022-04-07 11:37:49 +08:00
Ulric Qin 8abb04afde use hostname+pid instead of ip 2022-04-06 10:27:28 +08:00
Ulric Qin f7318cfc5a alter table user to users 2022-04-05 09:12:30 +08:00
Ulric Qin 66bc023e51 bugfix: list builtin alerts and dashboards 2022-04-02 12:21:03 +08:00
Ulric Qin 9e8d9b44b1 fix NotifyRecovered logic 2022-04-01 15:20:52 +08:00
Ulric Qin 9d016212c8 move sender package to common 2022-03-31 15:31:21 +08:00
Ulric Qin a4158c476e mv poster to common package 2022-03-31 15:27:14 +08:00
Ulric Qin 5d17f006f0 check smtp configurations 2022-03-31 12:02:57 +08:00
Ulric Qin 16d303a6fb rename var 2022-03-31 10:40:14 +08:00
Ulric Qin 70e5ac4898 add alert_aggr_view 2022-03-31 10:24:42 +08:00
Yening Qin a67356639b
feat: support OIDC (#893)
* feat: support oidc

* refactor: sso -> oidc

* refactor: add AccessToken

* refactor: change some naming
2022-03-30 11:01:02 +08:00
Lars Lehtonen 7b3cb2eb00
fix router errors (#894) 2022-03-30 10:57:54 +08:00
Ulric Qin b260a20646 give blank method for datadog-agent 2022-03-28 16:41:31 +08:00
Ulric Qin c557e383b6 add metric_view crud method 2022-03-27 19:06:31 +08:00
Ulric Qin 4c22284ca7 add cluster field when import builtin alerts 2022-03-23 14:48:28 +08:00
Ulric Qin 929c970b42 import builtin dashboard 2022-03-23 14:04:55 +08:00
Ulric Qin 496c8d8356 handle alerts builtin 2022-03-23 13:58:45 +08:00
Jeyrce.Lu 18164fdb16
perf: optimize alert plugin call(#886) (#891) 2022-03-22 18:10:35 +08:00
Ulric Qin 3b9e40c5d4 add severity in card 2022-03-22 15:49:59 +08:00
Ulric Qin 6d20b8ef72 fill notify groups of events 2022-03-22 15:36:51 +08:00
Ulric Qin 8bdd35975e AlertCurEventGetByIds 2022-03-22 15:24:25 +08:00
Ulric Qin 9ccdd6c3e7 fix nil pointer 2022-03-22 15:18:45 +08:00
Ulric Qin 30365a2256 code refactor 2022-03-22 15:14:56 +08:00
Ulric Qin cdd4100a30 code refactor 2022-03-22 14:43:30 +08:00
Ulric Qin 2cd9f50357 code refactor 2022-03-22 14:38:56 +08:00
Ulric Qin 106345ff49 add debug log 2022-03-22 14:26:37 +08:00
Ulric Qin 7c8c961aef query alerts card 2022-03-22 14:10:10 +08:00
Ulric Qin e1bd7f0267 verify alert_aggr_view 2022-03-22 11:38:16 +08:00
Ulric Qin 025c5809be add alert_aggr_view crud 2022-03-22 11:19:06 +08:00
Ulric Qin d45fdd50e7 modify sql: add group_name for event 2022-03-21 17:35:10 +08:00
Ulric Qin 4a62339c69 do not math.Round for metric value 2022-03-21 16:43:28 +08:00
Ulric Qin 5a9b8d6bd0 add configuration: BusiGroupLabelKey 2022-03-21 14:13:04 +08:00
Ulric Qin 8ce71de693 code refactor for append labels 2022-03-21 14:04:32 +08:00
Ulric Qin 6d9846f1f5 sync busi_group 2022-03-21 12:06:53 +08:00
Ulric Qin c9be9b0538 add label_value field for busi_group 2022-03-21 11:44:51 +08:00
Jeyrce.Lu 302cebbbec
[#886] Feature: 提供一种go plugin 告警通知方式 (#887)
* [#886] Feature: 提供一种go plugin 告警通知方式

* fix: 移除下层并发
2022-03-20 10:27:17 +08:00
zheng 46c60a32fd
修复无法删除空dashboard问题 (#889) 2022-03-17 19:07:24 +08:00
Ulric Qin 1ffdf3d283 bugfix: AdminRole 2022-03-07 18:19:19 +08:00
Ulric Qin 94a49c17f7 persist recovered events 2022-03-03 10:44:06 +08:00
Ulric Qin e515039ad4 use bgrwCheck func to check alert_rule put 2022-03-03 10:25:52 +08:00
Ulric Qin c6356df81f +NotifyBuiltinEnable 2022-03-01 16:27:21 +08:00
Ulric Qin 9c662de129 add smtp log 2022-03-01 13:50:51 +08:00
Ulric Qin caa37b087c use batch send mail 2022-03-01 13:44:46 +08:00
Ulric Qin b63c853889 use smtp.DialAndSend func 2022-03-01 13:27:23 +08:00
Ulric Qin 2ff79c7780 use golang as sender 2022-03-01 11:16:55 +08:00
Ulric Qin 403cb5a6ad not stable version 2022-02-28 23:50:02 +08:00
zheng b43f196d86
优化只保留5位小数 (#878)
* 优化只保留5位小数

* 优化小数点保留方法
2022-02-28 18:06:23 +08:00
Ulric Qin 483b353494 Merge branch 'main' of github.com:didi/nightingale 2022-02-26 12:00:25 +08:00
Ulric Qin cddc99981d modify perm of read tasks 2022-02-26 12:00:08 +08:00
zheng 01f1f50880
限制timestamp不能大于当前时间5分钟 (#872) 2022-02-18 19:17:28 +08:00
Ulric Qin 8664c3df37 refactor 2022-02-18 16:29:49 +08:00
eshun f009c43878
add windows support (#867)
* add windows support

* add windows support

* add windows support

Co-authored-by: 78552423@qq.com <chenyz0812>
2022-02-18 15:55:00 +08:00
Ulric Qin f8482601a8 Merge branch 'main' of github.com:didi/nightingale 2022-02-17 19:29:51 +08:00
Ulric Qin 8c4ab88888 return all busi-groups when subscribe 2022-02-17 19:28:35 +08:00
张哲铭 37421dd56a
兼容使用pg数据库,contacts字段json格式无法转换的问题 (#868) 2022-02-15 17:58:33 +08:00
UlricQin d31fe9cb71 modify user-groups query limit 2022-02-11 13:05:20 +08:00
UlricQin bd762172d4 add space in error log 2022-02-10 17:54:52 +08:00
UlricQin b32a7b3a9e add global callback 2022-02-10 17:32:06 +08:00
UlricQin 3ccc09674e query user-groups 2022-02-10 15:45:38 +08:00
UlricQin 9beef8f36a add last_sent_time for alert_cur_event 2022-01-29 13:46:10 +08:00
UlricQin 2e63993b7f fix 2022-01-29 11:08:32 +08:00
UlricQin b482c7a076 recover_duration done 2022-01-29 11:01:44 +08:00
Ulric Qin 598ae07fc2 add feature: recover_duration 2022-01-26 08:59:30 +08:00
Ulric Qin e5d7612af9 n9e-server support basic auth for Reader 2022-01-21 23:34:25 +08:00
UlricQin f3924dab5b delete pendings when recoverRule 2022-01-12 13:50:29 +08:00
UlricQin 7f4cb3888f support falcon datamodel 2022-01-11 11:25:03 +08:00
UlricQin 120c2fe52a fix proxy Host header 2022-01-10 20:16:44 +08:00
UlricQin b9c674d662 prometheus proxy add Header Host 2022-01-08 19:40:43 +08:00
Ulric Qin dcee4677ed Merge branch 'main' of github.com:didi/nightingale 2022-01-08 17:52:42 +08:00
Ulric Qin d590f6d5c1 enable_in_bg logic 2022-01-08 17:52:29 +08:00
UlricQin 850a370f9d add targets apis 2022-01-06 11:48:30 +08:00
UlricQin 40e7ede5e3 Merge branch 'main' of github.com:didi/nightingale 2022-01-04 16:47:15 +08:00
UlricQin 9a2257dd1e ldap user default role configuration 2022-01-04 16:47:03 +08:00
Ulric Qin b693e80d75 check basicauth 2021-12-31 12:07:23 +08:00
Ulric Qin e9ce679649 handle python2 encoding 2021-12-31 11:13:57 +08:00
Ulric Qin a56d6b568b refactor log print 2021-12-30 09:37:52 +08:00
Ulric Qin 904d09d91c add datadog deflate encoding 2021-12-29 14:59:05 +08:00
Ulric Qin 3700f7a10b update datadog url 2021-12-29 14:52:22 +08:00
Ulric Qin d57415d23d add datadog receiver 2021-12-28 11:00:48 +08:00
Ulric Qin 06eca94492 add datadogSeries 2021-12-27 13:30:45 +08:00
Ulric Qin 74e4724e66 delete no use code: repeater.go 2021-12-23 22:54:37 +08:00
Ulric Qin 1ea8694769 refactor fireEvent 2021-12-23 22:43:18 +08:00
Ulric Qin 218140066b fix r.rule.NotifyRepeatStep unit 2021-12-23 22:26:53 +08:00
Ulric Qin 837cfab1bd refactor repeater 2021-12-23 22:19:49 +08:00
Ulric Qin 3428b11ea8 configuration for metrics.yaml and templates 2021-12-23 12:53:32 +08:00
Ulric Qin 49176ae240 support grafana-agent 2021-12-16 17:58:49 +08:00
Ulric Qin 8eb4a39e7d fix index out of range 2021-12-16 17:07:27 +08:00
Ulric Qin 0f65a1f5dd add remote write api support 2021-12-16 16:59:51 +08:00
Ulric Qin a71edc4040 extract IamLeader function and fix repeat 2021-12-15 20:52:00 +08:00
Ulric Qin 23b6cf1a68 fix repeat sender 2021-12-15 19:37:55 +08:00
Ulric Qin 0f3bbf6368 use NotifyRepeatNext as TriggerTime when repeat notify 2021-12-15 18:37:48 +08:00
Ulric Qin caa33c29e9 refactor creating busi group 2021-12-13 11:12:49 +08:00
Ulric Qin d5050338f3 use last_eval_time for filter 2021-12-11 18:14:23 +08:00
Ulric Qin 7f0877bf28 add table column: last_eval_time in alert_his_event 2021-12-11 18:07:01 +08:00
Ulric Qin d4c4257517 code refactor for i18n when occur duplicate tagkey 2021-12-11 17:25:45 +08:00
Ulric Qin 61f76afa0d handle duplicate tagkey 2021-12-11 17:23:18 +08:00
Ulric Qin 5634f48725 remove perm of targets 2021-12-10 09:49:11 +08:00
Ulric Qin 964d50b4e7 add perm function in routers 2021-12-10 09:44:06 +08:00
Ulric Qin d2cb48a2ef remove writer name 2021-12-09 23:07:45 +08:00
Ulric Qin 53411dc5d9 add perm 2021-12-09 22:08:22 +08:00
Ulric Qin cab6089a37 add perm control busi-group adding 2021-12-09 22:04:16 +08:00
Ulric Qin 32fea64f3e use configuration file to control AnonymousAccess 2021-12-09 16:59:02 +08:00
Ulric Qin aa2e5f15ee update recover event 2021-12-08 22:31:48 +08:00
Ulric Qin ed5e93f373 modify event url 2021-12-08 21:36:21 +08:00
Ulric Qin 48247ea7fe At least one team have rw permission 2021-12-08 13:18:53 +08:00
Ulric Qin 12a5f335bd get event detail no need login 2021-12-08 10:04:31 +08:00
Ulric Qin 5e19eadd61 add recover_time only when IsRecovered 2021-12-08 00:17:42 +08:00
Ulric Qin 0e88f0074c add recover_time 2021-12-08 00:07:25 +08:00
Ulric Qin 2bfc67686d refactor alert_subscribe.user_group_ids 2021-12-07 19:33:39 +08:00
Ulric Qin 4f8fedbaa0 delete no use code 2021-12-07 13:44:14 +08:00
Ulric Qin b108c9f11a refactor: The business group must retain at least one team 2021-12-06 21:33:36 +08:00
Ulric Qin bef8e8e548 bugfix: handle rule judge 2021-12-06 18:44:56 +08:00
Ulric Qin 88063cd30e bugfix: callback ibex 2021-12-06 18:20:44 +08:00
Ulric Qin a94a602d4f remove jwtAuth in prom api 2021-12-06 15:18:56 +08:00
UlricQin df97166f07 add api: check perm 2021-12-05 20:40:13 +08:00
UlricQin b418dec3ab bugfix: event mute 2021-12-04 12:07:30 +08:00
UlricQin 79401183ca bugfix 2021-12-02 17:37:42 +08:00
UlricQin 270d3b7e5b code refactor 2021-12-02 17:34:54 +08:00
UlricQin 4e3f9914f1 use i18n error when import rules and dashboards 2021-12-02 10:19:10 +08:00
UlricQin dd8e1f2d71 add api: /api/n9e/version 2021-12-01 16:46:37 +08:00
UlricQin 11e7c41908 add EngineDelay 2021-12-01 14:09:08 +08:00
UlricQin 57c2fd9b73 update jwt 2021-12-01 11:40:49 +08:00
UlricQin dc9fe38735 modify args: hours->days 2021-12-01 11:26:44 +08:00
UlricQin 622d4ac165 refactor 2021-12-01 10:14:35 +08:00
Ulric Qin 3090e13be7 verify tpl tags modify 2021-11-30 18:16:09 +08:00
UlricQin f96a36aa43 bugfix 2021-11-30 14:25:02 +08:00
UlricQin 6ad24419ab engine wait 2min 2021-11-30 12:33:37 +08:00
UlricQin 04319a6b41 add /v1/n9e/users 2021-11-30 11:57:55 +08:00
UlricQin 952f6b139d add api: get one alert-subscribe 2021-11-30 11:49:08 +08:00
UlricQin d43067bad4 bugfix 2021-11-29 20:06:45 +08:00
UlricQin c17ade64e1 bugfix 2021-11-29 19:56:36 +08:00
UlricQin 4ddbba1400 bugfix 2021-11-29 15:36:15 +08:00
UlricQin 6e3ad3dd6b version 5.1 2021-11-28 18:57:49 +08:00
qinyening 4e6e70c14d
release v5.0.0-rc1 (#708)
* release v5.0.0-rc1
2021-06-28 00:42:39 +08:00
710leo 18b9fb3ee2 add some log 2021-06-25 11:46:34 +08:00
710leo 02f2554cc1 fix: nodata repeated recovery alerting 2021-06-22 23:11:55 +08:00
stonelgh 07961c9f21
m3db: fix Errorf calls (#703) 2021-06-21 15:06:44 +08:00
wjkxiaowu f770b3cf14
add system env when plugin run (#699)
Co-authored-by: root <root@localhost.localdomain>
2021-06-15 11:13:51 +08:00
yubo 37abf19f0d
add m3db client timeout check (#693) 2021-05-31 15:35:00 +08:00
710leo bbbd7faeb1 bugfix: user and team info cache 2021-05-27 20:55:47 +08:00
710leo a73f2654df bugfix: aggr output and alert 2021-05-27 00:46:21 +08:00
710leo 22f0aee55d add event write perm check 2021-05-25 17:54:09 +08:00
710leo 01420ff1d8 optimize user information filling 2021-05-16 17:42:53 +08:00
710leo c4b5d13348 optimize user information filling 2021-05-16 15:42:30 +08:00
hubo 9cf2d47eef
agent 增加默认tags功能, agent 增加正则匹配磁盘挂载类型过滤功能 (#683)
* agent 增加默认tags功能, agent 增加正则匹配磁盘挂载类型过滤功能

* agent 增加默认tags功能, agent 增加正则匹配磁盘挂载类型过滤功能

Co-authored-by: huboc <huboc@zbj.com>
2021-05-08 19:17:01 +08:00
Paul Chu a9d6d6f820
支持节点迁移 (#680)
* enable promethues summary

* ADD: 添加节点迁移的方法

* FIX: node move session commit

* ADD: 注册迁移节点的接口

* MOD: fix error handle

Co-authored-by: zhupeiyuan <zhupeiyuan@fenbi.com>
2021-05-07 11:10:05 +08:00
Ulric Qin f70d303942 fix http_response compile error 2021-05-06 17:00:18 +08:00
peng19940915 1112186d1c
新增postgresql监控 (#671)
* add postgresql & remove http_response status_code tag

* add postgresql & remove http_response status_code tag

Co-authored-by: leiyupeng <susu898287771@>
2021-04-27 23:16:07 +08:00
yubo f40332f197
bugfix: add user.Type (#667) 2021-04-26 19:15:33 +08:00
joyexpr 41efc66d25
fix: send mail not work(wrong notifyType and subject) (#660) 2021-04-19 23:57:20 +08:00
710leo d49d40768c organize configuration 2021-04-19 21:28:02 +08:00
710leo c71264ab30 fix send message 2021-04-19 20:10:29 +08:00
710leo bb64a2f1ec support static files 2021-04-16 19:21:02 +08:00
710leo 3f0dfd63d4 support static files 2021-04-15 21:23:59 +08:00
710leo 46f7ec7af9 complete version information 2021-04-15 19:35:25 +08:00
yubo 999c1b4239
bugfix: use InviteMustGet instead of InviteGet (#654)
* add fmt import
2021-04-14 20:57:27 +08:00
yubo f6b2535cdb
bugfix: use InviteMustGet instead of InviteGet (#653) 2021-04-14 12:48:26 +08:00
yubo 5f1c868006
feature: logout when the user is invalidated (#652) 2021-04-13 14:33:21 +08:00
qinyening 59366e4d3a
发布v4版本 (#651)
* init
2021-04-13 11:38:40 +08:00
710leo eed2f073a0 Merge branch 'master' of https://github.com/didi/nightingale 2021-04-09 15:34:06 +08:00
710leo 31a03aa331 alert event modify filling user detail 2021-04-09 15:33:52 +08:00
yubo 71984c72b5
feature: add password changed notify (#647)
* feature: add password changed notify
2021-04-09 11:21:09 +08:00
yubo 72573e32cb
feature: add get self permissions by nodeID (#643) 2021-04-07 13:12:00 +08:00
chixianliangGithub 50f4cc10c4
去除重复代码 (#641) 2021-04-03 16:00:32 +08:00
yubo 1ff6d0a2dc
feature: add [start,end) param for clude, endpointMetric, endpoints api (#639) 2021-03-30 18:10:14 +08:00
yubo 92ac8b09c0
prober plugin use `all` mode as default (#634) 2021-03-26 11:17:31 +08:00
Paul Chu 384e993ca1
enable promethues summary (#630) 2021-03-24 16:08:42 +00:00
yubo c1241fdfbc
bugfix: created_at -> create_at for rdb.user table (#632) 2021-03-24 19:10:01 +08:00
yubo be9d6ac660
use logger.Warning instead of fmt.Printf at loading plugins (#629) 2021-03-23 18:37:03 +08:00
yubo 30b469ddbd
add subject for rdb rst-code/login-code mail (#628) 2021-03-22 17:27:01 +08:00
yubo 111c6fc1bf
feature: support node event notify with webhook (#627)
* feature: support node event notify with webhook
2021-03-19 13:06:41 +08:00
710leo 0cd2761021 Merge branch 'master' of https://github.com/didi/nightingale 2021-03-19 11:12:41 +08:00
710leo 0a7c8988c6 stra add user group detail 2021-03-19 11:12:32 +08:00
UlricQin 7947533182 monapi support new timestamp 2021-03-19 10:48:40 +08:00
710leo 184c39d311 add some audit log 2021-03-18 21:22:50 +08:00
UlricQin d89eaec596 bugfix: GetTeamsNameByIds 2021-03-18 10:03:20 +08:00
yubo 40ce0d75ed
prettify msg (#620) 2021-03-17 11:57:30 +08:00
ning1875 61bd28db31
日志采集字段变更 whether_attache_one_log_line--> whether_attach_one_log_line (#619)
* m3db writetagged应该并发做,不然会导致transfer rpc变慢

* go func指针传参问题

* 新增k8s-mon三个大盘文件

* 新增k8s-mon三个大盘文件

* 修改k8s-mon三个大盘文件

* 日志采集新增带上最后一条日志 到extra字段中,为后续报警做准备

* 日志采集字段变更 whether_attache_one_log_line--> whether_attach_one_log_line

* 日志采集带上日志
2021-03-15 16:03:02 +08:00
710leo b1426945d4 fix agent proc.cpu.util 2021-03-13 18:21:21 +08:00
ning1875 dec9097ce7
transfer写m3db出错时打印metric信息帮助定位 (#615)
* m3db writetagged应该并发做,不然会导致transfer rpc变慢

* go func指针传参问题

* 新增k8s-mon三个大盘文件

* 新增k8s-mon三个大盘文件

* 修改k8s-mon三个大盘文件

* transfer写m3db出错时打印metric信息帮助定位
2021-03-13 13:21:05 +08:00
ning1875 7bb93e8351
日志采集新增带上最后一条日志 到extra字段中,为后续报警做准备 (#614)
* m3db writetagged应该并发做,不然会导致transfer rpc变慢

* go func指针传参问题

* 新增k8s-mon三个大盘文件

* 新增k8s-mon三个大盘文件

* 修改k8s-mon三个大盘文件

* 日志采集新增带上最后一条日志 到extra字段中,为后续报警做准备
2021-03-13 13:19:45 +08:00
alick-liming 7a84223d5b
Aggr lanteness (#611)
* aggr lateness

* default value

* test

* test

Co-authored-by: alickliming <alickliming@didiglobal.com>
2021-03-12 15:10:42 +08:00
yubo 398628870c
bugfix: add prober.plugins Stop() for release resource (#610) 2021-03-11 16:22:55 +08:00
yubo 3e426537c7
add maxSeriesPoints for config.transfer.m3db (#609) 2021-03-10 17:50:38 +08:00
HONG YANG bf1bd3ef5a
“massage” (#603) 2021-03-10 17:36:58 +08:00
yubo b85b1e44ef
bugfix: auth password history size (#607) 2021-03-10 17:35:12 +08:00
yubo ff194c0382
add sample.out for mysql & redist (#605) 2021-03-09 19:10:40 +08:00
stiei13wangluo bd72a773f4
telegraf dns_query plugins (#601)
* dns_query

* dns_query

Co-authored-by: root <root@localhost.localdomain>
2021-03-05 11:54:13 +08:00
yubo 22dc5c909c
feature: add dryrun for collect_rule add/update (#599)
* feature: add dryrun for collect_rule add/update

* ignore sso when it is disable
2021-03-04 17:35:40 +08:00
Feng_Qi acaa88f1a9
add ping/net_response/http_response support (#594)
* fix port check and push debug log

1:如果服务没有监听在 0.0.0.0 上,而是监听在特定地址上的话,在 127.0.0.1 上无法检测到端口。修改为如果 127.0.0.1 检测不到话,在 identity 的地址上再检测一次。
2. http push 部分缺乏 debug 日志,把 debug log 改到 push 里面以补全。

* Update cron.go

* notify add resource name and note

* Update notify.go

* Update notify.go

修复一个当 name/note 为空值且 resource 只有一台时, 由于被 config.Set 清空
因此获取下标 index out of range 导致 panic 的 bug

* add ping, net_response, http_response plugin

增加
ping
net_response
http_response
的插件支持

* Update all.go

* add example config yml

* Update notify.go
2021-02-28 07:56:35 +08:00
yubo 005dc47868
fix: https://github.com/didi/nightingale/issues/583 (#590) 2021-02-25 15:37:35 +08:00
yubo 9c1c894e29
feature: support dlopen for prober plugin (#588) 2021-02-23 18:04:03 +08:00
yubo b055bc73c5
add a demo plugin for prober (#586)
* add a demo plugin for prober

* update demo plugin
2021-02-23 11:41:38 +08:00
yubo 322cbf27dc
use testhttp instead of http for ut (#585)
* use testhttp instead of http for ut
* bugfix: add username check
2021-02-22 11:25:02 +08:00
UlricQin 417a13c1be bugfix: judge: redis conn pools 2021-02-07 17:07:00 +08:00
yubo 66c93f472a
update vendor for local_build (#578) 2021-02-03 19:10:19 +08:00
710leo 023b23a0ef fix build monapi 2021-02-03 17:01:54 +08:00
710leo 900896c045 add sync stra log 2021-02-03 16:55:14 +08:00