未验证 提交 2c729b5d 编写于 作者: X xbkaishui 提交者: GitHub

Support group concept in the alarm core (#5615)

上级 47d5a768
......@@ -337,6 +337,7 @@ The text of each license is the standard Apache 2.0 license.
lz4-java 1.6.0: https://github.com/jpountz/lz4-java, Apache 2.0
snappy-java 1.1.7.3: https://github.com/xerial/snappy-java, Apache 2.0
slf4j-api 1.7.28: http://www.slf4j.org, Apache 2.0
mvel 2.4.8: https://github.com/mvel/mvel, Apache 2.0
========================================================================
MIT licenses
......
......@@ -16,6 +16,8 @@ Define the relation between scope and entity name.
- **Endpoint Relation**: {Source endpoint name} in {Source Service name} to {Dest endpoint name} in {Dest service name}
## Rules
**There are two types of rules, individual rule and composite rule, composite rule is the combination of individual rules**
### Individual rules
Alarm rule is constituted by following keys
- **Rule name**. Unique name, show in alarm message. Must end with `_rule`.
- **Metrics name**. A.K.A. metrics name in oal script. Only long, double, int types are supported. See
......@@ -41,10 +43,20 @@ Such as in **percentile**, `value1` is threshold of P50, and `-, -, value3, valu
backend deployment env time.
- **Count**. In the period window, if the number of **value**s over threshold(by OP), reaches count, alarm
should send.
- **Only as condition**. Specify if the rule can send notification or just as an condition of composite rule.
- **Silence period**. After alarm is triggered in Time-N, then keep silence in the **TN -> TN + period**.
By default, it is as same as **Period**, which means in a period, same alarm(same ID in same
metrics name) will be trigger once.
### Composite rules
**NOTE**. Composite rules only work for alarm rules targeting the same entity level, such as alarm rules of the service level.
For example, `service_percent_rule && service_resp_time_percentile_rule`. You shouldn't compose alarm rules of different entity levels.
such as one alarm rule of the service metrics with another rule of the endpoint metrics.
Composite rule is constituted by the following keys
- **Rule name**. Unique name, show in alarm message. Must end with `_rule`.
- **Expression**. Specify how to compose rules, support `&&`, `||`, `()`.
- **Message**. Specify the notification message when rule triggered.
```yaml
rules:
......@@ -60,6 +72,8 @@ rules:
count: 3
# How many times of checks, the alarm keeps silence after alarm triggered, default as same as period.
silence-period: 10
# Specify if the rule can send notification or just as an condition of composite rule
only-as-condition: false
service_percent_rule:
metrics-name: service_percent
# [Optional] Default, match all services in this metrics
......@@ -73,6 +87,7 @@ rules:
op: <
period: 10
count: 4
only-as-condition: false
service_resp_time_percentile_rule:
# Metrics value need to be long, double or int
metrics-name: service_percentile
......@@ -83,6 +98,7 @@ rules:
count: 3
silence-period: 5
message: Percentile response time of service {name} alarm in 3 minutes of last 10 minutes, due to more than one condition of p50 > 1000, p75 > 1000, p90 > 1000, p95 > 1000, p99 > 1000
only-as-condition: false
meter_service_status_code_rule:
metrics-name: meter_status_code
exclude-labels:
......@@ -93,8 +109,15 @@ rules:
count: 3
silence-period: 5
message: The request number of entity {name} non-200 status is more than expected.
only-as-condition: false
composite-rules:
comp_rule:
# Must satisfied percent rule and resp time rule
expression: service_percent_rule && service_resp_time_percentile_rule
message: Service {name} successful rate is less than 80% and P50 of response time is over 1000ms
```
### Default alarm rules
We provided a default `alarm-setting.yml` in our distribution only for convenience, which including following rules
1. Service average response time over 1s in last 3 minutes.
......
......@@ -94,6 +94,7 @@
<javaassist.version>3.25.0-GA</javaassist.version>
<vavr.version>0.10.3</vavr.version>
<groovy.version>3.0.3</groovy.version>
<mvel.version>2.4.8.Final</mvel.version>
<zookeeper.image.version>3.5</zookeeper.image.version>
<kafka-clients.version>2.4.1</kafka-clients.version>
......@@ -531,6 +532,11 @@
<artifactId>groovy</artifactId>
<version>${groovy.version}</version>
</dependency>
<dependency>
<groupId>org.mvel</groupId>
<artifactId>mvel2</artifactId>
<version>${mvel.version}</version>
</dependency>
</dependencies>
</dependencyManagement>
</project>
......@@ -42,6 +42,10 @@
<groupId>io.grpc</groupId>
<artifactId>grpc-testing</artifactId>
</dependency>
<dependency>
<groupId>org.mvel</groupId>
<artifactId>mvel2</artifactId>
</dependency>
</dependencies>
<build>
......
......@@ -22,6 +22,8 @@ import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.stream.Collectors;
import org.apache.skywalking.oap.server.core.alarm.AlarmCallback;
import org.apache.skywalking.oap.server.core.alarm.AlarmMessage;
import org.joda.time.LocalDateTime;
......@@ -52,10 +54,10 @@ public class AlarmCore {
lastExecuteTime = now;
Executors.newSingleThreadScheduledExecutor().scheduleAtFixedRate(() -> {
try {
List<AlarmMessage> alarmMessageList = new ArrayList<>(30);
final List<AlarmMessage> alarmMessageList = new ArrayList<>(30);
LocalDateTime checkTime = LocalDateTime.now();
int minutes = Minutes.minutesBetween(lastExecuteTime, checkTime).getMinutes();
boolean[] hasExecute = new boolean[] {false};
boolean[] hasExecute = new boolean[]{false};
alarmRulesWatcher.getRunningContext().values().forEach(ruleList -> ruleList.forEach(runningRule -> {
if (minutes > 0) {
runningRule.moveTo(checkTime);
......@@ -74,7 +76,12 @@ public class AlarmCore {
}
if (alarmMessageList.size() > 0) {
allCallbacks.forEach(callback -> callback.doAlarm(alarmMessageList));
if (alarmRulesWatcher.getCompositeRules().size() > 0) {
List<AlarmMessage> messages = alarmRulesWatcher.getCompositeRuleEvaluator().evaluate(alarmRulesWatcher.getCompositeRules(), alarmMessageList);
alarmMessageList.addAll(messages);
}
List<AlarmMessage> filteredMessages = alarmMessageList.stream().filter(msg -> !msg.isOnlyAsCondition()).collect(Collectors.toList());
allCallbacks.forEach(callback -> callback.doAlarm(filteredMessages));
}
} catch (Exception e) {
LOGGER.error(e.getMessage(), e);
......
......@@ -51,6 +51,7 @@ public class AlarmRule {
private int count;
private int silencePeriod;
private String message;
private boolean onlyAsCondition;
@Override
public boolean equals(final Object o) {
......
......@@ -28,6 +28,8 @@ import lombok.extern.slf4j.Slf4j;
import org.apache.skywalking.oap.server.configuration.api.ConfigChangeWatcher;
import org.apache.skywalking.oap.server.core.Const;
import org.apache.skywalking.oap.server.core.alarm.AlarmModule;
import org.apache.skywalking.oap.server.core.alarm.provider.expression.Expression;
import org.apache.skywalking.oap.server.core.alarm.provider.expression.ExpressionContext;
import org.apache.skywalking.oap.server.core.alarm.provider.grpc.GRPCAlarmSetting;
import org.apache.skywalking.oap.server.core.alarm.provider.slack.SlackSettings;
import org.apache.skywalking.oap.server.core.alarm.provider.wechat.WechatSettings;
......@@ -46,13 +48,16 @@ public class AlarmRulesWatcher extends ConfigChangeWatcher {
private volatile Map<AlarmRule, RunningRule> alarmRuleRunningRuleMap;
private volatile Rules rules;
private volatile String settingsString;
@Getter
private final CompositeRuleEvaluator compositeRuleEvaluator;
public AlarmRulesWatcher(Rules defaultRules, ModuleProvider provider) {
super(AlarmModule.NAME, provider, "alarm-settings");
this.runningContext = new HashMap<>();
this.alarmRuleRunningRuleMap = new HashMap<>();
this.settingsString = Const.EMPTY_STRING;
Expression expression = new Expression(new ExpressionContext());
this.compositeRuleEvaluator = new CompositeRuleEvaluator(expression);
notify(defaultRules);
}
......@@ -104,6 +109,10 @@ public class AlarmRulesWatcher extends ConfigChangeWatcher {
return this.rules.getRules();
}
public List<CompositeAlarmRule> getCompositeRules() {
return this.rules.getCompositeRules();
}
public List<String> getWebHooks() {
return this.rules.getWebhooks();
}
......
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package org.apache.skywalking.oap.server.core.alarm.provider;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Getter;
import lombok.NoArgsConstructor;
import lombok.Setter;
import lombok.ToString;
@Builder
@NoArgsConstructor
@AllArgsConstructor
@Setter
@Getter
@ToString
public class CompositeAlarmRule {
private String alarmRuleName;
private String expression;
private String message;
}
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package org.apache.skywalking.oap.server.core.alarm.provider;
import com.google.common.base.Joiner;
import com.google.common.collect.ImmutableListMultimap;
import com.google.common.collect.Multimaps;
import org.apache.skywalking.oap.server.core.Const;
import org.apache.skywalking.oap.server.core.alarm.AlarmMessage;
import org.apache.skywalking.oap.server.core.alarm.MetaInAlarm;
import org.apache.skywalking.oap.server.core.alarm.provider.expression.Expression;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.concurrent.ConcurrentHashMap;
/**
* Evaluate composite rule using expression eval
*
* @since 8.2.0
*/
public class CompositeRuleEvaluator {
private Expression expression;
private Map<String, AlarmMessageFormatter> messageFormatterCache;
public CompositeRuleEvaluator(Expression expression) {
this.expression = expression;
this.messageFormatterCache = new ConcurrentHashMap<>();
}
/**
* Evaluate composite rule
*
* @param compositeAlarmRules compositeRules
* @param alarmMessages triggered alarm messages
* @return
*/
public List<AlarmMessage> evaluate(List<CompositeAlarmRule> compositeAlarmRules, List<AlarmMessage> alarmMessages) {
final List<AlarmMessage> compositeRuleMessages = new ArrayList<>();
ImmutableListMultimap<String, AlarmMessage> messageMap = Multimaps.index(alarmMessages, alarmMessage ->
Joiner.on(Const.ID_CONNECTOR).useForNull(Const.EMPTY_STRING).join(alarmMessage.getId0(), alarmMessage.getId1()));
for (CompositeAlarmRule compositeAlarmRule : compositeAlarmRules) {
String expr = compositeAlarmRule.getExpression();
Set<String> dependencyRules = expression.analysisInputs(expr);
Map<String, Object> dataContext = new HashMap<>();
messageMap.asMap().forEach((key, alarmMessageList) -> {
dependencyRules.forEach(ruleName -> dataContext.put(ruleName, false));
alarmMessageList.forEach(alarmMessage -> {
if (dependencyRules.contains(alarmMessage.getRuleName())) {
dataContext.put(alarmMessage.getRuleName(), true);
}
});
Object matched = expression.eval(expr, dataContext);
if (matched instanceof Boolean && (Boolean) matched) {
AlarmMessage headMsg = alarmMessageList.iterator().next();
AlarmMessage message = new AlarmMessage();
message.setOnlyAsCondition(false);
message.setScopeId(headMsg.getScopeId());
message.setScope(headMsg.getScope());
message.setName(headMsg.getName());
message.setId0(headMsg.getId0());
message.setId1(headMsg.getId1());
message.setStartTime(System.currentTimeMillis());
message.setRuleName(compositeAlarmRule.getAlarmRuleName());
String alarmMessage = formatMessage(message, compositeAlarmRule.getMessage(), compositeAlarmRule.getExpression());
message.setAlarmMessage(alarmMessage);
compositeRuleMessages.add(message);
}
});
}
return compositeRuleMessages;
}
/**
* Format alarm message using {@link AlarmMessageFormatter}, only support name and id0 meta
*/
private String formatMessage(AlarmMessage alarmMessage, String message, String metricName) {
return messageFormatterCache.computeIfAbsent(message, AlarmMessageFormatter::new).format(new MetaInAlarm() {
@Override
public String getScope() {
return alarmMessage.getScope();
}
@Override
public int getScopeId() {
return alarmMessage.getScopeId();
}
@Override
public String getName() {
return alarmMessage.getName();
}
@Override
public String getMetricsName() {
return metricName;
}
@Override
public String getId0() {
return alarmMessage.getId0();
}
@Override
public String getId1() {
return alarmMessage.getId1();
}
});
}
}
......@@ -36,9 +36,11 @@ public class Rules {
private GRPCAlarmSetting grpchookSetting;
private SlackSettings slacks;
private WechatSettings wecchats;
private List<CompositeAlarmRule> compositeRules;
public Rules() {
this.rules = new ArrayList<>();
this.webhooks = new ArrayList<>();
this.compositeRules = new ArrayList<>();
}
}
......@@ -24,6 +24,7 @@ import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.Objects;
import org.apache.skywalking.oap.server.core.alarm.provider.grpc.GRPCAlarmSetting;
import org.apache.skywalking.oap.server.core.alarm.provider.slack.SlackSettings;
import org.apache.skywalking.oap.server.core.alarm.provider.wechat.WechatSettings;
......@@ -45,97 +46,161 @@ public class RulesReader {
yamlData = yaml.loadAs(io, Map.class);
}
/**
* Read rule config file to {@link Rules}
*/
public Rules readRules() {
Rules rules = new Rules();
if (Objects.nonNull(yamlData)) {
Map rulesData = (Map) yamlData.get("rules");
if (rulesData != null) {
rules.setRules(new ArrayList<>());
rulesData.forEach((k, v) -> {
if (((String) k).endsWith("_rule")) {
AlarmRule alarmRule = new AlarmRule();
alarmRule.setAlarmRuleName((String) k);
Map settings = (Map) v;
Object metricsName = settings.get("metrics-name");
if (metricsName == null) {
throw new IllegalArgumentException("metrics-name can't be null");
}
alarmRule.setMetricsName((String) metricsName);
alarmRule.setIncludeNames((ArrayList) settings.getOrDefault("include-names", new ArrayList(0)));
alarmRule.setExcludeNames((ArrayList) settings.getOrDefault("exclude-names", new ArrayList(0)));
alarmRule.setIncludeNamesRegex((String) settings.getOrDefault("include-names-regex", ""));
alarmRule.setExcludeNamesRegex((String) settings.getOrDefault("exclude-names-regex", ""));
alarmRule.setIncludeLabels(
(ArrayList) settings.getOrDefault("include-labels", new ArrayList(0)));
alarmRule.setExcludeLabels(
(ArrayList) settings.getOrDefault("exclude-labels", new ArrayList(0)));
alarmRule.setIncludeLabelsRegex((String) settings.getOrDefault("include-labels-regex", ""));
alarmRule.setExcludeLabelsRegex((String) settings.getOrDefault("exclude-labels-regex", ""));
alarmRule.setThreshold(settings.get("threshold").toString());
alarmRule.setOp((String) settings.get("op"));
alarmRule.setPeriod((Integer) settings.getOrDefault("period", 1));
alarmRule.setCount((Integer) settings.getOrDefault("count", 1));
// How many times of checks, the alarm keeps silence after alarm triggered, default as same as period.
alarmRule.setSilencePeriod((Integer) settings.getOrDefault("silence-period", alarmRule.getPeriod()));
alarmRule.setMessage(
(String) settings.getOrDefault("message", "Alarm caused by Rule " + alarmRule
readRulesConfig(rules);
readWebHookConfig(rules);
readGrpcConfig(rules);
readSlackConfig(rules);
readWechatConfig(rules);
readCompositeRuleConfig(rules);
}
return rules;
}
/**
* Read rule config into {@link AlarmRule}
*/
private void readRulesConfig(Rules rules) {
Map rulesData = (Map) yamlData.get("rules");
if (rulesData == null) {
return;
}
rules.setRules(new ArrayList<>());
rulesData.forEach((k, v) -> {
if (((String) k).endsWith("_rule")) {
AlarmRule alarmRule = new AlarmRule();
alarmRule.setAlarmRuleName((String) k);
Map settings = (Map) v;
Object metricsName = settings.get("metrics-name");
if (metricsName == null) {
throw new IllegalArgumentException("metrics-name can't be null");
}
alarmRule.setMetricsName((String) metricsName);
alarmRule.setIncludeNames((ArrayList) settings.getOrDefault("include-names", new ArrayList(0)));
alarmRule.setExcludeNames((ArrayList) settings.getOrDefault("exclude-names", new ArrayList(0)));
alarmRule.setIncludeNamesRegex((String) settings.getOrDefault("include-names-regex", ""));
alarmRule.setExcludeNamesRegex((String) settings.getOrDefault("exclude-names-regex", ""));
alarmRule.setIncludeLabels(
(ArrayList) settings.getOrDefault("include-labels", new ArrayList(0)));
alarmRule.setExcludeLabels(
(ArrayList) settings.getOrDefault("exclude-labels", new ArrayList(0)));
alarmRule.setIncludeLabelsRegex((String) settings.getOrDefault("include-labels-regex", ""));
alarmRule.setExcludeLabelsRegex((String) settings.getOrDefault("exclude-labels-regex", ""));
alarmRule.setThreshold(settings.get("threshold").toString());
alarmRule.setOp((String) settings.get("op"));
alarmRule.setPeriod((Integer) settings.getOrDefault("period", 1));
alarmRule.setCount((Integer) settings.getOrDefault("count", 1));
// How many times of checks, the alarm keeps silence after alarm triggered, default as same as period.
alarmRule.setSilencePeriod((Integer) settings.getOrDefault("silence-period", alarmRule.getPeriod()));
alarmRule.setOnlyAsCondition((Boolean) settings.getOrDefault("only-as-condition", false));
alarmRule.setMessage(
(String) settings.getOrDefault("message", "Alarm caused by Rule " + alarmRule
.getAlarmRuleName()));
rules.getRules().add(alarmRule);
}
});
}
List webhooks = (List) yamlData.get("webhooks");
if (webhooks != null) {
rules.setWebhooks(new ArrayList<>());
webhooks.forEach(url -> {
rules.getWebhooks().add((String) url);
});
rules.getRules().add(alarmRule);
}
});
}
Map grpchooks = (Map) yamlData.get("gRPCHook");
if (grpchooks != null) {
GRPCAlarmSetting grpcAlarmSetting = new GRPCAlarmSetting();
Object targetHost = grpchooks.get("target_host");
if (targetHost != null) {
grpcAlarmSetting.setTargetHost((String) targetHost);
}
/**
* Read web hook config
*/
private void readWebHookConfig(Rules rules) {
List webhooks = (List) yamlData.get("webhooks");
if (webhooks != null) {
rules.setWebhooks(new ArrayList<>());
webhooks.forEach(url -> {
rules.getWebhooks().add((String) url);
});
}
}
Object targetPort = grpchooks.get("target_port");
if (targetPort != null) {
grpcAlarmSetting.setTargetPort((Integer) targetPort);
}
/**
* Read grpc hook config into {@link GRPCAlarmSetting}
*/
private void readGrpcConfig(Rules rules) {
Map grpchooks = (Map) yamlData.get("gRPCHook");
if (grpchooks != null) {
GRPCAlarmSetting grpcAlarmSetting = new GRPCAlarmSetting();
Object targetHost = grpchooks.get("target_host");
if (targetHost != null) {
grpcAlarmSetting.setTargetHost((String) targetHost);
}
rules.setGrpchookSetting(grpcAlarmSetting);
Object targetPort = grpchooks.get("target_port");
if (targetPort != null) {
grpcAlarmSetting.setTargetPort((Integer) targetPort);
}
Map slacks = (Map) yamlData.get("slackHooks");
if (slacks != null) {
SlackSettings slackSettings = new SlackSettings();
Object textTemplate = slacks.getOrDefault("textTemplate", "");
slackSettings.setTextTemplate((String) textTemplate);
rules.setGrpchookSetting(grpcAlarmSetting);
}
}
List<String> slackWebhooks = (List<String>) slacks.get("webhooks");
if (slackWebhooks != null) {
slackSettings.getWebhooks().addAll(slackWebhooks);
}
rules.setSlacks(slackSettings);
/**
* Read slack hook config into {@link SlackSettings}
*/
private void readSlackConfig(Rules rules) {
Map slacks = (Map) yamlData.get("slackHooks");
if (slacks != null) {
SlackSettings slackSettings = new SlackSettings();
Object textTemplate = slacks.getOrDefault("textTemplate", "");
slackSettings.setTextTemplate((String) textTemplate);
List<String> slackWebhooks = (List<String>) slacks.get("webhooks");
if (slackWebhooks != null) {
slackSettings.getWebhooks().addAll(slackWebhooks);
}
rules.setSlacks(slackSettings);
}
}
/**
* Read wechat hook config into {@link WechatSettings}
*/
private void readWechatConfig(Rules rules) {
Map wechatConfig = (Map) yamlData.get("wechatHooks");
if (wechatConfig != null) {
WechatSettings wechatSettings = new WechatSettings();
Object textTemplate = wechatConfig.getOrDefault("textTemplate", "");
wechatSettings.setTextTemplate((String) textTemplate);
List<String> wechatWebhooks = (List<String>) wechatConfig.get("webhooks");
if (wechatWebhooks != null) {
wechatSettings.getWebhooks().addAll(wechatWebhooks);
}
rules.setWecchats(wechatSettings);
}
}
Map wechatConfig = (Map) yamlData.get("wechatHooks");
if (wechatConfig != null) {
WechatSettings wechatSettings = new WechatSettings();
Object textTemplate = wechatConfig.getOrDefault("textTemplate", "");
wechatSettings.setTextTemplate((String) textTemplate);
List<String> wechatWebhooks = (List<String>) wechatConfig.get("webhooks");
if (wechatWebhooks != null) {
wechatSettings.getWebhooks().addAll(wechatWebhooks);
/**
* Read composite rule config into {@link CompositeAlarmRule}
*/
private void readCompositeRuleConfig(Rules rules) {
Map compositeRulesData = (Map) yamlData.get("composite-rules");
if (compositeRulesData == null) {
return;
}
compositeRulesData.forEach((k, v) -> {
String ruleName = (String) k;
if (ruleName.endsWith("_rule")) {
Map settings = (Map) v;
CompositeAlarmRule compositeAlarmRule = new CompositeAlarmRule();
compositeAlarmRule.setAlarmRuleName(ruleName);
String expression = (String) settings.get("expression");
if (expression == null) {
throw new IllegalArgumentException("expression can't be null");
}
rules.setWecchats(wechatSettings);
compositeAlarmRule.setExpression(expression);
compositeAlarmRule.setMessage(
(String) settings.getOrDefault("message", "Alarm caused by Rule " + ruleName));
rules.getCompositeRules().add(compositeAlarmRule);
}
}
return rules;
});
}
}
......@@ -24,6 +24,7 @@ import java.util.Comparator;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.locks.ReentrantLock;
import java.util.regex.Pattern;
......@@ -72,6 +73,7 @@ public class RunningRule {
private final Pattern includeLabelsRegex;
private final Pattern excludeLabelsRegex;
private final AlarmMessageFormatter formatter;
private final boolean onlyAsCondition;
public RunningRule(AlarmRule alarmRule) {
metricsName = alarmRule.getMetricsName();
......@@ -101,6 +103,7 @@ public class RunningRule {
this.excludeLabelsRegex = StringUtil.isNotEmpty(alarmRule.getExcludeLabelsRegex()) ?
Pattern.compile(alarmRule.getExcludeLabelsRegex()) : null;
this.formatter = new AlarmMessageFormatter(alarmRule.getMessage());
this.onlyAsCondition = alarmRule.isOnlyAsCondition();
}
/**
......@@ -221,8 +224,9 @@ public class RunningRule {
List<AlarmMessage> alarmMessageList = new ArrayList<>(30);
windows.forEach((meta, window) -> {
AlarmMessage alarmMessage = window.checkAlarm();
if (alarmMessage != AlarmMessage.NONE) {
Optional<AlarmMessage> alarmMessageOptional = window.checkAlarm();
if (alarmMessageOptional.isPresent()) {
AlarmMessage alarmMessage = alarmMessageOptional.get();
alarmMessage.setScopeId(meta.getScopeId());
alarmMessage.setScope(meta.getScope());
alarmMessage.setName(meta.getName());
......@@ -230,6 +234,7 @@ public class RunningRule {
alarmMessage.setId1(meta.getId1());
alarmMessage.setRuleName(this.ruleName);
alarmMessage.setAlarmMessage(formatter.format(meta));
alarmMessage.setOnlyAsCondition(this.onlyAsCondition);
alarmMessage.setStartTime(System.currentTimeMillis());
alarmMessageList.add(alarmMessage);
}
......@@ -323,7 +328,7 @@ public class RunningRule {
}
}
public AlarmMessage checkAlarm() {
public Optional<AlarmMessage> checkAlarm() {
if (isMatch()) {
/*
* When
......@@ -334,9 +339,7 @@ public class RunningRule {
counter++;
if (counter >= countThreshold && silenceCountdown < 1) {
silenceCountdown = silencePeriod;
// set empty message, but new message
return new AlarmMessage();
return Optional.of(new AlarmMessage());
} else {
silenceCountdown--;
}
......@@ -346,7 +349,7 @@ public class RunningRule {
counter--;
}
}
return AlarmMessage.NONE;
return Optional.empty();
}
private boolean isMatch() {
......
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package org.apache.skywalking.oap.server.core.alarm.provider.expression;
import lombok.extern.slf4j.Slf4j;
import org.mvel2.MVEL;
import org.mvel2.ParserContext;
import java.util.Map;
import java.util.Set;
import java.util.concurrent.ConcurrentHashMap;
/**
* Expression support eval java basic expressions, just like groovy script
* The internal detail is it first compile the expression to a parseTree then execute the parseTree with data
* It caches the compiled expression for sake of performance
*/
@Slf4j
public class Expression {
private final Map<String, Object> expressionCache;
private final ExpressionContext context;
public Expression(ExpressionContext context) {
this.context = context;
this.expressionCache = new ConcurrentHashMap<>();
}
/**
* Eval the given expression using empty data context
*/
public Object eval(String expression) {
return eval(expression, null);
}
/**
* Eval the given expression with data context
*/
public Object eval(String expression, Map<String, Object> vars) {
try {
Object obj = compile(expression, context);
return MVEL.executeExpression(obj, vars);
} catch (Throwable e) {
log.error("eval expression {} error", expression, e);
return null;
}
}
/**
* Compile the given expression to a parseTree
*/
public Object compile(String expression, ExpressionContext pctx) {
return expressionCache.computeIfAbsent(expression, s -> MVEL.compileExpression(expression, pctx.getContext()));
}
/**
* Analysis expression dependencies
*/
public Set<String> analysisInputs(String expression) {
ParserContext pCtx = ParserContext.create();
MVEL.analysisCompile(expression, pCtx);
Map<String, Class> inputsMap = pCtx.getInputs();
return inputsMap.keySet();
}
}
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package org.apache.skywalking.oap.server.core.alarm.provider.expression;
import lombok.Getter;
import org.mvel2.ParserContext;
import java.lang.reflect.Method;
import java.lang.reflect.Modifier;
/***
* Expression context can support custom function in expression,
* for example `md5(a) == '111111'`, the md5 function add register in the context
*/
public class ExpressionContext {
@Getter
private ParserContext context;
public ExpressionContext() {
context = new ParserContext();
}
/**
* Register a single method in the context
*/
public void registerFunc(String func, Method method) {
context.addImport(func, method);
}
/**
* Register hole class public static methods in the context
*/
public void registerFunc(Class<?> clz) {
Method[] methods = clz.getDeclaredMethods();
for (Method method : methods) {
int mod = method.getModifiers();
if (Modifier.isStatic(mod) && Modifier.isPublic(mod)) {
registerFunc(method.getName(), method);
}
}
}
}
......@@ -31,7 +31,7 @@ public class AlarmRuleInitTest {
Rules rules = reader.readRules();
List<AlarmRule> ruleList = rules.getRules();
Assert.assertEquals(2, ruleList.size());
Assert.assertEquals(3, ruleList.size());
Assert.assertEquals("85", ruleList.get(1).getThreshold());
Assert.assertEquals("endpoint_percent_rule", ruleList.get(0).getAlarmRuleName());
Assert.assertEquals(0, ruleList.get(0).getIncludeNames().size());
......@@ -46,5 +46,8 @@ public class AlarmRuleInitTest {
Assert.assertEquals(2, rulesWebhooks.size());
Assert.assertEquals("http://127.0.0.1/go-wechat/", rulesWebhooks.get(1));
List<CompositeAlarmRule> compositeRules = rules.getCompositeRules();
Assert.assertEquals(1, compositeRules.size());
Assert.assertEquals("endpoint_percent_more_rule && endpoint_percent_rule", compositeRules.get(0).getExpression());
}
}
......@@ -77,7 +77,7 @@ public class AlarmRulesWatcherTest {
alarmRulesWatcher.notify(new ConfigChangeWatcher.ConfigChangeEvent(new String(chars, 0, length), ConfigChangeWatcher.EventType.MODIFY));
assertEquals(2, alarmRulesWatcher.getRules().size());
assertEquals(3, alarmRulesWatcher.getRules().size());
assertEquals(2, alarmRulesWatcher.getWebHooks().size());
assertNotNull(alarmRulesWatcher.getGrpchookSetting());
assertEquals(9888, alarmRulesWatcher.getGrpchookSetting().getTargetPort());
......
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package org.apache.skywalking.oap.server.core.alarm.provider;
import org.apache.skywalking.oap.server.core.alarm.AlarmMessage;
import org.apache.skywalking.oap.server.core.alarm.provider.expression.Expression;
import org.apache.skywalking.oap.server.core.alarm.provider.expression.ExpressionContext;
import org.junit.Before;
import org.junit.Test;
import java.util.ArrayList;
import java.util.List;
import static org.hamcrest.CoreMatchers.is;
import static org.junit.Assert.assertThat;
public class CompositeRuleEvaluatorTest {
private CompositeRuleEvaluator ruleEvaluate;
@Before
public void init() {
Expression expression = new Expression(new ExpressionContext());
ruleEvaluate = new CompositeRuleEvaluator(expression);
}
@Test
public void testEvaluateMessageWithAndOp() {
List<CompositeAlarmRule> compositeAlarmRules = new ArrayList<>();
CompositeAlarmRule compositeAlarmRule = new CompositeAlarmRule("dummy", "a_rule && b_rule", "composite rule {name},{id} triggered!");
compositeAlarmRules.add(compositeAlarmRule);
List<AlarmMessage> alarmMessages = getAlarmMessages();
List<AlarmMessage> compositeMsgs = ruleEvaluate.evaluate(compositeAlarmRules, alarmMessages);
assertThat(compositeMsgs.size(), is(1));
assertThat(compositeMsgs.get(0).getAlarmMessage(), is("composite rule demo service,id0 triggered!"));
assertThat(compositeMsgs.get(0).getRuleName(), is("dummy"));
assertThat(compositeMsgs.get(0).getId0(), is("id0"));
assertThat(compositeMsgs.get(0).getId1(), is("id1"));
assertThat(compositeMsgs.get(0).isOnlyAsCondition(), is(false));
}
@Test
public void testEvaluateMessageWithFormatMessage() {
List<CompositeAlarmRule> compositeAlarmRules = new ArrayList<>();
CompositeAlarmRule compositeAlarmRule = new CompositeAlarmRule("dummy", "a_rule && b_rule", "composite rule {name} triggered!");
compositeAlarmRules.add(compositeAlarmRule);
List<AlarmMessage> alarmMessages = getAlarmMessages();
List<AlarmMessage> compositeMsgs = ruleEvaluate.evaluate(compositeAlarmRules, alarmMessages);
assertThat(compositeMsgs.size(), is(1));
assertThat(compositeMsgs.get(0).getAlarmMessage(), is("composite rule demo service triggered!"));
assertThat(compositeMsgs.get(0).getRuleName(), is("dummy"));
assertThat(compositeMsgs.get(0).getId0(), is("id0"));
assertThat(compositeMsgs.get(0).getId1(), is("id1"));
assertThat(compositeMsgs.get(0).isOnlyAsCondition(), is(false));
}
@Test
public void testEvaluateMessageWithNotExistsRule() {
List<CompositeAlarmRule> compositeAlarmRules = new ArrayList<>();
CompositeAlarmRule compositeAlarmRule = new CompositeAlarmRule("dummy", "a_rule && not_exist_rule", "composite rule triggered!");
compositeAlarmRules.add(compositeAlarmRule);
List<AlarmMessage> alarmMessages = getAlarmMessages();
List<AlarmMessage> compositeMsgs = ruleEvaluate.evaluate(compositeAlarmRules, alarmMessages);
assertThat(compositeMsgs.size(), is(0));
}
@Test
public void testEvaluateMessageWithException() {
List<CompositeAlarmRule> compositeAlarmRules = new ArrayList<>();
CompositeAlarmRule compositeAlarmRule = new CompositeAlarmRule("dummy", "a_rule + b_rule", "composite rule triggered!");
compositeAlarmRules.add(compositeAlarmRule);
List<AlarmMessage> alarmMessages = getAlarmMessages();
List<AlarmMessage> compositeMsgs = ruleEvaluate.evaluate(compositeAlarmRules, alarmMessages);
assertThat(compositeMsgs.size(), is(0));
}
private List<AlarmMessage> getAlarmMessages() {
List<AlarmMessage> alarmMessages = new ArrayList<>();
AlarmMessage alarmMessage = new AlarmMessage();
alarmMessage.setRuleName("a_rule");
alarmMessage.setOnlyAsCondition(true);
alarmMessage.setId0("id0");
alarmMessage.setId1("id1");
alarmMessage.setName("demo service");
alarmMessage.setScope("");
alarmMessage.setScopeId(1);
alarmMessages.add(alarmMessage);
alarmMessage = new AlarmMessage();
alarmMessage.setRuleName("b_rule");
alarmMessage.setOnlyAsCondition(true);
alarmMessage.setId0("id0");
alarmMessage.setId1("id1");
alarmMessage.setName("demo service");
alarmMessage.setScope("");
alarmMessage.setScopeId(1);
alarmMessages.add(alarmMessage);
alarmMessage = new AlarmMessage();
alarmMessage.setRuleName("c_rule");
alarmMessage.setOnlyAsCondition(true);
alarmMessage.setId0("id0");
alarmMessage.setId1("id1");
alarmMessage.setName("demo service");
alarmMessage.setScope("");
alarmMessage.setScopeId(1);
alarmMessages.add(alarmMessage);
return alarmMessages;
}
@Test
public void testEvaluateMessageWithOrOp() {
List<CompositeAlarmRule> compositeAlarmRules = new ArrayList<>();
CompositeAlarmRule compositeAlarmRule = new CompositeAlarmRule("dummy", "a_rule || b_rule", "composite rule triggered!");
compositeAlarmRules.add(compositeAlarmRule);
List<AlarmMessage> alarmMessages = getAlarmMessages();
alarmMessages.remove(0);
List<AlarmMessage> compositeMsgs = ruleEvaluate.evaluate(compositeAlarmRules, alarmMessages);
assertThat(compositeMsgs.size(), is(1));
assertThat(compositeMsgs.get(0).getAlarmMessage(), is("composite rule triggered!"));
assertThat(compositeMsgs.get(0).getRuleName(), is("dummy"));
assertThat(compositeMsgs.get(0).getId0(), is("id0"));
assertThat(compositeMsgs.get(0).getId1(), is("id1"));
assertThat(compositeMsgs.get(0).isOnlyAsCondition(), is(false));
}
@Test
public void testEvaluateMessageWithParenthesisAndOp() {
List<CompositeAlarmRule> compositeAlarmRules = new ArrayList<>();
CompositeAlarmRule compositeAlarmRule = new CompositeAlarmRule("dummy", "(a_rule || b_rule) && c_rule", "composite rule triggered!");
compositeAlarmRules.add(compositeAlarmRule);
List<AlarmMessage> alarmMessages = getAlarmMessages();
alarmMessages.remove(alarmMessages.size() - 1);
List<AlarmMessage> compositeMsgs = ruleEvaluate.evaluate(compositeAlarmRules, alarmMessages);
assertThat(compositeMsgs.size(), is(0));
}
@Test
public void testEvaluateMessageWithParenthesisAndOrOp() {
List<CompositeAlarmRule> compositeAlarmRules = new ArrayList<>();
CompositeAlarmRule compositeAlarmRule = new CompositeAlarmRule("dummy", "(a_rule && b_rule) || c_rule", "composite rule triggered!");
compositeAlarmRules.add(compositeAlarmRule);
List<AlarmMessage> alarmMessages = getAlarmMessages();
List<AlarmMessage> compositeMsgs = ruleEvaluate.evaluate(compositeAlarmRules, alarmMessages);
assertThat(compositeMsgs.size(), is(1));
assertThat(compositeMsgs.get(0).getAlarmMessage(), is("composite rule triggered!"));
assertThat(compositeMsgs.get(0).getRuleName(), is("dummy"));
assertThat(compositeMsgs.get(0).getId0(), is("id0"));
assertThat(compositeMsgs.get(0).getId1(), is("id1"));
assertThat(compositeMsgs.get(0).isOnlyAsCondition(), is(false));
}
}
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package org.apache.skywalking.oap.server.core.alarm.provider.expression;
import org.junit.Test;
import java.lang.reflect.Method;
import java.util.Arrays;
import static org.hamcrest.CoreMatchers.is;
import static org.junit.Assert.assertThat;
public class ExpressionContextTest {
@Test
public void testRegisterFuncWithMethod() throws NoSuchMethodException {
ExpressionContext expressionContext = new ExpressionContext();
Method[] methods = Math.class.getMethods();
Arrays.stream(methods).forEach(method -> {
if (method.getName().equalsIgnoreCase("sqrt")) {
expressionContext.registerFunc("sqrt", method);
}
});
Expression expression = new Expression(expressionContext);
Number number = (Number) expression.eval("sqrt(16)");
assertThat(number, is(4.0));
}
@Test
public void testRegisterFuncWithClazz() throws NoSuchMethodException {
ExpressionContext expressionContext = new ExpressionContext();
expressionContext.registerFunc(Math.class);
Expression expression = new Expression(expressionContext);
Number number = (Number) expression.eval("sqrt(16)");
assertThat(number, is(4.0));
number = (Number) expression.eval("abs(-12)");
assertThat(number, is(12));
}
}
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package org.apache.skywalking.oap.server.core.alarm.provider.expression;
import com.google.common.collect.Sets;
import org.junit.Before;
import org.junit.Test;
import org.mvel2.CompileException;
import java.util.HashMap;
import java.util.Map;
import java.util.Set;
import static org.hamcrest.CoreMatchers.is;
import static org.junit.Assert.assertNotNull;
import static org.junit.Assert.assertNull;
import static org.junit.Assert.assertThat;
public class ExpressionTest {
private Expression expression;
@Before
public void init() {
expression = new Expression(new ExpressionContext());
}
@Test
public void testEval() {
String expr = " a && b ";
Map<String, Object> dataMap = new HashMap();
dataMap.put("a", Boolean.TRUE);
Object flag = expression.eval(expr, dataMap);
assertNull(flag);
dataMap.put("b", Boolean.TRUE);
flag = expression.eval(expr, dataMap);
assertThat(flag, is(true));
}
@Test
public void testAnalysisInputs() {
String expr = " a && b ";
Set<String> inputs = expression.analysisInputs(expr);
assertThat(inputs.size(), is(2));
assertThat(inputs, is(Sets.newHashSet("a", "b")));
}
@Test
public void testEvalWithEmptyContext() {
String expr = " a && b ";
Object flag = expression.eval(expr);
assertNull(flag);
flag = expression.eval(" 1 > 0");
assertThat(flag, is(true));
}
@Test
public void testCompile() {
String expr = " a && b ";
ExpressionContext context = new ExpressionContext();
Object compiledExpression = expression.compile(expr, context);
assertNotNull(compiledExpression);
Object sameExpression = expression.compile(expr, context);
assertThat(compiledExpression, is(sameExpression));
}
@Test(expected = CompileException.class)
public void testCompileWithException() {
String expr = " a && * b ";
ExpressionContext context = new ExpressionContext();
expression.compile(expr, context);
}
}
......@@ -26,7 +26,9 @@ rules:
count: 3
# How many times of checks, the alarm keeps silence after alarm triggered, default as same as period.
silence-period: 10
only-as-condition: false
message: Successful rate of endpoint {name} is lower than 75%
service_percent_rule:
metrics-name: service_percent
# [Optional] Default, match all services in this metrics
......@@ -39,6 +41,27 @@ rules:
op: <
period: 10
count: 4
only-as-condition: false
endpoint_percent_more_rule:
# Metrics value need to be long, double or int
metrics-name: endpoint_percent
threshold: 60
op: ">"
# The length of time to evaluate the metrics
period: 10
# How many times after the metrics match the condition, will trigger alarm
count: 3
# How many times of checks, the alarm keeps silence after alarm triggered, default as same as period.
silence-period: 10
# Control if this rule is just as a composite rule condition. do not notification alone
only-as-condition: false
message: Successful rate of endpoint {name} is higher than 60%
composite-rules:
comp1_rule:
expression: endpoint_percent_more_rule && endpoint_percent_rule
message: xxxxx
webhooks:
- http://127.0.0.1/notify/
......
......@@ -27,9 +27,6 @@ import lombok.Setter;
@Setter
@Getter
public class AlarmMessage {
public static AlarmMessage NONE = new NoAlarm();
private int scopeId;
private String scope;
private String name;
......@@ -38,8 +35,5 @@ public class AlarmMessage {
private String ruleName;
private String alarmMessage;
private long startTime;
private static class NoAlarm extends AlarmMessage {
}
private transient boolean onlyAsCondition;
}
......@@ -167,4 +167,5 @@ zookeeper-3.4.10.jar
kafka-clients-2.4.1.jar
lz4-java-1.6.0.jar
snappy-java-1.1.7.3.jar
zstd-jni-1.4.3-1.jar
\ No newline at end of file
zstd-jni-1.4.3-1.jar
mvel2-2.4.8.Final.jar
......@@ -167,3 +167,4 @@ kafka-clients-2.4.1.jar
lz4-java-1.6.0.jar
snappy-java-1.1.7.3.jar
zstd-jni-1.4.3-1.jar
mvel2-2.4.8.Final.jar
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册