未验证 提交 75b52917 编写于 作者: W Wing 提交者: GitHub

Refine concepts and designs (#6655)

上级 4af2b610
# Meter Analysis Language
Meter system provides a functional analysis language called MAL(Meter Analysis Language) that lets the user analyze and
aggregate meter data in OAP streaming system. The result of an expression can either be ingested by agent analyzer,
or OC/Prometheus analyzer.
The meter system provides a functional analysis language called MAL (Meter Analysis Language) that lets users analyze and
aggregate meter data in the OAP streaming system. The result of an expression can either be ingested by the agent analyzer,
or the OC/Prometheus analyzer.
## Language data type
In MAL, an expression or sub-expression can evaluate to one of two types:
In MAL, an expression or sub-expression can evaluate to one of the following two types:
- Sample family - a set of samples(metrics) containing a range of metrics whose name is identical.
- Scalar - a simple numeric value. it supports integer/long, floating/double,
- **Sample family**: A set of samples (metrics) containing a range of metrics whose names are identical.
- **Scalar**: A simple numeric value that supports integer/long and floating/double.
## Sample family
A set of samples, which is as the basic unit in MAL. For example:
A set of samples, which acts as the basic unit in MAL. For example:
```
instance_trace_count
```
The above sample family might contains following simples which are provided by external modules, for instance, agent analyzer:
The sample family above may contain the following samples which are provided by external modules, such as the agent analyzer:
```
instance_trace_count{region="us-west",az="az-1"} 100
......@@ -29,12 +29,12 @@ instance_trace_count{region="asia-north",az="az-1"} 33
### Tag filter
MAL support four type operations to filter samples in a sample family:
MAL supports four type operations to filter samples in a sample family:
- tagEqual: Filter tags that are exactly equal to the provided string.
- tagNotEqual: Filter tags that are not equal to the provided string.
- tagMatch: Filter tags that regex-match the provided string.
- tagNotMatch: Filter labels that do not regex-match the provided string.
- tagEqual: Filter tags exactly equal to the string provided.
- tagNotEqual: Filter tags not equal to the string provided.
- tagMatch: Filter tags that regex-match the string provided.
- tagNotMatch: Filter labels that do not regex-match the string provided.
For example, this filters all instance_trace_count samples for us-west and asia-north region and az-1 az:
......@@ -43,14 +43,14 @@ instance_trace_count.tagMatch("region", "us-west|asia-north").tagEqual("az", "az
```
### Value filter
MAL support six type operations to filter samples in a sample family by value:
MAL supports six type operations to filter samples in a sample family by value:
- valueEqual: Filter values that are exactly equal to the provided value.
- valueNotEqual: Filter values that are not equal to the provided value.
- valueGreater: Filter values that greater than the provided value.
- valueGreaterEqual: Filter values that greater or equal the provided value.
- valueLess: Filter values that less than the provided value.
- valueLessEqual: Filter values that less or equal the provided value.
- valueEqual: Filter values exactly equal to the value provided.
- valueNotEqual: Filter values equal to the value provided.
- valueGreater: Filter values greater than the value provided.
- valueGreaterEqual: Filter values greater than or equal to the value provided.
- valueLess: Filter values less than the value provided.
- valueLessEqual: Filter values less than or equal to the value provided.
For example, this filters all instance_trace_count samples for values >= 33:
......@@ -58,17 +58,17 @@ For example, this filters all instance_trace_count samples for values >= 33:
instance_trace_count.valueGreaterEqual(33)
```
### Tag manipulator
MAL provides tag manipulators to change(add/delete/update) tags and their values.
MAL allows tag manipulators to change (i.e. add/delete/update) tags and their values.
#### K8s
MAL supports using the metadata of k8s to manipulate the tags and their values.
This feature requires OAP Server to have the authority to access the K8s's `API Server`.
MAL supports using the metadata of K8s to manipulate the tags and their values.
This feature requires authorizing the OAP Server to access K8s's `API Server`.
##### retagByK8sMeta
`retagByK8sMeta(newLabelName, K8sRetagType, existingLabelName, namespaceLabelName)`. Add a new tag to the sample family based on an existing label's value. Provide several internal converting types, including
`retagByK8sMeta(newLabelName, K8sRetagType, existingLabelName, namespaceLabelName)`. Add a new tag to the sample family based on the value of an existing label. Provide several internal converting types, including
- K8sRetagType.Pod2Service
Add a tag to the sample by using `service` as the key, `$serviceName.$namespace` as the value, by the given value of the tag key, which represents the name of a pod.
Add a tag to the sample using `service` as the key, `$serviceName.$namespace` as the value, and according to the given value of the tag key, which represents the name of a pod.
For example:
```
......@@ -94,7 +94,7 @@ The following binary arithmetic operators are available in MAL:
Binary operators are defined between scalar/scalar, sampleFamily/scalar and sampleFamily/sampleFamily value pairs.
Between two scalars: they evaluate to another scalar that is the result of the operator applied to both scalar operands:
Between two scalars: they evaluate to another scalar that is the result of the operator being applied to both scalar operands:
```
1 + 2
......@@ -120,9 +120,9 @@ instance_trace_count{region="us-east",az="az-3"} 22 // 20 + 2
instance_trace_count{region="asia-north",az="az-1"} 35 // 33 + 2
```
Between two sample families, a binary operator is applied to each sample in the left-hand side sample family and
its matching sample in the right-hand sample family. A new sample family with empty name will be generated.
Only the matched tags will be reserved. Samples for which no matching sample in the right-hand sample family are not in the result.
Between two sample families, a binary operator is applied to each sample in the sample family on the left and
its matching sample in the sample family on the right. A new sample family with empty name will be generated.
Only the matched tags will be reserved. Samples with no matching samples in the sample family on the right will not be found in the result.
Another sample family `instance_trace_analysis_error_count` is
......@@ -137,7 +137,7 @@ Example expression:
instance_trace_analysis_error_count / instance_trace_count
```
This returns a result sample family containing the error rate of trace analysis. The samples with region us-west and az az-3
This returns a resulting sample family containing the error rate of trace analysis. Samples with region us-west and az az-3
have no match and will not show up in the result:
```
......@@ -148,14 +148,14 @@ have no match and will not show up in the result:
### Aggregation Operation
Sample family supports the following aggregation operations that can be used to aggregate the samples of a single sample family,
resulting in a new sample family of fewer samples(even single one) with aggregated values:
resulting in a new sample family having fewer samples (sometimes having just a single sample) with aggregated values:
- sum (calculate sum over dimensions)
- min (select minimum over dimensions)
- max (select maximum over dimensions)
- avg (calculate the average over dimensions)
These operations can be used to aggregate over all label dimensions or preserve distinct dimensions by inputting `by` parameter.
These operations can be used to aggregate overall label dimensions or preserve distinct dimensions by inputting `by` parameter.
```
<aggr-op>(by: <tag1, tag2, ...>)
......@@ -167,7 +167,7 @@ Example expression:
instance_trace_count.sum(by: ['az'])
```
will output a result:
will output the following result:
```
instance_trace_count{az="az-1"} 133 // 100 + 33
......@@ -177,7 +177,7 @@ instance_trace_count{az="az-3"} 20
### Function
`Duraton` is a textual representation of a time range. The formats accepted are based on the ISO-8601 duration format {@code PnDTnHnMn.nS}
with days considered to be exactly 24 hours.
where a day is regarded as exactly 24 hours.
Examples:
- "PT20.345S" -- parses as "20.345 seconds"
......@@ -190,33 +190,33 @@ Examples:
- "-P-6H+3M" -- parses as "+6 hours and -3 minutes"
#### increase
`increase(Duration)`. Calculates the increase in the time range.
`increase(Duration)`: Calculates the increase in the time range.
#### rate
`rate(Duration)`. Calculates the per-second average rate of increase of the time range.
`rate(Duration)`: Calculates the per-second average rate of increase in the time range.
#### irate
`irate()`. Calculates the per-second instant rate of increase of the time range.
`irate()`: Calculates the per-second instant rate of increase in the time range.
#### tag
`tag({allTags -> })`. Update tags of samples. User can add, drop, rename and update tags.
`tag({allTags -> })`: Updates tags of samples. User can add, drop, rename and update tags.
#### histogram
`histogram(le: '<the tag name of le>')`. Transforms less based histogram buckets to meter system histogram buckets.
`le` parameter hints the tag name of a bucket.
`histogram(le: '<the tag name of le>')`: Transforms less-based histogram buckets to meter system histogram buckets.
`le` parameter represents the tag name of the bucket.
#### histogram_percentile
`histogram_percentile([<p scalar>])`. Hints meter-system to calculates the p-percentile (0 ≤ p ≤ 100) from the buckets.
`histogram_percentile([<p scalar>])`. Represents the meter-system to calculate the p-percentile (0 ≤ p ≤ 100) from the buckets.
#### time
`time()`. returns the number of seconds since January 1, 1970 UTC.
`time()`: Returns the number of seconds since January 1, 1970 UTC.
## Down Sampling Operation
MAL should instruct meter-system how to do downsampling for metrics. It doesn't only refer to aggregate raw samples to
`minute` level, but also hints data from `minute` to higher levels, for instance, `hour` and `day`.
MAL should instruct meter-system on how to downsample for metrics. It doesn't only refer to aggregate raw samples to
`minute` level, but also expresses data from `minute` in higher levels, such as `hour` and `day`.
Down sampling function is called `downsampling` in MAL, it accepts the following types:
Down sampling function is called `downsampling` in MAL, and it accepts the following types:
- AVG
- SUM
......@@ -236,8 +236,7 @@ last_server_state_sync_time_in_seconds.tagEqual('production', 'catalog').downsam
## Metric level function
Metric has three level, service, instance and endpoint. They extract level relevant labels from metric labels, then
hints meter-system which level this metrics should be.
There are three levels in metric: service, instance and endpoint. They extract level relevant labels from metric labels, then informs the meter-system the level to which this metric belongs.
- `servcie([svc_label1, svc_label2...])` extracts service level labels from the array argument.
- `instance([svc_label1, svc_label2...], [ins_label1, ins_label2...])` extracts service level labels from the first array argument,
......
# Manual instrument SDK
We have manual instrument SDK contributed from the community.
- [Go2Sky](https://github.com/SkyAPM/go2sky). Go SDK follows SkyWalking format.
- [C++](https://github.com/SkyAPM/cpp2sky). C++ SDK follows SkyWalking format.
Our incredible community has contributed to the manual instrument SDK.
- [Go2Sky](https://github.com/SkyAPM/go2sky). Go SDK follows the SkyWalking format.
- [C++](https://github.com/SkyAPM/cpp2sky). C++ SDK follows the SkyWalking format.
## What is SkyWalking formats and propagation protocols?
## What are the SkyWalking format and the propagation protocols?
See these protocols in [protocols document](../protocols/README.md).
## Envoy tracer
......
# Meter System
Meter system is another streaming calculation mode, especially for metrics data. In the [OAL](oal.md), there are clear
[Scope Definitions](scope-definitions.md), including native objects. Meter system is focusing on the data type itself,
and provides more flexible to the end user to define the scope entity.
Meter system is another streaming calculation mode designed for metrics data. In the [OAL](oal.md), there are clear
[Scope Definitions](scope-definitions.md), including definitions for native objects. Meter system is focused on the data type itself,
and provides a more flexible approach to the end user in defining the scope entity.
The meter system is open to different receivers and fetchers in the backend,
follow the [backend setup document](../setup/backend/backend-setup.md) for more details.
see the [backend setup document](../setup/backend/backend-setup.md) for more details.
Every metrics is declared in the meter system should include following attribute
1. **Metrics Name**. An unique name globally, should avoid overlap the OAL variable names.
1. **Function Name**. The function used for this metrics, distributed aggregation, value calculation and down sampling calculation
based on the function implementation. Also, the data structure is determined by the function too, such as function Avg is for Long.
1. **Scope Type**. Unlike inside the OAL, there are plenty of logic scope definitions, in meter system, only type is required.
Type values include service, instance, and endpoint, like we introduced in the Overview.
The values of scope entity name, such as service name, are required when metrics data generated with the metrics data value.
Every metric is declared in the meter system to include the following attributes:
1. **Metrics Name**. A globally unique name to avoid overlapping between the OAL variable names.
1. **Function Name**. The function used for this metric, namely distributed aggregation, value calculation or down sampling calculation
based on the function implementation. Further, the data structure is determined by the function as well, such as function Avg is for Long.
1. **Scope Type**. Unlike within the OAL, there are plenty of logic scope definitions. In the meter system, only type is required.
Type values include service, instance, and endpoint, just as we have described in the Overview section.
The values of scope entity name, such as service name, are required when metrics data are generated with the metrics data values.
NOTICE, the metrics must be declared in the bootstrap stage, no runtime changed.
NOTE: The metrics must be declared in the bootstrap stage, and there must be no change to runtime.
Meter System supports following binding functions
- **avg**. Calculate the avg value for every entity in the same metrics name.
- **histogram**. Aggregate the counts in the configurable buckets, buckets is configurable but must be assigned in the declaration stage.
- **percentile**. Read [percentile in WIKI](https://en.wikipedia.org/wiki/Percentile). Unlike in the OAL, we provide
50/75/90/95/99 in default, in the meter system function, percentile function accepts several ranks, which should be in
The Meter System supports the following binding functions:
- **avg**. Calculates the avg value for every entity under the same metrics name.
- **histogram**. Aggregates the counts in the configurable buckets. Buckets are configurable but must be assigned in the declaration stage.
- **percentile**. See [percentile in WIKI](https://en.wikipedia.org/wiki/Percentile). Unlike the OAL, we provide
50/75/90/95/99 by default. In the meter system function, the percentile function accepts several ranks, which should be in
the (0, 100) range.
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册