Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
apache
SkyWalking
提交
f9096f50
S
SkyWalking
项目概览
apache
/
SkyWalking
上一次同步 大约 1 年
通知
302
Star
21345
Fork
6091
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
S
SkyWalking
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
前往新版Gitcode,体验更适合开发者的 AI 搜索 >>
未验证
提交
f9096f50
编写于
4月 06, 2021
作者:
W
wankai123
提交者:
GitHub
4月 06, 2021
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Support k8s monitoring (#6479)
上级
96611394
变更
9
展开全部
隐藏空白更改
内联
并排
Showing
9 changed file
with
755 addition
and
7 deletion
+755
-7
CHANGES.md
CHANGES.md
+2
-0
docs/en/setup/backend/backend-receivers.md
docs/en/setup/backend/backend-receivers.md
+3
-0
oap-server/analyzer/meter-analyzer/src/main/java/org/apache/skywalking/oap/meter/analyzer/dsl/tagOpt/K8sRetagType.java
...kywalking/oap/meter/analyzer/dsl/tagOpt/K8sRetagType.java
+5
-5
oap-server/analyzer/meter-analyzer/src/main/java/org/apache/skywalking/oap/meter/analyzer/dsl/tagOpt/Retag.java
...pache/skywalking/oap/meter/analyzer/dsl/tagOpt/Retag.java
+1
-0
oap-server/analyzer/meter-analyzer/src/test/java/org/apache/skywalking/oap/meter/analyzer/dsl/K8sTagTest.java
.../apache/skywalking/oap/meter/analyzer/dsl/K8sTagTest.java
+3
-2
oap-server/server-bootstrap/src/main/resources/otel-oc-rules/k8s-cluster.yaml
...otstrap/src/main/resources/otel-oc-rules/k8s-cluster.yaml
+89
-0
oap-server/server-bootstrap/src/main/resources/otel-oc-rules/k8s-node.yaml
...-bootstrap/src/main/resources/otel-oc-rules/k8s-node.yaml
+74
-0
oap-server/server-bootstrap/src/main/resources/otel-oc-rules/k8s-service.yaml
...otstrap/src/main/resources/otel-oc-rules/k8s-service.yaml
+66
-0
oap-server/server-bootstrap/src/main/resources/ui-initialized-templates/k8s.yml
...strap/src/main/resources/ui-initialized-templates/k8s.yml
+512
-0
未找到文件。
CHANGES.md
浏览文件 @
f9096f50
...
...
@@ -68,6 +68,8 @@ Release Notes.
*
Optimize the self monitoring grafana dashboard.
*
Enhance the export service.
*
Add function
`retagByK8sMeta`
and opt type
`K8sRetagType.Pod2Service`
in MAL for k8s to relate pods and services.
*
Using "service.istio.io/canonical-name" to replace "app" label to resolve Envoy ALS service name.
*
Support k8s monitoring.
*
Make the flushing metrics operation concurrent.
*
Fix ALS K8SServiceRegistry didn't remove the correct entry.
*
Using "service.istio.io/canonical-name" to replace "app" label to resolve Envoy ALS service name.
...
...
docs/en/setup/backend/backend-receivers.md
浏览文件 @
f9096f50
...
...
@@ -132,6 +132,9 @@ to be the identification of the metric data.
|istio-controlplane| Metrics of Istio control panel | otel-oc-rules/istio-controlplane.yaml | Istio Control Panel -> OpenTelemetry Collector --OC format--> SkyWalking OAP Server |
|oap| Metrics of SkyWalking OAP server itself | otel-oc-rules/oap.yaml | SkyWalking OAP Server(SelfObservability) -> OpenTelemetry Collector --OC format--> SkyWalking OAP Server |
|vm| Metrics of VMs | otel-oc-rules/vm.yaml | Prometheus node-exporter(VMs) -> OpenTelemetry Collector --OC format--> SkyWalking OAP Server |
|k8s-cluster| Metrics of K8s cluster | otel-oc-rules/k8s-cluster.yaml | K8s kube-state-metrics -> OpenTelemetry Collector --OC format--> SkyWalking OAP Server |
|k8s-node| Metrics of K8s cluster | otel-oc-rules/k8s-node.yaml | cAdvisor & K8s kube-state-metrics -> OpenTelemetry Collector --OC format--> SkyWalking OAP Server |
|k8s-service| Metrics of K8s cluster | otel-oc-rules/k8s-service.yaml | cAdvisor & K8s kube-state-metrics -> OpenTelemetry Collector --OC format--> SkyWalking OAP Server |
## Meter receiver
...
...
oap-server/analyzer/meter-analyzer/src/main/java/org/apache/skywalking/oap/meter/analyzer/dsl/tagOpt/K8sRetagType.java
浏览文件 @
f9096f50
...
...
@@ -27,7 +27,6 @@ import org.apache.skywalking.oap.meter.analyzer.dsl.Sample;
import
org.apache.skywalking.oap.meter.analyzer.k8s.K8sInfoRegistry
;
public
enum
K8sRetagType
implements
Retag
{
Pod2Service
{
@Override
public
Sample
[]
execute
(
final
Sample
[]
ss
,
...
...
@@ -39,11 +38,12 @@ public enum K8sRetagType implements Retag {
String
namespace
=
sample
.
getLabels
().
get
(
namespaceLabelName
);
if
(!
Strings
.
isNullOrEmpty
(
podName
)
&&
!
Strings
.
isNullOrEmpty
(
namespace
))
{
String
serviceName
=
K8sInfoRegistry
.
getInstance
().
findServiceName
(
namespace
,
podName
);
if
(!
Strings
.
isNullOrEmpty
(
serviceName
))
{
Map
<
String
,
String
>
labels
=
Maps
.
newHashMap
(
sample
.
getLabels
());
labels
.
put
(
newLabelName
,
serviceName
);
return
sample
.
toBuilder
().
labels
(
ImmutableMap
.
copyOf
(
labels
)).
build
();
if
(
Strings
.
isNullOrEmpty
(
serviceName
))
{
serviceName
=
BLANK
;
}
Map
<
String
,
String
>
labels
=
Maps
.
newHashMap
(
sample
.
getLabels
());
labels
.
put
(
newLabelName
,
serviceName
);
return
sample
.
toBuilder
().
labels
(
ImmutableMap
.
copyOf
(
labels
)).
build
();
}
return
sample
;
}).
toArray
(
Sample
[]::
new
);
...
...
oap-server/analyzer/meter-analyzer/src/main/java/org/apache/skywalking/oap/meter/analyzer/dsl/tagOpt/Retag.java
浏览文件 @
f9096f50
...
...
@@ -21,5 +21,6 @@ package org.apache.skywalking.oap.meter.analyzer.dsl.tagOpt;
import
org.apache.skywalking.oap.meter.analyzer.dsl.Sample
;
public
interface
Retag
{
String
BLANK
=
""
;
Sample
[]
execute
(
Sample
[]
ss
,
String
newLabelName
,
String
existingLabelName
,
String
namespaceLabelName
);
}
oap-server/analyzer/meter-analyzer/src/test/java/org/apache/skywalking/oap/meter/analyzer/dsl/K8sTagTest.java
浏览文件 @
f9096f50
...
...
@@ -28,6 +28,7 @@ import java.util.Collection;
import
java.util.Map
;
import
lombok.SneakyThrows
;
import
lombok.extern.slf4j.Slf4j
;
import
org.apache.skywalking.oap.meter.analyzer.dsl.tagOpt.Retag
;
import
org.apache.skywalking.oap.meter.analyzer.k8s.K8sInfoRegistry
;
import
org.junit.Before
;
import
org.junit.Test
;
...
...
@@ -133,7 +134,7 @@ public class K8sTagTest {
.
labels
(
of
(
"namespace"
,
"default"
,
"container"
,
"my-nginx"
,
"cpu"
,
"total"
,
"pod"
,
"my-nginx-5dc4865748-no-pod"
"my-nginx-5dc4865748-no-pod"
,
"service"
,
Retag
.
BLANK
))
.
value
(
2
)
.
build
(),
...
...
@@ -175,7 +176,7 @@ public class K8sTagTest {
.
labels
(
of
(
"namespace"
,
"default"
,
"container"
,
"my-nginx"
,
"cpu"
,
"total"
,
"pod"
,
"my-nginx-5dc4865748-no-service"
"my-nginx-5dc4865748-no-service"
,
"service"
,
Retag
.
BLANK
))
.
value
(
2
)
.
build
(),
...
...
oap-server/server-bootstrap/src/main/resources/otel-oc-rules/k8s-cluster.yaml
0 → 100644
浏览文件 @
f9096f50
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This will parse a textual representation of a duration. The formats
# accepted are based on the ISO-8601 duration format {@code PnDTnHnMn.nS}
# with days considered to be exactly 24 hours.
# <p>
# Examples:
# <pre>
# "PT20.345S" -- parses as "20.345 seconds"
# "PT15M" -- parses as "15 minutes" (where a minute is 60 seconds)
# "PT10H" -- parses as "10 hours" (where an hour is 3600 seconds)
# "P2D" -- parses as "2 days" (where a day is 24 hours or 86400 seconds)
# "P2DT3H4M" -- parses as "2 days, 3 hours and 4 minutes"
# "P-6H3M" -- parses as "-6 hours and +3 minutes"
# "-P6H3M" -- parses as "-6 hours and -3 minutes"
# "-P-6H+3M" -- parses as "+6 hours and -3 minutes"
# </pre>
expSuffix
:
tag({tags -> tags.cluster = 'k8s-cluster::' + tags.cluster}).service(['cluster'])
metricPrefix
:
k8s_cluster
metricsRules
:
-
name
:
cpu_cores
exp
:
(kube_node_status_capacity * 1000).tagEqual('resource' , 'cpu').sum(['cluster'])
-
name
:
cpu_cores_allocatable
exp
:
(kube_node_status_allocatable * 1000).tagEqual('resource' , 'cpu').sum(['cluster'])
-
name
:
cpu_cores_requests
exp
:
(kube_pod_container_resource_requests * 1000).tagEqual('resource' , 'cpu').sum(['cluster'])
-
name
:
cpu_cores_limits
exp
:
(kube_pod_container_resource_limits * 1000).tagEqual('resource' , 'cpu').sum(['cluster'])
-
name
:
memory_total
exp
:
kube_node_status_capacity.tagEqual('resource' , 'memory').sum(['cluster'])
-
name
:
memory_allocatable
exp
:
kube_node_status_allocatable.tagEqual('resource' , 'memory').sum(['cluster'])
-
name
:
memory_requests
exp
:
kube_pod_container_resource_requests.tagEqual('resource' , 'memory').sum(['cluster'])
-
name
:
memory_limits
exp
:
kube_pod_container_resource_limits.tagEqual('resource' , 'memory').sum(['cluster'])
-
name
:
storage_total
exp
:
kube_node_status_capacity.tagEqual('resource' , 'ephemeral_storage').sum(['cluster'])
-
name
:
storage_allocatable
exp
:
kube_node_status_allocatable.tagEqual('resource' , 'ephemeral_storage').sum(['cluster'])
-
name
:
node_total
exp
:
kube_node_info.sum(['cluster'])
-
name
:
node_status
exp
:
kube_node_status_condition.valueEqual(1).tagMatch('status' , 'true|unknown').sum(['cluster' , 'node' ,'condition'])
-
name
:
namespace_total
exp
:
kube_namespace_labels.sum(['cluster'])
-
name
:
deployment_total
exp
:
kube_deployment_labels.sum(['cluster'])
-
name
:
deployment_status
exp
:
kube_deployment_status_condition.valueEqual(1).tagMatch('condition' , 'Available').sum(['cluster' , 'deployment' ,'condition' , 'status']).tag({tags -> tags.remove('condition')})
-
name
:
deployment_spec_replicas
exp
:
kube_deployment_spec_replicas.sum(['cluster' , 'deployment'])
-
name
:
service_total
exp
:
kube_service_info.sum(['cluster'])
-
name
:
service_pod_status
exp
:
kube_pod_status_phase.retagByK8sMeta('service' , K8sRetagType.Pod2Service , 'pod' , 'namespace').tagNotEqual('service' , '').valueEqual(1).sum(['cluster' , 'service' , 'phase'])
-
name
:
pod_total
exp
:
kube_pod_info.sum(['cluster'])
-
name
:
pod_status_not_running
exp
:
kube_pod_status_phase.valueEqual(1).tagNotMatch('phase' , 'Running').sum(['cluster' , 'pod' , 'phase'])
-
name
:
container_total
exp
:
kube_pod_container_info.sum(['cluster'])
-
name
:
pod_status_waiting
exp
:
kube_pod_container_status_waiting_reason.valueEqual(1).sum(['cluster' , 'pod' , 'container' , 'reason'])
-
name
:
pod_status_terminated
exp
:
kube_pod_container_status_terminated_reason.valueEqual(1).sum(['cluster' , 'pod' , 'container' , 'reason'])
oap-server/server-bootstrap/src/main/resources/otel-oc-rules/k8s-node.yaml
0 → 100644
浏览文件 @
f9096f50
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This will parse a textual representation of a duration. The formats
# accepted are based on the ISO-8601 duration format {@code PnDTnHnMn.nS}
# with days considered to be exactly 24 hours.
# <p>
# Examples:
# <pre>
# "PT20.345S" -- parses as "20.345 seconds"
# "PT15M" -- parses as "15 minutes" (where a minute is 60 seconds)
# "PT10H" -- parses as "10 hours" (where an hour is 3600 seconds)
# "P2D" -- parses as "2 days" (where a day is 24 hours or 86400 seconds)
# "P2DT3H4M" -- parses as "2 days, 3 hours and 4 minutes"
# "P-6H3M" -- parses as "-6 hours and +3 minutes"
# "-P6H3M" -- parses as "-6 hours and -3 minutes"
# "-P-6H+3M" -- parses as "+6 hours and -3 minutes"
# </pre>
expSuffix
:
tag({tags -> tags.cluster = 'k8s-cluster::' + tags.cluster}).instance(['cluster'] , ['node'])
metricPrefix
:
k8s_node
metricsRules
:
-
name
:
cpu_cores
exp
:
(kube_node_status_capacity * 1000).tagEqual('resource' , 'cpu').sum(['cluster' , 'node'])
-
name
:
cpu_usage
exp
:
(container_cpu_usage_seconds_total * 1000).tagEqual('id' , '/').sum(['cluster' , 'node']).rate('PT1M')
-
name
:
cpu_cores_allocatable
exp
:
(kube_node_status_allocatable * 1000).tagEqual('resource' , 'cpu').sum(['cluster' , 'node'])
-
name
:
cpu_cores_requests
exp
:
(kube_pod_container_resource_requests * 1000).tagEqual('resource' , 'cpu').sum(['cluster' , 'node'])
-
name
:
cpu_cores_limits
exp
:
(kube_pod_container_resource_limits * 1000).tagEqual('resource' , 'cpu').sum(['cluster' , 'node'])
-
name
:
memory_total
exp
:
kube_node_status_capacity.tagEqual('resource' , 'memory').sum(['cluster' , 'node'])
-
name
:
memory_allocatable
exp
:
kube_node_status_allocatable.tagEqual('resource' , 'memory').sum(['cluster' , 'node'])
-
name
:
memory_requests
exp
:
kube_pod_container_resource_requests.tagEqual('resource' , 'memory').sum(['cluster' , 'node'])
-
name
:
memory_limits
exp
:
kube_pod_container_resource_limits.tagEqual('resource' , 'memory').sum(['cluster' , 'node'])
-
name
:
memory_usage
exp
:
container_memory_working_set_bytes.tagEqual('id' , '/').sum(['cluster' , 'node'])
-
name
:
storage_total
exp
:
kube_node_status_capacity.tagEqual('resource' , 'ephemeral_storage').sum(['cluster' , 'node'])
-
name
:
storage_allocatable
exp
:
kube_node_status_allocatable.tagEqual('resource' , 'ephemeral_storage').sum(['cluster' , 'node'])
-
name
:
node_status
exp
:
kube_node_status_condition.valueEqual(1).tagMatch('status' , 'true|unknown').sum(['cluster' , 'node' ,'condition'])
-
name
:
pod_total
exp
:
kube_pod_info.sum(['cluster' , 'node'])
-
name
:
network_receive
exp
:
container_network_receive_bytes_total.sum(['cluster' , 'node']).irate()
-
name
:
network_transmit
exp
:
container_network_transmit_bytes_total.sum(['cluster' , 'node']).irate()
oap-server/server-bootstrap/src/main/resources/otel-oc-rules/k8s-service.yaml
0 → 100644
浏览文件 @
f9096f50
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This will parse a textual representation of a duration. The formats
# accepted are based on the ISO-8601 duration format {@code PnDTnHnMn.nS}
# with days considered to be exactly 24 hours.
# <p>
# Examples:
# <pre>
# "PT20.345S" -- parses as "20.345 seconds"
# "PT15M" -- parses as "15 minutes" (where a minute is 60 seconds)
# "PT10H" -- parses as "10 hours" (where an hour is 3600 seconds)
# "P2D" -- parses as "2 days" (where a day is 24 hours or 86400 seconds)
# "P2DT3H4M" -- parses as "2 days, 3 hours and 4 minutes"
# "P-6H3M" -- parses as "-6 hours and +3 minutes"
# "-P6H3M" -- parses as "-6 hours and -3 minutes"
# "-P-6H+3M" -- parses as "+6 hours and -3 minutes"
# </pre>
expSuffix
:
tag({tags -> tags.cluster = 'k8s-cluster::' + tags.cluster}).endpoint(['cluster'] , ['service'])
metricPrefix
:
k8s_service
metricsRules
:
-
name
:
pod_total
exp
:
kube_pod_info.retagByK8sMeta('service' , K8sRetagType.Pod2Service , 'pod' , 'namespace').tagNotEqual('service' , '').sum(['cluster' , 'service'])
-
name
:
cpu_cores_requests
exp
:
(kube_pod_container_resource_requests * 1000).retagByK8sMeta('service' , K8sRetagType.Pod2Service , 'pod' , 'namespace').tagNotEqual('service' , '').tagEqual('resource' , 'cpu').sum(['cluster' , 'service'])
-
name
:
cpu_cores_limits
exp
:
(kube_pod_container_resource_limits * 1000).retagByK8sMeta('service' , K8sRetagType.Pod2Service , 'pod' , 'namespace').tagNotEqual('service' , '').tagEqual('resource' , 'cpu').sum(['cluster' , 'service'])
-
name
:
memory_requests
exp
:
kube_pod_container_resource_requests.retagByK8sMeta('service' , K8sRetagType.Pod2Service , 'pod' , 'namespace').tagNotEqual('service' , '').tagEqual('resource' , 'memory').sum(['cluster' , 'service'])
-
name
:
memory_limits
exp
:
kube_pod_container_resource_limits.retagByK8sMeta('service' , K8sRetagType.Pod2Service , 'pod' , 'namespace').tagNotEqual('service' , '').tagEqual('resource' , 'memory').sum(['cluster' , 'service'])
-
name
:
pod_status
exp
:
kube_pod_status_phase.retagByK8sMeta('service' , K8sRetagType.Pod2Service , 'pod' , 'namespace').tagNotEqual('service' , '').valueEqual(1).sum(['cluster' , 'service' , 'pod' , 'phase'])
-
name
:
pod_status_waiting
exp
:
kube_pod_container_status_waiting_reason.retagByK8sMeta('service' , K8sRetagType.Pod2Service , 'pod' , 'namespace').tagNotEqual('service' , '').valueEqual(1).sum(['cluster' , 'service' , 'pod' , 'container' , 'reason'])
-
name
:
pod_status_terminated
exp
:
kube_pod_container_status_terminated_reason.retagByK8sMeta('service' , K8sRetagType.Pod2Service , 'pod' , 'namespace').tagNotEqual('service' , '').valueEqual(1).sum(['cluster' , 'service' , 'pod' , 'container' , 'reason'])
-
name
:
pod_status_restarts_total
exp
:
kube_pod_container_status_restarts_total.retagByK8sMeta('service' , K8sRetagType.Pod2Service , 'pod' , 'namespace').tagNotEqual('service' , '').sum(['cluster' , 'service' , 'pod'])
-
name
:
pod_cpu_usage
exp
:
(container_cpu_usage_seconds_total * 1000).tagNotEqual('pod' , '').retagByK8sMeta('service' , K8sRetagType.Pod2Service , 'pod' , 'namespace').tagNotEqual('service' , '').sum(['cluster' , 'service' , 'pod']).rate('PT1M')
-
name
:
pod_memory_usage
exp
:
container_memory_working_set_bytes.retagByK8sMeta('service' , K8sRetagType.Pod2Service , 'pod' , 'namespace').tagNotEqual('service' , '').sum(['cluster' , 'service' , 'pod'])
-
name
:
pod_network_receive
exp
:
container_network_receive_bytes_total.tagNotEqual('pod' , '').retagByK8sMeta('service' , K8sRetagType.Pod2Service , 'pod' , 'namespace').tagNotEqual('service' , '').sum(['cluster' , 'service' , 'pod']).irate()
-
name
:
pod_network_transmit
exp
:
container_network_transmit_bytes_total.tagNotEqual('pod' , '').retagByK8sMeta('service' , K8sRetagType.Pod2Service , 'pod' , 'namespace').tagNotEqual('service' , '').sum(['cluster' , 'service' , 'pod']).irate()
-
name
:
pod_fs_usage
exp
:
container_fs_usage_bytes.tagNotEqual('pod' , '').retagByK8sMeta('service' , K8sRetagType.Pod2Service , 'pod' , 'namespace').tagNotEqual('service' , '').sum(['cluster' , 'service' , 'pod'])
oap-server/server-bootstrap/src/main/resources/ui-initialized-templates/k8s.yml
0 → 100644
浏览文件 @
f9096f50
此差异已折叠。
点击以展开。
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录