diff --git "a/Documentation/A-Tune\347\224\250\346\210\267\346\214\207\345\215\227.md" "b/Documentation/A-Tune\347\224\250\346\210\267\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..710266c4b48fdd4c90878df9a665f4b8e12a244a --- /dev/null +++ "b/Documentation/A-Tune\347\224\250\346\210\267\346\214\207\345\215\227.md" @@ -0,0 +1,1389 @@ +# A-Tune用户指南 + +## 法律申明 + +**版权所有 © 2020 华为技术有限公司。** + +您对“本文档”的复制,使用,修改及分发受知识共享\(Creative Commons\)署名—相同方式共享4.0国际公共许可协议\(以下简称“CC BY-SA 4.0”\)的约束。为了方便用户理解,您可以通过访问[https://creativecommons.org/licenses/by-sa/4.0/](https://creativecommons.org/licenses/by-sa/4.0/) 了解CC BY-SA 4.0的概要 \(但不是替代\)。CC BY-SA 4.0的完整协议内容您可以访问如下网址获取:[https://creativecommons.org/licenses/by-sa/4.0/legalcode](https://creativecommons.org/licenses/by-sa/4.0/legalcode)。 + +**商标声明** + +A-Tune和其他华为商标均为华为技术有限公司的商标。本文档提及的其他所有商标或注册商标,由各自的所有人拥有。 + +**免责声明** + +本文档仅作为使用指导,除非适用法强制规定或者双方有明确书面约定, 华为技术有限公司对本文档中的所有陈述、信息和建议不做任何明示或默示的声明或保证,包括但不限于不侵权,时效性或满足特定目的的担保。 + +## 前言 + +### 概述 + +本文档介绍openEuler系统性能自优化软件A-Tune的安装部署和使用方法,以指导用户快速了解并使用A-Tune。 + +### 读者对象 + +本文档适用于使用openEuler系统并希望了解和使用A-Tune的社区开发者、开源爱好者以及相关合作伙伴。使用人员需要具备基本的Linux操作系统知识。 + +## 1 认识A-Tune + +## 1.1 简介 + +操作系统作为衔接应用和硬件的基础软件,如何调整系统和应用配置,充分发挥软硬件能力,从而使业务性能达到最优,对用户至关重要。然而,运行在操作系统上的业务类型成百上千,应用形态千差万别,对资源的要求各不相同。当前硬件和基础软件组成的应用环境涉及高达7000多个配置对象,随着业务复杂度和调优对象的增加,调优所需的时间成本呈指数级增长,导致调优效率急剧下降,调优成为了一项极其复杂的工程,给用户带来巨大挑战。 +其次,操作系统作为基础设施软件,提供了大量的软硬件管理能力,每种能力适用场景不尽相同,并非对所有的应用场景都通用有益,因此,不同的场景需要开启或关闭不同的能力,组合使用系统提供的各种能力,才能发挥应用程序的最佳性能; +另外,实际业务场景成千上万,计算、网络、存储等硬件配置也层出不穷,实验室无法遍历穷举所有的应用和业务场景,以及不同的硬件组合。 +为了应对上述挑战,openEuler推出了A-Tune。 + +A-Tune是一款基于AI开发的系统性能优化的基础软件,它利用人工智能技术,对业务场景建立精准的系统画像,感知并推理出业务特征,进而做出智能决策,匹配并推荐最佳的系统参数配置组合,使业务处于最佳运行状态。 + +![](figures/zh-cn_image_0215192422.png) + +## 1.2 架构 + +A-Tune核心技术架构如下图,主要包括智能决策、系统画像和交互系统三层。 + +- 智能决策层:包含感知和决策两个子系统,分别承担对应用的智能感知和对系统的调优决策。 +- 系统画像层:主要包括标注和学习系统,标注系统用于业务模型的聚类,学习系统用于业务模型的学习和分类。 +- 交互系统层:用于各类系统资源的监控和配置,调优策略执行在本层进行。 + +![](figures/zh-cn_image_0215591510.png) + +## 1.3 支持特性与业务模型 + +### 支持特性 + +A-Tune支持的主要特性、特性成熟度以及使用建议请参见[表1](#table1919220557576)。 + +**表 1** 特性成熟度 + + + + + + + + + + + + + + + + + + + + +

特性

+

成熟度

+

使用建议

+

七大类11款应用负载类型自动优化

+

已测试

+

试用

+

自定义负载类型和业务模型

+

已测试

+

试用

+

参数自调优

+

已测试

+

试用

+
+ +### 支持业务模型 + +根据应用的负载特征,A-Tune将业务分为七大类,各类型的负载特征和A-Tune支持的应用请参见[表2](#table2819164611311)。 + +**表 2** 支持的业务类型和应用 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

业务类型(workload_type)

+

类型说明

+

负载特征

+

支持的应用

+

default

+

默认类型

+

CPU、内存带宽、网络、IO各维度资源使用率都不高

+

N/A

+

webserver

+

https应用

+

CPU使用率高

+

Nginx

+

big_database

+

数据库

+
  • 关系型数据库

    读: CPU、内存带宽、网络使用率高

    +

    写:IO使用率高

    +
+
  • 非关系型数据库

    CPU、IO使用率高

    +
+

MongoDB、MySQL、PostgreSQL、MariaDB

+

big_data

+

大数据

+

CPU、IO使用率较高

+

Hadoop、Spark

+

in-memory_computing

+

内存密集型应用

+

CPU、内存带宽使用率高

+

SPECjbb2015

+

in-memory_database

+

计算+网络密集型应用

+

CPU单核使用率高,多实例下网络使用率高

+

Redis

+

single_computer_intensive_jobs

+

计算密集型应用

+

CPU单核使用率高,部分子项内存带宽使用率高

+

SPECCPU2006

+

communication

+

网络密集型应用

+

CPU、网络使用率高

+

Dubbo

+

idle

+

系统idle

+

系统处于空闲状态,无任何应用运行

+

N/A

+
+ +## 2 安装与部署 + +本章介绍如何安装和部署A-Tune。 + +## 2.1 软硬件要求 + +### 硬件要求 + +- 鲲鹏920处理器 + +### 软件要求 + +- 操作系统:openEuler 1.0 + +## 2.2 环境准备 + +安装openEuler系统,安装方法参考《openEuler 1.0 安装指南》。 + +## 2.3 安装A-Tune + +本章介绍A-Tune的安装模式和安装方法。 + +### 2.3.1 安装模式介绍 + +A-Tune支持单机模式和分布式模式安装: + +- 单机模式 + + client和server安装到同一台机器上。 + +- 分布式模式 + + client和server分别安装在不同的机器上。 + + +![](figures/zh-cn_image_0214540005.png) + +### 2.3.2 安装操作 + +安装A-Tune的操作步骤如下: + +1. 挂载openEuler的iso文件。 + + ``` + # mount openEuler-1.0-aarch64-dvd.iso /mnt + ``` + +2. 配置本地yum源。 + + ``` + # vim /etc/yum.repos.d/local.repo + ``` + + 配置内容如下所示: + + ``` + [local] + name=local + baseurl=file:///mnt + gpgcheck=0 + enabled=1 + ``` + +3. 安装A-Tune服务端。 + + >![](public_sys-resources/icon-note.gif) **说明:** + >本步骤会同时安装服务端和客户端软件包,对于单机部署模式,请跳过**步骤4**。 + + ``` + # yum install atune -y + ``` + +4. 安装A-Tune客户端。 + + ``` + # yum install atune-client -y + ``` + +5. 验证是否安装成功。 + + ``` + # rpm -qa | grep atune + atune-client-xxx + atune-db-xxx + atune-xxx + ``` + + 有如上回显信息表示安装成功。 + + +## 2.4 部署A-Tune + +本章介绍A-Tune的配置部署。 + +### 2.4.1 配置介绍 + +A-Tune配置文件/etc/atuned/atuned.cnf的配置项说明如下: + +- A-Tune服务启动配置 + + 可根据需要进行修改。 + + - address:系统grpc服务的侦听地址,默认为127.0.0.1,若为多机部署,需进行修改。 + - port:系统grpc服务的侦听端口,范围为0\~65535未使用的端口。 + - rest\_port:系统restservice的侦听端口, 范围为0\~65535未使用的端口。 + - sample\_num:系统执行analysis流程时采集样本的数量。 + +- system信息 + + system为系统执行相关的优化需要用到的参数信息,必须根据系统实际情况进行修改。 + + - disk:执行analysis流程时需要采集的对应磁盘的信息或执行磁盘相关优化时需要指定的磁盘。 + - network:执行analysis时需要采集的对应的网卡的信息或执行网卡相关优化时需要指定的网卡。 + - user:执行ulimit相关优化时用到的用户名。目前只支持root用户。 + - tls:开启A-Tune的gRPC和http服务SSL/TLS证书校验,默认不开启。开启TLS后atune-adm命令在使用前需要设置以下环境变量方可与服务端进行通讯: + - export ATUNE\_TLS=yes + - export ATUNE\_CLICERT=<客户端证书路径\> + + - tlsservercertfile:gPRC服务端证书路径。 + - tlsserverkeyfile:gPRC服务端秘钥路径。 + - tlshttpcertfile:http服务端证书路径。 + - tlshttpkeyfile:http服务端秘钥路径。 + - tlshttpcacertfile:http服务端CA证书路径。 + +- 日志信息 + + 根据情况修改日志的路径和级别,默认的日志信息在/var/log/message中。 + +- monitor信息 + + 为系统启动时默认采集的系统的硬件信息。 + + +### 配置示例 + +``` +#################################### server ############################### +# atuned config +[server] +# the address the grpc server to bind to, default is 127.0.0.1 +address = 127.0.0.1 + +# the atuned grpc listening port, default is 60001 +# the port can be set between 0 to 65535 the not be used +port = 60001 + +# the rest service listening port, default is 8383 +# the port can be set between 0 to 65535 than not be used +rest_port = 8383 + +# when run analysis command, the numbers of collected data. +# default is 20 +sample_num = 20 + +# Enable gRPC and http server authentication SSL/TLS +# default is false +# tls = true +# tlsservercertfile = /etc/atuned/server.pem +# tlsserverkeyfile = /etc/atuned/server.key +# tlshttpcertfile = /etc/atuned/http/server.pem +# tlshttpkeyfile = /etc/atuned/http/server.key +# tlshttpcacertfile = /etc/atuned/http/cacert.pem + +#################################### log ############################### +# Either "debug", "info", "warn", "error", "critical", default is "info" +level = info + +#################################### monitor ############################### +[monitor] +# With the module and format of the MPI, the format is {module}_{purpose} +# The module is Either "mem", "net", "cpu", "storage" +# The purpose is "topo" +module = mem_topo, cpu_topo + +#################################### system ############################### +# you can add arbitrary key-value here, just like key = value +# you can use the key in the profile +[system] +# the disk to be analysis +disk = sda + +# the network to be analysis +network = enp189s0f0 + +user = root +``` + +## 2.5 启动A-Tune + +A-Tune安装完成后,需要启动A-Tune服务才能使用。 + +- 启动atuned服务: + + ``` + $ systemctl start atuned + ``` + + +- 查询atuned服务状态: + + ``` + $ systemctl status atuned + ``` + + 若回显为如下,则服务启动成功。 + + ![](figures/zh-cn_image_0214540398.png) + + +## 3 使用方法 + +用户可以通过调用A-Tune提供的命令行接口使用A-Tune提供的功能。本章介绍A-Tune命令行接口的功能和使用方式。 + +## 3.1 查询负载类型 + +## list + +### 功能描述 + +查询系统当前支持的workload\_type和对应的profile,以及当前处于active状态的profile。 + +### 命令格式 + +**atune-adm list** + +### 使用示例 + +``` +$ atune-adm list + +Support WorkloadTypes: ++-----------------------------------+------------------------+-----------+ +| WorkloadType | ProfileName | Active | ++===================================+========================+===========+ +| default | default | true | ++-----------------------------------+------------------------+-----------+ +| webserver | ssl_webserver | false | ++-----------------------------------+------------------------+-----------+ +| big_database | database | false | ++-----------------------------------+------------------------+-----------+ +| big_data | big_data | false | ++-----------------------------------+------------------------+-----------+ +| in-memory_computing | in-memory_computing | false | ++-----------------------------------+------------------------+-----------+ +| in-memory_database | in-memory_database | false | ++-----------------------------------+------------------------+-----------+ +| single_computer_intensive_jobs | compute-intensive | false | ++-----------------------------------+------------------------+-----------+ +| communication | rpc_communication | false | ++-----------------------------------+------------------------+-----------+ +| idle | default | false | ++-----------------------------------+------------------------+-----------+ + +``` + +>![](public_sys-resources/icon-note.gif) **说明:** +>Active为true表示当前激活的profile,示例表示当前激活的是default类型对应的profile。 + +## 3.2 自定义负载类型 + +除了系统已定义的负载类型,A-Tune也支持用户定义新的workload\_type及对应profile,并允许更新或删除自定义的workload\_type。 + +用户也可以将“使用方法 \> 自定义模型”中用户训练的自定义模型添加到A-Tune中。 + +## define + +### 功能描述 + +添加用户自定义的workload\_type,及对应的profile优化项。 + +### 命令格式 + +**atune-adm define** + +### 使用示例 + +新增一个workload type,workload type的名称为test\_type,profile name的名称为test\_name,优化项的配置文件为example.conf。 + +``` +$ atune-adm define test_type test_name ./example.conf +``` + +example.conf 可以参考如下方式书写(以下各优化项非必填,仅供参考),也可通过**atune-adm info**查看已有的profile是如何书写的。 + +``` +[main] +# list it's parent profile +[tip] +# the recommended optimization, which should be performed manunaly +[check] +# check the environment +[affinity.irq] +# to change the affinity of irqs +[affinity.task] +# to change the affinity of tasks +[bios] +# to change the bios config +[bootloader.grub2] +# to change the grub2 config +[kernel_config] +# to change the kernel config +[script] +# the script extention of cpi +[sysctl] +# to change the /proc/sys/* config +[sysfs] +# to change the /sys/* config +[systemctl] +# to change the system service config +[ulimit] +# to change the resources limit of user +``` + +## update + +### 功能描述 + +将workload\_type原来的优化项更新为new.conf中的内容。 + +### 命令格式 + +**atune-adm update** + +### 使用示例 + +更新负载类型为test\_type,优化项名称为test\_name的优化项为new.conf。 + +``` +$ atune-adm update test_type test_name ./new.conf +``` + +## undefine + +### 功能描述 + +删除用户自定义的workload\_type。 + +### 命令格式 + +**atune-adm undefine** + +### 使用示例 + +删除自定义的负载类型test\_type。 + +``` +$ atune-adm undefine test_type +``` + +## 3.3 自定义模型 + +A-Tune支持用户训练新的workload\_type。训练方法非常简单,用户只要通过collection和train两条命令,即可完成新模型的训练。 + +## collection + +### 功能描述 + +采集业务运行时系统的全局资源使用情况以及OS的各项状态信息,并将收集的结果保存到csv格式的输出文件中,作为模型训练的输入数据集。 + +>![](public_sys-resources/icon-note.gif) **说明:** +>本命令依赖采样工具perf,mpstat,vmstat,iostat,sar。CPU型号目前仅支持鲲鹏920,可通过dmidecode -t processor检查CPU型号。 + +### 命令格式 + +**atune-adm collection** + +### 参数说明 + +- OPTIONS + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

参数

+

描述

+

--filename, -f

+

生成的用于训练的csv文件名:name-时间戳.csv

+

--output_path, -o

+

生成的csv文件的存放路径,需提供绝对路径

+

--disk, -b

+

业务运行时实际使用的磁盘,如/dev/sda

+

--network, -n

+

业务运行时使用的网络接口,如eth0

+

--workload_type, -t

+

标记业务的负载类型,作为后续训练的标签

+

--duration, -d

+

业务运行时采集数据的时间,单位秒,默认采集时间1200秒

+

--interval, -i

+

采集数据的时间间隔,单位秒,默认采集间隔5秒

+
+ + +### 使用示例 + +``` +$ atune-adm collection --filename name --interval 5 --duration 1200 --output_path /data –-disk sda --network eth0 --workload_type test_type +``` + +## train + +### 功能描述 + +使用采集的数据进行模型的训练。训练时至少采集两种workload\_type的数据,否则会报错。 + +### 命令格式 + +**atune-adm train** + +### 参数说明 + +- OPTIONS + + + + + + + + + + + + + +

参数

+

描述

+

--data_path, -d

+

存放模型训练所需的csv文件的目录

+

--output_file, -o

+

训练生成的新模型

+
+ + +### 使用示例 + +使用data目录下的csv文件作为训练输入,生成的新模型new-model.m存放在model目录下。 + +``` +$ atune-adm train --data_path ./data –output_file ./model/new-model.m +``` + +## 3.4 分析负载类型并自优化 + + +## analysis + +### 功能描述 + +采集系统的实时统计数据进行负载类型识别,并进行自动优化。 + +### 命令格式 + +**atune-adm analysis** \[OPTIONS\] + +### 参数说明 + + + + + + + + + + +

参数

+

描述

+

--model, -m

+

用户自训练产生的新模型

+
+ +### 使用示例 + +- 使用默认的模型进行分类识别 + + ``` + $ atune-adm analysis + ``` + +- 使用自训练的模型进行识别 + + ``` + $ atune-adm analysis --model ./model/new-model.m + ``` + + +## 3.5 查询profile + +## info + +### 功能描述 + +查看workload\_type对应的profile内容。 + +### 命令格式 + +**atune-adm info** <_WORKLOAD\_TYPE\>_ + +### 使用示例 + +查看webserver的profile内容: + +``` +$ atune-adm info webserver + +*** ssl_webserver: + +# +# webserver tuned configuration +# +[main] +#TODO CONFIG + +[kernel_config] +#TODO CONFIG + +[bios] +#TODO CONFIG + +[sysfs] +#TODO CONFIG + +[sysctl] +fs.file-max=6553600 +fs.suid_dumpable = 1 +fs.aio-max-nr = 1048576 +kernel.shmmax = 68719476736 +kernel.shmall = 4294967296 +kernel.shmmni = 4096 +kernel.sem = 250 32000 100 128 +net.ipv4.tcp_tw_reuse = 1 +net.ipv4.tcp_syncookies = 1 +net.ipv4.ip_local_port_range = 1024 65500 +net.ipv4.tcp_max_tw_buckets = 5000 +net.core.somaxconn = 65535 +net.core.netdev_max_backlog = 262144 +net.ipv4.tcp_max_orphans = 262144 +net.ipv4.tcp_max_syn_backlog = 262144 +net.ipv4.tcp_timestamps = 0 +net.ipv4.tcp_synack_retries = 1 +net.ipv4.tcp_syn_retries = 1 +net.ipv4.tcp_fin_timeout = 1 +net.ipv4.tcp_keepalive_time = 60 +net.ipv4.tcp_mem = 362619 483495 725238 +net.ipv4.tcp_rmem = 4096 87380 6291456 +net.ipv4.tcp_wmem = 4096 16384 4194304 +net.core.wmem_default = 8388608 +net.core.rmem_default = 8388608 +net.core.rmem_max = 16777216 +net.core.wmem_max = 16777216 + +[systemctl] +sysmonitor=stop +irqbalance=stop + +[bootloader.grub2] +selinux=0 +iommu.passthrough=1 + +[tip] +bind your master process to the CPU near the network = affinity +bind your network interrupt to the CPU that has this network = affinity +relogin into the system to enable limits setting = OS + +[script] +openssl_hpre = 0 +prefetch = off + +[ulimit] +{user}.hard.nofile = 102400 +{user}.soft.nofile = 102400 + +[affinity.task] +#TODO CONFIG + +[affinity.irq] +#TODO CONFIG + +[check] +#TODO CONFIG + +``` + +## 3.6 设置profile + +## profile + +### 功能描述 + +手动激活workload\_type对应的profile,使得workload\_type处于active状态。 + +### 命令格式 + +**atune-adm profile** + +### 参数说明 + +WORKLOAD\_TYPE 支持的类型参考list命令查询结果。 + +### 使用示例 + +激活webserver对应的profile配置。 + +``` +$ atune-adm profile webserver +``` + +## 3.7 回滚profile + +## rollback + +### 功能描述 + +回退当前的配置到系统的初始配置。 + +### 命令格式 + +**atune-adm rollback** + +### 使用示例 + +``` +$ atune-adm rollback +``` + +## 3.8 更新数据库 + +## upgrade + +### 功能描述 + +更新系统的数据库。 + +### 命令格式 + +**atune-adm upgrade** + +### 使用示例 + +数据库更新为new\_sqlite.db。 + +``` +$ atune-adm upgrade ./new_sqlite.db +``` + +## 3.9 系统信息查询 + +## check + +### 功能描述 + +检查系统当前的cpu、bios、os、网卡等信息。 + +### 命令格式 + +**atune-adm check** + +### 使用示例 + +``` +$ atune-adm check + cpu information: + cpu:0 version: Kunpeng 920-6426 speed: 2600000000 HZ cores: 64 + cpu:1 version: Kunpeng 920-6426 speed: 2600000000 HZ cores: 64 + system information: + DMIBIOSVersion: 0.59 + OSRelease: 4.19.36-vhulk1906.3.0.h356.eulerosv2r8.aarch64 + network information: + name: eth0 product: HNS GE/10GE/25GE RDMA Network Controller + name: eth1 product: HNS GE/10GE/25GE Network Controller + name: eth2 product: HNS GE/10GE/25GE RDMA Network Controller + name: eth3 product: HNS GE/10GE/25GE Network Controller + name: eth4 product: HNS GE/10GE/25GE RDMA Network Controller + name: eth5 product: HNS GE/10GE/25GE Network Controller + name: eth6 product: HNS GE/10GE/25GE RDMA Network Controller + name: eth7 product: HNS GE/10GE/25GE Network Controller + name: docker0 product: +``` + +## 3.10 参数自调优 + +A-Tune提供了最佳配置的自动搜索能力,免去人工反复做参数调整、性能评价的调优过程,极大地提升最优配置的搜寻效率。 + +## tuning + +### 功能描述 + +使用指定的项目文件对参数进行动态空间的搜索,找到当前环境配置下的最优解。 + +### 命令格式 + +**atune-adm tuning** \[OPTIONS\] + +>![](public_sys-resources/icon-note.gif) **说明:** +>在运行命令前,需要满足如下条件: +>1. 编辑好服务端yaml配置文件,且需要服务端管理员将该配置文件放到服务端的/etc/atuned/tuning/目录下。 +>2. 编辑好客户端yaml配置文件并放在客户端任一目录。 + +### 参数说明 + +- PROJECT\_YAML +客户端yaml配置文件 + +- OPTIONS + +**表 1** + + + + + + + + + + + + + +

参数

+

描述

+

--restore, -r

+

恢复tuning优化前的初始配置

+

--project, -p

+

指定需要恢复的yaml文件中的项目名称

+
+ +### 配置说明 + +**服务端yaml文件配置说明** + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

配置名称

+

配置说明

+

参数类型

+

取值范围

+

project

+

项目名称

+

字符串

+

-

+

startworkload

+

待调服务的启动脚本

+

字符串

+

-

+

stopworkload

+

待调服务的停止脚本

+

字符串

+

-

+

maxiterations

+

最大调优迭代次数,用于限制客户端的迭代次数

+

整型

+

>=10

+

object

+

需要调节的参数项及信息

+

-

+

-

+
+ +**表 1** object项配置说明 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

配置名称

+

配置说明

+

参数类型

+

取值范围

+

name

+

待调参数名称

+

字符串

+

-

+

desc

+

待调参数描述

+

字符串

+

-

+

get

+

查询参数值的脚本

+

-

+

-

+

set

+

设置参数值的脚本

+

-

+

-

+

needrestart

+

参数生效是否需要重启业务

+

枚举

+

"true", "false"

+

type

+

参数的类型,目前支持discrete, continuous两种类型,对应离散型、连续型参数

+

枚举

+

"discrete", "continuous"

+

dtype

+

type为discrete类型时的参数值类型,目前支持int和string两种

+

枚举

+

int, string

+

scope

+

参数设置范围,dtype为int时使用

+

整型

+

用户自定义,取值在该参数的合法范围

+

step

+

参数值步长,dtype为int时使用

+

整型

+

用户自定义

+

items

+

参数值在选定范围之外的枚举值,dtype为int时使用

+

整型

+

用户自定义,取值在该参数的合法范围

+

options

+

参数值的枚举范围,dtype为string时使用

+

字符串

+

用户自定义,取值在该参数的合法范围

+

ref

+

参数的推荐初始值

+

整型或字符串

+

用户自定义,取值在该参数的合法范围

+
+ +**客户端yaml文件配置说明** + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

配置名称

+

配置说明

+

参数类型

+

取值范围

+

project

+

项目名称,需要与服务端对应配置文件中的project匹配

+

字符串

+

-

+

iterations

+

调优迭代次数

+

整型

+

>=10

+

benchmark

+

性能测试脚本

+

-

+

-

+

evaluations

+

性能测试评估指标

+

-

+

-

+
+ +**表 2** evaluations项配置说明 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

配置名称

+

配置说明

+

参数类型

+

取值范围

+

name

+

评价指标名称

+

字符串

+

-

+

get

+

获取性能评估结果的脚本

+

-

+

-

+

type

+

评估结果的正负类型,positive代表最小化对应性能值,negative代表最大化对应性能

+

枚举

+

"positive","negative"

+

weight

+

该指标的权重百分比,0-100

+

整型

+

0-100

+

threshold

+

该指标的最低性能要求

+

整型

+

用户指定

+
+ +### 配置示例 + +服务端yaml文件配置样例: + +``` +project: "example" +maxiterations: 10 +startworkload: "" +stopworkload: "" +object : + - + name : "vm.swappiness" + info : + desc : "the vm.swappiness" + get : "sysctl -a | grep vm.swappiness" + set : "sysctl -w vm.swappiness=$value" + needrestart: "false" + type : "continuous" + scope : + - 0 + - 10 + ref : 1 + - + name : "irqbalance" + info : + desc : "system irqbalance" + get : "systemctl status irqbalance" + set : "systemctl $value sysmonitor;systemctl $value irqbalance" + needrestart: "false" + type : "discrete" + options: + - "start" + - "stop" + dtype : "string" + ref : "start" + - + name : "net.tcp_min_tso_segs" + info : + desc : "the minimum tso number" + get : "cat /proc/sys/net/ipv4/tcp_min_tso_segs" + set : "echo $value > /proc/sys/net/ipv4/tcp_min_tso_segs" + needrestart: "false" + type : "continuous" + scope: + - 1 + - 16 + ref : 2 + - + name : "prefetcher" + info : + desc : "" + get : "cat /sys/class/misc/prefetch/policy" + set : "echo $value > /sys/class/misc/prefetch/policy" + needrestart: "false" + type : "discrete" + options: + - "0" + - "15" + dtype : "string" + ref : "15" + - + name : "kernel.sched_min_granularity_ns" + info : + desc : "Minimal preemption granularity for CPU-bound tasks" + get : "sysctl kernel.sched_min_granularity_ns" + set : "sysctl -w kernel.sched_min_granularity_ns=$value" + needrestart: "false" + type : "continuous" + scope: + - 5000000 + - 50000000 + ref : 10000000 + - + name : "kernel.sched_latency_ns" + info : + desc : "" + get : "sysctl kernel.sched_latency_ns" + set : "sysctl -w kernel.sched_latency_ns=$value" + needrestart: "false" + type : "continuous" + scope: + - 10000000 + - 100000000 + ref : 16000000 + +``` + +客户端yaml文件配置样例: + +``` +project: "example" +iterations : 10 +benchmark : "sh /home/Benchmarks/mysql/tunning_mysql.sh" +evaluations : + - + name: "tps" + info: + get: "echo -e '$out' |grep 'transactions:' |awk '{print $3}' | cut -c 2-" + type: "negative" + weight: 100 + threshold: 100 +``` + +### 使用示例 + +- 进行tuning调优 + +``` +$ atune-adm tuning example-client.yaml +``` + +- 恢复tuning调优前的初始配置 + +``` +$ atune-adm tuning -restore -project example +``` + +## 4 附录 + +### 术语和缩略语 + +**表 1** 术语表 + + + + + + + + + + + + + +

术语

+

含义

+

workload_type

+

负载类型,用于标记具有相同特征的一类业务

+

profile

+

优化项集合,最佳的参数配置

+
+ diff --git a/Documentation/UserGuide.md b/Documentation/UserGuide.md new file mode 100644 index 0000000000000000000000000000000000000000..1fcac609a606bb0272ec6938c7aecce6458a5cc0 --- /dev/null +++ b/Documentation/UserGuide.md @@ -0,0 +1,1405 @@ +# A-Tune User Guide + +## Legal Statement + +**Copyright © Huawei Technologies Co., Ltd. 2020. All rights reserved.** + +Your replication, use, modification, and distribution of this document are governed by the Creative Commons License Attribution-ShareAlike 4.0 International Public License \(CC BY-SA 4.0\). You can visit [https://creativecommons.org/licenses/by-sa/4.0/](https://creativecommons.org/licenses/by-sa/4.0/) to view a human-readable summary of \(and not a substitute for\) CC BY-SA 4.0. For the complete CC BY-SA 4.0, visit [https://creativecommons.org/licenses/by-sa/4.0/legalcode](https://creativecommons.org/licenses/by-sa/4.0/legalcode). + +**Trademarks and Permissions** + +A-Tune and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd. All other trademarks and trade names mentioned in this document are the property of their respective holders. + +**Disclaimer** + +This document is used only as a guide. Unless otherwise specified by applicable laws or agreed by both parties in written form, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, including but not limited to non-infringement, timeliness, and specific purposes. + +## Preface + +## Overview + +This document describes how to install and use A-Tune, which is a performance self-optimization software for openEuler. + +## Intended Audience + +This document is intended for developers, open-source enthusiasts, and partners who use the openEuler system and want to know and use A-Tune. You need to have basic knowledge of the Linux OS. + +## 1 Getting to Know A-Tune + + +## 1.1 Introduction + +An operating system \(OS\) is basic software that connects applications and hardware. It is critical for users to adjust OS and application configurations and make full use of software and hardware capabilities to achieve optimal service performance. However, numerous workload types and varied applications run on the OS, and the requirements on resources are different. Currently, the application environment composed of hardware and software involves more than 7000 configuration objects. As the service complexity and optimization objects increase, the time cost for optimization increases exponentially. As a result, optimization efficiency decreases sharply. Optimization becomes complex and brings great challenges to users. + +Second, as infrastructure software, the OS provides a large number of software and hardware management capabilities. Each capability applies to different scenarios. Therefore, different capabilities need to be enabled or disabled in different scenarios to combine various capabilities provided by the system, maximizing the optimal performance of applications. + +In addition, thousands of actual service scenarios exist, and hardware configurations for computing, network, and storage emerge. The lab cannot list all applications, service scenarios, and different hardware combinations. + +To address the preceding challenges, openEuler launches A-Tune. + +A-Tune is AI-based software that optimizes system performance. It uses AI technologies to create precise system profiles for service scenarios, aware and infer service characteristics, make intelligent decisions, and match and recommend the optimal system parameter configuration combination, ensuring the optimal running status of services. + +![](figures/en-us_image_0215192422.png) + +## 1.2 Architecture + +The following figure shows the A-Tune core technical architecture, which consists of intelligent decision-making, system profile, and interaction system. + +- Intelligent decision-making layer: consists of the awareness and decision-making subsystems, which implements intelligent awareness of applications and system optimization decision-making, respectively. +- System profile layer: consists of the labeling and learning subsystems. The labeling subsystem is used to cluster service models, and the learning subsystem is used to learn and classify service models. +- Interaction system layer: monitors and configures various system resources and executes optimization policies. + +![](figures/en-us_image_0215591510.png) + +## 1.3 Supported Features and Service Models + +### Supported Features + +[Table 1](#table1919220557576) describes the main features supported by A-Tune, feature maturity, and usage suggestions. + +**Table 1** Feature maturity + + + + + + + + + + + + + + + + + + + + +

Feature

+

Maturity

+

Usage Suggestion

+

Auto optimization of 11 applications in seven workload types

+

Tested

+

Pilot

+

User-defined workload types and service models

+

Tested

+

Pilot

+

Automatic parameter optimization

+

Tested

+

Pilot

+
+ +### Supported Service Models + +Based on the workload characteristics of applications, A-Tune classifies services into seven types. For details about the workload characteristics of each type and the applications supported by A-Tune, see [Table 2](#table2819164611311). + +**Table 2** Supported workload types and applications + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Workload Type

+

Description

+

Workload Characteristic

+

Supported Application

+

default

+

Default type

+

The usage of CPU, memory bandwidth, network, and I/O resources is low.

+

N/A

+

webserver

+

HTTPS application

+

The CPU usage is high.

+

Nginx

+

big_database

+

Database

+
  • Relational database

    Read: The usage of CPU, memory bandwidth, and network is high.

    +

    Write: The usage of I/O is high.

    +
+
  • Non-relational database

    The usage of CPU and I/O is high.

    +
+

MongoDB, MySQL, PostgreSQL, and MariaDB

+

big_data

+

Big data

+

The usage of CPU and I/O is high.

+

Hadoop and Spark

+

in-memory_computing

+

Memory-intensive application

+

The usage of CPU and memory bandwidth is high.

+

SPECjbb2015

+

in-memory_database

+

Computing- and network-intensive application

+

The usage of a single-core CPU is high, and the network usage is high in multi-instance scenarios.

+

Redis

+

single_computer_intensive_jobs

+

Computing-intensive application

+

The usage of a single-core CPU is high, and the usage of memory bandwidth of some subitems is high.

+

SPECCPU2006

+

communication

+

Network-intensive application

+

The usage of CPU and network is high.

+

Dubbo

+

idle

+

System in idle state

+

The system is in idle state and no applications are running.

+

N/A

+
+ +## 2 Installation and Deployment + +This chapter describes how to install and deploy A-Tune. + + +## 2.1 Software and Hardware Requirements + +### Hardware Requirement + +- Huawei Kunpeng 920 processor + +### Software Requirement + +- OS: openEuler 1.0 + +## 2.2 Environment Preparation + +Install an openEuler OS. For details, see _openEuler 1.0 Installation Guide_. + +## 2.3 A-Tune Installation + +This chapter describes the installation modes and methods of the A-Tune. + + +### 2.3.1 Installation Modes +A-Tune can be installed in single-node or distributed mode. + +- Single-node mode + + The client and server are installed on the same system. + +- Distributed mode + + The client and server are installed on different systems. + + +![](figures/en-us_image_0214540005.png) + +### 2.3.2 Installation Procedure + +To install the A-Tune, perform the following steps: + +1. Mount an openEuler ISO file. + + ``` + # mount openEuler-1.0-aarch64-dvd.iso /mnt + ``` + +2. Configure the local yum source. + + ``` + # vim /etc/yum.repos.d/local.repo + ``` + + The configured contents are as follows: + + ``` + [local] + name=local + baseurl=file:///mnt + gpgcheck=0 + enabled=1 + ``` + +3. Install an A-Tune server. + + >![](public_sys-resources/icon-note.gif) **NOTE:** + >In this step, both the server and client software packages are installed. For the single-node deployment, skip **Step 4**. + + ``` + # yum install atune -y + ``` + +4. Install an A-Tune client. + + ``` + # yum install atune-client -y + ``` + +5. Check whether the installation is successful. + + ``` + # rpm -qa | grep atune + atune-client-xxx + atune-db-xxx + atune-xxx + ``` + + If the preceding information is displayed, the installation is successful. + + +## 2.4 A-Tune Deployment + +This chapter describes how to deploy A-Tune. + + +### 2.4.1 Overview + +The configuration items in the A-Tune configuration file **/etc/atuned/atuned.cnf** are described as follows: + +- A-Tune service startup configuration + + You can modify the parameter value as required. + + - **address**: Listening IP address of the gRPC server. The default value is **127.0.0.1**. Modify the value for distributed deployment. + - **port**: Listening port of the gRPC server. The value ranges from 0 to 65535. The port is not in use. + - **rest\_port**: Listening port of the system REST service. The value ranges from 0 to 65535. The port is not in use. + - **sample\_num**: Number of samples collected when the system executes the analysis process. + +- System information + + System is the parameter information required for system optimization. You must modify the parameter information according to the actual situation. + + - **disk**: Disk information to be collected during the analysis process or specified disk during disk optimization. + - **network**: NIC information to be collected during the analysis process or specified NIC during NIC optimization. + - **user**: User name used for ulimit optimization. Currently, only the user **root** is supported. + - **tls**: SSL/TLS certificate verification for the gRPC and HTTP services of A-Tune. This is disabled by default. After TLS is enabled, you need to set the following environment variables before running the **atune-adm** command to communicate with the server: + - export ATUNE\_TLS=yes + - export ATUNE\_CLICERT= + + - tlsservercertfile: path of the gPRC server certificate. + - tlsserverkeyfile: gPRC server key path. + - tlshttpcertfile: HTTP server certificate path. + - tlshttpkeyfile: HTTP server key path. + - tlshttpcacertfile: CA certificate path of the HTTP server. + +- Log information + + Change the log path and level based on the site requirements. By default, the log information is stored in **/var/log/message**. + +- Monitor information + + The hardware information that is collected by default when the system is started. + + +### Example + +``` +#################################### server ############################### +# atuned config +[server] +# the address the grpc server to bind to, default is 127.0.0.1 +address = 127.0.0.1 + +# the atuned grpc listening port, default is 60001 +# the port can be set between 0 to 65535 the not be used +port = 60001 + +# the rest service listening port, default is 8383 +# the port can be set between 0 to 65535 than not be used +rest_port = 8383 + +# when run analysis command, the numbers of collected data. +# default is 20 +sample_num = 20 + +# Enable gRPC and http server authentication SSL/TLS +# default is false +# tls = true +# tlsservercertfile = /etc/atuned/server.pem +# tlsserverkeyfile = /etc/atuned/server.key +# tlshttpcertfile = /etc/atuned/http/server.pem +# tlshttpkeyfile = /etc/atuned/http/server.key +# tlshttpcacertfile = /etc/atuned/http/cacert.pem + +#################################### log ############################### +# Either "debug", "info", "warn", "error", "critical", default is "info" +level = info + +#################################### monitor ############################### +[monitor] +# With the module and format of the MPI, the format is {module}_{purpose} +# The module is Either "mem", "net", "cpu", "storage" +# The purpose is "topo" +module = mem_topo, cpu_topo + +#################################### system ############################### +# you can add arbitrary key-value here, just like key = value +# you can use the key in the profile +[system] +# the disk to be analysis +disk = sda + +# the network to be analysis +network = enp189s0f0 + +user = root +``` + +## 2.5 Starting A-Tune + +After the A-Tune is installed, you need to start the A-Tune service. + +- Start the atuned service. + + ``` + $ systemctl start atuned + ``` + + +- To query the status of the atuned service, run the following command: + + ``` + $ systemctl status atuned + ``` + + If the following information is displayed, the service is started successfully: + + ![](figures/en-us_image_0214540398.png) + + +## 3 Application Scenarios + +You can invoke the command line interface \(CLI\) provided by A-Tune to use A-Tune functions. This chapter describes the functions and usage of the A-Tune CLI. + + +## 3.1 Querying Workload Types + + +## list + +### Function + +Query the supported workload types, profiles, and the values of Active. + +### Format + +**atune-adm list** + +### Example + +``` +$ atune-adm list + +Support WorkloadTypes: ++-----------------------------------+------------------------+-----------+ +| WorkloadType | ProfileName | Active | ++===================================+========================+===========+ +| default | default | true | ++-----------------------------------+------------------------+-----------+ +| webserver | ssl_webserver | false | ++-----------------------------------+------------------------+-----------+ +| big_database | database | false | ++-----------------------------------+------------------------+-----------+ +| big_data | big_data | false | ++-----------------------------------+------------------------+-----------+ +| in-memory_computing | in-memory_computing | false | ++-----------------------------------+------------------------+-----------+ +| in-memory_database | in-memory_database | false | ++-----------------------------------+------------------------+-----------+ +| single_computer_intensive_jobs | compute-intensive | false | ++-----------------------------------+------------------------+-----------+ +| communication | rpc_communication | false | ++-----------------------------------+------------------------+-----------+ +| idle | default | false | ++-----------------------------------+------------------------+-----------+ + +``` + +>![](public_sys-resources/icon-note.gif) **NOTE:** +>If the value of Active is **true**, the profile is activated. In the example, the profile of the default type is activated. + +## 3.2 User-defined Workload Types + +In addition to the workload types defined in the system, A-Tune also supports user-defined workload types and corresponding profiles, and you can update or delete these workload types. + +You can also add the user-defined model to A-Tune. For details about how to train a model, see [User-defined Model](user-defined-model.md). + + +## define + +### Function + +Add a user-defined workload type and the corresponding profile optimization item. + +### Format + +**atune-adm define** + +### Example + +Add a workload type. Set workload type to **test\_type**, profile name to **test\_name**, and configuration file of an optimization item to **example.conf**. + +``` +$ atune-adm define test_type test_name ./example.conf +``` + +The **example.conf** file can be written as follows \(the following optimization items are optional and are for reference only\). You can also run the **atune-adm info** command to view how the existing profile is written. + +``` +[main] +# list it's parent profile +[tip] +# the recommended optimization, which should be performed manunaly +[check] +# check the environment +[affinity.irq] +# to change the affinity of irqs +[affinity.task] +# to change the affinity of tasks +[bios] +# to change the bios config +[bootloader.grub2] +# to change the grub2 config +[kernel_config] +# to change the kernel config +[script] +# the script extention of cpi +[sysctl] +# to change the /proc/sys/* config +[sysfs] +# to change the /sys/* config +[systemctl] +# to change the system service config +[ulimit] +# to change the resources limit of user +``` + +## update + +### Function + +Update an optimization item of a workload type to the content in the **new.conf** file. + +### Format + +**atune-adm update** + +### Example + +Update the workload type to **test\_type** and the optimization item of test\_name to **new.conf**. + +``` +$ atune-adm update test_type test_name ./new.conf +``` + +## undefine + +### Function + +Delete a user-defined workload type. + +### Format + +**atune-adm undefine** + +### Example + +Delete the **test\_type** workload type. + +``` +$ atune-adm undefine test_type +``` + +## 3.3 User-defined Model + +You can train a new workload type model by running the **collection** and **train** commands. + + +## collection + +### Function + +Collect the global resource usage and OS status information during service running, and save the collected information to a CSV output file as the input dataset for model training. + +>![](public_sys-resources/icon-note.gif) **NOTE:** +>This command depends on the sampling tools such as perf, mpstat, vmstat, iostat, and sar. Currently, only the Kunpeng 920 CPU is supported. You can run the **dmidecode -t processor** command to check the CPU model. + +### Format + +**atune-adm collection** + +### Parameter Description + +- OPTIONS + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Parameter

+

Description

+

--filename, -f

+

Name of the generated CSV file used for training: name-timestamp.csv

+

--output_path, -o

+

Path for storing the generated CSV file. The absolute path is required.

+

--disk, -b

+

Disk used during service running, for example, /dev/sda.

+

--network, -n

+

Network port used during service running, for example, eth0.

+

--workload_type, -t

+

Workload type, which is used as a label for subsequent training.

+

--duration, -d

+

Data collection time during service running, in seconds. The default collection time is 1200 seconds.

+

--interval, -i

+

Interval for collecting data, in seconds. The default interval is 5 seconds.

+
+ + +### Example + +``` +$ atune-adm collection --filename name --interval 5 --duration 1200 --output_path /home –-disk sda --network eth0 --workload_type test_type +``` + +## train + +### Function + +Use the collected data to train the model. Collect data of at least two workload types during training. Otherwise, an error is reported. + +### Format + +**atune-adm train** + +### Parameter Description + +- OPTIONS + + + + + + + + + + + + + +

Parameter

+

Description

+

--data_path, -d

+

Path for storing CSV files required for model training

+

--output_file, -o

+

Model generated through training

+
+ + +### Example + +Use the CSV file in the **data** directory as the training input. The generated model **new-model.m** is stored in the **model** directory. + +``` +$ atune-adm train --data_path ./data –output_file ./model/new-model.m +``` + +## 3.4 Workload Type Analysis and Auto Optimization + + +## analysis + +### Function + +Collect real-time statistics from the system to identify and automatically optimize workload types. + +### Format + +**atune-adm analysis** \[OPTIONS\] + +### Parameter Description + + + + + + + + + + +

Parameter

+

Description

+

--model, -m

+

Model generated by user-defined training

+
+ +### Example + +- Use the default model for classification and identification. + + ``` + $ atune-adm analysis + ``` + +- Use the user-defined training model for recognition. + + ``` + $ atune-adm analysis --model ./model/new-model.m + ``` + + +## 3.5 Querying Profiles + + +## info + +### Function + +View the profile content of a workload type. + +### Format + +**atune-adm info** <_WORKLOAD\_TYPE\>_ + +### Example + +View the profile content of webserver. + +``` +$ atune-adm info webserver + +*** ssl_webserver: + +# +# webserver tuned configuration +# +[main] +#TODO CONFIG + +[kernel_config] +#TODO CONFIG + +[bios] +#TODO CONFIG + +[sysfs] +#TODO CONFIG + +[sysctl] +fs.file-max=6553600 +fs.suid_dumpable = 1 +fs.aio-max-nr = 1048576 +kernel.shmmax = 68719476736 +kernel.shmall = 4294967296 +kernel.shmmni = 4096 +kernel.sem = 250 32000 100 128 +net.ipv4.tcp_tw_reuse = 1 +net.ipv4.tcp_syncookies = 1 +net.ipv4.ip_local_port_range = 1024 65500 +net.ipv4.tcp_max_tw_buckets = 5000 +net.core.somaxconn = 65535 +net.core.netdev_max_backlog = 262144 +net.ipv4.tcp_max_orphans = 262144 +net.ipv4.tcp_max_syn_backlog = 262144 +net.ipv4.tcp_timestamps = 0 +net.ipv4.tcp_synack_retries = 1 +net.ipv4.tcp_syn_retries = 1 +net.ipv4.tcp_fin_timeout = 1 +net.ipv4.tcp_keepalive_time = 60 +net.ipv4.tcp_mem = 362619 483495 725238 +net.ipv4.tcp_rmem = 4096 87380 6291456 +net.ipv4.tcp_wmem = 4096 16384 4194304 +net.core.wmem_default = 8388608 +net.core.rmem_default = 8388608 +net.core.rmem_max = 16777216 +net.core.wmem_max = 16777216 + +[systemctl] +sysmonitor=stop +irqbalance=stop + +[bootloader.grub2] +selinux=0 +iommu.passthrough=1 + +[tip] +bind your master process to the CPU near the network = affinity +bind your network interrupt to the CPU that has this network = affinity +relogin into the system to enable limits setting = OS + +[script] +openssl_hpre = 0 +prefetch = off + +[ulimit] +{user}.hard.nofile = 102400 +{user}.soft.nofile = 102400 + +[affinity.task] +#TODO CONFIG + +[affinity.irq] +#TODO CONFIG + +[check] +#TODO CONFIG + +``` + +## 3.6 Setting Profiles + + +## profile + +### Function + +Manually activate a profile of a workload type. + +### Format + +**atune-adm profile** + +### Parameter Description + +You can run the **list** command to query the supported workload types. + +### Example + +Activate the profile configuration of webserver. + +``` +$ atune-adm profile webserver +``` + +## 3.7 Rolling Back Profiles + + +## rollback + +### Function + +Roll back the current configuration to the initial configuration of the system. + +### Format + +**atune-adm rollback** + +### Example + +``` +$ atune-adm rollback +``` + +## 3.8 Updating Database + + +## upgrade + +### Function + +Update the system database. + +### Format + +**atune-adm upgrade** + +### Example + +The database is updated to **new\_sqlite.db**. + +``` +$ atune-adm upgrade ./new_sqlite.db +``` + +## 3.9 Querying System Information + + +## check + +### Function + +Check the CPU, BIOS, OS, and NIC information. + +### Format + +**atune-adm check** + +### Example + +``` +$ atune-adm check + cpu information: + cpu:0 version: Kunpeng 920-6426 speed: 2600000000 HZ cores: 64 + cpu:1 version: Kunpeng 920-6426 speed: 2600000000 HZ cores: 64 + system information: + DMIBIOSVersion: 0.59 + OSRelease: 4.19.36-vhulk1906.3.0.h356.eulerosv2r8.aarch64 + network information: + name: eth0 product: HNS GE/10GE/25GE RDMA Network Controller + name: eth1 product: HNS GE/10GE/25GE Network Controller + name: eth2 product: HNS GE/10GE/25GE RDMA Network Controller + name: eth3 product: HNS GE/10GE/25GE Network Controller + name: eth4 product: HNS GE/10GE/25GE RDMA Network Controller + name: eth5 product: HNS GE/10GE/25GE Network Controller + name: eth6 product: HNS GE/10GE/25GE RDMA Network Controller + name: eth7 product: HNS GE/10GE/25GE Network Controller + name: docker0 product: +``` + +## 3.10 Automatic Parameter Optimization + +A-Tune provides the automatic search capability for optimal configurations, eliminating the need for repeated manual parameter adjustment and performance evaluation. This greatly improves the search efficiency of optimal configurations. + + +## tuning + +### Function + +Use the specified project file to search the dynamic space for parameters to find the optimal solution under the current environment configuration. + +### Format + +**atune-adm tuning** \[OPTIONS\] + +>![](public_sys-resources/icon-note.gif) **NOTE:** +>Before running the command, ensure that the following conditions are met: +>1. The YAML configuration file of the server has been edited and placed in the **/etc/atuned/tuning/** directory on the server by the server administrator. +>2. The YAML configuration file of the client has been edited and placed in any directory on the client. + +### Parameter Description + +- PROJECT\_YAML + +YAML configuration file of the client. + +- OPTIONS + + + + + + + + + + + + + + +

Parameter

+

Description

+

--restore, -r

+

restore pre-optimized initial configuration.

+

--project value, -p value

+

The project name of the yaml file.

+
+ +### Configuration Description + +The configuration items of a YAML file on a server are as follows: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Name

+

Description

+

Type

+

Value Range

+

project

+

Project name.

+

Character string

+

-

+

startworkload

+

Startup script of the service to be optimized.

+

Character string

+

-

+

stopworkload

+

Stopping script of the service to be optimized.

+

Character string

+

-

+

maxiterations

+

Maximum number of optimization iterations, which is used to limit the number of iterations on the client.

+

Integer

+

>=10

+

object

+

Parameters to be optimized and related information.

+

-

+

-

+
+ +**Table 1** Description of object configuration item + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Name

+

Description

+

Type

+

Value Range

+

name

+

Parameter to be optimized.

+

Character string

+

-

+

desc

+

Description of parameters to be optimized.

+

Character string

+

-

+

get

+

Script for querying parameter values.

+

-

+

-

+

set

+

Script for setting parameter values.

+

-

+

-

+

needrestart

+

Specifies whether to restart the service for the parameter to take effect.

+

Enumeration

+

"true", "false"

+

type

+

Parameter type. Currently, the discrete and continuous types are supported.

+

Enumeration

+

"discrete", "continuous"

+

dtype

+

Parameter value type when type is set to discrete. Currently, int and string are supported.

+

Enumeration

+

int, string

+

scope

+

Parameter value range, which is used when dtype is set to int.

+

Integer

+

The value is user-defined and must be within the valid range of this parameter.

+

step

+

Parameter value step, which is used when dtype is set to int.

+

Integer

+

This value is user-defined.

+

items

+

Enumerated value of which the parameter value is not within the selected range. This is used when dtype is set to int.

+

Integer

+

The value is user-defined and must be within the valid range of this parameter.

+

options

+

Enumerated value range of the parameter value, which is used when dtype is set to string.

+

Character string

+

The value is user-defined and must be within the valid range of this parameter.

+

ref

+

Recommended initial value of the parameter

+

Integer or character string

+

The value is user-defined and must be within the valid range of this parameter.

+
+ +The configuration items of a YAML file on a client are as follows: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Name

+

Description

+

Type

+

Value Range

+

project

+

Project name, which must be the same as that in the configuration file on the server.

+

Character string

+

-

+

iterations

+

Number of optimization iterations.

+

Integer

+

≥ 10

+

benchmark

+

Performance test script.

+

-

+

-

+

evaluations

+

Performance test evaluation index.

+

-

+

-

+
+ +**Table 2** Description of evaluations configuration item + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Name

+

Description

+

Type

+

Value Range

+

name

+

Evaluation index name.

+

Character string

+

-

+

get

+

Script for obtaining performance evaluation results.

+

-

+

-

+

type

+

Specifies a positive or negative type of the evaluation result. The value positive indicates that the performance value is minimized, and the value negative indicates that the performance value is maximized.

+

Enumeration

+

"positive","negative"

+

weight

+

Weight of the index. The value ranges from 0 to 100.

+

Integer

+

0-100

+

threshold

+

Minimum performance requirement of the index.

+

Integer

+

User-specified

+
+ +### Configuration Example + +The following is an example of the YAML file configuration on a server: + +``` +project: "example" +maxiterations: 10 +startworkload: "" +stopworkload: "" +object : + - + name : "vm.swappiness" + info : + desc : "the vm.swappiness" + get : "sysctl -a | grep vm.swappiness" + set : "sysctl -w vm.swappiness=$value" + needrestart: "false" + type : "continuous" + scope : + - 0 + - 10 + ref : 1 + - + name : "irqbalance" + info : + desc : "system irqbalance" + get : "systemctl status irqbalance" + set : "systemctl $value sysmonitor;systemctl $value irqbalance" + needrestart: "false" + type : "discrete" + options: + - "start" + - "stop" + dtype : "string" + ref : "start" + - + name : "net.tcp_min_tso_segs" + info : + desc : "the minimum tso number" + get : "cat /proc/sys/net/ipv4/tcp_min_tso_segs" + set : "echo $value > /proc/sys/net/ipv4/tcp_min_tso_segs" + needrestart: "false" + type : "continuous" + scope: + - 1 + - 16 + ref : 2 + - + name : "prefetcher" + info : + desc : "" + get : "cat /sys/class/misc/prefetch/policy" + set : "echo $value > /sys/class/misc/prefetch/policy" + needrestart: "false" + type : "discrete" + options: + - "0" + - "15" + dtype : "string" + ref : "15" + - + name : "kernel.sched_min_granularity_ns" + info : + desc : "Minimal preemption granularity for CPU-bound tasks" + get : "sysctl kernel.sched_min_granularity_ns" + set : "sysctl -w kernel.sched_min_granularity_ns=$value" + needrestart: "false" + type : "continuous" + scope: + - 5000000 + - 50000000 + ref : 10000000 + - + name : "kernel.sched_latency_ns" + info : + desc : "" + get : "sysctl kernel.sched_latency_ns" + set : "sysctl -w kernel.sched_latency_ns=$value" + needrestart: "false" + type : "continuous" + scope: + - 10000000 + - 100000000 + ref : 16000000 +``` + +The following is an example of the YAML file configuration on a client: + +``` +project: "example" +iterations : 10 +benchmark : "sh /home/Benchmarks/mysql/tunning_mysql.sh" +evaluations : + - + name: "tps" + info: + get: "echo -e '$out' |grep 'transactions:' |awk '{print $3}' | cut -c 2-" + type: "negative" + weight: 100 + threshold: 100 +``` + +### Example + +- Tuning + +``` +$ atune-adm tuning example-client.yaml +``` + +- Restore the initial configuration before tuning + +``` +$ atune-adm tuning -restore -project example +``` + +## 4 Appendixes + + +### Acronyms and Abbreviations + +**Table 1** Terminology + + + + + + + + + + + + + +

Term

+

Description

+

workload_type

+

Workload type, which is used to identify a type of service with the same characteristics.

+

profile

+

Set of optimization items and optimal parameter configuration.

+
+ diff --git "a/Documentation/UserGuide/A-Tune\347\224\250\346\210\267\346\214\207\345\215\227.md" "b/Documentation/UserGuide/A-Tune\347\224\250\346\210\267\346\214\207\345\215\227.md" index 3e0ecbce20a72e6b921be36aaa192b072fbb46d5..710266c4b48fdd4c90878df9a665f4b8e12a244a 100644 --- "a/Documentation/UserGuide/A-Tune\347\224\250\346\210\267\346\214\207\345\215\227.md" +++ "b/Documentation/UserGuide/A-Tune\347\224\250\346\210\267\346\214\207\345\215\227.md" @@ -810,7 +810,7 @@ prefetch = off ### 命令格式 -**atune-adm profile **_<_WORKLOAD\_TYPE_\>_ +**atune-adm profile** ### 参数说明 diff --git a/Documentation/UserGuide/UserGuide.md b/Documentation/UserGuide/UserGuide.md index 99664dbd608c579b6232b02ea02d1f9d868ff44a..1fcac609a606bb0272ec6938c7aecce6458a5cc0 100644 --- a/Documentation/UserGuide/UserGuide.md +++ b/Documentation/UserGuide/UserGuide.md @@ -822,7 +822,7 @@ Manually activate a profile of a workload type. ### Format -**atune-adm profile **_<_WORKLOAD\_TYPE_\>_ +**atune-adm profile** ### Parameter Description diff --git a/README-zh.md b/README-zh.md index 8da60291ad036c1141f9f58b7cda10876c7cb3ef..dbb782ec3a7dfcedfbec7858b6998b76281c8fbb 100644 --- a/README-zh.md +++ b/README-zh.md @@ -87,6 +87,7 @@ git clone https://gitee.com/openeuler/A-Tune.git atune #### 4、编译 ```bash cd atune +export GO111MODULE=off make ``` @@ -100,8 +101,9 @@ make install ### 1、管理atuned服务 -#### 启动atuned服务 +#### 加载并启动atuned服务 ```bash +systemctl daemon-reload systemctl start atuned ``` diff --git a/README.md b/README.md index 87e44834231107a0acf47761d0ff9724806584a7..04c20a0d701d3fc6c6babb5eb813876da4a6eb21 100644 --- a/README.md +++ b/README.md @@ -86,6 +86,7 @@ git clone https://gitee.com/openeuler/A-Tune.git atune #### 4. Compile. ```bash cd atune +export GO111MODULE=off make ``` @@ -99,8 +100,9 @@ II. Quick Guide ### 1. Manage the atuned service. -#### Start the atuned service. +#### Load and start the atuned service. ```bash +systemctl daemon-reload systemctl start atuned ```