提交 8df3cf82 编写于 作者: B barrierye

combine hdfs and afs monitor

上级 f819ccc3
...@@ -15,9 +15,8 @@ Currently, the following types of Monitors are supported: ...@@ -15,9 +15,8 @@ Currently, the following types of Monitors are supported:
| Monitor Type | Description | Specific options | | Monitor Type | Description | Specific options |
| :----------: | :----------------------------------------------------------: | :----------------------------------------------------------: | | :----------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
| general | Without authentication, you can directly access the download file by `wget` (such as FTP and BOS which do not need authentication) | `general_host` General remote host. | | general | Without authentication, you can directly access the download file by `wget` (such as FTP and BOS which do not need authentication) | `general_host` General remote host. |
| hdfs | The remote is HDFS, and relevant commands are executed through HDFS binary | `hdfs_bin` Path of HDFS binary file. | | hdfs | The remote is HDFS or AFS, and relevant commands are executed through Hadoop-client | `hadoop_bin` Path of Hadoop binary file.<br/>`fs_name` Hadoop fs_name. Not used if set in Hadoop-client.<br/>`fs_ugi` Hadoop fs_ugi, Not used if set in Hadoop-client. |
| ftp | The remote is FTP, and relevant commands are executed through `ftplib`(Using this monitor, you need to install `ftplib` with command `pip install ftplib`) | `ftp_host` FTP remote host.<br>`ftp_port` FTP remote port.<br>`ftp_username` FTP username. Not used if anonymous access.<br>`ftp_password` FTP password. Not used if anonymous access. | | ftp | The remote is FTP, and relevant commands are executed through `ftplib`(Using this monitor, you need to install `ftplib` with command `pip install ftplib`) | `ftp_host` FTP remote host.<br>`ftp_port` FTP remote port.<br>`ftp_username` FTP username. Not used if anonymous access.<br>`ftp_password` FTP password. Not used if anonymous access. |
| afs | The remote is AFS, and relevant commands are executed through Hadoop-client | `hadoop_bin` Path of Hadoop binary file.<br>`hadoop_host` AFS host. Not used if set in Hadoop-client.<br>`hadoop_ugi` AFS ugi, Not used if set in Hadoop-client. |
| Monitor Shared options | Description | Default | | Monitor Shared options | Description | Default |
| :--------------------: | :----------------------------------------------------------: | :----------------------------------: | | :--------------------: | :----------------------------------------------------------: | :----------------------------------: |
...@@ -83,8 +82,8 @@ exe = fluid.Executor(place) ...@@ -83,8 +82,8 @@ exe = fluid.Executor(place)
exe.run(fluid.default_startup_program()) exe.run(fluid.default_startup_program())
def push_to_hdfs(local_file_path, remote_path): def push_to_hdfs(local_file_path, remote_path):
hdfs_bin = '/hadoop-3.1.2/bin/hdfs' hdfs_bin = '/hadoop-3.1.2/bin/hadoop'
os.system('{} dfs -put -f {} {}'.format( os.system('{} fs -put -f {} {}'.format(
hdfs_bin, local_file_path, remote_path)) hdfs_bin, local_file_path, remote_path))
name = "uci_housing" name = "uci_housing"
...@@ -120,7 +119,7 @@ for pass_id in range(30): ...@@ -120,7 +119,7 @@ for pass_id in range(30):
The files on HDFS are as follows: The files on HDFS are as follows:
```bash ```bash
# hdfs dfs -ls / # hadoop fs -ls /
Found 2 items Found 2 items
-rw-r--r-- 1 root supergroup 0 2020-04-02 02:54 /donefile -rw-r--r-- 1 root supergroup 0 2020-04-02 02:54 /donefile
-rw-r--r-- 1 root supergroup 2101 2020-04-02 02:54 /uci_housing.tar.gz -rw-r--r-- 1 root supergroup 2101 2020-04-02 02:54 /uci_housing.tar.gz
...@@ -151,11 +150,11 @@ Use the following command to execute the HDFSMonitor: ...@@ -151,11 +150,11 @@ Use the following command to execute the HDFSMonitor:
```shell ```shell
python -m paddle_serving_server.monitor \ python -m paddle_serving_server.monitor \
--type='hdfs' --hdfs_bin='/hadoop-3.1.2/bin/hdfs' --remote_path='/' \ --type='hdfs' --hadoop_bin='/hadoop-3.1.2/bin/hadoop' \
--remote_model_name='uci_housing.tar.gz' --remote_donefile_name='donefile' \ --remote_path='/' --remote_model_name='uci_housing.tar.gz' \
--local_path='.' --local_model_name='uci_housing_model' \ --remote_donefile_name='donefile' --local_path='.' \
--local_timestamp_file='fluid_time_file' --local_tmp_path='_tmp' \ --local_model_name='uci_housing_model' --local_timestamp_file='fluid_time_file' \
--unpacked_filename='uci_housing_model' --debug --local_tmp_path='_tmp' --unpacked_filename='uci_housing_model' --debug
``` ```
The above code monitors the remote timestamp file `/donefile` of the remote HDFS address `/` every 10 seconds by polling. When the remote timestamp file changes, the remote model is considered to have been updated. Pull the remote packaging model `/uci_housing.tar.gz` to the local temporary path `./_tmp/uci_housing.tar.gz`. After unpacking to get the model file `./_tmp/uci_housing_model`, update the local model `./uci_housing_model` and the model timestamp file `./uci_housing_model/fluid_time_file` of Paddle Serving. The above code monitors the remote timestamp file `/donefile` of the remote HDFS address `/` every 10 seconds by polling. When the remote timestamp file changes, the remote model is considered to have been updated. Pull the remote packaging model `/uci_housing.tar.gz` to the local temporary path `./_tmp/uci_housing.tar.gz`. After unpacking to get the model file `./_tmp/uci_housing_model`, update the local model `./uci_housing_model` and the model timestamp file `./uci_housing_model/fluid_time_file` of Paddle Serving.
...@@ -163,32 +162,34 @@ The above code monitors the remote timestamp file `/donefile` of the remote HDFS ...@@ -163,32 +162,34 @@ The above code monitors the remote timestamp file `/donefile` of the remote HDFS
The expected output is as follows: The expected output is as follows:
```shell ```shell
2020-04-02 08:38 INFO [monitor.py:85] _hdfs_bin: /hadoop-3.1.2/bin/hdfs 2020-04-02 10:12 INFO [monitor.py:85] _hadoop_bin: /hadoop-3.1.2/bin/hadoop
2020-04-02 08:38 INFO [monitor.py:244] HDFS prefix cmd: /hadoop-3.1.2/bin/hdfs dfs 2020-04-02 10:12 INFO [monitor.py:85] _fs_name:
2020-04-02 08:38 INFO [monitor.py:85] _remote_path: / 2020-04-02 10:12 INFO [monitor.py:85] _fs_ugi:
2020-04-02 08:38 INFO [monitor.py:85] _remote_model_name: uci_housing.tar.gz 2020-04-02 10:12 INFO [monitor.py:209] AFS prefix cmd: /hadoop-3.1.2/bin/hadoop fs
2020-04-02 08:38 INFO [monitor.py:85] _remote_donefile_name: donefile 2020-04-02 10:12 INFO [monitor.py:85] _remote_path: /
2020-04-02 08:38 INFO [monitor.py:85] _local_model_name: uci_housing_model 2020-04-02 10:12 INFO [monitor.py:85] _remote_model_name: uci_housing.tar.gz
2020-04-02 08:38 INFO [monitor.py:85] _local_path: . 2020-04-02 10:12 INFO [monitor.py:85] _remote_donefile_name: donefile
2020-04-02 08:38 INFO [monitor.py:85] _local_timestamp_file: fluid_time_file 2020-04-02 10:12 INFO [monitor.py:85] _local_model_name: uci_housing_model
2020-04-02 08:38 INFO [monitor.py:85] _local_tmp_path: _tmp 2020-04-02 10:12 INFO [monitor.py:85] _local_path: .
2020-04-02 08:38 INFO [monitor.py:85] _interval: 10 2020-04-02 10:12 INFO [monitor.py:85] _local_timestamp_file: fluid_time_file
2020-04-02 08:38 DEBUG [monitor.py:249] check cmd: /hadoop-3.1.2/bin/hdfs dfs -stat "%Y" /donefile 2020-04-02 10:12 INFO [monitor.py:85] _local_tmp_path: _tmp
2020-04-02 08:38 DEBUG [monitor.py:251] resp: 1585816693193 2020-04-02 10:12 INFO [monitor.py:85] _interval: 10
2020-04-02 08:38 INFO [monitor.py:138] doneilfe(donefile) changed. 2020-04-02 10:12 DEBUG [monitor.py:214] check cmd: /hadoop-3.1.2/bin/hadoop fs -ls /donefile 2>/dev/null
2020-04-02 08:38 DEBUG [monitor.py:261] pull cmd: /hadoop-3.1.2/bin/hdfs dfs -get -f /uci_housing.tar.gz _tmp 2020-04-02 10:12 DEBUG [monitor.py:216] resp: -rw-r--r-- 1 root supergroup 0 2020-04-02 10:11 /donefile
2020-04-02 08:38 INFO [monitor.py:144] pull remote model(uci_housing.tar.gz). 2020-04-02 10:12 INFO [monitor.py:138] doneilfe(donefile) changed.
2020-04-02 08:38 INFO [monitor.py:98] unpack remote file(uci_housing.tar.gz). 2020-04-02 10:12 DEBUG [monitor.py:233] pull cmd: /hadoop-3.1.2/bin/hadoop fs -get /uci_housing.tar.gz _tmp/uci_housing.tar.gz 2>/dev/null
2020-04-02 08:38 DEBUG [monitor.py:108] remove packed file(uci_housing.tar.gz). 2020-04-02 10:12 INFO [monitor.py:144] pull remote model(uci_housing.tar.gz).
2020-04-02 08:38 INFO [monitor.py:110] using unpacked filename: uci_housing_model. 2020-04-02 10:12 INFO [monitor.py:98] unpack remote file(uci_housing.tar.gz).
2020-04-02 08:38 DEBUG [monitor.py:175] update model cmd: cp -r _tmp/uci_housing_model/* ./uci_housing_model 2020-04-02 10:12 DEBUG [monitor.py:108] remove packed file(uci_housing.tar.gz).
2020-04-02 08:38 INFO [monitor.py:152] update local model(uci_housing_model). 2020-04-02 10:12 INFO [monitor.py:110] using unpacked filename: uci_housing_model.
2020-04-02 08:38 DEBUG [monitor.py:184] update timestamp cmd: touch ./uci_housing_model/fluid_time_file 2020-04-02 10:12 DEBUG [monitor.py:175] update model cmd: cp -r _tmp/uci_housing_model/* ./uci_housing_model
2020-04-02 08:38 INFO [monitor.py:157] update model timestamp(fluid_time_file). 2020-04-02 10:12 INFO [monitor.py:152] update local model(uci_housing_model).
2020-04-02 08:38 INFO [monitor.py:161] sleep 10s. 2020-04-02 10:12 DEBUG [monitor.py:184] update timestamp cmd: touch ./uci_housing_model/fluid_time_file
2020-04-02 08:38 DEBUG [monitor.py:249] check cmd: /hadoop-3.1.2/bin/hdfs dfs -stat "%Y" /donefile 2020-04-02 10:12 INFO [monitor.py:157] update model timestamp(fluid_time_file).
2020-04-02 08:38 DEBUG [monitor.py:251] resp: 1585816693193 2020-04-02 10:12 INFO [monitor.py:161] sleep 10s.
2020-04-02 08:38 INFO [monitor.py:161] sleep 10s. 2020-04-02 10:12 DEBUG [monitor.py:214] check cmd: /hadoop-3.1.2/bin/hadoop fs -ls /donefile 2>/dev/null
2020-04-02 10:12 DEBUG [monitor.py:216] resp: -rw-r--r-- 1 root supergroup 0 2020-04-02 10:11 /donefile
2020-04-02 10:12 INFO [monitor.py:161] sleep 10s.
``` ```
......
...@@ -15,9 +15,8 @@ Paddle Serving提供了一个自动监控脚本,远端地址更新模型后会 ...@@ -15,9 +15,8 @@ Paddle Serving提供了一个自动监控脚本,远端地址更新模型后会
| Monitor类型 | 描述 | 特殊选项 | | Monitor类型 | 描述 | 特殊选项 |
| :---------: | :----------------------------------------------------------: | :----------------------------------------------------------: | | :---------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
| general | 远端无认证,可以通过`wget`直接访问下载文件(如无需认证的FTP,BOS等) | `general_host` 通用远端host | | general | 远端无认证,可以通过`wget`直接访问下载文件(如无需认证的FTP,BOS等) | `general_host` 通用远端host |
| hdfs | 远端为HDFS,通过HDFS二进制执行相关命令 | `hdfs_bin` HDFS二进制的路径 | | hdfs/afs | 远端为HDFS或AFS,通过Hadoop-Client执行相关命令 | `hadoop_bin` Hadoop二进制的路径<br/>`fs_name` Hadoop fs_name,默认为空<br/>`fs_ugi` Hadoop fs_ugi,默认为空 |
| ftp | 远端为FTP,通过`ftplib`进行相关访问(使用该Monitor,您可能需要执行`pip install ftplib`下载`ftplib`) | `ftp_host` FTP host<br>`ftp_port` FTP port<br>`ftp_username` FTP username,默认为空<br>`ftp_password` FTP password,默认为空 | | ftp | 远端为FTP,通过`ftplib`进行相关访问(使用该Monitor,您可能需要执行`pip install ftplib`下载`ftplib`) | `ftp_host` FTP host<br>`ftp_port` FTP port<br>`ftp_username` FTP username,默认为空<br>`ftp_password` FTP password,默认为空 |
| afs | 远端为AFS,通过Hadoop-client执行相关命令 | `hadoop_bin` Hadoop二进制的路径<br>`hadoop_host` AFS host,默认为空<br>`hadoop_ugi` AFS ugi,默认为空 |
| Monitor通用选项 | 描述 | 默认值 | | Monitor通用选项 | 描述 | 默认值 |
| :--------------------: | :----------------------------------------------------------: | :--------------------: | | :--------------------: | :----------------------------------------------------------: | :--------------------: |
...@@ -33,9 +32,9 @@ Paddle Serving提供了一个自动监控脚本,远端地址更新模型后会 ...@@ -33,9 +32,9 @@ Paddle Serving提供了一个自动监控脚本,远端地址更新模型后会
| `unpacked_filename` | Monitor支持tarfile打包的远程模型。如果远程模型是打包格式,则需要设置该选项来告知Monitor解压后的文件名。 | `None` | | `unpacked_filename` | Monitor支持tarfile打包的远程模型。如果远程模型是打包格式,则需要设置该选项来告知Monitor解压后的文件名。 | `None` |
| `debug` | 如果添加`--debug`选项,则输出更详细的中间信息。 | 默认不添加该选项 | | `debug` | 如果添加`--debug`选项,则输出更详细的中间信息。 | 默认不添加该选项 |
下面通过HDFSMonitor示例来展示Paddle Serving的模型热加载功能。 下面通过HadoopMonitor示例来展示Paddle Serving的模型热加载功能。
## HDFSMonitor示例 ## HadoopMonitor示例
示例中在`product_path`中生产模型上传至hdfs,在`server_path`中模拟服务端模型热加载: 示例中在`product_path`中生产模型上传至hdfs,在`server_path`中模拟服务端模型热加载:
...@@ -83,8 +82,8 @@ exe = fluid.Executor(place) ...@@ -83,8 +82,8 @@ exe = fluid.Executor(place)
exe.run(fluid.default_startup_program()) exe.run(fluid.default_startup_program())
def push_to_hdfs(local_file_path, remote_path): def push_to_hdfs(local_file_path, remote_path):
hdfs_bin = '/hadoop-3.1.2/bin/hdfs' hdfs_bin = '/hadoop-3.1.2/bin/hadoop'
os.system('{} dfs -put -f {} {}'.format( os.system('{} fs -put -f {} {}'.format(
hdfs_bin, local_file_path, remote_path)) hdfs_bin, local_file_path, remote_path))
name = "uci_housing" name = "uci_housing"
...@@ -120,7 +119,7 @@ for pass_id in range(30): ...@@ -120,7 +119,7 @@ for pass_id in range(30):
hdfs上的文件如下列所示: hdfs上的文件如下列所示:
```bash ```bash
# hdfs dfs -ls / # hadoop fs -ls /
Found 2 items Found 2 items
-rw-r--r-- 1 root supergroup 0 2020-04-02 02:54 /donefile -rw-r--r-- 1 root supergroup 0 2020-04-02 02:54 /donefile
-rw-r--r-- 1 root supergroup 2101 2020-04-02 02:54 /uci_housing.tar.gz -rw-r--r-- 1 root supergroup 2101 2020-04-02 02:54 /uci_housing.tar.gz
...@@ -151,11 +150,11 @@ python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --po ...@@ -151,11 +150,11 @@ python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --po
```shell ```shell
python -m paddle_serving_server.monitor \ python -m paddle_serving_server.monitor \
--type='hdfs' --hdfs_bin='/hadoop-3.1.2/bin/hdfs' --remote_path='/' \ --type='hdfs' --hadoop_bin='/hadoop-3.1.2/bin/hadoop' \
--remote_model_name='uci_housing.tar.gz' --remote_donefile_name='donefile' \ --remote_path='/' --remote_model_name='uci_housing.tar.gz' \
--local_path='.' --local_model_name='uci_housing_model' \ --remote_donefile_name='donefile' --local_path='.' \
--local_timestamp_file='fluid_time_file' --local_tmp_path='_tmp' \ --local_model_name='uci_housing_model' --local_timestamp_file='fluid_time_file' \
--unpacked_filename='uci_housing_model' --debug --local_tmp_path='_tmp' --unpacked_filename='uci_housing_model' --debug
``` ```
上面代码通过轮询方式监控远程HDFS地址`/`的时间戳文件`/donefile`,当时间戳变更则认为远程模型已经更新,将远程打包模型`/uci_housing.tar.gz`拉取到本地临时路径`./_tmp/uci_housing.tar.gz`下,解包出模型文件`./_tmp/uci_housing_model`后,更新本地模型`./uci_housing_model`以及Paddle Serving的时间戳文件`./uci_housing_model/fluid_time_file` 上面代码通过轮询方式监控远程HDFS地址`/`的时间戳文件`/donefile`,当时间戳变更则认为远程模型已经更新,将远程打包模型`/uci_housing.tar.gz`拉取到本地临时路径`./_tmp/uci_housing.tar.gz`下,解包出模型文件`./_tmp/uci_housing_model`后,更新本地模型`./uci_housing_model`以及Paddle Serving的时间戳文件`./uci_housing_model/fluid_time_file`
...@@ -163,32 +162,34 @@ python -m paddle_serving_server.monitor \ ...@@ -163,32 +162,34 @@ python -m paddle_serving_server.monitor \
预计输出如下: 预计输出如下:
```shell ```shell
2020-04-02 08:38 INFO [monitor.py:85] _hdfs_bin: /hadoop-3.1.2/bin/hdfs 2020-04-02 10:12 INFO [monitor.py:85] _hadoop_bin: /hadoop-3.1.2/bin/hadoop
2020-04-02 08:38 INFO [monitor.py:244] HDFS prefix cmd: /hadoop-3.1.2/bin/hdfs dfs 2020-04-02 10:12 INFO [monitor.py:85] _fs_name:
2020-04-02 08:38 INFO [monitor.py:85] _remote_path: / 2020-04-02 10:12 INFO [monitor.py:85] _fs_ugi:
2020-04-02 08:38 INFO [monitor.py:85] _remote_model_name: uci_housing.tar.gz 2020-04-02 10:12 INFO [monitor.py:209] AFS prefix cmd: /hadoop-3.1.2/bin/hadoop fs
2020-04-02 08:38 INFO [monitor.py:85] _remote_donefile_name: donefile 2020-04-02 10:12 INFO [monitor.py:85] _remote_path: /
2020-04-02 08:38 INFO [monitor.py:85] _local_model_name: uci_housing_model 2020-04-02 10:12 INFO [monitor.py:85] _remote_model_name: uci_housing.tar.gz
2020-04-02 08:38 INFO [monitor.py:85] _local_path: . 2020-04-02 10:12 INFO [monitor.py:85] _remote_donefile_name: donefile
2020-04-02 08:38 INFO [monitor.py:85] _local_timestamp_file: fluid_time_file 2020-04-02 10:12 INFO [monitor.py:85] _local_model_name: uci_housing_model
2020-04-02 08:38 INFO [monitor.py:85] _local_tmp_path: _tmp 2020-04-02 10:12 INFO [monitor.py:85] _local_path: .
2020-04-02 08:38 INFO [monitor.py:85] _interval: 10 2020-04-02 10:12 INFO [monitor.py:85] _local_timestamp_file: fluid_time_file
2020-04-02 08:38 DEBUG [monitor.py:249] check cmd: /hadoop-3.1.2/bin/hdfs dfs -stat "%Y" /donefile 2020-04-02 10:12 INFO [monitor.py:85] _local_tmp_path: _tmp
2020-04-02 08:38 DEBUG [monitor.py:251] resp: 1585816693193 2020-04-02 10:12 INFO [monitor.py:85] _interval: 10
2020-04-02 08:38 INFO [monitor.py:138] doneilfe(donefile) changed. 2020-04-02 10:12 DEBUG [monitor.py:214] check cmd: /hadoop-3.1.2/bin/hadoop fs -ls /donefile 2>/dev/null
2020-04-02 08:38 DEBUG [monitor.py:261] pull cmd: /hadoop-3.1.2/bin/hdfs dfs -get -f /uci_housing.tar.gz _tmp 2020-04-02 10:12 DEBUG [monitor.py:216] resp: -rw-r--r-- 1 root supergroup 0 2020-04-02 10:11 /donefile
2020-04-02 08:38 INFO [monitor.py:144] pull remote model(uci_housing.tar.gz). 2020-04-02 10:12 INFO [monitor.py:138] doneilfe(donefile) changed.
2020-04-02 08:38 INFO [monitor.py:98] unpack remote file(uci_housing.tar.gz). 2020-04-02 10:12 DEBUG [monitor.py:233] pull cmd: /hadoop-3.1.2/bin/hadoop fs -get /uci_housing.tar.gz _tmp/uci_housing.tar.gz 2>/dev/null
2020-04-02 08:38 DEBUG [monitor.py:108] remove packed file(uci_housing.tar.gz). 2020-04-02 10:12 INFO [monitor.py:144] pull remote model(uci_housing.tar.gz).
2020-04-02 08:38 INFO [monitor.py:110] using unpacked filename: uci_housing_model. 2020-04-02 10:12 INFO [monitor.py:98] unpack remote file(uci_housing.tar.gz).
2020-04-02 08:38 DEBUG [monitor.py:175] update model cmd: cp -r _tmp/uci_housing_model/* ./uci_housing_model 2020-04-02 10:12 DEBUG [monitor.py:108] remove packed file(uci_housing.tar.gz).
2020-04-02 08:38 INFO [monitor.py:152] update local model(uci_housing_model). 2020-04-02 10:12 INFO [monitor.py:110] using unpacked filename: uci_housing_model.
2020-04-02 08:38 DEBUG [monitor.py:184] update timestamp cmd: touch ./uci_housing_model/fluid_time_file 2020-04-02 10:12 DEBUG [monitor.py:175] update model cmd: cp -r _tmp/uci_housing_model/* ./uci_housing_model
2020-04-02 08:38 INFO [monitor.py:157] update model timestamp(fluid_time_file). 2020-04-02 10:12 INFO [monitor.py:152] update local model(uci_housing_model).
2020-04-02 08:38 INFO [monitor.py:161] sleep 10s. 2020-04-02 10:12 DEBUG [monitor.py:184] update timestamp cmd: touch ./uci_housing_model/fluid_time_file
2020-04-02 08:38 DEBUG [monitor.py:249] check cmd: /hadoop-3.1.2/bin/hdfs dfs -stat "%Y" /donefile 2020-04-02 10:12 INFO [monitor.py:157] update model timestamp(fluid_time_file).
2020-04-02 08:38 DEBUG [monitor.py:251] resp: 1585816693193 2020-04-02 10:12 INFO [monitor.py:161] sleep 10s.
2020-04-02 08:38 INFO [monitor.py:161] sleep 10s. 2020-04-02 10:12 DEBUG [monitor.py:214] check cmd: /hadoop-3.1.2/bin/hadoop fs -ls /donefile 2>/dev/null
2020-04-02 10:12 DEBUG [monitor.py:216] resp: -rw-r--r-- 1 root supergroup 0 2020-04-02 10:11 /donefile
2020-04-02 10:12 INFO [monitor.py:161] sleep 10s.
``` ```
#### 查看Server日志 #### 查看Server日志
......
...@@ -186,24 +186,21 @@ class Monitor(object): ...@@ -186,24 +186,21 @@ class Monitor(object):
raise Exception('update local donefile failed.') raise Exception('update local donefile failed.')
class AFSMonitor(Monitor): class HadoopMonitor(Monitor):
''' AFS Monitor(by hadoop-client). ''' ''' Monitor HDFS or AFS by Hadoop-client. '''
def __init__(self, def __init__(self, hadoop_bin, fs_name='', fs_ugi='', interval=10):
hadoop_bin, super(HadoopMonitor, self).__init__(interval)
hadoop_host=None,
hadoop_ugi=None,
interval=10):
super(AFSMonitor, self).__init__(interval)
self._hadoop_bin = hadoop_bin self._hadoop_bin = hadoop_bin
self._hadoop_host = hadoop_host self._fs_name = fs_name
self._hadoop_ugi = hadoop_ugi self._fs_ugi = fs_ugi
self._print_params(['_hadoop_bin', '_hadoop_host', '_hadoop_ugi']) self._print_params(['_hadoop_bin', '_fs_name', '_fs_ugi'])
self._cmd_prefix = '{} fs '.format(self._hadoop_bin) self._cmd_prefix = '{} fs '.format(self._hadoop_bin)
if self._hadoop_host and self._hadoop_ugi: if self._fs_name:
self._cmd_prefix += '-D fs.default.name={} -D hadoop.job.ugi={} '.format( self._cmd_prefix += '-D fs.default.name={} '.format(self._fs_name)
self._hadoop_host, self._hadoop_ugi) if self._fs_ugi:
_LOGGER.info('AFS prefix cmd: {}'.format(self._cmd_prefix)) self._cmd_prefix += '-D hadoop.job.ugi={} '.format(self._fs_ugi)
_LOGGER.info('Hadoop prefix cmd: {}'.format(self._cmd_prefix))
def _exist_remote_file(self, path, filename, local_tmp_path): def _exist_remote_file(self, path, filename, local_tmp_path):
remote_filepath = os.path.join(path, filename) remote_filepath = os.path.join(path, filename)
...@@ -233,37 +230,6 @@ class AFSMonitor(Monitor): ...@@ -233,37 +230,6 @@ class AFSMonitor(Monitor):
self._check_param_help('remote_model_name', dirname))) self._check_param_help('remote_model_name', dirname)))
class HDFSMonitor(Monitor):
''' HDFS Monitor. '''
def __init__(self, hdfs_bin, interval=10):
super(HDFSMonitor, self).__init__(interval)
self._hdfs_bin = hdfs_bin
self._print_params(['_hdfs_bin'])
self._prefix_cmd = '{} dfs '.format(self._hdfs_bin)
_LOGGER.info('HDFS prefix cmd: {}'.format(self._prefix_cmd))
def _exist_remote_file(self, path, filename, local_tmp_path):
remote_filepath = os.path.join(path, filename)
cmd = '{} -stat "%Y" {}'.format(self._prefix_cmd, remote_filepath)
_LOGGER.debug('check cmd: {}'.format(cmd))
[status, timestamp] = commands.getstatusoutput(cmd)
_LOGGER.debug('resp: {}'.format(timestamp))
if status == 0:
return [True, timestamp]
else:
return [False, None]
def _pull_remote_dir(self, remote_path, dirname, local_tmp_path):
remote_dirpath = os.path.join(remote_path, dirname)
cmd = '{} -get -f {} {}'.format(self._prefix_cmd, remote_dirpath,
local_tmp_path)
_LOGGER.debug('pull cmd: {}'.format(cmd))
if os.system(cmd) != 0:
raise Exception('pull remote dir failed. {}'.format(
self._check_param_help('remote_model_name', dirname)))
class FTPMonitor(Monitor): class FTPMonitor(Monitor):
''' FTP Monitor. ''' ''' FTP Monitor. '''
...@@ -438,8 +404,6 @@ def parse_args(): ...@@ -438,8 +404,6 @@ def parse_args():
parser.set_defaults(debug=False) parser.set_defaults(debug=False)
# general monitor # general monitor
parser.add_argument("--general_host", type=str, help="General remote host") parser.add_argument("--general_host", type=str, help="General remote host")
# hdfs monitor
parser.add_argument("--hdfs_bin", type=str, help="Path of HDFS binary file")
# ftp monitor # ftp monitor
parser.add_argument("--ftp_host", type=str, help="FTP remote host") parser.add_argument("--ftp_host", type=str, help="FTP remote host")
parser.add_argument("--ftp_port", type=int, help="FTP remote port") parser.add_argument("--ftp_port", type=int, help="FTP remote port")
...@@ -453,19 +417,19 @@ def parse_args(): ...@@ -453,19 +417,19 @@ def parse_args():
type=str, type=str,
default='', default='',
help="FTP password. Not used if anonymous access") help="FTP password. Not used if anonymous access")
# afs monitor # afs/hdfs monitor
parser.add_argument( parser.add_argument(
"--hadoop_bin", type=str, help="Path of Hadoop binary file") "--hadoop_bin", type=str, help="Path of Hadoop binary file")
parser.add_argument( parser.add_argument(
"--hadoop_host", "--fs_name",
type=str, type=str,
default=None, default='',
help="AFS host. Not used if set in Hadoop-client.") help="AFS/HDFS fs_name. Not used if set in Hadoop-client.")
parser.add_argument( parser.add_argument(
"--hadoop_ugi", "--fs_ugi",
type=str, type=str,
default=None, default='',
help="AFS ugi, Not used if set in Hadoop-client") help="AFS/HDFS fs_ugi, Not used if set in Hadoop-client")
return parser.parse_args() return parser.parse_args()
...@@ -485,16 +449,11 @@ def get_monitor(mtype): ...@@ -485,16 +449,11 @@ def get_monitor(mtype):
username=args.ftp_username, username=args.ftp_username,
password=args.ftp_password, password=args.ftp_password,
interval=args.interval) interval=args.interval)
elif mtype == 'hdfs':
return HDFSMonitor(args.hdfs_bin, interval=args.interval)
elif mtype == 'general': elif mtype == 'general':
return GeneralMonitor(args.general_host, interval=args.interval) return GeneralMonitor(args.general_host, interval=args.interval)
elif mtype == 'afs': elif mtype == 'afs' or mtype == 'hdfs':
return AFSMonitor( return HadoopMonitor(
args.hadoop_bin, args.hadoop_bin, args.fs_name, args.fs_ugi, interval=args.interval)
args.hadoop_host,
args.hadoop_ugi,
interval=args.interval)
else: else:
raise Exception('unsupport type.') raise Exception('unsupport type.')
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册