README.md 1.8 KB
Newer Older
L
liyukun01 已提交
1
# ERNIE-GEN
M
Meiyim 已提交
2 3 4 5 6 7

[ERNIE-GEN](https://arxiv.org/pdf/2001.11314.pdf) is a multi-flow language generation framework for both pre-training and fine-tuning.
Only finetune strategy is illustrated in this section.

## Finetune

L
liyukun01 已提交
8
We use Abstractive Summarization task CNN/DailyMail to illustate usage of ERNIE-GEN, you can download preprocessed finetune data from [here](https://ernie-github.cdn.bcebos.com/data-cnndm.tar.gz)
M
Meiyim 已提交
9

L
liyukun01 已提交
10
To starts finetuning ERNIE-GEN, run:
M
Meiyim 已提交
11 12 13 14

```script
python3 -m paddle.distributed.launch \
    --log_dir ./log  \
M
Meiyim 已提交
15
    ./demo/seq2seq/finetune_seq2seq_dygraph.py \
M
Meiyim 已提交
16 17 18 19 20 21 22 23 24 25
    --from_pretrained ernie-gen-base-en \
    --data_dir ./data/cnndm \
    --save_dir ./model_cnndm \
    --label_smooth 0.1 \
    --use_random_noice \
    --noise_prob 0.7 \
    --predict_output_dir ./pred \
    --max_steps $((287113*30/64))
```

C
chenxuyi 已提交
26 27
Note that you need more than 2 GPUs to run the finetuning.
During multi-gpu finetuning, `max_steps` is used as stop criteria rather than `epoch` to prevent dead block.
M
Meiyim 已提交
28 29 30 31 32 33
We simply canculate `max_steps` with: `EPOCH * NUM_TRIAN_EXAMPLE / TOTAL_BATCH`.
This demo script will save a finetuned model at `--save_dir`, and do muti-gpu prediction every `--eval_steps` and save prediction results at `--predict_output_dir`.


### Evalution

C
chenxuyi 已提交
34
While finetuning, a serials of prediction files is generated.
M
Meiyim 已提交
35 36 37 38 39 40 41 42
First you need to sort and join all files with:

```shell
sort -t$'\t' -k1n ./pred/pred.step60000.* |awk -F"\t" '{print $2}'> final_prediction
```

then use `./eval_cnndm/cnndm_eval.sh` to calcuate all metrics
(`pyrouge` is required to evalute CNN/Daily Mail.)
C
chenxuyi 已提交
43

M
Meiyim 已提交
44 45 46 47 48
```shell
sh cnndm_eval.sh final_prediction ./data/cnndm/dev.summary
```


C
chenxuyi 已提交
49
### Inference
M
Meiyim 已提交
50 51 52 53 54

To run beam serach decode after you got a finetuned model. try:

```shell

M
Meiyim 已提交
55
cat one_column_source_text| python3 demo/seq2seq/decode.py \
M
Meiyim 已提交
56 57 58 59
    --from_pretrained ./ernie_gen_large \
    --save_dir ./model_cnndm \
    --bsz 8
```