1. 09 12月, 2019 2 次提交
  2. 08 12月, 2019 1 次提交
  3. 06 12月, 2019 4 次提交
    • B
      cherry-pick MKL-DNN NHWC FWD support fix (#21593) · 1f598dfa
      bingyanghuang 提交于
      1f598dfa
    • A
      f83254d6
    • e228e707
    • Z
      CHERRY_PICK: Better TensorRT support (#20858) (#21578) · 0a4002f5
      Zhaolong Xing 提交于
      * Fix TensorRT detection bug
      
      1. Add new search path for TensorRT at tensorrt.cmake
      2. Add better debug message
      3. Fix the bug of detection of TensorRT version
      
      In NVIDIA official docker image, TensorRT headers are located at
      `/usr/include/x86_64-linux-gnu` and TensorRT libraries are located
      at `/usr/lib/x86_64-linux-gnu`, so using `-DTENSORRT_ROOT` will
      fail to detect TensorRT.
      
      There is no debug/warning message to tell developer that TensorRT
      is failed to be detected.
      
      In later version of TensorRT (e.g. v6), `NV_TENSORRT_MAJOR` is
      defined at `NvInferVersion.h` instead of `NvInfer.h`, so add
      compatibility fix.
      
      * Fix TensorRT variables in CMake
      
      1. Replace `${TENSORRT_ROOT}/include` with `${TENSORRT_INCLUDE_DIR}`
      2. Replace `${TENSORRT_ROOT}/lib` with `${TENSORRT_LIBRARY}`
      
      Manually type path may locate incorrect path of TensorRT. Use the
      paths detected by system instead.
      
      * Fix TensorRT library path
      
      1. Add new variable - `${TENSORRT_LIBRARY_DIR}`
      2. Fix TensorRT library path
      
      inference_lib.cmake and setup.py.in need the path of TensorRT library
      instead of the file of TensorRT library, so add new variable to fix it.
      
      * Add more general search rule for TensoRT
      
      Let system detect architecture instead of manually assign it, so
      replace `x86_64-linux-gnu` with `${CMAKE_LIBRARY_ARCHITECTURE}`.
      
      * Add more general search rule for TensorRT
      
      Remove duplicate search rules for TensorRT libraries. Use
      `${TENSORRT_LIBRARY_DIR}` to get full path of libnvinfer.so
      
      test=release/1.6
      0a4002f5
  4. 05 12月, 2019 5 次提交
  5. 04 12月, 2019 6 次提交
  6. 03 12月, 2019 11 次提交
  7. 02 12月, 2019 5 次提交
  8. 30 11月, 2019 1 次提交
  9. 29 11月, 2019 3 次提交
  10. 28 11月, 2019 1 次提交
    • X
      cherry-pick1.6 fix cache table bug, add save_paddle_inference_model, fix hdfs util bug (#21339) · 072eb5b6
      xujiaqi01 提交于
      * fix cache table bug, add save_paddle_inference_model, fix hdfs util bug (#21052)
      
      * fix cache table bug
      * add save_paddle_inference_model
      * fix hdfs util bug
      * test=develop
      
      * fix several sparse table issuses (#20686)
      
      * no longer need to define all embedding layers (no one less) of all slots in each program. make trainer_param repeated in ps.proto.
      * add find_distributed_lookup_table_grads instead of hard code GRAD
      * support embedding stop gradient. push sparse has error before fix this.* 
      * fix fill sparse, skip slots which do not have embedding. each slot's embedding in a sparse table should be used in all training programs before fix this.
      * fix pull sparse, skip slots which do not have embedding.
      * fix collect feasign label info, skip slots which do not have embedding.
      * support when there are multi sparse tables in one or multi training programs, each program can pull/push its own related sparse tables instead of all sparse tables.
      * test=develop
      
      * add copy table (#21086)
      
      * copy some feasigns and corresponding embeddings from one sparse table to another
      * copy all feasigns and corresponding embeddings from one sparse table to another
      * copy all dense params from one table to another
      * copy some local vars to other local vars
      
      * fix fs_client_param bug (#21212)
      
      * fix fs_client_param bug, user can set this config through fleet_desc_file or fleet config
      * test=develop
      
      * fix fleet util bug (#21254)
      
      * fix fleet util bug in save paddle inference model
      * test=develop
      072eb5b6
  11. 26 11月, 2019 1 次提交