diff --git a/doc/faq.md b/doc/faq.md index 9cda8a02ebe90453387af5a9f57dfaa23d4889b0..786cb1644bfbcc7bcd4642235ff940ac9a89ba2a 100644 --- a/doc/faq.md +++ b/doc/faq.md @@ -17,6 +17,7 @@ OpenPose - Frequently Asked Question (FAQ) 11. [CUDA_cublas_device_LIBRARY Not Found](#cuda_cublas_device_library-not-found) 12. [CMake-GUI Error While Getting Default Caffe](#cmake-gui-error-while-getting-default-caffe) 13. [Libgomp Out of Memory Error](#libgomp-out-of-memory-error) + 14. [Runtime Error with Turing GPU (Tesla T4) or Volta GPU][#runtime-error-with-turing-gpu-teslat4-or-volta-gpu) 2. [Speed Performance Issues](#speed-performance-issues) 1. [Speed Up, Memory Reduction, and Benchmark](#speed-up-memory-reduction-and-benchmark) 2. [How to Measure the Latency Time?](#how-to-measure-the-latency-time) @@ -39,7 +40,7 @@ OpenPose - Frequently Asked Question (FAQ) #### Out of Memory Error **Q: Out of memory error** - I get an error similar to: `Check failed: error == cudaSuccess (2 vs. 0) out of memory`. -**A**: Most probably cuDNN is not installed/enabled, the default Caffe model uses >12 GB of GPU memory, cuDNN reduces it to ~2 GB for BODY_25 and ~1.5 GB for COCO. +**A**: Most probably cuDNN is not installed/enabled, the default Caffe model uses >12 GB of GPU memory, cuDNN reduces it to ~2.2 GB for BODY_25 (default) and ~1.5 GB for COCO (`--model_pose COCO`). Note that you still need at least about 2.2 GB free for the default OpenPose to run. I.e., GPUs with only 2 GB will not fit the default OpenPose, and you will have to either switch to the `COCO` model (slower and less accurate), or reduce the `--net_resolution` (faster speed but also lower accuracy). @@ -162,6 +163,14 @@ git submodle update +#### Runtime Error with Turing GPU (Tesla T4) or Volta GPU +**Q**: When I start OpenPose, I receive a runtime error for new GPU architectures. + +**A**: To solve this problem, 1) make sure you are using CUDA 10 or higher, and 2) change line 7 in `{OPENPOSE_PATH}/3rdparty/caffe/cmake/Cuda.cmake`, from `set(Caffe_known_gpu_archs "30 35 50 52 60 61")` to `set(Caffe_known_gpu_archs "30 35 50 52 60 61 75")`. + + + + ### Speed Performance Issues #### Speed Up, Memory Reduction, and Benchmark diff --git a/doc/installation.md b/doc/installation.md index fb9a3c2af9fbb46f8a750e6c1df2af294a8a2bbf..914e04424f3d1c8c724521c6e93872bec9591bc6 100644 --- a/doc/installation.md +++ b/doc/installation.md @@ -54,9 +54,14 @@ We add links to some community-based work based on OpenPose. Note: We do not sup - [ROS example](https://github.com/firephinx/openpose_ros) (based on a very old OpenPose version). For questions and more details, read and post ONLY on [issue thread #51](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/51). - Docker Images. For questions and more details, read and post ONLY on [issue thread #347](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/347). - - Dockerfile working with CUDA 10: [link 1](https://github.com/ExSidius/openpose-docker/blob/master/Dockerfile) and [link 2](https://cloud.docker.com/repository/docker/exsidius/openpose/general). - - [Dockerfile - OpenPose v1.4.0, OpenCV, CUDA 8, CuDNN 6, Python2.7](https://gist.github.com/moiseevigor/11c02c694fc0c22fccd59521793aeaa6). - - [Dockerfile - OpenPose v1.2.1](https://gist.github.com/sberryman/6770363f02336af82cb175a83b79de33). + - Dockerfile working also with CUDA 10: + - [Link 1](https://github.com/esemeniuc/openpose-docker), it claims to also include Python support. Read and post ONLY on [issue thread #1102](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/1102). + - [Link 2](https://github.com/ExSidius/openpose-docker/blob/master/Dockerfile). + - [Link 3](https://cloud.docker.com/repository/docker/exsidius/openpose/general). + - Dockerfile working only with CUDA 8: + - [Dockerfile - OpenPose v1.4.0, OpenCV, CUDA 8, CuDNN 5, Python2.7](https://github.com/tlkh/openpose). Read and post ONLY on [issue thread #1102](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/1102). + - [Dockerfile - OpenPose v1.4.0, OpenCV, CUDA 8, CuDNN 6, Python2.7](https://gist.github.com/moiseevigor/11c02c694fc0c22fccd59521793aeaa6). + - [Dockerfile - OpenPose v1.2.1](https://gist.github.com/sberryman/6770363f02336af82cb175a83b79de33). - [Google Colab helper script](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/949#issue-387855863): Script to install OpenPose on Google Colab. Really useful when access to a computer powerful enough to run OpenPose is not possible, so one possible way to use OpenPose is to build it on a GPU-enabled Colab runtime and then run the programs there. For questions and more details, read and post ONLY on [issue thread #949](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/949). @@ -168,7 +173,25 @@ make -j`nproc` ``` #### Windows -In order to build the project, open the Visual Studio solution (Windows), called `build/OpenPose.sln`. Then, set the configuration from `Debug` to `Release` and press the green triangle icon (alternatively press F5). +In order to build the project, select and run only one of the 2 following alternatives. + +1. **CMake-GUI alternative (recommended)**: Open the Visual Studio solution (Windows), called `build/OpenPose.sln`. Then, set the configuration from `Debug` to `Release` and press the green triangle icon (alternatively press F5). + +2. Command-line build alternative (not recommended). NOTE: The command line alternative is not officially supported, but it was added in [GitHub issue #1198](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/1198). For any questions or bug report about this command-line version, comment in that GitHub issue. + 1. Run "MSVS 2017 Developer Command Console" + ``` + openpose\mkdir build + cd build + cmake .. -G "Visual Studio 15 2017 Win64" -T v140 + cmake --build . --config Release + copy x64\Release\* bin\ + ``` + 2. If you want to clean build + ``` + cmake --clean-first . + cmake --build . --config Release + copy x64\Release\* bin\ + ``` **VERY IMPORTANT NOTE**: In order to use OpenPose outside Visual Studio, and assuming you have not unchecked the `BUILD_BIN_FOLDER` flag in CMake, copy all DLLs from `{build_directory}/bin` into the folder where the generated `openpose.dll` and `*.exe` demos are, e.g., `{build_directory}x64/Release` for the 64-bit release version. diff --git a/doc/modules/3d_reconstruction_module.md b/doc/modules/3d_reconstruction_module.md index c839eff1346cbeb0f277491c19b4076d29989c02..bc7dd78c2a1658c7418c8df9bfb9ac3517e1622f 100644 --- a/doc/modules/3d_reconstruction_module.md +++ b/doc/modules/3d_reconstruction_module.md @@ -23,13 +23,13 @@ This module performs 3-D keypoint (body, face, and hand) reconstruction and rend ## Installation -Check [doc/installation.md#3d-reconstruction-module](./installation.md#3d-reconstruction-module) for installation steps. +Check [doc/installation.md#3d-reconstruction-module](../installation.md#3d-reconstruction-module) for installation steps. ## Non Linear Optimization -In order to increase the 3-D reconstruction accuracy, OpenPose optionally performs non-linear optimization if Ceres solver support is enabled (only available in Ubuntu for now). To enable it, check [doc/installation.md#3d-reconstruction-module](./installation.md#3d-reconstruction-module) for more details. +In order to increase the 3-D reconstruction accuracy, OpenPose optionally performs non-linear optimization if Ceres solver support is enabled (only available in Ubuntu for now). To enable it, check [doc/installation.md#3d-reconstruction-module](../installation.md#3d-reconstruction-module) for more details. diff --git a/doc/release_notes.md b/doc/release_notes.md index 2d31499f849e2cbcd1672cba9f3c44f9671a0be2..75f1d98d8958353b1b0a1ed2850f6721afdd9cf8 100644 --- a/doc/release_notes.md +++ b/doc/release_notes.md @@ -369,10 +369,14 @@ OpenPose Library - Release Notes 1. Main improvements: 1. Highly improved 3D triangulation for >3 cameras by fixing some small bugs. 2. Added community-based support for Nvidia NVCaffe. + 3. Increased accuracy very lightly for CUDA version (about 0.01%) by adapting the threshold in `process()` in `bodyPartConnectorBase.cu` to `defaultNmsThreshold`. This also removes any posibility of future bugs in that function for using a default NMS threshold higher than 0.15 (which was the hard-coded value used previously). + 4. Increased mAP but reduced mAR (both about 0.01%) as well as reduction of false positives. Step 1: removed legs where only knee/ankle/feet are found. Step 2: If no people is found in an image, `removePeopleBelowThresholds` is re-run with `maximizePositives = true`. + 5. Number of maximum people is not limited by the maximum number of max peaks anymore. However, the number of body part candidates for a specific keypoint (e.g., nose) is still limited to the number of max peaks. 2. Functions or parameters renamed: 1. `--3d_min_views` default value (-1) no longer means that all camera views are required. Instead, it will be equal to max(2, min(4, #cameras-1)). This should provide a good trade-off between recall and precission. 3. Main bugs fixed: 1. Windows: Added back support for OpenGL and Spinnaker, as well as DLLs for debug compilation. + 2. `06_face_from_image.cpp` and `07_hand_from_image.cpp` working again, they stopped working in version 1.5.0 with the GPU image resize for the GUI. 4. Changes/additions that affect the compatibility with the OpenPose Unity Plugin: diff --git a/include/openpose/net/bodyPartConnectorBase.hpp b/include/openpose/net/bodyPartConnectorBase.hpp index 0050b27d28de16c35c3e43436eacac7306a6e46d..d863088a0829537ebd7ed5bc569fbdd6b63706b9 100644 --- a/include/openpose/net/bodyPartConnectorBase.hpp +++ b/include/openpose/net/bodyPartConnectorBase.hpp @@ -18,10 +18,10 @@ namespace op void connectBodyPartsGpu( Array& poseKeypoints, Array& poseScores, const T* const heatMapGpuPtr, const T* const peaksPtr, const PoseModel poseModel, const Point& heatMapSize, const int maxPeaks, const T interMinAboveThreshold, - const T interThreshold, const int minSubsetCnt, const T minSubsetScore, const T scaleFactor = 1.f, - const bool maximizePositives = false, Array pairScoresCpu = Array{}, T* pairScoresGpuPtr = nullptr, - const unsigned int* const bodyPartPairsGpuPtr = nullptr, const unsigned int* const mapIdxGpuPtr = nullptr, - const T* const peaksGpuPtr = nullptr); + const T interThreshold, const int minSubsetCnt, const T minSubsetScore, const T scaleFactor, + const bool maximizePositives, Array pairScoresCpu, T* pairScoresGpuPtr, + const unsigned int* const bodyPartPairsGpuPtr, const unsigned int* const mapIdxGpuPtr, + const T* const peaksGpuPtr, const T defaultNmsThreshold); template void connectBodyPartsOcl( @@ -41,17 +41,17 @@ namespace op const unsigned int numberBodyPartPairs, const Array& precomputedPAFs = Array()); template - void removePeopleBelowThresholds(std::vector& validSubsetIndexes, int& numberPeople, - const std::vector, T>>& subsets, - const unsigned int numberBodyParts, const int minSubsetCnt, - const T minSubsetScore, const int maxPeaks, const bool maximizePositives); + void removePeopleBelowThresholdsAndFillFaces( + std::vector& validSubsetIndexes, int& numberPeople, + std::vector, T>>& subsets, const unsigned int numberBodyParts, + const int minSubsetCnt, const T minSubsetScore, const bool maximizePositives, const T* const peaksPtr); template - void peopleVectorToPeopleArray(Array& poseKeypoints, Array& poseScores, const T scaleFactor, - const std::vector, T>>& subsets, - const std::vector& validSubsetIndexes, const T* const peaksPtr, - const int numberPeople, const unsigned int numberBodyParts, - const unsigned int numberBodyPartPairs); + void peopleVectorToPeopleArray( + Array& poseKeypoints, Array& poseScores, const T scaleFactor, + const std::vector, T>>& subsets, const std::vector& validSubsetIndexes, + const T* const peaksPtr, const int numberPeople, const unsigned int numberBodyParts, + const unsigned int numberBodyPartPairs); template std::vector> pafPtrIntoVector( diff --git a/include/openpose/net/bodyPartConnectorCaffe.hpp b/include/openpose/net/bodyPartConnectorCaffe.hpp index 4562ee2968673302a1b7687121474d48a798f0f2..ac1d5d9f1b3c32d18df8888109fb29c60582c2d9 100644 --- a/include/openpose/net/bodyPartConnectorCaffe.hpp +++ b/include/openpose/net/bodyPartConnectorCaffe.hpp @@ -25,6 +25,8 @@ namespace op void setMaximizePositives(const bool maximizePositives); + void setDefaultNmsThreshold(const T defaultNmsThreshold); + void setInterMinAboveThreshold(const T interMinAboveThreshold); void setInterThreshold(const T interThreshold); @@ -56,6 +58,7 @@ namespace op private: PoseModel mPoseModel; bool mMaximizePositives; + T mDefaultNmsThreshold; T mInterMinAboveThreshold; T mInterThreshold; int mMinSubsetCnt; diff --git a/include/openpose/pose/poseParametersRender.hpp b/include/openpose/pose/poseParametersRender.hpp index 2fd42f6fd73e10ba5816a497faf6411174e3a510..ddd61e7fbe08a6b88f2cd15504e2d1ba5d0e4104 100644 --- a/include/openpose/pose/poseParametersRender.hpp +++ b/include/openpose/pose/poseParametersRender.hpp @@ -210,10 +210,12 @@ namespace op 1.f,1.f,1.f,1.f,1.f,1.f, \ 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, \ 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, \ - 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.45f,0.45f,0.45f, \ + 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, \ 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, \ 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, \ 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f + // First 0.45f row: + // 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.45f,0.45f,0.45f, #define POSE_BODY_135_COLORS_RENDER_GPU \ 255.f, 0.f, 85.f, \ 170.f, 0.f, 255.f, \ diff --git a/include/openpose/utilities/keypoint.hpp b/include/openpose/utilities/keypoint.hpp index bbc5272404e4ef99590256f07b74f35ca951eefa..1f21fb20378145307cbfa51e8f43cf32ab5b1c49 100644 --- a/include/openpose/utilities/keypoint.hpp +++ b/include/openpose/utilities/keypoint.hpp @@ -21,12 +21,15 @@ namespace op void scaleKeypoints2d(Array& keypoints, const T scaleX, const T scaleY, const T offsetX, const T offsetY); template - void renderKeypointsCpu(Array& frameArray, const Array& keypoints, const std::vector& pairs, - const std::vector colors, const T thicknessCircleRatio, - const T thicknessLineRatioWRTCircle, const std::vector& poseScales, const T threshold); + void renderKeypointsCpu( + Array& frameArray, const Array& keypoints, const std::vector& pairs, + const std::vector colors, const T thicknessCircleRatio, const T thicknessLineRatioWRTCircle, + const std::vector& poseScales, const T threshold); template - Rectangle getKeypointsRectangle(const Array& keypoints, const int person, const T threshold); + Rectangle getKeypointsRectangle( + const Array& keypoints, const int person, const T threshold, const int firstIndex = 0, + const int lastIndex = -1); template T getAverageScore(const Array& keypoints, const int person); @@ -44,8 +47,9 @@ namespace op T getDistanceAverage(const Array& keypoints, const int personA, const int personB, const T threshold); template - T getDistanceAverage(const Array& keypointsA, const int personA, const Array& keypointsB, const int personB, - const T threshold); + T getDistanceAverage( + const Array& keypointsA, const int personA, const Array& keypointsB, const int personB, + const T threshold); /** * Creates and Array with a specific person. diff --git a/include/openpose/wrapper/wrapperAuxiliary.hpp b/include/openpose/wrapper/wrapperAuxiliary.hpp index 2fed1d1728d99038a063c1f33690f68bf7b9369e..3ec50335dfdce2f5cfff741ef4f0223e67b59513 100644 --- a/include/openpose/wrapper/wrapperAuxiliary.hpp +++ b/include/openpose/wrapper/wrapperAuxiliary.hpp @@ -267,7 +267,7 @@ namespace op // Input cvMat to OpenPose input & output format // Note: resize on GPU reduces accuracy about 0.1% bool resizeOnCpu = true; - // const auto resizeOnCpu = (numberGpuThreads < 3); + // const auto resizeOnCpu = (wrapperStructPose.poseMode != PoseMode::Enabled); if (resizeOnCpu) { const auto gpuResize = false; @@ -277,7 +277,8 @@ namespace op } // Note: We realized that somehow doing it on GPU for any number of GPUs does speedup the whole OP resizeOnCpu = false; - addCvMatToOpOutputInCpu = addCvMatToOpOutput && (resizeOnCpu || !renderOutputGpu); + addCvMatToOpOutputInCpu = addCvMatToOpOutput + && (resizeOnCpu || !renderOutputGpu || wrapperStructPose.poseMode != PoseMode::Enabled); if (addCvMatToOpOutputInCpu) { const auto gpuResize = false; @@ -618,7 +619,7 @@ namespace op { const auto gpuResize = true; opOutputToCvMats.emplace_back(std::make_shared(gpuResize)); - poseExtractorsWs[i].emplace_back( + poseExtractorsWs.at(i).emplace_back( std::make_shared>(opOutputToCvMats.back())); // Assign shared parameters opOutputToCvMats.back()->setSharedParameters( diff --git a/scripts/ubuntu/Makefile.example b/scripts/ubuntu/Makefile.example index 3cacc833ef28a0d2c437d63efedffacfd667d01e..4f596eca183bb120f84cb40d51737da6ec60cbf3 100644 --- a/scripts/ubuntu/Makefile.example +++ b/scripts/ubuntu/Makefile.example @@ -33,7 +33,7 @@ LIBRARY_NAME := $(PROJECT) LIB_BUILD_DIR := $(BUILD_DIR)/lib STATIC_NAME := $(LIB_BUILD_DIR)/lib$(LIBRARY_NAME).a DYNAMIC_VERSION_MAJOR := 1 -DYNAMIC_VERSION_MINOR := 4 +DYNAMIC_VERSION_MINOR := 5 DYNAMIC_VERSION_REVISION := 0 DYNAMIC_NAME_SHORT := lib$(LIBRARY_NAME).so #DYNAMIC_SONAME_SHORT := $(DYNAMIC_NAME_SHORT).$(DYNAMIC_VERSION_MAJOR) diff --git a/src/openpose/net/bodyPartConnectorBase.cpp b/src/openpose/net/bodyPartConnectorBase.cpp index 9d9167ae6c7b7bcd1a5375cad12c108cc1890f5b..491b5b31d03b34dc58afe4394ea851b4a28fbbc3 100644 --- a/src/openpose/net/bodyPartConnectorBase.cpp +++ b/src/openpose/net/bodyPartConnectorBase.cpp @@ -1,15 +1,16 @@ #include #include #include +#include #include #include namespace op { template - inline T getScoreAB(const int i, const int j, const T* const candidateAPtr, const T* const candidateBPtr, - const T* const mapX, const T* const mapY, const Point& heatMapSize, - const T interThreshold, const T interMinAboveThreshold) + inline T getScoreAB( + const int i, const int j, const T* const candidateAPtr, const T* const candidateBPtr, const T* const mapX, + const T* const mapY, const Point& heatMapSize, const T interThreshold, const T interMinAboveThreshold) { try { @@ -57,6 +58,27 @@ namespace op } } + template + void getKeypointCounter( + int& personCounter, const std::vector, T>>& peopleVector, + const unsigned int index, const int indexFirst, const int indexLast, const int minimum) + { + try + { + // Count keypoints + auto keypointCounter = 0; + for (auto i = indexFirst ; i < indexLast ; i++) + keypointCounter += (peopleVector[index].first.at(i) > 0); + // If enough keypoints --> subtract them and keep them at least as big as minimum + if (keypointCounter > minimum) + personCounter += minimum-keypointCounter; // personCounter = non-considered keypoints + minimum + } + catch (const std::exception& e) + { + error(e.what(), __LINE__, __FUNCTION__, __FILE__); + } + } + template std::vector, T>> createPeopleVector( const T* const heatMapPtr, const T* const peaksPtr, const PoseModel poseModel, const Point& heatMapSize, @@ -211,8 +233,9 @@ namespace op for (auto j = 1; j <= numberPeaksB; j++) { // Initial PAF - auto scoreAB = getScoreAB(i, j, candidateAPtr, candidateBPtr, mapX, mapY, - heatMapSize, interThreshold, interMinAboveThreshold); + auto scoreAB = getScoreAB( + i, j, candidateAPtr, candidateBPtr, mapX, mapY, heatMapSize, interThreshold, + interMinAboveThreshold); // E.g., neck-nose connection. If possible PAF between neck i, nose j --> add // parts score + connection score @@ -263,9 +286,8 @@ namespace op const auto indexB = std::get<2>(aBConnection); if (!occurA[indexA-1] && !occurB[indexB-1]) { - abConnections.emplace_back(std::make_tuple(bodyPartA*peaksOffset + indexA*3 + 2, - bodyPartB*peaksOffset + indexB*3 + 2, - score)); + abConnections.emplace_back(std::make_tuple( + bodyPartA*peaksOffset+indexA*3+2, bodyPartB*peaksOffset+indexB*3+2, score)); counter++; if (counter==minAB) break; @@ -298,8 +320,8 @@ namespace op // Add ears connections (in case person is looking to opposite direction to camera) // Note: This has some issues: // - It does not prevent repeating the same keypoint in different people - // - Assuming I have nose,eye,ear as 1 person subset, and whole arm as another one, it will not - // merge them both + // - Assuming I have nose,eye,ear as 1 person subset, and whole arm as another one, it + // will not merge them both else if ( (numberBodyParts == 18 && (pairIndex==17 || pairIndex==18)) || ((numberBodyParts == 19 || (numberBodyParts == 25) @@ -622,49 +644,139 @@ namespace op } template - void removePeopleBelowThresholds( + void getRoiDiameterAndBounds( + Rectangle& roi, int& diameter, int& indexFirstNon0, int& indexLastNon0, + const std::vector& personVector, const T* const peaksPtr, + const int indexInit, const int indexEnd) + { + try + { + roi = Rectangle{0,0,0,0}; + for (auto index = 0u ; index < personVector.size()-1 ; index++) + { + const auto x = peaksPtr[personVector[index]-2]; + const auto y = peaksPtr[personVector[index]-1]; + const auto score = peaksPtr[personVector[index]]; + if (roi.x > x) + roi.x = x; + if (roi.y > y) + roi.y = y; + } + } + catch (const std::exception& e) + { + error(e.what(), __LINE__, __FUNCTION__, __FILE__); + } + } + + template + void removePeopleBelowThresholdsAndFillFaces( std::vector& validSubsetIndexes, int& numberPeople, - const std::vector, T>>& peopleVector, const unsigned int numberBodyParts, - const int minSubsetCnt, const T minSubsetScore, const int maxPeaks, const bool maximizePositives) + std::vector, T>>& peopleVector, const unsigned int numberBodyParts, + const int minSubsetCnt, const T minSubsetScore, const bool maximizePositives, const T* const peaksPtr) + // const int minSubsetCnt, const T minSubsetScore, const int maxPeaks, const bool maximizePositives) { try { // Delete people below the following thresholds: // a) minSubsetCnt: removed if less than minSubsetCnt body parts // b) minSubsetScore: removed if global score smaller than this - // c) maxPeaks (POSE_MAX_PEOPLE): keep first maxPeaks people above thresholds + // c) maxPeaks (POSE_MAX_PEOPLE): keep first maxPeaks people above thresholds -> Not required numberPeople = 0; validSubsetIndexes.clear(); - validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size())); + // validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size())); // maxPeaks is not required + validSubsetIndexes.reserve(peopleVector.size()); + // Face valid sets + std::vector faceValidSubsetIndexes; + faceValidSubsetIndexes.reserve(peopleVector.size()); + // Face invalid sets + std::vector faceInvalidSubsetIndexes; + faceInvalidSubsetIndexes.reserve(peopleVector.size()); + // For each person candidate for (auto index = 0u ; index < peopleVector.size() ; index++) { auto personCounter = peopleVector[index].first.back(); + // Analog for hand/face keypoints + if (numberBodyParts >= 135) + { + // No consider face keypoints for personCounter + const auto currentCounter = personCounter; + getKeypointCounter(personCounter, peopleVector, index, 65, 135, 1); + const auto newCounter = personCounter; + if (personCounter == 0) + { + faceInvalidSubsetIndexes.emplace_back(index); + continue; + } + // If body is still valid and facial points were removed, then add to valid faces + else if (currentCounter != newCounter) + faceValidSubsetIndexes.emplace_back(index); + // No consider right hand keypoints for personCounter + getKeypointCounter(personCounter, peopleVector, index, 45, 65, 1); + // No consider left hand keypoints for personCounter + getKeypointCounter(personCounter, peopleVector, index, 25, 45, 1); + } // Foot keypoints do not affect personCounter (too many false positives, // same foot usually appears as both left and right keypoints) // Pros: Removed tons of false positives // Cons: Standalone leg will never be recorded + // Solution: No consider foot keypoints for that if (!maximizePositives && (numberBodyParts == 25 || numberBodyParts > 70)) { - // No consider foot keypoints for that - for (auto i = 19 ; i < 25 ; i++) - personCounter -= (peopleVector[index].first.at(i) > 0); - // No consider hand keypoints for that - if (numberBodyParts > 70) - for (auto i = 25 ; i < 65 ; i++) - personCounter -= (peopleVector[index].first.at(i) > 0); + const auto currentCounter = personCounter; + getKeypointCounter(personCounter, peopleVector, index, 19, 25, 0); + const auto newCounter = personCounter; + // Problem: Same leg/foot keypoints are considered for both left and right keypoints. + // Solution: Remove legs that are duplicated and that do not have upper torso + // Result: Slight increase in COCO mAP and decrease in mAR + reducing a lot false positives! + if (newCounter != currentCounter && newCounter <= 4) + continue; } + // Add only valid people const auto personScore = peopleVector[index].second; if (personCounter >= minSubsetCnt && (personScore/personCounter) >= minSubsetScore) { numberPeople++; validSubsetIndexes.emplace_back(index); - if (numberPeople == maxPeaks) - break; + // // This is not required, it is OK if there are more people. No more GPU memory used. + // if (numberPeople == maxPeaks) + // break; } + // Sanity check else if ((personCounter < 1 && numberBodyParts != 25 && numberBodyParts < 70) || personCounter < 0) error("Bad personCounter (" + std::to_string(personCounter) + "). Bug in this" " function if this happens.", __LINE__, __FUNCTION__, __FILE__); } +// // Random standalone facial keypoints --> Merge into a more complete face +// if (numberPeople > 0 && faceInvalidSubsetIndexes.size() > 0) +// { +// for (auto faceId = 0u ; faceId < faceInvalidSubsetIndexes.size() ; faceId++) +// { +// // Get ROI +// Rectangle roi; +// int diameter; +// int indexFirstNon0; +// int indexLastNon0; +// const auto index = faceValidSubsetIndexes[faceId]; +// getRoiDiameterAndBounds( +// roi, diameter, indexFirstNon0, indexLastNon0, peopleVector[index].first, peaksPtr, 65, 135); +// // const auto personCounter = peopleVector[index].first.back(); +// // const auto x = peaksPtr[peopleVector[index].first[part]-2]; +// // const auto y = peaksPtr[peopleVector[index].first[part]-1]; +// // const auto score = peaksPtr[peopleVector[index].first[part]]; +// } +// } + // If no people found --> Repeat with maximizePositives = true + // Result: Increased COCO mAP because we catch more foot-only images + if (numberPeople == 0 && !maximizePositives) + { + removePeopleBelowThresholdsAndFillFaces( + validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, minSubsetScore, + true, peaksPtr); + // // Debugging + // if (numberPeople > 0) + // log("Found " + std::to_string(numberPeople) + " people in second iteration"); + } } catch (const std::exception& e) { @@ -673,30 +785,35 @@ namespace op } template - void peopleVectorToPeopleArray(Array& poseKeypoints, Array& poseScores, const T scaleFactor, - const std::vector, T>>& peopleVector, - const std::vector& validSubsetIndexes, const T* const peaksPtr, - const int numberPeople, const unsigned int numberBodyParts, - const unsigned int numberBodyPartPairs) + void peopleVectorToPeopleArray( + Array& poseKeypoints, Array& poseScores, const T scaleFactor, + const std::vector, T>>& peopleVector, const std::vector& validSubsetIndexes, + const T* const peaksPtr, const int numberPeople, const unsigned int numberBodyParts, + const unsigned int numberBodyPartPairs) { try { + // Allocate memory (initialized to 0) if (numberPeople > 0) { // Initialized to 0 for non-found keypoints in people poseKeypoints.reset({numberPeople, (int)numberBodyParts, 3}, 0.f); poseScores.reset(numberPeople); } + // No people --> Empty Arrays else { poseKeypoints.reset(); poseScores.reset(); } + // Fill people keypoints const auto oneOverNumberBodyPartsAndPAFs = 1/T(numberBodyParts + numberBodyPartPairs); + // For each person for (auto person = 0u ; person < validSubsetIndexes.size() ; person++) { const auto& personPair = peopleVector[validSubsetIndexes[person]]; const auto& personVector = personPair.first; + // For each body part for (auto bodyPart = 0u; bodyPart < numberBodyParts; bodyPart++) { const auto baseOffset = (person*numberBodyParts + bodyPart) * 3; @@ -1109,11 +1226,11 @@ namespace op // } template - void connectBodyPartsCpu(Array& poseKeypoints, Array& poseScores, const T* const heatMapPtr, - const T* const peaksPtr, const PoseModel poseModel, const Point& heatMapSize, - const int maxPeaks, const T interMinAboveThreshold, const T interThreshold, - const int minSubsetCnt, const T minSubsetScore, const T scaleFactor, - const bool maximizePositives) + void connectBodyPartsCpu( + Array& poseKeypoints, Array& poseScores, const T* const heatMapPtr, const T* const peaksPtr, + const PoseModel poseModel, const Point& heatMapSize, const int maxPeaks, const T interMinAboveThreshold, + const T interThreshold, const int minSubsetCnt, const T minSubsetScore, const T scaleFactor, + const bool maximizePositives) { try { @@ -1124,29 +1241,27 @@ namespace op if (numberBodyParts == 0) error("Invalid value of numberBodyParts, it must be positive, not " + std::to_string(numberBodyParts), __LINE__, __FUNCTION__, __FILE__); - // std::vector, double>> refers to: // - std::vector: [body parts locations, #body parts found] // - double: person subset score - const auto peopleVector = createPeopleVector( + auto peopleVector = createPeopleVector( heatMapPtr, peaksPtr, poseModel, heatMapSize, maxPeaks, interThreshold, interMinAboveThreshold, bodyPartPairs, numberBodyParts, numberBodyPartPairs); - // Delete people below the following thresholds: // a) minSubsetCnt: removed if less than minSubsetCnt body parts // b) minSubsetScore: removed if global score smaller than this // c) maxPeaks (POSE_MAX_PEOPLE): keep first maxPeaks people above thresholds int numberPeople; std::vector validSubsetIndexes; - validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size())); - removePeopleBelowThresholds( + // validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size())); + validSubsetIndexes.reserve(peopleVector.size()); + removePeopleBelowThresholdsAndFillFaces( validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, minSubsetScore, - maxPeaks, maximizePositives); - + maximizePositives, peaksPtr); // Fill and return poseKeypoints - peopleVectorToPeopleArray(poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes, - peaksPtr, numberPeople, numberBodyParts, numberBodyPartPairs); - + peopleVectorToPeopleArray( + poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes, peaksPtr, numberPeople, + numberBodyParts, numberBodyPartPairs); // Experimental code if (poseModel == PoseModel::BODY_25D) error("BODY_25D is an experimental branch which is not usable.", __LINE__, __FUNCTION__, __FILE__); @@ -1185,16 +1300,16 @@ namespace op const unsigned int numberBodyParts, const unsigned int numberBodyPartPairs, const Array& precomputedPAFs); - template OP_API void removePeopleBelowThresholds( + template OP_API void removePeopleBelowThresholdsAndFillFaces( std::vector& validSubsetIndexes, int& numberPeople, - const std::vector, float>>& peopleVector, - const unsigned int numberBodyParts, - const int minSubsetCnt, const float minSubsetScore, const int maxPeaks, const bool maximizePositives); - template OP_API void removePeopleBelowThresholds( + std::vector, float>>& peopleVector, + const unsigned int numberBodyParts, const int minSubsetCnt, const float minSubsetScore, + const bool maximizePositives, const float* const peaksPtr); + template OP_API void removePeopleBelowThresholdsAndFillFaces( std::vector& validSubsetIndexes, int& numberPeople, - const std::vector, double>>& peopleVector, - const unsigned int numberBodyParts, - const int minSubsetCnt, const double minSubsetScore, const int maxPeaks, const bool maximizePositives); + std::vector, double>>& peopleVector, + const unsigned int numberBodyParts, const int minSubsetCnt, const double minSubsetScore, + const bool maximizePositives, const double* const peaksPtr); template OP_API void peopleVectorToPeopleArray( Array& poseKeypoints, Array& poseScores, const float scaleFactor, diff --git a/src/openpose/net/bodyPartConnectorBase.cu b/src/openpose/net/bodyPartConnectorBase.cu index 4e3d172b8b8d7e824d90585cea7385ab67c29e32..02d7b6650fa11f7d0c5c19953324d6bc11095cc3 100644 --- a/src/openpose/net/bodyPartConnectorBase.cu +++ b/src/openpose/net/bodyPartConnectorBase.cu @@ -14,7 +14,7 @@ namespace op template inline __device__ T process( const T* bodyPartA, const T* bodyPartB, const T* mapX, const T* mapY, const int heatmapWidth, - const int heatmapHeight, const T interThreshold, const T interMinAboveThreshold) + const int heatmapHeight, const T interThreshold, const T interMinAboveThreshold, const T defaultNmsThreshold) { const auto vectorAToBX = bodyPartB[0] - bodyPartA[0]; const auto vectorAToBY = bodyPartB[1] - bodyPartA[1]; @@ -59,7 +59,7 @@ namespace op const auto l2Dist = sqrtf(vectorAToBX*vectorAToBX + vectorAToBY*vectorAToBY); const auto threshold = sqrtf(heatmapWidth*heatmapHeight)/150; // 3.3 for 368x656, 6.6 for 2x resolution if (l2Dist < threshold) - return T(0.15); + return T(defaultNmsThreshold+1e-6); // Without 1e-6 will not work because I use strict greater } } return -1; @@ -69,7 +69,8 @@ namespace op // __global__ void pafScoreKernelOld( // T* pairScoresPtr, const T* const heatMapPtr, const T* const peaksPtr, const unsigned int* const bodyPartPairsPtr, // const unsigned int* const mapIdxPtr, const unsigned int maxPeaks, const int numberBodyPartPairs, - // const int heatmapWidth, const int heatmapHeight, const T interThreshold, const T interMinAboveThreshold) + // const int heatmapWidth, const int heatmapHeight, const T interThreshold, const T interMinAboveThreshold, + // const T defaultNmsThreshold) // { // const auto pairIndex = (blockIdx.x * blockDim.x) + threadIdx.x; // const auto peakA = (blockIdx.y * blockDim.y) + threadIdx.y; @@ -96,7 +97,7 @@ namespace op // const T* const mapY = heatMapPtr + mapIdxY*heatmapWidth*heatmapHeight; // pairScoresPtr[outputIndex] = process( // bodyPartA, bodyPartB, mapX, mapY, heatmapWidth, heatmapHeight, interThreshold, - // interMinAboveThreshold); + // interMinAboveThreshold, defaultNmsThreshold); // } // else // pairScoresPtr[outputIndex] = -1; @@ -107,7 +108,8 @@ namespace op __global__ void pafScoreKernel( T* pairScoresPtr, const T* const heatMapPtr, const T* const peaksPtr, const unsigned int* const bodyPartPairsPtr, const unsigned int* const mapIdxPtr, const unsigned int maxPeaks, const int numberBodyPartPairs, - const int heatmapWidth, const int heatmapHeight, const T interThreshold, const T interMinAboveThreshold) + const int heatmapWidth, const int heatmapHeight, const T interThreshold, const T interMinAboveThreshold, + const T defaultNmsThreshold) { const auto peakB = (blockIdx.x * blockDim.x) + threadIdx.x; const auto peakA = (blockIdx.y * blockDim.y) + threadIdx.y; @@ -135,191 +137,21 @@ namespace op const T* const mapY = heatMapPtr + mapIdxY*heatmapWidth*heatmapHeight; pairScoresPtr[outputIndex] = process( bodyPartA, bodyPartB, mapX, mapY, heatmapWidth, heatmapHeight, interThreshold, - interMinAboveThreshold); + interMinAboveThreshold, defaultNmsThreshold); } else pairScoresPtr[outputIndex] = -1; } } - // template - // std::vector, T>> pafVectorIntoPeopleVectorOld( - // const std::vector>& pairConnections, const T* const peaksPtr, - // const int maxPeaks, const std::vector& bodyPartPairs, const unsigned int numberBodyParts) - // { - // try - // { - // // std::vector, double>> refers to: - // // - std::vector: [body parts locations, #body parts found] - // // - double: person subset score - // std::vector, T>> peopleVector; - // const auto vectorSize = numberBodyParts+1; - // const auto peaksOffset = (maxPeaks+1); - // // Save which body parts have been already assigned - // std::vector personAssigned(numberBodyParts*maxPeaks, -1); - // // Iterate over each PAF pair connection detected - // // E.g., neck1-nose2, neck5-Lshoulder0, etc. - // for (const auto& pairConnection : pairConnections) - // { - // // Read pairConnection - // // // Total score - only required for previous sort - // // const auto totalScore = std::get<0>(pairConnection); - // const auto pafScore = std::get<1>(pairConnection); - // const auto pairIndex = std::get<2>(pairConnection); - // const auto indexA = std::get<3>(pairConnection); - // const auto indexB = std::get<4>(pairConnection); - // // Derived data - // const auto bodyPartA = bodyPartPairs[2*pairIndex]; - // const auto bodyPartB = bodyPartPairs[2*pairIndex+1]; - - // const auto indexScoreA = (bodyPartA*peaksOffset + indexA)*3 + 2; - // const auto indexScoreB = (bodyPartB*peaksOffset + indexB)*3 + 2; - // // -1 because indexA and indexB are 1-based - // auto& aAssigned = personAssigned[bodyPartA*maxPeaks+indexA-1]; - // auto& bAssigned = personAssigned[bodyPartB*maxPeaks+indexB-1]; - // // Debugging - // #ifdef DEBUG - // if (indexA-1 > peaksOffset || indexA <= 0) - // error("Something is wrong: " + std::to_string(indexA) - // + " vs. " + std::to_string(peaksOffset) + ". Contact us.", - // __LINE__, __FUNCTION__, __FILE__); - // if (indexB-1 > peaksOffset || indexB <= 0) - // error("Something is wrong: " + std::to_string(indexB) - // + " vs. " + std::to_string(peaksOffset) + ". Contact us.", - // __LINE__, __FUNCTION__, __FILE__); - // #endif - - // // Different cases: - // // 1. A & B not assigned yet: Create new person - // // 2. A assigned but not B: Add B to person with A (if no another B there) - // // 3. B assigned but not A: Add A to person with B (if no another A there) - // // 4. A & B already assigned to same person (circular/redundant PAF): Update person score - // // 5. A & B already assigned to different people: Merge people if keypoint intersection is null - // // 1. A & B not assigned yet: Create new person - // if (aAssigned < 0 && bAssigned < 0) - // { - // // Keypoint indexes - // std::vector rowVector(vectorSize, 0); - // rowVector[bodyPartA] = indexScoreA; - // rowVector[bodyPartB] = indexScoreB; - // // Number keypoints - // rowVector.back() = 2; - // // Score - // const auto personScore = peaksPtr[indexScoreA] + peaksPtr[indexScoreB] + pafScore; - // // Set associated personAssigned as assigned - // aAssigned = (int)peopleVector.size(); - // bAssigned = aAssigned; - // // Create new personVector - // peopleVector.emplace_back(std::make_pair(rowVector, personScore)); - // } - // // 2. A assigned but not B: Add B to person with A (if no another B there) - // // or - // // 3. B assigned but not A: Add A to person with B (if no another A there) - // else if ((aAssigned >= 0 && bAssigned < 0) - // || (aAssigned < 0 && bAssigned >= 0)) - // { - // // Assign person1 to one where xAssigned >= 0 - // const auto assigned1 = (aAssigned >= 0 ? aAssigned : bAssigned); - // auto& assigned2 = (aAssigned >= 0 ? bAssigned : aAssigned); - // const auto bodyPart2 = (aAssigned >= 0 ? bodyPartB : bodyPartA); - // const auto indexScore2 = (aAssigned >= 0 ? indexScoreB : indexScoreA); - // // Person index - // auto& personVector = peopleVector[assigned1]; - // // Debugging - // #ifdef DEBUG - // const auto bodyPart1 = (aAssigned >= 0 ? bodyPartA : bodyPartB); - // const auto indexScore1 = (aAssigned >= 0 ? indexScoreA : indexScoreB); - // const auto index1 = (aAssigned >= 0 ? indexA : indexB); - // if ((unsigned int)personVector.first.at(bodyPart1) != indexScore1) - // error("Something is wrong: " - // + std::to_string((personVector.first[bodyPart1]-2)/3-bodyPart1*peaksOffset) - // + " vs. " + std::to_string((indexScore1-2)/3-bodyPart1*peaksOffset) + " vs. " - // + std::to_string(index1) + ". Contact us.", - // __LINE__, __FUNCTION__, __FILE__); - // #endif - // // If person with 1 does not have a 2 yet - // if (personVector.first[bodyPart2] == 0) - // { - // // Update keypoint indexes - // personVector.first[bodyPart2] = indexScore2; - // // Update number keypoints - // personVector.first.back()++; - // // Update score - // personVector.second += peaksPtr[indexScore2] + pafScore; - // // Set associated personAssigned as assigned - // assigned2 = assigned1; - // } - // // Otherwise, ignore this B because the previous one came from a higher PAF-confident score - // } - // // 4. A & B already assigned to same person (circular/redundant PAF): Update person score - // else if (aAssigned >=0 && bAssigned >=0 && aAssigned == bAssigned) - // peopleVector[aAssigned].second += pafScore; - // // 5. A & B already assigned to different people: Merge people if keypoint intersection is null - // // I.e., that the keypoints in person A and B do not overlap - // else if (aAssigned >=0 && bAssigned >=0 && aAssigned != bAssigned) - // { - // // Assign person1 to the one with lowest index for 2 reasons: - // // 1. Speed up: Removing an element from std::vector is cheaper for latest elements - // // 2. Avoid harder index update: Updated elements in person1ssigned would depend on - // // whether person1 > person2 or not: element = aAssigned - (person2 > person1 ? 1 : 0) - // const auto assigned1 = (aAssigned < bAssigned ? aAssigned : bAssigned); - // const auto assigned2 = (aAssigned < bAssigned ? bAssigned : aAssigned); - // auto& person1 = peopleVector[assigned1].first; - // const auto& person2 = peopleVector[assigned2].first; - // // Check if complementary - // // Defining found keypoint indexes in personA as kA, and analogously kB - // // Complementary if and only if kA intersection kB = empty. I.e., no common keypoints - // bool complementary = true; - // for (auto part = 0u ; part < numberBodyParts ; part++) - // { - // if (person1[part] > 0 && person2[part] > 0) - // { - // complementary = false; - // break; - // } - // } - // // If complementary, merge both people into 1 - // if (complementary) - // { - // // Update keypoint indexes - // for (auto part = 0u ; part < numberBodyParts ; part++) - // if (person1[part] == 0) - // person1[part] = person2[part]; - // // Update number keypoints - // person1.back() += person2.back(); - // // Update score - // peopleVector[assigned1].second += peopleVector[assigned2].second + pafScore; - // // Erase the non-merged person - // peopleVector.erase(peopleVector.begin()+assigned2); - // // Update associated personAssigned (person indexes have changed) - // for (auto& element : personAssigned) - // { - // if (element == assigned2) - // element = assigned1; - // else if (element > assigned2) - // element--; - // } - // } - // } - // } - // // Return result - // return peopleVector; - // } - // catch (const std::exception& e) - // { - // error(e.what(), __LINE__, __FUNCTION__, __FILE__); - // return {}; - // } - // } - template - void connectBodyPartsGpu(Array& poseKeypoints, Array& poseScores, const T* const heatMapGpuPtr, - const T* const peaksPtr, const PoseModel poseModel, const Point& heatMapSize, - const int maxPeaks, const T interMinAboveThreshold, const T interThreshold, - const int minSubsetCnt, const T minSubsetScore, const T scaleFactor, - const bool maximizePositives, Array pairScoresCpu, T* pairScoresGpuPtr, - const unsigned int* const bodyPartPairsGpuPtr, const unsigned int* const mapIdxGpuPtr, - const T* const peaksGpuPtr) + void connectBodyPartsGpu( + Array& poseKeypoints, Array& poseScores, const T* const heatMapGpuPtr, const T* const peaksPtr, + const PoseModel poseModel, const Point& heatMapSize, const int maxPeaks, const T interMinAboveThreshold, + const T interThreshold, const int minSubsetCnt, const T minSubsetScore, const T scaleFactor, + const bool maximizePositives, Array pairScoresCpu, T* pairScoresGpuPtr, + const unsigned int* const bodyPartPairsGpuPtr, const unsigned int* const mapIdxGpuPtr, + const T* const peaksGpuPtr, const T defaultNmsThreshold) { try { @@ -352,27 +184,10 @@ namespace op // pafScoreKernelOld<<>>( // pairScoresGpuPtr, heatMapGpuPtr, peaksGpuPtr, bodyPartPairsGpuPtr, mapIdxGpuPtr, // maxPeaks, (int)numberBodyPartPairs, heatMapSize.x, heatMapSize.y, interThreshold, - // interMinAboveThreshold); + // interMinAboveThreshold, defaultNmsThreshold); // // pairScoresCpu <-- pairScoresGpu // cudaMemcpy(pairScoresCpu.getPtr(), pairScoresGpuPtr, totalComputations * sizeof(T), // cudaMemcpyDeviceToHost); - // // Get pair connections and their scores - // const auto pairConnections = pafPtrIntoVector( - // pairScoresCpu, peaksPtr, maxPeaks, bodyPartPairs, numberBodyPartPairs); - // const auto peopleVector = pafVectorIntoPeopleVectorOld( - // pairConnections, peaksPtr, maxPeaks, bodyPartPairs, numberBodyParts); - // // Delete people below the following thresholds: - // // a) minSubsetCnt: removed if less than minSubsetCnt body parts - // // b) minSubsetScore: removed if global score smaller than this - // // c) maxPeaks (POSE_MAX_PEOPLE): keep first maxPeaks people above thresholds - // int numberPeople; - // std::vector validSubsetIndexes; - // validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size())); - // removePeopleBelowThresholds(validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, - // minSubsetScore, maxPeaks, maximizePositives); - // // Fill and return poseKeypoints - // peopleVectorToPeopleArray(poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes, - // peaksPtr, numberPeople, numberBodyParts, numberBodyPartPairs); // OP_PROFILE_END(timeNormalize1, 1e3, REPS); // Efficient code @@ -386,14 +201,16 @@ namespace op pafScoreKernel<<>>( pairScoresGpuPtr, heatMapGpuPtr, peaksGpuPtr, bodyPartPairsGpuPtr, mapIdxGpuPtr, maxPeaks, (int)numberBodyPartPairs, heatMapSize.x, heatMapSize.y, interThreshold, - interMinAboveThreshold); + interMinAboveThreshold, defaultNmsThreshold); // pairScoresCpu <-- pairScoresGpu cudaMemcpy(pairScoresCpu.getPtr(), pairScoresGpuPtr, totalComputations * sizeof(T), cudaMemcpyDeviceToHost); + // OP_PROFILE_END(timeNormalize2, 1e3, REPS); + // Get pair connections and their scores const auto pairConnections = pafPtrIntoVector( pairScoresCpu, peaksPtr, maxPeaks, bodyPartPairs, numberBodyPartPairs); - const auto peopleVector = pafVectorIntoPeopleVector( + auto peopleVector = pafVectorIntoPeopleVector( pairConnections, peaksPtr, maxPeaks, bodyPartPairs, numberBodyParts); // // Old code: Get pair connections and their scores // // std::vector, double>> refers to: @@ -409,13 +226,15 @@ namespace op // c) maxPeaks (POSE_MAX_PEOPLE): keep first maxPeaks people above thresholds int numberPeople; std::vector validSubsetIndexes; - validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size())); - removePeopleBelowThresholds(validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, - minSubsetScore, maxPeaks, maximizePositives); + // validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size())); + validSubsetIndexes.reserve(peopleVector.size()); + removePeopleBelowThresholdsAndFillFaces( + validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, minSubsetScore, + maximizePositives, peaksPtr); // Fill and return poseKeypoints - peopleVectorToPeopleArray(poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes, - peaksPtr, numberPeople, numberBodyParts, numberBodyPartPairs); - // OP_PROFILE_END(timeNormalize2, 1e3, REPS); + peopleVectorToPeopleArray( + poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes, peaksPtr, numberPeople, + numberBodyParts, numberBodyPartPairs); // // Profiling verbose // log(" BPC(ori)=" + std::to_string(timeNormalize1) + "ms"); @@ -436,12 +255,12 @@ namespace op const float interMinAboveThreshold, const float interThreshold, const int minSubsetCnt, const float minSubsetScore, const float scaleFactor, const bool maximizePositives, Array pairScoresCpu, float* pairScoresGpuPtr, const unsigned int* const bodyPartPairsGpuPtr, - const unsigned int* const mapIdxGpuPtr, const float* const peaksGpuPtr); + const unsigned int* const mapIdxGpuPtr, const float* const peaksGpuPtr, const float defaultNmsThreshold); template void connectBodyPartsGpu( Array& poseKeypoints, Array& poseScores, const double* const heatMapGpuPtr, const double* const peaksPtr, const PoseModel poseModel, const Point& heatMapSize, const int maxPeaks, const double interMinAboveThreshold, const double interThreshold, const int minSubsetCnt, const double minSubsetScore, const double scaleFactor, const bool maximizePositives, Array pairScoresCpu, double* pairScoresGpuPtr, const unsigned int* const bodyPartPairsGpuPtr, - const unsigned int* const mapIdxGpuPtr, const double* const peaksGpuPtr); + const unsigned int* const mapIdxGpuPtr, const double* const peaksGpuPtr, const double defaultNmsThreshold); } diff --git a/src/openpose/net/bodyPartConnectorBaseCL.cpp b/src/openpose/net/bodyPartConnectorBaseCL.cpp index 5515c33c422e427501d7a17d26dcde1298ad4218..cff5d21876309c7d79d4fccd579c141540abfe44 100644 --- a/src/openpose/net/bodyPartConnectorBaseCL.cpp +++ b/src/openpose/net/bodyPartConnectorBaseCL.cpp @@ -156,26 +156,24 @@ namespace op pairScoresGpuPtrBuffer, heatMapGpuPtrBuffer, peaksGpuPtrBuffer, bodyPartPairsGpuPtrBuffer, mapIdxGpuPtrBuffer, maxPeaks, (int)numberBodyPartPairs, heatMapSize.x, heatMapSize.y, interThreshold, interMinAboveThreshold); - OpenCL::getInstance(gpuID)->getQueue().enqueueReadBuffer(pairScoresGpuPtrBuffer, CL_TRUE, 0, - totalComputations * sizeof(T), pairScoresCpu.getPtr()); + OpenCL::getInstance(gpuID)->getQueue().enqueueReadBuffer( + pairScoresGpuPtrBuffer, CL_TRUE, 0, totalComputations * sizeof(T), pairScoresCpu.getPtr()); // New code // Get pair connections and their scores const auto pairConnections = pafPtrIntoVector( pairScoresCpu, peaksPtr, maxPeaks, bodyPartPairs, numberBodyPartPairs); - const auto peopleVector = pafVectorIntoPeopleVector( + auto peopleVector = pafVectorIntoPeopleVector( pairConnections, peaksPtr, maxPeaks, bodyPartPairs, numberBodyParts); - - // // Old code - // // Get pair connections and their scores - // // std::vector, double>> refers to: - // // - std::vector: [body parts locations, #body parts found] - // // - double: person subset score - // const T* const tNullptr = nullptr; - // const auto peopleVector = createPeopleVector( - // tNullptr, peaksPtr, poseModel, heatMapSize, maxPeaks, interThreshold, interMinAboveThreshold, - // bodyPartPairs, numberBodyParts, numberBodyPartPairs, pairScoresCpu); - + // // Old code + // // Get pair connections and their scores + // // std::vector, double>> refers to: + // // - std::vector: [body parts locations, #body parts found] + // // - double: person subset score + // const T* const tNullptr = nullptr; + // const auto peopleVector = createPeopleVector( + // tNullptr, peaksPtr, poseModel, heatMapSize, maxPeaks, interThreshold, interMinAboveThreshold, + // bodyPartPairs, numberBodyParts, numberBodyPartPairs, pairScoresCpu); // Delete people below the following thresholds: // a) minSubsetCnt: removed if less than minSubsetCnt body parts // b) minSubsetScore: removed if global score smaller than this @@ -183,15 +181,13 @@ namespace op int numberPeople; std::vector validSubsetIndexes; validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size())); - removePeopleBelowThresholds(validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, - minSubsetScore, maxPeaks, maximizePositives); - + removePeopleBelowThresholdsAndFillFaces( + validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, minSubsetScore, + maximizePositives, peaksPtr); // Fill and return poseKeypoints - peopleVectorToPeopleArray(poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes, - peaksPtr, numberPeople, numberBodyParts, numberBodyPartPairs); - - // // Sanity check - // cudaCheck(__LINE__, __FUNCTION__, __FILE__); + peopleVectorToPeopleArray( + poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes, peaksPtr, numberPeople, + numberBodyParts, numberBodyPartPairs); #else UNUSED(poseKeypoints); UNUSED(poseScores); diff --git a/src/openpose/net/bodyPartConnectorCaffe.cpp b/src/openpose/net/bodyPartConnectorCaffe.cpp index a8644f707f041147ed7a50d377ebb1098c4ed1a8..5eb2abb1cd8b4d17aefa9e5e9369d795fbaf3c23 100644 --- a/src/openpose/net/bodyPartConnectorCaffe.cpp +++ b/src/openpose/net/bodyPartConnectorCaffe.cpp @@ -108,6 +108,19 @@ namespace op } } + template + void BodyPartConnectorCaffe::setDefaultNmsThreshold(const T defaultNmsThreshold) + { + try + { + mDefaultNmsThreshold = {defaultNmsThreshold}; + } + catch (const std::exception& e) + { + error(e.what(), __LINE__, __FUNCTION__, __FILE__); + } + } + template void BodyPartConnectorCaffe::setInterMinAboveThreshold(const T interMinAboveThreshold) { @@ -300,8 +313,8 @@ namespace op } template - void BodyPartConnectorCaffe::Forward_gpu(const std::vector*>& bottom, Array& poseKeypoints, - Array& poseScores) + void BodyPartConnectorCaffe::Forward_gpu( + const std::vector*>& bottom, Array& poseKeypoints, Array& poseScores) { try { @@ -354,12 +367,12 @@ namespace op } // Run body part connector - connectBodyPartsGpu(poseKeypoints, poseScores, heatMapsGpuPtr, peaksPtr, mPoseModel, - Point{heatMapsBlob->shape(3), heatMapsBlob->shape(2)}, - maxPeaks, mInterMinAboveThreshold, mInterThreshold, - mMinSubsetCnt, mMinSubsetScore, mScaleNetToOutput, mMaximizePositives, - mFinalOutputCpu, pFinalOutputGpuPtr, pBodyPartPairsGpuPtr, pMapIdxGpuPtr, - peaksGpuPtr); + connectBodyPartsGpu( + poseKeypoints, poseScores, heatMapsGpuPtr, peaksPtr, mPoseModel, + Point{heatMapsBlob->shape(3), heatMapsBlob->shape(2)}, maxPeaks, mInterMinAboveThreshold, + mInterThreshold, mMinSubsetCnt, mMinSubsetScore, mScaleNetToOutput, mMaximizePositives, + mFinalOutputCpu, pFinalOutputGpuPtr, pBodyPartPairsGpuPtr, pMapIdxGpuPtr, peaksGpuPtr, + mDefaultNmsThreshold); #else UNUSED(bottom); UNUSED(poseKeypoints); diff --git a/src/openpose/pose/poseExtractorCaffe.cpp b/src/openpose/pose/poseExtractorCaffe.cpp index 8968144d1e97a4bc7f7c16f174e094727d2022ab..a7e3865b62010414c00af85338ef4b8be178aaae 100644 --- a/src/openpose/pose/poseExtractorCaffe.cpp +++ b/src/openpose/pose/poseExtractorCaffe.cpp @@ -317,6 +317,7 @@ namespace op // OP_CUDA_PROFILE_END(timeNormalize3, 1e3, REPS); // OP_CUDA_PROFILE_INIT(REPS); spBodyPartConnectorCaffe->setScaleNetToOutput(mScaleNetToOutput); + spBodyPartConnectorCaffe->setDefaultNmsThreshold((float)get(PoseProperty::NMSThreshold)); spBodyPartConnectorCaffe->setInterMinAboveThreshold( (float)get(PoseProperty::ConnectInterMinAboveThreshold)); spBodyPartConnectorCaffe->setInterThreshold((float)get(PoseProperty::ConnectInterThreshold)); diff --git a/src/openpose/utilities/keypoint.cpp b/src/openpose/utilities/keypoint.cpp index 267d78eb2b83fe6d16f060b769d168b2b2b1de0f..f9201aedc07f734b7884b7e951038d6e60db9eb5 100644 --- a/src/openpose/utilities/keypoint.cpp +++ b/src/openpose/utilities/keypoint.cpp @@ -174,9 +174,10 @@ namespace op const double offsetY); template - void renderKeypointsCpu(Array& frameArray, const Array& keypoints, const std::vector& pairs, - const std::vector colors, const T thicknessCircleRatio, - const T thicknessLineRatioWRTCircle, const std::vector& poseScales, const T threshold) + void renderKeypointsCpu( + Array& frameArray, const Array& keypoints, const std::vector& pairs, + const std::vector colors, const T thicknessCircleRatio, const T thicknessLineRatioWRTCircle, + const std::vector& poseScales, const T threshold) { try { @@ -209,8 +210,9 @@ namespace op const auto personRectangle = getKeypointsRectangle(keypoints, person, thresholdRectangle); if (personRectangle.area() > 0) { - const auto ratioAreas = fastMin(T(1), fastMax(personRectangle.width/(T)width, - personRectangle.height/(T)height)); + const auto ratioAreas = fastMin( + T(1), fastMax( + personRectangle.width/(T)width, personRectangle.height/(T)height)); // Size-dependent variables const auto thicknessRatio = fastMax( positiveIntRound(std::sqrt(area)* thicknessCircleRatio * ratioAreas), 2); @@ -283,21 +285,32 @@ namespace op const std::vector& poseScales, const double threshold); template - Rectangle getKeypointsRectangle(const Array& keypoints, const int person, const T threshold) + Rectangle getKeypointsRectangle( + const Array& keypoints, const int person, const T threshold, const int firstIndex, const int lastIndex) { try { + // Params const auto numberKeypoints = keypoints.getSize(1); - // Sanity check + const auto lastIndexClean = (lastIndex < 0 ? numberKeypoints : lastIndex); + // Sanity checks if (numberKeypoints < 1) error("Number body parts must be > 0.", __LINE__, __FUNCTION__, __FILE__); + if (lastIndexClean > numberKeypoints) + error("The value of `lastIndex` must be less or equal than `numberKeypoints`. Currently: " + + std::to_string(lastIndexClean) + " vs. " + std::to_string(numberKeypoints), + __LINE__, __FUNCTION__, __FILE__); + if (firstIndex > lastIndexClean) + error("The value of `firstIndex` must be less or equal than `lastIndex`. Currently: " + + std::to_string(firstIndex) + " vs. " + std::to_string(lastIndex), + __LINE__, __FUNCTION__, __FILE__); // Define keypointPtr const auto keypointPtr = keypoints.getConstPtr() + person * keypoints.getSize(1) * keypoints.getSize(2); T minX = std::numeric_limits::max(); T maxX = std::numeric_limits::lowest(); T minY = minX; T maxY = maxX; - for (auto part = 0 ; part < numberKeypoints ; part++) + for (auto part = firstIndex ; part < lastIndexClean ; part++) { const auto score = keypointPtr[3*part + 2]; if (score > threshold) @@ -328,9 +341,11 @@ namespace op } } template OP_API Rectangle getKeypointsRectangle( - const Array& keypoints, const int person, const float threshold); + const Array& keypoints, const int person, const float threshold, const int firstIndex, + const int lastIndex); template OP_API Rectangle getKeypointsRectangle( - const Array& keypoints, const int person, const double threshold); + const Array& keypoints, const int person, const double threshold, const int firstIndex, + const int lastIndex); template T getAverageScore(const Array& keypoints, const int person)