Slightly improved mAP and reduced false positives, but reduced mAR

3c9441ae · Gines Hidalgo · 80fc1144 · 3c9441ae · 3c9441ae · 3c9441ae
16 changed file
--- a/doc/faq.md
+++ b/doc/faq.md
@@ -17,6 +17,7 @@ OpenPose - Frequently Asked Question (FAQ)
        11. [CUDA_cublas_device_LIBRARY Not Found](#cuda_cublas_device_library-not-found)
        12. [CMake-GUI Error While Getting Default Caffe](#cmake-gui-error-while-getting-default-caffe)
        13. [Libgomp Out of Memory Error](#libgomp-out-of-memory-error)
+        14. [Runtime Error with Turing GPU (Tesla T4) or Volta GPU][#runtime-error-with-turing-gpu-teslat4-or-volta-gpu)
    2. [Speed Performance Issues](#speed-performance-issues)
        1. [Speed Up, Memory Reduction, and Benchmark](#speed-up-memory-reduction-and-benchmark)
        2. [How to Measure the Latency Time?](#how-to-measure-the-latency-time)
@@ -39,7 +40,7 @@ OpenPose - Frequently Asked Question (FAQ)
 #### Out of Memory Error
 **Q: Out of memory error** - I get an error similar to: `Check failed: error == cudaSuccess (2 vs. 0)  out of memory`.

-**A**: Most probably cuDNN is not installed/enabled, the default Caffe model uses >12 GB of GPU memory, cuDNN reduces it to ~2 GB for BODY_25 and ~1.5 GB for COCO.
+**A**: Most probably cuDNN is not installed/enabled, the default Caffe model uses >12 GB of GPU memory, cuDNN reduces it to ~2.2 GB for BODY_25 (default) and ~1.5 GB for COCO (`--model_pose COCO`). Note that you still need at least about 2.2 GB free for the default OpenPose to run. I.e., GPUs with only 2 GB will not fit the default OpenPose, and you will have to either switch to the `COCO` model (slower and less accurate), or reduce the `--net_resolution` (faster speed but also lower accuracy).



@@ -162,6 +163,14 @@ git submodle update



+#### Runtime Error with Turing GPU (Tesla T4) or Volta GPU
+**Q**: When I start OpenPose, I receive a runtime error for new GPU architectures.
+
+**A**: To solve this problem, 1) make sure you are using CUDA 10 or higher, and 2) change line 7 in `{OPENPOSE_PATH}/3rdparty/caffe/cmake/Cuda.cmake`, from `set(Caffe_known_gpu_archs "30 35 50 52 60 61")` to `set(Caffe_known_gpu_archs "30 35 50 52 60 61 75")`.
+
+
+
+

 ### Speed Performance Issues
 #### Speed Up, Memory Reduction, and Benchmark

--- a/doc/installation.md
+++ b/doc/installation.md
@@ -54,9 +54,14 @@ We add links to some community-based work based on OpenPose. Note: We do not sup
 - [ROS example](https://github.com/firephinx/openpose_ros) (based on a very old OpenPose version). For questions and more details, read and post ONLY on [issue thread #51](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/51).

 - Docker Images. For questions and more details, read and post ONLY on [issue thread #347](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/347).
-    - Dockerfile working with CUDA 10: [link 1](https://github.com/ExSidius/openpose-docker/blob/master/Dockerfile) and [link 2](https://cloud.docker.com/repository/docker/exsidius/openpose/general).
-    - [Dockerfile - OpenPose v1.4.0, OpenCV, CUDA 8, CuDNN 6, Python2.7](https://gist.github.com/moiseevigor/11c02c694fc0c22fccd59521793aeaa6).
-    - [Dockerfile - OpenPose v1.2.1](https://gist.github.com/sberryman/6770363f02336af82cb175a83b79de33).
+    - Dockerfile working also with CUDA 10:
+        - [Link 1](https://github.com/esemeniuc/openpose-docker), it claims to also include Python support. Read and post ONLY on [issue thread #1102](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/1102).
+        - [Link 2](https://github.com/ExSidius/openpose-docker/blob/master/Dockerfile).
+        - [Link 3](https://cloud.docker.com/repository/docker/exsidius/openpose/general).
+    - Dockerfile working only with CUDA 8:
+        - [Dockerfile - OpenPose v1.4.0, OpenCV, CUDA 8, CuDNN 5, Python2.7](https://github.com/tlkh/openpose). Read and post ONLY on [issue thread #1102](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/1102).
+        - [Dockerfile - OpenPose v1.4.0, OpenCV, CUDA 8, CuDNN 6, Python2.7](https://gist.github.com/moiseevigor/11c02c694fc0c22fccd59521793aeaa6).
+        - [Dockerfile - OpenPose v1.2.1](https://gist.github.com/sberryman/6770363f02336af82cb175a83b79de33).

 - [Google Colab helper script](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/949#issue-387855863): Script to install OpenPose on Google Colab. Really useful when access to a computer powerful enough to run OpenPose is not possible, so one possible way to use OpenPose is to build it on a GPU-enabled Colab runtime and then run the programs there. For questions and more details, read and post ONLY on [issue thread #949](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/949).

@@ -168,7 +173,25 @@ make -j`nproc`
 ```

 #### Windows
-In order to build the project, open the Visual Studio solution (Windows), called `build/OpenPose.sln`. Then, set the configuration from `Debug` to `Release` and press the green triangle icon (alternatively press <kbd>F5</kbd>).
+In order to build the project, select and run only one of the 2 following alternatives.
+
+1. **CMake-GUI alternative (recommended)**: Open the Visual Studio solution (Windows), called `build/OpenPose.sln`. Then, set the configuration from `Debug` to `Release` and press the green triangle icon (alternatively press <kbd>F5</kbd>).
+
+2. Command-line build alternative (not recommended). NOTE: The command line alternative is not officially supported, but it was added in [GitHub issue #1198](https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/1198). For any questions or bug report about this command-line version, comment in that GitHub issue.
+    1. Run "MSVS 2017 Developer Command Console"
+    ```
+    openpose\mkdir  build
+    cd build
+    cmake .. -G "Visual Studio 15 2017 Win64" -T v140
+    cmake --build . --config Release
+    copy x64\Release\*  bin\
+    ```
+    2. If you want to clean build
+    ```
+    cmake --clean-first .
+    cmake --build . --config Release
+    copy x64\Release\*  bin\
+    ```

 **VERY IMPORTANT NOTE**: In order to use OpenPose outside Visual Studio, and assuming you have not unchecked the `BUILD_BIN_FOLDER` flag in CMake, copy all DLLs from `{build_directory}/bin` into the folder where the generated `openpose.dll` and `*.exe` demos are, e.g., `{build_directory}x64/Release` for the 64-bit release version.


--- a/doc/modules/3d_reconstruction_module.md
+++ b/doc/modules/3d_reconstruction_module.md
@@ -23,13 +23,13 @@ This module performs 3-D keypoint (body, face, and hand) reconstruction and rend


 ## Installation
-Check [doc/installation.md#3d-reconstruction-module](./installation.md#3d-reconstruction-module) for installation steps.
+Check [doc/installation.md#3d-reconstruction-module](../installation.md#3d-reconstruction-module) for installation steps.




 ## Non Linear Optimization
-In order to increase the 3-D reconstruction accuracy, OpenPose optionally performs non-linear optimization if Ceres solver support is enabled (only available in Ubuntu for now). To enable it, check [doc/installation.md#3d-reconstruction-module](./installation.md#3d-reconstruction-module) for more details.
+In order to increase the 3-D reconstruction accuracy, OpenPose optionally performs non-linear optimization if Ceres solver support is enabled (only available in Ubuntu for now). To enable it, check [doc/installation.md#3d-reconstruction-module](../installation.md#3d-reconstruction-module) for more details.




--- a/doc/release_notes.md
+++ b/doc/release_notes.md
@@ -369,10 +369,14 @@ OpenPose Library - Release Notes
 1. Main improvements:
    1. Highly improved 3D triangulation for >3 cameras by fixing some small bugs.
    2. Added community-based support for Nvidia NVCaffe.
+    3. Increased accuracy very lightly for CUDA version (about 0.01%) by adapting the threshold in `process()` in `bodyPartConnectorBase.cu` to `defaultNmsThreshold`. This also removes any posibility of future bugs in that function for using a default NMS threshold higher than 0.15 (which was the hard-coded value used previously).
+    4. Increased mAP but reduced mAR (both about 0.01%) as well as reduction of false positives. Step 1: removed legs where only knee/ankle/feet are found. Step 2: If no people is found in an image, `removePeopleBelowThresholds` is re-run with `maximizePositives = true`.
+    5. Number of maximum people is not limited by the maximum number of max peaks anymore. However, the number of body part candidates for a specific keypoint (e.g., nose) is still limited to the number of max peaks.
 2. Functions or parameters renamed:
    1. `--3d_min_views` default value (-1) no longer means that all camera views are required. Instead, it will be equal to max(2, min(4, #cameras-1)). This should provide a good trade-off between recall and precission.
 3. Main bugs fixed:
    1. Windows: Added back support for OpenGL and Spinnaker, as well as DLLs for debug compilation.
+    2. `06_face_from_image.cpp` and `07_hand_from_image.cpp` working again, they stopped working in version 1.5.0 with the GPU image resize for the GUI.
 4. Changes/additions that affect the compatibility with the OpenPose Unity Plugin:



--- a/include/openpose/net/bodyPartConnectorBase.hpp
+++ b/include/openpose/net/bodyPartConnectorBase.hpp
@@ -18,10 +18,10 @@ namespace op
    void connectBodyPartsGpu(
        Array<T>& poseKeypoints, Array<T>& poseScores, const T* const heatMapGpuPtr, const T* const peaksPtr,
        const PoseModel poseModel, const Point<int>& heatMapSize, const int maxPeaks, const T interMinAboveThreshold,
-        const T interThreshold, const int minSubsetCnt, const T minSubsetScore, const T scaleFactor = 1.f,
-        const bool maximizePositives = false, Array<T> pairScoresCpu = Array<T>{}, T* pairScoresGpuPtr = nullptr,
-        const unsigned int* const bodyPartPairsGpuPtr = nullptr, const unsigned int* const mapIdxGpuPtr = nullptr,
-        const T* const peaksGpuPtr = nullptr);
+        const T interThreshold, const int minSubsetCnt, const T minSubsetScore, const T scaleFactor,
+        const bool maximizePositives, Array<T> pairScoresCpu, T* pairScoresGpuPtr,
+        const unsigned int* const bodyPartPairsGpuPtr, const unsigned int* const mapIdxGpuPtr,
+        const T* const peaksGpuPtr, const T defaultNmsThreshold);

    template <typename T>
    void connectBodyPartsOcl(
@@ -41,17 +41,17 @@ namespace op
        const unsigned int numberBodyPartPairs, const Array<T>& precomputedPAFs = Array<T>());

    template <typename T>
-    void removePeopleBelowThresholds(std::vector<int>& validSubsetIndexes, int& numberPeople,
-                                            const std::vector<std::pair<std::vector<int>, T>>& subsets,
-                                            const unsigned int numberBodyParts, const int minSubsetCnt,
-                                            const T minSubsetScore, const int maxPeaks, const bool maximizePositives);
+    void removePeopleBelowThresholdsAndFillFaces(
+        std::vector<int>& validSubsetIndexes, int& numberPeople,
+        std::vector<std::pair<std::vector<int>, T>>& subsets, const unsigned int numberBodyParts,
+        const int minSubsetCnt, const T minSubsetScore, const bool maximizePositives, const T* const peaksPtr);

    template <typename T>
-    void peopleVectorToPeopleArray(Array<T>& poseKeypoints, Array<T>& poseScores, const T scaleFactor,
-                                          const std::vector<std::pair<std::vector<int>, T>>& subsets,
-                                          const std::vector<int>& validSubsetIndexes, const T* const peaksPtr,
-                                          const int numberPeople, const unsigned int numberBodyParts,
-                                          const unsigned int numberBodyPartPairs);
+    void peopleVectorToPeopleArray(
+        Array<T>& poseKeypoints, Array<T>& poseScores, const T scaleFactor,
+        const std::vector<std::pair<std::vector<int>, T>>& subsets, const std::vector<int>& validSubsetIndexes,
+        const T* const peaksPtr, const int numberPeople, const unsigned int numberBodyParts,
+        const unsigned int numberBodyPartPairs);

    template <typename T>
    std::vector<std::tuple<T, T, int, int, int>> pafPtrIntoVector(

--- a/include/openpose/net/bodyPartConnectorCaffe.hpp
+++ b/include/openpose/net/bodyPartConnectorCaffe.hpp
@@ -25,6 +25,8 @@ namespace op

        void setMaximizePositives(const bool maximizePositives);

+        void setDefaultNmsThreshold(const T defaultNmsThreshold);
+
        void setInterMinAboveThreshold(const T interMinAboveThreshold);

        void setInterThreshold(const T interThreshold);
@@ -56,6 +58,7 @@ namespace op
    private:
        PoseModel mPoseModel;
        bool mMaximizePositives;
+        T mDefaultNmsThreshold;
        T mInterMinAboveThreshold;
        T mInterThreshold;
        int mMinSubsetCnt;

--- a/include/openpose/pose/poseParametersRender.hpp
+++ b/include/openpose/pose/poseParametersRender.hpp
@@ -210,10 +210,12 @@ namespace op
        1.f,1.f,1.f,1.f,1.f,1.f, \
        0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, \
        0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, \
-        0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.45f,0.45f,0.45f, \
+        0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, \
        0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, \
        0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, \
        0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f
+        // First 0.45f row:
+        // 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.45f,0.45f,0.45f,
    #define POSE_BODY_135_COLORS_RENDER_GPU \
        255.f,     0.f,    85.f, \
        170.f,     0.f,   255.f, \

--- a/include/openpose/utilities/keypoint.hpp
+++ b/include/openpose/utilities/keypoint.hpp
@@ -21,12 +21,15 @@ namespace op
    void scaleKeypoints2d(Array<T>& keypoints, const T scaleX, const T scaleY, const T offsetX, const T offsetY);

    template <typename T>
-    void renderKeypointsCpu(Array<T>& frameArray, const Array<T>& keypoints, const std::vector<unsigned int>& pairs,
-                            const std::vector<T> colors, const T thicknessCircleRatio,
-                            const T thicknessLineRatioWRTCircle, const std::vector<T>& poseScales, const T threshold);
+    void renderKeypointsCpu(
+        Array<T>& frameArray, const Array<T>& keypoints, const std::vector<unsigned int>& pairs,
+        const std::vector<T> colors, const T thicknessCircleRatio, const T thicknessLineRatioWRTCircle,
+        const std::vector<T>& poseScales, const T threshold);

    template <typename T>
-    Rectangle<T> getKeypointsRectangle(const Array<T>& keypoints, const int person, const T threshold);
+    Rectangle<T> getKeypointsRectangle(
+        const Array<T>& keypoints, const int person, const T threshold, const int firstIndex = 0,
+        const int lastIndex = -1);

    template <typename T>
    T getAverageScore(const Array<T>& keypoints, const int person);
@@ -44,8 +47,9 @@ namespace op
    T getDistanceAverage(const Array<T>& keypoints, const int personA, const int personB, const T threshold);

    template <typename T>
-    T getDistanceAverage(const Array<T>& keypointsA, const int personA, const Array<T>& keypointsB, const int personB,
-                         const T threshold);
+    T getDistanceAverage(
+        const Array<T>& keypointsA, const int personA, const Array<T>& keypointsB, const int personB,
+        const T threshold);

    /**
     * Creates and Array<T> with a specific person.

--- a/include/openpose/wrapper/wrapperAuxiliary.hpp
+++ b/include/openpose/wrapper/wrapperAuxiliary.hpp
@@ -267,7 +267,7 @@ namespace op
                // Input cvMat to OpenPose input & output format
                // Note: resize on GPU reduces accuracy about 0.1%
                bool resizeOnCpu = true;
-                // const auto resizeOnCpu = (numberGpuThreads < 3);
+                // const auto resizeOnCpu = (wrapperStructPose.poseMode != PoseMode::Enabled);
                if (resizeOnCpu)
                {
                    const auto gpuResize = false;
@@ -277,7 +277,8 @@ namespace op
                }
                // Note: We realized that somehow doing it on GPU for any number of GPUs does speedup the whole OP
                resizeOnCpu = false;
-                addCvMatToOpOutputInCpu = addCvMatToOpOutput && (resizeOnCpu || !renderOutputGpu);
+                addCvMatToOpOutputInCpu = addCvMatToOpOutput
+                    && (resizeOnCpu || !renderOutputGpu || wrapperStructPose.poseMode != PoseMode::Enabled);
                if (addCvMatToOpOutputInCpu)
                {
                    const auto gpuResize = false;
@@ -618,7 +619,7 @@ namespace op
                    {
                        const auto gpuResize = true;
                        opOutputToCvMats.emplace_back(std::make_shared<OpOutputToCvMat>(gpuResize));
-                        poseExtractorsWs[i].emplace_back(
+                        poseExtractorsWs.at(i).emplace_back(
                            std::make_shared<WOpOutputToCvMat<TDatumsSP>>(opOutputToCvMats.back()));
                        // Assign shared parameters
                        opOutputToCvMats.back()->setSharedParameters(

--- a/scripts/ubuntu/Makefile.example
+++ b/scripts/ubuntu/Makefile.example
@@ -33,7 +33,7 @@ LIBRARY_NAME := $(PROJECT)
 LIB_BUILD_DIR := $(BUILD_DIR)/lib
 STATIC_NAME := $(LIB_BUILD_DIR)/lib$(LIBRARY_NAME).a
 DYNAMIC_VERSION_MAJOR 		:= 1
-DYNAMIC_VERSION_MINOR 		:= 4
+DYNAMIC_VERSION_MINOR 		:= 5
 DYNAMIC_VERSION_REVISION 	:= 0
 DYNAMIC_NAME_SHORT := lib$(LIBRARY_NAME).so
 #DYNAMIC_SONAME_SHORT := $(DYNAMIC_NAME_SHORT).$(DYNAMIC_VERSION_MAJOR)

--- a/src/openpose/net/bodyPartConnectorBase.cpp
+++ b/src/openpose/net/bodyPartConnectorBase.cpp
 #include <set>
 #include <openpose/utilities/check.hpp>
 #include <openpose/utilities/fastMath.hpp>
+#include <openpose/utilities/keypoint.hpp>
 #include <openpose/pose/poseParameters.hpp>
 #include <openpose/net/bodyPartConnectorBase.hpp>

 namespace op
 {
    template <typename T>
-    inline T getScoreAB(const int i, const int j, const T* const candidateAPtr, const T* const candidateBPtr,
-                        const T* const mapX, const T* const mapY, const Point<int>& heatMapSize,
-                        const T interThreshold, const T interMinAboveThreshold)
+    inline T getScoreAB(
+        const int i, const int j, const T* const candidateAPtr, const T* const candidateBPtr, const T* const mapX,
+        const T* const mapY, const Point<int>& heatMapSize, const T interThreshold, const T interMinAboveThreshold)
    {
        try
        {
@@ -57,6 +58,27 @@ namespace op
        }
    }

+    template <typename T>
+    void getKeypointCounter(
+        int& personCounter, const std::vector<std::pair<std::vector<int>, T>>& peopleVector,
+        const unsigned int index, const int indexFirst, const int indexLast, const int minimum)
+    {
+        try
+        {
+            // Count keypoints
+            auto keypointCounter = 0;
+            for (auto i = indexFirst ; i < indexLast ; i++)
+                keypointCounter += (peopleVector[index].first.at(i) > 0);
+            // If enough keypoints --> subtract them and keep them at least as big as minimum
+            if (keypointCounter > minimum)
+                personCounter += minimum-keypointCounter; // personCounter = non-considered keypoints + minimum
+        }
+        catch (const std::exception& e)
+        {
+            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+        }
+    }
+
    template <typename T>
    std::vector<std::pair<std::vector<int>, T>> createPeopleVector(
        const T* const heatMapPtr, const T* const peaksPtr, const PoseModel poseModel, const Point<int>& heatMapSize,
@@ -211,8 +233,9 @@ namespace op
                            for (auto j = 1; j <= numberPeaksB; j++)
                            {
                                // Initial PAF
-                                auto scoreAB = getScoreAB(i, j, candidateAPtr, candidateBPtr, mapX, mapY,
-                                                          heatMapSize, interThreshold, interMinAboveThreshold);
+                                auto scoreAB = getScoreAB(
+                                    i, j, candidateAPtr, candidateBPtr, mapX, mapY, heatMapSize, interThreshold,
+                                    interMinAboveThreshold);

                                // E.g., neck-nose connection. If possible PAF between neck i, nose j --> add
                                // parts score + connection score
@@ -263,9 +286,8 @@ namespace op
                            const auto indexB = std::get<2>(aBConnection);
                            if (!occurA[indexA-1] && !occurB[indexB-1])
                            {
-                                abConnections.emplace_back(std::make_tuple(bodyPartA*peaksOffset + indexA*3 + 2,
-                                                                           bodyPartB*peaksOffset + indexB*3 + 2,
-                                                                           score));
+                                abConnections.emplace_back(std::make_tuple(
+                                    bodyPartA*peaksOffset+indexA*3+2, bodyPartB*peaksOffset+indexB*3+2, score));
                                counter++;
                                if (counter==minAB)
                                    break;
@@ -298,8 +320,8 @@ namespace op
                        // Add ears connections (in case person is looking to opposite direction to camera)
                        // Note: This has some issues:
                        //     - It does not prevent repeating the same keypoint in different people
-                        //     - Assuming I have nose,eye,ear as 1 person subset, and whole arm as another one, it will not
-                        //       merge them both
+                        //     - Assuming I have nose,eye,ear as 1 person subset, and whole arm as another one, it
+                        //       will not merge them both
                        else if (
                            (numberBodyParts == 18 && (pairIndex==17 || pairIndex==18))
                            || ((numberBodyParts == 19 || (numberBodyParts == 25)
@@ -622,49 +644,139 @@ namespace op
    }

    template <typename T>
-    void removePeopleBelowThresholds(
+    void getRoiDiameterAndBounds(
+        Rectangle<int>& roi, int& diameter, int& indexFirstNon0, int& indexLastNon0,
+        const std::vector<int>& personVector, const T* const peaksPtr,
+        const int indexInit, const int indexEnd)
+    {
+        try
+        {
+            roi = Rectangle<int>{0,0,0,0};
+            for (auto index = 0u ; index < personVector.size()-1 ; index++)
+            {
+                const auto x = peaksPtr[personVector[index]-2];
+                const auto y = peaksPtr[personVector[index]-1];
+                const auto score = peaksPtr[personVector[index]];
+                if (roi.x > x)
+                    roi.x = x;
+                if (roi.y > y)
+                    roi.y = y;
+            }
+        }
+        catch (const std::exception& e)
+        {
+            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+        }
+    }
+
+    template <typename T>
+    void removePeopleBelowThresholdsAndFillFaces(
        std::vector<int>& validSubsetIndexes, int& numberPeople,
-        const std::vector<std::pair<std::vector<int>, T>>& peopleVector, const unsigned int numberBodyParts,
-        const int minSubsetCnt, const T minSubsetScore, const int maxPeaks, const bool maximizePositives)
+        std::vector<std::pair<std::vector<int>, T>>& peopleVector, const unsigned int numberBodyParts,
+        const int minSubsetCnt, const T minSubsetScore, const bool maximizePositives, const T* const peaksPtr)
+        // const int minSubsetCnt, const T minSubsetScore, const int maxPeaks, const bool maximizePositives)
    {
        try
        {
            // Delete people below the following thresholds:
                // a) minSubsetCnt: removed if less than minSubsetCnt body parts
                // b) minSubsetScore: removed if global score smaller than this
-                // c) maxPeaks (POSE_MAX_PEOPLE): keep first maxPeaks people above thresholds
+                // c) maxPeaks (POSE_MAX_PEOPLE): keep first maxPeaks people above thresholds -> Not required
            numberPeople = 0;
            validSubsetIndexes.clear();
-            validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size()));
+            // validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size())); // maxPeaks is not required
+            validSubsetIndexes.reserve(peopleVector.size());
+            // Face valid sets
+            std::vector<int> faceValidSubsetIndexes;
+            faceValidSubsetIndexes.reserve(peopleVector.size());
+            // Face invalid sets
+            std::vector<int> faceInvalidSubsetIndexes;
+            faceInvalidSubsetIndexes.reserve(peopleVector.size());
+            // For each person candidate
            for (auto index = 0u ; index < peopleVector.size() ; index++)
            {
                auto personCounter = peopleVector[index].first.back();
+                // Analog for hand/face keypoints
+                if (numberBodyParts >= 135)
+                {
+                    // No consider face keypoints for personCounter
+                    const auto currentCounter = personCounter;
+                    getKeypointCounter(personCounter, peopleVector, index, 65, 135, 1);
+                    const auto newCounter = personCounter;
+                    if (personCounter == 0)
+                    {
+                        faceInvalidSubsetIndexes.emplace_back(index);
+                        continue;
+                    }
+                    // If body is still valid and facial points were removed, then add to valid faces
+                    else if (currentCounter != newCounter)
+                        faceValidSubsetIndexes.emplace_back(index);
+                    // No consider right hand keypoints for personCounter
+                    getKeypointCounter(personCounter, peopleVector, index, 45, 65, 1);
+                    // No consider left hand keypoints for personCounter
+                    getKeypointCounter(personCounter, peopleVector, index, 25, 45, 1);
+                }
                // Foot keypoints do not affect personCounter (too many false positives,
                // same foot usually appears as both left and right keypoints)
                // Pros: Removed tons of false positives
                // Cons: Standalone leg will never be recorded
+                // Solution: No consider foot keypoints for that
                if (!maximizePositives && (numberBodyParts == 25 || numberBodyParts > 70))
                {
-                    // No consider foot keypoints for that
-                    for (auto i = 19 ; i < 25 ; i++)
-                        personCounter -= (peopleVector[index].first.at(i) > 0);
-                    // No consider hand keypoints for that
-                    if (numberBodyParts > 70)
-                        for (auto i = 25 ; i < 65 ; i++)
-                            personCounter -= (peopleVector[index].first.at(i) > 0);
+                    const auto currentCounter = personCounter;
+                    getKeypointCounter(personCounter, peopleVector, index, 19, 25, 0);
+                    const auto newCounter = personCounter;
+                    // Problem: Same leg/foot keypoints are considered for both left and right keypoints.
+                    // Solution: Remove legs that are duplicated and that do not have upper torso
+                    // Result: Slight increase in COCO mAP and decrease in mAR + reducing a lot false positives!
+                    if (newCounter != currentCounter && newCounter <= 4)
+                        continue;
                }
+                // Add only valid people
                const auto personScore = peopleVector[index].second;
                if (personCounter >= minSubsetCnt && (personScore/personCounter) >= minSubsetScore)
                {
                    numberPeople++;
                    validSubsetIndexes.emplace_back(index);
-                    if (numberPeople == maxPeaks)
-                        break;
+                    // // This is not required, it is OK if there are more people. No more GPU memory used.
+                    // if (numberPeople == maxPeaks)
+                    //     break;
                }
+                // Sanity check
                else if ((personCounter < 1 && numberBodyParts != 25 && numberBodyParts < 70) || personCounter < 0)
                    error("Bad personCounter (" + std::to_string(personCounter) + "). Bug in this"
                          " function if this happens.", __LINE__, __FUNCTION__, __FILE__);
            }
+//             // Random standalone facial keypoints --> Merge into a more complete face
+//             if (numberPeople > 0 && faceInvalidSubsetIndexes.size() > 0)
+//             {
+//                 for (auto faceId = 0u ; faceId < faceInvalidSubsetIndexes.size() ; faceId++)
+//                 {
+//                     // Get ROI
+//                     Rectangle<int> roi;
+//                     int diameter;
+//                     int indexFirstNon0;
+//                     int indexLastNon0;
+//                     const auto index = faceValidSubsetIndexes[faceId];
+//                     getRoiDiameterAndBounds(
+//                         roi, diameter, indexFirstNon0, indexLastNon0, peopleVector[index].first, peaksPtr, 65, 135);
+//                     // const auto personCounter = peopleVector[index].first.back();
+//                     // const auto x = peaksPtr[peopleVector[index].first[part]-2];
+//                     // const auto y = peaksPtr[peopleVector[index].first[part]-1];
+//                     // const auto score = peaksPtr[peopleVector[index].first[part]];
+//                 }
+//             }
+            // If no people found --> Repeat with maximizePositives = true
+            // Result: Increased COCO mAP because we catch more foot-only images
+            if (numberPeople == 0 && !maximizePositives)
+            {
+                removePeopleBelowThresholdsAndFillFaces(
+                    validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, minSubsetScore,
+                    true, peaksPtr);
+                // // Debugging
+                // if (numberPeople > 0)
+                //     log("Found " + std::to_string(numberPeople) + " people in second iteration");
+            }
        }
        catch (const std::exception& e)
        {
@@ -673,30 +785,35 @@ namespace op
    }

    template <typename T>
-    void peopleVectorToPeopleArray(Array<T>& poseKeypoints, Array<T>& poseScores, const T scaleFactor,
-                                   const std::vector<std::pair<std::vector<int>, T>>& peopleVector,
-                                   const std::vector<int>& validSubsetIndexes, const T* const peaksPtr,
-                                   const int numberPeople, const unsigned int numberBodyParts,
-                                   const unsigned int numberBodyPartPairs)
+    void peopleVectorToPeopleArray(
+        Array<T>& poseKeypoints, Array<T>& poseScores, const T scaleFactor,
+        const std::vector<std::pair<std::vector<int>, T>>& peopleVector, const std::vector<int>& validSubsetIndexes,
+        const T* const peaksPtr, const int numberPeople, const unsigned int numberBodyParts,
+        const unsigned int numberBodyPartPairs)
    {
        try
        {
+            // Allocate memory (initialized to 0)
            if (numberPeople > 0)
            {
                // Initialized to 0 for non-found keypoints in people
                poseKeypoints.reset({numberPeople, (int)numberBodyParts, 3}, 0.f);
                poseScores.reset(numberPeople);
            }
+            // No people --> Empty Arrays
            else
            {
                poseKeypoints.reset();
                poseScores.reset();
            }
+            // Fill people keypoints
            const auto oneOverNumberBodyPartsAndPAFs = 1/T(numberBodyParts + numberBodyPartPairs);
+            // For each person
            for (auto person = 0u ; person < validSubsetIndexes.size() ; person++)
            {
                const auto& personPair = peopleVector[validSubsetIndexes[person]];
                const auto& personVector = personPair.first;
+                // For each body part
                for (auto bodyPart = 0u; bodyPart < numberBodyParts; bodyPart++)
                {
                    const auto baseOffset = (person*numberBodyParts + bodyPart) * 3;
@@ -1109,11 +1226,11 @@ namespace op
 //     }

    template <typename T>
-    void connectBodyPartsCpu(Array<T>& poseKeypoints, Array<T>& poseScores, const T* const heatMapPtr,
-                             const T* const peaksPtr, const PoseModel poseModel, const Point<int>& heatMapSize,
-                             const int maxPeaks, const T interMinAboveThreshold, const T interThreshold,
-                             const int minSubsetCnt, const T minSubsetScore, const T scaleFactor,
-                             const bool maximizePositives)
+    void connectBodyPartsCpu(
+        Array<T>& poseKeypoints, Array<T>& poseScores, const T* const heatMapPtr, const T* const peaksPtr,
+        const PoseModel poseModel, const Point<int>& heatMapSize, const int maxPeaks, const T interMinAboveThreshold,
+        const T interThreshold, const int minSubsetCnt, const T minSubsetScore, const T scaleFactor,
+        const bool maximizePositives)
    {
        try
        {
@@ -1124,29 +1241,27 @@ namespace op
            if (numberBodyParts == 0)
                error("Invalid value of numberBodyParts, it must be positive, not " + std::to_string(numberBodyParts),
                      __LINE__, __FUNCTION__, __FILE__);
-
            // std::vector<std::pair<std::vector<int>, double>> refers to:
            //     - std::vector<int>: [body parts locations, #body parts found]
            //     - double: person subset score
-            const auto peopleVector = createPeopleVector(
+            auto peopleVector = createPeopleVector(
                heatMapPtr, peaksPtr, poseModel, heatMapSize, maxPeaks, interThreshold, interMinAboveThreshold,
                bodyPartPairs, numberBodyParts, numberBodyPartPairs);
-
            // Delete people below the following thresholds:
                // a) minSubsetCnt: removed if less than minSubsetCnt body parts
                // b) minSubsetScore: removed if global score smaller than this
                // c) maxPeaks (POSE_MAX_PEOPLE): keep first maxPeaks people above thresholds
            int numberPeople;
            std::vector<int> validSubsetIndexes;
-            validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size()));
-            removePeopleBelowThresholds(
+            // validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size()));
+            validSubsetIndexes.reserve(peopleVector.size());
+            removePeopleBelowThresholdsAndFillFaces(
                validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, minSubsetScore,
-                maxPeaks, maximizePositives);
-
+                maximizePositives, peaksPtr);
            // Fill and return poseKeypoints
-            peopleVectorToPeopleArray(poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes,
-                                      peaksPtr, numberPeople, numberBodyParts, numberBodyPartPairs);
-
+            peopleVectorToPeopleArray(
+                poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes, peaksPtr, numberPeople,
+                numberBodyParts, numberBodyPartPairs);
            // Experimental code
            if (poseModel == PoseModel::BODY_25D)
                error("BODY_25D is an experimental branch which is not usable.", __LINE__, __FUNCTION__, __FILE__);
@@ -1185,16 +1300,16 @@ namespace op
        const unsigned int numberBodyParts, const unsigned int numberBodyPartPairs,
        const Array<double>& precomputedPAFs);

-    template OP_API void removePeopleBelowThresholds(
+    template OP_API void removePeopleBelowThresholdsAndFillFaces(
        std::vector<int>& validSubsetIndexes, int& numberPeople,
-        const std::vector<std::pair<std::vector<int>, float>>& peopleVector,
-        const unsigned int numberBodyParts,
-        const int minSubsetCnt, const float minSubsetScore, const int maxPeaks, const bool maximizePositives);
-    template OP_API void removePeopleBelowThresholds(
+        std::vector<std::pair<std::vector<int>, float>>& peopleVector,
+        const unsigned int numberBodyParts, const int minSubsetCnt, const float minSubsetScore,
+        const bool maximizePositives, const float* const peaksPtr);
+    template OP_API void removePeopleBelowThresholdsAndFillFaces(
        std::vector<int>& validSubsetIndexes, int& numberPeople,
-        const std::vector<std::pair<std::vector<int>, double>>& peopleVector,
-        const unsigned int numberBodyParts,
-        const int minSubsetCnt, const double minSubsetScore, const int maxPeaks, const bool maximizePositives);
+        std::vector<std::pair<std::vector<int>, double>>& peopleVector,
+        const unsigned int numberBodyParts, const int minSubsetCnt, const double minSubsetScore,
+        const bool maximizePositives, const double* const peaksPtr);

    template OP_API void peopleVectorToPeopleArray(
        Array<float>& poseKeypoints, Array<float>& poseScores, const float scaleFactor,

--- a/src/openpose/net/bodyPartConnectorBase.cu
+++ b/src/openpose/net/bodyPartConnectorBase.cu
--- a/src/openpose/net/bodyPartConnectorBaseCL.cpp
+++ b/src/openpose/net/bodyPartConnectorBaseCL.cpp
@@ -156,26 +156,24 @@ namespace op
                    pairScoresGpuPtrBuffer, heatMapGpuPtrBuffer, peaksGpuPtrBuffer, bodyPartPairsGpuPtrBuffer, mapIdxGpuPtrBuffer,
                    maxPeaks, (int)numberBodyPartPairs, heatMapSize.x, heatMapSize.y, interThreshold,
                    interMinAboveThreshold);
-                OpenCL::getInstance(gpuID)->getQueue().enqueueReadBuffer(pairScoresGpuPtrBuffer, CL_TRUE, 0,
-                                                                          totalComputations * sizeof(T), pairScoresCpu.getPtr());
+                OpenCL::getInstance(gpuID)->getQueue().enqueueReadBuffer(
+                    pairScoresGpuPtrBuffer, CL_TRUE, 0, totalComputations * sizeof(T), pairScoresCpu.getPtr());

                // New code
                // Get pair connections and their scores
                const auto pairConnections = pafPtrIntoVector(
                    pairScoresCpu, peaksPtr, maxPeaks, bodyPartPairs, numberBodyPartPairs);
-                const auto peopleVector = pafVectorIntoPeopleVector(
+                auto peopleVector = pafVectorIntoPeopleVector(
                    pairConnections, peaksPtr, maxPeaks, bodyPartPairs, numberBodyParts);
-
-               // // Old code
-               // // Get pair connections and their scores
-               // // std::vector<std::pair<std::vector<int>, double>> refers to:
-               // //     - std::vector<int>: [body parts locations, #body parts found]
-               // //     - double: person subset score
-               // const T* const tNullptr = nullptr;
-               // const auto peopleVector = createPeopleVector(
-               //     tNullptr, peaksPtr, poseModel, heatMapSize, maxPeaks, interThreshold, interMinAboveThreshold,
-               //     bodyPartPairs, numberBodyParts, numberBodyPartPairs, pairScoresCpu);
-
+                // // Old code
+                // // Get pair connections and their scores
+                // // std::vector<std::pair<std::vector<int>, double>> refers to:
+                // //     - std::vector<int>: [body parts locations, #body parts found]
+                // //     - double: person subset score
+                // const T* const tNullptr = nullptr;
+                // const auto peopleVector = createPeopleVector(
+                //     tNullptr, peaksPtr, poseModel, heatMapSize, maxPeaks, interThreshold, interMinAboveThreshold,
+                //     bodyPartPairs, numberBodyParts, numberBodyPartPairs, pairScoresCpu);
                // Delete people below the following thresholds:
                    // a) minSubsetCnt: removed if less than minSubsetCnt body parts
                    // b) minSubsetScore: removed if global score smaller than this
@@ -183,15 +181,13 @@ namespace op
                int numberPeople;
                std::vector<int> validSubsetIndexes;
                validSubsetIndexes.reserve(fastMin((size_t)maxPeaks, peopleVector.size()));
-                removePeopleBelowThresholds(validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt,
-                                            minSubsetScore, maxPeaks, maximizePositives);
-
+                removePeopleBelowThresholdsAndFillFaces(
+                    validSubsetIndexes, numberPeople, peopleVector, numberBodyParts, minSubsetCnt, minSubsetScore,
+                    maximizePositives, peaksPtr);
                // Fill and return poseKeypoints
-                peopleVectorToPeopleArray(poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes,
-                                          peaksPtr, numberPeople, numberBodyParts, numberBodyPartPairs);
-
-               // // Sanity check
-               // cudaCheck(__LINE__, __FUNCTION__, __FILE__);
+                peopleVectorToPeopleArray(
+                    poseKeypoints, poseScores, scaleFactor, peopleVector, validSubsetIndexes, peaksPtr, numberPeople,
+                    numberBodyParts, numberBodyPartPairs);
            #else
                UNUSED(poseKeypoints);
                UNUSED(poseScores);

--- a/src/openpose/net/bodyPartConnectorCaffe.cpp
+++ b/src/openpose/net/bodyPartConnectorCaffe.cpp
@@ -108,6 +108,19 @@ namespace op
        }
    }

+    template <typename T>
+    void BodyPartConnectorCaffe<T>::setDefaultNmsThreshold(const T defaultNmsThreshold)
+    {
+        try
+        {
+            mDefaultNmsThreshold = {defaultNmsThreshold};
+        }
+        catch (const std::exception& e)
+        {
+            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+        }
+    }
+
    template <typename T>
    void BodyPartConnectorCaffe<T>::setInterMinAboveThreshold(const T interMinAboveThreshold)
    {
@@ -300,8 +313,8 @@ namespace op
    }

    template <typename T>
-    void BodyPartConnectorCaffe<T>::Forward_gpu(const std::vector<ArrayCpuGpu<T>*>& bottom, Array<T>& poseKeypoints,
-                                                Array<T>& poseScores)
+    void BodyPartConnectorCaffe<T>::Forward_gpu(
+        const std::vector<ArrayCpuGpu<T>*>& bottom, Array<T>& poseKeypoints, Array<T>& poseScores)
    {
        try
        {
@@ -354,12 +367,12 @@ namespace op
                }

                // Run body part connector
-                connectBodyPartsGpu(poseKeypoints, poseScores, heatMapsGpuPtr, peaksPtr, mPoseModel,
-                                    Point<int>{heatMapsBlob->shape(3), heatMapsBlob->shape(2)},
-                                    maxPeaks, mInterMinAboveThreshold, mInterThreshold,
-                                    mMinSubsetCnt, mMinSubsetScore, mScaleNetToOutput, mMaximizePositives,
-                                    mFinalOutputCpu, pFinalOutputGpuPtr, pBodyPartPairsGpuPtr, pMapIdxGpuPtr,
-                                    peaksGpuPtr);
+                connectBodyPartsGpu(
+                    poseKeypoints, poseScores, heatMapsGpuPtr, peaksPtr, mPoseModel,
+                    Point<int>{heatMapsBlob->shape(3), heatMapsBlob->shape(2)}, maxPeaks, mInterMinAboveThreshold,
+                    mInterThreshold, mMinSubsetCnt, mMinSubsetScore, mScaleNetToOutput, mMaximizePositives,
+                    mFinalOutputCpu, pFinalOutputGpuPtr, pBodyPartPairsGpuPtr, pMapIdxGpuPtr, peaksGpuPtr,
+                    mDefaultNmsThreshold);
            #else
                UNUSED(bottom);
                UNUSED(poseKeypoints);

--- a/src/openpose/pose/poseExtractorCaffe.cpp
+++ b/src/openpose/pose/poseExtractorCaffe.cpp
@@ -317,6 +317,7 @@ namespace op
                // OP_CUDA_PROFILE_END(timeNormalize3, 1e3, REPS);
                // OP_CUDA_PROFILE_INIT(REPS);
                spBodyPartConnectorCaffe->setScaleNetToOutput(mScaleNetToOutput);
+                spBodyPartConnectorCaffe->setDefaultNmsThreshold((float)get(PoseProperty::NMSThreshold));
                spBodyPartConnectorCaffe->setInterMinAboveThreshold(
                    (float)get(PoseProperty::ConnectInterMinAboveThreshold));
                spBodyPartConnectorCaffe->setInterThreshold((float)get(PoseProperty::ConnectInterThreshold));

--- a/src/openpose/utilities/keypoint.cpp
+++ b/src/openpose/utilities/keypoint.cpp
@@ -174,9 +174,10 @@ namespace op
        const double offsetY);

    template <typename T>
-    void renderKeypointsCpu(Array<T>& frameArray, const Array<T>& keypoints, const std::vector<unsigned int>& pairs,
-                            const std::vector<T> colors, const T thicknessCircleRatio,
-                            const T thicknessLineRatioWRTCircle, const std::vector<T>& poseScales, const T threshold)
+    void renderKeypointsCpu(
+        Array<T>& frameArray, const Array<T>& keypoints, const std::vector<unsigned int>& pairs,
+        const std::vector<T> colors, const T thicknessCircleRatio, const T thicknessLineRatioWRTCircle,
+        const std::vector<T>& poseScales, const T threshold)
    {
        try
        {
@@ -209,8 +210,9 @@ namespace op
                    const auto personRectangle = getKeypointsRectangle(keypoints, person, thresholdRectangle);
                    if (personRectangle.area() > 0)
                    {
-                        const auto ratioAreas = fastMin(T(1), fastMax(personRectangle.width/(T)width,
-                                                                     personRectangle.height/(T)height));
+                        const auto ratioAreas = fastMin(
+                            T(1), fastMax(
+                                personRectangle.width/(T)width, personRectangle.height/(T)height));
                        // Size-dependent variables
                        const auto thicknessRatio = fastMax(
                            positiveIntRound(std::sqrt(area)* thicknessCircleRatio * ratioAreas), 2);
@@ -283,21 +285,32 @@ namespace op
        const std::vector<double>& poseScales, const double threshold);

    template <typename T>
-    Rectangle<T> getKeypointsRectangle(const Array<T>& keypoints, const int person, const T threshold)
+    Rectangle<T> getKeypointsRectangle(
+        const Array<T>& keypoints, const int person, const T threshold, const int firstIndex, const int lastIndex)
    {
        try
        {
+            // Params
            const auto numberKeypoints = keypoints.getSize(1);
-            // Sanity check
+            const auto lastIndexClean = (lastIndex < 0 ? numberKeypoints : lastIndex);
+            // Sanity checks
            if (numberKeypoints < 1)
                error("Number body parts must be > 0.", __LINE__, __FUNCTION__, __FILE__);
+            if (lastIndexClean > numberKeypoints)
+                error("The value of `lastIndex` must be less or equal than `numberKeypoints`. Currently: "
+                    + std::to_string(lastIndexClean) + " vs. " + std::to_string(numberKeypoints),
+                    __LINE__, __FUNCTION__, __FILE__);
+            if (firstIndex > lastIndexClean)
+                error("The value of `firstIndex` must be less or equal than `lastIndex`. Currently: "
+                    + std::to_string(firstIndex) + " vs. " + std::to_string(lastIndex),
+                    __LINE__, __FUNCTION__, __FILE__);
            // Define keypointPtr
            const auto keypointPtr = keypoints.getConstPtr() + person * keypoints.getSize(1) * keypoints.getSize(2);
            T minX = std::numeric_limits<T>::max();
            T maxX = std::numeric_limits<T>::lowest();
            T minY = minX;
            T maxY = maxX;
-            for (auto part = 0 ; part < numberKeypoints ; part++)
+            for (auto part = firstIndex ; part < lastIndexClean ; part++)
            {
                const auto score = keypointPtr[3*part + 2];
                if (score > threshold)
@@ -328,9 +341,11 @@ namespace op
        }
    }
    template OP_API Rectangle<float> getKeypointsRectangle(
-        const Array<float>& keypoints, const int person, const float threshold);
+        const Array<float>& keypoints, const int person, const float threshold, const int firstIndex,
+        const int lastIndex);
    template OP_API Rectangle<double> getKeypointsRectangle(
-        const Array<double>& keypoints, const int person, const double threshold);
+        const Array<double>& keypoints, const int person, const double threshold, const int firstIndex,
+        const int lastIndex);

    template <typename T>
    T getAverageScore(const Array<T>& keypoints, const int person)