Merge pull request #20978 from cudawarped:videocapture_read_raw_enchancement

Add capacity to Videocapture to return the extraData from FFmpeg when required * Update rawMode to append any extra data recieved during the initial negotiation of an RTSP stream or during the parsing of an MPEG4 file header. For h264[5] RTSP streams this ensures the parameter sets if available are always returned on the first call to grab()/read() and has two purposes: 1) To ensure the parameter sets are available even if they are not transmitted in band. This is common for axis ip camera's. 2) To allow callers of VideoCapture::grab()[read()] to write to split the raw stream over multiple files by appending the parameter sets to the begining of any new files. For (1) there is no alternative, for (2) if the parameter sets were provided in band it would be possible to parse the raw bit stream and search for the parameter sets however that would be a lot of work when that information is already provided by FFMPEG. For MPEG4 files this information is only suplied in the header and is required for decoding. Two properties are also required to enable the raw encoded bitstream to be written to multiple files, these are; 1) an indicator as to whether the last frame was a key frame or not - each new file needs to start at a key frame to avoid storing unusable frame diffs, 2) the length in bytes of the paramater sets contained in the last frame - required to split the paramater sets from the frame without having to parse the stream. Any call to VideoCapture::get(CAP_PROP_LF_PARAM_SET_LEN) returning a number greater than zero indicates the presense of a parameter set at the begining of the raw bitstream. * Adjust test data to account for extraData * Address warning. * Change added property names and remove paramater set start code check. * Output extra data on calls to retrieve instead of appending to the first packet. * Reverted old test case and added new one to evaluate new functionality. * Add missing definition. * Remove flag from legacy api. Add property to determine if returning extra data is supported. Always allow extra data to be returned on calls to cap.retrieve() Update test case. * Update condition which indicates CAP_PROP_CODEC_EXTRADATA_INDEX is not supported in test case. * Include compatibility for windows dll if not updated. Enforce existing return status convention. * Fix return error and missing test constraints.

Merge pull request #20978 from cudawarped:videocapture_read_raw_enchancement
Add capacity to Videocapture to return the extraData from FFmpeg when required * Update rawMode to append any extra data recieved during the initial negotiation of an RTSP stream or during the parsing of an MPEG4 file header. For h264[5] RTSP streams this ensures the parameter sets if available are always returned on the first call to grab()/read() and has two purposes: 1) To ensure the parameter sets are available even if they are not transmitted in band. This is common for axis ip camera's. 2) To allow callers of VideoCapture::grab()[read()] to write to split the raw stream over multiple files by appending the parameter sets to the begining of any new files. For (1) there is no alternative, for (2) if the parameter sets were provided in band it would be possible to parse the raw bit stream and search for the parameter sets however that would be a lot of work when that information is already provided by FFMPEG. For MPEG4 files this information is only suplied in the header and is required for decoding. Two properties are also required to enable the raw encoded bitstream to be written to multiple files, these are; 1) an indicator as to whether the last frame was a key frame or not - each new file needs to start at a key frame to avoid storing unusable frame diffs, 2) the length in bytes of the paramater sets contained in the last frame - required to split the paramater sets from the frame without having to parse the stream. Any call to VideoCapture::get(CAP_PROP_LF_PARAM_SET_LEN) returning a number greater than zero indicates the presense of a parameter set at the begining of the raw bitstream. * Adjust test data to account for extraData * Address warning. * Change added property names and remove paramater set start code check. * Output extra data on calls to retrieve instead of appending to the first packet. * Reverted old test case and added new one to evaluate new functionality. * Add missing definition. * Remove flag from legacy api. Add property to determine if returning extra data is supported. Always allow extra data to be returned on calls to cap.retrieve() Update test case. * Update condition which indicates CAP_PROP_CODEC_EXTRADATA_INDEX is not supported in test case. * Include compatibility for windows dll if not updated. Enforce existing return status convention. * Fix return error and missing test constraints.
97c6ec6d · cudawarped · GitHub · 1fcf7ba5 · 97c6ec6d · 97c6ec6d
4 changed file
--- a/modules/videoio/include/opencv2/videoio.hpp
+++ b/modules/videoio/include/opencv2/videoio.hpp
@@ -200,6 +200,8 @@ enum VideoCaptureProperties {
       CAP_PROP_AUDIO_TOTAL_CHANNELS = 64, //!< (read-only) Number of audio channels in the selected audio stream (mono, stereo, etc)
       CAP_PROP_AUDIO_TOTAL_STREAMS = 65, //!< (read-only) Number of audio streams.
       CAP_PROP_AUDIO_SYNCHRONIZE = 66, //!< (open, read) Enables audio synchronization.
+       CAP_PROP_LRF_HAS_KEY_FRAME = 67, //!< FFmpeg back-end only - Indicates whether the Last Raw Frame (LRF), output from VideoCapture::read() when VideoCapture is initialized with VideoCapture::open(CAP_FFMPEG, {CAP_PROP_FORMAT, -1}) or VideoCapture::set(CAP_PROP_FORMAT,-1) is called before the first call to VideoCapture::read(), contains encoded data for a key frame.
+       CAP_PROP_CODEC_EXTRADATA_INDEX = 68, //!< Positive index indicates that returning extra data is supported by the video back end.  This can be retrieved as cap.retrieve(data, <returned index>).  E.g. When reading from a h264 encoded RTSP stream, the FFmpeg backend could return the SPS and/or PPS if available (if sent in reply to a DESCRIBE request), from calls to cap.retrieve(data, <returned index>).
 #ifndef CV_DOXYGEN
       CV__CAP_PROP_LATEST
 #endif

--- a/modules/videoio/src/cap_ffmpeg.cpp
+++ b/modules/videoio/src/cap_ffmpeg.cpp
@@ -87,7 +87,7 @@ public:
    {
        return ffmpegCapture ? icvGrabFrame_FFMPEG_p(ffmpegCapture)!=0 : false;
    }
-    virtual bool retrieveFrame(int, cv::OutputArray frame) CV_OVERRIDE
+    virtual bool retrieveFrame(int flag, cv::OutputArray frame) CV_OVERRIDE
    {
        unsigned char* data = 0;
        int step=0, width=0, height=0, cn=0;
@@ -102,8 +102,14 @@ public:
            }
        }

-        if (!icvRetrieveFrame_FFMPEG_p(ffmpegCapture, &data, &step, &width, &height, &cn))
-            return false;
+        if (flag == 0) {
+            if (!icvRetrieveFrame_FFMPEG_p(ffmpegCapture, &data, &step, &width, &height, &cn))
+                return false;
+        }
+        else {
+            if (!ffmpegCapture->retrieveFrame(flag, &data, &step, &width, &height, &cn))
+                return false;
+        }

        cv::Mat tmp(height, width, CV_MAKETYPE(CV_8U, cn), data, step);
        this->rotateFrame(tmp);

--- a/modules/videoio/src/cap_ffmpeg_impl.hpp
+++ b/modules/videoio/src/cap_ffmpeg_impl.hpp
@@ -481,7 +481,7 @@ struct CvCapture_FFMPEG
    double getProperty(int) const;
    bool setProperty(int, double);
    bool grabFrame();
-    bool retrieveFrame(int, unsigned char** data, int* step, int* width, int* height, int* cn);
+    bool retrieveFrame(int flag, unsigned char** data, int* step, int* width, int* height, int* cn);
    bool retrieveHWFrame(cv::OutputArray output);
    void rotateFrame(cv::Mat &mat) const;

@@ -547,6 +547,7 @@ struct CvCapture_FFMPEG
    VideoAccelerationType va_type;
    int hw_device;
    int use_opencl;
+    int extraDataIdx;
 };

 void CvCapture_FFMPEG::init()
@@ -590,6 +591,7 @@ void CvCapture_FFMPEG::init()
    va_type = cv::VIDEO_ACCELERATION_NONE;  // TODO OpenCV 5.0: change to _ANY?
    hw_device = -1;
    use_opencl = 0;
+    extraDataIdx = 1;
 }


@@ -1408,20 +1410,28 @@ bool CvCapture_FFMPEG::grabFrame()
    return valid;
 }

-bool CvCapture_FFMPEG::retrieveFrame(int, unsigned char** data, int* step, int* width, int* height, int* cn)
+bool CvCapture_FFMPEG::retrieveFrame(int flag, unsigned char** data, int* step, int* width, int* height, int* cn)
 {
    if (!video_st)
        return false;

-    if (rawMode)
+    if (rawMode || flag == extraDataIdx)
    {
-        AVPacket& p = bsfc ? packet_filtered : packet;
-        *data = p.data;
-        *step = p.size;
-        *width = p.size;
+        bool ret = true;
+        if (flag == 0) {
+            AVPacket& p = bsfc ? packet_filtered : packet;
+            *data = p.data;
+            *step = p.size;
+            ret = p.data != NULL;
+        }
+        else if (flag == extraDataIdx) {
+            *data = ic->streams[video_stream]->codec->extradata;
+            *step = ic->streams[video_stream]->codec->extradata_size;
+        }
+        *width = *step;
        *height = 1;
        *cn = 1;
-        return p.data != NULL;
+        return  ret;
    }

    AVFrame* sw_picture = picture;
@@ -1586,6 +1596,12 @@ double CvCapture_FFMPEG::getProperty( int property_id ) const
        if (rawMode)
            return -1;
        break;
+    case CAP_PROP_LRF_HAS_KEY_FRAME: {
+        const AVPacket& p = bsfc ? packet_filtered : packet;
+        return ((p.flags & AV_PKT_FLAG_KEY) != 0) ? 1 : 0;
+    }
+    case CAP_PROP_CODEC_EXTRADATA_INDEX:
+            return extraDataIdx;
    case CAP_PROP_BITRATE:
        return static_cast<double>(get_bitrate());
    case CAP_PROP_ORIENTATION_META:

--- a/modules/videoio/test/test_ffmpeg.cpp
+++ b/modules/videoio/test/test_ffmpeg.cpp
@@ -476,6 +476,27 @@ static void ffmpeg_check_read_raw(VideoCapture& cap)
    EXPECT_EQ((size_t)37118, data.total());
 }

+TEST(videoio_ffmpeg, ffmpeg_check_extra_data)
+{
+    if (!videoio_registry::hasBackend(CAP_FFMPEG))
+        throw SkipTestException("FFmpeg backend was not found");
+
+    string video_file = findDataFile("video/big_buck_bunny.mp4");
+    VideoCapture cap;
+    EXPECT_NO_THROW(cap.open(video_file, CAP_FFMPEG));
+    ASSERT_TRUE(cap.isOpened()) << "Can't open the video";
+    const int codecExtradataIdx = (int)cap.get(CAP_PROP_CODEC_EXTRADATA_INDEX);
+#ifdef _WIN32  // handle old FFmpeg backend
+    if (codecExtradataIdx <= 0)
+        throw SkipTestException("Codec extra data is not supported by backend or video stream");
+#endif
+    Mat data;
+    ASSERT_TRUE(cap.retrieve(data, codecExtradataIdx));
+    EXPECT_EQ(CV_8UC1, data.type()) << "CV_8UC1 != " << typeToString(data.type());
+    EXPECT_TRUE(data.rows == 1 || data.cols == 1) << data.size;
+    EXPECT_EQ((size_t)45, data.total());
+}
+
 TEST(videoio_ffmpeg, open_with_property)
 {
    if (!videoio_registry::hasBackend(CAP_FFMPEG))