OpenCV 4.10.0-dev
Open Source Computer Vision
Loading...
Searching...
No Matches
cv::cuda::BufferPool Class Reference

BufferPool for use with CUDA streams. More...

#include <opencv2/core/cuda.hpp>

Collaboration diagram for cv::cuda::BufferPool:

Public Member Functions

 BufferPool (Stream &stream)
 Gets the BufferPool for the given stream.
 
Ptr< GpuMat::AllocatorgetAllocator () const
 Returns the allocator associated with the stream.
 
GpuMat getBuffer (int rows, int cols, int type)
 Allocates a new GpuMat of given size and type.
 
GpuMat getBuffer (Size size, int type)
 Allocates a new GpuMat of given size and type.
 

Detailed Description

BufferPool for use with CUDA streams.

BufferPool utilizes Stream's allocator to create new buffers for GpuMat's. It is only useful when enabled with setBufferPoolUsage.

void setBufferPoolUsage(bool on)
BufferPool management (must be called before Stream creation)
Note
setBufferPoolUsage must be called before any Stream declaration.

Users may specify custom allocator for Stream and may implement their own stream based functions utilizing the same underlying GPU memory management.

If custom allocator is not specified, BufferPool utilizes StackAllocator by default. StackAllocator allocates a chunk of GPU device memory beforehand, and when GpuMat is declared later on, it is given the pre-allocated memory. This kind of strategy reduces the number of calls for memory allocating APIs such as cudaMalloc or cudaMallocPitch.

Below is an example that utilizes BufferPool with StackAllocator:

#include <opencv2/opencv.hpp>
using namespace cv;
using namespace cv::cuda
int main()
{
setBufferPoolUsage(true); // Tell OpenCV that we are going to utilize BufferPool
setBufferPoolConfig(getDevice(), 1024 * 1024 * 64, 2); // Allocate 64 MB, 2 stacks (default is 10 MB, 5 stacks)
Stream stream1, stream2; // Each stream uses 1 stack
BufferPool pool1(stream1), pool2(stream2);
GpuMat d_src1 = pool1.getBuffer(4096, 4096, CV_8UC1); // 16MB
GpuMat d_dst1 = pool1.getBuffer(4096, 4096, CV_8UC3); // 48MB, pool1 is now full
GpuMat d_src2 = pool2.getBuffer(1024, 1024, CV_8UC1); // 1MB
GpuMat d_dst2 = pool2.getBuffer(1024, 1024, CV_8UC3); // 3MB
cvtColor(d_src1, d_dst1, cv::COLOR_GRAY2BGR, 0, stream1);
cvtColor(d_src2, d_dst2, cv::COLOR_GRAY2BGR, 0, stream2);
}
BufferPool for use with CUDA streams.
Definition cuda.hpp:741
Base storage class for GPU memory with reference counting.
Definition cuda.hpp:106
This class encapsulates a queue of asynchronous calls.
Definition cuda.hpp:910
#define CV_8UC1
Definition interface.h:88
#define CV_8UC3
Definition interface.h:90
int getDevice()
Returns the current device index set by cuda::setDevice or initialized by default.
void setBufferPoolConfig(int deviceId, size_t stackSize, int stackCount)
void cvtColor(InputArray src, OutputArray dst, int code, int dcn=0, Stream &stream=Stream::Null())
Converts an image from one color space to another.
@ COLOR_GRAY2BGR
Definition imgproc.hpp:557
int main(int argc, char *argv[])
Definition highgui_qt.cpp:3
Definition cuda.hpp:65
Definition core.hpp:107

If we allocate another GpuMat on pool1 in the above example, it will be carried out by the DefaultAllocator since the stack for pool1 is full.

GpuMat d_add1 = pool1.getBuffer(1024, 1024, CV_8UC1); // Stack for pool1 is full, memory is allocated with DefaultAllocator

If a third stream is declared in the above example, allocating with getBuffer within that stream will also be carried out by the DefaultAllocator because we've run out of stacks.

Stream stream3; // Only 2 stacks were allocated, we've run out of stacks
BufferPool pool3(stream3);
GpuMat d_src3 = pool3.getBuffer(1024, 1024, CV_8UC1); // Memory is allocated with DefaultAllocator
Warning
When utilizing StackAllocator, deallocation order is important.

Just like a stack, deallocation must be done in LIFO order. Below is an example of erroneous usage that violates LIFO rule. If OpenCV is compiled in Debug mode, this sample code will emit CV_Assert error.

int main()
{
setBufferPoolUsage(true); // Tell OpenCV that we are going to utilize BufferPool
Stream stream; // A default size (10 MB) stack is allocated to this stream
BufferPool pool(stream);
GpuMat mat1 = pool.getBuffer(1024, 1024, CV_8UC1); // Allocate mat1 (1MB)
GpuMat mat2 = pool.getBuffer(1024, 1024, CV_8UC1); // Allocate mat2 (1MB)
mat1.release(); // erroneous usage : mat2 must be deallocated before mat1
}
void release()
decreases reference counter, deallocate the data when reference counter reaches 0

Since C++ local variables are destroyed in the reverse order of construction, the code sample below satisfies the LIFO rule. Local GpuMat's are deallocated and the corresponding memory is automatically returned to the pool for later usage.

int main()
{
setBufferPoolUsage(true); // Tell OpenCV that we are going to utilize BufferPool
setBufferPoolConfig(getDevice(), 1024 * 1024 * 64, 2); // Allocate 64 MB, 2 stacks (default is 10 MB, 5 stacks)
Stream stream1, stream2; // Each stream uses 1 stack
BufferPool pool1(stream1), pool2(stream2);
for (int i = 0; i < 10; i++)
{
GpuMat d_src1 = pool1.getBuffer(4096, 4096, CV_8UC1); // 16MB
GpuMat d_dst1 = pool1.getBuffer(4096, 4096, CV_8UC3); // 48MB, pool1 is now full
GpuMat d_src2 = pool2.getBuffer(1024, 1024, CV_8UC1); // 1MB
GpuMat d_dst2 = pool2.getBuffer(1024, 1024, CV_8UC3); // 3MB
d_src1.setTo(Scalar(i), stream1);
d_src2.setTo(Scalar(i), stream2);
cvtColor(d_src1, d_dst1, cv::COLOR_GRAY2BGR, 0, stream1);
cvtColor(d_src2, d_dst2, cv::COLOR_GRAY2BGR, 0, stream2);
// The order of destruction of the local variables is:
// d_dst2 => d_src2 => d_dst1 => d_src1
// LIFO rule is satisfied, this code runs without error
}
}
GpuMat & setTo(Scalar s)
sets some of the GpuMat elements to s (Blocking call)
Scalar_< double > Scalar
Definition types.hpp:709

Constructor & Destructor Documentation

◆ BufferPool()

cv::cuda::BufferPool::BufferPool ( Stream & stream)
explicit
Python:
cv.cuda.BufferPool(stream) -> <cuda_BufferPool object>

Gets the BufferPool for the given stream.

Member Function Documentation

◆ getAllocator()

Ptr< GpuMat::Allocator > cv::cuda::BufferPool::getAllocator ( ) const
inline
Python:
cv.cuda.BufferPool.getAllocator() -> retval

Returns the allocator associated with the stream.

◆ getBuffer() [1/2]

GpuMat cv::cuda::BufferPool::getBuffer ( int rows,
int cols,
int type )
Python:
cv.cuda.BufferPool.getBuffer(rows, cols, type) -> retval
cv.cuda.BufferPool.getBuffer(size, type) -> retval

Allocates a new GpuMat of given size and type.

◆ getBuffer() [2/2]

GpuMat cv::cuda::BufferPool::getBuffer ( Size size,
int type )
inline
Python:
cv.cuda.BufferPool.getBuffer(rows, cols, type) -> retval
cv.cuda.BufferPool.getBuffer(size, type) -> retval

Allocates a new GpuMat of given size and type.

Here is the call graph for this function:

The documentation for this class was generated from the following file: