OpenCV
5.0.0alpha
Open Source Computer Vision
|
Topics | |
Private implementation helpers | |
"Universal intrinsics" is a types and functions set intended to simplify vectorization of code on different platforms. Currently a few different SIMD extensions on different architectures are supported. 128 bit registers of various types support is implemented for a wide range of architectures including x86(SSE/SSE2/SSE4.2), ARM(NEON), PowerPC(VSX), MIPS(MSA). 256 bit long registers are supported on x86(AVX2) and 512 bit long registers are supported on x86(AVX512). In case when there is no SIMD extension available during compilation, fallback C++ implementation of intrinsics will be chosen and code will work as expected although it could be slower.
There are several types representing packed values vector registers, each type is implemented as a structure based on a one SIMD register.
Exact bit length(and value quantity) of listed types is compile time deduced and depends on architecture SIMD capabilities chosen as available during compilation of the library. All the types contains nlanes enumeration to check for exact value quantity of the type.
In case the exact bit length of the type is important it is possible to use specific fixed length register types.
There are several types representing 128-bit registers.
There are several types representing 256-bit registers.
There are several types representing 512-bit registers.
These operations allow to set contents of the register explicitly or by loading it from some memory block and to save contents of the register to memory block.
There are variable size register load operations that provide result of maximum available size depending on chosen platform capabilities.
Also there are fixed size register load/store operations.
For 128 bit registers
For 256 bit registers(check CV_SIMD256 preprocessor definition)
For 512 bit registers(check CV_SIMD512 preprocessor definition)
Store to memory operations are similar across different platform capabilities: v_store, v_store_aligned, v_store_high, v_store_low
These operations allow to reorder or recombine elements in one or multiple vectors.
Element-wise binary and unary operations.
Most of these operations return only one value.
Different type conversions and casts:
In these operations vectors represent matrix rows/columns: v_dotprod, v_dotprod_fast, v_dotprod_expand, v_dotprod_expand_fast, v_matmul, v_transpose4x4
Most operations are implemented only for some subset of the available types, following matrices shows the applicability of different operations to the types.
Regular integers:
Operations\Types | uint 8 | int 8 | uint 16 | int 16 | uint 32 | int 32 |
---|---|---|---|---|---|---|
load, store | x | x | x | x | x | x |
interleave | x | x | x | x | x | x |
expand | x | x | x | x | x | x |
expand_low | x | x | x | x | x | x |
expand_high | x | x | x | x | x | x |
expand_q | x | x | ||||
add, sub | x | x | x | x | x | x |
add_wrap, sub_wrap | x | x | x | x | ||
mul_wrap | x | x | x | x | ||
mul | x | x | x | x | x | x |
mul_expand | x | x | x | x | x | |
compare | x | x | x | x | x | x |
shift | x | x | x | x | ||
dotprod | x | x | ||||
dotprod_fast | x | x | ||||
dotprod_expand | x | x | x | x | x | |
dotprod_expand_fast | x | x | x | x | x | |
logical | x | x | x | x | x | x |
min, max | x | x | x | x | x | x |
absdiff | x | x | x | x | x | x |
absdiffs | x | x | ||||
reduce | x | x | x | x | x | x |
mask | x | x | x | x | x | x |
pack | x | x | x | x | x | x |
pack_u | x | x | ||||
pack_b | x | |||||
unpack | x | x | x | x | x | x |
extract | x | x | x | x | x | x |
rotate (lanes) | x | x | x | x | x | x |
cvt_flt32 | x | |||||
cvt_flt64 | x | |||||
transpose4x4 | x | x | ||||
reverse | x | x | x | x | x | x |
extract_n | x | x | x | x | x | x |
broadcast_element | x | x |
Big integers:
Operations\Types | uint 64 | int 64 |
---|---|---|
load, store | x | x |
add, sub | x | x |
shift | x | x |
logical | x | x |
reverse | x | x |
extract | x | x |
rotate (lanes) | x | x |
cvt_flt64 | x | |
extract_n | x | x |
Floating point:
Operations\Types | float 32 | float 64 |
---|---|---|
load, store | x | x |
interleave | x | |
add, sub | x | x |
mul | x | x |
div | x | x |
compare | x | x |
min, max | x | x |
absdiff | x | x |
reduce | x | |
mask | x | x |
unpack | x | x |
cvt_flt32 | x | |
cvt_flt64 | x | |
sqrt, abs | x | x |
float math | x | x |
transpose4x4 | x | |
extract | x | x |
rotate (lanes) | x | x |
reverse | x | x |
extract_n | x | x |
broadcast_element | x | |
exp | x | x |
log | x | x |
sin, cos | x | x |
Classes | |
struct | cv::v_reg< _Tp, n > |
Macros | |
#define | OPENCV_HAL_HAVE_PACK_STORE_BFLOAT16 1 |
#define | OPENCV_HAL_MATH_HAVE_EXP 1 |
Typedefs | |
typedef v_float32x16 | simd512::v_float32 |
Maximum available vector register capacity 32-bit floating point values (single precision) | |
typedef v_reg< float, 16 > | cv::v_float32x16 |
Sixteen 32-bit floating point values (single precision) | |
typedef v_reg< float, 4 > | cv::v_float32x4 |
Four 32-bit floating point values (single precision) | |
typedef v_reg< float, 8 > | cv::v_float32x8 |
Eight 32-bit floating point values (single precision) | |
typedef v_float64x8 | simd512::v_float64 |
Maximum available vector register capacity 64-bit floating point values (double precision) | |
typedef v_reg< double, 2 > | cv::v_float64x2 |
Two 64-bit floating point values (double precision) | |
typedef v_reg< double, 4 > | cv::v_float64x4 |
Four 64-bit floating point values (double precision) | |
typedef v_reg< double, 8 > | cv::v_float64x8 |
Eight 64-bit floating point values (double precision) | |
typedef v_int16x32 | simd512::v_int16 |
Maximum available vector register capacity 16-bit signed integer values. | |
typedef v_reg< short, 16 > | cv::v_int16x16 |
Sixteen 16-bit signed integer values. | |
typedef v_reg< short, 32 > | cv::v_int16x32 |
Thirty two 16-bit signed integer values. | |
typedef v_reg< short, 8 > | cv::v_int16x8 |
Eight 16-bit signed integer values. | |
typedef v_int32x16 | simd512::v_int32 |
Maximum available vector register capacity 32-bit signed integer values. | |
typedef v_reg< int, 16 > | cv::v_int32x16 |
Sixteen 32-bit signed integer values. | |
typedef v_reg< int, 4 > | cv::v_int32x4 |
Four 32-bit signed integer values. | |
typedef v_reg< int, 8 > | cv::v_int32x8 |
Eight 32-bit signed integer values. | |
typedef v_int64x8 | simd512::v_int64 |
Maximum available vector register capacity 64-bit signed integer values. | |
typedef v_reg< int64, 2 > | cv::v_int64x2 |
Two 64-bit signed integer values. | |
typedef v_reg< int64, 4 > | cv::v_int64x4 |
Four 64-bit signed integer values. | |
typedef v_reg< int64, 8 > | cv::v_int64x8 |
Eight 64-bit signed integer values. | |
typedef v_int8x64 | simd512::v_int8 |
Maximum available vector register capacity 8-bit signed integer values. | |
typedef v_reg< schar, 16 > | cv::v_int8x16 |
Sixteen 8-bit signed integer values. | |
typedef v_reg< schar, 32 > | cv::v_int8x32 |
Thirty two 8-bit signed integer values. | |
typedef v_reg< schar, 64 > | cv::v_int8x64 |
Sixty four 8-bit signed integer values. | |
typedef v_uint16x32 | simd512::v_uint16 |
Maximum available vector register capacity 16-bit unsigned integer values. | |
typedef v_reg< ushort, 16 > | cv::v_uint16x16 |
Sixteen 16-bit unsigned integer values. | |
typedef v_reg< ushort, 32 > | cv::v_uint16x32 |
Thirty two 16-bit unsigned integer values. | |
typedef v_reg< ushort, 8 > | cv::v_uint16x8 |
Eight 16-bit unsigned integer values. | |
typedef v_uint32x16 | simd512::v_uint32 |
Maximum available vector register capacity 32-bit unsigned integer values. | |
typedef v_reg< unsigned, 16 > | cv::v_uint32x16 |
Sixteen 32-bit unsigned integer values. | |
typedef v_reg< unsigned, 4 > | cv::v_uint32x4 |
Four 32-bit unsigned integer values. | |
typedef v_reg< unsigned, 8 > | cv::v_uint32x8 |
Eight 32-bit unsigned integer values. | |
typedef v_uint64x8 | simd512::v_uint64 |
Maximum available vector register capacity 64-bit unsigned integer values. | |
typedef v_reg< uint64, 2 > | cv::v_uint64x2 |
Two 64-bit unsigned integer values. | |
typedef v_reg< uint64, 4 > | cv::v_uint64x4 |
Four 64-bit unsigned integer values. | |
typedef v_reg< uint64, 8 > | cv::v_uint64x8 |
Eight 64-bit unsigned integer values. | |
typedef v_uint8x64 | simd512::v_uint8 |
Maximum available vector register capacity 8-bit unsigned integer values. | |
typedef v_reg< uchar, 16 > | cv::v_uint8x16 |
Sixteen 8-bit unsigned integer values. | |
typedef v_reg< uchar, 32 > | cv::v_uint8x32 |
Thirty two 8-bit unsigned integer values. | |
typedef v_reg< uchar, 64 > | cv::v_uint8x64 |
Sixty four 8-bit unsigned integer values. | |
Enumerations | |
enum | { cv::simd128_width = 16 , cv::simd256_width = 32 , cv::simd512_width = 64 , cv::simdmax_width = simd512_width } |
Functions | |
void | cv::v256_cleanup () |
template<typename _Tp > | |
v_reg< _Tp, simd256_width/sizeof(_Tp)> | cv::v256_load (const _Tp *ptr) |
Load 256-bit length register contents from memory. | |
template<typename _Tp > | |
v_reg< _Tp, simd256_width/sizeof(_Tp)> | cv::v256_load_aligned (const _Tp *ptr) |
Load register contents from memory (aligned) | |
template<typename _Tp > | |
v_reg< typename V_TypeTraits< _Tp >::w_type, simd256_width/sizeof(typename V_TypeTraits< _Tp >::w_type)> | cv::v256_load_expand (const _Tp *ptr) |
Load register contents from memory with double expand. | |
v_reg< float, simd256_width/sizeof(float)> | cv::v256_load_expand (const hfloat *ptr) |
template<typename _Tp > | |
v_reg< typename V_TypeTraits< _Tp >::q_type, simd256_width/sizeof(typename V_TypeTraits< _Tp >::q_type)> | cv::v256_load_expand_q (const _Tp *ptr) |
Load register contents from memory with quad expand. | |
template<typename _Tp > | |
v_reg< _Tp, simd256_width/sizeof(_Tp)> | cv::v256_load_halves (const _Tp *loptr, const _Tp *hiptr) |
Load register contents from two memory blocks. | |
template<typename _Tp > | |
v_reg< _Tp, simd256_width/sizeof(_Tp)> | cv::v256_load_low (const _Tp *ptr) |
Load 128-bits of data to lower part (high part is undefined). | |
void | cv::v512_cleanup () |
template<typename _Tp > | |
v_reg< _Tp, simd512_width/sizeof(_Tp)> | cv::v512_load (const _Tp *ptr) |
Load 512-bit length register contents from memory. | |
template<typename _Tp > | |
v_reg< _Tp, simd512_width/sizeof(_Tp)> | cv::v512_load_aligned (const _Tp *ptr) |
Load register contents from memory (aligned) | |
template<typename _Tp > | |
v_reg< typename V_TypeTraits< _Tp >::w_type, simd512_width/sizeof(typename V_TypeTraits< _Tp >::w_type)> | cv::v512_load_expand (const _Tp *ptr) |
Load register contents from memory with double expand. | |
v_reg< float, simd512_width/sizeof(float)> | cv::v512_load_expand (const hfloat *ptr) |
template<typename _Tp > | |
v_reg< typename V_TypeTraits< _Tp >::q_type, simd512_width/sizeof(typename V_TypeTraits< _Tp >::q_type)> | cv::v512_load_expand_q (const _Tp *ptr) |
Load register contents from memory with quad expand. | |
template<typename _Tp > | |
v_reg< _Tp, simd512_width/sizeof(_Tp)> | cv::v512_load_halves (const _Tp *loptr, const _Tp *hiptr) |
Load register contents from two memory blocks. | |
template<typename _Tp > | |
v_reg< _Tp, simd512_width/sizeof(_Tp)> | cv::v512_load_low (const _Tp *ptr) |
Load 256-bits of data to lower part (high part is undefined). | |
template<typename _Tp , int n> | |
v_reg< typename V_TypeTraits< _Tp >::abs_type, n > | cv::v_abs (const v_reg< _Tp, n > &a) |
Absolute value of elements. | |
template<typename _Tp , int n> | |
v_reg< typename V_TypeTraits< _Tp >::abs_type, n > | cv::v_absdiff (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Absolute difference. | |
template<int n> | |
v_reg< double, n > | cv::v_absdiff (const v_reg< double, n > &a, const v_reg< double, n > &b) |
template<int n> | |
v_reg< float, n > | cv::v_absdiff (const v_reg< float, n > &a, const v_reg< float, n > &b) |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_absdiffs (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Saturating absolute difference. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_add (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Add values. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_add_wrap (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Add values without saturation. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_and (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Bitwise AND. | |
template<int i, typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_broadcast_element (const v_reg< _Tp, n > &a) |
Broadcast i-th element of vector. | |
template<int n> | |
v_reg< int, n *2 > | cv::v_ceil (const v_reg< double, n > &a) |
template<int n> | |
v_reg< int, n > | cv::v_ceil (const v_reg< float, n > &a) |
Ceil elements. | |
template<typename _Tp , int n> | |
bool | cv::v_check_all (const v_reg< _Tp, n > &a) |
Check if all packed values are less than zero. | |
template<typename _Tp , int n> | |
bool | cv::v_check_any (const v_reg< _Tp, n > &a) |
Check if any of packed values is less than zero. | |
void | cv::v_cleanup () |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_combine_high (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Combine vector from last elements of two vectors. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_combine_low (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Combine vector from first elements of two vectors. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_cos (const v_reg< _Tp, n > &a) |
Cosine \( cos(x) \) of elements. | |
template<int n> | |
v_reg< float, n *2 > | cv::v_cvt_f32 (const v_reg< double, n > &a) |
Convert lower half to float. | |
template<int n> | |
v_reg< float, n *2 > | cv::v_cvt_f32 (const v_reg< double, n > &a, const v_reg< double, n > &b) |
Convert to float. | |
template<int n> | |
v_reg< float, n > | cv::v_cvt_f32 (const v_reg< int, n > &a) |
Convert to float. | |
template<int n> | |
v_reg< double,(n/2)> | cv::v_cvt_f64 (const v_reg< float, n > &a) |
Convert lower half to double. | |
template<int n> | |
v_reg< double, n/2 > | cv::v_cvt_f64 (const v_reg< int, n > &a) |
Convert lower half to double. | |
template<int n> | |
v_reg< double, n > | cv::v_cvt_f64 (const v_reg< int64, n > &a) |
Convert to double. | |
template<int n> | |
v_reg< double,(n/2)> | cv::v_cvt_f64_high (const v_reg< float, n > &a) |
Convert to double high part of vector. | |
template<int n> | |
v_reg< double,(n/2)> | cv::v_cvt_f64_high (const v_reg< int, n > &a) |
Convert to double high part of vector. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_div (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Divide values. | |
template<typename _Tp , int n> | |
v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > | cv::v_dotprod (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Dot product of elements. | |
template<typename _Tp , int n> | |
v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > | cv::v_dotprod (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, const v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &c) |
Dot product of elements. | |
template<typename _Tp , int n> | |
v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 > | cv::v_dotprod_expand (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Dot product of elements and expand. | |
template<typename _Tp , int n> | |
v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 > | cv::v_dotprod_expand (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, const v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 > &c) |
Dot product of elements. | |
template<int n> | |
v_reg< double, n/2 > | cv::v_dotprod_expand (const v_reg< int, n > &a, const v_reg< int, n > &b) |
template<int n> | |
v_reg< double, n/2 > | cv::v_dotprod_expand (const v_reg< int, n > &a, const v_reg< int, n > &b, const v_reg< double, n/2 > &c) |
template<typename _Tp , int n> | |
v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 > | cv::v_dotprod_expand_fast (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Fast Dot product of elements and expand. | |
template<typename _Tp , int n> | |
v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 > | cv::v_dotprod_expand_fast (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, const v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 > &c) |
Fast Dot product of elements. | |
template<int n> | |
v_reg< double, n/2 > | cv::v_dotprod_expand_fast (const v_reg< int, n > &a, const v_reg< int, n > &b) |
template<int n> | |
v_reg< double, n/2 > | cv::v_dotprod_expand_fast (const v_reg< int, n > &a, const v_reg< int, n > &b, const v_reg< double, n/2 > &c) |
template<typename _Tp , int n> | |
v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > | cv::v_dotprod_fast (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Fast Dot product of elements. | |
template<typename _Tp , int n> | |
v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > | cv::v_dotprod_fast (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, const v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &c) |
Fast Dot product of elements. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_eq (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Equal comparison. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_erf (const v_reg< _Tp, n > &a) |
Error function. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_exp (const v_reg< _Tp, n > &a) |
Exponential \( e^x \) of elements. | |
template<typename _Tp , int n> | |
void | cv::v_expand (const v_reg< _Tp, n > &a, v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &b0, v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &b1) |
Expand values to the wider pack type. | |
template<typename _Tp , int n> | |
v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > | cv::v_expand_high (const v_reg< _Tp, n > &a) |
Expand higher values to the wider pack type. | |
template<typename _Tp , int n> | |
v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > | cv::v_expand_low (const v_reg< _Tp, n > &a) |
Expand lower values to the wider pack type. | |
template<int s, typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_extract (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Vector extract. | |
template<int s, typename _Tp , int n> | |
_Tp | cv::v_extract_n (const v_reg< _Tp, n > &v) |
Vector extract. | |
template<int n> | |
v_reg< int, n *2 > | cv::v_floor (const v_reg< double, n > &a) |
template<int n> | |
v_reg< int, n > | cv::v_floor (const v_reg< float, n > &a) |
Floor elements. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_fma (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, const v_reg< _Tp, n > &c) |
Multiply and add. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_ge (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Greater-than or equal comparison. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_gt (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Greater-than comparison. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_interleave_pairs (const v_reg< _Tp, n > &vec) |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_interleave_quads (const v_reg< _Tp, n > &vec) |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_invsqrt (const v_reg< _Tp, n > &a) |
Inversed square root. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_le (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Less-than or equal comparison. | |
template<typename _Tp > | |
v_reg< _Tp, simd128_width/sizeof(_Tp)> | cv::v_load (const _Tp *ptr) |
Load register contents from memory. | |
template<typename _Tp > | |
v_reg< _Tp, simd128_width/sizeof(_Tp)> | cv::v_load_aligned (const _Tp *ptr) |
Load register contents from memory (aligned) | |
template<typename _Tp , int n> | |
void | cv::v_load_deinterleave (const _Tp *ptr, v_reg< _Tp, n > &a, v_reg< _Tp, n > &b) |
Load and deinterleave (2 channels) | |
template<typename _Tp , int n> | |
void | cv::v_load_deinterleave (const _Tp *ptr, v_reg< _Tp, n > &a, v_reg< _Tp, n > &b, v_reg< _Tp, n > &c) |
Load and deinterleave (3 channels) | |
template<typename _Tp , int n> | |
void | cv::v_load_deinterleave (const _Tp *ptr, v_reg< _Tp, n > &a, v_reg< _Tp, n > &b, v_reg< _Tp, n > &c, v_reg< _Tp, n > &d) |
Load and deinterleave (4 channels) | |
template<typename _Tp > | |
v_reg< typename V_TypeTraits< _Tp >::w_type, simd128_width/sizeof(typename V_TypeTraits< _Tp >::w_type)> | cv::v_load_expand (const _Tp *ptr) |
Load register contents from memory with double expand. | |
v_reg< float, simd128_width/sizeof(float)> | cv::v_load_expand (const hfloat *ptr) |
template<typename _Tp > | |
v_reg< typename V_TypeTraits< _Tp >::q_type, simd128_width/sizeof(typename V_TypeTraits< _Tp >::q_type)> | cv::v_load_expand_q (const _Tp *ptr) |
Load register contents from memory with quad expand. | |
template<typename _Tp > | |
v_reg< _Tp, simd128_width/sizeof(_Tp)> | cv::v_load_halves (const _Tp *loptr, const _Tp *hiptr) |
Load register contents from two memory blocks. | |
template<typename _Tp > | |
v_reg< _Tp, simd128_width/sizeof(_Tp)> | cv::v_load_low (const _Tp *ptr) |
Load 64-bits of data to lower part (high part is undefined). | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_log (const v_reg< _Tp, n > &a) |
Natural logarithm \( \log(x) \) of elements. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_lt (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Less-than comparison. | |
template<typename _Tp > | |
v_reg< _Tp, simd128_width/sizeof(_Tp)> | cv::v_lut (const _Tp *tab, const int *idx) |
template<int n> | |
v_reg< double, n/2 > | cv::v_lut (const double *tab, const v_reg< int, n > &idx) |
template<int n> | |
v_reg< float, n > | cv::v_lut (const float *tab, const v_reg< int, n > &idx) |
template<int n> | |
v_reg< int, n > | cv::v_lut (const int *tab, const v_reg< int, n > &idx) |
template<int n> | |
v_reg< unsigned, n > | cv::v_lut (const unsigned *tab, const v_reg< int, n > &idx) |
template<int n> | |
void | cv::v_lut_deinterleave (const double *tab, const v_reg< int, n *2 > &idx, v_reg< double, n > &x, v_reg< double, n > &y) |
template<int n> | |
void | cv::v_lut_deinterleave (const float *tab, const v_reg< int, n > &idx, v_reg< float, n > &x, v_reg< float, n > &y) |
template<typename _Tp > | |
v_reg< _Tp, simd128_width/sizeof(_Tp)> | cv::v_lut_pairs (const _Tp *tab, const int *idx) |
template<typename _Tp > | |
v_reg< _Tp, simd128_width/sizeof(_Tp)> | cv::v_lut_quads (const _Tp *tab, const int *idx) |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_magnitude (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Magnitude. | |
template<int n> | |
v_reg< float, n > | cv::v_matmul (const v_reg< float, n > &v, const v_reg< float, n > &a, const v_reg< float, n > &b, const v_reg< float, n > &c, const v_reg< float, n > &d) |
Matrix multiplication. | |
template<int n> | |
v_reg< float, n > | cv::v_matmuladd (const v_reg< float, n > &v, const v_reg< float, n > &a, const v_reg< float, n > &b, const v_reg< float, n > &c, const v_reg< float, n > &d) |
Matrix multiplication and add. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_max (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Choose max values for each pair. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_min (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Choose min values for each pair. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_mul (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Multiply values. | |
template<typename _Tp , int n> | |
void | cv::v_mul_expand (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &c, v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &d) |
Multiply and expand. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_mul_hi (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Multiply and extract high part. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_mul_wrap (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Multiply values without saturation. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_muladd (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, const v_reg< _Tp, n > &c) |
A synonym for v_fma. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_ne (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Not equal comparison. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_not (const v_reg< _Tp, n > &a) |
Bitwise NOT. | |
template<int n> | |
v_reg< double, n > | cv::v_not_nan (const v_reg< double, n > &a) |
template<int n> | |
v_reg< float, n > | cv::v_not_nan (const v_reg< float, n > &a) |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_or (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Bitwise OR. | |
template<int n> | |
void | cv::v_pack_store (hfloat *ptr, const v_reg< float, n > &v) |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_pack_triplets (const v_reg< _Tp, n > &vec) |
template<typename _Tp , int n> | |
v_reg< typename V_TypeTraits< _Tp >::abs_type, n > | cv::v_popcount (const v_reg< _Tp, n > &a) |
Count the 1 bits in the vector lanes and return result as corresponding unsigned type. | |
template<typename _Tp , int n> | |
void | cv::v_recombine (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, v_reg< _Tp, n > &low, v_reg< _Tp, n > &high) |
Combine two vectors from lower and higher parts of two other vectors. | |
template<typename _Tp , int n> | |
_Tp | cv::v_reduce_max (const v_reg< _Tp, n > &a) |
Find one max value. | |
template<typename _Tp , int n> | |
_Tp | cv::v_reduce_min (const v_reg< _Tp, n > &a) |
Find one min value. | |
template<typename _Tp , int n> | |
V_TypeTraits< typenameV_TypeTraits< _Tp >::abs_type >::sum_type | cv::v_reduce_sad (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Sum absolute differences of values. | |
template<typename _Tp , int n> | |
V_TypeTraits< _Tp >::sum_type | cv::v_reduce_sum (const v_reg< _Tp, n > &a) |
Sum packed values. | |
template<int n> | |
v_reg< float, n > | cv::v_reduce_sum4 (const v_reg< float, n > &a, const v_reg< float, n > &b, const v_reg< float, n > &c, const v_reg< float, n > &d) |
Sums all elements of each input vector, returns the vector of sums. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_reverse (const v_reg< _Tp, n > &a) |
Vector reverse order. | |
template<int imm, typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_rotate_left (const v_reg< _Tp, n > &a) |
Element shift left among vector. | |
template<int imm, typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_rotate_left (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
template<int imm, typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_rotate_right (const v_reg< _Tp, n > &a) |
Element shift right among vector. | |
template<int imm, typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_rotate_right (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
template<int n> | |
v_reg< int, n *2 > | cv::v_round (const v_reg< double, n > &a) |
template<int n> | |
v_reg< int, n *2 > | cv::v_round (const v_reg< double, n > &a, const v_reg< double, n > &b) |
template<int n> | |
v_reg< int, n > | cv::v_round (const v_reg< float, n > &a) |
Round elements. | |
template<typename _Tp , int n> | |
int | cv::v_scan_forward (const v_reg< _Tp, n > &a) |
Get first negative lane index. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_select (const v_reg< _Tp, n > &mask, const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Per-element select (blend operation) | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_shl (const v_reg< _Tp, n > &a, int imm) |
Bitwise shift left. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_shr (const v_reg< _Tp, n > &a, int imm) |
Bitwise shift right. | |
template<typename _Tp , int n> | |
int | cv::v_signmask (const v_reg< _Tp, n > &a) |
Get negative values mask. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_sin (const v_reg< _Tp, n > &a) |
Sine \( sin(x) \) of elements. | |
template<typename _Tp , int n> | |
void | cv::v_sincos (const v_reg< _Tp, n > &x, v_reg< _Tp, n > &s, v_reg< _Tp, n > &c) |
Compute sine \( sin(x) \) and cosine \( cos(x) \) of elements at the same time. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_sqr_magnitude (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Square of the magnitude. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_sqrt (const v_reg< _Tp, n > &a) |
Square root of elements. | |
template<typename _Tp , int n> | |
void | cv::v_store (_Tp *ptr, const v_reg< _Tp, n > &a) |
Store data to memory. | |
template<typename _Tp , int n> | |
void | cv::v_store (_Tp *ptr, const v_reg< _Tp, n > &a, hal::StoreMode) |
template<typename _Tp , int n> | |
void | cv::v_store_aligned (_Tp *ptr, const v_reg< _Tp, n > &a) |
Store data to memory (aligned) | |
template<typename _Tp , int n> | |
void | cv::v_store_aligned (_Tp *ptr, const v_reg< _Tp, n > &a, hal::StoreMode) |
template<typename _Tp , int n> | |
void | cv::v_store_aligned_nocache (_Tp *ptr, const v_reg< _Tp, n > &a) |
template<typename _Tp , int n> | |
void | cv::v_store_high (_Tp *ptr, const v_reg< _Tp, n > &a) |
Store data to memory (higher half) | |
template<typename _Tp , int n> | |
void | cv::v_store_interleave (_Tp *ptr, const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, const v_reg< _Tp, n > &c, const v_reg< _Tp, n > &d, hal::StoreMode=hal::STORE_UNALIGNED) |
Interleave and store (4 channels) | |
template<typename _Tp , int n> | |
void | cv::v_store_interleave (_Tp *ptr, const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, const v_reg< _Tp, n > &c, hal::StoreMode=hal::STORE_UNALIGNED) |
Interleave and store (3 channels) | |
template<typename _Tp , int n> | |
void | cv::v_store_interleave (_Tp *ptr, const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, hal::StoreMode=hal::STORE_UNALIGNED) |
Interleave and store (2 channels) | |
template<typename _Tp , int n> | |
void | cv::v_store_low (_Tp *ptr, const v_reg< _Tp, n > &a) |
Store data to memory (lower half) | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_sub (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Subtract values. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_sub_wrap (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Subtract values without saturation. | |
template<typename _Tp , int n> | |
void | cv::v_transpose4x4 (v_reg< _Tp, n > &a0, const v_reg< _Tp, n > &a1, const v_reg< _Tp, n > &a2, const v_reg< _Tp, n > &a3, v_reg< _Tp, n > &b0, v_reg< _Tp, n > &b1, v_reg< _Tp, n > &b2, v_reg< _Tp, n > &b3) |
Transpose 4x4 matrix. | |
template<int n> | |
v_reg< int, n *2 > | cv::v_trunc (const v_reg< double, n > &a) |
template<int n> | |
v_reg< int, n > | cv::v_trunc (const v_reg< float, n > &a) |
Truncate elements. | |
template<typename _Tp , int n> | |
v_reg< _Tp, n > | cv::v_xor (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b) |
Bitwise XOR. | |
template<typename _Tp , int n> | |
void | cv::v_zip (const v_reg< _Tp, n > &a0, const v_reg< _Tp, n > &a1, v_reg< _Tp, n > &b0, v_reg< _Tp, n > &b1) |
Interleave two vectors. | |
Variables | |
static const unsigned char | cv::popCountTable [] |
Reinterpret | |
Convert vector to different type without modifying underlying data. | |
template<typename _Tp0 , int n0> | |
v_reg< uchar, n0 *sizeof(_Tp0)/sizeof(uchar)> | cv::v_reinterpret_as_u8 (const v_reg< _Tp0, n0 > &a) |
template<typename _Tp0 , int n0> | |
v_reg< schar, n0 *sizeof(_Tp0)/sizeof(schar)> | cv::v_reinterpret_as_s8 (const v_reg< _Tp0, n0 > &a) |
template<typename _Tp0 , int n0> | |
v_reg< ushort, n0 *sizeof(_Tp0)/sizeof(ushort)> | cv::v_reinterpret_as_u16 (const v_reg< _Tp0, n0 > &a) |
template<typename _Tp0 , int n0> | |
v_reg< short, n0 *sizeof(_Tp0)/sizeof(short)> | cv::v_reinterpret_as_s16 (const v_reg< _Tp0, n0 > &a) |
template<typename _Tp0 , int n0> | |
v_reg< unsigned, n0 *sizeof(_Tp0)/sizeof(unsigned)> | cv::v_reinterpret_as_u32 (const v_reg< _Tp0, n0 > &a) |
template<typename _Tp0 , int n0> | |
v_reg< int, n0 *sizeof(_Tp0)/sizeof(int)> | cv::v_reinterpret_as_s32 (const v_reg< _Tp0, n0 > &a) |
template<typename _Tp0 , int n0> | |
v_reg< float, n0 *sizeof(_Tp0)/sizeof(float)> | cv::v_reinterpret_as_f32 (const v_reg< _Tp0, n0 > &a) |
template<typename _Tp0 , int n0> | |
v_reg< double, n0 *sizeof(_Tp0)/sizeof(double)> | cv::v_reinterpret_as_f64 (const v_reg< _Tp0, n0 > &a) |
template<typename _Tp0 , int n0> | |
v_reg< uint64, n0 *sizeof(_Tp0)/sizeof(uint64)> | cv::v_reinterpret_as_u64 (const v_reg< _Tp0, n0 > &a) |
template<typename _Tp0 , int n0> | |
v_reg< int64, n0 *sizeof(_Tp0)/sizeof(int64)> | cv::v_reinterpret_as_s64 (const v_reg< _Tp0, n0 > &a) |
Left shift | |
Shift left | |
template<int shift, int n> | |
v_reg< ushort, n > | cv::v_shl (const v_reg< ushort, n > &a) |
template<int shift, int n> | |
v_reg< short, n > | cv::v_shl (const v_reg< short, n > &a) |
template<int shift, int n> | |
v_reg< unsigned, n > | cv::v_shl (const v_reg< unsigned, n > &a) |
template<int shift, int n> | |
v_reg< int, n > | cv::v_shl (const v_reg< int, n > &a) |
template<int shift, int n> | |
v_reg< uint64, n > | cv::v_shl (const v_reg< uint64, n > &a) |
template<int shift, int n> | |
v_reg< int64, n > | cv::v_shl (const v_reg< int64, n > &a) |
Right shift | |
Shift right | |
template<int shift, int n> | |
v_reg< ushort, n > | cv::v_shr (const v_reg< ushort, n > &a) |
template<int shift, int n> | |
v_reg< short, n > | cv::v_shr (const v_reg< short, n > &a) |
template<int shift, int n> | |
v_reg< unsigned, n > | cv::v_shr (const v_reg< unsigned, n > &a) |
template<int shift, int n> | |
v_reg< int, n > | cv::v_shr (const v_reg< int, n > &a) |
template<int shift, int n> | |
v_reg< uint64, n > | cv::v_shr (const v_reg< uint64, n > &a) |
template<int shift, int n> | |
v_reg< int64, n > | cv::v_shr (const v_reg< int64, n > &a) |
Rounding shift | |
Rounding shift right | |
template<int shift, int n> | |
v_reg< ushort, n > | cv::v_rshr (const v_reg< ushort, n > &a) |
template<int shift, int n> | |
v_reg< short, n > | cv::v_rshr (const v_reg< short, n > &a) |
template<int shift, int n> | |
v_reg< unsigned, n > | cv::v_rshr (const v_reg< unsigned, n > &a) |
template<int shift, int n> | |
v_reg< int, n > | cv::v_rshr (const v_reg< int, n > &a) |
template<int shift, int n> | |
v_reg< uint64, n > | cv::v_rshr (const v_reg< uint64, n > &a) |
template<int shift, int n> | |
v_reg< int64, n > | cv::v_rshr (const v_reg< int64, n > &a) |
Pack | |
Pack values from two vectors to one Return vector type have twice more elements than input vector types. Variant with u suffix also converts to corresponding unsigned type.
| |
template<int n> | |
v_reg< uchar, 2 *n > | cv::v_pack (const v_reg< ushort, n > &a, const v_reg< ushort, n > &b) |
template<int n> | |
v_reg< schar, 2 *n > | cv::v_pack (const v_reg< short, n > &a, const v_reg< short, n > &b) |
template<int n> | |
v_reg< ushort, 2 *n > | cv::v_pack (const v_reg< unsigned, n > &a, const v_reg< unsigned, n > &b) |
template<int n> | |
v_reg< short, 2 *n > | cv::v_pack (const v_reg< int, n > &a, const v_reg< int, n > &b) |
template<int n> | |
v_reg< unsigned, 2 *n > | cv::v_pack (const v_reg< uint64, n > &a, const v_reg< uint64, n > &b) |
template<int n> | |
v_reg< int, 2 *n > | cv::v_pack (const v_reg< int64, n > &a, const v_reg< int64, n > &b) |
template<int n> | |
v_reg< uchar, 2 *n > | cv::v_pack_u (const v_reg< short, n > &a, const v_reg< short, n > &b) |
template<int n> | |
v_reg< ushort, 2 *n > | cv::v_pack_u (const v_reg< int, n > &a, const v_reg< int, n > &b) |
Pack with rounding shift | |
Pack values from two vectors to one with rounding shift Values from the input vectors will be shifted right by n bits with rounding, converted to narrower type and returned in the result vector. Variant with u suffix converts to unsigned type.
| |
template<int shift, int n> | |
v_reg< uchar, 2 *n > | cv::v_rshr_pack (const v_reg< ushort, n > &a, const v_reg< ushort, n > &b) |
template<int shift, int n> | |
v_reg< schar, 2 *n > | cv::v_rshr_pack (const v_reg< short, n > &a, const v_reg< short, n > &b) |
template<int shift, int n> | |
v_reg< ushort, 2 *n > | cv::v_rshr_pack (const v_reg< unsigned, n > &a, const v_reg< unsigned, n > &b) |
template<int shift, int n> | |
v_reg< short, 2 *n > | cv::v_rshr_pack (const v_reg< int, n > &a, const v_reg< int, n > &b) |
template<int shift, int n> | |
v_reg< unsigned, 2 *n > | cv::v_rshr_pack (const v_reg< uint64, n > &a, const v_reg< uint64, n > &b) |
template<int shift, int n> | |
v_reg< int, 2 *n > | cv::v_rshr_pack (const v_reg< int64, n > &a, const v_reg< int64, n > &b) |
template<int shift, int n> | |
v_reg< uchar, 2 *n > | cv::v_rshr_pack_u (const v_reg< short, n > &a, const v_reg< short, n > &b) |
template<int shift, int n> | |
v_reg< ushort, 2 *n > | cv::v_rshr_pack_u (const v_reg< int, n > &a, const v_reg< int, n > &b) |
Pack and store | |
Store values from the input vector into memory with pack Values will be stored into memory with conversion to narrower type. Variant with u suffix converts to corresponding unsigned type.
| |
template<int n> | |
void | cv::v_pack_store (uchar *ptr, const v_reg< ushort, n > &a) |
template<int n> | |
void | cv::v_pack_store (schar *ptr, const v_reg< short, n > &a) |
template<int n> | |
void | cv::v_pack_store (ushort *ptr, const v_reg< unsigned, n > &a) |
template<int n> | |
void | cv::v_pack_store (short *ptr, const v_reg< int, n > &a) |
template<int n> | |
void | cv::v_pack_store (unsigned *ptr, const v_reg< uint64, n > &a) |
template<int n> | |
void | cv::v_pack_store (int *ptr, const v_reg< int64, n > &a) |
template<int n> | |
void | cv::v_pack_u_store (uchar *ptr, const v_reg< short, n > &a) |
template<int n> | |
void | cv::v_pack_u_store (ushort *ptr, const v_reg< int, n > &a) |
Pack and store with rounding shift | |
Store values from the input vector into memory with pack Values will be shifted n bits right with rounding, converted to narrower type and stored into memory. Variant with u suffix converts to unsigned type.
| |
template<int shift, int n> | |
void | cv::v_rshr_pack_store (uchar *ptr, const v_reg< ushort, n > &a) |
template<int shift, int n> | |
void | cv::v_rshr_pack_store (schar *ptr, const v_reg< short, n > &a) |
template<int shift, int n> | |
void | cv::v_rshr_pack_store (ushort *ptr, const v_reg< unsigned, n > &a) |
template<int shift, int n> | |
void | cv::v_rshr_pack_store (short *ptr, const v_reg< int, n > &a) |
template<int shift, int n> | |
void | cv::v_rshr_pack_store (unsigned *ptr, const v_reg< uint64, n > &a) |
template<int shift, int n> | |
void | cv::v_rshr_pack_store (int *ptr, const v_reg< int64, n > &a) |
template<int shift, int n> | |
void | cv::v_rshr_pack_u_store (uchar *ptr, const v_reg< short, n > &a) |
template<int shift, int n> | |
void | cv::v_rshr_pack_u_store (ushort *ptr, const v_reg< int, n > &a) |
Pack boolean values | |
Pack boolean values from multiple vectors to one unsigned 8-bit integer vector
| |
template<int n> | |
v_reg< uchar, 2 *n > | cv::v_pack_b (const v_reg< ushort, n > &a, const v_reg< ushort, n > &b) |
! For 16-bit boolean values | |
template<int n> | |
v_reg< uchar, 4 *n > | cv::v_pack_b (const v_reg< unsigned, n > &a, const v_reg< unsigned, n > &b, const v_reg< unsigned, n > &c, const v_reg< unsigned, n > &d) |
template<int n> | |
v_reg< uchar, 8 *n > | cv::v_pack_b (const v_reg< uint64, n > &a, const v_reg< uint64, n > &b, const v_reg< uint64, n > &c, const v_reg< uint64, n > &d, const v_reg< uint64, n > &e, const v_reg< uint64, n > &f, const v_reg< uint64, n > &g, const v_reg< uint64, n > &h) |
#define OPENCV_HAL_HAVE_PACK_STORE_BFLOAT16 1 |
#include <opencv2/core/hal/intrin_cpp.hpp>
#define OPENCV_HAL_MATH_HAVE_EXP 1 |
#include <opencv2/core/hal/intrin_cpp.hpp>
typedef v_float32x16 simd512::v_float32 |
#include <opencv2/core/hal/intrin.hpp>
Maximum available vector register capacity 32-bit floating point values (single precision)
typedef v_reg<float, 16> cv::v_float32x16 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Sixteen 32-bit floating point values (single precision)
typedef v_reg<float, 4> cv::v_float32x4 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Four 32-bit floating point values (single precision)
typedef v_reg<float, 8> cv::v_float32x8 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Eight 32-bit floating point values (single precision)
typedef v_float64x8 simd512::v_float64 |
#include <opencv2/core/hal/intrin.hpp>
Maximum available vector register capacity 64-bit floating point values (double precision)
typedef v_reg<double, 2> cv::v_float64x2 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Two 64-bit floating point values (double precision)
typedef v_reg<double, 4> cv::v_float64x4 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Four 64-bit floating point values (double precision)
typedef v_reg<double, 8> cv::v_float64x8 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Eight 64-bit floating point values (double precision)
typedef v_int16x32 simd512::v_int16 |
#include <opencv2/core/hal/intrin.hpp>
Maximum available vector register capacity 16-bit signed integer values.
typedef v_reg<short, 16> cv::v_int16x16 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Sixteen 16-bit signed integer values.
typedef v_reg<short, 32> cv::v_int16x32 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Thirty two 16-bit signed integer values.
typedef v_reg<short, 8> cv::v_int16x8 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Eight 16-bit signed integer values.
typedef v_int32x16 simd512::v_int32 |
#include <opencv2/core/hal/intrin.hpp>
Maximum available vector register capacity 32-bit signed integer values.
typedef v_reg<int, 16> cv::v_int32x16 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Sixteen 32-bit signed integer values.
typedef v_reg<int, 4> cv::v_int32x4 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Four 32-bit signed integer values.
typedef v_reg<int, 8> cv::v_int32x8 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Eight 32-bit signed integer values.
typedef v_int64x8 simd512::v_int64 |
#include <opencv2/core/hal/intrin.hpp>
Maximum available vector register capacity 64-bit signed integer values.
typedef v_reg<int64, 2> cv::v_int64x2 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Two 64-bit signed integer values.
typedef v_reg<int64, 4> cv::v_int64x4 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Four 64-bit signed integer values.
typedef v_reg<int64, 8> cv::v_int64x8 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Eight 64-bit signed integer values.
typedef v_int8x64 simd512::v_int8 |
#include <opencv2/core/hal/intrin.hpp>
Maximum available vector register capacity 8-bit signed integer values.
typedef v_reg<schar, 16> cv::v_int8x16 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Sixteen 8-bit signed integer values.
typedef v_reg<schar, 32> cv::v_int8x32 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Thirty two 8-bit signed integer values.
typedef v_reg<schar, 64> cv::v_int8x64 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Sixty four 8-bit signed integer values.
typedef v_uint16x32 simd512::v_uint16 |
#include <opencv2/core/hal/intrin.hpp>
Maximum available vector register capacity 16-bit unsigned integer values.
typedef v_reg<ushort, 16> cv::v_uint16x16 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Sixteen 16-bit unsigned integer values.
typedef v_reg<ushort, 32> cv::v_uint16x32 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Thirty two 16-bit unsigned integer values.
typedef v_reg<ushort, 8> cv::v_uint16x8 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Eight 16-bit unsigned integer values.
typedef v_uint32x16 simd512::v_uint32 |
#include <opencv2/core/hal/intrin.hpp>
Maximum available vector register capacity 32-bit unsigned integer values.
typedef v_reg<unsigned, 16> cv::v_uint32x16 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Sixteen 32-bit unsigned integer values.
typedef v_reg<unsigned, 4> cv::v_uint32x4 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Four 32-bit unsigned integer values.
typedef v_reg<unsigned, 8> cv::v_uint32x8 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Eight 32-bit unsigned integer values.
typedef v_uint64x8 simd512::v_uint64 |
#include <opencv2/core/hal/intrin.hpp>
Maximum available vector register capacity 64-bit unsigned integer values.
typedef v_reg<uint64, 2> cv::v_uint64x2 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Two 64-bit unsigned integer values.
typedef v_reg<uint64, 4> cv::v_uint64x4 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Four 64-bit unsigned integer values.
typedef v_reg<uint64, 8> cv::v_uint64x8 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Eight 64-bit unsigned integer values.
typedef v_uint8x64 simd512::v_uint8 |
#include <opencv2/core/hal/intrin.hpp>
Maximum available vector register capacity 8-bit unsigned integer values.
typedef v_reg<uchar, 16> cv::v_uint8x16 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Sixteen 8-bit unsigned integer values.
typedef v_reg<uchar, 32> cv::v_uint8x32 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Thirty two 8-bit unsigned integer values.
typedef v_reg<uchar, 64> cv::v_uint8x64 |
#include <opencv2/core/hal/intrin_cpp.hpp>
Sixty four 8-bit unsigned integer values.
anonymous enum |
#include <opencv2/core/hal/intrin_cpp.hpp>
Enumerator | |
---|---|
simd128_width | |
simd256_width | |
simd512_width | |
simdmax_width |
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load 256-bit length register contents from memory.
ptr | pointer to memory block with data |
sizeof(lane type)
should be enough). Do not cast pointer types without runtime check for pointer alignment (like uchar*
=> int*
).
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load register contents from memory (aligned)
similar to cv::v256_load, but source memory block should be aligned (to 32-byte boundary in case of SIMD256, 64-byte - SIMD512, etc)
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load register contents from memory with double expand.
Same as cv::v256_load, but result pack type will be 2x wider than memory type.
For 8-, 16-, 32-bit integer source types.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load register contents from memory with quad expand.
Same as cv::v256_load_expand, but result type is 4 times wider than source.
For 8-bit integer source types.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load register contents from two memory blocks.
loptr | memory block containing data for first half (0..n/2) |
hiptr | memory block containing data for second half (n/2..n) |
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load 128-bits of data to lower part (high part is undefined).
ptr | memory block containing data for first half (0..n/2) |
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load 512-bit length register contents from memory.
ptr | pointer to memory block with data |
sizeof(lane type)
should be enough). Do not cast pointer types without runtime check for pointer alignment (like uchar*
=> int*
).
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load register contents from memory (aligned)
similar to cv::v512_load, but source memory block should be aligned (to 64-byte boundary in case of SIMD512, etc)
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load register contents from memory with double expand.
Same as cv::v512_load, but result pack type will be 2x wider than memory type.
For 8-, 16-, 32-bit integer source types.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load register contents from memory with quad expand.
Same as cv::v512_load_expand, but result type is 4 times wider than source.
For 8-bit integer source types.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load register contents from two memory blocks.
loptr | memory block containing data for first half (0..n/2) |
hiptr | memory block containing data for second half (n/2..n) |
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load 256-bits of data to lower part (high part is undefined).
ptr | memory block containing data for first half (0..n/2) |
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Absolute value of elements.
Only for floating point types.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Absolute difference.
Returns \( |a - b| \) converted to corresponding unsigned type. Example:
For 8-, 16-, 32-bit integer source types.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
For 64-bit floating point values
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
For 32-bit floating point values
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Saturating absolute difference.
Returns \( saturate(|a - b|) \) . For 8-, 16-bit signed integer source types.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Add values without saturation.
For 8- and 16-bit integer values.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Broadcast i-th element of vector.
Scheme:
Restriction: 0 <= i < nlanes Supported types: 32-bit integers and floats (s32/u32/f32)
#include <opencv2/core/hal/intrin_cpp.hpp>
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
#include <opencv2/core/hal/intrin_cpp.hpp>
Ceil elements.
Ceil each value. Input type is float vector ==> output type is int vector.
#include <opencv2/core/hal/intrin_cpp.hpp>
Check if all packed values are less than zero.
Unsigned values will be casted to signed: uchar 254 => char -2
.
#include <opencv2/core/hal/intrin_cpp.hpp>
Check if any of packed values is less than zero.
Unsigned values will be casted to signed: uchar 254 => char -2
.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Combine vector from last elements of two vectors.
Scheme:
For all types except 64-bit.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Combine vector from first elements of two vectors.
Scheme:
For all types except 64-bit.
#include <opencv2/core/hal/intrin_cpp.hpp>
Cosine \( cos(x) \) of elements.
Only for floating point types. Core implementation the same as v_sincos.
#include <opencv2/core/hal/intrin_cpp.hpp>
Convert lower half to float.
Supported input type is cv::v_float64.
#include <opencv2/core/hal/intrin_cpp.hpp>
Convert lower half to double.
Supported input type is cv::v_float32.
#include <opencv2/core/hal/intrin_cpp.hpp>
Convert lower half to double.
Supported input type is cv::v_int32.
#include <opencv2/core/hal/intrin_cpp.hpp>
Convert to double high part of vector.
Supported input type is cv::v_float32.
#include <opencv2/core/hal/intrin_cpp.hpp>
Convert to double high part of vector.
Supported input type is cv::v_int32.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Dot product of elements.
Multiply values in two registers and sum adjacent result pairs.
Scheme:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Dot product of elements.
Same as cv::v_dotprod, but add a third element to the sum of adjacent pairs. Scheme:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Dot product of elements and expand.
Multiply values in two registers and expand the sum of adjacent result pairs.
Scheme:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Dot product of elements.
Same as cv::v_dotprod_expand, but add a third element to the sum of adjacent pairs. Scheme:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Fast Dot product of elements and expand.
Multiply values in two registers and expand the sum of adjacent result pairs.
Same as cv::v_dotprod_expand, but it may perform unorder sum between result pairs in some platforms, this intrinsic can be used if the sum among all lanes is only matters and also it should be yielding better performance on the affected platforms.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Fast Dot product of elements.
Same as cv::v_dotprod_expand_fast, but add a third element to the sum of adjacent pairs.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Fast Dot product of elements.
Same as cv::v_dotprod, but it may perform unorder sum between result pairs in some platforms, this intrinsic can be used if the sum among all lanes is only matters and also it should be yielding better performance on the affected platforms.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Fast Dot product of elements.
Same as cv::v_dotprod_fast, but add a third element to the sum of adjacent pairs.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Equal comparison.
#include <opencv2/core/hal/intrin_cpp.hpp>
Exponential \( e^x \) of elements.
Only for floating point types. Core implementation steps:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Expand values to the wider pack type.
Copy contents of register to two registers with 2x wider pack type. Scheme:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Expand higher values to the wider pack type.
Same as cv::v_expand_low, but expand higher half of the vector instead.
Scheme:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Expand lower values to the wider pack type.
Same as cv::v_expand, but return lower half of the vector.
Scheme:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Vector extract.
Scheme:
Restriction: 0 <= shift < nlanes
Usage:
For all types.
#include <opencv2/core/hal/intrin_cpp.hpp>
Vector extract.
Scheme: Return the s-th element of v. Restriction: 0 <= s < nlanes
Usage:
For all types.
#include <opencv2/core/hal/intrin_cpp.hpp>
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
#include <opencv2/core/hal/intrin_cpp.hpp>
Floor elements.
Floor each value. Input type is float vector ==> output type is int vector.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Multiply and add.
Returns \( a*b + c \) For floating point types and signed 32bit int only.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Greater-than or equal comparison.
For all types except 64-bit integer values.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Greater-than comparison.
For all types except 64-bit integer values.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
Inversed square root.
Returns \( 1/sqrt(a) \) For floating point types only.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Less-than or equal comparison.
For all types except 64-bit integer values.
#include <opencv2/core/hal/intrin_cpp.hpp>
Load register contents from memory.
ptr | pointer to memory block with data |
sizeof(lane type)
should be enough). Do not cast pointer types without runtime check for pointer alignment (like uchar*
=> int*
).
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load register contents from memory (aligned)
similar to cv::v_load, but source memory block should be aligned (to 16-byte boundary in case of SIMD128, 32-byte - SIMD256, etc)
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load and deinterleave (2 channels)
Load data from memory deinterleave and store to 2 registers. Scheme:
For all types except 64-bit.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load and deinterleave (3 channels)
Load data from memory deinterleave and store to 3 registers. Scheme:
For all types except 64-bit.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load and deinterleave (4 channels)
Load data from memory deinterleave and store to 4 registers. Scheme:
For all types except 64-bit.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load register contents from memory with double expand.
Same as cv::v_load, but result pack type will be 2x wider than memory type.
For 8-, 16-, 32-bit integer source types.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load register contents from memory with quad expand.
Same as cv::v_load_expand, but result type is 4 times wider than source.
For 8-bit integer source types.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load register contents from two memory blocks.
loptr | memory block containing data for first half (0..n/2) |
hiptr | memory block containing data for second half (n/2..n) |
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Load 64-bits of data to lower part (high part is undefined).
ptr | memory block containing data for first half (0..n/2) |
#include <opencv2/core/hal/intrin_cpp.hpp>
Natural logarithm \( \log(x) \) of elements.
Only for floating point types. Core implementation steps:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Less-than comparison.
For all types except 64-bit integer values.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Magnitude.
Returns \( sqrt(a^2 + b^2) \) For floating point types only.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Matrix multiplication.
Scheme:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Matrix multiplication and add.
Scheme:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Choose max values for each pair.
Scheme:
For all types except 64-bit integer.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Choose min values for each pair.
Scheme:
For all types except 64-bit integer.
v_reg< _Tp, n > cv::v_mul | ( | const v_reg< _Tp, n > & | a, |
const v_reg< _Tp, n > & | b ) |
#include <opencv2/core/hal/intrin_cpp.hpp>
Multiply values.
For 16- and 32-bit integer types and floating types.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Multiply and expand.
Multiply values two registers and store results in two registers with wider pack type. Scheme:
Example:
Implemented only for 16- and unsigned 32-bit source types (v_int16x8, v_uint16x8, v_uint32x4).
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Multiply and extract high part.
Multiply values two registers and store high part of the results. Implemented only for 16-bit source types (v_int16x8, v_uint16x8). Returns \( a*b >> 16 \)
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Multiply values without saturation.
For 8- and 16-bit integer values.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
A synonym for v_fma.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Not equal comparison.
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. For 64-bit boolean values
Scheme:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. For 32-bit boolean values
Scheme:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
! For 16-bit boolean values
Scheme:
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Count the 1 bits in the vector lanes and return result as corresponding unsigned type.
Scheme:
For all integer types.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Combine two vectors from lower and higher parts of two other vectors.
#include <opencv2/core/hal/intrin_cpp.hpp>
Find one max value.
Scheme:
For all types except 64-bit integer and 64-bit floating point types.
#include <opencv2/core/hal/intrin_cpp.hpp>
Find one min value.
Scheme:
For all types except 64-bit integer and 64-bit floating point types.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Sum absolute differences of values.
Scheme:
For all types except 64-bit types.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Sum packed values.
Scheme:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Sums all elements of each input vector, returns the vector of sums.
Scheme:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
Vector reverse order.
Reverse the order of the vector Scheme:
For all types.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
#include <opencv2/core/hal/intrin_cpp.hpp>
Round elements.
Rounds each value. Input type is float vector ==> output type is int vector.
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
Get first negative lane index.
Returned value is an index of first negative lane (undefined for input of all positive values) Example:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Per-element select (blend operation)
Return value will be built by combining values a and b using the following scheme: result[i] = mask[i] ? a[i] : b[i];
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Bitwise shift left.
For 16-, 32- and 64-bit integer values.
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Bitwise shift right.
For 16-, 32- and 64-bit integer values.
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
#include <opencv2/core/hal/intrin_cpp.hpp>
Get negative values mask.
Returned value is a bit mask with bits set to 1 on places corresponding to negative packed values indexes. Example:
#include <opencv2/core/hal/intrin_cpp.hpp>
Sine \( sin(x) \) of elements.
Only for floating point types. Core implementation the same as v_sincos.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Compute sine \( sin(x) \) and cosine \( cos(x) \) of elements at the same time.
Only for floating point types. Core implementation steps:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Square of the magnitude.
Returns \( a^2 + b^2 \) For floating point types only.
#include <opencv2/core/hal/intrin_cpp.hpp>
Store data to memory.
Store register contents to memory. Scheme:
Pointer can be unaligned.
|
inline |
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Store data to memory (aligned)
Store register contents to memory. Scheme:
Pointer should be aligned by 16-byte boundary.
|
inline |
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Store data to memory (higher half)
Store higher half of register contents to memory. Scheme:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Interleave and store (4 channels)
Interleave and store data from 4 registers to memory. Scheme:
For all types except 64-bit.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Interleave and store (3 channels)
Interleave and store data from 3 registers to memory. Scheme:
For all types except 64-bit.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Interleave and store (2 channels)
Interleave and store data from 2 registers to memory. Scheme:
For all types except 64-bit.
#include <opencv2/core/hal/intrin_cpp.hpp>
Store data to memory (lower half)
Store lower half of register contents to memory. Scheme:
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Subtract values without saturation.
For 8- and 16-bit integer values.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Transpose 4x4 matrix.
Scheme:
#include <opencv2/core/hal/intrin_cpp.hpp>
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
#include <opencv2/core/hal/intrin_cpp.hpp>
Truncate elements.
Truncate each value. Input type is float vector ==> output type is int vector.
|
inline |
#include <opencv2/core/hal/intrin_cpp.hpp>
Interleave two vectors.
Scheme:
For all types except 64-bit.
|
static |
#include <opencv2/core/hal/intrin_cpp.hpp>