OpenCV  5.0.0alpha
Open Source Computer Vision
Loading...
Searching...
No Matches
cv::dnn::LSTMLayer Class Referenceabstract

LSTM recurrent layer. More...

#include <opencv2/dnn/all_layers.hpp>

Collaboration diagram for cv::dnn::LSTMLayer:

Public Member Functions

int inputNameToIndex (String inputName) CV_OVERRIDE
 Returns index of input blob into the input array.
 
int outputNameToIndex (const String &outputName) CV_OVERRIDE
 Returns index of output blob in output array.
 
virtual void setOutShape (const MatShape &outTailShape=MatShape())=0
 Specifies shape of output blob which will be [[T], N] + outTailShape.
 
virtual void setProduceCellOutput (bool produce=false)=0
 If this flag is set to true then layer will produce \( c_t \) as second output.
 
virtual void setUseTimstampsDim (bool use=true)=0
 Specifies either interpret first dimension of input blob as timestamp dimension either as sample.
 
virtual void setWeights (const Mat &Wh, const Mat &Wx, const Mat &b)=0
 Set trained weights for LSTM layer.
 
- Public Member Functions inherited from cv::dnn::Layer
 Layer ()
 
 Layer (const LayerParams &params)
 Initializes only name, type and blobs fields.
 
virtual ~Layer ()
 
virtual bool alwaysSupportInplace () const
 
virtual std::ostream & dump (std::ostream &strm, int indent, bool comma) const
 
virtual std::ostream & dumpAttrs (std::ostream &strm, int indent) const
 
virtual bool dynamicOutputShapes () const
 
virtual void finalize (const std::vector< Mat * > &input, std::vector< Mat > &output)
 Computes and sets internal parameters according to inputs, outputs and blobs.
 
std::vector< Matfinalize (const std::vector< Mat > &inputs)
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
 
void finalize (const std::vector< Mat > &inputs, std::vector< Mat > &outputs)
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
 
virtual void finalize (InputArrayOfArrays inputs, OutputArrayOfArrays outputs)
 Computes and sets internal parameters according to inputs, outputs and blobs.
 
virtual void forward (InputArrayOfArrays inputs, OutputArrayOfArrays outputs, OutputArrayOfArrays internals)
 Given the input blobs, computes the output blobs.
 
virtual void forward (std::vector< Mat * > &input, std::vector< Mat > &output, std::vector< Mat > &internals)
 Given the input blobs, computes the output blobs.
 
void forward_fallback (InputArrayOfArrays inputs, OutputArrayOfArrays outputs, OutputArrayOfArrays internals)
 Given the input blobs, computes the output blobs.
 
virtual int64 getFLOPS (const std::vector< MatShape > &inputs, const std::vector< MatShape > &outputs) const
 
virtual bool getMemoryShapes (const std::vector< MatShape > &inputs, const int requiredOutputs, std::vector< MatShape > &outputs, std::vector< MatShape > &internals) const
 
virtual void getScaleShift (Mat &scale, Mat &shift) const
 Returns parameters of layers with channel-wise multiplication and addition.
 
virtual void getScaleZeropoint (float &scale, int &zeropoint) const
 Returns scale and zeropoint of layers.
 
virtual void getTypes (const std::vector< MatType > &inputs, const int requiredOutputs, const int requiredInternals, std::vector< MatType > &outputs, std::vector< MatType > &internals) const
 
virtual Ptr< BackendNodeinitCann (const std::vector< Ptr< BackendWrapper > > &inputs, const std::vector< Ptr< BackendWrapper > > &outputs, const std::vector< Ptr< BackendNode > > &nodes)
 Returns a CANN backend node.
 
virtual Ptr< BackendNodeinitCUDA (void *context, const std::vector< Ptr< BackendWrapper > > &inputs, const std::vector< Ptr< BackendWrapper > > &outputs)
 Returns a CUDA backend node.
 
virtual Ptr< BackendNodeinitNgraph (const std::vector< Ptr< BackendWrapper > > &inputs, const std::vector< Ptr< BackendNode > > &nodes)
 
virtual Ptr< BackendNodeinitTimVX (void *timVxInfo, const std::vector< Ptr< BackendWrapper > > &inputsWrapper, const std::vector< Ptr< BackendWrapper > > &outputsWrapper, bool isLast)
 Returns a TimVX backend node.
 
virtual Ptr< BackendNodeinitVkCom (const std::vector< Ptr< BackendWrapper > > &inputs, std::vector< Ptr< BackendWrapper > > &outputs)
 
virtual Ptr< BackendNodeinitWebnn (const std::vector< Ptr< BackendWrapper > > &inputs, const std::vector< Ptr< BackendNode > > &nodes)
 
void run (const std::vector< Mat > &inputs, std::vector< Mat > &outputs, std::vector< Mat > &internals)
 Allocates layer and computes output.
 
virtual bool setActivation (const Ptr< ActivationLayer > &layer)
 Tries to attach to the layer the subsequent activation layer, i.e. do the layer fusion in a partial case.
 
void setParamsFrom (const LayerParams &params)
 Initializes only name, type and blobs fields.
 
virtual std::vector< Ptr< Graph > > * subgraphs () const
 
virtual bool supportBackend (int backendId)
 Ask layer if it support specific backend for doing computations.
 
virtual bool tryFuse (Ptr< Layer > &top)
 Try to fuse current layer with a next one.
 
virtual void unsetAttached ()
 "Detaches" all the layers, attached to particular layer.
 
virtual bool updateMemoryShapes (const std::vector< MatShape > &inputs)
 
- Public Member Functions inherited from cv::Algorithm
 Algorithm ()
 
virtual ~Algorithm ()
 
virtual void clear ()
 Clears the algorithm state.
 
virtual bool empty () const
 Returns true if the Algorithm is empty (e.g. in the very beginning or after unsuccessful read.
 
virtual String getDefaultName () const
 
virtual void read (const FileNode &fn)
 Reads algorithm parameters from a file storage.
 
virtual void save (const String &filename) const
 
virtual void write (FileStorage &fs) const
 Stores algorithm parameters in a file storage.
 
void write (FileStorage &fs, const String &name) const
 

Static Public Member Functions

static Ptr< LSTMLayercreate (const LayerParams &params)
 
- Static Public Member Functions inherited from cv::Algorithm
template<typename _Tp >
static Ptr< _Tpload (const String &filename, const String &objname=String())
 Loads algorithm from the file.
 
template<typename _Tp >
static Ptr< _TploadFromString (const String &strModel, const String &objname=String())
 Loads algorithm from a String.
 
template<typename _Tp >
static Ptr< _Tpread (const FileNode &fn)
 Reads algorithm from the file node.
 

Additional Inherited Members

- Public Attributes inherited from cv::dnn::Layer
std::vector< Matblobs
 List of learned parameters must be stored here to allow read them by using Net::getParam().
 
std::vector< Arginputs
 
String name
 Name of the layer instance, can be used for logging or other internal purposes.
 
void * netimpl
 
std::vector< Argoutputs
 
int preferableTarget
 prefer target for layer forwarding
 
String type
 Type name which was used for creating layer by layer factory.
 
- Protected Member Functions inherited from cv::Algorithm
void writeFormat (FileStorage &fs) const
 

Detailed Description

LSTM recurrent layer.

Member Function Documentation

◆ create()

static Ptr< LSTMLayer > cv::dnn::LSTMLayer::create ( const LayerParams & params)
static

Creates instance of LSTM layer

◆ inputNameToIndex()

int cv::dnn::LSTMLayer::inputNameToIndex ( String inputName)
virtual

Returns index of input blob into the input array.

Parameters
inputNamelabel of input blob

Each layer input and output can be labeled to easily identify them using "%<layer_name%>[.output_name]" notation. This method maps label of input blob to its index into input vector.

Reimplemented from cv::dnn::Layer.

◆ outputNameToIndex()

int cv::dnn::LSTMLayer::outputNameToIndex ( const String & outputName)
virtual

Returns index of output blob in output array.

See also
inputNameToIndex()

Reimplemented from cv::dnn::Layer.

◆ setOutShape()

virtual void cv::dnn::LSTMLayer::setOutShape ( const MatShape & outTailShape = MatShape())
pure virtual

Specifies shape of output blob which will be [[T], N] + outTailShape.

If this parameter is empty or unset then outTailShape = [Wh.size(0)] will be used, where Wh is parameter from setWeights().

◆ setProduceCellOutput()

virtual void cv::dnn::LSTMLayer::setProduceCellOutput ( bool produce = false)
pure virtual

If this flag is set to true then layer will produce \( c_t \) as second output.

Deprecated
Use flag use_timestamp_dim in LayerParams.

Shape of the second output is the same as first output.

◆ setUseTimstampsDim()

virtual void cv::dnn::LSTMLayer::setUseTimstampsDim ( bool use = true)
pure virtual

Specifies either interpret first dimension of input blob as timestamp dimension either as sample.

Deprecated
Use flag produce_cell_output in LayerParams.

If flag is set to true then shape of input blob will be interpreted as [T, N, [data dims]] where T specifies number of timestamps, N is number of independent streams. In this case each forward() call will iterate through T timestamps and update layer's state T times.

If flag is set to false then shape of input blob will be interpreted as [N, [data dims]]. In this case each forward() call will make one iteration and produce one timestamp with shape [N, [out dims]].

◆ setWeights()

virtual void cv::dnn::LSTMLayer::setWeights ( const Mat & Wh,
const Mat & Wx,
const Mat & b )
pure virtual

Set trained weights for LSTM layer.

Deprecated
Use LayerParams::blobs instead.

LSTM behavior on each step is defined by current input, previous output, previous cell state and learned weights.

Let \(x_t\) be current input, \(h_t\) be current output, \(c_t\) be current state. Than current output and current cell state is computed as follows:

\begin{eqnarray*} h_t &= o_t \odot tanh(c_t), \\ c_t &= f_t \odot c_{t-1} + i_t \odot g_t, \\ \end{eqnarray*}

where \(\odot\) is per-element multiply operation and \(i_t, f_t, o_t, g_t\) is internal gates that are computed using learned weights.

Gates are computed as follows:

\begin{eqnarray*} i_t &= sigmoid&(W_{xi} x_t + W_{hi} h_{t-1} + b_i), \\ f_t &= sigmoid&(W_{xf} x_t + W_{hf} h_{t-1} + b_f), \\ o_t &= sigmoid&(W_{xo} x_t + W_{ho} h_{t-1} + b_o), \\ g_t &= tanh &(W_{xg} x_t + W_{hg} h_{t-1} + b_g), \\ \end{eqnarray*}

where \(W_{x?}\), \(W_{h?}\) and \(b_{?}\) are learned weights represented as matrices: \(W_{x?} \in R^{N_h \times N_x}\), \(W_{h?} \in R^{N_h \times N_h}\), \(b_? \in R^{N_h}\).

For simplicity and performance purposes we use \( W_x = [W_{xi}; W_{xf}; W_{xo}, W_{xg}] \) (i.e. \(W_x\) is vertical concatenation of \( W_{x?} \)), \( W_x \in R^{4N_h \times N_x} \). The same for \( W_h = [W_{hi}; W_{hf}; W_{ho}, W_{hg}], W_h \in R^{4N_h \times N_h} \) and for \( b = [b_i; b_f, b_o, b_g]\), \(b \in R^{4N_h} \).

Parameters
Whis matrix defining how previous output is transformed to internal gates (i.e. according to above mentioned notation is \( W_h \))
Wxis matrix defining how current input is transformed to internal gates (i.e. according to above mentioned notation is \( W_x \))
bis bias vector (i.e. according to above mentioned notation is \( b \))

The documentation for this class was generated from the following file: