A Depthwise Separable Network for Action Recognition
Abstract
Action recognition is one important but challenging tasks in computer vision. The 3D convolutional neural network is one mainstream method for action recognition because it can extract temporal and spatial information in the video simultaneously. However, 3D convolutional neural network has a serious drawback which is that its parameter quantity is too large. Depthwise convolution is a form of group convolution, which can effectively reduce the parameter of convolution kernel, and has been widely applied in 2D convolutional neural network. Therefore, we propose to introduce depthwise convolution into the 3D convolutional neural network. We choose 3D resnet as our basic model, and construct our model by replacing the 3D convolution kernel in the baseline with the depthwise convolution, we named our proposed model as depthwise separable network (DSN). We conducted experiments on UCF101 and HMDB51 dataset. The experimental results show that by introducing the depthwise convolution, our DSN network can not only reduce the parameters of the baseline, but also can moderately improve the accuracy.
Keywords
Action recognition, 3D convolution, Depthwise convolutionText
DOI
10.12783/dtcse/cisnrc2019/33352
10.12783/dtcse/cisnrc2019/33352
Refbacks
- There are currently no refbacks.