There are lots of things that can be done using deep learning. Video classification is just one of them. When you google video classification, you will see some works that are done using 2D CNN’s. I would like to share my experience of devoloping a video classifier by using 3D CNN architecture.

While 2D CNN’s take single video frame as input, 3D CNN architecture takes array of video frames. So it produces more accurate scores on videos. For instance, 3D CNN’s can predict if a person is walking or running by considering some consequtive frames of the video.

I used…


