Sometimes it's a DSP or FPGA with blocks/instructions designed to implement comm...

Sometimes it's a DSP or FPGA with blocks/instructions designed to implement common matrix transformations.

In this particular case a vendor on that site claims:

> 3. There are 5.9MB SRAM can be used for convolutional neural network acceleration, so, it is possible to run small model like tiny-yolo v2,MobileNet, as you see in face detection routine video.