Coral TPU and TensorFlow Lite application note¶

This application note describes how to get the Coral TPU M.2 and mPCIe AI accelerator cards working on the Ten64 under Debian, both natively and inside a VM (using PCIe passthrough/VFIO).

The same instructions (excluding gasket driver install) should also apply to the Coral USB accelerator, however, the PCIe based accelerators can operate faster and without thermal restrictions.

Usage under VMs with VFIO/passthrough¶

The Coral PCIe accelerators will work under VFIO passthrough, but VMs hosts with earlier kernel versions (<5.4) may not work, as the host needs to perform PCIe quirk fixups.

If the Coral card fails to passthrough, you will need the PCI: Move Apex Edge TPU class quirk to fix BAR assignment patch.

Driver and Software Installation¶

The instructions for software installation are nearly the same as the Coral instructions, however, you may encounter issues getting the PCIe driver (gasket) installed from the Coral repository due to linux-header dependencies that cannot be met on arm64.

If you are running a recent kernel (5.4 or later) you may already have the gasket and apex drivers - these are currently in drivers/staging/gasket in the Linux kernel. We don't recommend using the staging version of the driver in kernels prior to 5.7, in part due to the PCIe quirk handling issue mentioned above.

Install the kernel headers for your kernel and DKMS:

sudo apt-get install dkms linux-headers-4.19.0-10-arm64 build-essential

(Note: You need to choose the correct linux-headers package for your running kernel)
Download the gasket-dkms package:

apt-get download gasket-dkms

Extract the gasket source, add to DKMS and install

sudo dpkg --force-depends -i gasket-dkms_1.0-13_all.deb
ar x gasket-dkms_1.0-13_all.deb
cd gasket
sudo cp -r usr/src/gasket-1.0 /usr/src
sudo dkms add gasket/1.0
sudo dkms build gasket/1.0
sudo dkms install gasket/1.0

Check that the gasket and apex drivers load and that the /dev/apex_0 device exists.

sudo modprobe gasket
sudo dmesg | grep gasket
[    4.676912] gasket: loading out-of-tree module taints kernel.
[    4.737324] gasket: module verification failed: signature and/or required key missing - tainting kernel
sudo dmesg | grep apex
[    5.229682] apex 0000:00:05.0: enabling device (0000 -> 0002)

(Optional) give your user account permissions to access the apex device (reboot required to take effect):

sudo sh -c "echo 'SUBSYSTEM==\"apex\", MODE=\"0660\", GROUP=\"apex\"' >> /etc/udev/rules.d/65-apex.rules"
sudo groupadd apex
sudo adduser $USER apex

Install the edgetpu libraries:
```
sudo apt-get install libedgetpu1-std
```
Install TensorFlow Lite

See the official TensorFlow Lite install page for URL's.

You will need python3 and python3-pip, as well as numpy and pil(llow), if you don't have it installed already:
```
sudo apt-get install python3 python3-pip python3-numpy python3-pil
```

Run the Coral example/demo

This follows from Coral's getting started guide.

mkdir coral && cd coral
git clone https://github.com/google-coral/tflite.git
cd tflite/python/examples/classification
bash install_requirements.sh
python3 classify_image.py \
    --model models/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite \
    --labels models/inat_bird_labels.txt \
    --input images/parrot.jpg

----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
12.6ms
2.5ms
2.4ms
2.4ms
2.4ms
-------RESULTS--------
Ara macao (Scarlet Macaw): 0.77734

You can also run the classification model without the TPU to compare (by specifying a model file not compatible with the TPU):

$ python3 classify_image.py \
    --model models/mobilenet_v2_1.0_224_inat_bird_quant.tflite \
    --labels models/inat_bird_labels.txt \
    --input images/parrot.jpg

----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
140.4ms
138.9ms
139.1ms
139.3ms
139.3ms
-------RESULTS--------
Ara macao (Scarlet Macaw): 0.77734

So the TPU has given us a 58x speedup (138ms CPU vs 2.4ms on TPU) - not bad!