Caffe Deep Learning Framework – 64-bit NVIDIA Jetson TX1

Back in February, we installed Caffe on the TX1. At the time, the TX1 was running a 32-bit version of L4T 23.1. With the advent of the 64-bit L4T 24.2, this seems like a good time to do a performance comparison of the two. The TX1 can now do an image recognition in about 8 ms! For the install and test, Looky Here:

Background

As you recall, Caffe is a deep learning framework developed with cleanliness, readability, and speed in mind. It was created by Yangqing Jia during his PhD at UC Berkeley, and is in active development by the Berkeley Vision and Learning Center (BVLC) and by community contributors.

The L4T 23.1 Operating System release was a 64-bit kernel supporting a 32-bit user space. For the L4T 24.2 release, both the kernel and the user space are 64-bit.

Caffe Installation

A script is available in the JetsonHack Github repository which will install the dependencies for Caffe, downloads the source files, configures the build system, compiles Caffe, and then runs a suite of tests. Passing the tests indicates that Caffe is installed correctly.

This installation demonstration is for a NVIDIA Jetson TX1 running L4T 24.2, an Ubuntu 16.04 variant. The installation of L4T 24.2 was done using JetPack 2.3, and includes installation of OpenCV4Tegra, CUDA 8.0, and cuDNN 5.1.

Before starting the installation, you may want to set the CPU and GPU clocks to maximum by running the script:

$ sudo ./jetson_clocks.sh

The script is in the home directory, and is also included in the installCaffeJTX1 repository for convenience.

In order to install Caffe:

$ git clone https://github.com/jetsonhacks/installCaffeJTX1.git
$ cd installCaffeJTX1
$ ./installCaffe.sh

Installation should not require intervention, in the video installation of dependencies and compilation took about 10 minutes. Running the unit tests takes about 45 minutes. While not strictly necessary, running the unit tests makes sure that the installation is correct.

Test Results

At the end of the video, there are a couple of timed tests which can be compared with the Jetson TK1, and the previous installation:

Jetson TK1 vs. Jetson TX1 Caffe GPU Example Comparison 10 iterations, times in milliseconds
Machine	Average FWD	Average BACK	Average FWD-BACK
Jetson TK1 (32-bit OS)	234	243	478
Jetson TX1 (32-bit OS)	179	144	324
Jetson TX1 *with cuDNN support* (32-bit OS)	103	117	224
Jetson TX1 (64-bit OS)	110	122	233
Jetson TX1 *with cuDNN support* (64-bit)	80	119	200

There is definitely a performance improvement between the 32-bit and 64-bit releases. There are a couple of factors for the performance improvement. One is the change from a 32-bit to 64-bit operating system. Another factor is the improvement of the deep learning libraries, CUDA and cuDNN, between the releases. Considering that the tests are running on the exact same hardware, the performance boost is impressive. Using cuDNN provides a huge gain in the forward pass tests.

The tests are running 50 iterations of the recognition pipeline, and each one is analyzing 10 different crops of the input image, so look at the ‘Average Forward pass’ time and divide by 10 to get the timing per recognition result. For the 64-bit version, that means that an image recognition takes about 8 ms.

NVCaffe

It is worth mentioning that NVCaffe is a special branch of Caffe used on the TX1 which includes support for FP16. The above tests use FP32. In many cases, FP32 and FP16 give very similar results; FP16 is faster. For example, in the above tests, the Average Forward Pass test finishes in about 60ms, a result of 6 ms per image recognition!

Conclusion

Deep learning is in its infancy and as people explore its potential, the Jetson TX1 seems well positioned to take the lessons learned and deploy them in the embedded computing ecosystem. There are several different deep learning platforms being developed, the improvement in Caffe on the Jetson Dev Kits over the last couple of years is quite impressive.

Notes

The installation in this video was done directly after flashing L4T 24.2 on to the Jetson TX1 with CUDA 8.0, cuDNN r5.1 and OpenCV4Tegra. Git was then installed:

$ sudo apt-get install git

The latest Caffe commit used in the video is: 80f44100e19fd371ff55beb3ec2ad5919fb6ac43

The post Caffe Deep Learning Framework – 64-bit NVIDIA Jetson TX1 appeared first on JetsonHacks.

Caffe Deep Learning Framework – 64-bit NVIDIA Jetson TX1

Background

Caffe Installation

Test Results

NVCaffe

Conclusion

Notes

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112