“The Embedded Vision Summit is bigger than ever,” I wrote in last year’s show report. It was the same again this year. With over 40 exhibitors and 1000 registrations, the show again grew 30-40%. Besides attracting more people, there was also more to see: the exhibition ran for two days instead of one and there were two days filled with 3 parallel tracks of talks. There were even additional workshops on a third day.
The show’s growth reflects the state of the computer vision industry. You may not realize it, but you’re probably already using computer vision techniques many times a day. Your phone’s camera automatically focuses on faces in view. Apps like snapchat or face swap carefully analyze the captured images using computer vision and manipulate them in unique ways. On the streets, security cameras have sophisticated motion detection algorithms that only hit record when you’re walking by. Cars on the road behind you automatically slow down when they get too close. Computer vision algorithms unlock your phone or not based on your fingerprint. The food you’ll eat today has likely been inspected using cameras, and the jar it comes in has a barcode on the side that was scanned during check out at the store. The mail you’ll receive was routed using cameras reading the address on the envelope. Computer Vision is already widespread and it’s rapidly becoming cheaper, more powerful, and more prevalent every day. And that’s exactly what the show was all about. Just like the Internet changed our lives in the past two decades, computer vision will change it significantly in the next two decades.
The show also attracted more media attention than in previous years. Max Maxfield from Embedded.com wrote about Day 1 and Day 2 at the show. Stephan Ohr at the EE Times wrote a nice article “Automotive Safety Rules Embedded Vision Summit,” highlighting the fact that automotive was a major topic at the show.
There were three keynotes. Google’s Jeff Dean spoke about how deep learning is used at Google. It’s interesting it’s not just about reaching the best detection rates anymore. Some of the emphasis for CNNs has shifted toward making them more compute efficient. Larry Matthies from NASA’s Jet Propulsion Laboratory gave an interesting talk about computer vision to enable autonomous land, sea and air vehicles, like the rover they put on Mars. His talk was much less about object detection and classification, but much more about how such vehicles navigate in their surroundings. Larry showed that techniques like Structure from Motion then become much more important. Jeff Bier, founder of the Embedded Vision Alliance showed that computer vision will become ubiquitous and invisible and a huge creator of value, both for suppliers as well as those who leverage the technology.
We had a talk in the business track on automotive: ”Computer Vision in Cars: Status, Challenges, and Trends”. Just as cars replaced horse carriages in the 1920s, electronics will replace us human operators in the 2020s. We started off with some of the benefits of self-driving cars: they save lives, save time and save money. For car manufacturers though, change will at first be gradual. With each new model year, they’re adopting increasingly sophisticated advanced driver assistance systems (ADAS) that aid the driver, instead of taking full control. We gave an overview of the state of these ADAS today and a glimpse into the future.
Final versions of all Summit presentations, including ours, are now available for download from the Alliance website as a single ZIP file. Registered users may login and download the file (77MB). Not registered yet? Go ahead and register, it’s free.
Trends from the show
One thing that became clear to me is that the industry is perhaps somewhat maturing. We still have a long way to go and there are still many, many opportunities for computer vision to enable new products and features. Out of the top 20 semiconductor companies six were exhibiting, and many more were attending. The semis have certainly more than noticed this upcoming industry.
Another example of the industry maturing was a company located next to us: Samasource. Their sole business is to provide image-labeling services. They have a team of over 600 people on staff to do so. Many computer vision algorithms need to be trained using known, “ground truth” data. Samasource made a business out of converting raw video sequences into valuable data where objects in the images are properly identified and annotated by their staff.
Perhaps another sign of the industry maturing: I didn’t see any radically new sensor technologies, new algorithms, or even really new applications (the Vision Tank session being the exception). It was the same for the Internet in the early days though. One of the first things I did on the web was to visit the White House’s simple website. I certainly didn’t envision things like Facebook, Maps or Wikipedia at the time. Computer vision is similar in that the technology is basically there, and now it’s a matter of making it faster, cheaper, more accurate, and using it to enable lots of new applications.
Just like the previous two years, CNNs were a big topic again. At the same time, I didn’t see a lot of shipping products include them yet. Google’s Dean highlighted in his talk that right now in production at Google, CNNs drive speech recognition and image searching in Google Photos. He did show there are hundreds of new projects underway that plan to use CNNs, so further adoption is imminent. Last year I wrote in my EV Summit show report that “The focus is simply on reaching the highest detection rates, not on finding the right trade-offs between accuracy, power and silicon area consumption.” This has changed a bit, and several talks spoke about this trade-off.
Our presence at the show
Our booth was right across from the bar, which opened daily at 5pm. We showed several vision demonstrations running on our silicon. One demo we developed together with VISCODA, an algorithms partner of videantis. VISCODA’s Structure from Motion algorithm allows the capture of 3D information using a standard, single, 2D camera. The application was shown running on silicon that includes the videantis v-MP4280HDX processor. Processing happens real-time at HD resolutions, and consumes very little power. We showed different object detectors like pedestrian detection for surveillance or automotive applications and face detection. We showed our optimized vision library accelerating a wide range of vision algorithms and providing a 1000x power reduction compared to CPUs or GPUs.
All in all it was a great three days of learning about computer vision and speaking to our customers and industry colleagues again. Embedded vision is gaining speed, and we’re proud of our highly efficient vision-processing architecture being such a key enabler for bringing these new technologies to high-volume markets. I would also like to thank the EVA organization again for making sure everything ran smoothly and for driving this industry forward.