Interview with Computer Vision expert Adrian Rosebrock

Published October 12, 2015   |   

Adrian Rosebrock is an entrepreneur, computer vision expert and Ph.D, who launched two successful image search engines: ID My Pill – an iPhone app and API that identifies prescription pills in the snap of a smartphone’s camera, and Chic Engine – a fashion search engine for the iPhone. He has been working in the startup world for the past eight years and building computer vision, machine learning, and image search engine systems.


Can you tell us a little about yourself and what you do?
I’ve spent my entire adult life studying computer vision. I have a PhD in computer science with a focus in computer vision and machine learning. I’ve consulted with the National Cancer Institute to develop methods to automatically analyze breast histology images for cancer risk factors. I also blog over at about computer vision, image processing, and machine learning.
Last year I published a book, Practical Python and OpenCV, your guaranteed quick-start guide to learning the fundamentals of computer vision and image processing in only a single weekend.
And just two weeks ago I launched a Kickstarter to fund my new computer vision course, PyImageSearch Gurus. It’s safe to say that I know a thing or two about computer vision. And I’m here to pass that knowledge on to you.
There are many definitions for Computer Vision, so can you tell us exactly what Computer Vision is?
At the most basic level, computer vision involves teaching a computer how to “see”. It encompasses a wide variety of sub-fields including acquiring, processing, and analyzing images. But at the very core, computer vision can be boiled down to understanding and interpreting the contents of an image.
For humans, it’s very simple to understand the contents of an image. We see a picture of a dog and we know it’s a dog. We see a picture of a cat and our brain is able to make the connection that we are seeing a cat in the photo.
But for computer it’s not that easy. All a computer “sees” is a matrix of pixels (i.e. the Red, Green, and Blue pixel intensities) of an image. A computer has no idea how to take these pixel intensities and derive any semantic meaning from the image.
So while computer vision covers a wide variety of topics, the end goal of computer vision is to facilitate interpreting the contents of an image.
Can you share an interesting example of computer vision?
Of course. This is probably one of my favorite questions.
About six months ago I was traveling back and forth between Maryland and Connecticut almost two times per month for work. This trip was roughly a 5 hour drive once you factored in traffic.
Anyway, as I was driving back up to Connecticut after an extremely long and exhausting three days of non-stop work, I wasn’t paying attention to how fast I was going; and apparently I should have.
About three weeks later I received a speeding ticket in the mail from the Maryland State Police informing me that I was going 21 MPH over the posted speed limit. Whoops.
While I was not happy about writing out a check to pay the traffic violation, I couldn’t help but smile to myself.
You see, the police had setup a hidden camera and used computer vision and image processing algorithms were used to detect and recognize the license plate on my car; which ultimately lead to the Maryland government looking up my address and mailing me a speeding ticket.
This is just one example of computer vision in the real-world. Computer vision is everywhere, whether you realize it or not. We use computer vision to recognize faces and license plates and to detect broken bones in X-Ray images or tumors in MRI scans. It’s an extremely exciting field to be a part of.
How is computer vision and machine learning related?
While it’s not entirely true, I’ve always said that computer vision wouldn’t be very exciting without machine learning. Sure, we would still have basic image processing techniques, such as Instagram filters or counting the number of cellular structures in a histology image; but to do anything really exciting we need machine learning.
For example, we use Principal Component Analysis and machine learning models to recognize faces. We use clustering and vector quantization to build large scale image search engines and classification systems. And let’s not forget about Deep Learning and Convolutional Neural Networks. While these learning methods are based on interpreting and classifying the contents of an image, they are still heavily rooted in machine learning.
Without machine learning, computer vision would be a substantially smaller field.
How much mathematics and programming does one need to know to start learning computer vision?
I’ve been asked this question a lot before. And for some reason people think that you need to be a math whiz to get started in computer vision; that’s simply not true.
If you understand what a matrix is, and that an image is just a matrix of pixel values, then that’s realistically enough to get started.
More advanced mathematical knowledge of Linear Algebra and Calculus (especially for machine learning) are good to know, but if you’re just getting started, don’t get hung up on not being a math expert.
As for programming experience, the biggest hurdle is learning OpenCV, the de factostandard library for computer vision and image processing. Most popular programming languages have bindings for the OpenCV library (including C++, Python, Java, .NET, etc.), so if you already consider yourself a good programmer in a certain language, chances are that there are OpenCV bindings for it.
And if you’re looking to get started learning the OpenCV library, be sure to check out my book!
You started Kickstarter project to turn anyone into a computer vision and OpenCV guru. Can you tell us more about your Kickstarter project?
Sure, I would be happy to.
PyImageSearch Gurus is:
– An actionable, real-world 6-8 month course on OpenCV and computer vision.
– A community of like-minded developers, programmers, and students who are eager to get started or level-up their OpenCV skills.
– An IPython Notebook development environment in the cloud. This means that you don’t have to waste time getting your development environment setup and configured. You simply launch a browser, point it at the PyImageSearch Gurus website, and start working through the lessons.
Plus, at the end of the course you’ll receive a Certificate of Completion which you can include in your resume or CV.
The reason I started PyImageSearch Gurus is because I was looking for the next logical step after Practical Python and OpenCV. There is only so much that you can teach inside of a book. But having a 6-8 month course allows you to cover more topics and study them in more depth. Computer vision is such an exciting field, I didn’t think it was right to constrict it to only a book or two!
If you’re interested in learning more about the PyImageSearch Gurus course, justclick here.
Your book Practical Python and OpenCV sounds awesome. How can your book help a complete beginner in computer vision get started quickly?
Great question, thanks for asking.
Computer vision doesn’t have to be as hard as people make it out to be. If you pickup a textbook on computer vision you’ll instantly be slammed over the head with complex equations and theoretical reviews.
But it doesn’t have to be that way.
My book, Practical Python and OpenCV takes a very real-world, hands-on approach to teaching computer vision using the OpenCV library.
Inside my book you’ll learn all about face detectionobject tracking in video, andrecognizing handwritten digits. Even if you’re a complete beginner you’ll learn how to solve these problems using OpenCV and Python.
What do you think is the future of computer vision?
Without a doubt, I think this is the start of a very special time in computer vision. Our computer vision algorithms have gotten powerful enough that we can understand the contents of an image with reasonable accuracy. We also have access to cheap cloud computing. And we’re finally starting to see more and more computer vision startup companies. My prediction is that over the next 10-20 years we will see a substantial number of computer vision apps and technologies launched. Some will fail, others will succeed; but it will no doubt push computer vision farther (and faster) than its ever been before.
If you could give one piece of advice to someone starting out in computer vision, what would you say?
I would say to stop  thinking about getting started, and just get started. Stop worrying if you have enough mathematical background or not; you will be absolutely amazed and how far you can get and what you can build right this second.
There is never a better time than now to get involved in computer vision.
And if you want to jump start your education, I would suggest checking out my book,  Practical Python and OpenCV or my PyImageSearch Gurus computer vision course.