Hire the best Image/Object Recognition professionals

Check out Image/Object Recognition professionals with the skills you need for your next job.

Clients rate Image/Object Recognition professionals
Rating is 4.8 out of 5.
4.8/5
based on 4,083 client reviews
Trung-Khanh L.
$25/hr
  • Trophy Icon Image/Object Recognition
  • C
  • C++
  • Embedded System
  • HTML
  • Analog Electronics
  • Circuit Design
  • Digital Electronics
  • Digital Signal Processing
  • Electronic Design
  • Integrated Circuit
  • PCB Design
  • Image Processing
  • Eagle
  • Computer Vision
  • OpenCV

Big Dolphin Co., Ltd www.bigdolphin.com.vn Analog and digital circuit design. PCB design. Software and hardware development with MCU and FPGA: AVR MCU's, Microchip PIC MCU's, Altera KIT's, C/C++, Visual Basic, VHDL/Verilog HDL.

Amirsina T.
$125/hr
  • Trophy Icon Image/Object Recognition
  • Machine Learning
  • Deep Learning
  • Computer Vision
  • Natural Language Processing
  • TensorFlow
  • PyTorch
  • Healthcare & Medical
  • Python
  • Neural Network
  • Deep Learning Modeling
  • Artificial Intelligence
  • Machine Learning Model
  • Data Science

Providing the following services in the following areas in the realm of AI, Machine Learning and Data Science: - Strategy building - Validation of strategy - Model development For strategy, I provide insight for exploring what’s possible with data and aim to create a plan. This includes but is not limited to the data collection method, model development, and objectives. The validation step is necessary to validate the identified strategy. It's easy to suggest that strategy, but implementation can take months. So I validate the strategy to make sure we are on the right path. For model development, I provide insight into what can be done...

Gaurav G.
$70/hr
  • Trophy Icon Image/Object Recognition
  • C++
  • Robotics
  • Mathematics
  • Python
  • Computer Vision
  • Artificial Intelligence
  • Robot Operating System

I am a robotics engineer with over 6 years of experience in working with autonomous robots of all shapes and sizes - indoor and outdoor mobile robots, arm manipulators, humanoids, and drones. I started out with a masters in robotics from IIT Kanpur on walking patterns for humanoid robots, and followed up my work in a research role at the University of Heidelberg. Subsequently I worked in the autonomous mobile robot industry and architected the ROS based software stack for intra-logistic operations in industrial plants. In early 2019, I co-founded Black Coffee Robotics. We are a team of roboticists providing development and consultancy in...

Alfin N.
$35/hr
  • Trophy Icon Image/Object Recognition
  • Mobile App Development
  • Kotlin
  • Python
  • Java
  • Firebase
  • Android
  • Machine Learning
  • Computer Vision
  • Keras
  • TensorFlow
  • PyTorch
  • Data Science
  • Machine Learning Model

Hi, I’m a data scientist and already worked on several projects related to artificial intelligence (AI) and computer vision. I have a specialty in counting small objects for scientific purposes such as microbes movement tracking, sperm trajectory tracking, etc. I have strong knowledge about YOLO, Mask RCNN, CNN, and already built a lot of projects using those algorithms. also, I currently working on GAN architecture for image super-resolution, and I already published a paper about it. Other than that, I also experienced making mobile apps as I usually build my AI project for mobile apps. which means that I can create an end-to-end mobile...

Ozgur S.
$100/hr
  • Trophy Icon Image/Object Recognition
  • iOS Development
  • App Development
  • UIKit
  • Core ML
  • iPadOS
  • Swift
  • Objective-C
  • Computer Vision
  • ARKit
  • iOS
  • Python
  • In-App Purchases
  • User Authentication
  • Social Media Account Integration
  • Core Data

I developed more than ten indie iOS apps to the AppStore. I have a good portfolio of apps that uses UIKit, AutoLayout, SwiftUI, Alamofire, Core ML, ARKit, CoreData, Realm, Git etc. I am self-motivated to develop iOS apps. I have good knowledge of iOS Design Guidelines and App Store Review processes. I can take your idea to the product level fast. I'm also experienced on training neural networks with Keras to classify images.

Anton K.
$100/hr
  • Trophy Icon Image/Object Recognition
  • TensorFlow
  • Python
  • Deep Learning
  • Computer Vision
  • Neural Network
  • Data Science
  • Machine Learning
  • Data Analysis
  • Data Mining
  • Artificial Intelligence
  • Deep Neural Network
  • Artificial Neural Network
  • Django
  • JavaScript
  • SQL

Top-notch expert in a broad range of AI-related areas, including the development of Machine Learning Applications, Intelligent Agents/Bots, Automation of Intelligent Systems, and more. If you have a project and need consultation, research, a presentation, development of a prototype, or an End-to-End application, then you should contact me. I'm working highly agile, transparent and structured, sharing with you the whole process and progress in close cooperation. Independent of the size of your project, we will work out a solution. I am well interconnected with companies and other freelancers and have my own team. If the project size...

Tyler M.
$200/hr
  • Trophy Icon Image/Object Recognition
  • Python
  • Deep Learning
  • Computer Vision
  • Machine Learning
  • Artificial Neural Network
  • Convolutional Neural Network
  • Quantitative Finance
  • Stock Option Agreement
  • Quantitative Analysis
  • Data Science
  • Financial Analysis
  • Microsoft Excel
  • Artificial Intelligence
  • Automation
  • Data Visualization

Very experienced in Python, across a wide interest of fields. Especially automating tasks, trend analysis, excel manipulations, data visualization, core machine learning (regressions, clustering, classification, etc), along with a healthy amount of deep learning. Heavy focus in finance, backtesting strategies, and automated trading. Also, a decent amount of computer vision, and other visually related subjects - like automated art generation, NFTs, GANs, etc. Have a lot of experience with AWS for data backups, deployment, automation, and utilizing multiple cloud computers at once. AWS EC2, S3, Amplify, DynamoDB, Lambda, etc. API...

Alejo G.
$45/hr
  • Trophy Icon Image/Object Recognition
  • C++
  • Image Processing
  • CUDA
  • FFmpeg
  • OpenCV
  • Deep Learning
  • Video Stream
  • Raspberry Pi
  • GPS
  • Microsoft Visual C++
  • GStreamer
  • C
  • Python
  • Computer Vision
  • STM32

I am a C++ and Python programmer with large experience in image and video processing. I have been working for a long time with the OpenCV, GStreamer and Deepstream libraries and using GPU hardware with CUDA and other HPC libraries. My work experience includes professional video players & editors, transcoders, object detection and tracking, Deep Learning systems (YOLO, Pytorch, Tensorflow, TensorRT), AI hardware like NVIDIA Jetson TX2, Jetson Xavier NX, Jetson NANO and Raspberry Pi devices. I like to choose projects that interest me and that match my skills and give me a challenge or bring something new.

Shehryar M.
$40/hr
  • Trophy Icon Image/Object Recognition
  • Artificial Intelligence
  • Time Series Analysis
  • Tesseract OCR
  • Keras
  • TensorFlow
  • Statistics
  • Python
  • Recommendation System
  • Artificial Neural Network
  • Natural Language Processing
  • PyTorch
  • OpenCV
  • Computer Vision
  • pandas
  • Neural Network

Are you looking for a machine learning engineer with extensive experience in building end-to-end ML systems backed with a strong theoretical understanding of state-of-the-art AI technologies? If yes, then I can help you. Here are some ML systems I've built. *Computer Vision* 1. Built custom OCR systems to extract structured information from images of different documents such as invoices, receipts, legal documents and forms (have also extensive experience with AWS Textract and Google's Intelligent Document Processor) 2. Automated wildlife surveys by using object detection systems such as YOLO and RCNN-based architectures and tracking...

Richard A.
$149/hr
  • Trophy Icon Image/Object Recognition
  • Python
  • Deep Learning
  • Deep Neural Network
  • Machine Learning
  • Jupyter
  • JavaScript
  • HTML
  • Computer Vision
  • TensorFlow
  • Data Science
  • PyTorch
  • Keras
  • Neural Network
  • Natural Language Processing

I am a deep learning expert in the fields of computer vision and natural language processing. I am an EXPERT-VETTED (top 1%), full-time freelancer with $700K+ earnings on Upwork. I regularly create — and immensely enjoy creating — custom neural networks for my client's specialized problem domains! —————————— You are here because: —————————— - You have a small to mid-sized startup - Your startup is pushing the boundaries of what is possible, and you require deep learning techniques to make it happen - You have tried pre-built cloud ML/DL solutions and none work for your problem domain - You know that there is a massive gap between the...

Michael F.
$130/hr
  • Trophy Icon Image/Object Recognition
  • Image Processing
  • Artificial Intelligence
  • Python
  • Software Development
  • Data Sourcing
  • Predictive Analytics
  • Software Architecture & Design
  • Researcher
  • Data Analysis
  • IT Consultation
  • Data Science Consultation
  • Data Science
  • Deep Learning
  • Machine Learning
  • Computer Vision
  • Model Tuning

Wondering if your idea is possible? Let's discuss it. Wondering if your idea can be turned into software? Let's explore it. I've been a consultant for new products/platforms, architecture design, programming, modeling, and so much more. Formally, I’ve worked as a researcher and developer alike in creating end-to-end software solutions. I have published research in the realm of formal models, machine learning, and data science. My passion projects include algorithmic trading bots, computer vision/facial recognition, ecology modeling, and Natural language processing(NLP). I hold a Bachelor of Science degree in Computer Science,...

Vladimir F.
$50/hr
  • Trophy Icon Image/Object Recognition
  • Python
  • Scrapy
  • ETL
  • TensorFlow
  • pandas
  • SciPy
  • Machine Learning
  • Keras
  • Convolutional Neural Network
  • NumPy
  • OpenCV
  • STM32
  • C
  • Research & Development
  • Computer Vision

I'm developer with more then 15 years experience in big projects, based on different stacks. Also have great experience as team leader and projects curator. My way in development is to make good, fast, safe, fault-tolerant, scalable systems, based on different architectures. I worked with customers all over the world, with different businesses. If you need system for your business - i can develop it. Also, i have huge experience is extracting data from web and other sources and transform them to needed format. I prefer work with Python, using Scrapy, Beautiful Soup, Selenium, Splash and many other tools. If needed - R. You have data or where...

Maksym G.
$49/hr
  • Trophy Icon Image/Object Recognition
  • C++
  • C
  • Python
  • Computer Vision
  • OpenCV
  • Algorithms
  • Mathematics
  • Multithreaded Programming
  • Artificial Intelligence

Hello, I'm senior C++ developer with 10+ years of professional experience. In my career I worked in successful AAA game development, mobile multiuser software development, embedded software for set-top-boxes, searched for a good trading strategy for a dozen markets using machine learning on a cluster of forty high-end servers. Once I participated in Google AI Challenge (Planet Wars) and wrote a bot in three days which took 41th place out of 4500+ participants from around the world. My rating on freelancer.com is 5/5 Now I'm looking for a challenging job opportunity. My key skills: - C/C++, Python - computer vision with OpenCV library -...

Maximilian U.
$65/hr
  • Trophy Icon Image/Object Recognition
  • Machine Learning
  • Computer Vision
  • Natural Language Processing
  • Neural Network
  • Supervised Learning
  • Data Science Consultation
  • Deep Learning
  • Artificial Intelligence
  • Cloud Computing
  • Bioinformatics

I am a business-minded data scientist with an entrepreneurial spirit. My experience is in hands-on product development and implementation of data-driven products in the Natural Language Processing and Computer Vision domain using state of the art machine learning algorithms and technologies. I have experience working in an international environment and leading cross-cultural teams, and the ability to communicate fluently in German and English with team members, customers, and stakeholders. My skillset includes traditional Machine Learning, Deep Learning, Data Pipelines, Cloud Computing and Data Acquisition Strategy. My tools of choice is...

Lobna M.
$48/hr
  • Trophy Icon Image/Object Recognition
  • C++
  • Computer Vision
  • Python
  • OpenCV
  • Image Processing
  • Augmented Reality
  • PySpark
  • Databricks
  • WebGL
  • Azure Machine Learning
  • Microsoft Azure
  • Deep Learning
  • Data Science

My husband and I are working as a team using our diverse skills to work on projects related to Computer Vision, Image Processing, Drone Software Development, Data Analysis and Augmented Reality. We also have experience in deploying AI/Computer Vision algorithms into Mobile (Native Platforms iOS & Android) and Web (Using Web Assembly & WebGL). We have a strong background in C++ and OpenCV. We have done many projects related to face detection and filters and deploy it into web.

Mikhail M.
$40/hr
  • Trophy Icon Image/Object Recognition
  • Data Modeling
  • Chemistry
  • Data Visualization
  • Machine Learning
  • R
  • Statistics
  • Python
  • MATLAB
  • Artificial Intelligence
  • Organic Chemistry
  • Neural Network
  • Algorithm Development
  • Big Data
  • Computer Vision
  • Recommendation System

I am a data scientist with a long-term experience. My main objectives are both theoretical and experimental data treatment tasks involving statistics, data analysis / modeling / visualization, and machine learning including natural language processing. My main fields of experience are: - Python (Numpy, SciPy, Scikit-Learn, Jupyter, Keras, Word2Vec, etc). - R (RStudio, Shiny). - Matlab. - Rapidminer. - IBM Watson. - Web development (HTML, JavaScript, Flask, Amazon and other cloud solutions). - IoT (Arduino, Raspberry Pi, ESP8266, etc). Please take a look at my work history for comments from other clients. Thank you in advance for your time...

Mark S.
$150/hr
  • Trophy Icon Image/Object Recognition
  • Computer Vision
  • OpenCV
  • Python
  • Machine Learning
  • Data Science
  • Geospatial Data
  • Data Analysis
  • Digital Signal Processing
  • Algorithm Development
  • Image Processing
  • NumPy
  • SciPy
  • Tesseract OCR
  • Pattern Recognition

Algorithm developer, research scientist, and technical leader with experience in image and signal processing, data analysis, optimization, pattern recognition, and machine learning for startup companies in a variety of applications and industries. Most development performed in Python with Numpy, Scipy, OpenCV, and sklearn libraries. Generally work with early stage prototype algorithm development (getting it working, not optimizing for execution speed).

Ievgen G.
$125/hr
  • Trophy Icon Image/Object Recognition
  • Computer Vision
  • C++
  • Image Processing
  • Digital Signal Processing
  • Machine Learning
  • Deep Learning
  • Augmented Reality
  • Python
  • Pattern Recognition
  • Data Science
  • Data Science Consultation
  • Deep Neural Network
  • Feature Extraction
  • Artificial Intelligence

I’m a tech entrepreneur with passion for computer vision and AI/ML. After completion of my Ph.D. in image processing I have founded the It-Jim company. As a result, I have successfully built a strong team of CV/ML experts. Most of them have grown under my direct supervision. Currently I’m actively involved in strategy and business development, technical leadership and management of multiple R&D and software engineering groups. We are delivering R&D and consulting services in computer vision, pattern recognition, machine learning, augmented and mixed reality. Some interesting facts about It-Jim: ✅ 20+ R&D engineers in CV/ML ✅ 10+ PhDs in...

Muhammad H.
$40/hr
  • Trophy Icon Image/Object Recognition
  • Machine Learning
  • Computer Vision
  • Keras
  • OpenCV
  • Deep Learning
  • Data Science
  • TensorFlow
  • iOS
  • Android
  • Mobile App Development
  • Web Application
  • PyTorch
  • OCR Algorithm
  • Pattern Recognition

I’m an Electrical Engineer having 5+ years of practical experience in Computer Vision. I have extensive expertise in the design, development and deployment of industrial-scale machine vision systems. Work Experience: • 2D/3D computer vision based Motion Analysis and Tracking. • Object Classification, Localization and Segmentation. • Face detection and recognition. • Body posture estimation. • Optical Character Recognition. • Foreground and Background extraction. • Person Re-Identification. • Video or action classification. • Point cloud based action recognition and other applications. • Content based image retrieval. • Generative...

Ahmed A.
$125/hr
  • Trophy Icon Image/Object Recognition
  • Machine Learning
  • Python
  • Deep Learning
  • Computer Vision
  • Keras
  • scikit-learn
  • Natural Language Processing
  • Data Science
  • Data Science Consultation
  • Data Visualization
  • Amazon EC2

IN 𝐓𝐎𝐏 27 Data Scientists on Upwork 𝐓𝐎𝐏 𝐑𝐀𝐓𝐄𝐃 PLUS Data Solutions Expert 𝟱+ 𝘆𝗲𝗮𝗿𝘀 of experience in the data industry, building small to large scale data solution pipelines for countless industries. 💡 Custom & Interactive 𝐃𝐀𝐒𝐇𝐁𝐎𝐀𝐑𝐃𝐒 and 𝐖𝐄𝐁 𝐀𝐏𝐏𝐒 💡 Efficient and Powerful machine learning models 💡 Data Analysis and Story Building 💡 ML empowered 𝐌𝐚𝐫𝐤𝐞𝐭 𝐀𝐧𝐚𝐥𝐲𝐭𝐢𝐜𝐬 💡 𝐍𝐋𝐏, 𝐂𝐕, 𝐃𝐞𝐞𝐩 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠 Implantations 💡 ETL 𝐃𝐚𝐭𝐚 𝐏𝐢𝐩𝐞𝐥𝐢𝐧𝐞 Creation 💡 𝗗𝗔𝗧𝗔 𝗖𝗟𝗘𝗔𝗡𝗜𝗡𝗚 & 𝗠𝗔𝗡𝗔𝗚𝗘𝗠𝗘𝗡𝗧 I work with the following tools and technologies: Python, Sci-Kit Learn, Django,...

Mladen F.
$70/hr
  • Trophy Icon Image/Object Recognition
  • Deep Learning
  • Computer Vision
  • Natural Language Processing
  • Artificial Neural Network
  • Artificial Intelligence
  • Machine Learning
  • Consultant
  • Python
  • Python Numpy
  • Python Scikit-Learn
  • PyTorch
  • TensorFlow
  • Linux
  • Docker
  • Data Science

I have six years of professional work experience in developing computer vision and natural language processing solutions. I turn academic cutting-edge research ideas into products. Besides solving problems in the industry, I also teach the Advanced Machine Learning Methods class for the local university college. My primary focus has been on developing solutions based on current research articles. I use Python, Pytorch, and Tensorflow in close cooperation with software engineers, producing quality code. I also have experience in consulting and mentoring. I have developed novel solutions for advanced Deep Learning products deployed in the...

Memoona T.
$45/hr
  • Trophy Icon Image/Object Recognition
  • Machine Learning
  • Deep Learning
  • Python
  • Data Science
  • TensorFlow
  • Keras
  • Researcher
  • C++
  • GitHub
  • Scripting
  • Convolutional Neural Network
  • Artificial Intelligence
  • Deep Neural Network
  • Computer Vision
  • Model Optimization

Welcome to my profile! I am Memoona Tahira, and I am a Machine Learning Engineer. I transitioned to freelance work from academia, with a MS in Computer Science and published research. I love to work with clients who have specific deliverables for their machine and deep learning tasks, preferably to build an MVP. My interest lies in: * Digital Image Processing * Deep Learning for Computer Vision (Image and Video), * Deep Learning with Speech and Audio * Time Series analysis for financial data For all my projects, my focus is on cleaning and preparing data for a data-centric approach, and then creating custom models that are ready for...

Faisal E.
$50/hr
  • Trophy Icon Image/Object Recognition
  • Python
  • Machine Learning
  • Image Processing
  • Computer Vision
  • OpenCV
  • PyTorch
  • TensorFlow
  • Keras
  • Deep Learning
  • Data Analysis

I am an expert Machine Learning and AI Engineer. My purpose is to propose solutions to my clients for their problems and guide them throughout the journey to minimize the cost and maximize the output. I can understand AI problems, recommend multiple solutions and develop the whole solution from scratch to reach the Proof of Concept, MVP, and Production stage. I have more than five years of experience in ML. I have developed multiple AI systems, and millions of users have used these systems. 3 Products that I was part of have raised seed fundings totaling 5.5 Million Dollars. Artificial Intelligence skills: ⠀✓...

Paul Jim C.
$10/hr
  • Trophy Icon Image/Object Recognition
  • Artificial Intelligence
  • Data Processing
  • Medical Imaging
  • Data Segmentation
  • Computer Vision
  • Machine Learning
  • Data Annotation
  • Data Labeling
  • Virtual Assistant
  • Data Entry
  • Data Extraction
  • Data Cleansing
  • Image Processing
  • Adobe Photoshop
  • Photo Editing

A Top Rated data annotation specialist. I have years of experience in this field and have done 5000+ hours of Data Annotation projects in Upwork. I know how to deliver the top quality. If you are looking for someone to perform data annotation at high accuracy and speed, please feel free to contact me. I have excellent expertise in using: Computer Vision Annotation Tool (CVAT) COCO Annotation Tool Labelbox LabelMe LabelImg VGG Image Annotator ITK-Snap 3D Slicer and other web-based annotation tools. I will be able to work on a part-time or full-time basis depending on your needs. Please let me know if you would like to discuss this further....

Himanshu J.
$50/hr
  • Trophy Icon Image/Object Recognition
  • Data Science
  • Machine Learning
  • TensorFlow
  • Computer Vision
  • Natural Language Processing
  • Artificial Neural Network
  • Deep Learning
  • Convolutional Neural Network
  • Data Analysis
  • Data Cleansing
  • Data Science Consultation
  • Data Scraping
  • Unsupervised Learning
  • Neural Network

I am a PROBLEM SOLVER Your problem deals with DATA, I am your guy Your problem is related to AI, I am your guy You need an end-to-end solution for your IDEA, I AM YOUR GUY! Programming Languages: › Python › R › Java › Javascript › C › C++ Frameworks: › Tensorflow › Keras, Flask › Django › PySpark › AngularJS › ReactJS › Node.JS › Jquery Technologies: › Web Development › Machine Learning › Deep Learning › Data Science Technical Skills: › Data Acquisition › Data Pre-processing › Data Analysis and Interpretation › Data Modeling Data Visualization: › Tableau › RAW › Matplotlib › Seaborn › Ggplot › Tensorflow dashboard Other...

Sairaj B.
$50/hr
  • Trophy Icon Image/Object Recognition
  • Mobile App Development
  • Computer Vision
  • Image Processing
  • Machine Learning
  • Deep Learning
  • MATLAB
  • Android Studio
  • iOS Development
  • Flutter
  • Machine Learning Model
  • OpenCV
  • C++
  • Swift
  • Kotlin
  • Android

An Entrepreneur, Computer Vision Researcher, and Mobile App developer. A hands-on learner that is persistently inspired by innovation with a niche for product development. Strong team member with outstanding organizational and time management skills, capable of working well under pressure in fast-paced environments.

László B.
$90/hr
  • Trophy Icon Image/Object Recognition
  • Raspberry Pi
  • Python
  • Rapid Prototyping
  • Machine Learning
  • TensorFlow
  • Artificial Neural Network
  • Computer Vision
  • OpenCV
  • Home Automation
  • Keras
  • PyTorch
  • PyCharm
  • Deep Learning
  • Convolutional Neural Network

Hi, My name is Laszlo Benke. I graduated as an Electrical Engineer M.Sc. and Software Developer. I am an expert in AI/Machine learning (object detection) and IoT/Edge Computing (Raspberry Pi and Jetson Nano, Deepstream, Triton/TAO). I prefer long-term projects, but you can hire me for shorter ones also as consultant. Strength: - Python Machine learning projects (Tensorflow, Keras) - especially with computer vision, object detection (YOLO, Mobilenet SSD, OpenPose), shape recognition (dlib) and action recognition (X3D, mmaction etc.) - Embedded image processing solutions (Edge computing): computer vision on Jetson Nano with Deepstream6.1...

$NaN/hr

How it works

1. Post a job (it’s free)

Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.

2. Talent comes to you

Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.

3. Collaborate easily

Use Upwork to chat or video call, share files, and track project progress right from the app.

4. Payment simplified

Receive invoices and make payments through Upwork. Only pay for work you authorize.

Trusted by 5M+ businesses

How Image Recognition Works

Interpreting the visual world is one of those things that’s so easy for humans we’re hardly even conscious we’re doing it. When we see something, whether it’s car, or a tree, or our grandma, we don’t (usually) have to consciously study it before we can tell what it is. For a computer, however, identifying a human being at all (as opposed to a dog or a chair or a clock, let alone your grandmother) represents an amazingly difficult problem.

And the stakes for solving that problem are extremely high. Image recognition, and computer vision more broadly, is integral to a number of emerging technologies, from high-profile advances like driverless cars and facial recognition software to more prosaic but no less important developments, like building smart factories that can spot defects and irregularities on the assembly line, or developing software to allow insurance companies to process and categorize photographs of claims automatically.

We’re going to explore the challenge of image recognition and how data scientists are using a special type of neural network to address it.

Learning to see is hard (and expensive)

A good way to think about this problem is of applying metadata to unstructured data. In our article on content-based recommendations, we looked at some of the challenges of categorizing and searching content in cases where that metadata is sparse or nonexistent. Hiring human experts to manually tag libraries of movies and music may be a daunting task, but it’s an impossible one when it comes to challenges like teaching the navigation system in a driverless car to distinguish pedestrians crossing the road from other vehicles, or tagging, categorizing, and filtering the millions of user-uploaded pictures and videos that appear daily on social media.

One way to solve this would be through neural networks. While in theory we could use conventional neural networks to analyze images, in practice this turns out to prohibitively expensive from a computational perspective. For instance, a conventional neural network attempting to process even a relatively small image (let’s say 30×30 pixels) would still require 900 inputs and more than half a million parameters. While that might be manageable for a reasonably powerful machine, once the images become larger (say 500×500 pixels), the number of inputs and parameters required increases to truly absurd levels.

What’s more, applying neural networks to image recognition can lead to another problem: overfitting. Simply put, overfitting is what happens when a model tailors itself too closely to the data it’s been trained on. Not only does this generally lead to added parameters (and thus, further computational expense), it actually results in a loss in general performance when it’s exposed to new data.

The solution? Convolution!

Fortunately, a relatively straightforward change to the way a neural network is structured can make even large images more manageable. The result is what we call convolutional neural networks (also called CNNs or ConvNets).

One of the advantages of neural networks is their general applicability, but as we’ve seen when dealing with images, this advantage turns into a liability. CNNs make a conscious tradeoff: By designing a network specifically to handle images, we sacrifice some generalizability for a much more feasible solution.

Specifically, CNNs take advantage of the fact that, in any given image, proximity is strongly correlated with similarity. That is, two pixels that are near one another in a given image are more likely to be related than two pixels that are further apart. However, in a typical neural network, every pixel gets connected to every single neuron. In this case, the added computational load actually makes our network less rather than more accurate.

Convolution solves this by simply killing a lot of these less important connections. In more technical terms, CNNs make image processing computationally manageable by filtering connections by proximity. Rather than connecting every input to every neuron in a given layer, CNNs intentionally restrict connections so that any one neuron only accepts inputs from a small subsection of the layer before it (like, say, 3×3 or 5×5 pixels). Thus, each neuron is only responsible for processing a certain part of an image. (Incidentally, this is more or less how the individual cortical neurons in your brain work: Each neuron responds to only a small part of your overall visual field.)

Inside a convolutional neural network

But how does this filtering work? The secret is in the addition of two new types of layers: convolutional and pooling layers. We’ll break the process down below, using the example of a network designed to do just one thing: determine whether a picture contains a grandma or not.

The first step is the convolution layer, which actually consists of several steps in itself:

  1. First, we’ll break down a picture of grandma into a series of overlapping tiles 3×3 pixel tiles.
  2. Next, we’ll run each of these tiles through a simple, single-layer neural network, leaving the weights unchanged. This will turn our collection of tiles into an array. Because we kept each of the images small (in this case, 3×3), the neural network required to process them stays small and manageable.
  3. Then, we’ll take those output values and arrange them in an array that numerically represents the content of each area of our photograph, with the axes representing height, width, and color channels. So in our case, we’d have a 3x3x3 representation for each tile. (If we were talking about videos of grandma, we’d throw in a fourth dimension for time.)

Then comes the pooling layer, which takes these three-(or four-)dimensional arrays and applies a downsampling function alongside the spatial dimensions. The result is a pooled array containing only those parts of the image that are more important while discarding the rest, which both minimizes the computations we’ll need to do while also avoiding the problem of overfitting.

Lastly, we’ll take our downsampled array and use it as the input for a regular, fully connected neural network. Since we’ve dramatically reduced the size of the input using convolution and pooling, we should now have something a normal network can handle while still preserving the most important parts of the data. The output of this final step will represent how confident the system is that we have a picture of a grandma.

Note that this is a simplified explanation of how a convolutional neural network works. In real life, the process is (excuse the pun) more convoluted, involving multiple convolutional, pooling, and hidden layers. Additionally, real CNNs typically involve hundreds or thousands of labels, rather than just one.

Implementing convolutional neural networks

Building a Convolutional Neural Network from scratch can be a time-consuming and expensive undertaking. That said, a number of APIs have recently been developed that aim to allow organizations to glean insights from images without requiring in-house computer vision or machine learning expertise.

  • Google Cloud Vision is Google’s visual recognition API, based on the open-source TensorFlow framework and using a REST API. It detects individual objects and faces and contains a pretty comprehensive set of labels. It also comes with a few bells and whistles, including OCR and integration with Google Image Search to find related entities and similar images from the web.
  • IBM Watson Visual Recognition, part of the Watson Developer Cloud, comes with a large set of built-in classes, but is really built for training custom classes based on images you supply. Like Google Cloud Vision, it also supports a number of nifty features, including OCR and NSFW detection.
  • Clarif.ai is an upstart image recognition service that also uses a REST API. One interesting aspect is that it comes with a number of modules that help tailor its algorithm to particular subjects, like weddings, travel, and food.

While the above APIs may be suitable for some general applications, for specific tasks you might still be better off building a custom solution. Luckily, there are a number of libraries available that make the lives of data scientists and developers a little easier by handling the computational and optimization aspects, allowing them to focus on training models. Many of these libraries, including TensorFlow, DeepLearning4J, Torch, and Theano, have been used successfully in a wide variety of applications.

View less
Schedule a Call