Hire the best Image/Object Recognition professionals

Check out Image/Object Recognition professionals with the skills you need for your next job.
Clients rate Image/Object Recognition professionals
Rating is 4.8 out of 5.
4.8/5
based on 4,083 client reviews
  • $70 hourly
    ►About me ○ 7 years of providing software development services for autonomous robots ○ Helped take 5 major robotics products from idea stage to market (Raising $5m+) ○ Full stack robotics developer - SLAM, Perception (AI), Motion Planning, controls, ROS, ROS2, Simulations (Gazebo, Unity, Webots), devOps, UI dashboards ►Key services ○ Autonomous robot prototyping: From idea stage to market launch (manipulators, and wheeled/tracked robots) ○ Simulations: Custom and advanced process simulations for robotics application (Gazebo/Unity3D) ○ Specialized problem solving: A specific R&D solution for your robotics applications ►ROS packages ○ SLAM - SLAM Toolbox, Gmapping, Hector SLAM, google cartographer ○ State estimation - robot_localization, AMCL ○ Stacks - nav2, move_base, moveit2!, moveit! ○ Web-UI - rosbridge, roslibjs ► Past projects on autonomous robots ○ Forklifts weighing tonnes of kilos for pallet movement across factories. ○ Cleaning robot for deployment at airports and hotels. ○ Warehouse sorting system with 100+ robots with coordinated movement. ○ Perception integrated arm robot for assisting veterans- object pick place, door opening, other manipulation. ○ ROS2 architecture for fleet of autonomous boats
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Simulation Software
    Unity
    Deep Learning
    OpenCV
    Computer Vision
    Robotics
    Robot Operating System
    C++
    Artificial Intelligence
    Python
  • $120 hourly
    As an Expert-Vetted AI expert, I bring a wealth of experience and knowledge to any project I undertake. With expertise in a wide range of AI-related areas, including the development of Machine Learning Applications, Intelligent Agents/Bots, Automation of Intelligent Systems, and more, I can provide comprehensive services to meet your needs. My approach is highly agile, transparent, and structured, and I work closely with my clients to ensure that they are fully informed about the progress of their projects. Whether you require consultation, research, presentation development, prototype development, or End-to-End application development, I have the skills and expertise to deliver outstanding results. Beyond my freelance services I am the founder of POLAR FREQUENCY, an AI agency that specializes in tackling projects of any size. We compose a team specifically tailored to meet the needs of that project while keeping the project within the budget, making it both efficient and effective. If you're looking for an AI expert who can provide top-notch expertise, professionalism, and dedication to your project with the potential to scale your project up then look no further. Contact me today to learn more about how I can help take your project to the next level.
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Artificial Intelligence
    Data Mining
    Data Analysis
    Deep Neural Network
    Artificial Neural Network
    Computer Vision
    TensorFlow
    Data Science
    Neural Network
    ChatGPT
    GPT-3
    Machine Learning
    Deep Learning
    Python
    Natural Language Processing
  • $50 hourly
    M. Sc. in Computer science with a mention in image processing and pattern recognition. Im a Computer vision specialist with over 10 years of hands on experience on a variety of computer vision applications. I have strong background in maths and computer science, research skills in computer vision, senior programmer in C/C++, Matlab and python. On the past years i focused on the development of computer vision solutions for real world problems involving extensive research and ground breaking solutions. I have over 10 years experience working with OpenCV, TensorFlow, DeepStream and porting image processing solutions to mobile platforms for real time processing. Im very interested in the research and development of a new challenge computer vision tasks including pattern recognition, machine learning and image analysis. I am seeking opportunities to develop and maintain complete computer vision applications, whatever standalone or back-ends for smart websites or light weight solutions for mobile applications.
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Python
    Deep Neural Network
    Artificial Neural Network
    Video Processing
    Deep Learning
    Java
    Android App Development
    C
    MATLAB
    Image Processing
    C++
    Machine Learning
    OpenCV
    Computer Vision
  • $100 hourly
    - 15+ years of experience in computer vision and machine learning - worked at a research lab and for industry - freelance for international clients I've worked on projects involving: object detection, human and animal pose estimation, multiple objects tracking, deep-learning, multi-camera calibration and synchronization, audio and video data acquisition, drone footage processing, camera stabilization and more. Current stack: Python (numpy, pandas, opencv, keras, pytorch), AWS, Google Cloud, GPU, machine learning APIs, Docker. Past (passive) stack: C++ (STL, Boost, CMake), C, MATLAB.
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Git
    Artificial Intelligence
    FFmpeg
    Video Processing
    Linux
    Machine Learning
    Neural Network
    Keras
    Python Scikit-Learn
    pandas
    Deep Learning
    OpenCV
    Python
    Computer Vision
  • $35 hourly
    Advanced C#/Unity 3D developer specialising in interactive application development, Augmented Reality and Virtual Reality. Strong Image processing and 3D Graphics skills. Years of experience in the interactive and augmented reality field. Intermediate Three.JS/A-frame developer with lots of WebAR experience including 8th Wall and Zappar Creator of the AdVis Interactive Projection Software. Designed and Developed many custom projects using interactive technologies such as motion detection, augmented reality, virtual reality (Oculus Rift), etc. Lots of experience with sensors such as Leap Motion, Kinect, Kinect v2 and Azure Kinect, Panasonic D-Imager and various IMUs. Clients have included McDonalds, Bentley, Cartier, Audi, Aston Martin, GSK and Pepsi (Please contact for Portfolio).
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Computer Vision
    Graphic Design
    Mobile App Development
    Three.js
    Augmented Reality
    Unity
    Android
    Image Processing
    iOS Development
    AR Application
    Virtual Reality
    Microsoft Kinect Development
    OpenCV
    C#
    .NET Framework
  • $25 hourly
    I am a broad-minded Machine Learning Engineer with more than 5 years of development experience with various machine learning projects including Computer Vision, Deep Learning, Reinforcement Learning. I have over 10 years of university teaching experience in machine learning and neural networks, so I implement systematic and methodological approaches. I have successfully accomplished more than 100 projects in such industries as scientific research, meteorologic centers, industries, smart houses, and modeling. I have cooperated with companies, startups, and personal clients from Ukraine, Russia, Germany, France, and The USA. Most of the tasks I solved were primarily of scientific value and required a quick orientation and immersion in the peculiarities of scientific direction. I am always focused on the maximum result for the customer. Please see some of my cases and my client's feedback: 🏆 I made a research and realized the project concerning the application of neural networks in the technological process at “NTI UrFU”: “The execution is excellent. Everything is beautifully, diligently done. The codes work as they should and fulfill the task of setting up and running the neural network. Done on time, with no delays. If there are any questions from me or Ivan, they are resolved easily. The performer is polite. I am satisfied with the work. I recommend him! Thank you very much for your help! I wish to cooperate in my other projects.”. - Sergey K., a designer-technologist of “NTI of UrFU” 🏆 I analyzed data in order to find patterns in a time-series study on the Russian federal reserve data: “A punctual and communicative performer. Checks all the details with the customer for a better job performance and takes them into account. All the stages of work are performed on time! I was pleased.”. - Konstantin A., a researcher from Moscow Economic School. ♦️ Participation in the development of a human thermal image recognition system as part of the smart house project. ♦️ Meteorological project - forecasting system development of precipitation based on space images and Eumetsat data. 🎯 When approaching a project, I always follow two basic rules - responsibility and scrupulous accuracy. Your project will be completed on time and will meet the highest standards. I am responsible, respond to the messages promptly, and I am available to discuss the project as soon as possible. I will be glad to provide the following services (including, but not limited to): 🔹 Selecting appropriate data sets 🔹 Picking appropriate data representation methods 🔹 Identifying differences in data distribution that affects model performance 🔹 Verifying data quality 🔹 Transforming and converting data science prototypes 🔹 Performing statistical analysis 🔹 Designing ML systems 🔹 Researching and implementing ML algorithms and tools 🔹 Running machine learning tests 🔹 Using results to improve models 🔹 Training and retraining systems when needed 🔹 Developing machine learning apps according to client requirements 🔹 Artificial intelligence Machine Learning Skills: - Сoding Skills in Python (Scripting, PyCharm, Jupyter, Google Colab) - Python Data Analytics Libraries (Scikit-Learn, Pandas, NumPy, SciPy) - Python Data Visualization Libraries (Matplotlib, Seaborn, Bokeh) - Deep learning libraries and toolboxes (TensorFlow, Keras, PyTorch) Additional Skills: - Experience with Data scraping tasks and libraries - Strong math background - Experience with Google Cloud Engine - Google Script advanced user - *nix systems advanced user 🤝 Let's start working on your project right now! Contact me to discuss details.
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    PyTorch
    Computer Vision
    TensorFlow
    Reinforcement Learning
    Deep Learning
    Statistics
    Business Intelligence
    Machine Learning
    Data Visualization
    Data Mining
    Google Sheets
    Data Analytics
    Data Science
    Data Analysis
    Python
  • $150 hourly
    Very experienced in Python, across a wide interest of fields. Especially automating tasks, trend analysis, excel manipulations, data visualization, core machine learning (regressions, clustering, classification, etc), along with a healthy amount of deep learning. Heavy focus in finance, backtesting strategies, and automated trading. Also, a decent amount of computer vision, and other visually related subjects - like automated art generation, NFTs, GANs, etc. Have a lot of experience with AWS for data backups, deployment, automation, and utilizing multiple cloud computers at once. AWS EC2, S3, Amplify, DynamoDB, Lambda, etc. API knowledge to pull data / work with 3rd party software.
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Artificial Intelligence
    Quantitative Analysis
    Microsoft Excel
    Stock Option Agreement
    Automation
    Financial Analysis
    Data Visualization
    Quantitative Finance
    Artificial Neural Network
    Convolutional Neural Network
    Machine Learning
    Data Science
    Computer Vision
    Deep Learning
    Python
  • $130 hourly
    Wondering if your idea is possible? Let's discuss it. Wondering if your idea can be turned into software? Let's explore it. I've been a consultant for new products/platforms, architecture design, programming, modeling, and so much more. Formally, I’ve worked as a researcher and developer alike in creating end-to-end software solutions. I have published research in the realm of formal models, machine learning, and data science. My passion projects include algorithmic trading bots, computer vision/facial recognition, ecology modeling, and Natural language processing(NLP). I hold a Bachelor of Science degree in Computer Science, graduating Summa Cum Laude with an emphasis in data science & Artificial Intelligence(AI). Finally, I’m the founder of “Data Bindu LLC” which seeks to push the boundaries of what is possible with data. Past Experience : Identified a bimodal distribution underlying file transfer error causing 7x (700%) resource expenditure than baseline. Created a recommendation engine which improved user-to-user connectivity through predictive analytics. Created an automated solution which alleviated ≈90% of a given business workflow. Created ecology models to predict cyanotoxins from algae blooms in freshwater environments. Visit my profile for more successful projects. PUBLICATIONS Agnew, W., Fischer, M., Foster, I. and Chard, K., 2016, November. An ensemble-based recommendation engine for scientific data transfers. In 2016 Seventh International Workshop on Data-Intensive Computing in the Clouds (DataCloud) (pp. 9-16). IEEE. Fischer, M., Riley, D., 2016, April. Using Data Mining in Combination with Machine Learning to Enhance Crowdsourcing of a Formal Model of Biodiesel Production. In 49th Annual Midwest Instruction and Computing Symposium (MICS 2016).
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Large Language Model
    OpenAI Codex
    IT Consultation
    Data Analysis
    Software Architecture & Design
    Data Sourcing
    Software Development
    Artificial Intelligence
    Image Processing
    Data Science
    Data Science Consultation
    Computer Vision
    Deep Learning
    Machine Learning
    Model Tuning
  • $80 hourly
    I have over ten years of experience in software and hardware architecture, development, and testing. I am a generalist -- I enjoy researching and creating new tools/algorithms to solve problems that lack existing solutions. My more specific skills are: 1) *performance bottlenecks* -- I squeeze every last drop of performance out of hardware, including multi-core/vector CPU hardware and GPU accelerators. I have deep knowledge in this area and can optimize at all stack levels from algorithms to machine code to hardware pipelines. 2) *deep neural networks* -- I know neural network algorithms and optimizations. I specialize in image and video inference. 3) *security architecture* -- the first step to a secure architecture is knowing your adversaries and what they are capable of. A secure architecture must be designed like watertight plumbing -- a leak anywhere results in catastrophic failure once an adversary finds it. I can help you define a watertight security architecture for your product(s) and infrastructure, preferably using military-grade asymmetric-key encryption technology. 4) *computer networks and IT infrastructure* -- I have several years of experience managing compute, storage, and device infrastructure and know the Internet Protocol (IP) stack. My goal is to be your "easy button" and produce a solution that we're both happy with. First, I will make sure I understand your problem space before working on a solution, respecting your time by asking only key questions. Then, as I start working, I will provide updates/metrics/demos to ensure that I deliver what you want. Finally, I will produce code/documentation/artifacts optimized for readability and maintainability.
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    JavaScript
    Software Architecture & Design
    C
    Automation
    Assembly Language
    Computer Vision
    Performance Optimization
    Linux
    Deep Neural Network
    OpenCL
    CUDA
    TensorFlow
    SQL
    Python
    C++
  • $50 hourly
    "Programming" ... Defines my life. I have been into programming since teenage. Always ready to explore new technologies. Willing to engage in some serious work. I have been working in Machine Learning, Deep Learning and Data Science from past seven years. Besides that I also have vast experience in network programming, cuda Programming , Android App Development and Web Development. Clients ranging from startups to Fortune 500 companies across e-commerce, finance, IOT, audio, medicine, healthcare, real-estate and time series domains. My past projects technical domains: Android DEVELOPMENT: ✓Java, Kotlin programming languages Web: J2EE, Angular.js, Reactjs, NodeJs Databases: Firebase, MYSQL, CoreData, SQLite, CloudKit,JSON, XML, MongoDb, DynamoDB Linear regression, Logistic regression, PCA, Backpropagation, Autoencoders, Generative adversarial network, Reinforcement learning, Dropout regularization, K-means, KNN, SVM, Gaussian mixture model, EKF, UKF, Particle filter, SLAM, Stereo matching, Point cloud, Structure from motion, Path planning, Object detection, Classification, Localization, Segmentation, YOLO, OPENCV * Natural Language Processing (NLP) -- intent, question answering, e-commerce search, conversations. DialogFlow, wit.ai -- Abstraction, Extraction Text Summerization, Query Based summerization. * Time series analysis, prediction -- medical signals, stock/currency markets. * Deep reinforcement learning -- decision-making/scheduling problems. * Cloud (AWS, Azure, GCE) based network training and deployment. Hands-on expertise with a range of deep learning tools: * Tensorflow, Keras, TensorRT, PyTorch, TensorFlow Serving, Caffe, Deeplearning4j, Kubernetes, Tensorflow lite and Theano expert using CUDA backend. * Distributed network training (across CPUs/GPUs). * Building GPU-Accelerated Workflows with TensorFlow and Kubernetes * Data processing: pandas, dask, h5py, Spark. * Speech, Audio analysis -- speech to text, noise removal, source separation. DeepSpeech, DeepVoice, Alexa. --text to speech pocket phoenix,Kaldi toolkit * CHAT BOTS -- my-croft open source AI chat bot * Computer Vision, Object detection, Segmentation -- locating, segmenting, bounding objects in medical/natural images. SSD, Yolo, RCNN. Proficient in building/training cutting-edge, novel neural architectures across various verticals. Familiar with latest research involving DNN, CNN, RNN / LSTM, seq2seq,GAN, attention models. Natural Language Processing with Transformers: Hugging Face Transformers RoBERTa, XLNet, BERT, GPT-3, Transformer-XL, BERT with TensorRT, Custom models Biomedical NLP: BioBERT, SciSpacy, BioELMO, NegBio Question-Answering: SQUAD, HotpotQA, MovieQA, Biomedical question answering Embeddings: Universal Sentence Encoder,ELMO, BERT, Flair,Glove,FastText Cloud services: Google Cloud, AWS,IBM Cloud, Digital Ocean, Heroku, Paper space and others If you are searching someone who is passionate for programming and research. you are on the right page.
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Automatic Speech Recognition
    Android App Development
    Amazon Rekognition Video
    AWS Lambda
    Deep Neural Network
    Machine Learning
    Chatbot
    PyTorch
    Computer Vision
    Deep Learning
    Natural Language Processing
    C++
    Java
    Python
    TensorFlow
  • $225 hourly
    Machine learning scientist/engineer with hands-on expertise in deep learning, computer vision, time series, data analysis, computational science, mathematical modeling, computer simulations, algorithm development, and software development. ML -- computer vision: segmentation, classification, object detection, e.g., R-CNN, unet, yolo ML -- time-series: recurrent neural networks, convolutional neural networks, e.g., LSTM, GRU, CNN ML -- general: classification, regression, prediction Development of end-to-end ML pipelines: pre-processing, model development, post-processing, model evaluation, hyper-parameter turning, model deployment Applications: Medical imaging and devices, biofluids, fluid dynamics, combustion, renewable energy, computational engineering. Technical/non-technical writing: Patents, proposals, white papers, peer-reviewed papers, presentations. Education: PhD -- Applied Math (Cornell), Postdoctoral Fellow -- Chemical Engineering (MIT) Programming: Python, C++, Java, Matlab Machine Learning: Keras, PyMC3, Scikit-learn, TensorFlow, Weka Computer Vision: OpenCV, Pillow, SimpleITK Data Science: Matplotlib, Numpy, Pandas, Scipy, Seaborn CAD/CFD: Ansys, Gmsh, OpenCascade, OpenFOAM, Overture, Rhino3D, TGrid Misc: Amazon/Google cloud, DICOM, Git, Jira, UNIX tools/admin
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    DICOM
    Object Detection
    Computational Fluid Dynamics
    Image Processing
    Algorithm Development
    Mathematics
    SciPy
    Python Scikit-Learn
    Python
    Computer Vision
    TensorFlow
    Convolutional Neural Network
    Machine Learning
    Deep Learning
  • $35 hourly
    I am a computer vision engineer with a focus on artificial intelligence and with strong knowledge of machine learning frameworks and tools such as Pytorch and TensorFlow. My key expertise includes: 1. Computer Vision 2. Machine Learning 3. Artificial Intelligence 4. Data Science I have extensive knowledge of YOLO, Mask RCNN, CNN, and have built several projects using these algorithms, I specialize in counting small objects for scientific purposes, such as tracking the movement of microbes and sperm trajectories. Additionally, I am currently working on the GAN architecture for image super-resolution, and I have already published a paper on the same. Feel free to contact me. I look forward to helping you finish your projects.
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Mobile App Development
    PyTorch
    Machine Learning
    Firebase
    Python
    Java
    Kotlin
    Keras
    Machine Learning Model
    TensorFlow
    Data Science
    Computer Vision
    Android
  • $25 hourly
    Big Dolphin Co., Ltd www.bigdolphin.com.vn Analog and digital circuit design. PCB design. Software and hardware development with MCU and FPGA: AVR MCU's, Microchip PIC MCU's, Altera KIT's, C/C++, Visual Basic, VHDL/Verilog HDL.
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    OpenCV
    Computer Vision
    Eagle
    Image Processing
    PCB Design
    Integrated Circuit
    Electronic Design
    Digital Signal Processing
    Digital Electronics
    Circuit Design
    Analog Electronics
    HTML
    Embedded System
    C++
    C
  • $20 hourly
    I am a data scientist with 3 year of professional experience and I am really interested in machine learning, deep learning, natural language processing, computer vision and enjoy solving challenging tasks. I have experience of working with databases and complex data, implementation models from scientific articles (for example, lanenet), prepare models for production using Docker containers and Flask. My skills: ✔️Python ✔️PyTorch ✔️ Scikit-learn ✔️ Matplotlib ✔️ Numpy ✔️Pandas ✔️PySpark ✔️Transformers (from huggingface)
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Regex Writing
    Flask
    PySpark
    Matplotlib
    Neural Network
    Python
    Data Science
    Natural Language Processing
    pandas
    NumPy
    Machine Learning
    PyTorch
    Computer Vision
    Python Scikit-Learn
  • $65 hourly
    👋 Hello there, Glad to make your acquaintance. I am a Python developer with a passion for data science and a focus on the study and analysis of data to uncover hidden insights. I have a proven engineering and professional business background expertise in data processing, data mining, data analysis, and data visualization. I am a TensorFlow certified machine learning developer and can help with your investigation and needs in this area. Do reach out to me to start a conversation on how we can leverage the power of machine learning and artificial intelligence to achieve quantifiable business improvements. List of areas I can help: ★ Python coding/scripting/web scraping ★ Raspberry PI and Arduino projects ★ Data Mining, Data Analysis, Data Modeling, Data Visualization, ETL ★ Time Series Analysis, Computer Vision, Natural Language Processing, Deep Learning ★ ML Project Plan Setup ★ ML Data Collection and Preparation ★ ML Model Exploration ★ ML Model Tuning and Refinement ★ ML Test and Evaluate ★ Crafting a compelling business case and narrative grounded solidly on data
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Data Analysis
    Algorithm Development
    Data Modeling
    Time Series Analysis
    Data Extraction
    OCR Algorithm
    Data Science
    Machine Learning
    Data Mining
    Data Entry
    Data Scraping
    TensorFlow
    Python
    Computer Vision
  • $200 hourly
    Driven engineer with laid-back intensity who thrives on breaking boundaries. Expert Vetted top %1 on Upwork Built a large-scale computer vision product for Apple. Expert in end-to-end models and multi-modal transformers as well as LLMs. Well-versed in prompt engineering Deployed computer vision products used by governments and large businesses around the world. Organize 3k+ people strong community of people passionate about Deep Learning Lead automation of an entire financial department at Hewlett Packard Enterprise. Strong engineering skills as well as a knack for research and mathematics. I love staying on top of the latest research and pushing binderies of what's possible. Above all, human who loves other humans.
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Hugging Face
    Artificial Intelligence
    Google Cloud Platform
    Image Processing
    TensorFlow Stack
    Linux
    Data Science
    OpenCV
    Machine Learning
    Python
    TensorFlow
    Neural Network
    Computer Vision
    Deep Learning
    PyTorch
  • $25 hourly
    As a Computer Science graduate, I am a seasoned 𝐂𝐨𝐦𝐩𝐮𝐭𝐞𝐫 𝐕𝐢𝐬𝐢𝐨𝐧, 𝐌𝐚𝐜𝐡𝐢𝐧𝐞 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠, and 𝐃𝐞𝐞𝐩 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠 expert, boasting over 3 years of experience in the field. My extensive expertise encompasses a wide range of areas including: - 𝐂𝐡𝐚𝐭𝐛𝐨𝐭 𝐃𝐞𝐯𝐞𝐥𝐨𝐩𝐦𝐞𝐧𝐭 (with experience with ChatGPT) - 𝐎𝐛𝐣𝐞𝐜𝐭 𝐂𝐥𝐚𝐬𝐬𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧, 𝐒𝐞𝐠𝐦𝐞𝐧𝐭𝐚𝐭𝐢𝐨𝐧 (Semantic and Instance), and 𝐃𝐞𝐭𝐞𝐜𝐭𝐢𝐨𝐧 - 𝐅𝐚𝐜𝐢𝐚𝐥 𝐑𝐞𝐜𝐨𝐠𝐧𝐢𝐭𝐢𝐨𝐧 - 𝐄𝐝𝐠𝐞 𝐃𝐞𝐭𝐞𝐜𝐭𝐢𝐨𝐧 - 𝐈𝐦𝐚𝐠𝐞 𝐑𝐞𝐬𝐭𝐨𝐫𝐚𝐭𝐢𝐨𝐧 - 𝐅𝐞𝐚𝐭𝐮𝐫𝐞 𝐌𝐚𝐭𝐜𝐡𝐢𝐧𝐠 - 𝐀𝐮𝐭𝐨𝐞𝐧𝐜𝐨𝐝𝐞𝐫𝐬 𝐚𝐧𝐝 𝐃𝐞𝐜𝐨𝐝𝐞𝐫𝐬 - 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐝𝐯𝐞𝐫𝐬𝐚𝐫𝐢𝐚𝐥 𝐍𝐞𝐭𝐰𝐨𝐫𝐤𝐬 (GANs) and 𝐂𝐨𝐧𝐝𝐢𝐭𝐢𝐨𝐧𝐚𝐥 𝐆𝐀𝐍𝐬 My proficiency in Computer Vision has been applied to diverse practical domains, such as: - 𝐌𝐞𝐝𝐢𝐜𝐚𝐥 𝐈𝐦𝐚𝐠𝐢𝐧𝐠 - 𝐕𝐢𝐝𝐞𝐨 𝐀𝐧𝐚𝐥𝐲𝐭𝐢𝐜𝐬 - 𝐂𝐨𝐦𝐩𝐮𝐭𝐞𝐫 𝐕𝐢𝐬𝐢𝐨𝐧 solutions for Manufacturing, Construction Sites, and Retail Stores - 𝐎𝐩𝐭𝐢𝐜𝐚𝐥 𝐂𝐡𝐚𝐫𝐚𝐜𝐭𝐞𝐫 𝐑𝐞𝐜𝐨𝐠𝐧𝐢𝐭𝐢𝐨𝐧 (OCR) In terms of technical skills, I am well-versed in Python, TensorFlow, Py Torch, and Jupyter Notebook. I have also developed two comprehensive courses utilizing the YOLO framework, which feature Flask-based applications for use cases ranging from traffic monitoring to aimbot implementation for CS: GO.
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Bootstrap
    HTML
    Flask
    CSS
    TensorFlow
    PyTorch
    Machine Learning Model
    Keras
    Computer Vision
    Python
    Data Science
    Neural Network
    C++
    Machine Learning
    Deep Learning
  • $35 hourly
    I am a Top-rated-PLUS, seasoned machine learning engineer and data scientist specializing in machine learning algorithms/frameworks, computer vision, mediapipe, data modelling and production pipelines. I have expertise and experience in the following areas: computer vision, Natural Language Processing and text analysis, data science automation tasks, python programming, recommendation systems, reinforcement learning, image segmentation, advanced linear modeling, data extraction via API, random forest algorithm classification models and more. *Machine Learning in production certified specialist. *Mediapipe framework integration *Cloud architect AWS(in progress) *BSc. Electrical and electronics engineering
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Image Processing
    Flask
    Artificial Intelligence
    Deep Neural Network
    Artificial Neural Network
    Machine Learning
    PyTorch
    Model Optimization
    Supervised Learning
    Python
    Keras
    OpenCV
    Computer Vision
    TensorFlow
    Deep Learning
  • $125 hourly
    Looking for a great Return On Investment? If your project requires expertise in creating efficient, profit-driven machine learning systems that succeed on delivery, performance and commercial viability metrics, let's have a chat. So, why me? ✅ Proven track record including, 👨‍💻13 years professional experience as a Senior Software Engineer 🤖 Building AI systems since 2017 👨‍🎓 Always keeping my skills up to date including LangChain, Generative AI, LLMs and ChatGPT integrations 👁 Computer Vision startup founder 🤵 Serial entrepreneur 🥇My Approach At the onset of a project, I'll guide you to the necessary ingredients for success. This isn't just ticking boxes; it involves a profound understanding of your domain. The initial focus will be on establishing a robust data collection process. From there, we'll look at fitting a ML system to your needs, this can include rapid prototyping of more than 20 algorithms at once. From this point, the best solution will be chosen and a path forward will be presented for sign off and implementation. 🎁If you're unsure if your project will benefit from AI, let's chat with a 30 minute rundown and evaluation. I provide invaluable advice at a fixed cost via my consulting hours to understand if you have everything you need to turn your idea, data and demand into AI insights. 🧙‍♂️Tech Skills I specialise across the backend tech stack needed for ML projects on the bleeding edge. Including: 🐍Python's ML ecosystem - GCP, Azure, AWS (specialisation in Azure) - Llama/2, OpenAI GPT3.5/4, LangChain, StableDiffusion and Midjourney - Tensorflow, TFLite, Pytorch, Keras, OpenCV and custom CV development - Relational databases and high throughput message streaming systems - Postgres, Mysql, MS SQL Server, MQTT, Kafka, RabbitMQ, OracleDBMS - FaaS / serverless + ETL development - Distributed systems design including microservice architecture - Containerization including Docker, Kubernetes - System administration including self-hosted, containerized and bare metal - RHEL / CentOS, Azure Container Instances 👮‍♂️Trust, but verify High security or vulnerable intellectual property, all of my contracts are covered with a separate NDA under New Zealand jurisdiction with strong IP protection laws. Previously I've been entrusted with US govt EAR / ITAR restricted data, developed in high security credit card PCI DSS environments and high value finance, medical and fraud domains. 🇳🇿🥝Native English / Kiwi I work European and US timezones!
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Stable Diffusion
    Generative AI
    Large Language Model
    MySQL
    Microsoft Azure
    Apache Kafka
    ETL
    OpenCV
    Machine Learning
    Deep Learning
    PyTorch
    Python
    TensorFlow
    Computer Vision
    ChatGPT
  • $49 hourly
    Hello, I'm senior C++ developer with 10+ years of professional experience. In my career I worked in successful AAA game development, mobile multiuser software development, embedded software for set-top-boxes, searched for a good trading strategy for a dozen markets using machine learning on a cluster of forty high-end servers. Once I participated in Google AI Challenge (Planet Wars) and wrote a bot in three days which took 41th place out of 4500+ participants from around the world. My rating on freelancer.com is 5/5 Now I'm looking for a challenging job opportunity. My key skills: - C/C++, Python - computer vision with OpenCV library - multithreading, vectorization (SSE/SSE2/AVX) - memory cache optimization, speed-up hacks - machine learning, genetical algorithms, data mining - computer graphics using SDL/OpenGL - Windows/Linux/MacOS My other passion is electronics and circuitry (see portfolio). Maksym
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Artificial Intelligence
    Multithreaded Programming
    Mathematics
    Algorithms
    OpenCV
    Computer Vision
    Python
    C
    C++
  • $40 hourly
    I am a machine learning engineer with extensive experience in building end-to-end ML systems backed with a strong understanding of latest AI technologies Here are some ML systems I've built. **Computer Vision** 1. Built custom OCR systems for intelligent document processing of invoices, receipts, legal documents and forms (have also extensive experience with AWS Textract and Google's Intelligent Document Processor) 2. Automated wildlife surveys by using object detection systems such as YOLO and RCNN-based architectures and tracking algorithms including ByteTrack and DeepSort to automatically process imagery captured from drones 3. Built custom models for video summarization by extracting key frames **Generative AI** 1. Extensive experience with diffusion models including StableDiffusion, Dreambooth, ControlNet and MidJourney 2. Extensive experience with prompt engineering of generative models including ChatGPT and Stable Diffusion 3. Virtual try-on systems for fashion stores **Natural Language Processing** 1. Built GPT-based systems that can answer questions from given documents. Experience using Pinecone and vector databases for this. I have worked with ChatGPT, GPT3 and GPT4. 2. Extensive prompt engineering experience for language models including GPT, OPT, Bloom, Llama 2. Built a system to automatically extract relevant structured information ("entities") from raw emails using Spacy. 3. Built text summarization systems to summarize large documents and translation systems **Time Series Forecasting** 1. Built systems based on ARIMA, SARIMA etc to forecast freezer temperatures 2. Used LSTM networks for forecasting disease outbreak **MLOps** 1. I have built several end-to-end production level ML pipelines 2. Extensive experience with AWS including AWS EC2 instances, SageMaker --- Some of the deep learning architectures I have worked extensively with are: 1. Feedforward, convolutional neural networks 2. RNNs, LSTMs, GRUs 3. Transformers, attention-based networks 4. GANs, flow-based models, diffusion models 5. Graph neural networks --- I also have a graduate degree in computer science and an undergraduate degree in electrical engineering. In addition, I have also have a research background in reinforcement learning. My research (entitled 'Inverse Constrained Reinforcement Learning') was published in the International Conference on Machine Learning (ICML) which is one of the top conferences in ML. My work focused on inferring Markovian constraints in environments from expert trajectories. It extended prior work to high dimensions.
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Statistics
    Time Series Analysis
    Artificial Intelligence
    Natural Language Processing
    Tesseract OCR
    Neural Network
    Recommendation System
    pandas
    Keras
    PyTorch
    OpenCV
    TensorFlow
    Computer Vision
    Artificial Neural Network
    Python
  • $150 hourly
    Algorithm developer, research scientist, and technical leader with experience in image and signal processing, data analysis, optimization, pattern recognition, and machine learning for startup companies in a variety of applications and industries. Most development performed in Python with Numpy, Scipy, OpenCV, and sklearn libraries. Generally work with early stage prototype algorithm development (getting it working, not optimizing for execution speed).
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Pattern Recognition
    Digital Signal Processing
    Geospatial Data
    Digital Mapping
    Image Processing
    Algorithm Development
    Data Analysis
    Image Analysis
    Computer Vision
    Tesseract OCR
    NumPy
    Machine Learning
    OpenCV
    Data Science
    Python
    SciPy
  • $50 hourly
    Expert-Vetted (Top 1% of Upwork talent )🏆🏆🏆 🎓 NLP, ML, LLM and AI expert 💬 custom Chatbots using OpenAI, langchain vector databases. LLMs like chatgpt, GPT4, Llama and Falcon 📊 Sentiment Analysis, Text Classification, text generation, text summarization, Topic modelling, and Data Clustering I am lead AI/ML engineer with more than 6 years of experience traditional ML, deep learning, advanced NLP, generative AI like chatgpt, GPT4, Llama and Falcon. Strong experience in executing custom NLP solutions and integrating them in business workflows If you're working with any sort of data for your project, I'm here to help! Whether you have raw and unprocessed data that needs cleaning or you need help scraping and annotating new data, I've got you covered. As an AI professional with a specialization in NLP, I've worked with various models, including GPT3, Chatgpt/GPT4, GPT-NeoX, and GPT-J, and have experience in applying state-of-the-art NLP techniques to projects. If you need help training a deep learning model, I can help you experiment with cutting-edge models such as T5, Bert, M2M, FLAN-T5 and RoBerta to achieve the best possible performance. I can train/Fine tune open source LLMs like Llama, mpt7b, Falcon using efficient techniques like QLora. I'm well-versed in working with transformer-based models and can help you fine-tune and transfer learning to get the most out of your data. If you have text data I can help with text classification, natural language understanding, and natural language generation. If you're looking for help with semantic search or similarity search, I can create high-quality embeddings and develop a state-of-the-art semantic search system using sentence transformers. With my expertise in natural language processing, I can help you achieve accurate and efficient search results that meet your needs. If you're looking for a chatbot or conversational AI solution, I can help you develop a solution using Chatgpt, langchain and vector databases like pinecone. In addition to NLP, I'm experienced in working with sequential data, time series forecasting, and PyTorch code debugging. I have already completed more than 20 jobs on Upwork, and my clients have always been satisfied with my work. So, if you're looking for an AI professional who can help with anything remotely related to GPT3, GPT-NeoX, or GPT-J or any other LLM or any other NLP/ML task don't hesitate to reach out to me. I'll be more than happy to assist you in achieving success with your project.
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Llama 2
    LLM Prompt Engineering
    Large Language Model
    Flask
    SQL
    Artificial Intelligence
    BERT
    Python
    TensorFlow
    PyTorch
    GPT-4
    ChatGPT
    GPT-3
    Machine Learning
    Natural Language Processing
  • $50 hourly
    Sr. Data Scientist with extensive experience in the field of Data Science, Machine Learning and Deep Learning • 10+ years of hands-on experience with Python, R, SQL, Tableau, Big Data, ERP Systems, AWS and Data Science • Analyzed data using Python, R and SQL by querying structured and unstructured databases • Proficient in designing machine learning models with Python programming language • Extensive experience working and developing in UNIX environments such as Ubuntu • Theoretical and practical understanding of Linear Algebra, Statistics, Neural Networks and Algorithm Development • Working experience with Machine Learning, Deep Learning and Artificial Intelligence Programming Languages: › Python › R › Java › Javascript › C › C++ Frameworks: › Tensorflow › Keras, Flask › Django › PySpark › AngularJS › ReactJS › Node.JS › Jquery Technologies: › Web Development › Machine Learning › Deep Learning › Data Science Technical Skills: › Data Acquisition › Data Pre-processing › Data Analysis and Interpretation › Data Modeling Data Visualization: › Tableau › RAW › Matplotlib › Seaborn › Ggplot › Tensorflow dashboard Other Tools /Clouds: › Spark › Selenium Web driver › MongoDB, SQL › Express › AWS › Google Cloud › Azure
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Data Analysis
    Data Cleaning
    Data Scraping
    Artificial Neural Network
    Convolutional Neural Network
    Machine Learning
    Data Science
    Unsupervised Learning
    Neural Network
    Data Science Consultation
    Natural Language Processing
    Deep Learning
    Computer Vision
    TensorFlow
  • $30 hourly
    I'm a Artificial Intelligence M.Sc student at Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany. I have a good hands-on experience with several deep learning and big data frameworks and used them in several projects mentioned below. Tools Summary: -Programming Languages: Python, C, C++, MATLAB -Deep Learning: Tensorflow, Tensorflow Compression, Pytorch, Torchserve, Torchvision, SkLearn, NLTK, Huggingface -Data Science: Pandas, Hadoop, Spark, PySpark, SparkML -Parallel Programming: OpenCL, CUDA, MPI, OpenMP, pThreads -Other Libraries and Tools: Numpy, OpenCV, Docker, MMaction2, , Github 1) Implementation of several CV and image processing algorithms such as Harris corner detection, SIFT and other traditional feature extraction techniques. Tools Used: Python, OpenCV 2) Abstractive Text summarization using RNN and LSTM Encoder Decoder Architecture and other NLP projects using pretrained transformers. Tools Used: Python, Numpy, Tensorflow, NLTK, Pandas, Huggingface 3) Credit Card fraud detection using XGBoost Tools Used: Python, Numpy, SkLearn 4) Image Compression using Variational Autoencoders and GANs which achieved results compareable to tradidional SOTA techniques such as BPG Tools Used: Python, Numpy, Tensorflow, Tensorflow Compression 5) Several basic image classification projects using Transfer Learning and CNNs Tools Used: Python, Numpy, Tensorflow 6) My graduation project is Smart surveillance system that relies on SOTA models such as vision transformers to achieve a good performance in action detection and recognition tasks. Tools Used: Pytorch, Torchserve, Torchvision, Docker, MMaction2, Github 7) Experience with popular big data frameworks such as Hadoop and Spark in Cluster and Distributed environments and hands-on experience with Map-Reduce Paradigm 8) Big Data analytics: Analyze very large e-commerce datasets(approx 50 GB) to produce useful insights and implement a machine learning model using XGboost to predict customer behavior. Tools Used: Hadoop, Spark, PySpark, SparkML 9) Implementation of multiple image processing algorithms including sobel edge detection and 2D convolution Tools Used: C++, OpenCL, CUDA 10) Implementation of several distributed algorithms such as Map-Reduce. Tools Used: C, C++, MPI, OpenMP and pThreads 11) Implementation of three different process scheduling algorithms. Tools Used: C, Linux Scripting 12) Implementation of several Pacman game playing agents such as MiniMax and other variaions based on adversarial search techniques. 13) Implementation of Several AI algorithms such as CSP, A* and other basic and advanced search algorithms Tools Used: Python, Numpy
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    OCR Algorithm
    Engineering Tutoring
    Artificial Intelligence
    Natural Language Processing
    Computer Vision
    Python
    TensorFlow
    Data Science
    Machine Learning
    Deep Learning
    PyTorch
  • $60 hourly
    Medical Image Processing: Registration, Segmentation and Modeling of Anatomical Structures. Machine learning and data science. I am a PhD in Biomedical Engineering, University of Calgary, Canada. I have plenty of experience in C and C++ languages. I have also used python and Unix shell scripts. I can also program for VBA and Excel.
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Computer Vision
    Machine Learning
    OpenCV
    Image Processing
    Data Science
    Artificial Intelligence
    Microsoft Excel
    Medical Imaging
    C++
    Excel
    Excel VBA
    Visualization Toolkit
    Insight Toolkit
    Python
  • $35 hourly
    🏆 Google Certified TensorFlow Developer 🏆 AWS Certified Machine Learning - Specialty Engineer 🏆 AWS Certified Data Analytics - Specialty Engineer 5+ years of comprehensive industry experience in computer vision, Natural Language Processing (NLP), Predictive Modelling and forecasting. ➤ Generative AI Models 📍 OpenAI ( GPT - 3/4, ChatGPT, Embeddings ) 📍 Stable Diffusion - LoRA, DreamBooth 📍 Large Language Models (LLMs) - BLOOM, LLaMA, Llama2, Falcon ➤ Generative AI Frameworks 📍 LangChain 📍 Chainlit 📍 Pinecone - Vector database ➤ ML Frameworks 📍 TensorFlow 📍 PyTorch 📍 Huggingface 📍 Keras 📍 Scikit-learn 📍 Spark ML 📍 NVIDIA DeepStream SDK Development ➤ DevOps 📍CI/CD 📍Git, Git Action 📍AWS - CodeCommit, CodeBuild, CodeDeploy, CodePipeline, CodeStar ➤ Cloud Skills 📍 AWS - SageMaker, Comprehend, Translate, Textract, Polly, Forecast, Personalize, Rekognition, Transcribe, IoT Core, IoT Greengrass 📍 GCP - Vertex AI, Text-to-Speech 📍 Azure - Azure ML ➤ Sample work Applications include but are not limited to: 📍 Sales forecasting 📍 Recommendation engines 📍 Image classification 📍 Object segmentation 📍 Face recognition 📍 Object detection & object tracking 📍 Stable Diffusion Generative AI 📍 Augmented Reality 📍 Emotion analysis 📍 Video analytics and surveillance 📍 Text analysis and chatbot development 📍 Image caption generation 📍 Similar Image search engine 📍 Fine-tuning large language models (LLMs) 📍 ChatGPT API
    vsuc_fltilesrefresh_TrophyIcon Image/Object Recognition
    Artificial Intelligence
    Amazon Redshift
    AWS Glue
    Image Processing
    Google Cloud Platform
    Amazon Web Services
    Python
    Amazon SageMaker
    Computer Vision
    TensorFlow
    Machine Learning
    Google AutoML
    PyTorch
    Natural Language Processing
    Deep Learning
  • Want to browse more freelancers?
    Sign up

How it works

 

1. Post a job (it’s free)

Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.

2. Talent comes to you

Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.

3. Collaborate easily

Use Upwork to chat or video call, share files, and track project progress right from the app.

4. Payment simplified

Receive invoices and make payments through Upwork. Only pay for work you authorize.

Trusted by 5M+ businesses

How Image Recognition Works

Interpreting the visual world is one of those things that’s so easy for humans we’re hardly even conscious we’re doing it. When we see something, whether it’s car, or a tree, or our grandma, we don’t (usually) have to consciously study it before we can tell what it is. For a computer, however, identifying a human being at all (as opposed to a dog or a chair or a clock, let alone your grandmother) represents an amazingly difficult problem.

And the stakes for solving that problem are extremely high. Image recognition, and computer vision more broadly, is integral to a number of emerging technologies, from high-profile advances like driverless cars and facial recognition software to more prosaic but no less important developments, like building smart factories that can spot defects and irregularities on the assembly line, or developing software to allow insurance companies to process and categorize photographs of claims automatically.

We’re going to explore the challenge of image recognition and how data scientists are using a special type of neural network to address it.

Learning to see is hard (and expensive)

A good way to think about this problem is of applying metadata to unstructured data. In our article on content-based recommendations, we looked at some of the challenges of categorizing and searching content in cases where that metadata is sparse or nonexistent. Hiring human experts to manually tag libraries of movies and music may be a daunting task, but it’s an impossible one when it comes to challenges like teaching the navigation system in a driverless car to distinguish pedestrians crossing the road from other vehicles, or tagging, categorizing, and filtering the millions of user-uploaded pictures and videos that appear daily on social media.

One way to solve this would be through neural networks. While in theory we could use conventional neural networks to analyze images, in practice this turns out to prohibitively expensive from a computational perspective. For instance, a conventional neural network attempting to process even a relatively small image (let’s say 30×30 pixels) would still require 900 inputs and more than half a million parameters. While that might be manageable for a reasonably powerful machine, once the images become larger (say 500×500 pixels), the number of inputs and parameters required increases to truly absurd levels.

What’s more, applying neural networks to image recognition can lead to another problem: overfitting. Simply put, overfitting is what happens when a model tailors itself too closely to the data it’s been trained on. Not only does this generally lead to added parameters (and thus, further computational expense), it actually results in a loss in general performance when it’s exposed to new data.

The solution? Convolution!

Fortunately, a relatively straightforward change to the way a neural network is structured can make even large images more manageable. The result is what we call convolutional neural networks (also called CNNs or ConvNets).

One of the advantages of neural networks is their general applicability, but as we’ve seen when dealing with images, this advantage turns into a liability. CNNs make a conscious tradeoff: By designing a network specifically to handle images, we sacrifice some generalizability for a much more feasible solution.

Specifically, CNNs take advantage of the fact that, in any given image, proximity is strongly correlated with similarity. That is, two pixels that are near one another in a given image are more likely to be related than two pixels that are further apart. However, in a typical neural network, every pixel gets connected to every single neuron. In this case, the added computational load actually makes our network less rather than more accurate.

Convolution solves this by simply killing a lot of these less important connections. In more technical terms, CNNs make image processing computationally manageable by filtering connections by proximity. Rather than connecting every input to every neuron in a given layer, CNNs intentionally restrict connections so that any one neuron only accepts inputs from a small subsection of the layer before it (like, say, 3×3 or 5×5 pixels). Thus, each neuron is only responsible for processing a certain part of an image. (Incidentally, this is more or less how the individual cortical neurons in your brain work: Each neuron responds to only a small part of your overall visual field.)

Inside a convolutional neural network

But how does this filtering work? The secret is in the addition of two new types of layers: convolutional and pooling layers. We’ll break the process down below, using the example of a network designed to do just one thing: determine whether a picture contains a grandma or not.

The first step is the convolution layer, which actually consists of several steps in itself:

  1. First, we’ll break down a picture of grandma into a series of overlapping tiles 3×3 pixel tiles.
  2. Next, we’ll run each of these tiles through a simple, single-layer neural network, leaving the weights unchanged. This will turn our collection of tiles into an array. Because we kept each of the images small (in this case, 3×3), the neural network required to process them stays small and manageable.
  3. Then, we’ll take those output values and arrange them in an array that numerically represents the content of each area of our photograph, with the axes representing height, width, and color channels. So in our case, we’d have a 3x3x3 representation for each tile. (If we were talking about videos of grandma, we’d throw in a fourth dimension for time.)

Then comes the pooling layer, which takes these three-(or four-)dimensional arrays and applies a downsampling function alongside the spatial dimensions. The result is a pooled array containing only those parts of the image that are more important while discarding the rest, which both minimizes the computations we’ll need to do while also avoiding the problem of overfitting.

Lastly, we’ll take our downsampled array and use it as the input for a regular, fully connected neural network. Since we’ve dramatically reduced the size of the input using convolution and pooling, we should now have something a normal network can handle while still preserving the most important parts of the data. The output of this final step will represent how confident the system is that we have a picture of a grandma.

Note that this is a simplified explanation of how a convolutional neural network works. In real life, the process is (excuse the pun) more convoluted, involving multiple convolutional, pooling, and hidden layers. Additionally, real CNNs typically involve hundreds or thousands of labels, rather than just one.

Implementing convolutional neural networks

Building a Convolutional Neural Network from scratch can be a time-consuming and expensive undertaking. That said, a number of APIs have recently been developed that aim to allow organizations to glean insights from images without requiring in-house computer vision or machine learning expertise.

  • Google Cloud Vision is Google’s visual recognition API, based on the open-source TensorFlow framework and using a REST API. It detects individual objects and faces and contains a pretty comprehensive set of labels. It also comes with a few bells and whistles, including OCR and integration with Google Image Search to find related entities and similar images from the web.
  • IBM Watson Visual Recognition, part of the Watson Developer Cloud, comes with a large set of built-in classes, but is really built for training custom classes based on images you supply. Like Google Cloud Vision, it also supports a number of nifty features, including OCR and NSFW detection.
  • Clarif.ai is an upstart image recognition service that also uses a REST API. One interesting aspect is that it comes with a number of modules that help tailor its algorithm to particular subjects, like weddings, travel, and food.

While the above APIs may be suitable for some general applications, for specific tasks you might still be better off building a custom solution. Luckily, there are a number of libraries available that make the lives of data scientists and developers a little easier by handling the computational and optimization aspects, allowing them to focus on training models. Many of these libraries, including TensorFlow, DeepLearning4J, Torch, and Theano, have been used successfully in a wide variety of applications.

View less
Schedule a call