Hire the best Hadoop Developers & Programmers in India

Check out Hadoop Developers & Programmers in India with the skills you need for your next job.
Clients rate Hadoop developers & Programmers
Rating is 4.8 out of 5.
4.8/5
based on 102 client reviews
  • $35 hourly
    I have 18+ years of experience in software development in Telecom, Banking, and Healthcare domains. Primary skillsets include Big Data eco-systems (Apache Spark, Hive, Map Reduce, Cassandra), Scala, Core Java, Python, C++. I am well versed in designing and implementing Big data solutions, ETL and Data Pipelines, Serverless and event-driven architectures on Google Cloud Platform (GCP), and Cloudera Hadoop 5.5. I like to work with organizations to develop sustainable, scalable, and modern data-oriented software systems. - Keen eye on scalability, sustainability of the solution - Can come up with maintainable & good object-oriented designs quickly - Highly experienced in seamlessly working with remote teams effectively - Aptitude for recognizing business requirements and solving the root cause of the problem - Can quickly learn new technologies Sound experience in following technology stacks: Big Data: Apache Spark, Spark Streaming, HDFS, Hadoop MR, Hive, Apache Kafka, Cassandra, Google Cloud Platform (Dataproc, Cloud storage, Cloud Function, Data Store, Pub/Sub), Cloudera Hadoop 5.x Languages: Scala, Python, Java, C++, C, Scala with Akka and Play frameworks Build Tools: Sbt, Maven Databases: Postgres, Oracle, MongoDB/CosmosDB, Cassandra, Hive GCP Services: GCS, DataProc, Cloud functions, Pub/Sub, Data-store, BigQuery AWS Services: S3, VM, VM Auto-scaling Group, EMR, S3 Java APIs, Redshift Azure Services: Blob, VM, VM scale-set, Blob Java APIs, Synapse Other Tools/Technologies: Dockerization, Terraform Worked with different types of Input & Storage formats: CSV, XML, JSON file, Mongodb, Parquet, ORC
    Featured Skill Hadoop
    C++
    Java
    Apache Spark
    Scala
    Apache Hadoop
    Python
    Apache Cassandra
    Oracle PLSQL
    Apache Hive
    Cloudera
    Google Cloud Platform
  • $40 hourly
    Highly Skilled Data Engineer with diverse experience in the following areas: ✅ Data analysis and ETL solution expertise. ✅ Snowflake DB Expertise- Developer. ✅ DBT step, administration and development on both DBT cloud and DBT core. ✅ Azure Data Factory ✅ Sharepoint and Onedrive Integration using Microsoft Graph API ✅ Airflow Workflow / DAG development ✅ Matillion ETL ✅ Talend ETL Expert- Integration, Java Routines, data quality. ✅ Salesforce Integration. ✅ Google Cloud Platform - Cloud Function, Cloud Run, Data Proc, Pub-Sub, Bigquery. ✅ AWS- S3, Lambda, EC2, Redshift. ✅ Cloud Migration - work with Bulk data and generic code. ✅ Python automation and API Integration ✅ SQL reporting. ✅ Data Quality Analysis and Data Governance solution architecture design. ✅ Data Validation using Great expectations(python tool) P.S. Available to work US - EST hours on demand. I have good exposure to data integration, migration, transformation, cleansing, warehouse design, SQL, Functions, and procedures. - Databases: Snowflake, Oracle, PostgreSQL, Bigquery. - ETL Tools: Azure Data factory, Matillion, Talend Data Fabric with Java - DB Languages and tools: SQL, SnowSQL, DBT(Data Build Tool). - Workflow management tool: Airflow. - Scripting language - Python. - Python Frameworks: Pandas, Spark, Great Expectations, - Cloud Ecosystem: AWS, GCP
    Featured Skill Hadoop
    PySpark
    Microsoft Azure
    dbt
    Apache Hadoop
    Google Cloud Platform
    ETL
    Talend Data Integration
    Snowflake
    AWS Lambda
    API Integration
    JavaScript
    Apache Spark
    Amazon Web Services
    Python
    Apache Airflow
  • $60 hourly
    Experienced professional with more than 10 plus years of work experience in cloud architecture(Data focused) on platforms (like AWS,Azure, GCP). - Architecting Distributed Database clusters & Data pipelines for Big Data Analytics and Data Warehousing using tech stacks which include but are not limited to Redshift, Spark, Kinesis, Trino/PrestoDB, Athena, Glue, Hadoop, Hive, S3 Data lake . - Python, Bash, and SQL scripting for database management and automation. - Architecting your next enterprise-level software solution - Linux Server administration for setup and maintenance of services on cloud and on-premise servers. - Creating scripts to automate tasks, web scraping, and so on. Proficient in scripting using Python, Bash and Powershell. Expert in deploying Presto/Trino via docker/kubernetes and on cloud Professional Certifications- AWS Certified Data Analytics Speciality AWS Certified Solutions Architect Associate Google Associate cloud Engineer Microsoft Azure Fundamentals Microsoft Azure Data Fundamentals Starburst Certified practitioner
    Featured Skill Hadoop
    Amazon Web Services
    Apache Hadoop
    Big Data
    AWS Glue
    Amazon Athena
    Database Design
    Amazon Redshift
    PySpark
    AWS CloudFormation
    Amazon RDS
    AWS Lambda
    Data Migration
    ETL
    SQL
    ETL Pipeline
  • $50 hourly
    Hi, I'm Rajesh, a Senior SaaS Developer & Data Engineer with expertise in Python, Java, Scala, and cloud technologies (GCP, AWS, Azure AI). I’ve built and scaled AI-powered applications, developed RAG-based chatbots, and designed large-scale data pipelines. As a Founding Engineer at Labrador AI, I led backend architecture, payment integrations, and DevOps. I’m passionate about solving complex problems, mentoring, and scaling businesses with AI-driven solutions.
    Featured Skill Hadoop
    ETL Pipeline
    Data Science
    Database Architecture
    Kubernetes
    MySQL
    Apache Kafka
    Django
    Akka-HTTP
    Angular
    Scala
    Apache Hadoop
    Python
    MapReduce
    Java
  • $25 hourly
    Hello, I’m Aditya Johar, a Data Scientist and Full Stack Developer with 9+ years of experience delivering innovative, tech-driven solutions. I focus on identifying areas where technology can reduce manual tasks, streamline workflows, and optimize resources By implementing smart automation solutions tailored to your specific needs, I can help your business cut costs, improve efficiency, and free up valuable time for more strategic, growth-focused initiatives. ---------------------------------TOP SOLUTIONS DEVELOPED--------------------------------- ✅Custom Software using Python (Django, Flask, FAST API), MERN/MEAN/MEVN Stacks ✅Interactive Data Visualization Dashboards - Power BI, Tableau, ETL etc ✅Intelligent Document Processing (IDP), RAG, LLMs, Chat GTP APIs ✅NLP: Sentiment Analysis, Text Summarization, Chatbots and Language Translation ✅COMPUTER VISION: Image and Video Classification, Object Detection, Face Recognition, Medical Image Analysis ✅RECOMMENDATION SYSTEMS: Product Recommendations (e.g., e-commerce), Content Recommendations (e.g., streaming services), Personalized Marketing ✅PREDICTIVE ANALYTICS: Sales and Demand Forecasting, Customer Churn Prediction, Stock Price Prediction, Equipment Maintenance Prediction ✅E-COMMERCE OPTIMIZATION: Dynamic Pricing, Inventory Management, Customer Lifetime Value Prediction ✅TIME SERIES ANALYSIS: Financial Market Analysis, Energy Consumption Forecasting, Weather Forecasting ✅SPEECH RECOGNITION: Virtual Call Center Agents, Voice Assistants (e.g., Siri, Alexa) ✅AI IN FINANCE: Credit Scoring, Algorithmic Trading, Fraud Prevention ✅AI IN HR: Candidate Screening, Employee Performance Analysis, Workforce Planning ✅CONVERSATIONAL AI: Customer Support Chatbots, Virtual Shopping Assistants, Voice Interfaces ✅AI IN EDUCATION: Personalized Learning Paths, Educational Chatbots, Plagiarism Detection ✅AI IN MARKETING: Customer Segmentation, Content Personalization, A/B Testing ✅SUPPLY CHAIN OPTIMIZATION: Demand Forecasting, Inventory Optimization, Route Planning And Many More use cases that we can discuss while we connect. "Ready to turn these possibilities into realities? I'm just a click away! Simply click the 'Invite to Job' or 'Hire Now' button in the top right corner of your screen."
    Featured Skill Hadoop
    Django
    Apache Airflow
    Apache Hadoop
    Terraform
    PySpark
    Apache Kafka
    Flask
    BigQuery
    BERT
    Apache Spark
    Python Scikit-Learn
    pandas
    Python
    TensorFlow
    Data Science
  • $33 hourly
    👋 Hi, I'm Debjyoti, and together, we're your dedicated AI & Data Science team. We're a specialized, collaborative group of senior Data Scientists and ML Engineers with 12+ years of industry experience. Our team has successfully designed, built, and deployed AI-driven solutions across sectors including finance, telecom, SaaS, and major tech companies. Think of us as your extended AI department—we integrate seamlessly, deliver impactful results, and ensure your AI solutions run effectively in production. 🚀 What We Offer Generative AI & LLMs: Fine-tuned GPT and LLaMA models, retrieval-augmented chatbots, and advanced prompt engineering. Time-Series & Forecasting: Real-time forecasting pipelines (TinyTimeMixer, Prophet, ESRNN), anomaly detection, and actionable analytics dashboards. Computer Vision: Real-time object detection, OCR, automated labeling, and robust vision models optimized with PyTorch and TensorRT. Applied NLP: Customized text classification, multilingual embeddings, content summarization, and optimized search and ranking systems. MLOps & Deployment: End-to-end CI/CD for machine learning, robust APIs using FastAPI or Flask, containerization (Docker/Kubernetes), and deployment on AWS, GCP, or Azure. 🛠 Our Expertise & Certifications Programming: Python, Scala, SQL, Bash ML Frameworks: PyTorch, TensorFlow/Keras, Hugging Face, Apache Spark Infrastructure: Docker, Kubernetes, Ray, Airflow, LangChain/LangGraph Data Management: PostgreSQL, BigQuery, Snowflake, Delta Lake, Elasticsearch, Neo4j Cloud Platforms: AWS (SageMaker, EKS), GCP (Vertex AI), Azure ML Certifications: IBM Machine Learning Specialist (Professional & Advanced), Deep Learning with TensorFlow 🏆 Notable Achievements Achieved an 83% reduction in inference latency by optimizing a multilingual retrieval-augmented chatbot (RAG) using quantized LLaMA-2 models. Developed and maintained a mission-critical, fault-tolerant forecasting service processing 20+ TB/day, ensuring continuous uptime for a Fortune 50 bank. Designed an AI-powered diagnostic tool for mainframe systems (z/OS) that dramatically reduced incident resolution time from 3 hours to under 10 minutes. Contributed to thought leadership by presenting at international conferences (PyData, IEEE Big Data) and filing multiple patent disclosures in Generative AI safety. 🤝 Why Partner With Us? Flexible Scalability: Benefit from a team that can scale efforts quickly without the overhead of traditional agencies. Enterprise Experience: Proven track record handling sensitive data, complex regulatory environments, and rigorous security protocols. Clear Communication: Regular updates, detailed progress reports, concise documentation, and engaging demonstrations. Reliability & Quality: Rigorous coding standards, thorough documentation, comprehensive testing, and seamless CI/CD pipelines. Commitment: Focused on building lasting relationships through trust, transparency, and exceptional results. 📅 Our Availability Team availability: 30-40 hours/week. Extensive timezone coverage for seamless collaboration (US / EU / APAC-friendly hours). Open for exploratory discussions and tailored solution planning sessions—let's schedule a call! Ready to transform your data into actionable insights? Reach out today, and let's map out a tailored strategy to achieve your AI goals.
    Featured Skill Hadoop
    CUDA
    SQL
    Apache Hadoop
    Apache Spark MLlib
    Apache Spark
    Keras
    Time Series Forecasting
    Time Series Analysis
    LLM Prompt Engineering
    Natural Language Processing
    Computer Vision
    OpenCV
    TensorFlow
    PyTorch
    Python
  • $40 hourly
    As a Senior Data Engineer with 9 years of extensive experience in the Data Engineering with Python ,Spark, Databricks, ETL Pipelines, Azure and AWS services, develop PySpark scripts and store data in ADLS using Azure Databricks. Additionally, I have created data pipelines for reading streaming data from MongoDB and developed Neo4j graphs based on stream-based data. I am well-versed in designing and modeling databases using Neo4j and MongoDB. I am seeking a challenging opportunity in a dynamic organization that can enhance my personal and professional growth while enabling me to make valuable contributions towards achieving the company's objectives. • Utilizing Azure Databricks to develop PySpark scripts and store data in ADLS. • Developing producers and consumers for stream-based data using Azure Event Hub. • Designing and modeling databases using Neo4j and MongoDB. • Creating data pipelines for reading streaming data from MongoDB. • Creating Neo4j graphs based on stream-based data. • Visualizing data for supply-demand analysis using Power BI. • Developing data pipelines on Azure to integrate Spark notebooks. • Developing ADF pipelines for a multi-environment and multi-tenant application. • Utilizing ADLS and Blob storage to store and retrieve data. • Proficient in Spark, HDFS, Hive, Python, PySpark, Kafka, SQL, Databricks, and Azure, AWS technologies. • Utilizing AWS EMR clusters to execute Hadoop ecosystems such as HDFS, Spark, and Hive. • Experienced in using AWS DynamoDB for data storage and caching data on Elasticache. • Involved in data migration projects that move data from SQL and Oracle to AWS S3 or Azure storage. • Skilled in designing and deploying dynamically scalable, fault-tolerant, and highly available applications on the AWS cloud. • Executed transformations using Spark, MapReduce, loaded data into HDFS, and utilized Sqoop to extract data from SQL into HDFS. • Proficient in working with Azure Data Factory, Azure Data Lake, Azure Databricks, Python, Spark, and PySpark. • Implemented a cognitive model for telecom data using NLP and Kafka cluster. • Competent in big data processing utilizing Hadoop, MapReduce, and HDFS.
    Featured Skill Hadoop
    Microsoft Azure SQL Database
    SQL
    MongoDB
    Data Engineering
    Microsoft Azure
    Apache Kafka
    Apache Hadoop
    AWS Glue
    PySpark
    Databricks Platform
    Hive Technology
    Apache Spark
    Azure Cosmos DB
    Apache Hive
    Python
  • $30 hourly
    Highly experienced in IT sector with Lead roles. ( IT 10+ Years, after Master of Computer Applications (MCA) ). Stregnth: * Excellent as a Code Developer * Solution Leader. * Problem detecting & solving enthusiast * Proactive in suggesting the client and committed to the word given to the client. * Driven to the core for Speed, Optimization, "Bugs Cleaning" and the Scalability of the Projects. * With me, your company will get the creativity and advantage extra edge over the competition. General Experience: * 7 Years of experience as a software developer. * 3 Years of experience as a Senior * 2 Years of experience as a Team Leader Skills: Java,PHP,Angular,Vue,React,Wordpress,Laravel, Hadoop * Master of Computer Application * Have full stack knowledge of industry Companies and projects: * Samsung-Team Lead * RBS (Royal Bank of Scotland)- Team Lead * NCR Corporation - Team Lead * Accenture - Developer * Honda Insurance- Sr. Developer
    Featured Skill Hadoop
    Oracle Database
    Agile Software Development
    Apache Spark
    Apache Hive
    Hibernate
    MongoDB
    Scrum
    Apache Hadoop
    J2EE
    Machine Learning
    Git
    Apache Struts 2
    Web Service
    Apache Kafka
    Spring Framework
  • $15 hourly
    With more than 8+ years of experience , I have expertise in building highly scalable applications and my area of expertise is Java/J2EE, Artificial Intelligence, Big Data and IoT. I have experience with AWS Lambda function to process records in an Amazon DynamoDB Streams. I have good experience with multi tenant SAAS based applications with high volume traffic. In the Front-end, I have proven hands-on Java Libraries like Javascript, Angularjs, Reactjs, CSS, HTML, Bootstrap for responsive design. My competency with microservices or Lagom Framework along with Kafka, Scala and NoSQL databases helped me win high scale enterprise applications. I have experience of serving a lot of domains like Medical, Telecom, Multimedia, Health and Fitness, Logistics, eCommerce etc. My client base ranges from start-ups and emerging companies to established and mature organizations in various domains having a wide range of technological needs. I have been serving clients all across the globe. Quality is the key to everything that I build. My expertise in building scalable systems helps the client leverage future opportunities without worries. What you get: ----------------------------------------------------------------------------------------------------------- • 7+ Years of expertise in delivering high scale applications. • Great quality • High availability & Trust • Ease and comfort of communication • Smart Solution with plug N play features for future integrations • Scalable solutions, you need not worry when expanding/scaling • Well defined processes and documentation • Full safety of the source code and its ownership • The reliable partner
    Featured Skill Hadoop
    Big Data
    Apache Hadoop
    MERN Stack
    BigQuery
    Data Engineering
    Data Warehousing & ETL Software
    J2EE
    Apache Kafka
    Node.js
    MongoDB
    Redis
    React
    Spring Boot
    Microservice
    Spring Framework
    Java
    HTML5
  • $50 hourly
    8+ years of experience in architecting, designing and developing software across large scalable distributed systems and web applications. In my past experiences, I have been responsible for end-to-end development of features for Paytm Mall (Ecommerce), Paytm Smart Retail (B2B) and Paytm For Business(Merchant Platform). I am currently working on development of inhouse analytics platform for flipkart as Abobe Analytics is not scaling anymore at Flipkart's scale. Languages: Java, Scala, Python, JS Technologies: Spring, Spring Boot, Apache Flink, Spark,Django , Node.js, Express, Flask Data: Hibernate, Hadoop, Hive, Hbase, Druid, MySQL, SQLite, PostgreSQL, Elastic Search, Redis, SQLAlchemy Others: Kafka, RabbitMQ, Jenkins, Kibana, Nginx, Gunicorn, Celery, Supervisor, Datadog, JIRA, Git, CI/CD, TDD
    Featured Skill Hadoop
    Amazon Web Services
    Google Cloud Platform
    Java
    Big Data
    Apache Hive
    Apache Hadoop
    Apache Spark
    Apache HBase
    Apache Flink
    Apache Kafka
    Django
    Elasticsearch
    JavaScript
    Python
    SQL
  • $40 hourly
    I design AI systems that read your data, reason over it, and act—from Retrieval-Augmented chatbots to multi-agent automations and generative media pipelines. GPT-4o • Llama-3 • LangChain • CrewAI • FastAPI. Show me your use-case today and I’ll ship a working proof-of-concept in 72 hours. Ex-startup founder with 15 + years in software; the last four laser-focused on large-language-model and computer-vision stacks. PROJECT HIGHLIGHTS • AI SQL Tutor (live SaaS) – natural language ➜ optimized SQL ➜ chart + explanation (LangChain agents, vector RAG, multi-tenant) • Meeting Summarizer – Zoom/Meet recording ➜ Whisper transcription ➜ speaker diarization ➜ GPT JSON of tasks & decisions • RAG-in-a-Box – local Llama-3-8B, ChromaDB, FastAPI chat over PDFs (runs on a MacBook) • CrewAI + n8n Blog Factory – research & writing agents that auto-publish SEO posts to Notion • AI Product Explainer Video – 3-D animation with Pika Labs, ElevenLabs voice-over for e-commerce demos WHAT I DELIVER – Retrieval-Augmented chatbots and search assistants grounded in your private docs – Multi-agent workflows (CrewAI / LangGraph) that research, decide and trigger actions – Generative images & video pipelines for marketing and personalization – Voice & meeting intelligence: transcription, diarization, summaries, action items – Robust back-ends: FastAPI/Flask services, SQL analytics, schedulers, CI/CD STACK AT A GLANCE LLMs: GPT-4o, Claude-3, Llama-3, custom LoRA • LangChain • LangGraph • CrewAI • OpenAI Functions RAG: ChromaDB, Weaviate, Faiss, ElasticSearch • Gen Media: Stable Diffusion XL, ControlNet, Runway Gen-3 • Speech: Whisper, pyannote-audio • Automation: n8n, Make.com, Celery, APScheduler • Infra: Docker, AWS (Lambda, S3, Transcribe), Postgres, DuckDB ENGAGEMENT STYLE • Rapid prototypes—most clients see a running demo in one week • Production rigor—tests, logging, IaC, hand-off docs • Clear communication—Loom walkthroughs, milestone billing, responsive Slack/Teams Need a quick POC or a full production rollout? Click “Invite” and let’s talk.
    Featured Skill Hadoop
    Apache Hadoop
    Django
    Sqoop
    Oracle Performance Tuning
    Database Modeling
    PySpark
    Flask
    Machine Learning
    Python
    NLTK
  • $35 hourly
    Seasoned data engineer with over 11 years of experience in building sophisticated and reliable ETL applications using Big Data and cloud stacks (Azure and AWS). TOP RATED PLUS . Collaborated with over 20 clients, accumulating more than 2000 hours on Upwork. 🏆 Expert in creating robust, scalable and cost-effective solutions using Big Data technologies for past 9 years. 🏆 The main areas of expertise are: 📍 Big data - Apache Spark, Spark Streaming, Hadoop, Kafka, Kafka Streams, Trino, HDFS, Hive, Solr, Airflow, Sqoop, NiFi, Flink 📍 AWS Cloud Services - AWS S3, AWS EC2, AWS Glue, AWS RedShift, AWS SQS, AWS RDS, AWS EMR 📍 Azure Cloud Services - Azure Data Factory, Azure Databricks, Azure HDInsights, Azure SQL 📍 Google Cloud Services - GCP DataProc 📍 Search Engine - Apache Solr 📍 NoSQL - HBase, Cassandra, MongoDB 📍 Platform - Data Warehousing, Data lake 📍 Visualization - Power BI 📍 Distributions - Cloudera 📍 DevOps - Jenkins 📍 Accelerators - Data Quality, Data Curation, Data Catalog
    Featured Skill Hadoop
    SQL
    AWS Glue
    PySpark
    Apache Cassandra
    ETL Pipeline
    Apache Hive
    Apache NiFi
    Apache Kafka
    Big Data
    Apache Hadoop
    Scala
    Apache Spark
  • $100 hourly
    I'm an experienced DevOps engineer with many years of experience in implementing automation, CI/CD pipelines and cloud infrastructure for various organizations. I have a strong background in leveraging tools like Jenkins, Git, Docker, Kubernetes, and AWS and also proficient in architecting AI/ML and data pipelines to process data at any scale without explosion in infra costs. I am always eager to learn new technologies and stay up-to-date with industry best practices to provide the most innovative solutions for my clients. I do my best to meet deadlines and expectations and try to deliver the best solution instead of just a working one. My skills:- - Kubernetes/Openshift - Docker - AWS - Terraform - Ansible - Linux - Kafka - Python - Kubeflow - DB(Mongo, Postgres, MySQL, Redshift) - lambda I look forward to working with you.
    Featured Skill Hadoop
    Bash Programming
    Linux System Administration
    Apache Hadoop
    Network File System Implementation
    Docker Compose
    Amazon Web Services
    Kubernetes
    VPN
    Docker
    Google Cloud Platform
  • $100 hourly
    AI and Cloud Data Engineer with over 15 years of Practical Experience in Banking and Networking Domain. Well-versed in defining requirements, designing solutions, and building solutions at an enterprise grade. A Passionate Programmer and Quick Troubleshooter. Strong grasp on Java, Python, Big Data Technologies, Data Engineering/Analysis and Cloud Computing.
    Featured Skill Hadoop
    Apache Beam
    Apache Flink
    Apache Spark
    Data Science
    Microsoft Power BI
    Data Mining
    Apache Hadoop
    ETL
    Python
    Data Extraction
  • $40 hourly
    ✮✮✮✮✮ 5 Star Reviews✮✮✮✮✮ ✅ Upwork's TOP Rated Plus Expert ✅ 5+ years of Research Experience ✅ 3+ years of Industry Experience ✅ 2+ years of Teaching Experience Hi Folks, I am Dr. Jenish Dhanani, Ph.D. in Computer Science and Engineering. I am an expert in Generative AI, ML, Data Mining, GPT-4, and Natural Language Processing (NLP) to solve potential and real-life problems. I also hold expertise in AI Agent and Agenting Workflow development, enabling the creation of advanced, autonomous systems for a wide range of applications I have good experience in Prompt Engineering of GPT-4, GPT-J (and Other GPTs), and the Jurasic-jumbo model, which I believe is crucial in domain-specific or Special Applications. I have previously developed GPT-3 based Textual entity extraction, Article, Paraphraser, essay writing, summarizer, etc. by Prompt Engineering. I have great experience with Web Application Development using Python, Django, Flask, and many other web development technologies. ➤ Key Skills:- =========== ✔ Python, Django, Flask, DRF, Selenium, MongoDB, MySQL, Postgres, etc. ✔ PyTorch, Keras, TensorFlow, TensorBoard, Jupyter, R ✔ NLP, Text Mining and Analytics, Text Embedding ✔ TF-IDF, LSA, LDA, Word2Vec, Doc2Vec, BERT, FastText, Glove, etc. ✔ Machine Learning and Deep Learning ✔ Neural Network, Deep Neural Network, Support Vector Machine, Random Forest, Decision Tree, etc. ✔ Computer Vision and Image Processing ✔ Hadoop, Spark, MapReduce, Incremental MapReduce ✔ Community Detection: CDLib (Louvain, SLPA) ✔ Scikit-Learn and SparkMLLib ✔ Paperspace ✨ I have extensive experience in developing projects for Classification, Clustering, NLP techniques, Text Summarization, Topic modeling, Sentiment Analysis, Recommendation Systems, etc, considering Traditional and Advanced Machine learning and Deep learning techniques. ✨ Also, I hold good expertise in scaling and designing distributed solutions considering distributed platforms like MapReduce and Spark. ✨ I have published 17+ research articles in the fields of AI, ML, and Text Analytics in reputed international conferences and journals. ✨ It did automatic Sentiment Analysis of Amazon Product Reviews and legal Document Recommendation systems, considering Distributed framework and Text embedding approaches that were proposed, developed, and implemented. ✨ I also specialize in AI Automation and Agent Workflow Design, building intelligent systems that automate tasks like customer support, lead handling, and backend processes. Using GPT-4 and other LLMs, I develop prompt-engineered agents integrated with APIs and tools to deliver accurate, autonomous solutions that boost efficiency and business outcomes. ✨ I have also delivered expert lectures and Hands-On sessions on various topics such as Big Stream Data mining, Hadoop, Pig, Hive, Flume, Hadoop and MapReduce programming, and Sentiment Analysis. ✨ Also, I am a scientific research professional. I always love to build long-term relationships with clients. I will be very punctual, so will keep deadlines and deliver good results. I am looking forward to hearing from you. Best regards. Dr. Jenish Dhanani
    Featured Skill Hadoop
    AI Agent Development
    Data Mining
    Django
    Document Analysis
    Apache Hadoop
    Artificial Intelligence
    Sentiment Analysis
    Flask
    Machine Learning
    Data Science
    Word Embedding
    Recommendation System
    Apache Spark MLlib
    Apache Spark
    Natural Language Processing
  • $35 hourly
    5+ years of experience in Big Dat Technologies like Spark, Hadoop, Hive, Sqoop, ADF, Databricks. 5+ years of experience in ELK Stack( Elasticsearch, Logstash and Kibana). Microsoft Azure Certified Data Engineer. Elasticsearch and Kibana Certified. MongoDB Certified Developer.
    Featured Skill Hadoop
    Microsoft Azure
    Databricks Platform
    Apache Spark
    PySpark
    MongoDB
    Logstash
    Elasticsearch
    Grok Framework
    ELK Stack
    Apache Hadoop
    Hive
    Bash
    SQL
    Kibana
  • $39 hourly
    Google Cloud-certified Professional Data Engineer. XP: 3 years Education: IIT Kharagpur - BTech MTech Dual Degree (2016-2021) Strengths: Data Engineering, Data Migration, Data Governance, Data Pipelines, Databases, GCP, Hadoop, Python
    Featured Skill Hadoop
    Google Dataflow
    BigQuery
    Apache Airflow
    Apache Hadoop
    PySpark
    Google Cloud Platform
    Solution Architecture
    Data Engineering
  • $34 hourly
    A Passionate Bigdata cloud admin with vast experience in Big Data and cloud technologies.Flexible to learn, explore and implement new technologies. Self-motivated and believe in sharing knowledge and experiences. PROFESSIONAL FORTE Having overall 6.5 years of experience as a Bigdata and cloud Administrator in production environments. * Working experience in installing, configuring, managing and upgrading RStudio (Posit) products. * Working experience in architecting the Posit Infrastructure in cloud and on-prem servers. * Working experience in administrating and maintaining linux and windows operating system. * Working experience in databricks administration. * Working experience in deploying infra using terraform. * Excellent understanding and working experience on azure and aws cloud services. * Working experience in managing and deploying data bricks lake house Env.
    Featured Skill Hadoop
    R Shiny
    R Hadoop
    Apache Hadoop
    Terraform
    Azure DevOps Server
    RStudio
    Databricks Platform
    Google Cloud Platform
    Amazon Web Services
    Microsoft Azure
    System Administration
  • $40 hourly
    Throughout my dynamic 8 year tenure in the technology sector, I have carefully developed a versatile skill set, showcasing expertise in AWS, Kubernetes, Jenkins, ELK stack, Docker, CI/CD, version control, Linux servers, Windows servers, Azure, Helm Charts, and Terraform. My proficiency in containerization and orchestration is evident through the successful deployment of diverse services and Big-Data tools such as HBase, Spark, Hadoop and Hive, utilizing Microservices and Kubernetes frameworks. I excel in developing automated CI/CD pipelines tools like AWS CodeDeploy, Gitlab, and Jenkins. My capabilities extend to cloud migration and management, executing seamless transitions for both Linux and Windows environments. With proficiency in infrastructure automation, I utilize Ansible for streamlined provisioning and management. Additionally, my proven track record encompasses expertise in infrastructure upgrades, ensuring smooth os version transitions on production servers. In the realm of project deployments, I specialize in the web hosting domain, showcasing proficiency in deploying applications such as Wordpress, Magento, Node-js, React, and Python-based applications
    Featured Skill Hadoop
    Scripting
    Ansible
    Amazon ECS
    AWS CodePipeline
    Apache Hadoop
    Microservice
    ELK Stack
    WordPress
    Windows Server
    Linux
    GitLab
    Jenkins
    Docker
    Kubernetes
    DevOps
  • $35 hourly
    I am a certified AWS Data enthusiast with around 7 years of experience in data engineering, currently looking for freelance opportunities. I possess a strong skillset in Big Data Processing, PySpark, Hive, Fivetran Integrations, OLAP, Snowflake,SQL, MongoDb, Python, Data Modeling, Data Integration, Data Warehousing, ETL, and Data Visualization. My role as an AWS Data Engineer has honed my skills in utilizing AWS services like Glue, Step Functions, and other related technologies for scalable data pipelines. Recognized for delivering outstanding performance and contributing to the success of projects, I have consistently met deadlines and achieved significant milestones, aiding clients in making data-driven decisions. My problem-solving abilities and sense of responsibility allow me to work efficiently and effectively to meet project requirements. I am committed to continuous learning and always eager to adopt new technologies, enabling me to provide advanced analytics and data insights. As a freelancer, I am adaptable in managing clients and directing programs to ensure successful outcomes.
    Featured Skill Hadoop
    Amazon S3
    Amazon EC2
    Microsoft Power BI
    Tableau
    Management Skills
    Amazon
    Data Mining
    Unix Shell
    Git
    Apache Hadoop
    AWS Glue
    Amazon Web Services
    PySpark
    Python
    SQL
  • $33 hourly
    With over 5 years of professional experience in software development and quantitative analysis, I bring a robust background in data structures, algorithms, and statistics. My work has spanned high-impact roles at leading organizations like Bank of America and Squarepoint Capital, where I developed scalable software solutions and gained deep insights into financial engineering. Armed with a Master's in Financial Engineering from the National University of Singapore and a Bachelor's in Chemical Engineering from IIT Roorkee, I am skilled in creating innovative algorithms for financial analytics, statistical modeling, and developing software systems that drive business outcomes. I am proficient in multiple programming languages and tools, with a focus on Python, C++, and SQL. My technical expertise, combined with a track record of solving complex problems in both finance and engineering, makes me an ideal collaborator for projects in software engineering, data analytics, and financial technology.
    Featured Skill Hadoop
    pandas
    PostgreSQL
    Apache Impala
    Apache Hadoop
    Parquet
    Asynchronous I/O
    Multithreaded Programming
    Python
    Data Mining
    Data Analysis
    Beta Testing
    Alpha Testing
    ETL Pipeline
    ETL
    Data Extraction
  • $60 hourly
    Your account has been suspended. Please contact customer support. Your account has been suspended. Please contact customer support.
    Featured Skill Hadoop
    Vertica
    Amazon EC2
    Microsoft SQL Server Administration
    NoSQL Database
    Microsoft Azure
    Microsoft SQL Server Programming
    Apache Hadoop
    SQL
    MongoDB
    Apache Cassandra
  • $100 hourly
    Hi Folks, I am techno oriented guy with ~10 Years of industrial experience, Whose passion is the intersection between search, big data and machine learning which extends my capability as Elasticsearch (Elastic Search) , Logstash and Kibana Consultant || End to End ELK administration and architect || Text Analytic. My core competency lies in end to end development and management in Big data Projects. I am also associated to elite organization who is doing lot of R&D in the field of data science and data analytic which also leads me to be updated with latest technology and being professional at work. Like most of the fellows I have also worked and managed to do the projects on ELK, REST full service, No SQL Database, Hadoop, HortonWorks, Spark, Flume, Pig, Hive, R Programming, AWS, Digital Ocean, Azure, PHP, Java, MySQL, Data visualization. I do have ability to grasp the things quickly and implement new technology as and when required. Its my pleasure to work with you and give my best so as to be in win win situation for both. I do have teams who are expertise in web design and development , Linux system administration, Ruby On Rails, Share Point Development etc. So in short, depending on what you need to launch your product successfully, I can * Architect and Design * Code * Mentor * Train ~Prashant
    Featured Skill Hadoop
    AWS CodePipeline
    NoSQL Database
    Logstash
    ELK Stack
    PHP
    Kibana
    Spring Boot
    AWS Lambda
    AWS CodeDeploy
    Elasticsearch
    Microservice
    Amazon EC2
    Apache Hadoop
    Java
    Linux System Administration
    NGINX
  • $45 hourly
    As a highly experienced Data Engineer with over 10+ years of expertise in the field, I have built a strong foundation in designing and implementing scalable, reliable, and efficient data solutions for a wide range of clients. I specialize in developing complex data architectures that leverage the latest technologies, including AWS, Azure, Spark, GCP, SQL, Python, and other big data stacks. My extensive experience includes designing and implementing large-scale data warehouses, data lakes, and ETL pipelines, as well as data processing systems that process and transform data in real-time. I am also well-versed in distributed computing and data modeling, having worked extensively with Hadoop, Spark, and NoSQL databases. As a team leader, I have successfully managed and mentored cross-functional teams of data engineers, data scientists, and data analysts, providing guidance and support to ensure the delivery of high-quality data-driven solutions that meet business objectives. If you are looking for a highly skilled Data Engineer with a proven track record of delivering scalable, reliable, and efficient data solutions, please do not hesitate to contact me. I am confident that I have the skills, experience, and expertise to meet your data needs and exceed your expectations.
    Featured Skill Hadoop
    Snowflake
    ETL
    PySpark
    MongoDB
    Unix Shell
    Data Migration
    Scala
    Microsoft Azure
    Amazon Web Services
    SQL
    Apache Hadoop
    Cloudera
    Apache Spark
  • $35 hourly
    I have more than 17 years of software development experience including product management role. During this time I have designed/architect and developed multiple JEE application in variety domains including GIS, financial world to IoT. I have extensive experience on number of Spring based open source frameworks. Also has developed application on big data and real time streaming data processing using Kafka/Spark stack. I also have experience in developing web GIS product for managing georeferencing maps.
    Featured Skill Hadoop
    JDBC
    Java
    Spring Framework
    NoSQL Database
    Big Data
    Apache Spark
    Apache Hadoop
    Apache Kafka
  • $40 hourly
    I am Aliabbas Bhojani, Data Engineer with profound knowledge and experience in the core functionality of Data Engineering, Big Data Processing and Cloud Data Architecture. I have completed by Bachelor in Engineering with the specialisation in Computer Engineering which has helped me to target complex data problems and have proved my expertise by suggesting the high performant cloud data architecture which can help to scale the business. I'm very familiar with a wide variety of web platforms and infrastructure, so don't be afraid to run something by me for things like Apache Spark, Apache NiFi, Kafka, Apache Accumulo, Apache Base, Zookeeper, REST APIs, Java, Python, Scala and JavaScript. I can work on your on-prem or cloud deployed solution so whether it's setting up Kubernetes, Docker, VMs on Azure, Amazon Web Services(AWS) or Google Cloud Platform(GCP). Wide Spectrum of Offering: - Data Engineering Core Values - Data Driven Business Intelligence - Automated Real Time Data Pipelines - Advance Machine Learning based Data Analytics - Relational and Non Relational Data Modelling - Cloud native data products - Big Data Handling with Apache Spark and Apache NiFi - Open Source Data Tools Usage and Mindset - AWS Cloud Data Architecture and Engineering - Azure Cloud Data Architecture and Engineering - GCP Cloud Data Architecture and Engineering - Scaling Data Pipelines with Kubernetes and Docker - No Down Time Data Pipeline using Cloud Agnostic Approach Feel free to reach out in terms of any inquiries and project discussion Aliabbas Bhojani
    Featured Skill Hadoop
    Snowflake
    Cloud Architecture
    Data Lake
    Apache Accumulo
    ETL
    DevOps
    Machine Learning
    PySpark
    Apache NiFi
    Apache Spark
    Python
    Java
    SQL
    Data Engineering
    Apache Hadoop
  • $50 hourly
    With around 13 Years of IT experience on data driven applications.I excel in building robust data foundations for both structured and unstructured data from diverse sources. Additionally, I possess expertise in efficiently migrating data lakes and pipelines from on-premise to cloud environments. My skills include designing and developing scalable ETL/ELT pipelines using cutting-edge technologies such as Spark, kafka, Pyspark, Hadoop, Hive, DBT, Python, and leveraging cloud services like AWS, Snowflake, and DBT Cloud,Airbyte, BigQuery, Metabase and A good understanding of containerisation frameworks like Kubernetes and Docker is essential
    Featured Skill Hadoop
    Apache Airflow
    Apache Hive
    Databricks Platform
    Apache Spark
    Python
    Apache Hadoop
    PySpark
    Snowflake
    Amazon S3
    dbt
    Database
    Oracle PLSQL
    Unix Shell
  • Want to browse more freelancers?
    Sign up

How hiring on Upwork works

1. Post a job

Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.

2. Talent comes to you

Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.

3. Collaborate easily

Use Upwork to chat or video call, share files, and track project progress right from the app.

4. Payment simplified

Receive invoices and make payments through Upwork. Only pay for work you authorize.