Hire the best Apache Spark Engineers in Pune, IN

Check out Apache Spark Engineers in Pune, IN with the skills you need for your next job.
Clients rate Apache Spark Engineers
Rating is 4.7 out of 5.
4.7/5
based on 283 client reviews
  • $35 hourly
    ✨ Seasoned software professional with 20+ years of experience in end-to-end software development, including 8+ years specializing in Big Data technologies and cloud-based solutions. Proven expertise in building scalable, high-performance data platforms using Apache Spark, Hadoop, Hive, Cassandra, and programming in Scala, Python, Java and C++. ✨ I focus on designing robust, enterprise-grade Big Data and Data Engineering architectures on GCP, AWS, and Azure, both in on-prem and cloud environments. My role involves solution architecture, technical leadership, and hands-on development of critical components. ✨ I am passionate about leveraging my experience to build cutting-edge data and AI solutions. Open to senior technical roles, consulting opportunities, and innovative startup environments. 🔹 Keen eye on scalability, sustainability of the solution 🔹 Can come up with maintainable & good object-oriented designs quickly 🔹 Highly experienced in seamlessly working with remote teams effectively 🔹 Aptitude for recognizing business requirements and solving the root cause of the problem 🔹 Can quickly learn new technologies 🔹 Transparency, Dedication, Qualtity and Satisfaction Guaranteed Sound experience in following technology stacks: ✨ Big Data: Apache Spark, Spark Streaming, HDFS, Hadoop MR, Hive, Apache Kafka, Cassandra, Google Cloud Platform (Dataproc, Cloud storage, Cloud Function, Datastore/Firestore, Pub/Sub), Cloudera Hadoop 5.x ✨ Languages: Scala, Python, Java, C++, C, Scala with Akka and Play frameworks ✨ Build Tools: Sbt, Maven ✨ Databases: Postgres, Oracle, MongoDB/CosmosDB ✨ GCP Services: GCS, DataProc, Cloud functions, Pub/Sub, Data-store, BigQuery ✨ AWS Services: S3, VM, VM Auto-scaling Group, EMR, S3 Java APIs, Redshift, MongoDB ✨ Azure Services: Blob, VM, VM scale-set, Blob Java APIs, Synapse, CosmosDB ✨ Other Tools/Technologies: Kubernetes, Dockerization, Terraform Worked with different types of Input & Storage formats: CSV, XML, JSON file, Mongodb, Parquet, ORC
    Featured Skill Apache Spark
    C++
    Java
    Scala
    Apache Hadoop
    Python
    Apache Cassandra
    Oracle PLSQL
    Apache Hive
    Cloudera
    Google Cloud Platform
  • $25 hourly
    Hands on data architect & senior data engineer, with 10+ years of experience in designing & building end to end high velocity, high volume real time & batch data platforms from scratch on public clouds & On-premises. Built & worked on peta byte scale data platforms in top companies. An open source contributor to data technologies & products like Airbyte etc. Love working on database internals. I also have experience designing MLOps platform. I have experience working with telemetry data, payments data, video data, sports data, ecommerce data & affiliate marketing data, logs data. Skill Set: Big Data Technologies: Spark, Kafka, Flink, Presto, Dremio, Hudi Data warehouses: Snowflake, Druid, Clickhouse, Redshift, SingleStore(Memsql) Databases: MySQL, Postgres, Cassandra, DynamoDB Programming languages: Golang, Python, Rust, Scala, Java Visualization: Tableau, Apache Superset, Zoomdata Data Technologies - Airbyte, Fivetran, Dagster, Airflow, Nifi, Kubeflow, ElasticSearch, OpenSearch Ops: Kubernetes, Docker Cloud: AWS, GCP
    Featured Skill Apache Spark
    Apache Druid
    ClickHouse
    Machine Learning
    Data Lake
    Streaming Platform
    PostgreSQL
    Golang
    Big Data
    Amazon Web Services
    Snowflake
    Data Engineering
    Apache Kafka
    Apache Cassandra
    Scala
  • $40 hourly
    As a Senior Data Engineer with 9 years of extensive experience in the Data Engineering with Python ,Spark, Databricks, ETL Pipelines, Azure and AWS services, develop PySpark scripts and store data in ADLS using Azure Databricks. Additionally, I have created data pipelines for reading streaming data from MongoDB and developed Neo4j graphs based on stream-based data. I am well-versed in designing and modeling databases using Neo4j and MongoDB. I am seeking a challenging opportunity in a dynamic organization that can enhance my personal and professional growth while enabling me to make valuable contributions towards achieving the company's objectives. • Utilizing Azure Databricks to develop PySpark scripts and store data in ADLS. • Developing producers and consumers for stream-based data using Azure Event Hub. • Designing and modeling databases using Neo4j and MongoDB. • Creating data pipelines for reading streaming data from MongoDB. • Creating Neo4j graphs based on stream-based data. • Visualizing data for supply-demand analysis using Power BI. • Developing data pipelines on Azure to integrate Spark notebooks. • Developing ADF pipelines for a multi-environment and multi-tenant application. • Utilizing ADLS and Blob storage to store and retrieve data. • Proficient in Spark, HDFS, Hive, Python, PySpark, Kafka, SQL, Databricks, and Azure, AWS technologies. • Utilizing AWS EMR clusters to execute Hadoop ecosystems such as HDFS, Spark, and Hive. • Experienced in using AWS DynamoDB for data storage and caching data on Elasticache. • Involved in data migration projects that move data from SQL and Oracle to AWS S3 or Azure storage. • Skilled in designing and deploying dynamically scalable, fault-tolerant, and highly available applications on the AWS cloud. • Executed transformations using Spark, MapReduce, loaded data into HDFS, and utilized Sqoop to extract data from SQL into HDFS. • Proficient in working with Azure Data Factory, Azure Data Lake, Azure Databricks, Python, Spark, and PySpark. • Implemented a cognitive model for telecom data using NLP and Kafka cluster. • Competent in big data processing utilizing Hadoop, MapReduce, and HDFS.
    Featured Skill Apache Spark
    Microsoft Azure SQL Database
    SQL
    MongoDB
    Data Engineering
    Microsoft Azure
    Apache Kafka
    Apache Hadoop
    AWS Glue
    PySpark
    Databricks Platform
    Hive Technology
    Azure Cosmos DB
    Apache Hive
    Python
  • $50 hourly
    Having a hands on experience on developing Analytics and Machine Learning, Data Science, Big Data and AWS Solutions.
    Featured Skill Apache Spark
    Apache Cordova
    Cloud Services
    Analytics
    PySpark
    Data Science
    Python
    Machine Learning
  • $75 hourly
    Certified TOGAF 9 Enterprise Architect with over 18 years of IT service experience, specializing in solution architecture, innovation, consulting, and leading diverse projects. My extensive background in IT services has honed my skills in consulting, architecture, and software development. I am now focused on leveraging these skills in AI, Machine Learning, Data Lakes, and Analytics, seeking opportunities that challenge me to continue learning and applying cutting-edge technologies in real-world applications. Recent Projects and Specializations: Artificial Intelligence & Machine Learning: Developed several generative AI projects including: A solution for manufacturing operators that provides real-time fixes based on user-generated prompts and descriptions. An AI-driven healthcare lab assistant that suggests diagnostic tests based on user inputs. Advanced ML algorithms for monitoring pH levels in sugar production, crucial for maintaining quality control over product consistency. Implemented an ML model for HVAC systems that predicts power consumption spikes and potential breakdowns, enhancing maintenance efficiency and energy management. Data Science & Big Data: Expertise in handling large-scale data environments from terabytes to petabytes, developing actionable insights across multiple domains including Retail, Finance, Manufacturing, IoT, and Healthcare. Proficient in: Apache Hadoop, Spark, Cloudera CDH, Hortonworks, MapR Real-time data processing with Apache Hive and Elasticsearch Cloud Architecting & Data Lakes: Skilled in designing and implementing robust cloud solutions and data lakes that streamline data accessibility and analysis, supporting high-level decision-making processes. Business Intelligence & Analytics: Experienced in integrating BI tools and technologies like Splunk, Tableau, and OBIEE to transform raw data into valuable business insights. Industry Expertise: Telecom, Retail, Banking & Financial Services, Utilities, Education
    Featured Skill Apache Spark
    Apache Superset
    Amazon Web Services
    CI/CD Platform
    Google Cloud Platform
    Cloud Computing
    Cloud Migration
    Microsoft Azure
    Cloud Security
    Data Privacy
    Data Management
    Data Ingestion
  • $60 hourly
    With over 14 years in data engineering and a decade of experience with public cloud platforms (AWS, Azure, GCP), I have evolved into a seasoned professional in Machine Learning and Artificial Intelligence. My focus lies in designing, deploying, and managing scalable AI solutions that drive business transformation. Core Competencies: Machine Learning & AI Development: Expertise in developing and deploying ML models using Python (scikit-learn, TensorFlow, PyTorch) for applications such as predictive analytics, natural language processing, and computer vision. MLOps & Model Deployment: Proficient in implementing MLOps practices, including CI/CD pipelines, model versioning, and automated monitoring using tools like MLflow, DVC, and Kubernetes. Agentic AI & LLM Integration: Skilled in building autonomous AI agents leveraging Large Language Models (LLMs) to perform complex, multi-step tasks with minimal human intervention, enhancing operational efficiency and decision-making processes. Cloud-Based Data Solutions: Experienced in architecting and managing data pipelines and AI workflows on cloud platforms, ensuring scalability, reliability, and security. Data Engineering Foundations: Strong background in data modeling, ETL processes, and database management, facilitating seamless integration between data engineering and AI workflows. I am passionate about harnessing the power of AI to solve real-world problems and drive innovation. My combined experience in data engineering and AI positions me to deliver end-to-end solutions that are both robust and scalable.
    Featured Skill Apache Spark
    LangChain
    Pinecone
    Snowflake
    Elasticsearch
    AWS Development
    Azure Service Fabric
    Azure Machine Learning
    Python
    Microsoft Azure
    Data Lake
    Databricks Platform
    Data Engineering
    Azure Cosmos DB
    SQL
  • $45 hourly
    As a highly experienced Data Engineer with over 10+ years of expertise in the field, I have built a strong foundation in designing and implementing scalable, reliable, and efficient data solutions for a wide range of clients. I specialize in developing complex data architectures that leverage the latest technologies, including AWS, Azure, Spark, GCP, SQL, Python, and other big data stacks. My extensive experience includes designing and implementing large-scale data warehouses, data lakes, and ETL pipelines, as well as data processing systems that process and transform data in real-time. I am also well-versed in distributed computing and data modeling, having worked extensively with Hadoop, Spark, and NoSQL databases. As a team leader, I have successfully managed and mentored cross-functional teams of data engineers, data scientists, and data analysts, providing guidance and support to ensure the delivery of high-quality data-driven solutions that meet business objectives. If you are looking for a highly skilled Data Engineer with a proven track record of delivering scalable, reliable, and efficient data solutions, please do not hesitate to contact me. I am confident that I have the skills, experience, and expertise to meet your data needs and exceed your expectations.
    Featured Skill Apache Spark
    Snowflake
    ETL
    PySpark
    MongoDB
    Unix Shell
    Data Migration
    Scala
    Microsoft Azure
    Amazon Web Services
    SQL
    Apache Hadoop
    Cloudera
  • $60 hourly
    🚀 Elevating Data for a Brighter Tomorrow 🌐 📊 Data & AI Enthusiast | Architect | Innovator 🔗 Let's Connect and Transform the Data Landscape Together 🌟 About Me: I'm not your typical Data Architect. I'm a relentless seeker of elegant solutions in the labyrinth of data. With over a decade of software development experience, my journey has been a relentless pursuit of harnessing the power of data to shape the future. 💡 My Path: I've always been drawn to the intersection of technology, innovation, and data. Over the years, I've sculpted a career marked by: 🔹 Deep Domain Knowledge: A versatile architect with an unquenchable thirst for understanding diverse domains. Experienced in most domains, I've dived deep into the essence of industries. 🔹 Innovation Incarnate: I thrive on redefining the limits of what's possible. In R&D, I've crafted innovative software designs and solutions that change the game. 🔹 Technology Savvy: My toolkit boasts cutting-edge technologies - Scala, Python, Akka, Spark, Kafka, AWS, GCP, and more. It's not about the tech; it's about what we can achieve with it. 🔹 Architectural Vision: Actively shaping high-level and low-level design decisions, I'm the maestro behind the symphony of architecture. 🔹 End-to-End Expertise: My journey spans from network infrastructure to the cloud, from code to deployment. Versatility is my middle name. 🌌 My Vision: My career is a quest to conquer the data cosmos. I'm not just solving problems; I'm crafting solutions to real-world challenges. The thrill of making technology serve humanity fuels my passion. 🌎 The Future Beckons: In a world awash with data, I'm your guide to unlock its potential. If you're ready to embark on a journey where data isn't just a tool but a game-changer. Let's Connect.
    Featured Skill Apache Spark
    Amazon Web Services
    Google Cloud Platform
    Data Integration
    Scala
    ETL Pipeline
    Python
    Big Data
    Snowflake
    Database Architecture
    BigQuery
    Apache Cassandra
    Data Science
    Apache Kafka
    Machine Learning
  • $30 hourly
    Data Architect with Sound knowledge on Data Analytics, Architect and Engineering with proficient in Cloud Technologies, Python, Spark and BigData ecosystems, with experience of 8+ years in problem solving and consulting, Also I am proficient in Design and implementation of End to end architecture for data driven approaches with optimised performance and efficiency in design
    Featured Skill Apache Spark
    Data Analysis
    Machine Learning
    PySpark
    Big Data
    Databricks Platform
    Python
  • $60 hourly
    * Data Engineer having around 15 + years of extensive experience of analyzing requirements and designing Data solutions for Credit card & Insurance & Banking and healthcare and retail companies. * Designed and architected and developed data ingestion framework to ingest terabytes of data for top major retail clients with company Toshiba Global Commrece Solutions. * Well versed with technologies like Azure Databricks Pyspark ETL and other related Azure services like Azure Blob storage, Azure key vault, Azure Postgres SQL DB, Azure Synapse. * Experience in Implementation Medallion architecture with databricks delta tables to Extract and transfrom and load huge amounts of data in size of terabytes in different layers like Bronze/Silver/Gold using databricks autoloader utility. * Experience in implementing real time streams using spark.
    Featured Skill Apache Spark
    Scala
    Python
    PySpark
    Databricks Platform
    Ab Initio
  • $40 hourly
    Seasoned Data Engineer and Generative AI Specialist with extensive experience in building scalable, end-to-end data solutions. - Proficient in Databricks, Snowflake, AWS, and Azure, - I design and optimize data pipelines, cloud architectures, and AI-driven applications. - My expertise spans advanced data engineering, Generative AI, and cloud-native solutions to drive business insights and innovation. Let's collaborate to transform your data challenges into efficient, impactful results.
    Featured Skill Apache Spark
    Golang
    Data Analytics
    NoSQL Database
    AWS Glue
    Amazon Web Services
    Elixir
    Python
    Machine Learning
    Generative AI
    ETL Pipeline
    ETL
    PySpark
    Snowflake
    Databricks Platform
  • $50 hourly
    Experience: 11.7 Years Seasoned Senior Data Engineer with over 11 years of proven expertise in the Information Technology industry, specializing in delivering robust, scalable, and innovative data solutions across diverse domains including Banking, Finance, and Telecom. Big Data Expertise: 9+ years of comprehensive experience in Big Data ecosystems, proficient in Hadoop, MapReduce, Spark-Scala & PySpark, Hive, Kafka, Sqoop, Apache Pig, and Oozie. Cloud Proficiency: Hands-on expertise in leading cloud platforms, including Google Cloud Platform (GCP) with tools like BigQuery, DataProc, and Cloud Composer, and Azure with Databricks, Azure Data Lake (ADLS), and Azure Data Factory (ADF). Demonstrated ability to design and implement solutions leveraging Delta Tables and Databricks for PoCs. Data Engineering & Processing: Adept in data cleansing, curation, migration, and ingestion.
    Featured Skill Apache Spark
    BigQuery
    Google Cloud Platform
    Big Data File Format
    Data Engineering
    dbt
    Sqoop
    Apache Hive
    Apache Kafka
    Apache Airflow
    Apache Hadoop
    Python
    Scala
    PySpark
    Big Data
  • $50 hourly
    SUMMARY Professional with 7.5 years of extensive experience in Java, Hadoop, Spark, Hive Kafka, Hbase, Hive, Graph DB, Relational DB and Streaming application. Possessing excellent problem solving skills, strong interpersonal skills and a quick learner. Experience in creating complex queries in mysql and Spark Sql and issue debugging. SUMMARY Worked on various phases of SDLC such as Requirement Analysis, Design, Implementation, Testing, and Documentation. Work within a team ,collaborate and provide comments and suggestions to achieve client requirements. Worked in distributed systems architecture and handled huge amount of data using technologies like HDFS ,Hbse, Spark,Nifi(SparkSql). Having knowledge of RF analytics, wireless Networking and banking domain.
    Featured Skill Apache Spark
    Data Extraction
    YARN
    HDFS
    Apache Impala
    Oracle
    MySQL
    Neo4j
    Apache HBase
    Hive
    Java
  • $35 hourly
    I'm Yogendra, a hands-on technology executive with 19+ years of experience building AI-native platforms, enterprise data systems, and cloud-scale infrastructure. I’ve led global engineering teams at MSCI and BNY Mellon, built data platforms handling billions of daily events, and delivered low-latency APIs over petabyte-scale datasets. As founder of Colrows, I designed a proprietary SQL engine and orchestrated LLMs (ChatGPT, Claude, Gemini) using Graph RAG and agentic AI to turn natural language into accurate, actionable insights. Expertise LLM integration, prompt engineering, Graph RAG Cloud-native architecture (AWS, Azure, GCP) Data lakes, real-time systems, and platform scalability Team building, roadmap ownership, GTM support I work with startups and enterprises as a fractional CTO, technical advisor, or platform architect—helping scale products, modernize data infra, or bring AI into production. Let’s build something impactful.
    Featured Skill Apache Spark
    Data Engineering
    Stream Processing Framework
    MongoDB
    Vector Database
    PySpark
    Apache Cassandra
    Elasticsearch
    LangChain
    LLM Prompt Engineering
    Apache Kafka
    Java
    Artificial Intelligence
    Machine Learning Model
    ETL
  • $18 hourly
    I am a Full-stack developer / lead, or solution architect with 10+ years of experience and have the expertise to complete many projects . I have very good experience working on full applications that require scalable architecture to design and develop, having worked on all stages of development like design, development to deployment with proven experiences in development. My passion and inclination toward the programming and coding, lead me to Upwork, a platform where I can put my knowledge, experience, passion and geekiness together and define and set my own limits. My expertise:- ✔️ Front-end Development JavaScript / React / React-Native / Redux / Angular / Ionic / Vue ✔️ Back-end Development Python / Node / Express / Java Spring boot / REST API / Golang / Laravel /Nest.js / Next.js ✔️ Databases PostgreSQL / MySQL / MongoDB / DynamoDB ✔️ Data Engineering Data Pipelines / ETL / Hive / Spack / Kafka / Drill ✔️ AWS Cloud Services Amplify / Lambda / EC2 / CloudFront / EC2 / S3 Bucket / Microservices ✔️ Responsibilities and Contribution: • Involved in various stages of software development life cycle including - development, testing, and implementation. • Analyzing and validating the functional Requirements. • Suggesting a better approach and preparing detailed documents and estimating the time required for the delivery system periodically. • Configuration and Customization of the Application as per the given Business requirement. • Used the sandbox for testing and migrated the code to the deployment instance thereafter. • Analysis of requirements Involved in the development of modules. • Discussing on requirements, feasibility of the changes, and impact on the current functionality onsite. I have excellent time management skills to define priorities and implement activities tailored to meet deadlines. My aptitude & creative problem solving skills help applying innovative solutions to complex issues. I am always eager to offer the value addition to customers by providing them with suggestions about the project.
    Featured Skill Apache Spark
    React
    React Native
    Angular 10
    Apache Kafka
    AWS Lambda
    Golang
    Apache Hive
    Spring Boot
    NodeJS Framework
    Vue.js
    Amazon EC2
    Python
    Java
  • $90 hourly
    *******Certified Apache Airflow Developer******* Having more than 7+ years of professional experience, I have done masters of Engineering in Information Technology. Currently working full time as Senior Consultant with one of a multi-national companies, I'm into a Data Engineering role working mostly on Python, PySpark, Airflow, Palantir Foundry, Collibra, SQL. In my past professional years I have also worked as Full Stack Developer building REST API's & UI functionalities. Also have mobile development experience using Flutter, Android & Xojo(for iOS). Please consider me if you want your work be done in time.
    Featured Skill Apache Spark
    Amazon Web Services
    RabbitMQ
    Node.js
    Amazon S3
    JavaScript
    PySpark
    Databricks Platform
    Apache Airflow
    SQL
    Python
    ETL Pipeline
    Kubernetes
    Docker
    Java
  • $20 hourly
    Very well understand your bussiness need. Also find Problem in your bussiness using your past data. Find new way or create new way for problem solution.
    Featured Skill Apache Spark
    Snowflake
    PySpark
    Databricks Platform
    Weka
    Apache Spark MLlib
    Data Science
    Data Mining
    Oracle PLSQL
    Apache Kafka
    Scala
    Python
    SQL
    Microsoft SQL Server
    Spring Framework
  • $17 hourly
    Welcome to my profile! With over 5 years of hands-on experience in cloud technology, specializing in AWS and Azure, I am a dedicated Data Engineer passionate about transforming complex data into actionable insights. I thrive on designing and implementing scalable solutions that drive organizational growth and efficiency. As a Cloud Data Engineer, I possess a deep understanding of cloud architecture, data storage, and processing frameworks. My expertise extends to AWS services, such as EC2, S3, Redshift, Glue, and Lambda, as well as Azure services, including Azure Data Factory, Azure Databricks, and Azure SQL Database. I leverage these tools and technologies to build robust data pipelines, optimize data ingestion, and ensure data integrity. Throughout my career, I have successfully executed end-to-end data engineering projects, collaborating with cross-functional teams to deliver high-quality solutions. I have a proven track record of designing and implementing data warehouses, data lakes, and ETL processes to enable efficient data management and analysis. In previous engagements, I have tackled complex challenges, such as data integration across multiple systems, real-time data processing, and implementing scalable architectures to handle large volumes of data. I am skilled in transforming raw data into meaningful insights using SQL, Python, and other relevant programming languages. My commitment to delivering excellence is complemented by my ability to understand business requirements and translate them into technical solutions. I prioritize performance, security, and cost optimization in every project, ensuring that my clients achieve their desired outcomes while maximizing ROI. Client satisfaction is at the core of my work philosophy. I communicate effectively, maintain regular progress updates, and actively seek client feedback to ensure alignment and exceed expectations. I am committed to fostering long-term partnerships and providing ongoing support to my clients. I hold a Certification in AWS and continually expand my knowledge through professional development initiatives, staying up-to-date with the latest advancements in cloud technology and data engineering. If you are seeking a dedicated Cloud Data Engineer who can drive your data initiatives forward, I am ready to collaborate with you. Let's discuss your project requirements and how I can leverage my expertise to deliver exceptional results. Contact me now to get started!"
    Featured Skill Apache Spark
    Databricks Platform
    Data Analysis
    Git
    Microsoft Azure
    AWS Glue
    Database Modeling
    Data Cleaning
    AWS IoT Analytics
    PySpark
    AWS Lambda
    Spreadsheet Software
    Amazon Redshift
    Apache Kafka
    Data Scraping
    Amazon S3
    Microsoft Azure SQL Database
    Amazon EC2
    Data Lake
    SQL
    Python
  • $6 hourly
    EXECUTIVE SUMMARY : "A dynamic professional in field of Data Analysis with continuous learning of data analysis tools & programming languages and databases. Seeking for opportunity to start career with positive wave in data analysis field!!!" PROJECTS HANDLED --- EDA(Exploratory Data Analysis) 1. Zomato 2. Car Dekho Tableau Dashboards : 1. Covid19 India Data Analysis 2. Amazon Stocks Analysis 3. Employee Joining Data Analysis 4. Sentiment Analysis Using Tabpy. 5. Data Analysis on Foreign Direct Investment (FDI) in India.
    Featured Skill Apache Spark
    Databricks Platform
    Data Warehousing & ETL Software
    Informatica
    Artificial Intelligence
    Python
    Tableau
    MySQL
  • $25 hourly
    I am a freelancing project manager leading a team of big data developers. delivered multiple big data projects We are into big data training spark Scala pyspark and customized training based on their needs. Also providing job support Skills : Big data , Apache Spark, SCALA, PYSPARK, SPARK SQL, SPARK Streaming, KAFKA, Cassandra and Oozie, APACHE AIRFLOW, HADOOP technology stack, HDFS, Map Reduce, SQOOP, HIVE and AWS Big data services
    Featured Skill Apache Spark
    Databricks Platform
    AWS Lambda
    AWS Glue
    Data Engineering
    Project Management
    Big Data
    Apache Hadoop
    Scala
    Python
    PySpark
  • $20 hourly
    SUMMARY Dedicated Data Engineer with 4+ years of experience in Spark, PySpark, SQL and Python. Proven ability to quickly learn and adapt, delivering high-quality data solutions.
    Featured Skill Apache Spark
    Amazon Athena
    AWS Glue
    Core Java
    SAP HANA
    Hive
    Python
    SQL
    PySpark
  • $30 hourly
    Results-driven Data Engineer with expertise in building and optimizing scalable data pipelines using Apache Spark and Kafka for real-time and batch processing. Skilled in managing Hadoop ecosystems (HDFS, Hive, YARN) and leveraging cloud platforms like AWS (S3, Redshift, Glue, EC2) for seamless data integration, storage, and analytics. Proficient in implementing CI/CD pipelines with tools like Jenkins and Terraform to streamline deployments and automation. Experienced in enforcing data security protocols through Apache Ranger and Kerberos, ensuring compliance and integrity across systems. Strong analytical mindset with a proven ability to monitor, analyze, and optimize system performance, enabling data-driven decisions that align with business goals. Collaborative team player with a commitment to delivering high-quality, actionable insights. EXPERIENCE * Apache NiFi * Apache Hive * Apache Impala * Apache HUE * Apache Sqoop * Apache Oozie * Apache Ranger *Spark *HAdoop *Kubernetes *Terraform *CICD
    Featured Skill Apache Spark
    Terraform
    Kerberos
    Solution Architecture
    Linux
    Apache Zookeeper
    YARN
    Apache Hadoop
    PySpark
    Apache Kafka
    Mining
    Data Mining
    ETL Pipeline
    ETL
    Data Extraction
  • $25 hourly
    I'm a Data Engineer with over 9 years of experience building scalable data platforms and ETL pipelines using AWS, Python, PySpark, and SQL. I'm specializes in serverless architectures, big data processing, and cloud-native data engineering. I have led agile teams and delivered high-impact solutions for global clients in healthcare, banking, and pharma. I'm passionate about clean architecture, automation, and solving complex data challenges at scale. Available for part-time and contract-based remote work.
    Featured Skill Apache Spark
    PostgreSQL
    Relational Database
    pandas
    Git
    GitLab
    Docker
    Flask
    AWS Lambda
    SQL
    Data Engineering
    Python
    PySpark
    AWS Glue
    ETL Pipeline
  • $30 hourly
    I’m a results-driven Data Engineer with strong expertise in building, managing, and optimizing scalable data pipelines and cloud-based solutions. With hands-on experience across leading technologies including Google Cloud Platform (GCP), BigQuery, Apache Airflow, Apache Spark, Snowflake, Hive, and SQL/PLSQL, I help businesses turn raw data into reliable, analytics-ready assets. Over the years, I’ve successfully delivered projects involving data migration, ETL pipeline development, data warehousing, and data administration, primarily using tools like PySpark, Airflow, and GCP-native services. I'm also proficient with version control and agile tools such as Git and Jira, ensuring smooth, collaborative workflows. Whether you're migrating data to the cloud, orchestrating complex workflows, or optimizing your existing data infrastructure, I bring the technical acumen and reliability needed to get it done right. Let’s discuss how I can help you build robust data systems that scale with your business.
    Featured Skill Apache Spark
    Data Migration
    Jira
    Git
    PySpark
    Database Administration
    Oracle PLSQL
    Hive
    Microsoft SQL Server
    Snowflake
    Teradata
    Apache Airflow
    BigQuery
    Google Cloud Platform
  • $50 hourly
    I'm a Python, Scala, C++ and Go developer with extensive experience in building big data pipelines, Machine Learning models and managing linux containers. I really like toying with systems level stuff like Compilers, OS Schedulers etc. I have expertise in building teams, working on complex systems. I like working on projects with a team that cares about creating new technologies. It's important to me to build long term relationships with clients, so I'm primarily looking for long term projects. I'm flexible with my working hours and am happy to work closely with any existing freelancers you work with. I look forward to hearing from you!
    Featured Skill Apache Spark
    CUDA
    Machine Learning
    Statistics
    Apache Flink
    Big Data
    Distributed Computing
    Amazon Kinesis Video Streams
    Apache Kafka
    AWS Lambda
    pandas
    Python
    C++
    Java
    PyTorch
  • $40 hourly
    📊 Passionate Big Data Technologist | Transforming Data into Insights 🚀 👋 Hello! I'm Monu Choudhary, a dedicated professional in the exciting world of Big Data technology. With 4 years of experience, I thrive on harnessing the power of data to drive informed decision-making, spark innovation, and deliver measurable results. 💡 My Expertise: 🔹 Big Data Analytics 🔹 Data Engineering 🔹 Data Warehousing 🔹 Data Computing 🔹 Data Visualization 🔹 Cloud Computing 🔍 What Sets Me Apart: 🌟 I possess a strong foundation in both the technical and strategic aspects of Big Data. I love tackling complex challenges, whether it's optimizing data pipelines, developing predictive models, or crafting compelling data stories. 💼 Professional Journey: 🚀 Throughout my career, I've had the privilege of working with leading organizations, where I've contributed to data-driven successes. I've played a key role in implementing cutting-edge data solutions and have been part of cross-functional teams that have made a significant impact. 🤖 A Glimpse into My Passion: 👉 I'm a firm believer in the potential of data to drive positive change. My goal is to empower businesses to unlock the full value of their data assets. I'm a continuous learner, staying current with the latest trends and technologies to provide innovative data solutions. 🔗 Let's Connect: I'm always open to connecting with fellow Big Data enthusiasts, professionals, and thought leaders. Whether you want to discuss a fascinating data project, share insights, or explore potential collaborations, feel free to reach out. Let's grow and excel together in the ever-evolving landscape of Big Data. 📩 Contact Me: 📧 Email: monuwats1996@gmail.com Thank you for stopping by my profile. Let's embark on this data-driven journey together! 📈🌐 #BigData #DataAnalytics #DataEngineer #DataScience
    Featured Skill Apache Spark
    Microsoft Azure SQL Database
    Apache Hive
    Databricks Platform
    Microsoft Azure
    Data Lake
    HDFS
    ETL
    Apache Hadoop
    PySpark
    Data Engineering
  • $6 hourly
    Prashant Patil Seasoned professional with a decade of experience in the corporate world, currently excelling as a Team Lead at a renowned multinational corporation. I bring a wealth of expertise in Big Data solutions, utilizing data-driven insights to make informed decisions and drive significant business growth. My role involves leading a talented team to tackle complex data challenges, delivering innovative and impactful results. In addition to my proficiency in Big Data, I have made substantial contributions in web and backend application development. With extensive experience in the financial and banking industries, I have collaborated on creating robust and efficient software solutions. My diverse skill set enables me to bridge the gap between cutting-edge technology and the specific needs of these industries, ensuring optimal outcomes for stakeholders. My career journey has been enriched with diverse experiences, enabling me to effectively contribute to team success and organizational growth. I am passionate about leveraging technology, fostering collaboration, and leading with a solution-oriented mindset to achieve excellence.
    Featured Skill Apache Spark
    Splunk
    Apache NiFi
    Android SDK
    Material Design
    WebRTC
    Web Services Development
    Cloudera
    Apache Hadoop
    Apache Hive
    Big Data
    API Development
    Spring Boot
    Java
    Scala
  • Want to browse more freelancers?
    Sign up

How hiring on Upwork works

1. Post a job

Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.

2. Talent comes to you

Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.

3. Collaborate easily

Use Upwork to chat or video call, share files, and track project progress right from the app.

4. Payment simplified

Receive invoices and make payments through Upwork. Only pay for work you authorize.

Trusted by

How do I hire a Apache Spark Engineer near Pune, on Upwork?

You can hire a Apache Spark Engineer near Pune, on Upwork in four simple steps:

  • Create a job post tailored to your Apache Spark Engineer project scope. We’ll walk you through the process step by step.
  • Browse top Apache Spark Engineer talent on Upwork and invite them to your project.
  • Once the proposals start flowing in, create a shortlist of top Apache Spark Engineer profiles and interview.
  • Hire the right Apache Spark Engineer for your project from Upwork, the world’s largest work marketplace.

At Upwork, we believe talent staffing should be easy.

How much does it cost to hire a Apache Spark Engineer?

Rates charged by Apache Spark Engineers on Upwork can vary with a number of factors including experience, location, and market conditions. See hourly rates for in-demand skills on Upwork.

Why hire a Apache Spark Engineer near Pune, on Upwork?

As the world’s work marketplace, we connect highly-skilled freelance Apache Spark Engineers and businesses and help them build trusted, long-term relationships so they can achieve more together. Let us help you build the dream Apache Spark Engineer team you need to succeed.

Can I hire a Apache Spark Engineer near Pune, within 24 hours on Upwork?

Depending on availability and the quality of your job post, it’s entirely possible to sign up for Upwork and receive Apache Spark Engineer proposals within 24 hours of posting a job description.