Hire the best Apache Spark Engineers in Pune, IN

Check out Apache Spark Engineers in Pune, IN with the skills you need for your next job.
Clients rate Apache Spark Engineers
Rating is 4.7 out of 5.
4.7/5
based on 283 client reviews
  • $35 hourly
    I have 15+ years of experience in software development in Telecom, Banking, and Healthcare domains. Primary skillsets include Big Data eco-systems (Apache Spark, Hive, Map Reduce, Cassandra), Scala, Core Java, Python, C++. I am well versed in designing and implementing Big data solutions, ETL and Data Pipelines, Serverless and event-driven architectures on Google Cloud Platform (GCP), and Cloudera Hadoop 5.5. I like to work with organizations to develop sustainable, scalable, and modern data-oriented software systems. - Keen eye on scalability, sustainability of the solution - Can come up with maintainable & good object-oriented designs quickly - Highly experienced in seamlessly working with remote teams effectively - Aptitude for recognizing business requirements and solving the root cause of the problem - Can quickly learn new technologies Sound experience in following technology stacks: Big Data: Apache Spark, Spark Streaming, HDFS, Hadoop MR, Hive, Apache Kafka, Cassandra, Google Cloud Platform (Dataproc, Cloud storage, Cloud Function, Data Store, Pub/Sub), Clouder Hadoop 5.x Languages: Scala, Python, Java, C++, C Build Tools: Sbt, Maven Databases: Postgres, Oracle Worked with different types of Input & Storage formats: CSV, XML, JSON file, Mongodb, Parquet, ORC
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    C++
    Java
    Scala
    Apache Hadoop
    Python
    Apache Cassandra
    Oracle PLSQL
    Apache Hive
    Cloudera
    Google Cloud Platform
  • $90 hourly
    *******Certified Apache Airflow Developer******* Having more than 7+ years of professional experience, I have done masters of Engineering in Information Technology. Currently working full time as Senior Consultant with one of a multi-national companies, I'm into a Data Engineering role working mostly on Python, PySpark, Airflow, Palantir Foundry, Collibra, SQL. In my past professional years I have also worked as Full Stack Developer building REST API's & UI functionalities. Also have mobile development experience using Flutter, Android & Xojo(for iOS). Please consider me if you want your work be done in time.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Amazon Web Services
    RabbitMQ
    Node.js
    Amazon S3
    JavaScript
    PySpark
    Databricks Platform
    Apache Airflow
    SQL
    Python
    ETL Pipeline
    Kubernetes
    Docker
    Java
  • $20 hourly
    Very well understand your bussiness need. Also find Problem in your bussiness using your past data. Find new way or create new way for problem solution.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Snowflake
    PySpark
    Databricks Platform
    Weka
    Apache Spark MLlib
    Data Science
    Data Mining
    Oracle PLSQL
    Apache Kafka
    Scala
    Python
    SQL
    Microsoft SQL Server
    Spring Framework
  • $75 hourly
    Certified TOGAF 9 Enterprise Architect with over 18 years of IT service experience, specializing in solution architecture, innovation, consulting, and leading diverse projects. My extensive background in IT services has honed my skills in consulting, architecture, and software development. I am now focused on leveraging these skills in AI, Machine Learning, Data Lakes, and Analytics, seeking opportunities that challenge me to continue learning and applying cutting-edge technologies in real-world applications. Recent Projects and Specializations: Artificial Intelligence & Machine Learning: Developed several generative AI projects including: A solution for manufacturing operators that provides real-time fixes based on user-generated prompts and descriptions. An AI-driven healthcare lab assistant that suggests diagnostic tests based on user inputs. Advanced ML algorithms for monitoring pH levels in sugar production, crucial for maintaining quality control over product consistency. Implemented an ML model for HVAC systems that predicts power consumption spikes and potential breakdowns, enhancing maintenance efficiency and energy management. Data Science & Big Data: Expertise in handling large-scale data environments from terabytes to petabytes, developing actionable insights across multiple domains including Retail, Finance, Manufacturing, IoT, and Healthcare. Proficient in: Apache Hadoop, Spark, Cloudera CDH, Hortonworks, MapR Real-time data processing with Apache Hive and Elasticsearch Cloud Architecting & Data Lakes: Skilled in designing and implementing robust cloud solutions and data lakes that streamline data accessibility and analysis, supporting high-level decision-making processes. Business Intelligence & Analytics: Experienced in integrating BI tools and technologies like Splunk, Tableau, and OBIEE to transform raw data into valuable business insights. Industry Expertise: Telecom, Retail, Banking & Financial Services, Utilities, Education
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Apache Superset
    Amazon Web Services
    CI/CD Platform
    Google Cloud Platform
    Cloud Computing
    Cloud Migration
    Microsoft Azure
    Cloud Security
    Data Privacy
    Data Management
    Data Ingestion
  • $40 hourly
    Experienced AWS certified Data Engineer. Currently have around 4 years of Experience in Big Data and tools. AWS | GCP Hadoop | HDFS | Hive | SQOOP Apache Airflow | Apache Spark | Apache Kafka | Apache NIFI | Apache Iceberg Python | BaSH | SQL | PySpark | Scala | Delta Lake Datastage | Git | Jenkins | Snaplogic | Snowflake.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Amazon API Gateway
    Google Cloud Platform
    Apache Kafka
    Apache Airflow
    Big Data
    Data Migration
    Apache NiFi
    Amazon Redshift
    Amazon Web Services
    PySpark
    AWS Lambda
    AWS Glue
    ETL
    Python
    SQL
  • $40 hourly
    I have total 9+ years of experiance in Apache Spark using Java/Scala/Python. Having the same year of experience with kafka and Cassandra. Worked on geo-special domain for 3 years. Working as part time training faculty to train candidates for AWS, Spark, Python
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Amazon S3
    Amazon Web Services
    AWS Glue
    Apache Hadoop
    Apache Kafka
    AWS Lambda
    Scala
    Python
    Java
  • $60 hourly
    Database Professional with vast and proven experience in design, development/BI development, configuration, installation and administration of different databases. I've been involved with very large database/datawarehouse implementations over 2 TBs data. I've completed number of development, administration and BI development projects on following platforms. 1] SQL Server 2005- 2016, Azure DB 2] MySQL 5.1 -5.7, AWS Aurora 3] Mongo DB 4] Amazon Dynamo DB 5] Amazon Redshift 6] Elastic Search
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Snowflake
    Microsoft SQL SSAS
    Elasticsearch
    AWS Development
    Azure Service Fabric
    Azure Machine Learning
    Python
    Microsoft Azure
    Data Modeling
    Data Lake
    Databricks Platform
    Data Engineering
    Azure Cosmos DB
    SQL
  • $45 hourly
    As a highly experienced Data Engineer with over 10+ years of expertise in the field, I have built a strong foundation in designing and implementing scalable, reliable, and efficient data solutions for a wide range of clients. I specialize in developing complex data architectures that leverage the latest technologies, including AWS, Azure, Spark, GCP, SQL, Python, and other big data stacks. My extensive experience includes designing and implementing large-scale data warehouses, data lakes, and ETL pipelines, as well as data processing systems that process and transform data in real-time. I am also well-versed in distributed computing and data modeling, having worked extensively with Hadoop, Spark, and NoSQL databases. As a team leader, I have successfully managed and mentored cross-functional teams of data engineers, data scientists, and data analysts, providing guidance and support to ensure the delivery of high-quality data-driven solutions that meet business objectives. If you are looking for a highly skilled Data Engineer with a proven track record of delivering scalable, reliable, and efficient data solutions, please do not hesitate to contact me. I am confident that I have the skills, experience, and expertise to meet your data needs and exceed your expectations.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Snowflake
    ETL
    PySpark
    MongoDB
    Unix Shell
    Data Migration
    Scala
    Microsoft Azure
    Amazon Web Services
    SQL
    Apache Hadoop
    Cloudera
  • $29 hourly
    *Experience* • Have hands-on experience upgrading the HDP or CDH cluster to Cloudera Data Private Cloud Platform [CDP Private Cloud]. • Extensive experience in installing, deploying, configuring, supporting, and managing Hadoop Clusters using Cloudera (CDH) Distributions and HDP hosted on Amazon web services (AWS) cloud and Microsoft Azure. • Experience in pgrading of Kafka, Airflow and CDSW • Configured various components such as HDFS, YARN, Sqoop, Flume, Kafka, HBase, Hive, Hue, Oozie, and Sentry. • Implemented Hadoop security. • Deployed production-grade Hadoop cluster and its components through Cloudera Manager/Ambari in a virtualized environment (AWS/Azure Cloud) as well as on-premises. • Configured HA for Hadoop services with backup & Disaster Recovery. • Setting Hadoop prerequisites on Linux server. • Secured the cluster using Kerberos & Sentry as well as Ranger and tls. • Experience in designing and building scalable infrastructure and platforms to collect and process very large amounts of structured and unstructured data. • Experience in adding and removing nodes, monitoring critical alerts, configuring high availability, configuring data backups, and data purging. • Cluster Management and troubleshooting on the Hadoop ecosystem. • Performance tuning, and solving Hadoop issues using CLI, CMUI by apache WebUI. • Report generation of running nodes using various benchmark operations. • Worked on AWS services such as EC2 instances, S3, Virtual private cloud, Security groups, and Microsoft Service like resource groups, resources (VM, disk, etc.), Azure blob storage, Azure storage replication. • configure private and public IP addresses, network routes, network interface, subnets, and virtual network on AWS/Microsoft Azure. • Troubleshooting, diagnosing, performance tuning, and solving the Hadoop issues. • Administration of Linux installation. • Fault finding, analysis and logging information for report. • Expert in administration of Kafka and deploying of UI tools to manage Kafka • Implementing HA for MySQL • Installing/Configuring Airflow for orchestration of jobs
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Apache Kafka
    Apache Hive
    Apache Airflow
    YARN
    Hortonworks
    Apache Hadoop
    Apache Zookeeper
    Cloudera
    Apache Impala
  • $50 hourly
    A Skilled Programmer with great domain Knowledge and Work Experience in: • E-Commerce • Adtech • Finance • Trading • Telecommunication • Software Design & Development • Data Platform Design, Development & Maintainance • Data Engineering • DevOps • Data Science • AI • Consulting A fast learner. An active Contributer to various Open Source Projects. An ardent Mathematician, proficient in Problem Analysis & Solving. Buid, Deploy, Maintain ETL\ELT Data Ingestion Pipelines & CI\CD Pipelines for code Distribution; Create, Deploy & Maintain Web Hooks, APIs developed as Serverless Microservices. Design, Build, Maintain Databases, Data Warehouses, Modern Data Platform (Data Lakes, Big Data Systems, etc.); Task Automation in Python; Web Automation in Python using Selenium; Skilled in C, C++ (with STL), Java & Java EE, Go; Python; Haskell, Scala; SQL; PHP, Perl; Javascript, Typescript; Various Python, Javascript & Typescript Frameworks, Libraries & APIs like Django, Flask, Angular, ReactJS, React Native; NodeJS, ExpressJS; Apache Hadoop, Apache Kafka, MapReduce, Apache Airflow, Apache Cassandra, PyTorch, Elasticsearch, Docker, Caffe; Database Systems: Oracle, MongoDB, PostgreSQL, & other SQL & NoSQL DBs. Data Structures & Algorithms for computation in Parallel and Distributed Environments, & Cloud-Based Platforms; Robotic Process Automation Tools : Automation Anywhere, UIPath; Bussiness Intelligence Tools: Sigma, Looker, Tableau, PowerBI, Data Studio, Excel, Spreadsheet, etc; An aspirant in Data Science, Expert Systems, Machine Learning, Deep Learning, Artificial & Swarm Intelligence. Proficient in Data Science Skills: • Data Engineering • Importing & Cleaning Data • Data Manipulation • Data Visualization • Probability & Statistics • Machine Learning • Applied Finance • Reporting • Case Studies • Management • Theory According to my AMCAT score, I am employable for the following profiles: Software Engineer, IT Services, Associate ITES/BPO.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Cloud Computing
    Big Data
    Database Management System
    Data Interpretation
    Data Analysis
    Amazon Web Services
    Bash Programming
    Database Programming
    API
    Bash
    BigQuery
    ETL Pipeline
    Docker
    Apache Hadoop
  • $50 hourly
    I am a passionate coder and data enthusiast. I love solving complex problems using Data and Models. I am currently working with tools and frameworks required for building efficient and scalable data pipelines using AWS and GCP based cloud based platform. My skills : Computer Vision, Google cloud, Infrastructure set-up, Big Data, Machine Learning, MapReduce, SQL, Search technologies. Tools and Languages : Pytorch, Tensorflow, Opwn CV, Apache Hadoop, Apache Kafka, Apache Spark, Apache Hive, Apache Impala, Apache Jena, AWS Cognito, AWS IOT Core, AWS Lambda, Django framework, Flask, Graphene, Graphql, AWS Dynamodb, AWS S3, RDF triple stores, Time series databases like Axibase, Apache Solr, Apache Lucene, Marklogic, Metafacts, Jenkins, Telegraf, Grafana, Kubernetes, Docker, AWS ECS, AWS EKS, GCP Kubernetes, Databricks solutions, Python, SQL.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Kubernetes
    Apache Kafka
    Architectural Design
    TensorFlow
    AWS Lambda
    PyTorch
    Internet of Things Solutions Design
    Apache Hive
    Apache Hadoop
    Internet of Things
    Google Cloud Platform
    Cloud Computing
    Amazon Web Services
  • $40 hourly
    8 years of Global Experience in Big Data Analytics, Data Warehouse and Data Science Projects. Engineered end to end data-driven solutions to business challenges to achieve there strategic objective
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Analytics
    Exploratory Data Analysis
    CI/CD
    Big Data
    BigQuery
    Data Analytics
    Cloud Computing
    Analytical Presentation
    Apache Airflow
  • $35 hourly
    * 9.6 Years of experience in Apache Spark, Python, MS Azure Cloud platform, Azure Data Factory, Azure Data Lake Storage, Azure-Databricks, Hadoop eco-system Map Reduce, Hive, Sqoop, HDFS, Oracle, GIT, JIRA, Agile methodology. * Involved in understanding the business requirement and providing solution/design. Understanding the data, parameter, schema associated and behaviour of the data to better perform operations and transformation the data. * Contributed and proposed best possible technical specification to solve limitations and defects in existing application. * Involved in the code development, testing and also created the
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Microsoft Azure
    AWS Glue
    Data Lake
    ETL Pipeline
    PaaS
    Agile Project Management
    Agile Software Development
    Git
    Cloud Computing
    Databricks Platform
    Apache Hadoop
    Apache Hive
  • $30 hourly
    A seasoned Data Engineer and a Software Developer with 3+ years of industry experience in building scalable cloud applications
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    GraphQL
    React Native
    Amazon EC2
    Search Tool
    Search Engine
    Amazon S3
    Amazon Web Services
    Google Cloud Platform
    Java
    Python
  • $35 hourly
    Experienced Enterprise Architect with a 20+ year track record in Java-based technologies, domain driven design, microservices, and cloud solutions. I have worked in top enterprises delivering end to end solutions. My Areas of expertise are: . Micro-services: Java, Spring, SpringBoot, Kubernetes, Docker, AWS(S3, RedShift, Aurora, EKS, ECS, Kafka, Kinesis, SNS, DynamoDB, Serverless, Openshift, Istio Service Mesh, Continuous Integration using (SVN, Git, Maven, Jenkin), Drools, Hibernate, SQL and NoSQL Databases Integration: Kafka, Restful Services, SOAP, SOA, Active MQ, Camel, JMS, Solace MQ Data Architecture: OLTP & OLAP data modeling, data formats: Avro, Parquet Big data analytics: Lamda Architecture, Kappa Architecture, Flink, Spark, HDFS, Hadoop , Hbase, Hazelcast, Hive, Scoop, Hue Analytics tools: Numpy, Pandas, Scikit Learn, and Matplotlib, Time Series forecast using SMA, EMA, ARIMA models Leadership: Enterprise Architecture, Solution Design, Program Management, Technology Evaluation, Budgets, Stakeholder Relations, Vendor Relations, Change Management , Promoting Diversity, Agile, Waterfall & Hybrid
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Low Code & RAD Software
    Project Management
    Hibernate
    Oracle Database
    Cloud Architecture
    Apache Camel
    MySQL
    MongoDB
    Spring Boot
    Solution Architecture
    Cloud Computing
    Microservice
    Apache Kafka
    Java
  • $60 hourly
    I’m a developer experienced in building websites for small and medium-sized businesses. Whether you’re trying to win work, list your services, or create a new online store, I can help. Knows HTML and CSS3, PHP, jQuery, python, pyspark, django Full project management from start to finish Regular communication is important to me, so let’s keep in touch.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    CSS 3
    Django
    SQL
    PySpark
    JavaScript
    CSS
    HTML
    Python Script
    Python
  • $18 hourly
    I am a Full-stack developer / lead, or solution architect with 10+ years of experience and have the expertise to complete many projects . I have very good experience working on full applications that require scalable architecture to design and develop, having worked on all stages of development like design, development to deployment with proven experiences in development. My passion and inclination toward the programming and coding, lead me to Upwork, a platform where I can put my knowledge, experience, passion and geekiness together and define and set my own limits. My expertise:- ✔️ Front-end Development JavaScript / React / React-Native / Redux / Angular / Ionic / Vue ✔️ Back-end Development Python / Node / Express / Java Spring boot / REST API / Golang / Laravel /Nest.js / Next.js ✔️ Databases PostgreSQL / MySQL / MongoDB / DynamoDB ✔️ Data Engineering Data Pipelines / ETL / Hive / Spack / Kafka / Drill ✔️ AWS Cloud Services Amplify / Lambda / EC2 / CloudFront / EC2 / S3 Bucket / Microservices ✔️ Responsibilities and Contribution: • Involved in various stages of software development life cycle including - development, testing, and implementation. • Analyzing and validating the functional Requirements. • Suggesting a better approach and preparing detailed documents and estimating the time required for the delivery system periodically. • Configuration and Customization of the Application as per the given Business requirement. • Used the sandbox for testing and migrated the code to the deployment instance thereafter. • Analysis of requirements Involved in the development of modules. • Discussing on requirements, feasibility of the changes, and impact on the current functionality onsite. I have excellent time management skills to define priorities and implement activities tailored to meet deadlines. My aptitude & creative problem solving skills help applying innovative solutions to complex issues. I am always eager to offer the value addition to customers by providing them with suggestions about the project.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    React
    React Native
    Angular 10
    Apache Kafka
    AWS Lambda
    Golang
    Apache Hive
    Spring Boot
    NodeJS Framework
    Vue.js
    Amazon EC2
    Python
    Java
  • $25 hourly
    10+ years of experience in full-stack and full-service data engineering/data visualization/data science/ AI-ML with enterprise clients such as Walmart, Procter & Gamble, Amazon, Johnson & Johnson etc. as well as SMEs like TheCreditPros, StructuredWeb, NorthCoastMedical, PoshPeanuts, EffectiveSpend, etc. Domain Experience: • Retail and E-commerce • Banking and Financial Services • Telecom • Sports & Gaming • Operations / ERP / CRM Analytics Tools Expertise: Data Engineering/ETL/Data Pipelines/Data Warehousing: • Talend Studio, Stitchdata, Denodo, Fivetran, CloverDX • AWS (Glue-RDS-Redshift-Step Functions-Lambda) • Azure (ADF-Data Lake-ADLS Gen2) • GCP (Cloud Composer- Cloud Functions-BigQuery) • SQL, MongoDB, DBT • ETL through Rest and SOAP APIs for Salesforce, Netsuite, Fulfil, Pardot, Facebook, Linkedin, Twitter, Instagram, Google Adwords, Yahoo Gemini, Bing Ads, Google Analytics, Zendesk, Mailchimp, Zoho, Five9 etc. • Data Streaming (Apache Spark, Flink, Flume, Kafka) • API/Webhook design through Python FastAPI + Uvicorn • Twilio, Asterisk Data Visualization/Business Intelligence: • Power BI (Pro-Premium-Embedded-Report Server, DAX/Power Query M) • Tableau (Prep-Cloud-Server-AI-Pulse, Functions/LOD Expressions) • Looker (Studio, Pro, LookML) • Qlik Sense • DOMO Voicebot-Chatbots: • NLP - LLM (Natural Language Processing - Large Language Models) • ChatGPT • Deepgram • Langchain • Llama2 • Falcon • deBerta • T5 • Bert Speech Engineering Tools/Techniques: • Kaldi / Speechbrain / Whisper / Nvidia Riva / EspNet / Bark • AWS-GCP-Azure ASR & TTS • Amazon AWS Polly, Transcribe, Translate • Automated Speech Recognition • Speaker Diarization • Wake Word Detection • Speech Biometrics • Intent Recognition • Speaker separation
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Talend Open Studio
    Snowflake
    Amazon Redshift
    BigQuery
    Qlik Sense
    dbt
    AWS Glue
    Microsoft Azure
    QlikView
    Automatic Speech Recognition
    SQL
    Tableau
    Microsoft Power BI
    Databricks Platform
  • $40 hourly
    As a Senior Data Engineer with 8+ years of extensive experience in the Data Engineering with Python ,Spark, Databricks, ETL Pipelines, Azure and AWS services, develop PySpark scripts and store data in ADLS using Azure Databricks. Additionally, I have created data pipelines for reading streaming data from MongoDB and developed Neo4j graphs based on stream-based data. I am well-versed in designing and modeling databases using Neo4j and MongoDB. I am seeking a challenging opportunity in a dynamic organization that can enhance my personal and professional growth while enabling me to make valuable contributions towards achieving the company's objectives. • Utilizing Azure Databricks to develop PySpark scripts and store data in ADLS. • Developing producers and consumers for stream-based data using Azure Event Hub. • Designing and modeling databases using Neo4j and MongoDB. • Creating data pipelines for reading streaming data from MongoDB. • Creating Neo4j graphs based on stream-based data. • Visualizing data for supply-demand analysis using Power BI. • Developing data pipelines on Azure to integrate Spark notebooks. • Developing ADF pipelines for a multi-environment and multi-tenant application. • Utilizing ADLS and Blob storage to store and retrieve data. • Proficient in Spark, HDFS, Hive, Python, PySpark, Kafka, SQL, Databricks, and Azure, AWS technologies. • Utilizing AWS EMR clusters to execute Hadoop ecosystems such as HDFS, Spark, and Hive. • Experienced in using AWS DynamoDB for data storage and caching data on Elasticache. • Involved in data migration projects that move data from SQL and Oracle to AWS S3 or Azure storage. • Skilled in designing and deploying dynamically scalable, fault-tolerant, and highly available applications on the AWS cloud. • Executed transformations using Spark, MapReduce, loaded data into HDFS, and utilized Sqoop to extract data from SQL into HDFS. • Proficient in working with Azure Data Factory, Azure Data Lake, Azure Databricks, Python, Spark, and PySpark. • Implemented a cognitive model for telecom data using NLP and Kafka cluster. • Competent in big data processing utilizing Hadoop, MapReduce, and HDFS.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Microsoft Azure SQL Database
    SQL
    MongoDB
    Data Engineering
    Microsoft Azure
    Apache Kafka
    Apache Hadoop
    AWS Glue
    PySpark
    Databricks Platform
    Hive Technology
    Azure Cosmos DB
    Apache Hive
    Python
  • $10 hourly
    Currently working as an Data Engineer. Experienced in Hadoop, Spark, Hive, Kafka, Python, SQL and AWS services such as EMR, EC2, S3, Redshift, Lambda.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Hive
    AWS Lambda
    Linux
    PostgreSQL
    PySpark
    Apache Kafka
    Apache Hadoop
    Python
    Apache Hive
    AWS Glue
  • $5 hourly
    CAREER SUMMARY: A Passionate and organized individual seeking an Data Engineer position in the field of Big Data. Skilled at Python Programming, SQL, ML and Hadoop core and Eco-System. Strong ability to handle complex Problems. Innovative, creative, and willing to contribute ideas and learn new things., Skills: Python Machine Learning Deep Learning DBMS, SQL Power BI Big Data Technologies (Hadoop MapReduce, Spark, Hive) RELEVANT PROJECTS: Dog Breed Classification Using Deep Learning Python, Deep Learning | Duration : 1 Month The purpose of our project is to build a model using deep learning algorithms and image processing to predict the breed of dog using their image as input. We used Convolution Neural Network to build the model. The Neural Network then identifies the patterns in that matrix to remember that image so that it can later itself recognizethat image. We have used pretrained model of Transfer Learning to increase the identification accuracy. Movie Recommender System Project Pyhton, Machine Learning | Duration : 1 month The Simple Recommender offers generalized recommnendations to every user based on movie popularity and (sometimes) genre. The basic idea behind this recommender is that movies that are more popular and more critically acclaimed will have a higher probability of being liked by the average audience.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Wing Python IDE
    SQL Programming
    Hive
    Analytics
    Data Science
    Big Data
  • $22 hourly
    Having total 9+ Years of Experience in building Bigdata decisive systems. Have developed small , medium and large Bigdata processing Projects. Expertise in Distributed Processing using Apache Spark. Migrated existing bigdata pipeline on AWS and GCP.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Apache Airflow
    ETL Pipeline
    Data Processing
    Algorithms
    Google Cloud Platform
    Java
    Amazon Web Services
    Big Data
  • $15 hourly
    I am a dedicated data engineer with 2+ years of experience, committed to constructing cutting-edge and dependable data solutions tailored to clients’ distinct business requirements. Continually enhancing my knowledge to staying up-to-date within the evolving data engineering landscape.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Apache HTTP Server
    PySpark
    Amazon
    Python Script
    Amazon EC2
    Amazon Web Services
    Python
    Apache Airflow
  • $15 hourly
    Senior Software Engineer (7+ years) with a demonstrated history of working in the information technology and services industry. Skilled in Big data, GCP, BigQuery, Cloud Storage, Pub/Sub, Dataproc, Dataflow, apache Beam, Cloud SQL, BigTable, Python, spark with Java and scala, PySpark, hive, hdfs, kafka, REST API's, Oracle Database, Extract, Transform, Load (ETL), and Data Warehousing.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Google Cloud Platform
    Data Warehousing & ETL Software
    ETL
    Apache Airflow
    Apache Kafka
    PySpark
    Java
    Apache Beam
    BigQuery
    Google Dataflow
    Big Data
    Python
  • $20 hourly
    I am highly skilled and results-driven Data Engineer/Architect with 7+ years of experience in designing and implementing robust data solutions. Adept at integrating complex data systems, optimizing data pipelines, and ensuring data quality and integrity. Overall, having a strong technical knowledge across multiple projects with technologies like Spark, Hadoop, Hive, Sqoop, Oozie, Python, Scala, SQL, Snowflake, AWS services (S3, Glue, Lambda, Step Functions, EventBridge, SNS, Redshift) and Microsoft Azure services (Blob, ADLS, Databricks, Data Factory, SQL server) etc. I am seeking challenging opportunities to leverage my expertise in data engineering, architecture, and analytics to drive business growth and enable data-driven decision-making.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Microsoft Azure SQL Database
    Data Lake
    Apache Hive
    PostgreSQL
    AWS Lambda
    Amazon CloudWatch
    Amazon S3
    Snowflake
    Amazon Redshift
    Databricks Platform
    AWS Glue
    Microsoft Azure
    Amazon Web Services
    PySpark
  • $20 hourly
    overall 3+ years of IT expertise as an AWS Data engineer. Hands-on experience with Amazon Web Service components and Big Data technologies. Experience developing ETL pipelines for big data and AWS environments. Good working knowledge of Python and SQL principles. Understanding Bigdata tools such as Hadoop, HDFS, and HIVE is beneficial. Expertise with AWS services such as S3, Glue, Athena, IAM, RDS, Redshift, EMR, Lambda, Step function, and Kinesis.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Apache Kafka
    PySpark
    Amazon Web Services
    Amazon EC2
    AWS CodePipeline
    AWS Lambda
    Amazon Redshift
    AWS Glue
    SQL
    Python
  • $25 hourly
    Summary Experienced Data Engineer with a demonstrated history of working in the information technology and services industry. Certified Developer in Big Data technologies such as Spark from Databricks and Machine Learning Engineer from Udacity. Also Has Hands on Experience in Azure, AWS and GCP cloud. Excellent Technical Communication, Team Building, and Public Speaking. Team player with good communication skills, high and optimized quality of technical work, self-driven and dedication towards work and highly self-motivated. Strong coding skills, creative in nature, technical acumen and able to work independently for bilateral growth. ? Scala, Java, Python, SQL & MySQL ? Windows , Linux & Mac, GitLab, VSTS, BitBucket, Github ? SQL Server 2016, Databricks Delta Table, Hive, hudi, Redshift, RDS, NoSQL. ? IntelliJ, Eclipse, Spyder, Jupyter, VS Code and Databricks Notebook. Expertise & Research ?
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Apache Hadoop
    Apache Kafka
    Python Script
    Amazon Web Services
    Google Cloud Platform
    SQL
    Hive
    Microsoft Azure
    Scala
    Apache Airflow
    Python
  • Want to browse more freelancers?
    Sign up

How hiring on Upwork works

1. Post a job (it’s free)

Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.

2. Talent comes to you

Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.

3. Collaborate easily

Use Upwork to chat or video call, share files, and track project progress right from the app.

4. Payment simplified

Receive invoices and make payments through Upwork. Only pay for work you authorize.

Trusted by

How do I hire a Apache Spark Engineer near Pune, on Upwork?

You can hire a Apache Spark Engineer near Pune, on Upwork in four simple steps:

  • Create a job post tailored to your Apache Spark Engineer project scope. We’ll walk you through the process step by step.
  • Browse top Apache Spark Engineer talent on Upwork and invite them to your project.
  • Once the proposals start flowing in, create a shortlist of top Apache Spark Engineer profiles and interview.
  • Hire the right Apache Spark Engineer for your project from Upwork, the world’s largest work marketplace.

At Upwork, we believe talent staffing should be easy.

How much does it cost to hire a Apache Spark Engineer?

Rates charged by Apache Spark Engineers on Upwork can vary with a number of factors including experience, location, and market conditions. See hourly rates for in-demand skills on Upwork.

Why hire a Apache Spark Engineer near Pune, on Upwork?

As the world’s work marketplace, we connect highly-skilled freelance Apache Spark Engineers and businesses and help them build trusted, long-term relationships so they can achieve more together. Let us help you build the dream Apache Spark Engineer team you need to succeed.

Can I hire a Apache Spark Engineer near Pune, within 24 hours on Upwork?

Depending on availability and the quality of your job post, it’s entirely possible to sign up for Upwork and receive Apache Spark Engineer proposals within 24 hours of posting a job description.