Hire the best Apache Spark Engineers in New Delhi, IN

Check out Apache Spark Engineers in New Delhi, IN with the skills you need for your next job.
Clients rate Apache Spark Engineers
Rating is 4.7 out of 5.
4.7/5
based on 283 client reviews
  • $35 hourly
    Shaikh is an experienced Certified Cloud Data Engineer with over three years of expertise in designing end-to-end ETL pipelines. He is passionate about unlocking the value of data and believes in its power to drive business growth. His skills are rooted in his experience working with Google Cloud Platform (GCP). Shaikh can help you leverage GCP services such as BigQuery, Bigtable, Data Studio (now Looker Studio), Cloud Functions, Cloud Storage, Cloud Scheduler, Scheduled Queries, Cloud SQL, Dataflow, Datafusion, and more. His expertise can empower your organization to efficiently manage and analyze large datasets, improve data-driven decision-making, and derive valuable insights.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    ETL Pipeline
    Apache Beam
    Google Analytics
    Microsoft PowerPoint
    BigQuery
    Databricks Platform
    Google Cloud Platform
    Apache NiFi
    Snowflake
    SQL
    Google Sheets
    Python
  • $40 hourly
    Hi, I'm Pradeep, a seasoned ML Software Engineer with a rich history spanning over 8 years. Throughout my career, I've honed my skills in crafting cutting-edge solutions tailored to the needs of diverse businesses, ranging from startups to enterprise-level corporations. I approach every project with a sense of ownership, striving to not only meet but exceed my clients' expectations. My expertise encompasses a wide array of disciplines, including Machine Learning, Data Engineering, Backend Development, DevOps, and Cloud technologies. With a keen eye for detail and a passion for innovation, I employ proven methodologies to deliver tangible results that drive business growth and success.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Apache Kafka
    Docker
    Kubernetes
    BigQuery
    DevOps
    Django
    Machine Learning
    Computer Vision
    Python
    PyTorch
  • $50 hourly
    Hello, I'm Raj, a data/ML professional with over 7+ years of experience in building large scale recommender systems and building data science solution in Adtech space - Data-driven statistician with a passion for leveraging insights to drive well-informed business decisions. - 2+ years of experience driving impactful data science solutions in the microblogging domain, contributing to the advancement of ML capabilities at Koo. - 5+ years of experience delivering innovative data science solutions in Ad-tech and e-commerce, spanning the programmatic stack (SSP, DSP, DMP, RTB). - Also, I've utilized the AWS tech stack to quickly analyze billions of log events and extract actionable insights from 2TB/day of RTB logs. - Enthusiastic learner, actively participating in MOOCs, translating knowledge into projects like Gitdiscoverer.com. In my industry experience, I've worked with: - Languages: Python, R, PySpark - Dashboards & Visualizations: Rshiny, Apache Superset, Kibana, Redash, Metabase- - Cloud: AWS, GCP - Database/Query: MySQL, Citus(PostgreSQL), Hive, Elastic Search, Amazon Athena, MongoDB, Postgres, Citus (PostgreSQL), Snowflake, Hive - Transformations: AWS GLUE, DBT - Cloud data warehouse: Amazon Redshift, Snowflake - Big Data Frameworks: Apache Spark, Databricks Notable Achievements at Koo: - Immense pride in leading the development of the 'Recommended For You' feature (also known as 'People You May Know' on LinkedIn) for Koo (India's twitter), built from scratch with a remarkable team effort. - Implemented a weighted PageRank algorithm at Koo to enhance content personalization and user recommendation systems. - Designing and developing insightful dashboards for the ML initiatives, providing key stakeholders with valuable visualizations and actionable insights - Designing and developing a suite of ETL processes that empowered data transformation for downstream Machine Learning products - Led the development of the 'For You' tab, applying advanced techniques such as Locality-Sensitive Hashing (LSH) for vector search within PySpark to deliver scalable and personalized content recommendations - Led the migration of the entire recommender system from AWS to GCP and transitioned Koo's search functionality from managed to self-hosted OpenSearch, enhancing performance, scalability, and control while significantly reducing operational costs At Class one exchange (C1X): - Setting up self-serve analytics: enabled data access to the entire organization by setting up Apache Superset and Metabase. This helped folks access reports in a self-service manner using industry-leading tools. Allowing easy access to rich reports is truly a game-changer in any organization - Developed a URL classification model to identify and enrich an ad impression to provide more context for campaign selection - AWS cost optimization by developing a routing model to match ad impressions with bidders. - Predicting the anomalous behavior in revenue and sending an alert with the affected metrics. Stakeholders were alerted when anything went wrong in the system like revenue was down or in some cases we observed overspend. This is a game-changer for any product manager who generally is answerable to management in these kinds of situations. - Identifying top monetization friendly users based on various buying intents. This helped us identify which publisher had a premium audience and made us focus more of our efforts on growing those accounts.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    LLM Prompt Engineering
    Data Management
    ETL
    PostgreSQL
    Analytics
    R Shiny
    Machine Learning
    Data Science
    Data Analysis
    Apache Hive
    Statistics
    R
    Python
    SQL
  • $30 hourly
    With over 4 years of IT experience, I bring a robust skill set to the table, showcasing proficiency in Python, PySpark, Spark SQL, and a solid understanding of DevOps practices. I hold multiple certifications, including Databricks Data Engineer Professional, Microsoft DP 203, DP 900, AZ 900, and AI 900, validating my expertise as a data engineer with over 3 years of hands-on experience on the Microsoft Azure cloud platform. Furthermore, my certification as a SAFe 5 Agile Practitioner underscores my commitment to agile methodologies and practices.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Code Review
    Jira
    Data Cleaning
    PostgreSQL
    Microsoft Azure
    ETL
    Azure DevOps
    Databricks Platform
    PySpark
    Python
    ETL Pipeline
  • $70 hourly
    A highly motivated graduate professional specializing in the field of Data Science. I believe my technical knowledge along with my managerial skills will help me outshine in the work I do, thus providing a path for future growth. My education and experience so far has strengthened and enriched my business acumen, logical thinking, analytical reasoning and decision making, thereby enabling me to create a niche for myself. My objective is to broaden my knowledge of information technology and business solutions and exploit it in the work I do. I'm pursuing my Master's in Information Technology & Analytics & actively looking for Full-Time Job Opportunities. * AREAS OF EXPERTISE Advanced Microsoft Excel Skills including Pivot Tables, Power Pivot, Advanced Formulas and Functions. Business Intelligence Data analytics, Reporting, Dashboards, Trending Languages: Python (Pandas, Numpy, Selenium, Transformer, FastAPI), MySQL, HTML, CSS
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Natural Language Processing
    Web Development
    Artificial Intelligence
    LangChain
    Elasticsearch
    Apache Hadoop
    Web Scraping
    Django
    Machine Learning
    API Development
    Java
    Python
    NoSQL Database
    SQL
  • $25 hourly
    I am an accomplished Software Engineer with expertise in designing, building, and maintaining high-performance real-time backend systems capable of handling 35,000 queries per second. I have a strong background in processing vast amounts of Big Data, processing billions of candidate recommendation emails daily. Through the implementation of new features and improved email relevance via A/B experiments I helped increase annual revenue by $50 million. My leadership skills shine through my ability to oversee projects, manage teams, and deliver products incrementally while facilitating seamless collaboration across diverse teams. I am dedicated to providing exceptional client support, promptly addressing inquiries and fulfilling feature requests. I excel in clear communication, regularly updating stakeholders, leaders, and managers on project progress, experiment outcomes, and performance metrics. Furthermore, I take pride in mentoring junior team members, guiding them towards adopting superior software practices and nurturing their growth to advanced proficiency.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Amazon
    Apache Kafka
    API
    Amazon DynamoDB
    PostgreSQL
    MySQL
    Amazon Web Services
    Django
    Python
    Redis
    Java
    Scala
  • $23 hourly
    Senior project Engineer with 8+ years of experience serving dual responsibility of Development, Supporting and Testing business solution software and analyzing business operations along with Techno Functional role involving system designing and implementation. Highly motivated and positive individual with great organizational and communication skills. Customer service master and efficient problem solver. Deftly manage administrative functions of the practice. Provide thorough answers and solutions and provide an exceptional customer experience. Aiming to utilize my strong prioritization skills and analytical ability along with Domain Expertise to achieve the goals of organization I am working with. Adept at interdepartmental coordination to maximize business functionality and efficiency
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Unix
    Informatica
    Microsoft Excel
    Data Warehousing
    Data Lake
    ETL
    Talend Open Studio
    Hive
    Apache Hadoop
    SQL
  • $25 hourly
    Experienced Data Engineer who is an AWS Ceritified Solutions Architect, having 6+ years of experience in serving 5+ Fortune 500 clients. I am proficient in- 1. SQL 2. Data Analytics 3. Python 4. PySpark 5. AWS 6. Cloud Ops 7. Scala 8. Apache Airflow
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Apache Hadoop
    Machine Learning
    Amazon Athena
    AWS Glue
    PySpark
    Data Analysis
    ETL Pipeline
    Data Visualization
    Python
    Tableau
    SQL
    Microsoft Power BI
  • $30 hourly
    I'm a Data, ML and DevOps Engineer. My work revolves around Solution Architecture, Dev, and building Data & ML Platforms, Data Engineering, Model Serving, and to provide Data Architecture & Governance solutions. I extensively work on Cloud, DevOps, DE, DataOps & MLOps using K8s, Docker, Kafka, Hadoop, Spark, Airflow, Terraform, AWS, Azure, & GCP Skills and Competencies - - Apache Hadoop: Cloudera CDH, Hadoop, Pig Latin, Impala, HiveQL, Hive, Beeline, Sqoop, Flume, HUE - Apache Spark: Spark SQL, Spark Streaming, Dataframes, RDD, PySpark, Spark on Yarn, Mesos and Kubernetes - Data Formats: Apache Parquet, Delta Lake, Apache HUDI, Apache Avro, RC File, JSON - Streaming: Apache Kafka, Apache Flink - Data Governance: Apache Ranger, Apache Atlas - AWS Stack: EKS, ECR, EMR, EC2, Glue, Lambda, S3, DynamoDB, Athena, CloudWatch, SageMaker, SES, SNS, SQS, IAM, Cloud9, VPC, Route53, Load Balancers etc - Azure Stack: AKS, ACR, HD Insights, Azure Databricks, VM, ADF, Blob, ADLS Gen2, Active Directory, Key Vault - GCP Stack: Compute Engine, Cloud Storage, Bigtable, Cloud Datastore, BigQuery, Cloud Dataproc, Cloud Dataflow - Machine Learning Orchestration: Kubeflow, MLFlow - Programming & Scripting languages: Python, Scala, Unix, Shell Scripting, Base SAS, SAS PROC SQL, SAS Macros - Logging and Monitoring: Elasticsearch, Grafana and Prometheus - Development Notebooks and IDE: Zeppelin, Jupyter, Databricks, Visual Studio Code, PyCharm & Spyder - Job Workflow Scheduler: Control-M, Apache Oozie, Apache Airflow, Papermill - Relational, NoSQL, Graph Databases: IBM DB2, Informix, MySQL, PostgreSQL, Teradata, HBase, Presto, Arango DB - CI/CD & DevOps: Jira, Gitlab, Confluence, Wiki, Jenkins, Gitlab CI, Spinnaker, Argo CD - Infrastructure as Code: Terraform, CloudFormation, Azure ARM - Container Orchestration: Docker, Kubernetes, Helm
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Microsoft Azure
    Atlas
    Terraform
    Amazon Web Services
    Kubernetes
    MLflow
    Apache Hive
    Docker
    Apache Kafka
  • $3 hourly
    Career Objective: Seeking an opportunity to utilize my knowledge and experience towards a challenging career in a growth oriented, leading organization which would provide opportunities to enhance my professional, technical, and personal skills. Summary: * Overall 10+ years of experience in dealing with hadoop ecosystem with different tools/technologies like Apache Spark with scala, pySpark, Python, Hive, Kafka, Hadoop, Cassandra, Sqoop, Airflow, SQL Server 2008R2. * Good exposure with Agile software development process. * Proven skills in managing teams to work in sync with the corporate set parameters & motivating them for achieving business and individual goals * Having expert technical knowledge in Big Data technologies, mainly in Spark with Python & Scala, Cassandra, Kafka HDFS, Hive, Sqoop. * Handling large datasets using effective partitioning, spark in-memory capabilities, pair
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Apache Airflow
    Hive
    Scala
    Python
    Apache Spark MLlib
  • $20 hourly
    Data engineer with 2+ years of experience in building and maintaining high-performance data pipelines using Big Data technologies (Hadoop, Spark, AWS).Successfully implemented two data lakes, resulting in a 30% improvement in data analysis time and a 15% increase in data accessibility for key stakeholders. I seek to leverage expertise in data modeling, ETL processes, and data governance to contribute to innovative projects that drive business growth .
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Apache Hadoop
    Python
    Amazon RDS
    Apache Airflow
    AWS Glue
    Amazon Redshift
    Amazon EC2
    Amazon Athena
    AWS Lambda
    Amazon S3
    SQL
    PySpark
    Apache Hive
    Sqoop
  • $5 hourly
    Data Engineer with 3+ years of experience in utilizing Big Data technologies. Skilled in HDFS, Sqoop, Hive, Scala, Spark SQL, Apache Spark, PySpark Good knowledge in RDBMS, SQL, CI/CD * Data acquisition & processed the data by spark framework. * Written a custom Spark application to read the data from files & applied transformations and generate insights to the data. * Filtration of data based on conditions provided pushed those RDBMS and used for report generation
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Apache Kafka
    Python
    SQL
    Apache NiFi
    Hive
    PySpark
    Scala
  • Want to browse more freelancers?
    Sign up

How hiring on Upwork works

1. Post a job

Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.

2. Talent comes to you

Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.

3. Collaborate easily

Use Upwork to chat or video call, share files, and track project progress right from the app.

4. Payment simplified

Receive invoices and make payments through Upwork. Only pay for work you authorize.

Trusted by

How do I hire a Apache Spark Engineer near New Delhi, on Upwork?

You can hire a Apache Spark Engineer near New Delhi, on Upwork in four simple steps:

  • Create a job post tailored to your Apache Spark Engineer project scope. We’ll walk you through the process step by step.
  • Browse top Apache Spark Engineer talent on Upwork and invite them to your project.
  • Once the proposals start flowing in, create a shortlist of top Apache Spark Engineer profiles and interview.
  • Hire the right Apache Spark Engineer for your project from Upwork, the world’s largest work marketplace.

At Upwork, we believe talent staffing should be easy.

How much does it cost to hire a Apache Spark Engineer?

Rates charged by Apache Spark Engineers on Upwork can vary with a number of factors including experience, location, and market conditions. See hourly rates for in-demand skills on Upwork.

Why hire a Apache Spark Engineer near New Delhi, on Upwork?

As the world’s work marketplace, we connect highly-skilled freelance Apache Spark Engineers and businesses and help them build trusted, long-term relationships so they can achieve more together. Let us help you build the dream Apache Spark Engineer team you need to succeed.

Can I hire a Apache Spark Engineer near New Delhi, within 24 hours on Upwork?

Depending on availability and the quality of your job post, it’s entirely possible to sign up for Upwork and receive Apache Spark Engineer proposals within 24 hours of posting a job description.