Hire the best Pyspark Developers in Gurgaon, IN

Check out Pyspark Developers in Gurgaon, IN with the skills you need for your next job.
  • $49 hourly
    Come to me if others are not able to scrape it ! Get data from ANY website despite HEAVY anti-bot protection! I am an expert in desktop and mobile web scraping and crawling. I can extract the data from Android, iOS mobile apps even if protected with SSL pinning. With 11+ years of proven experience, I can scrape all complex websites/apps on short notice. 1. Expertise in Python, PySpark, Databricks, Requests, REST API, Beautifulsoup, Regular Expressions, XPath, Pandas, Numpy, MatplotLib, Flask, Scrapy, TensorFlow, Spark, machine learning, NLP, scraping, and RPA tools, including UiPath. 2. Web Scraping (Desktop &Mobile) and Mobile Apps Scraping (Android, iOS, Windows). 3. Expert in defying anti-bot measures (Google ReCaptcha, Fun Captcha, Distil Networks, Incapsula, Cloudflare, MyraCloud, PerimeterX, AJAX, Javascript, etc.) no matter how complex it is. 4. Scraped data delivery in any required formats-TSV, Google Sheets, CSV, JSON, Excel, XML, MySQL, MongoDB, etc. and hosted at Google Cloud/AWS/DigitalOcean or simply the executable for the local run. 5. Super fast delivery of data with no hiccups in the delivery date. 6. Vast knowledge of various industries that include Retail, Travel, Finance, Real Estate, E-commerce, Insurance, Advertisements, Hedge funds, etc 7. Market research and data scraping for 500 companies among Forbes 2000 companies. 8. Beautiful visualization and insights from the scraped data using Google Data Studio and Tableau Software. So whether you want the information scraped from the websites/apps (no matter how stubborn the website/app is) or want to see the insights based on the scraped data, I can do this with 100% preciseness. My skills in web scraping have been particularly valuable in helping clients collect and analyze data from a variety of sources. I have experience in using web scraping tools such as Beautiful Soup, Selenium, and Scrapy to extract data from websites and APIs. I am also proficient in cleaning and transforming data using Python libraries such as Pandas and NumPy. In addition to web scraping, I am skilled in PySpark and Databricks, which are powerful tools for large-scale data processing and analysis. I have used these tools to build efficient data pipelines that automate data processing tasks and enable real-time data analysis. My experience in PySpark and Databricks has helped clients to reduce processing time and achieve faster insights. I am a highly skilled and experienced data analyst with expertise in web scraping, PySpark, Databricks, Tableau, REST API, and Streamlit. I have worked on a variety of projects, ranging from data collection and analysis to API integration and visualization. In my previous roles, I have leveraged my skills in web scraping and data analytics to extract valuable insights from large datasets. I have also worked extensively with PySpark and Databricks to develop efficient data pipelines and automate data processing tasks. Additionally, I have experience in building RESTful APIs and integrating third-party APIs to deliver real-time data to applications. I am proficient in Tableau and Streamlit and have developed numerous interactive dashboards and visualizations to help clients gain valuable insights into their data. I am committed to delivering high-quality work that meets the needs of my clients and exceeds their expectations. My clients have consistently praised my attention to detail, communication skills, and ability to deliver high-quality work within tight deadlines. I am committed to providing exceptional customer service and ensuring that my clients are completely satisfied with the work I deliver. I have been associated with the various startups in various capacities and continue to do so. Mentor for some and doer for some. I am founder of BOXnMOVE Packers and Movers Relocation startup based out of Gurgaon which launched an on demand bike and truck delivery app MOVER Delivery . More can be found on boxnmove.com and mover.delivery Overall, I believe that my skills in web scraping, PySpark, Databricks, Tableau, REST API, and Streamlit, combined with my experience in data analysis and project management, make me an ideal candidate for your data-related project. I am committed to delivering high-quality work that exceeds your expectations and helps you achieve your data-driven goals. Thank you for considering my profile, and I look forward to the opportunity to work with you. Please feel free to contact me if you have any tasks to be accomplished related to Web Scraping / Web Crawling / Data Mining / Automated Data Extraction/ Data Visualization /Python solutions/ SQL solutions/Streamlit. I have a proven track record of delivering high-quality work that meets the needs of my clients. I am passionate about leveraging data to drive insights and decision-making, and I have the technical skills and expertise to do so.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Data Extraction
    Big Data
    PySpark
    Data Scraping
    API Integration
    Web Crawling
    Apache Airflow
    Data Visualization
    Streamlit
    Databricks Platform
    Scrapy
    Scala
    pandas
    Python
    Machine Learning
  • $25 hourly
    Hello, I'm Aditya Johar, a seasoned Data Scientist and Full Stack Developer. With over 9 years of hands-on experience, I bring a wealth of expertise to the table. Here are the top 5 qualities that make me a reliable, highly experienced, and talented expert for your project collaborations: [1] My journey through the world of data science and full-stack development has exposed me to a plethora of tools and technologies. I am well-versed in Python, along with its data science libraries like Pandas, NumPy, and Scikit-Learn. For deep learning and AI, I work with frameworks such as TensorFlow and PyTorch. In the full-stack domain, I'm proficient in Node.js, Express, and MongoDB. [2] I thrive on tackling complex challenges. I have a track record of turning data into actionable insights using data visualization tools like Matplotlib and Seaborn, and efficiently managing databases through SQL and NoSQL systems. When it comes to full-stack development, I excel in both front-end technologies (React, HTML, CSS) and back-end frameworks (Django, Flask). [3] My versatility allows me to cover the entire project pipeline. I can dive into data analysis, modeling, and end-to-end application development with proficiency in React and Redux. This comprehensive approach streamlines project execution. [4] Effective communication is at the core of my work. I excel at translating complex technical concepts into plain language, making it accessible to non-technical stakeholders. [5] I've accumulated a portfolio of successful projects, showcasing my ability to deliver high-quality solutions on time and within budget. You'll find case studies, project highlights, and testimonials from satisfied clients. ------------------------------------------TOP USE CASES COVERED--------------------------------- ✅ NATURAL LANGUAGE PROCESSING (NLP): Sentiment Analysis, Text Summarization, Chatbots and Virtual Assistants, Language Translation ✅COMPUTER VISION: Image and Video Classification, Object Detection, Facial Recognition, Medical Image Analysis ✅RECOMMENDATION SYSTEMS: Product Recommendations (e.g., e-commerce), Content Recommendations (e.g., streaming services), Personalized Marketing ✅PREDICTIVE ANALYTICS: Sales and Demand Forecasting, Customer Churn Prediction, Stock Price Prediction, Equipment Maintenance Prediction ✅E-COMMERCE OPTIMIZATION: Dynamic Pricing, Inventory Management, Customer Lifetime Value Prediction ✅TIME SERIES ANALYSIS: Financial Market Analysis, Energy Consumption Forecasting, Weather Forecasting ✅SPEECH RECOGNITION: Virtual Call Center Agents, Voice Assistants (e.g., Siri, Alexa) ✅AI IN FINANCE: Credit Scoring, Algorithmic Trading, Fraud Prevention ✅AI IN HR: Candidate Screening, Employee Performance Analysis, Workforce Planning ✅CONVERSATIONAL AI: Customer Support Chatbots, Virtual Shopping Assistants, Voice Interfaces ✅AI IN EDUCATION: Personalized Learning Paths, Educational Chatbots, Plagiarism Detection ✅AI IN MARKETING: Customer Segmentation, Content Personalization, A/B Testing ✅SUPPLY CHAIN OPTIMIZATION: Demand Forecasting, Inventory Optimization, Route Planning And Many More use cases that we can discuss while we connect. "Ready to turn these possibilities into realities? I'm just a click away! Feel free to contact me, or if you're eager to get started, simply click the 'Invite to Job' or 'Hire Now' button in the top right corner of your screen. Let's kick off your project and make it a success!"
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Django
    Apache Airflow
    Apache Hadoop
    Terraform
    PySpark
    Apache Kafka
    Flask
    BigQuery
    BERT
    Apache Spark
    Python Scikit-Learn
    pandas
    Python
    TensorFlow
    Data Science
  • $40 hourly
    Google Cloud Certified Professional Data Engineer with 5.7+ years of experience Proficient in GCP stack: BigQuery, Composer, Cloud Storage, Dataflow, Pubsub Skilled in migrating legacy systems to GCP technologies Expertise in data ingestion, processing, and transformations at enterprise scale Experienced with AWS EMR, Spark, and Snowflake Proficient in Python and SQL with strong problem-solving skills Developed numerous ETL data pipelines using Apache Airflow, Python, and SQL Strategic mindset to deliver tangible business outcomes through data-driven insights Passionate about leveraging data to drive innovation and business growth
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Databricks Platform
    Data Engineering
    Amazon Web Services
    Snowflake
    PySpark
    ETL Pipeline
    Google Cloud Platform
    Python Script
    Cloud Computing
    Apache Beam
    SQL
    BigQuery
    Big Data
    Python
    Apache Airflow
  • $25 hourly
    A highly motivated person with strong technical, problem-solving, and excellent time management skills who likely to create an impact on the organization/work, he is part of and always love to socialize and experience new things in life. My hunger for the new challenges make me unique.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    MySQL
    Scala
    Microsoft Azure
    Data Analytics
    Snowflake
    SQL Programming
    Data Engineering
    Data Warehousing & ETL Software
    ETL Pipeline
    PySpark
    Apache Spark
    SQL
    Databricks Platform
    Python
  • $25 hourly
    My name is Abhinav Gundapaneni and I work at Microsoft as a Software Engineer. I have been associated with Microsoft for the last 3 years. Over the past few years I have gained valuable skills that are highlighted below: 1. Designing and developing ETL/ELT solutions for complex datasets from various clients using Data Engineering tools. 2. Building scalable and efficient data pipelines to handle large amounts of customer data efficiently. 3. Solid understanding of Azure cloud infrastructure. 4. 4+ years of working with web and software applications on Python. 5. Developing PySpark notebooks and applications that handle large datasets and complex requirements. 6. Hands-on experience in managing hundreds of SQL databases in production. 7. Designing and developing web applications at scale on Django. Apart from these skills, I’m a great team player and add value to the team's growth.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    ETL
    Web Application
    Microsoft Azure
    Django
    PySpark
    SQL
    Python
  • $20 hourly
    I am a bioinformatics engineer with a strong background in programming languages such as Python, R and Bash. I specialize in developing scalable workflows in Nextflow and have extensive experience working with a variety of omics data types including bulk RNAseq, single cell RNAseq, whole genome sequencing, whole exome sequencing and GWAS data. My expertise includes: - Developing pipelines for quality control, alignment, and analysis of large-scale omics data - Utilizing machine learning and statistical modeling to extract insights from complex datasets - Implementing reproducible research practices for efficient data management and sharing - Collaborating with biologists and clinicians to interpret and validate results I am passionate about using my skills to drive innovation and discovery in the field of bioinformatics. I am dedicated to delivering high-quality work and ensuring client satisfaction. If you have any projects or opportunities that may benefit from my expertise, please don't hesitate to contact me. I would be happy to discuss your needs in detail and provide a tailored solution.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Data Analytics
    Bioinformatics
    PySpark
    Data Mining
    Machine Learning Model
    Kubernetes
    AWS CodeDeploy
    Data Annotation
    AWS CodePipeline
    Docker
    Python
    SQL
  • $100 hourly
    Hobbies Book Reading Tech Blog Writing Dedicated Product Manager aiming to contribute 7+ years of experience in the IT and data analytics industry to the Internal Product Manager role , focusing on building products and delivering strategic objectives. Seeking to leverage analysis, problem-solving skills, and technical knowledge in ETL development, data warehousing, business intelligence, cloud technologies, and SQL to establish an effective interface between core business and technology teams. Committed the mission of enabling effective collaboration across multifunctional Agile teams to drive fruitful results and exposure. Professional Summary 7+ years of experience in the IT industry with roles encompassing data development, software engineering, technical program management, lead data engineering, and product management. Certified Scrum Master and accomplished project manager, proficient at
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    PySpark
    Python
    PowerPoint Presentation
    Microsoft Excel
    SQL Programming
    SQL
    Expert
    Data Analytics
  • $35 hourly
    I am a Data Engineer, with expertose in managing end to end data flow, starting from ingestion till reporting.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Artificial Intelligence
    SQL
    PySpark
    SQL Programming
    Apache NiFi
    Apache Spark MLlib
    AWS Glue
    Apache Spark
  • $25 hourly
    I am a Full Stack (Python with React)/(PHP with React) Developer experienced in the LAMP/WAMP/PYTHON/DJANGO platform. I have a Master's in Software Engineering. Listed below are my experience and skills. - 10+ years of experience in web application development using PHP, Python, MySQL, and JavaScript; - 4+ years of relevant experience in Python with Django and API development; - 3+ years of relevant experience in Web Scraping, Data Extraction, Web Crawling, Data Mining, Data Engineer, Python, PySpark,Scrapy; - Expert knowledge of Codeigniter, Laravel, Yii frameworks, also have experience with Symfony, CakePHP frameworks; - In-depth knowledge of Search Technology (Elastic search and Solr Search); - Good experience of AWS Services; - In-depth knowledge of REST and SOAP with 3rd party APIs like Amazon MWS, eBay API, and other public APIs like Facebook API and Google API; - In-depth knowledge of systems architecture and software design methodologies; - Good experience in GUI development using Jquery, ReactJS, Twitter Bootstrap; - Extensive experience in relational database management systems, mainly with MySQL and PostgreSQL; - Experience with NoSQL database MongoDB; - Experience in Linux systems administration (LAMP, Nginx); - I practice Agile Methodologies in the development process; - I use Git for version control, CI/CD Pipeline, Jenkins, Codedeploy; I am a self-driven, very fast learner.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    FastAPI
    Elasticsearch
    CodeIgniter
    Django
    Apache Hadoop
    PHP
    PySpark
    MySQL
    Apache Solr
    Apache Spark
    Laravel
    API
    Apache Airflow
    Python
    Data Scraping
  • $3 hourly
    Been working as a data engineer in companies for nearly 3 years now. Have experience with Python, SQL, pyspark, ETL. AWS experience with services such as Glue, EMR, Redshift, S3, RDS and have some snowflake experience too.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Selenium WebDriver
    Linux
    Amazon EC2
    Amazon Athena
    AWS Lambda
    Amazon S3
    Amazon Redshift
    AWS Glue
    Apache Spark
    PySpark
    SQL
    Python
  • $22 hourly
    Around 8 years of experience as ETL- Informatica PowerCenter/IICS . Completed end to end implementation project and worked in enhancement and support projects using Informatica PC/IICS. Used Scrum Agile project management methodologies and automated manual task using informatica PC, UNIX and VBA Macros. Strong in designing relational databases and handling complex SQL queries such as SQL Server, Oracle, SQL. Experience in Big Data Technologies such as Hadoop, Hive, Pig, Sqoop. Experience in Deep Learning Neural Networks and Natural Language Process on Python and R programming. Worked in AWS environment for development and deployment of Custom ETL Mapping.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Data Scraping
    Data Mining
    EasyVista
    Data Warehousing
    Informatica
    Data Analysis
    PySpark
    Microsoft SQL Server
    Machine Learning
    Python
  • $100 hourly
    Experience in creating scalable data architecture and pipelines on cloud. Expert in Tech Stack: Python, SQL, SPARK, Snowflake, AWS, AZURE, Data Modeling, Data Warehousing, Airflow, DBT, Data Processing, Data Analytics
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Amazon Athena
    Amazon Redshift
    AWS CloudFormation
    Amazon DynamoDB
    AWS Lambda
    AWS Glue
    Data Warehousing
    Apache Airflow
    Data Engineering
    Data Analytics
    dbt
    Snowflake
    Git
    PySpark
    Apache Spark
    Docker
    Python
    SQL
  • $13 hourly
    Hi, I'm Jaswinder Singh, a software engineer with 1 year of experience in front-end development using JavaScript, HTML, and CSS, as well as in building web applications using React, next and Vue js frameworks. I also have experience in back-end development with Python and have worked with databases such as PostgreSQL and MongoDB. I am passionate about creating clean, well-organized, and efficient code to deliver high-quality software solutions to clients. I am constantly seeking to expand my knowledge and skillset by learning new technologies and staying up-to-date with industry trends. As a freelancer, I am dedicated to providing top-quality work to my clients and ensuring that their needs are met. I am a detail-oriented individual who values effective communication and collaboration with clients to achieve their goals. I am excited to bring my skills and expertise to your projects and help you achieve your software development goals. Please don't hesitate to contact me to discuss your project requirements or to learn more about my experience and capabilities.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Next.js
    PySpark
    React
    Java
    JavaScript
    Python
    HTML
    SQL
    CSS
    Vue.js
  • $20 hourly
    SUMMARY * Data Engineer with two plus years of experience on Azure Cloud Platform, Data Lake architecture and Data Warehousing. Worked closely with major clients in FMCG and BFSI domain. * Expertise in developing CI/CD pipelines using Azure Devops. * Major skills in Pyspark, Python, T-SQL, Snowflake, DBT and Azure cloud .
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    CI/CD
    Microsoft Azure SQL Database
    Distributed Computing
    Data Warehousing
    Azure DevOps
    Cloud Computing
    Microsoft Azure
    Data Lake
    Databricks Platform
    PySpark
    Apache Spark
    ETL Pipeline
    Python
    Microsoft Excel
  • $35 hourly
    Hello there! I'm a seasoned software developer with a passion for crafting innovative solutions. Here are some of my skills- 1. Seasoned software developer specializing in Python development 2. Proficient in AWS and Azure cloud platforms for building robust and scalable systems 3. Experienced in utilizing GraphDB technologies, particularly Neo4j, for efficient data modeling 4. Skilled in PySpark for high-performance data processing in data-intensive environments 5. Enthusiastic about integrating machine learning algorithms into software solutions Committed to turning ideas into impactful applications through collaborative efforts 6. Knowledgeable in machine learning techniques and algorithms for building intelligent software solutions
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    API Development
    PySpark
    Scala
    Neo4j
    AWS Lambda
    Python
    Amazon Web Services
  • $20 hourly
    I’m a Data Engineer experienced in building etl pipelines and datalake.Have also built reverse etl pipelines.Also have experience working on snowflake. Have experience in optimising data pipelines and systems.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Amazon Athena
    Python
    Amazon S3
    Amazon DynamoDB
    Snowflake
    ETL
    Amazon Web Services
    ETL Pipeline
    PySpark
    AWS Glue
    Data Extraction
  • $10 hourly
    PROJECTS Building Real Time data pipeline Handling customer experience data Used Kafka, Snowflake, Python, Scylla, Spark, Hive, Spark Optimization, SCD techniques to develop this system Upstream flows events through kafka topics and we wrote our processor code (python) to handle the processing logic for different events Post processing, domain events are produced to our domain kafka topic From domain kafka topic it goes to snowflake sink table and finally gets merged into Snowflake final DM table Further looker explore/dashboards are created, keeping Snowflake Dm table as source Currently working on snowflake to hive migration for cost cutting Developed snowflake to hive backfilling pipeline to transfer historical data Applied CAP theorem for optimal tuning of consistency with availability Exception Handling Framework Automate Exception Handling mechanism for Real time data pipeline Reduced 90% of human effort while encountering any kind of
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    MySQL
    SQL
    NoSQL Database
    Apache Cassandra
    Apache HBase
    Apache Airflow
    Snowflake
    Apache Kafka
    Scala
    Python
    PySpark
    Hive
    Apache Spark
    Real Time Stream Processing
  • $50 hourly
    I’m a developer experienced in backend development and data enginering 1. Knows Pyspark, SQL, AWS. 2. Building Rest APIs using Flask, Django, FastAPI. 3. Full project management from start to finish 4. Regular communication is important to me, so let’s keep in touch.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    PySpark
    SQL
    ETL Pipeline
    Data Engineering
    FastAPI
    Django
    Flask
    Python Script
    Python
    Back-End Development
  • $10 hourly
    Can help you with your data analytics, data mining , visualization, IoT related projects. Have experience in Manufacturing, EV, Retail, Healthcare data
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Data Analysis
    NumPy
    PySpark
    Data Visualization
    Databricks Platform
    Python Folium
    pandas
    Python
    Microsoft Power BI
    Data Analytics
  • $12 hourly
    Data Engineer Experienced Data Engineer with 3 years of expertise in designing, building, and optimizing data pipelines for efficient data processing. Proficient in ETL processes, data warehousing, and data modeling, delivering valuable insights for informed decision-making.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    AWS Glue
    PostgreSQL
    Snowflake
    Microsoft Azure
    Microsoft SQL Server
    Databricks Platform
    PySpark
    SQL
    Python
  • $25 hourly
    As a seasoned Cloud Data Engineer & Analytics Architect, I specialize in empowering financial insights by using cutting-edge technologies. With over 5 years of hands-on experience, I can revolutionize data strategies within the Banking and Financial Services Domain. My expertise spans the entire data lifecycle, from collection and preparation to advanced analytics and visualization. I can leverage Python, AWS, PySpark, Docker, Airflow, and Power BI to elevate your organization to new heights of success. My key responsibilities include: - Python Scripting Mastery: I can use Python to engineer bespoke data solutions, automate processes, and unlock actionable insights with precision and efficiency. - AWS Cloud Expertise: By leveraging AWS services, I can architect scalable and secure data ecosystems that empower your organization to thrive in the cloud era with unmatched flexibility and agility. - PySpark Prowess: With PySpark, I can orchestrate distributed data processing and analysis, enabling lightning-fast insights and optimization of your data workflows. - Containerization with Docker: I can use Docker to containerize data applications, ensuring seamless deployment and portability across environments, from development to production. - Airflow Automation: By employing Airflow, I can automate data workflows, orchestrating complex pipelines with ease, reliability, and scalability. - Power BI Visualization: With Power BI, I can craft captivating visualizations and reports that transform raw data into actionable insights. These insights can drive informed decision-making at every level of your organization. You should choose me because: - Proven Financial Sector Expertise: With a specialization in the Banking and Financial Services Domain, I understand the unique challenges and opportunities inherent in the industry. I deliver tailored solutions that drive tangible business outcomes. - End-to-End Data Excellence: From data collection to visualization, I offer comprehensive solutions that optimize every aspect of your data landscape. This ensures maximum value extraction and competitive advantage. - Cutting-Edge Technology Adoption: By staying at the forefront of technological advancements, I ensure that your organization remains ahead of the curve. Your organization will continuously evolve and innovate in the rapidly changing world of data. Let's transform your data strategy. Are you ready to embark on a journey of data-driven transformation? Partner with me to unlock the full potential of your data assets and accelerate your path to success. Together, we'll navigate the complexities of modern data ecosystems, harnessing the power of Python, AWS, PySpark, Docker, Airflow, and Power BI to drive innovation, efficiency, and growth within your organization.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Model Validation
    Machine Learning
    Docker
    SQL
    Microsoft Power BI Data Visualization
    Apache Kafka
    Apache Airflow
    Git
    PySpark
    AWS Glue
    AWS Application
    Python
    Cloud Computing
  • $25 hourly
    As a Data Engineer cum Consultant, I bring experience in SQL, Spark, AWS services, and various data analysis and visualization tools.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Tech & IT
    Event Management
    Tableau
    Data Engineering
    Microsoft Excel
    Data Augmentation
    Data Analytics & Visualization Software
    PySpark
    SQL
    Amazon EC2
    Amazon S3
    Amazon Redshift
    AWS CodePipeline
    AWS Glue
    AWS Development
  • $15 hourly
    I have over 4 years of experience in Data Engineering (especially using Spark and pySpark to gain value from massive amounts of data). I worked with analysts and data scientists on Hive/Spark and resolving their issues with big data ecosystem. I also have experience on Hadoop and building ETL, especially in Pyspark and Hive based solutions. I have good hands on experience in optimizing the query performance and providing and maintaining key datasets for many important customer facing dashboards.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Data Warehousing & ETL Software
    ETL Pipeline
    Qubole
    Big Data
    Snowflake
    SQL
    Python
    PySpark
    Apache Kafka
    Apache Airflow
    Apache Hive
    Hive
    Engineering & Architecture
    Data Warehousing
    Data Engineering
  • $30 hourly
    As a seasoned data engineer with seven years of experience, I specialize in Python, SQL, GCP, Docker, Spark, and Databricks. I excel in designing scalable data pipelines, optimizing workflows, and leveraging cloud technologies for actionable insights. With expertise in deploying and maintaining distributed computing frameworks, I deliver high-quality solutions for freelance projects requiring top-tier data engineering and analytics skills.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    PySpark
    SQL
    Python
    Google Cloud Platform
    Databricks Platform
    Docker
    Scripting
    Apache Airflow
  • $35 hourly
    As someone passionate about machine learning, data science, and cutting-edge technologies, I'm drawn to projects that offer opportunities for innovation, problem-solving, and skill enhancement. Previously I have worked on: 1. GNNs for a reduction in fraud in the financial sector. 2. Trained/Fintuned LLMs to be used as chatbots for medical use cases. 3. Built an agent for black-box tool usage. 4. Worked on edge devices to implement an open-set facial recognition model. 5. Created alphas for WorldQuant as a part time research consultant.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    C
    MySQL
    Graph Neural Network
    Large Language Model
    TensorFlow
    Git
    C++
    Hugging Face
    LangChain
    pandas
    PySpark
    Python
    PyTorch
    Artificial Intelligence
    Machine Learning
  • $20 hourly
    Results-driven Data Enginner with 8+ years of experience in Software Developing, Debugging and Process Improvement. Successful career chronicle in using Python, PySpark, MySQL, Hadoop, Hive, Sqoop, etc. Possess extensive experience in AWS Services like Glue/Redshift/S3/Athena/BOTO3/EC2/SES/IAM USER/SQS/STEP FN. Distinguished capabilities in data analysis and using data frames through Spark SQL. Registered track record of success of working with Spark RDDs and Map/Reduce functions using PySpark. Adept in data migration to AWS S3 using PySpark and Hive from SAS9.2 and from S3 to Redshift. Sound knowledge in Relational Database Model. Demonstrated excels of working with various Python Integrated Development Environments like IDLE & Jupyter-Python. Extensive experience in handling large volumes of data and data migration/export. Successfully worked regularly on Tables, Procedures, Packages, Functions, Collections, Shell Scripting and Server Management. Contributed significantly to Bulk Data Migration through Unix Shell Scripting. Developed and enhanced PL/SQL Programs. AWS Services : Glue, EC2, BOTO3, SES, Athena, System Manager, IAM User, Lambda / State Machine/ Step FN / SQS / SES / CloudWatch / Crawler IDE : PyCharm, Eclipse, Jupyter Big Data Ecosystems : PySpark, Hadoop, Map-reduce, HDFS, Hive, Sqoop, Flume Programming Languages : C, Python, Core Java, SQL, Hive-QL Scripting Languages : Python Databases : My SQL, HBase, Redshift
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Apache HBase
    Amazon Redshift
    Sqoop
    HDFS
    SQL Programming
    AWS Lambda
    Glue & Other Adhesives
    Apache Flume
    Apache Hadoop
    PySpark
    Python
    Data Warehousing & ETL Software
    AWS Glue
    Amazon Web Services
    Database Management
  • $20 hourly
    I am a developer with 8 years of experience in various technologies. * I'm experienced in Azure Synapse, Data Factory, and Data Bricks. * Worked with API's, Logic Apps, Azure Functions * Proficient in SQL, MSBI Technologies. * I' fully manage projects from planning to deliver final product. In last 8 years, I worked with different companies to enrich with experience on multiple technologies.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Database
    ETL
    Microsoft Azure
    Hive
    Data Lake
    Python Script
    API
    PySpark
    ETL Pipeline
    Microsoft Excel
    Apache Hadoop
    Python
  • Want to browse more freelancers?
    Sign up

How hiring on Upwork works

1. Post a job (it’s free)

Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.

2. Talent comes to you

Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.

3. Collaborate easily

Use Upwork to chat or video call, share files, and track project progress right from the app.

4. Payment simplified

Receive invoices and make payments through Upwork. Only pay for work you authorize.

Trusted by

How do I hire a Pyspark Developer near Gurgaon, on Upwork?

You can hire a Pyspark Developer near Gurgaon, on Upwork in four simple steps:

  • Create a job post tailored to your Pyspark Developer project scope. We’ll walk you through the process step by step.
  • Browse top Pyspark Developer talent on Upwork and invite them to your project.
  • Once the proposals start flowing in, create a shortlist of top Pyspark Developer profiles and interview.
  • Hire the right Pyspark Developer for your project from Upwork, the world’s largest work marketplace.

At Upwork, we believe talent staffing should be easy.

How much does it cost to hire a Pyspark Developer?

Rates charged by Pyspark Developers on Upwork can vary with a number of factors including experience, location, and market conditions. See hourly rates for in-demand skills on Upwork.

Why hire a Pyspark Developer near Gurgaon, on Upwork?

As the world’s work marketplace, we connect highly-skilled freelance Pyspark Developers and businesses and help them build trusted, long-term relationships so they can achieve more together. Let us help you build the dream Pyspark Developer team you need to succeed.

Can I hire a Pyspark Developer near Gurgaon, within 24 hours on Upwork?

Depending on availability and the quality of your job post, it’s entirely possible to sign up for Upwork and receive Pyspark Developer proposals within 24 hours of posting a job description.