Hire the best Pyspark Developers in Bengaluru, IN

Check out Pyspark Developers in Bengaluru, IN with the skills you need for your next job.
  • $90 hourly
    I pride myself on achieving a 𝗽𝗲𝗿𝗳𝗲𝗰𝘁 𝗿𝗲𝗰𝗼𝗿𝗱 𝗼𝗳 𝟱-𝘀𝘁𝗮𝗿 𝗿𝗮𝘁𝗶𝗻𝗴𝘀 𝗮𝗰𝗿𝗼𝘀𝘀 𝗮𝗹𝗹 𝗽𝗿𝗼𝗷𝗲𝗰𝘁𝘀. My expertise in 𝗰𝗹𝗼𝘂𝗱 𝗱𝗮𝘁𝗮 𝗲𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 and 𝗳𝘂𝗹𝗹-𝘀𝘁𝗮𝗰𝗸 development has been honed through experience with premier institutions like 𝗚𝗼𝗹𝗱𝗺𝗮𝗻 𝗦𝗮𝗰𝗵𝘀, 𝗠𝗼𝗿𝗴𝗮𝗻 𝗦𝘁𝗮𝗻𝗹𝗲𝘆, a member of the 𝗕𝗶𝗴 𝗙𝗼𝘂𝗿 and a 𝗙𝗼𝗿𝘁𝘂𝗻𝗲 𝟱𝟬𝟬 company. With over 9 years of experience in Data Engineering and Programming, I bring a commitment to excellence and a passion for perfection in every project I undertake. My approach is centered around delivering not just functional, but 𝗵𝗶𝗴𝗵𝗹𝘆 𝗲𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝘁 𝗮𝗻𝗱 𝗼𝗽𝘁𝗶𝗺𝗶𝘇𝗲𝗱 code, ensuring top-quality outputs that consistently impress my clients. My expertise combined with extensive experience on both GCP and AWS Cloud platforms, allows me to provide solutions that are not only effective but also innovative and forward-thinking. I believe in going beyond the basics, striving for excellence in every aspect of my work, and delivering results that speak for themselves. 𝗖𝗵𝗼𝗼𝘀𝗲 𝗺𝗲 𝗶𝗳 𝘆𝗼𝘂 𝗽𝗿𝗶𝗼𝗿𝗶𝘁𝗶𝘇𝗲 𝘁𝗼𝗽-𝗻𝗼𝘁𝗰𝗵 𝗾𝘂𝗮𝗹𝗶𝘁𝘆 𝗶𝗻 𝘆𝗼𝘂𝗿 𝗽𝗿𝗼𝗷𝗲𝗰𝘁𝘀 𝗮𝗻𝗱 𝗮𝗽𝗽𝗿𝗲𝗰𝗶𝗮𝘁𝗲 𝗮 𝗳𝗿𝗲𝗲𝗹𝗮𝗻𝗰𝗲𝗿 𝘄𝗵𝗼 𝗮𝘂𝘁𝗼𝗻𝗼𝗺𝗼𝘂𝘀𝗹𝘆 𝗺𝗮𝗸𝗲𝘀 𝗼𝗽𝘁𝗶𝗺𝗮𝗹 𝗱𝗲𝗰𝗶𝘀𝗶𝗼𝗻𝘀, 𝘀𝗲𝗲𝗸𝗶𝗻𝗴 𝗰𝗹𝗮𝗿𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻𝘀 𝗼𝗻𝗹𝘆 𝘄𝗵𝗲𝗻 𝗮𝗯𝘀𝗼𝗹𝘂𝘁𝗲𝗹𝘆 𝗻𝗲𝗰𝗲𝘀𝘀𝗮𝗿𝘆. 𝗔𝗿𝗲𝗮𝘀 𝗼𝗳 𝗘𝘅𝗽𝗲𝗿𝘁𝗶𝘀𝗲: - 𝗖𝗹𝗼𝘂𝗱: GCP (Google Cloud Platform), AWS (Amazon Web Services) - 𝗣𝗿𝗼𝗴𝗿𝗮𝗺𝗺𝗶𝗻𝗴 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲: Java, Scala, Python, Ruby, HTML, Javascript - 𝗗𝗮𝘁𝗮 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴: Spark, Kafka, Crunch, MapReduce, Hive, HBase, AWS Glue, PySpark, BiqQuery, Snowflake, ETL, Datawarehouse, Databricks, Data Lake, Airflow, Cloudwatch 𝗖𝗹𝗼𝘂𝗱 𝗧𝗼𝗼𝗹𝘀: AWS Lambda, Cloud Functions, App Engine, Cloud Run, Datastore, EC2, S3, - 𝗗𝗲𝘃𝗢𝗽𝘀: GitHub, GitLab. BitBucket, CHEF, Docker, Kubernetes, Jenkins, Cloud Deploy, Cloud Build, - 𝗪𝗲𝗯 & 𝗔𝗣𝗜: SpringBoot, Jersey, Flask, HTML & JSP, ReactJS, Django 𝗥𝗲𝘃𝗶𝗲𝘄𝘀: "Amar is a highly intelligent and experienced individual who is exceeding expectations with his service. He has very deep knowledge across the entire field of data engineering and is a very passionate individual, so I am extremely happy to have finished my data engineering project with such a responsible fantastic guy. I was able to complete my project faster than anticipated. Many thanks...." "Amar is an exceptional programmer that is hard to find on Upwork. He combines top-notch technical skills in Python & Big Data, excellent work ethic, communication skills, and strong dedication to his projects. Amar systematically works to break down complex problems, plan an approach, and implement thought-out high-quality solutions. I would highly recommend Amar!" "Amar is a fabulous developer. He is fully committed. Is not a clock watcher. Technically very very strong. His Java and Python skills are top-notch. What I really like about him is his attitude of taking a technical challenge personally and putting in a lot of hours to solve that problem. Best yet, he does not charge the client for all those hours, He still sticks to the agreement. Very professional. It was a delight working with him. and Will reach out to him if I have a Java or Python task."
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Google App Engine
    Software Development
    Web Development
    Machine Learning
    Big Data
    Google Cloud Platform
    Amazon Web Services
    BigQuery
    PySpark
    Apache Airflow
    Apache Spark
    Data Engineering
    SQL
    Python
    Java
  • $60 hourly
    Expert Data Engineer and Certified Generalist Software Engineer Languages: - Expert: Python | SQL - Intermediate: Javascript | Java | Shell Script (Bash) | Solidity - Syntax Knowledge: C | C++ Big Data Stack: Apache Spark/Pyspark | Apache NiFi | Apache Kafka | Apache Flink | DBT BlockChain Stack: Solidity | Web3j | ChainLink | Moralis | StackOS | IPFS Chain: Ethereum, Polygon (any EVM compatible chain: BSC, PolkaDot, Avalanche, etc) Create ERC-20 Tokens, ERC 721/1155 NFTs. Store NFTs/metadata on FileCoin/IPFS, custom smart contracts Frontend Frameworks: Vue Js | Bootstrap | jQuery Backed Frameworks: Flask | Express JS | PHP | SpringBoot Cloud Infrastructure: AWS (S3, EC2, EMR, Redshift, SQS, Glue) Database: Postgres SQL | Redis | Redshift Deployment: Docker | Docker Compose | Kubernetes (K8-Amateur) Scheduler: Azkaban | Airflow Skills: Web-scraper | ETL | ELT | Datawarehouse | Data Mining | Full Stack Web development | Rest API | Data Wrangling Misc: Discord | Binance API | Selenium | Metabase Python Packages: BeautifulSoup, Request, Selenium, Pyspark, Pyflink, pandas, Scikit-learn, etc
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Algorithm Development
    PostgreSQL
    Flask
    Cryptocurrency
    Amazon Redshift
    Redis
    PySpark
    Apache Kafka
    Apache NiFi
    Linux
    SQL
    Python
  • $40 hourly
    🚀 Greetings! 🚀 I'm a seasoned Senior Data Engineer with a robust background in architecting and implementing sophisticated data solutions that drive decision-making and business intelligence. With a knack for data wrangling, transformation, normalization, and crafting end-to-end data pipelines, I bring to the table a wealth of expertise aimed at optimizing your data infrastructure for peak performance and insight generation. 🔍 What Sets Me Apart? 🔍 Proven Track Record: Successfully deployed multiple complex data pipelines using industry-standard tools like Apache Airflow and Apache Oozie, demonstrating my capability to handle projects of any scale. Fortune 500 Experience: Contributed significantly to data platform teams at renowned companies, tackling intricate data challenges, managing voluminous datasets, and enhancing data flow efficiency. Holistic Skillset: My proficiency isn't just limited to engineering. I excel in Business Intelligence, ETL processes, and crafting complex SQL queries, ensuring a comprehensive approach to data management. Efficiency & Simplicity: I prioritize creating solutions that are not only effective but also straightforward and maintainable, ensuring long-term success and ease of use. 🛠 Tech Arsenal 🛠 Cloud Platforms: Mastery over GCP (Google Cloud Platform) and AWS (Amazon Web Services), enabling seamless data operations in the cloud. Programming Languages: Skilled in Java, Scala, and Python, offering versatility in tackling various data engineering challenges. Data Engineering Tools: Expertise in Spark, Pyspark, Kafka, and more, equipped to build robust data processing applications. Data Warehousing: Proficient with AWS Athena, Google BigQuery, Snowflake, ensuring scalable and efficient data storage solutions. Orchestration & Scheduling: Adept in managing complex workflows with tools like Airflow and Oozie, coupled with container orchestration using Docker. 🌟 Why Collaborate With Me? 🌟 Beyond my technical prowess, I am detail-oriented, organized, and highly responsive, prioritizing clear communication and project efficiency. I am passionate about unlocking the potential of data to fuel business growth and innovation. Let's embark on this data-driven journey together! Connect with me to discuss how we can elevate your data infrastructure to new heights.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Apache Airflow
    Apache Kafka
    Data Warehousing
    Data Lake
    ETL Pipeline
    ETL
    AWS Lambda
    AWS Glue
    Microsoft Azure
    Data Integration
    Apache Hive
    Data Transformation
    PySpark
    SQL
    Python
  • $35 hourly
    I am a Cloud Data Architect at Greenway Health, a leading provider of software and services for the healthcare industry. I have more than two decades of experience in cloud architecture, database architecture, project management, and data and database migration. I also hold certifications in Azure Solutions Architect Expert and Azure Data Engineer, demonstrating my proficiency and expertise in Azure cloud services. In my current role, I design and implement data solutions that leverage the power and scalability of Azure data factory, data lake, blob, backup, devops, event grid, and monitor. I have successfully led and delivered several projects for various clients across different domains, such as healthcare, finance, energy, telecom, and media, using these Azure services. I am also skilled in Azure IaaS and PaaS, infrastructure as code, and DevOps with ARM, CI/CD, and Terraform pipeline. My mission is to provide innovative and reliable solutions that optimize cloud data performance, security, and availability.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Snowflake
    PySpark
    MLflow
    MLOps
    Terraform
    Cloud Implementation
    Cloud Computing
    Cloud Database
    Cloud Architecture
    Data Center Migration
    Data Analytics & Visualization Software
    Data Analysis Consultation
    Data Access Layer
    Microsoft Power BI Data Visualization
    Microsoft Azure
  • $8 hourly
    Are you ready to take your business to new heights with data-driven decision-making? Look no further! As an accomplished Data Analyst with a track record of success, I bring two years of hands-on experience in collecting and analyzing data across various business missions. My passion lies in uncovering the hidden gems within structured and unstructured data, empowering businesses to make informed decisions and solve complex problems effectively. With my strong analytical mindset and meticulous attention to detail, I have the expertise to transform raw data into actionable insights that can drive your business forward. Whether you need assistance in data extraction from diverse sources or conducting complex data transformations, I am well-equipped to deliver results that matter. As a data enthusiast, I thrive on translating intricate data into simplified visualizations and reports, ensuring that you and your stakeholders grasp the crucial information effortlessly. My educational background is grounded in excellence, holding a Master's degree in Data Science from the prestigious LJMU (Liverpool John Moores University), England. This solid foundation has equipped me with the skills and qualifications to provide your business with accurate, timely, and reliable data delivery, metrics, reporting, and analysis, all tailored to support your strategic goals. Working with me means having a partner who is not only passionate about data but also understands the significance of aligning insights with your business objectives. I take pride in my ability to adapt and learn swiftly, enabling me to stay ahead of industry trends and provide cutting-edge solutions for your business. Let's embark on a transformative journey together, where data will be the driving force behind your success. Whether you need a data-driven strategy, in-depth market analysis, or insights to optimize your operations, I'm here to make it happen. Reach out now, and let's start leveraging the power of data to make your business thrive!
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Oracle
    Data Scraping
    Data Engineering
    Data Analysis
    PySpark
    Matplotlib
    Flask
    SQL
    pandas
    Data Science
    Tableau
    Machine Learning
    Keras
    Python
  • $25 hourly
    Basically an Engineer, working as a Data scientist with PG Diploma in AI & ML degree. Currently I have 10 years of industrial experience as a programmer/Data scientist/Data analyst. I have worked on multiple projects for various Banking and Automobile based industries providing optimal solutions for complex business problems. Along with industrial projects, I also find passion in mentoring students to progress and build their Data science career Skills : • Automation and scripting - Python, SQL • Machine Learning and Deep learning Methodologies • Statistical Methods • Optimization Techniques • Data Mining & Analytics • Data Visualization - Tableau • Cloud based solutions
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Data Analytics
    Microsoft Excel
    Statistical Analysis
    Data Analysis
    Data Visualization
    PySpark
    SQL
    Data Science
    Machine Learning
    Data Science Consultation
    Databricks Platform
    Python
    Deep Learning
  • $50 hourly
    Enthusiastic Data engineer experienced in AWS data services like AWS EMR, GLUE, AWS Lambda, S3, Redshift, RDS, Cloudwatch, Athena along with Azure Databricks, Azure SQL DB
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Amazon S3
    Amazon Athena
    AWS Glue
    Databricks Platform
    AWS Lambda
    Amazon Redshift
    PySpark
    Apache Spark
    Python
  • $18 hourly
    I am a cloud engineer with experience in building ETL pipelines on Azure cloud, with hands on in Azure Data factory, Databricks, Azure Monitor, and other services. I enjoy exploring and learning new technologies and implementations and have done so throughout my career. Be it a new POC or an existing solution, I can help develop both. Experienced in developing cloud solution on Azure cloud I can fully manage a project end to end Effective and timely communication to keep you up to date
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Data Engineering
    Data Lake
    Apache Spark
    Microsoft Windows PowerShell
    CI/CD
    Microsoft SQL Server
    Data Warehousing
    Python
    Microsoft Azure
    Azure DevOps
    Distributed Computing
    SQL
    Microsoft Azure SQL Database
    PySpark
    Databricks Platform
  • $40 hourly
    I am a Senior Data Engineer with extensive expertise in data wrangling, transformation, normalization, and setting up comprehensive end-to-end data pipelines. My skills also include proficiency in Business Intelligence, ETL processes, and writing complex SQL queries. I have successfully implemented multiple intricate data pipelines using tools like Apache Airflow and Apache Oozie in my previous projects. I have had the opportunity to contribute to the data platform teams at Fortune 500 companies, where my role involved solving complex data issues, managing large datasets, and optimizing data streams for better performance and reliability. I prioritize reliability, efficiency, and simplicity in my work, ensuring that the data solutions I provide are not just effective but also straightforward and easy to maintain. Over the years, I have worked with a variety of major databases, programming languages, and cloud platforms, accumulating a wealth of experience and knowledge in the field." Skills : 𝗖𝗹𝗼𝘂𝗱: GCP (Google Cloud Platform) , AWS (Amazon Web Services) 𝗣𝗿𝗼𝗴𝗿𝗮𝗺𝗺𝗶𝗻𝗴 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲 : Java, Scala, Python 𝗗𝗮𝘁𝗮 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 : Spark, Pyspark, Kafka, Crunch, MapReduce, Hive, HBase, AWS Glue 𝗗𝗮𝘁𝗮-𝘄𝗮𝗿𝗲𝗵𝗼𝘂𝘀𝗶𝗻𝗴 : AWS Athena, Google BigQuery, Snowflake, Hive 𝗦𝗰𝗵𝗲𝗱𝘂𝗹𝗲𝗿 : Airflow, Oozie etc. 𝗢𝗿𝗰𝗵𝗲𝘀𝘁𝗿𝗮𝘁𝗶𝗼𝗻 : Docker I am highly attentive to details, organised, efficient, and responsive. Let's connect over.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Data Warehousing & ETL Software
    API Integration
    Apache Airflow
    Apache Spark
    Apache Hadoop
    Apache Kafka
    PySpark
    ETL Pipeline
    Data Engineering
    Data Preprocessing
    Data Integration
    Apache Hive
    Python
    SQL
    Data Transformation
  • $30 hourly
    Using Pandas or PySpark, I support data ingestion and cleaning for batch and streaming data (projects uploaded on Git) and store them in databases like PostgreSQL or MongoDB. Using Django, I build apps that can run scheduled python scripts using Celery and Redis as message broker; create APIs to serve front-end needs using Django REST Framework. As a Data Engineer, I build pipelines and maintain large open-source OLAP platforms like Apache Druid, Apache Pinot, Presto etc that ingest big-data in realtime using Kafka. I write SQL-queries (Advanced SQL certifications taken) to support dashboards in Superset or PowerBI. I use Airflow or crons to schedule jobs and tasks. I use Kubernetes or Docker for deployments; Git for version control.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    MySQL
    BigQuery
    PostgreSQL
    MongoDB
    Apache Flink
    RESTful API
    Data Engineering
    Apache Kafka
    Docker
    Django
    PySpark
    pandas
    Python
    SQL
    Apache Spark
  • $10 hourly
    "10% of all bottom-line earnings from your work will go to Save the Children, The Akshaya Patra Foundation(India) or a charity of your choice" - Making a difference, one step at a time! I have completed my education from IIT Kharagpur, India. I am having a keen interest in data science practical uses. Therefore, patented a deep learning based solution framework for a banking solution. Also, I have published 2 research papers in renowned journal in the field of economics and quantitative finance. I will squeeze every ounce of insight from your data and deliver an actionable product. From data engineer to data scientist, to MLOPs and quantitative finance, my skillset is wide and I like to work fast to help you move from data to insight to action and finally to a product. I work the full gamut of analytics from data extraction and clearing to data analysis, to visual, to model development and deployment. My skills include : - Quantitative Finance, Statistics, Stochastic processes - Python, R, SAS, SQL - PyTorch, TensorFlow, Keras, Scikit-learn - Google Cloud Platform (GCP) - ETL Process, Exploratory Data Analysis, - Machine Learning, Deep Learning - Natural Language Processing (NLP) - Model Development, Monitoring, and Deployment - Automation and Web Scraping - Trading Strategy, Valuations, Portfolio Optimization - Research and Patents in Data Science and Economics - Banking Domain and Digital Marketing Expertise Eager to collaborate with you and I am confident that our partnership will be both productive and impactful!
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    PySpark
    Git
    Google Cloud Platform
    Jupyter Notebook
    Linear Programming
    Cloud Computing
    Deep Learning
    TensorFlow
    Machine Learning Model
    Python
    C++
    pandas
    Machine Learning
    Keras
    Deep Neural Network
  • $10 hourly
    I am an experienced Data engineer with 1+ years in the Insurance/automobile/Banking sectors, I have a Bachelors of Technology in Electronics and Communication Engineering from SRM University and My skills are Python, SQL, PowerBI, Big Data Technologies(Hadoop, Sqoop, Hive), Pyspark, Kafka, Azure DataFactory, Azure DataBricks, AWS Glue. I have been developing data pipelines on different platforms like DataFactory, Informatica, DataStage, Glue and also developed PowerBI reports mostly on Financials of the clients. I'm responsible and always punctual to deadlines. My goal is to make every client satisfied, Thank You!
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Apache Hadoop
    Hive
    Python
    PySpark
    Microsoft Azure
    Microsoft Power BI
    Business Intelligence
    Apache Kafka
    Amazon S3
    RESTful API
    Databricks Platform
    Amazon EC2
    Database Management System
    Data Ingestion
    Data Lake
  • $50 hourly
    "Hello! I'm Jeevan, a dedicated and experienced data engineer with a passion for transforming raw data into actionable insights. With a strong background in data management, analysis, and visualization, I help businesses make informed decisions and drive growth through data-driven strategies. I have a proven track record of designing and implementing robust data pipelines, ensuring efficient data extraction, transformation, and loading (ETL) processes. I specialize in leveraging cutting-edge technologies such as Apache Spark and SQL to process and manipulate large datasets, enabling faster and more accurate analytics. My expertise extends to data modeling and database design, where I have worked extensively with both relational and NoSQL databases like MySQL, PostgreSQL, MongoDB, and Cassandra. I ensure optimal data storage structures and efficient query performance to support the needs of data-driven applications. Additionally, I am skilled in data warehousing solutions, having hands-on experience with cloud-based platforms such as Google BigQuery, and Snowflake. I can design and develop scalable data architectures that accommodate evolving business requirements while ensuring data integrity and security. Moreover, I am well-versed in data visualization tools like Tableau, Power BI, and Python libraries such as Matplotlib and Plotly. I create visually compelling dashboards and reports that effectively communicate insights to stakeholders, enabling them to make informed decisions. Collaboration and communication are key strengths of mine. I work closely with cross-functional teams, including data scientists, analysts, and stakeholders, to understand their requirements and deliver tailored solutions. I am highly adaptable, and I thrive in dynamic and fast-paced environments. If you are looking for a skilled data engineer who can turn your data into a valuable asset, I'm here to help. Let's connect and discuss how I can assist you in unlocking the full potential of your data-driven initiatives!"
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    NoSQL Database
    MySQL
    Data Analysis
    Tableau
    Google Cloud Platform
    Matplotlib
    pandas
    NumPy
    Databricks MLflow
    PySpark
    BigQuery
    Python
    SQL
  • $25 hourly
    With a decade-long background in backend and data engineering across diverse domains, I offer extensive expertise in managing large-scale data and building robust data system.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Terraform
    Docker
    Kubernetes
    PySpark
    Django
    FastAPI
    Python
    RabbitMQ
    Apache Kafka
    BigQuery
    ETL
    Google Cloud Platform
    MLOps
    ETL Pipeline
  • $50 hourly
    Accomplished IT professional Over 25+ years of experience in design, architect and implementing Digital transformation and Data Analytics products and services in large enterprises across the globe and wide range of industries. • Expertise in Data Engineering projects using Big data technologies Kafka, Apache Spark, Python, HBase, Hive, RDS, Redshift. • Expertise in Oracle Analytics projects such as platform Upgrades, Migration, Building Data Warehouses from Base up, Lift and Shift of Data warehouses, sourcing multiple source systems types (Oracle, SQL Server, json, csv File systems), automating of large number of data loads in parallel using scheduling tools, multiple downstream systems using ETL tool ODI and real time integration with Goldengate. • In depth knowledge of Statistics, Machine Learning Algorithms, Hypothesis testing, Regression Analysis, Clustering, ML Libraries in Python for Data Science projects. • Skilled in deploying Cloud Computing Platforms on AWS, Azure, GCP for Big data projects. • Completed 12 months of Executive Post Graduation Program in Data Science at IIITB, Bangalore, Pursuing MS, Data Science at LJMU, UK. • Improved ETL run time from 18 hours to 4 hours for Global Semiconductor client. • Expertise in SQL Performance tuning of Large and some of most complex Data loading for enterprises, performance increase from ETL run times from 30 hours to 6 hours. • Extensively involved in DR Sites setup, DR Testing, Data Validation and UAT for highly complex data integration and analytics projects for large Oil and Gas customer. • Well versed with Agile Project Management tools (Jira) for complex data integration and analytics projects for global customers. • Architected Cloud based IOT Solution for one of large Indian Machine tool manufacturing group for remote monitoring of machines at customer locations across India. • As Engineering Manager lead a team of 10+ developers with responsibility of product development in multiple product releases of Oracle Agile PLM Analytics (OPLA) solution on Oracle analytics platforms ODI and OBIEE, supported deployment at large fortune 500 customers for Hitech, Medical device, CPG customers. • Lead Design and Development large Banking data warehouse project spanning multiple years of development, maintenance and support. • In depth knowledge in RDBMS Systems (Oracle, Microsoft SQL Server, PostgreSQL) and performance tuning of complex database systems. •Worked with customer, project team and stake holders to manage the scope, deliverables.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Agile Project Management
    Apache Spark MLlib
    Hive
    Apache HBase
    HDFS
    Amazon Redshift
    PySpark
    Apache Kafka
    Oracle OBIEE Plus
    Oracle Data Integrator
    Python
    Oracle PLSQL
    Oracle Database
  • $35 hourly
    I am a dedicated Data Engineer boasting over 2 years of handson experience specializing in Python, AWS services, data pipeline construction, and ETL processes. Throughout my professional journey, I've adeptly navigated various tech stacks, refining crucial skills integral to big data, data warehousing, and cloud computing environments. What I Do Chat Bot: I have create healthcare chat-bot from scratch using Rasa framework Machine Learning: I have hands-on experience in developing and deploying machine learning models for ASR, TTS, Face Recognition, and Object Detection. Web Development: Proficient in back-end development, I have created multiple web applications using Flask, Django and Fast API frameworks. Data Pipelines: Skilled in real-time data pipeline architecture, primarily using Apache Kafka and Pyspark for seamless data streaming. Database Management: Adept in database management, I have experience working with MySQL, MongoDB, PostgreSQL and Big Data technologies. Data Visualization: Utilizing tools like Tableau and Power BI
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    REST API
    Big Data
    NLP Tokenization
    GPT-4
    Rasa
    Amazon Athena
    Amazon Redshift
    Amazon S3
    MongoDB
    AWS Glue
    Apache Spark
    PySpark
    Apache Kafka
    SQL
    Python
  • $35 hourly
    SUMMARY I am a Senior Data Engineer with 7 years of experience in solving complex architectural and scalability challenges while working in the healthcare and Finance sectors. Presently I am actively involved in building a robust, scalable, and fault-tolerant architecture for managing Sales data.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Snowflake
    Amazon Redshift
    Amazon EC2
    AWS Lambda
    AWS Glue
    Amazon Athena
    Amazon S3
    PySpark
    Python
    SQL
    Databricks Platform
    Data Warehousing & ETL Software
    ETL
  • $100 hourly
    I am currently Principal Consultant-Data Science at Evalueserve with around 9 years of analytic work experience in CPG and retail domain. Have worked with multiple retail business groups ranging from Customer insights, Finance, Pricing (markdown), Demand Planning and supply chain team. I have worked extensively on R, Python, PySpark, excel and SQL databases. Skilled in building propensity model solutions, markdown/pricing solutions, creating ADS (analytical dataset) for insight generation, demand forecasting, statistical modeling, descriptive reporting, and client/project management
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Machine Learning
    Inventory Management
    PySpark
    SQL
    Microsoft Excel
    R
    Python
    Exploratory Data Analysis
    Markdown
    Time Series Forecasting
    Forecasting
    Demand Planning
    Price Optimization
    Retail
    Data Science
  • $49 hourly
    Hello! I'm a certified data engineer with expertise in Big Data, ETL and major data engineering technologies. I have 6 years of work experience in the industry currently working as a Senior Data Engineer , thus having exposure to various data engineering problems. You can get in touch with me for your needs.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Amazon Web Services
    Microsoft Azure
    Google Cloud Platform
    pandas
    PySpark
    ETL
    Cloud Engineering
    SQL
    Big Data
    Python
  • $50 hourly
    As a seasoned Data Engineer, I specialize in crafting robust data solutions tailored to meet your business needs. Whether you're looking to optimize data processing, integrate machine learning models, or enhance decision-making capabilities, I've got you covered. Key skills and services I offer: - Expertise in ETL processes and database design - Integration of machine learning models into data workflows - Deployment and management of data solutions on any cloud platform - Full project management from conceptualization to delivery - Clear and regular communication throughout the project lifecycle Let's collaborate to unlock the full potential of your data assets and drive business growth. Reach out to discuss how I can contribute to your data success.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    AWS Glue
    ETL
    Automation
    PySpark
    Python
    Database
    Database Management System
    Big Data
  • $30 hourly
    As a dedicated Data Scientist at Citicorp, I bring an unwavering commitment to excellence and a track record of delivering impactful solutions. With a solid foundation from IIT Kharagpur, encompassing a Bachelor's degree in Civil Engineering, a Master's degree in AI, and a minor in Computer Science, I exude confidence in my ability to tackle complex data challenges. Skills: Proficient in Python, C/C++, and SAS programming languages Experienced in Pyspark, PostgreSQL, Flask, Selenium, and Tableau Expertise in machine learning, deep learning, natural language processing, and data analysis Skilled in developing NLP models, deep learning pipelines, and machine learning applications Capable of end-to-end ML modeling pipeline, encompassing data cleaning, model building, and deployment Strong in deriving actionable insights from extensive datasets and presenting them through compelling visualizations and reports Adept at statistical analysis and hypothesis testing Experienced in data cleaning, preprocessing, and feature engineering techniques Proficient in database management and SQL querying Familiarity with data visualization libraries such as Matplotlib and Seaborn and softwares like Tableau and Excel Knowledgeable in time series analysis and forecasting methods Experience with A/B testing methodologies and experimentation design Skilled in data storytelling and effective communication of insights with MS Powerpoint Experience: - Developed cutting-edge NLP models for semantic similarity evaluation - Created robust deep learning pipelines for crowd counting applications - Applied machine learning techniques to accurately estimate soil attributes -Spearheaded a Master's thesis focused on building a chatbot leveraging large language models with minimal data - Demonstrated proficiency in building and deploying ML models as APIs or websites Extracurricular: - Founded and led a cohort-based classroom at my campus to teach juniors about data science, receiving excellent reviews for my mentorship and teaching abilities - Actively engaged in organizing workshops and mentoring programs, showcasing my leadership and organizational prowess Driven by a passion for data science and armed with a diverse skill set, I am ready to take on the most challenging data analysis tasks and deliver results that exceed expectations.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Microsoft PowerPoint
    PySpark
    Data Analysis
    Deep Learning
    Microsoft Excel
    Tableau
    Selenium WebDriver
    Python
    Flask
    Chatbot Development
    Natural Language Processing
    Statistical Analysis
    Machine Learning
  • $150 hourly
    Career Summary: Result oriented Big Data Cloud Engineer with 11+ years of experience in AWS Cloud,Snowflake,DataBricks,PySpark,Python,Airflow,ETL/ELTTools,Bigdata platforms like Hadoop/Snowflake as well as traditional platforms (Oracle, Sybase, SQL Server). A motivated team player, well acquainted with AGILE model of delivery, dedicated to streamlining project issues with patience and perseverance, with a willingness to drive projects and take ownership. Technical Summary: * Expertise in handling complex ingestion pipelines using AWS /Snowflake, PySpark, ETL Tools ,AWS Glue, Lambda, AWS S3. * Expertise in building orchestration pipelines using Kafka and ActiveMQ. * Proficiency in designing resilient, robust and scalable data warehousing solutions using HDFS,HIVE,ETL Tools (Informatica BDM10.x, BDE 9.6.1). Expertise in handling complex data migration projects as well as designing new warehouses.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Apache Hadoop
    SQL Programming
    PySpark
    Microsoft Azure
    AWS Development
    Databricks Platform
    Data Warehousing & ETL Software
    Python Script
    Python
    Snowflake
  • $50 hourly
    I am a Full stack developer with experience in AWS Services, Python, Reactjs, NextJs. Experienced in Building Webapps as well as large scale data oriented solutions on AWS environment.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    AWS Fargate
    AWS Glue
    PySpark
    Python
    React
    Next.js
    JavaScript
    AWS Lambda
    AWS Amplify
  • $60 hourly
    I am an experienced Quality Assurance Automation Engineer specializing in the field of Data Analytics. With a robust skill set, I excel in constructing automation frameworks tailored to diverse environments. My expertise extends to crafting automation frameworks for both traditional and Big Data environments, encompassing cloud applications and server-side functionalities. Proficient in Python, AWS, PySpark, SQL, and RestAPI testing, I bring a comprehensive understanding of these technologies to effectively design and implement automated testing solutions. My versatility enables me to adapt to various project requirements and deliver high-quality results consistently.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    QA Automation
    PySpark
    SQL
    AWS CloudFormation
    Python
  • $60 hourly
    Data Engineer I am a results-driven professional with 4.8 years of experience in business intelligence, data analytics, and software development. I have a good experience of analyzing and visualizing data, while I am seeking a challenging role to leverage my expertise in data analysis, statistical modeling, and visualization to contribute to the success of an organization. I have good communication skills with experience of handling clients, problem solving, with the ability to work in fast-paced environments and deliver results under tight deadlines.
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Hive
    Apache Hadoop
    MongoDB
    Snowflake
    BigQuery
    Amazon Redshift
    AWS Glue
    Google Cloud Platform
    SQL
    Python
    PySpark
    Apache Airflow
  • $100 hourly
    I'm a senior Data Architect, in solving, designing and building products/solution in on-premise and cloud infrastructure. Area of expertise 1. Database : Oracle, SQL Server, RedShift,NoSQL(Mongo, Dynamo, ) 2. Datalake : Hadoop, Amazon S3/Athena/Glue 3. Programming language : Spark, Python, .Net
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    PySpark
    Oracle
    Microsoft SQL Server
    MongoDB
    Apache Hadoop
    Spring Boot
    .NET Core
    AWS Application
    Microsoft Azure
  • $30 hourly
    I am a Data Scientist with more than three years of professional experience. The following are the areas I specialize in. ✅ Data cleaning and manipulation ✅ Analysing data to drive key business insights ✅ Visualizing data through charts and infographics ✅ Creating appealing dashboards ✅ Building machine learning models (linear or non-linear) for key predictions on data based on relevant performance metrics. (Using sklearn as the primary library) TOOLS AND TECHNOLOGIES ✅ SQL ✅ Data Analysis and Development - Python, Python libraries (numpy, pandas, sklearn, etc), Git ✅ Web Analytics - Google Analytics ✅ Dashboarding - Power BI, DAX, Streamlit ✅ Big Data - Pyspark, Databricks Notebooks CERTIFICATIONS ✅ Microsoft Certified Data Analyst Associate ✅ Google Analytics Individual Qualification I understand each project has its own nuances and could possibly require one to learn on the job to which I am very much open to. I prioritize client satisfaction and have a 100% job completion rate on Upwork with reviews like - ⭐ "I feel like 5 stars is very less as a benchmark for Piyush. He is absolutely exceptional in his knowledge of Python, professionalism, Critical thinking, timeliness, sense of urgency of deliverables, requirement understanding with minimal client time and going above & beyond to compete the task. I am so glad I was able to find him for my understanding of Python concepts. He is such a good teacher and make sure you get it. I would recommend him to everyone for any task." ⭐ "Piyush delivered great work and submitted it in advance. He communicates promptly and clearly. Piyush demonstrated a solid knowledge of data science topics and great attention to detail. In addition, he pointed out an issue in the dataset, which he handled perfectly and which we were then able to fix thanks to him. It was a pleasure working with Piyush, and we have already hired him again for a similar job."
    vsuc_fltilesrefresh_TrophyIcon Pyspark
    Statistics
    Microsoft Excel
    Data Analysis
    Data Mining
    Analytics
    Dashboard
    Microsoft Power BI
    PySpark
    Data Visualization
    Technical Writing
    SQL
    Data Science
    Machine Learning
    Python
  • Want to browse more freelancers?
    Sign up

How hiring on Upwork works

1. Post a job (it’s free)

Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.

2. Talent comes to you

Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.

3. Collaborate easily

Use Upwork to chat or video call, share files, and track project progress right from the app.

4. Payment simplified

Receive invoices and make payments through Upwork. Only pay for work you authorize.

Trusted by

How do I hire a Pyspark Developer near Bengaluru, on Upwork?

You can hire a Pyspark Developer near Bengaluru, on Upwork in four simple steps:

  • Create a job post tailored to your Pyspark Developer project scope. We’ll walk you through the process step by step.
  • Browse top Pyspark Developer talent on Upwork and invite them to your project.
  • Once the proposals start flowing in, create a shortlist of top Pyspark Developer profiles and interview.
  • Hire the right Pyspark Developer for your project from Upwork, the world’s largest work marketplace.

At Upwork, we believe talent staffing should be easy.

How much does it cost to hire a Pyspark Developer?

Rates charged by Pyspark Developers on Upwork can vary with a number of factors including experience, location, and market conditions. See hourly rates for in-demand skills on Upwork.

Why hire a Pyspark Developer near Bengaluru, on Upwork?

As the world’s work marketplace, we connect highly-skilled freelance Pyspark Developers and businesses and help them build trusted, long-term relationships so they can achieve more together. Let us help you build the dream Pyspark Developer team you need to succeed.

Can I hire a Pyspark Developer near Bengaluru, within 24 hours on Upwork?

Depending on availability and the quality of your job post, it’s entirely possible to sign up for Upwork and receive Pyspark Developer proposals within 24 hours of posting a job description.