Hire the best Pyspark Developers in Bengaluru, IN

Check out Pyspark Developers in Bengaluru, IN with the skills you need for your next job.
  • $90 hourly
    I'm a software architect and data engineering expert with over 10 years of proven mastery in building scalable, secure, and high-performance backend and data-driven systems. My deep specialization includes robust data pipelines, sophisticated backend architectures, and strategic cloud solutions. I've successfully led and delivered critical technology initiatives for global industry leaders such as Goldman Sachs, Morgan Stanley, KPMG, and Oracle. Key Competencies: - Data Engineering & Databases - SQL (PySpark, Meltano, Oracle, Postgres, MySQL, Redshift, SQL Server, Snowflake, BigQuery), Data Warehousing, query optimization, Kafka, Spark, Airflow, DBT. - Cloud Platforms - AWS, GCP, Azure, OCI, Kubernetes, Docker, Terraform, CI/CD automation. - Backend Development - Java, Python, modern web frameworks, API integrations. Strategic & Technical Partnership: I go beyond simply delivering code—I partner with you to deeply understand your business challenges, advise on industry best practices, and implement tailored, future-proof solutions that optimize operations, reduce costs, and accelerate growth. Client Feedback: “Amar consistently exceeds expectations—he doesn't just deliver technical solutions, he delivers strategic insights that drive real business results.” “Highly intelligent and experienced individual with deep knowledge across data engineering.” “Top performer—will be working with Amar long time I hope!” If you're looking for a seasoned expert who combines technical mastery and strategic vision, let's connect to discuss how I can help you build impactful, business-driven solutions.
    Featured Skill Pyspark
    API Development
    Flask
    Google App Engine
    Software Development
    Big Data
    Google Cloud Platform
    Amazon Web Services
    BigQuery
    PySpark
    Apache Airflow
    Apache Spark
    Data Engineering
    SQL
    Python
    Java
  • $45 hourly
    👋 Hi! I'm a Senior Data Engineer with over 5+ years of hands-on experience in SaaS Development, Data Engineering, Backend, and Automation. Having successfully navigated through 35+ diverse projects on Upwork and other platforms, I stand confident in my prowess to deliver holistic solutions in Software, Data, and DevOps that not only meet but exceed your expectations, ensuring 100% client satisfaction and remarkable results. ⭐Data Engineering Expertise⭐ ✅Custom Application Development using Python, Flask, Django ✅Proficient in Data Analysis Tools such as Pandas, NumPy ✅Well versed in MongoDB, MySQL, PostgreSQL, SQL Server ✅Mastery in Automated Web Scraping with Requests, Beautiful Soup, Selenium ✅Data Warehousing with Snowflake, Google BigQuery, AWS Redshift ✅End-to-End ETL/ELT Data Pipeline Design and Implementation ✅Database Development & Modeling ✅Automation & Orchestration using Apache Airflow, Airtable, Make.com, Power Apps ✅Data Transformation using DBT (Data Build Tools) ✅Generative AI with OpenAI, ChatGPT, Google Vertex AI, Claude ✅Proficient in the Software Development Life Cycle (SDLC) ⭐DevOps Solutions⭐ ✅Containerization & Orchestration with Docker and Kubernetes ✅Infrastructure Automation using Terraform ✅Workflow Management in GitHub, Jira ✅CI/CD Process Development using Jenkins, GitHub Actions ✅Expertise in Cloud Platforms such as AWS, GCP, Azure, and Digital Ocean ✅Code Quality Assurance with SonarQube ------------------------------------------------------------------------------------------------------------------ 🤝Previous Partnerships🤝 ⚡Fostered deep strategic collaborations with a diverse range of Startup Founders and CXOs, offering comprehensive support and guidance in product development, backend architecture, infrastructure planning, AI integration, and data management solutions. ⚡Contributed significantly to strategic planning, technical problem-solving, and innovation to fulfill advanced business requirements. ⚡Developed a keen understanding of executive-level challenges, providing customized and effective business strategies. Ready to elevate your project? Let's discuss how we can work together to achieve your vision! 🚀
    Featured Skill Pyspark
    Amazon Web Services
    Microsoft Azure
    Databricks Platform
    PySpark
    Google Cloud Platform
    dbt
    API Development
    Airtable
    Automation
    Apache Airflow
    ETL Pipeline
    Data Engineering
    Python
    SQL
    Database Architecture
  • $35 hourly
    ☑️ 𝗤𝘂𝗮𝗹𝗶𝘁𝘆-𝗳𝗼𝗰𝘂𝘀𝗲𝗱 𝗲𝗻𝗱-𝘁𝗼-𝗲𝗻𝗱 ☁️ 𝗰𝗹𝗼𝘂𝗱 𝘀𝗼𝗹𝘂𝘁𝗶𝗼𝗻𝘀 & 𝗰𝗼𝗻𝘀𝘂𝗹𝘁𝗶𝗻𝗴 🔒 𝗮𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁𝘂𝗿𝗲 𝗱𝗼𝗻𝗲 𝗿𝗶𝗴𝗵𝘁 𝘁𝗵𝗲 𝗳𝗶𝗿𝘀𝘁 𝘁𝗶𝗺𝗲. 🎯 20+ 𝘆𝗲𝗮𝗿𝘀 𝗼𝗳 𝗽𝗿𝗼𝗳𝗲𝘀𝘀𝗶𝗼𝗻𝗮𝗹 𝗲𝘅𝗽𝗲𝗿𝗶𝗲𝗻𝗰𝗲, Delivered multiple projects of Fortune 500 companies. Expertise in working in the healthcare domain. ☑️ 📞 Invite me to your job and we can book a complimentary 30-minute consultation together that’s earnestly helpful. 📞 Microsoft Fabric | MS Fabric | Azure Function | Azure Synapse | Azure Data Factory | Azure Key Vault | Azure Data Factory | Azure Analysis Services | Azure SQL Database | Azure Open API Azure Sentiment Analysis | Azure Large Language model | Azure Logic Apps | Azure Active Directory (Entra) | Azure data lake storage | Azure Blob Storage| Azure data lake | Azure Delta lake Power BI | DAX | Dax Studio | Tabular Editor | Perfomance Analyser | SQL Profiler You have the option to hire someone cheaper who will take 5 times longer to complete the job correctly. Alternatively, you can hire me, an expert, to efficiently deliver the perfect solution while you relax. Consider me your cloud solution Advil – I consistently alleviate headaches rather than create them. ✅ What I do -: Take your raw data, process it, transform it, and present it in a meaningful and reproducible way, while keeping you aware of what's happening (in plain language, if you're a non-tech person) and working together with your team (if you already have a tech structure) on every step of the process - from data extraction to final presentation. -: I enjoy solving complex data issues with my skill set. My passion is data readiness and making data "work" for my clients. -: Performance Tuning: Performance tuning is what I love to do, I have worked on a few tables that had 1 billion records. I am an expert in Table Partitioning—indexing etc. -- Worked on huge data to process it and transform it like a few terabyte per millisecond streaming data to transform it to be used in the gold layer of the data lake. ✅ Offering the services: -: Data analytics -: Power Query programming -: Dashboard design and implementation -: Data Extraction -: Database Architect -: END-TO-END BI Solution Design -: Performance Tuning -: Table Partitioning -: Automation of almost everything in Bi Solution. ✅ Programming Languages -: Python, PowerShell, Batch Scripting, DAX, MDX ✅ Databases: -: Microsoft SQL Server (2008 to 2019) -: Azure SQL Server -: Azure Data Warehouse -: Cosmos DB -: MySQL -: PostgreSQL -: Oracle -: Mongo DB ✅ ETL -: SSIS (SQL Server Integration Services) -: ADF (Azure Data Factory) -: ADB (Azure Data Bricks) -: AirFlow -: Alteryx -: Tableau Prep Builder ✅ Reporting -: Microsoft Power BI -: Google Data Studio --Tableau ✅ Analytics -: SSAS (Multidimensional and tabular cube) -: Azure Data Bricks -: Azure Analysis Services -: DBT ✅ Solutions for business: -: Manufacturing -: Human Resources -: Healthcare -: Marketing -: Project management -: Sales -: Finance -: Media and Advertisement
    Featured Skill Pyspark
    Snowflake
    PySpark
    MLflow
    MLOps
    Terraform
    Cloud Implementation
    Cloud Computing
    Cloud Database
    Cloud Architecture
    Data Center Migration
    Data Analytics & Visualization Software
    Data Analysis Consultation
    Data Access Layer
    Microsoft Power BI Data Visualization
    Microsoft Azure
  • $70 hourly
    Data Science professional, blending Machine Learning and Data Engineering expertise for developing AI+ solutions to address business problems, on both edge and cloud. Adept at maintaining productive client relationships.
    Featured Skill Pyspark
    PySpark
    MLOps
    Computer Vision
    Object Detection
    Llama 3
    LLM Prompt Engineering
    Generative AI
    Python
    Deep Learning
    Artificial Intelligence
    Machine Learning
  • $8 hourly
    Are you ready to take your business to new heights with data-driven decision-making? Look no further! As an accomplished Data Analyst with a track record of success, I bring two years of hands-on experience in collecting and analyzing data across various business missions. My passion lies in uncovering the hidden gems within structured and unstructured data, empowering businesses to make informed decisions and solve complex problems effectively. With my strong analytical mindset and meticulous attention to detail, I have the expertise to transform raw data into actionable insights that can drive your business forward. Whether you need assistance in data extraction from diverse sources or conducting complex data transformations, I am well-equipped to deliver results that matter. As a data enthusiast, I thrive on translating intricate data into simplified visualizations and reports, ensuring that you and your stakeholders grasp the crucial information effortlessly. My educational background is grounded in excellence, holding a Master's degree in Data Science from the prestigious LJMU (Liverpool John Moores University), England. This solid foundation has equipped me with the skills and qualifications to provide your business with accurate, timely, and reliable data delivery, metrics, reporting, and analysis, all tailored to support your strategic goals. Working with me means having a partner who is not only passionate about data but also understands the significance of aligning insights with your business objectives. I take pride in my ability to adapt and learn swiftly, enabling me to stay ahead of industry trends and provide cutting-edge solutions for your business. Let's embark on a transformative journey together, where data will be the driving force behind your success. Whether you need a data-driven strategy, in-depth market analysis, or insights to optimize your operations, I'm here to make it happen. Reach out now, and let's start leveraging the power of data to make your business thrive!
    Featured Skill Pyspark
    Oracle
    Data Scraping
    Data Engineering
    Data Analysis
    PySpark
    Matplotlib
    Flask
    SQL
    pandas
    Data Science
    Tableau
    Machine Learning
    Keras
    Python
  • $60 hourly
    Expert Data Engineer and Certified Generalist Software Engineer Languages: - Expert: Python | SQL - Intermediate: Javascript | Java | Shell Script (Bash) | Solidity - Syntax Knowledge: C | C++ Big Data Stack: Apache Spark/Pyspark | Apache NiFi | Apache Kafka | Apache Flink | DBT BlockChain Stack: Solidity | Web3j | ChainLink | Moralis | StackOS | IPFS Chain: Ethereum, Polygon (any EVM compatible chain: BSC, PolkaDot, Avalanche, etc) Create ERC-20 Tokens, ERC 721/1155 NFTs. Store NFTs/metadata on FileCoin/IPFS, custom smart contracts Frontend Frameworks: Vue Js | Bootstrap | jQuery Backed Frameworks: Flask | Express JS | PHP | SpringBoot Cloud Infrastructure: AWS (S3, EC2, EMR, Redshift, SQS, Glue) Database: Postgres SQL | Redis | Redshift Deployment: Docker | Docker Compose | Kubernetes (K8-Amateur) Scheduler: Azkaban | Airflow Skills: Web-scraper | ETL | ELT | Datawarehouse | Data Mining | Full Stack Web development | Rest API | Data Wrangling Misc: Discord | Binance API | Selenium | Metabase Python Packages: BeautifulSoup, Request, Selenium, Pyspark, Pyflink, pandas, Scikit-learn, etc
    Featured Skill Pyspark
    Algorithm Development
    PostgreSQL
    Flask
    Cryptocurrency
    Amazon Redshift
    Redis
    PySpark
    Apache Kafka
    Apache NiFi
    Linux
    SQL
    Python
  • $10 hourly
    "10% of all bottom-line earnings from your work will go to Save the Children, The Akshaya Patra Foundation(India) or a charity of your choice" - Making a difference, one step at a time! I have completed my education from IIT Kharagpur, India. I am having a keen interest in data science practical uses. Therefore, patented a deep learning based solution framework for a banking solution. Also, I have published 2 research papers in renowned journal in the field of economics and quantitative finance. I will squeeze every ounce of insight from your data and deliver an actionable product. From data engineer to data scientist, to MLOPs and quantitative finance, my skillset is wide and I like to work fast to help you move from data to insight to action and finally to a product. I work the full gamut of analytics from data extraction and clearing to data analysis, to visual, to model development and deployment. My skills include : - Quantitative Finance, Statistics, Stochastic processes - Python, R, SAS, SQL - PyTorch, TensorFlow, Keras, Scikit-learn - Google Cloud Platform (GCP) - ETL Process, Exploratory Data Analysis, - Machine Learning, Deep Learning - Natural Language Processing (NLP) - Model Development, Monitoring, and Deployment - Automation and Web Scraping - Trading Strategy, Valuations, Portfolio Optimization - Research and Patents in Data Science and Economics - Banking Domain and Digital Marketing Expertise Eager to collaborate with you and I am confident that our partnership will be both productive and impactful!
    Featured Skill Pyspark
    PySpark
    Git
    Google Cloud Platform
    Jupyter Notebook
    Linear Programming
    Cloud Computing
    Deep Learning
    TensorFlow
    Machine Learning Model
    Python
    C++
    pandas
    Machine Learning
    Keras
    Deep Neural Network
  • $30 hourly
    As an experienced Data Engineer with over 3 years of expertise in the Entertainment and Banking sectors, I bring a strong background in data engineering and analytics. I hold a Bachelor's in Electronics and Communication Engineering from SRM University and have developed proficiency in a range of technologies, including Python, SQL, PowerBI, and Big Data Technologies such as Hadoop, Sqoop, and Hive. My skills extend to PySpark, Kafka, Azure DataFactory, Azure DataBricks, and AWS Glue. I specialize in developing robust data pipelines across various platforms like Databricks, DataFactory, Informatica, DataStage, and Glue. Additionally, I have extensive experience creating detailed PowerBI reports, particularly focused on financial data analysis. Known for my punctuality and commitment to meeting deadlines, I am dedicated to ensuring client satisfaction in every project I undertake. Thank you for considering my profile!
    Featured Skill Pyspark
    Apache Hadoop
    Hive
    Python
    PySpark
    Microsoft Azure
    Microsoft Power BI
    Business Intelligence
    Apache Kafka
    Amazon S3
    RESTful API
    Databricks Platform
    Amazon EC2
    Database Management System
    Data Ingestion
    Data Lake
  • $10 hourly
     8.9 years of experience in capacities as a developer, designer & analyst in BI space – Data & Analytics, with 5.5 Yrs. of experience in Azure , Databricks , Spark and Big Data stack  Microsoft Certified Azure Data Engineer and Databricks Certified Data Engineer  Key Skill set includes Spak, Azure Data Factory, Azure Data Lake, Databricks, Logics App, T SQL, Informatica PC, Oracle PL/SQL, PySpark , Function App  Rich experience of delivering excellence in Agile methodology – Scrum JIRA & Azure DevOps  Performed various customer facing roles in India and London which includes activities such as requirement gathering, analyzing, designing, developing, driving consensus amongst stakeholders etc. and ensuring delivery as per the schedule.  Holds a degree in B. Tech Computer Science & Engineering with first grade (2014)
    Featured Skill Pyspark
    Client Interview
    Candidate Interviewing
    Unix
    Data Modeling
    Data Analytics
    ETL
    Microsoft Azure
    Databricks Platform
    PySpark
    Apache Spark
    Data Engineering
    Informatica
    SQL
    Python
  • $25 hourly
    With a decade-long background in backend and data engineering across diverse domains, I offer extensive expertise in managing large-scale data and building robust data system.
    Featured Skill Pyspark
    Terraform
    Docker
    Kubernetes
    PySpark
    Django
    FastAPI
    Python
    RabbitMQ
    Apache Kafka
    BigQuery
    ETL
    Google Cloud Platform
    MLOps
    ETL Pipeline
  • $18 hourly
    o Data Scientist with an Experience of 8 Years into solving various business problems using advanced analytical and machine learning approaches. o Working experience on multiple domains like BFSI, CPG & FMCG, Retail, Travel and Hospitality and Telecom both as in house expert and consultant o Employed in various MNCs like PWC, ITC and HSBC as data science consultants o End to End handling of Data Science project with requirement gathering, designing, execution, deployment and monitoring o Hands on experience in software like R, Python, PySpark, Tableau, Alteryx, MS excel, MySQL, MS SQL o Proficient in coding in R, Python, PySpark, SAS o Detailed knowledge of implementation of AI & ML algorithms like Regression, GAM, Random Forest, SVM, Deep Learning Techniques, various Boosting algorithms (GBM,XGB), unsupervised learnings, Hypothesis Testing in real life problems to provide business driven solutions along with Model Validation o In depth statistical analysis with knowledge on econometrics and Time Series forecasting o Day to Day client management with requirement gathering, project delivery and project management skills o Basic knowledge of Model Deployment o Basic knowledge of Graph Theory o Repository maintenance in Github
    Featured Skill Pyspark
    Econometrics
    Biostatistics
    Project Management
    Statistical Analysis
    Model Validation
    PySpark
    Microsoft Excel
    Time Series Analysis
    Model Deployment
    Neural Network
    Machine Learning Model
    Tableau
    SAS
    R
    Python
  • $75 hourly
    Are you seeking a skilled Data Science Engineer to elevate your team's analytics capabilities? Look no further! With a solid background in statistical analysis, Python, R, SQL, and Excel, coupled with extensive experience in trend analysis, I am well-equipped to tackle your data challenges head-on. At Tiger Analytics, I contributed significantly to our data science team, honing my expertise in handling vast datasets typical of large companies. My proficiency extends to essential technologies such as DBT, Airflow, Snowflake, and Jenkins, ensuring seamless integration and efficient data processing workflows. In addition to my technical prowess, I bring a meticulous approach to project management, ensuring that tasks are executed flawlessly from inception to completion. Regular communication is paramount to me, as it fosters transparency and ensures alignment with your objectives. Let's collaborate to unlock actionable insights from your data and drive your business forward. Reach out, and let's embark on this data-driven journey together.
    Featured Skill Pyspark
    Data Modeling
    Data Extraction
    Data Engineering
    Data Collection
    Data Analysis Expressions
    Data Analytics & Visualization Software
    Data Analytics Framework
    PySpark
    Data Science Consultation
    Data Science
    Data Cleaning
    Data Analytics
    Data Analysis Consultation
    Data Analysis
    Python
  • $15 hourly
    Unlock up to 20% revenue growth, 25% faster decision-making and 100% efficiency boost with automation with data-driven insights, predictive models, and real-time solutions tailored for your business. With a Master's in Artificial Intelligence and Machine Learning, I can handle your unique challenges fast and accurately. What I can do for you? Consultation: 🎓 With a Master’s in AI/ML and 3 years of experience across fintech, real estate, agri-tech, smart city, and big tech (Citi), I deliver high-value, high ROI solutions. Book a call to discuss your unique data challenges! Data Cleaning: 🧹 Expert in Python, Excel, SQL, PySpark, and Dask for error elimination, data standardisation, missing value management, and integration from diverse sources (CRM, sales, marketing). Data Processing and Engineering: 🛠️ Skilled in developing efficient ETL pipelines, data normalization, and optimization with scalable cloud solutions (databases, data lakes). Proficient in orchestrating with Apache Spark, Kafka, and Docker. Data Visualization: 📊 Advanced in creating interactive dashboards and reports using Tableau, Excel (PivotTables, VBA), and custom Python visualizations (Seaborn, Plotly). Expert in visual storytelling for actionable insights. Machine Learning Modeling: 🤖 Proficient in building, validating, and deploying predictive models using Regression, Classification, Clustering, Deep Learning, NLP, and Computer Vision. Skilled in Python, PyTorch, and AWS SageMaker. Deployment: 🚀 Experienced in containerizing models with Docker, using AWS for cloud deployment, implementing real-time applications, and streamlining updates with CI/CD pipelines. Design APIs with Flask and data-driven websites with Django. Monitoring: 📈 Implement comprehensive monitoring with AWS CloudWatch, DataDog, and Prometheus to track performance, detect data drift, and ensure accuracy. Set up automated alerts for efficient issue resolution. Automation: 🤖 Automate data pipelines, model training, reporting, and deployment using Apache Airflow and CI/CD pipelines for optimized workflows. Business Integration and Communication: 💼 Collaborate with cross-functional teams to translate technical findings into strategic insights through clear communication and presentations. What my clients are saying about me: ✅ "Manasi's data cleaning expertise in Python and SQL improved our retail customer database accuracy by 95%, leading to a 20% increase in targeted marketing ROI." ✅ "Manasi optimized our ETL pipelines in the fintech industry using Apache Spark and Docker, reducing data processing time by 50% and enabling faster financial analysis." ✅ "The dashboards Manasi created in Tableau and Python for our healthcare analytics team delivered actionable insights that accelerated decision-making on patient care strategies." ✅ "In our e-commerce business, Manasi’s predictive models using deep learning and NLP enhanced our sales forecasting and customer segmentation, boosting sales by 15%." ✅ "Manasi streamlined our machine learning model deployment in the agri-tech sector with Docker and AWS, ensuring real-time analysis for crop yield predictions." ✅ "Manasi’s monitoring systems with AWS CloudWatch in our smart city project kept our models accurate, reducing system downtime by 20% and ensuring continuous service delivery." ✅ "For our logistics company, Manasi automated data scraping pipelines and reporting using Apache Airflow and Selenium, optimizing delivery route planning and saving operational time by 30%." ✅ "Manasi turned complex data insights into clear, strategic recommendations for our real estate development projects, making them a crucial part of our planning team." We will be a good fit if : ⭐ Long-Term Collaborations: Ideal for clients seeking ongoing data science and analytics partnerships. ⭐ Quality-Focused Projects: Suited for clients who value meticulously crafted, high-quality solutions. ⭐ Strategic Decision-Makers: Perfect for businesses needing comprehensive, data-driven strategies and support. ⭐ Complex Projects: Best for clients with intricate, multi-phase data challenges requiring deep analysis. ⭐ Iterative Development: Ideal for those who appreciate the value of perfecting each project phase. ⭐ Growing Businesses: Well-suited for companies scaling operations or undergoing digital transformation. ⭐ Resource-Invested Clients: A great fit for clients ready to invest in the resources necessary for high-performance data solutions. Who am I? With a B.Tech + M.Tech dual degree in Civil Engineering and Artificial Intelligence from IIT Kharagpur and over 3 years of hands-on experience, I specialize in data cleaning, visualization, and machine learning. I deliver high-impact solutions across industries, providing strategic insights that drive business growth. Let’s leverage data to achieve your goals, book a call with me or click the Invite button!
    Featured Skill Pyspark
    Microsoft PowerPoint
    PySpark
    Data Analysis
    Deep Learning
    Microsoft Excel
    Tableau
    Selenium WebDriver
    Python
    Flask
    Chatbot Development
    Natural Language Processing
    Statistical Analysis
    Machine Learning
  • $20 hourly
    🥇Senior Data Scientist with over 5 years of Experience ⭐ Industries: CPG, HR, Retailer and Manufacturing ✅ 100% Customer Satisfaction I am passionate and experienced in Data science and ML Engineering with expertise in various technologies. My skills include: ✅ Data cleaning, Data model, EDA, Feature Engineering, Model Selection, evaluation, deployment, CI/CD pipelines, LLM, GenAI APIs and Visualizations ✅ Python, SQL, R ✅ Supervised and Unsupervised learning, Regression, Classification, Recommendation system, MMM, CNN, RNN, NLP, GenAI ✅ Pyspark, TensorFlow, Databricks, ADF, Scikit-learn, H20, MLOps ✅ GenAI, OpenAI, Gemini, LLMs, OCR, OpenCV, NLP ✅ JIRA / Trello / Microsoft ✅ VS Code / Jupyter / Git / GitHub ✅ Azure / AWS / GCP / IBM Cloud MY PROCESS 🔶 1. Discover The first part of my process is to learn about your requirements 🔶 2. Strategy Next, I determine the best way to meet the goal proposed strategically 🔶 3. Development The development phase is where I build the end-to-end solution for the problem and test it rigorously 🔶 4. Delivery Finally, I package it all up and deliver the solution on time and within budget, incorporating storytelling to deliver impactful business solutions and clear manuals. Interested? Let's get on a quick 15-minute free consultation call. Hit the interview button and let's talk! 🙌🏼 If you think I can help you with your project, invite me to your project, and let's make your project a Success. Keywords: Data science, ML, Machine Learning, Data cleaning, Data model, EDA, Feature Engineering, Model Selection, evaluation, deployment, CI/CD pipelines, GenAI APIs and Visualizations, Python, SQL, R, Supervised and Unsupervised learning, Regression, Classification, Recommendation system, Time Series, MMM, Pyspark, TensorFlow, Databricks, ADF, Scikit-learn, H20, MLOps. GenAI, OpenAI, Gemini, LLMs, OCR, OpenCV, NLP, JIRA, Trello, Microsoft, Git, Github, Azure, AWS, GCP, IBM Cloud Looking forward to working with you!
    Featured Skill Pyspark
    Microsoft Power BI
    Gemini
    OpenAI API
    Cluster Analysis
    Regression Analysis
    Classification
    PySpark
    Databricks Platform
    Microsoft Azure
    Python
    Generative AI
    Machine Learning
    Data Science
    Machine Learning Model
  • $12 hourly
    With a strong background in IT, I specialize in building robust data pipelines and managing large-scale data projects. I can optimize data workflows, implement ETL processes, and create insightful data visualizations to drive data-driven decision-making. I am currently working as a Data Engineer, focusing on end-to-end project management for data engineering tasks. My work involves data integration and transformation, ensuring data quality and validation, and building and maintaining data lakes and warehouses. My expertise includes: Programming Languages: Proficient in SQL, Python, and PySpark. Data Warehousing Solutions: Experienced with Snowflake and Redshift. Big Data Technologies: Knowledge of Hadoop and Spark. Cloud Platforms: Familiar with AWS and Azure. Azure Data Services: Extensive experience with Azure Data Factory, Azure SQL Database, and Azure Databricks. Regular communication is important to me, so let’s keep in touch!
    Featured Skill Pyspark
    Apache Airflow
    Snowflake
    Data Analytics
    Databricks Platform
    PySpark
    Big Data
    Microsoft Azure
    Cloud Computing
    Apache Spark
    Python
    SQL
    Data Warehousing
    Data Warehousing & ETL Software
    ETL Pipeline
    Data Engineering
  • $40 hourly
    Senior software engineer with more than 7 years of experience across various industries in both consumer and B2B space
    Featured Skill Pyspark
    Apache Kafka
    Software
    Microservice
    Machine Learning
    AI Agent Development
    MySQL
    Redis
    Kubernetes
    PySpark
    Java
    C++
    Python
  • $100 hourly
    IT Industry Professional/ Hadoop Developer /Big data engineer PROFESSIONAL SNAPSHOT * A competent, seasoned professional having around 8 years of IT industry experience and around 6years experience in Big Data Hadoop Ecosystem using HDFS, Spark with java/spring boot, Azure Databricks, Pyspark, Spark with python,Yugabyte -Cassandra -postgresql DB,mysql * Having good exposer on Kubernetes used for app deployement. * Work closely with the business and analytics team in gathering the system requirements. * Hands on experience in creating Hive tables and using Partition and bucketing techniques to improve hive query performance * Creating spark jobs and series of jobs to run and schedule on data bricks with jars,notebook,wheel * Building Spark-java and Pyspark based application related to banking and payment related domain. * working on SPARK SQL, SPARK Streaming, RDD, Data Frame, Batch processing.
    Featured Skill Pyspark
    Analytical Presentation
    Big Data
    Big Data File Format
    Java
    Data Extraction
    ETL Pipeline
    Microsoft Azure Administration
    Data Analysis
    PySpark
  • $35 hourly
    Oracle (OCP) & Cloud Data Engineering (Azure, GCP) Certified Professional with over 20 years of diverse IT experience. Expertise in team management. Application and database development, performance tuning, data modelling, and data engineering. Proficient in data architecture, data warehousing/ETL, data integration, and service delivery, with a strong focus on delivering high-quality solutions and enhancing operational efficiency. My domain expertise spans the banking, logistics, manufacturing, healthcare, information security, and fund management sectors. Visited onsite and collaborated with clients in Sweden for one month, UK for two weeks, and Bhutan for ten months. Skills • Databases: Oracle, SQL Server • Database Programming & Scripting Languages: SQL, Oracle PLSQL, SQL Server TSQL • Data Warehousing & ETL Tools: Microsoft SSIS, Talend • Programming Languages: Python, Scala, SparkSQL • API & Environments: PySpark, Apache Spark • Cloud Environment: Azure • Cloud Technologies: Azure Storage Explorer, Azure Data Factory, Azure Data Bricks • Version Controlling: Azure DevOps, GIT, Bitbucket • Design & Development Tool: Oracle Forms. • Reporting or Visualization Tools: Oracle Reports, Power BI • IDE: Visual Studio • Operating Systems: Windows, Unix Core Competencies: • Extraordinary technical and functional skills in the Database programming area. • Proficient in ETL, data warehousing (DWH), data modelling, data pipeline and data architecture. • Proficient in analyzing existing systems and develop strategies for improvement. • Exceptional functional and technical skills in the Data Engineering area. • Proficient in performance tuning and optimization of databases and applications, consistently achieving substantial reductions in both time and costs by at least 25%. • Well versed in managing and mentor a team. • Excellent communication skills with clients and various stakeholders. • Experienced in operating within DevOps and Agile environments, leveraging tools such as JIRA, ServiceNow, GitHub, and Jenkins. • Well-versed in all aspects of the Software Development Life Cycle (SDLC). • Demonstrates strong analytical, problem-solving, and troubleshooting skills. • Demonstrated strong understanding and implementation of coding standards, code indentation, and code review processes.
    Featured Skill Pyspark
    Data Modeling
    Database Development
    Database Architecture
    Azure DevOps
    Python
    PySpark
    Data Lake
    Oracle Reports
    Oracle Forms
    Microsoft Azure SQL Database
    Transact-SQL
    Oracle PLSQL
    Data Warehousing & ETL Software
    ETL
    Data Extraction
  • $65 hourly
    Experienced Backend Developer & Educator | AWS | Python | FastAPI | ETL | Snowflake Professional Summary: With over 10 years of experience as a backend developer, I specialize in building scalable APIs, designing efficient ETL pipelines, and developing cloud-based solutions using Python, AWS, and modern data engineering tools. I've collaborated with leading enterprises such as Accenture, Cisco, Conduent, and the innovative startup DgCrux Technology, delivering high-performance backend systems and automation solutions. In addition to my development work, I have four years of teaching experience with Besant Technologies and Bytebits Technologies, where I instructed professionals and students in Python, Flask, Django, advanced Python, and database management. Services Offered: API Development: Design and develop FastAPI/Flask-based APIs, ensuring scalability, security, and high performance. ETL Pipelines: Build and optimize ETL workflows using Airflow, Snowflake, DBT, and Pandas to handle large-scale data processing. Cloud & Serverless Architectures: Implement AWS services such as Lambda, API Gateway, SNS, EventBridge, DynamoDB, S3, and EFS to create cost-efficient solutions. Infrastructure as Code (IaC): Automate cloud deployments using Terraform, Kubernetes, and Docker for scalable and maintainable applications. Database Optimization: Work with MySQL, PostgreSQL, Snowflake, and SQL Server, ensuring optimized query performance and data security. Technical Training & Mentorship: Provide training sessions and mentorship in Python, Flask, Django, advanced Python, and database management, leveraging my teaching experience to empower teams and individuals. Tech Stack: Backend: Python, FastAPI, Flask, Django, Celery, Redis Cloud & DevOps: AWS (Lambda, API Gateway, SNS, DynamoDB, CloudWatch, S3, EFS, EventBridge), Kubernetes, Terraform, Docker Data & ETL: Snowflake, DBT, Apache Airflow, Pandas, SQL (PostgreSQL, MySQL, SQL Server) Why Work With Me? Extensive Experience: Over a decade in backend development and cloud computing, coupled with four years of teaching, demonstrating both practical and instructional expertise. Proven Track Record: Successful delivery of enterprise-grade solutions for industry leaders and innovative startups. Strong Communication: Adept at understanding business needs and translating them into technical solutions, with a passion for mentoring and knowledge sharing. Commitment to Excellence: Dedicated to writing efficient, scalable, and well-documented code, and fostering continuous learning environments. Let's Collaborate: I'm eager to contribute to your project's success. Feel free to reach out, and let's discuss how my expertise can meet your needs. #BackendDevelopment #AWS #Python #FastAPI #ETL #Snowflake #CloudComputing #APIDevelopment #DataEngineering #DevOps #TechnicalTraining #Mentorship
    Featured Skill Pyspark
    Software Architecture & Design
    FastAPI
    Flask
    Django
    Amazon RDS
    Data Engineering
    SQL
    PySpark
    AWS Lambda
    AWS Glue
    Terraform
    Python
    ETL
    Data Extraction
    ETL Pipeline
  • $50 hourly
    With 17.5 years of expertise as a seasoned Data Engineer, I have successfully delivered data-driven solutions across Banking & Finance, Retail, Networking, and Healthcare industries. My technical proficiency spans Google Cloud Platform (GCP) and Amazon Web Services (AWS), leveraging advanced tools such as BigQuery, Dataflow, DataProc, Cloud Composer, Apache Airflow, Cloud Functions, Google Cloud Storage (GCS), Pub/Sub, Data Fusion, Kafka, Spark, and PySpark. Additionally, I specialize in AWS Redshift, AWS Glue, AWS Lambda, and AWS S3, ensuring efficient cloud-based data processing, transformation, and analytics. I am passionate about designing scalable data pipelines, optimizing cloud-based architectures, and enabling real-time data insights to drive business value.
    Featured Skill Pyspark
    Informatica
    PySpark
    BigQuery
    Google Dataflow
    SQL
    Python
    Data Engineering
  • $35 hourly
    With a robust 16-year career in the technological industry, I bring forth a profound expertise in data engineering across platforms such as Azure, GCP and Snowflake. My proficiency extends to a broad spectrum of tools and disciplines including ETL, Business Intelligence (BI), Azure, DevOps, Automation, SAP, BigData, and Data-warehouse management. My passion lies in harnessing the power of data to drive business growth and decision-making. I leverage my vast experience and deep understanding of data engineering to deliver transformative solutions that align with your organization's objectives. Engage with me to uncover the hidden value in your data and let's take your business to new heights of success.
    Featured Skill Pyspark
    Microsoft Power BI Development
    BigQuery
    Big Data
    Apache Hive
    Data Warehousing & ETL Software
    Apache Airflow
    PySpark
    Databricks Platform
    Google Cloud Platform
    Microsoft Azure
    Software QA
    Data Analysis
    ETL Pipeline
    ETL
    Data Extraction
  • $50 hourly
    I am a Data Engineer with experience in designing and building end-to-end data pipelines. I have worked extensively with Apache Spark, Hive, and Python, along with cloud-based technologies such as Azure Data Factory and Azure Databricks. My expertise includes writing complex SQL queries, optimizing data workflows, and implementing dimensional modeling techniques to structure and manage large-scale data efficiently. I am skilled in handling diverse data integration challenges and ensuring seamless data processing for analytics and business insights.
    Featured Skill Pyspark
    Data Extraction
    Microsoft Azure SQL Database
    Databricks Platform
    Unix Shell
    Apache Hadoop
    Hive
    PySpark
    Python
    ETL
    ETL Pipeline
  • $50 hourly
    I am a Data Engineer with experience in building scalable data pipelines. Below are my skills and the technologies I have worked with: 1. Developing data pipelines using Python and integrating with Airflow for creating and scheduling DAGs. 2 .Building end-to-end data pipelines in AWS services and scheduling them using AWS Glue Jobs. 3. Proficient in using Spark SQL and Spark APIs for data transformation and manipulation. 4. Hands-on experience with AWS components like S3, AWS Glue, Redshift, Athena, Lambda, Workflow, and Step Functions, along with an understanding of the functionality and use cases of other AWS components. 5. Hands-on experience with Snowflake Data Warehouse and dbt Cloud. Proficient in analyzing data using SQL queries through Athena and SQL Workbench. 6. Experience working with various SDLC methodologies like Agile and Scrum for developing and deploying projects. 7. Experience working with Jupyter Notebook. Skilled in analyzing requirements and architecture specifications to create detailed design documents. 8. Knowledge of Hadoop MapReduce, HDFS, and Hive concepts.
    Featured Skill Pyspark
    Data Extraction
    CI/CD
    Data Engineering
    Apache Airflow
    Amazon Redshift
    Snowflake
    Apache Hadoop
    Apache Spark
    PostgreSQL
    SQL
    PySpark
    Python
    AWS Glue
    ETL Pipeline
    ETL
  • $56 hourly
    Experienced Big Data Engineer with a proven track record in Hadoop and Big Data technologies. Proficient in Hive, with expertise in data ingestion, processing, and analytics. Demonstrated ability to drive impactful results, leveraging over 4.6 years of hands-on experience. Skilled in developing and maintaining big data solutions to meet business objectives. Committed to staying updated with emerging trends and technologies in the field. Skilled and results-driven Big Data Engineer with 4.6 years of extensive experience in the Hadoop and Big data domain, including expertise in Python, PySpark,Hive, SQL,Kafka, GCP.
    Featured Skill Pyspark
    YARN
    HDFS
    Sqoop
    Apache Kafka
    Hive Technology
    BigQuery
    Google Cloud Platform
    SQL
    PySpark
    Python
    Data Analysis
    ETL Pipeline
    ETL
  • $60 hourly
    ABOUT ME PROFILE SUMMARY Over 8.10 years of experience in IT industry. Currently working as Senior Engineer at ZS Associates. Big Data Engineer with extensive experience in the Spark ecosystem. Microsoft Azure Certified IT SKILLS Apache Spark Databricks PySpark Python Pandas SQL Hadoop(HDFS) Hive Azure Data Factory Snowflake Apache Airflow Over 8.10 years of experience in IT industry, with 6+ years specializing in designing, developing, and optimizing data engineering pipelines. Skilled in building scalable ETL pipelines, data processing, and automation using modern big data technologies. Proficient in Spark, Python, PySpark, Pandas, Azure Databricks, Azure Data Factory (ADF), Airflow, Snowpark, Snowflake, SQL, Hive, Unity Catalog, Big- Data, Hadoop, HDFS. Experience in AWS services such as S3, EC2 Instance, EMR, Airflow, Notebook and dealing with clusters.
    Featured Skill Pyspark
    GitHub
    Database
    SQL
    Python
    Hive
    Snowflake
    Databricks Platform
    Azure App Service
    Apache Hadoop
    Apache Spark
    Apache Airflow
    PySpark
  • $45 hourly
    A data-driven professional with expertise in data engineering, specializing in transforming raw data into meaningful insights. Skilled in modern tools such as Azure Data Factory, Databricks, and Data Lake to design and implement efficient cloud-based data pipelines and solutions. Experienced in handling pharmaceutical data, ensuring high-quality data processing for analytics and decision-making. Passionate about continuous learning and staying updated with the latest advancements in the field, with a strong commitment to developing impactful data solutions.
    Featured Skill Pyspark
    R
    Retail
    Supply Chain Management
    Pharmaceutical Industry
    Python
    SQL
    Azure DevOps
    Data Engineering
    PySpark
    ETL
    ETL Pipeline
    Data Extraction
    Big Data
  • $75 hourly
    Professional Summary * 18+ years of overall IT Development work experience in various technology areas [Web applications, Data, Integrations]. * Strong execution Leadership with proven track record in incubating and scaling large scale Data teams, by conducting discoveries, defining roadmaps, governance structures, and setting up high performance teams. * Strong consultative approach to steer enterprise level digital transformations and defining the Data Platforms role in the same. * Deep expertise in Defining Data engineering solution and implementation stack on Cloud [Azure, AWS]. * Hands-on approach to solving technical problems, experienced in designing frameworks for Data engineering Teams
    Featured Skill Pyspark
    Data Model
    Data Transformation
    Data Integration
    Data Ingestion
    Data Warehousing & ETL Software
    Architectural Design
    PySpark
    dbt
    Streamlit
    Microsoft Power BI
    Data Visualization
    AWS Lambda
    Amazon Redshift
    Data Analytics
    Data Engineering
  • Want to browse more freelancers?
    Sign up

How hiring on Upwork works

1. Post a job

Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.

2. Talent comes to you

Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.

3. Collaborate easily

Use Upwork to chat or video call, share files, and track project progress right from the app.

4. Payment simplified

Receive invoices and make payments through Upwork. Only pay for work you authorize.

Trusted by

How do I hire a Pyspark Developer near Bengaluru, on Upwork?

You can hire a Pyspark Developer near Bengaluru, on Upwork in four simple steps:

  • Create a job post tailored to your Pyspark Developer project scope. We’ll walk you through the process step by step.
  • Browse top Pyspark Developer talent on Upwork and invite them to your project.
  • Once the proposals start flowing in, create a shortlist of top Pyspark Developer profiles and interview.
  • Hire the right Pyspark Developer for your project from Upwork, the world’s largest work marketplace.

At Upwork, we believe talent staffing should be easy.

How much does it cost to hire a Pyspark Developer?

Rates charged by Pyspark Developers on Upwork can vary with a number of factors including experience, location, and market conditions. See hourly rates for in-demand skills on Upwork.

Why hire a Pyspark Developer near Bengaluru, on Upwork?

As the world’s work marketplace, we connect highly-skilled freelance Pyspark Developers and businesses and help them build trusted, long-term relationships so they can achieve more together. Let us help you build the dream Pyspark Developer team you need to succeed.

Can I hire a Pyspark Developer near Bengaluru, within 24 hours on Upwork?

Depending on availability and the quality of your job post, it’s entirely possible to sign up for Upwork and receive Pyspark Developer proposals within 24 hours of posting a job description.