Hire the best Pyspark Developers in Pune, IN
Check out Pyspark Developers in Pune, IN with the skills you need for your next job.
- $40 hourly
- 3.1/5
- (10 jobs)
As a Senior Data Engineer with 9 years of extensive experience in the Data Engineering with Python ,Spark, Databricks, ETL Pipelines, Azure and AWS services, develop PySpark scripts and store data in ADLS using Azure Databricks. Additionally, I have created data pipelines for reading streaming data from MongoDB and developed Neo4j graphs based on stream-based data. I am well-versed in designing and modeling databases using Neo4j and MongoDB. I am seeking a challenging opportunity in a dynamic organization that can enhance my personal and professional growth while enabling me to make valuable contributions towards achieving the company's objectives. • Utilizing Azure Databricks to develop PySpark scripts and store data in ADLS. • Developing producers and consumers for stream-based data using Azure Event Hub. • Designing and modeling databases using Neo4j and MongoDB. • Creating data pipelines for reading streaming data from MongoDB. • Creating Neo4j graphs based on stream-based data. • Visualizing data for supply-demand analysis using Power BI. • Developing data pipelines on Azure to integrate Spark notebooks. • Developing ADF pipelines for a multi-environment and multi-tenant application. • Utilizing ADLS and Blob storage to store and retrieve data. • Proficient in Spark, HDFS, Hive, Python, PySpark, Kafka, SQL, Databricks, and Azure, AWS technologies. • Utilizing AWS EMR clusters to execute Hadoop ecosystems such as HDFS, Spark, and Hive. • Experienced in using AWS DynamoDB for data storage and caching data on Elasticache. • Involved in data migration projects that move data from SQL and Oracle to AWS S3 or Azure storage. • Skilled in designing and deploying dynamically scalable, fault-tolerant, and highly available applications on the AWS cloud. • Executed transformations using Spark, MapReduce, loaded data into HDFS, and utilized Sqoop to extract data from SQL into HDFS. • Proficient in working with Azure Data Factory, Azure Data Lake, Azure Databricks, Python, Spark, and PySpark. • Implemented a cognitive model for telecom data using NLP and Kafka cluster. • Competent in big data processing utilizing Hadoop, MapReduce, and HDFS.Pyspark
Microsoft Azure SQL DatabaseSQLMongoDBData EngineeringMicrosoft AzureApache KafkaApache HadoopAWS GluePySparkDatabricks PlatformHive TechnologyApache SparkAzure Cosmos DBApache HivePython - $50 hourly
- 5.0/5
- (3 jobs)
Having a hands on experience on developing Analytics and Machine Learning, Data Science, Big Data and AWS Solutions.Pyspark
Apache CordovaCloud ServicesAnalyticsPySparkData SciencePythonApache SparkMachine Learning - $35 hourly
- 5.0/5
- (8 jobs)
Welcome to my profile! I'm a dedicated freelancer specializing in API development, data analytics, and web scraping. With a strong expertise in Python, Django, Flask, Docker, Kubernetes, Java, and Angular, I offer a wide range of specialized services to cater to your specific needs. My Services: ✔ Building robust REST APIs using Flask, Django, Angular, AngularJS, and Java SpringBoot. ✔ Web Scraping using Selenium and Beautiful Soup to efficiently extract data from websites. ✔ Data visualization using Python, R, Apache Superset, and Kibana (ELK stack) to transform complex data into actionable insights. ✔ Data analysis using Python and R to uncover valuable patterns and trends. ✔ Data modeling using SQL and Hive Query Language (HQL) for structured data stored in Hadoop. ✔ Identifying and formulating Key Performance Indicators (KPIs) tailored to your domain. ✔ Creating visually appealing dashboards to provide real-time data-driven decision-making capabilities. ✔ Web scraping and performing basic descriptive statistics on tabular data. ✔ Implementing CI/CD pipelines for efficient software development and deployment. ✔ Leveraging Natural Language Processing (NLP) techniques to extract insights from textual data. ✔ Applying Machine Learning and Data Science algorithms to drive predictive analytics. ✔ Data engineering using Spark and PySpark for big data processing. I have successfully delivered solutions based on data analytics, predictive analytics using various Machine Learning techniques (especially regression and tree-based models), data visualization, and dashboard creation across diverse industries including retail, marketing, manufacturing, e-commerce, and more. My expertise also extends to web scraping using Python, where I utilize powerful tools such as Selenium and Beautiful Soup to efficiently extract data from websites. Whether it's scraping product information, news articles, or any other data source, I can deliver accurate and reliable results to meet your requirements. I guarantee high-quality work at an affordable price, ensuring 100% accuracy in all my deliverables. Your satisfaction is my top priority, and I am committed to exceeding your expectations. If you're seeking a reliable and skilled freelancer to elevate your business with cutting-edge APIs, data analytics, visualization solutions, or web scraping capabilities, feel free to reach out to me. Let's collaborate and achieve your objectives! Contact me today to discuss your project requirements in detail. Thank you for visiting my profile! Sameer APyspark
JSONRuby on RailsAPIData MiningAngularAndroid AppArtificial IntelligenceDjangoPySparkFlaskMachine LearningData SciencePythonJava - $40 hourly
- 5.0/5
- (4 jobs)
Selected Achievements: #Spearheaded the creation of a comprehensive Bill of Materials for the Pharma Med Tech Vision Care sectors, leading to a 20% increase in inventory efficiency. This involved detailed analysis and categorization of over 3,000 individual components, ensuring compliance with industry standards and streamlining supply chain processes. #Developed and optimized Supplier Volume Data for key business partners, enhancing the procurement strategy. This involved aggregating and analysing data from more than 50 suppliers, leading to a 15% reduction in costs and a 10% improvement in supplier delivery times. #Managed and executed an end-to-end ETL pipeline, integrating data from multiple sources to various destinations. This project involved the processing of over 5TB of data monthly, resulting in a 30% improvement in data processing efficiency and a 25% reduction in related errors. #As the lead of the Data Engineering team, successfully delivered numerous high-impact projects, resulting in a significant increase in team productivity and a 35% improvement in data quality. This was achieved through implementing best practices in data management and fostering a collaborative team environment. Ms. Mangal is a distinguished Data Engineer with 6.5 years of experience in diverse sectors like banking and healthcare. Known for quick learning and problem-solving, she specializes in ETL/ELT pipeline development using Azure technologies and crafting complex database queries, enhancing efficiency. Her career reflects a deep passion for innovation and delivering high-quality data solutions. Ms. Mangal's career is marked by a passion for innovation and a commitment to delivering high-quality data solutions. She specialises in - Azure Data Factory , Azure Data Bricks , Azure Logic Apps Azure Data Lake Storage, Azure Blob Storage Azure Key Vault, Azure App Directory SQL Server, Azure SQL Database, Oracle, Mongo DB, SQL/PL-SQL Python, PySpark ETL/ELT Pipelines Development using ADF, ASA Migration Using ARM Templates Jenkins, CI/CD Pipelines, SonarQube Docker GIT, Bit Bucket Tableau Together, let's unlock the mysteries of your data, propelling your business towards uncharted territories of success. The data universe is vast, and I'm here to help you navigate its stars! 🌌 Let's make data magic happen! 🌌Pyspark
PySparkPythonSQLOracleData Flow DiagramDatabricks PlatformAzure App ServiceMicrosoft Azure SQL DatabaseData IntegrationOracle PLSQLData LakeMicrosoft AzureData EngineeringMicrosoft SQL ServerTableau - $45 hourly
- 5.0/5
- (77 jobs)
As a highly experienced Data Engineer with over 10+ years of expertise in the field, I have built a strong foundation in designing and implementing scalable, reliable, and efficient data solutions for a wide range of clients. I specialize in developing complex data architectures that leverage the latest technologies, including AWS, Azure, Spark, GCP, SQL, Python, and other big data stacks. My extensive experience includes designing and implementing large-scale data warehouses, data lakes, and ETL pipelines, as well as data processing systems that process and transform data in real-time. I am also well-versed in distributed computing and data modeling, having worked extensively with Hadoop, Spark, and NoSQL databases. As a team leader, I have successfully managed and mentored cross-functional teams of data engineers, data scientists, and data analysts, providing guidance and support to ensure the delivery of high-quality data-driven solutions that meet business objectives. If you are looking for a highly skilled Data Engineer with a proven track record of delivering scalable, reliable, and efficient data solutions, please do not hesitate to contact me. I am confident that I have the skills, experience, and expertise to meet your data needs and exceed your expectations.Pyspark
SnowflakeETLPySparkMongoDBUnix ShellData MigrationScalaMicrosoft AzureAmazon Web ServicesSQLApache HadoopClouderaApache Spark - $20 hourly
- 5.0/5
- (11 jobs)
- Senior Software Engineer with 8+ years of experience in building data-intensive applications and tackling challenging architectural/ scalability problems. - Showcasing excellence in delivering analytical and technical solutions in accordance with the customer requirements - Hands-on experience in data engineering functions including but not limited to data extraction, transformation, loading, and integration in support of enterprise data infrastructures that include data warehouse, operational data stores, and master data management - Taking ownership in terms of delivery, coordinating with relevant stakeholders, updating the status to the client on a daily basis; coordinating with the testing team and other teams for fixing bugs - Relevant experience of 3+ years in Big-Data analytics and Big-data handling using Hadoop Ecosystem tools such as Hive, HDFS, Spark, Sqoop, and Yarn - Extensive experience in working on Cloud Platform such as AWS with hands-on experience in using AWS Services – S3, Glue, Lambda & Step Function, RDS (Aurora DB), and Redshift. - Applied knowledge in AWS - EMR, EC2, SNS, SQS, and CloudWatch. - More than 3 years of experience in handling and processing Unstructured and Structured data using Python, Pyspark, and SQL. - Comprehensive knowledge of query building and expertise in handling RDBMS systems – MS SQL, MySQL, PostgreSQL, and Oracle (PL/SQL). - Strong experience in creating ETL pipelines using tools like Talend Studio DI and Big Data platform - Proficient in analyzing requirements and architecture specifications to create detailed designs and providing technical advice, training, and mentoring other associates in a lead capacity - Good understanding of Machine Learning and Statistical Analysis - Having sound knowledge in the Retail domain. With beginner-level knowledge, learning more about Insurance Domain. - Participating in deployment releases and release readiness reviews, and maintaining release repository - Interacting with clients for getting requirements, participating in PI planning, and ensuring the timely completion of projects; analyzing & designing the project requirements, various modules, and their functionality - Hands-on experience in handling the On-site applications & interacting with client review meetings & Brainstorm sessions with the technical team, Team Lead & Product Delivery Manager. - Excellent Analytical skills often lead to discovering requirement gaps at an early stage which ultimately helps in timely delivery and avoiding production issues.Pyspark
Data ExtractionData ScrapingAmazon RedshiftMongoDBGenerative AILLM Prompt EngineeringPostgreSQL ProgrammingData Warehousing & ETL SoftwareBig DataAmazon Web ServicesPySparkDatabricks PlatformMachine LearningPythonApache HadoopSQLTalend Open StudioData Migration - $30 hourly
- 4.9/5
- (2 jobs)
Data Architect with Sound knowledge on Data Analytics, Architect and Engineering with proficient in Cloud Technologies, Python, Spark and BigData ecosystems, with experience of 8+ years in problem solving and consulting, Also I am proficient in Design and implementation of End to end architecture for data driven approaches with optimised performance and efficiency in designPyspark
Data AnalysisMachine LearningPySparkBig DataApache SparkDatabricks PlatformPython - $64 hourly
- 0.0/5
- (1 job)
I'm a cloud enthusiast with experience in building AWS cloud native apps as well as migrating existing code/app to AWS cloud. I am also certified AWS architect with AWS - Associate Architect as well as AWS- Professional Architect. Whether you are thinking of moving your existing app to cloud or planning to build a new one directly on cloud-I can help! I'll manage the project brief from start to finish.Pyspark
AWS CodePipelineAWS CodeDeployAWS DevelopmentAmazon AthenaAWS GlueAWS CloudFormationPySparkAmazon S3AWS Cloud9Amazon RDSAWS ApplicationAmazon EC2Cloud ComputingPython - $5 hourly
- 5.0/5
- (1 job)
Nidhi Sharma 2x Certified Microsoft Azure Data Engineering Associate DP-203 2x Certified in Microsoft Power Bi Associate PL-300 Certified in Microsoft Azure Data Fundamentals DP-900 Certified in Microsoft Azure Fundamentals AZ 900 Professional Summary A Data engineering consultant with 4 years of experience in Python programming, Azure data engineering and Machine learning. A good grasp on SQL, azure data factory, data warehousing concepts, data bricks and machine learning. Have an active involvement in data ingestion, transformation, and warehousing. Also worked on automation and manual testing which involved creating the test cases and testing it with PyCharm and Data Analytics Studio and used JIRA X-ray reports and test plans for its proper storage and implementation. Have good expertise and experience in communicating with clients and stakeholders and helping our team in release activities and proper documentations. A good communicator with good analytical andPyspark
Data EngineeringData Analytics & Visualization SoftwareData AnalysisPySparkMicrosoft AzureDatabricks PlatformPython - $50 hourly
- 4.3/5
- (3 jobs)
Proficient Palantir developer with experience in azure databricks, azure datafactory, Microsoft azure,Palantir foundry application along with python , Mysql, Typescript,pyspark,Pyspark
Microsoft Power BIPower QueryDatabricks PlatformBig DataPySparkMicrosoft AzureTypeScriptpandasSQLPython - $15 hourly
- 4.5/5
- (11 jobs)
I am Big Data analyst and data scientist having 3+ years of financial industry experience. I have done my PG Diploma in Big Data analysis. Skilled in Python, PySpark, Java, SQL, Streamlit, Data visualisation and Data manipulation along with AI/ML Done projects in Automation of risk scorecard development along with deployment pipelines and model development.Pyspark
YOLOPyQtSeleniumWeb ScrapingAPIAI ChatbotData ScienceStreamlitFront-End DevelopmentJavaPySparkSQLPythonData Analysis - $60 hourly
- 0.0/5
- (0 jobs)
* Data Engineer having around 15 + years of extensive experience of analyzing requirements and designing Data solutions for Credit card & Insurance & Banking and healthcare and retail companies. * Designed and architected and developed data ingestion framework to ingest terabytes of data for top major retail clients with company Toshiba Global Commrece Solutions. * Well versed with technologies like Azure Databricks Pyspark ETL and other related Azure services like Azure Blob storage, Azure key vault, Azure Postgres SQL DB, Azure Synapse. * Experience in Implementation Medallion architecture with databricks delta tables to Extract and transfrom and load huge amounts of data in size of terabytes in different layers like Bronze/Silver/Gold using databricks autoloader utility. * Experience in implementing real time streams using spark.Pyspark
ScalaPythonApache SparkPySparkDatabricks PlatformAb Initio - $40 hourly
- 0.0/5
- (0 jobs)
Seasoned Data Engineer and Generative AI Specialist with extensive experience in building scalable, end-to-end data solutions. - Proficient in Databricks, Snowflake, AWS, and Azure, - I design and optimize data pipelines, cloud architectures, and AI-driven applications. - My expertise spans advanced data engineering, Generative AI, and cloud-native solutions to drive business insights and innovation. Let's collaborate to transform your data challenges into efficient, impactful results.Pyspark
GolangData AnalyticsNoSQL DatabaseApache SparkAWS GlueAmazon Web ServicesElixirPythonMachine LearningGenerative AIETL PipelineETLPySparkSnowflakeDatabricks Platform - $50 hourly
- 0.0/5
- (0 jobs)
Experience: 11.7 Years Seasoned Senior Data Engineer with over 11 years of proven expertise in the Information Technology industry, specializing in delivering robust, scalable, and innovative data solutions across diverse domains including Banking, Finance, and Telecom. Big Data Expertise: 9+ years of comprehensive experience in Big Data ecosystems, proficient in Hadoop, MapReduce, Spark-Scala & PySpark, Hive, Kafka, Sqoop, Apache Pig, and Oozie. Cloud Proficiency: Hands-on expertise in leading cloud platforms, including Google Cloud Platform (GCP) with tools like BigQuery, DataProc, and Cloud Composer, and Azure with Databricks, Azure Data Lake (ADLS), and Azure Data Factory (ADF). Demonstrated ability to design and implement solutions leveraging Delta Tables and Databricks for PoCs. Data Engineering & Processing: Adept in data cleansing, curation, migration, and ingestion.Pyspark
BigQueryGoogle Cloud PlatformBig Data File FormatData EngineeringdbtSqoopApache HiveApache KafkaApache AirflowApache HadoopPythonScalaApache SparkPySparkBig Data - $70 hourly
- 0.0/5
- (0 jobs)
• Expertise in Cloud Platforms: Cloudera, AWS, GCP and Azure • Expertise in designing end to end solutions for Data Governance, Data Warehousing, Business Intelligence (BI), Data Modelling, Data Integration, Data Replication, MDM, Data Quality and Data Migration projects. • Expertise on Big Data Hadoop development including Design architecture, development, system integration, and infrastructure readiness. • Extensively worked with Teradata utilities like BTEQ, Fast Export, Fast Load, Multi Load to export and load data to/from different source systems including flat files. • Good Understanding of GCP managed services e.g., Dataproc, Dataflow, pub/sub, Cloud functions, Cloud composer, Big Query, Big Table • Good Understanding of GCP core services like Google cloud storage, Google compute engine, Cloud SQL, Cloud IAM. • Good experience in ADB, Azure Data Lake, and Azure Synapse. • Experience in continuous delivery through CI/CD pipelines, containers and orchestration technologies. • Expert knowledge of Agile approaches software development and able to put key Agile principles into practice to deliver solutions incrementally. • Expertise Architect Reporting and Analytics solutionsPyspark
TeradataTechnical SupportMicrosoft Power BI Data VisualizationData MigrationAWS ApplicationClouderaPySparkdbtSnowflakeData EngineeringData ExtractionMachine Learning ModelETLArtificial IntelligenceData Analysis - $35 hourly
- 0.0/5
- (0 jobs)
Total work experience 8+ KEY SKILLS Spark Hive Impala Snowflake Snowpipe Airflow Apache Nifi Pyspark EMR Delta AWS Python Data Warehousing Snowpark Azure Databricks ETL Data Modeling Kafka Azure Data Factory GIT Spinnaker SQL Server Amazon Redshift Data Build Tool Power BI SCALA SQL Hadoop Hdfs Shell Scripting Data Analytics Big Data HBase Azure Data Lake System Design PROFILE SUMMARY Experienced Big Data Engineer with 8 years skilled in developing Big Data applications using a diverse tech stack including Spark, Scala, Hive, Impala,Azure data lake,Databricks,ADF, Python, AWS Services, ETL, Pyspark, Snowflake, SQL, Scope, TSQL,Cosmos, and PowerBI. Expertise in understanding project requirements and creating high-level designs. Proficient in writing low-level code based on established designs. Experienced in setting up Bitbucket, GIT, and Spinnaker pipelines (CICD) across multiple environments.Pyspark
Apache AirflowApache HadoopDockerSQL Server Integration ServicesJavaScalaSnowflakeDashboardSQLAzure App ServiceAWS ApplicationPySparkData ExtractionData AnalysisETL - $35 hourly
- 0.0/5
- (0 jobs)
I'm Yogendra, a hands-on technology executive with 19+ years of experience building AI-native platforms, enterprise data systems, and cloud-scale infrastructure. I’ve led global engineering teams at MSCI and BNY Mellon, built data platforms handling billions of daily events, and delivered low-latency APIs over petabyte-scale datasets. As founder of Colrows, I designed a proprietary SQL engine and orchestrated LLMs (ChatGPT, Claude, Gemini) using Graph RAG and agentic AI to turn natural language into accurate, actionable insights. Expertise LLM integration, prompt engineering, Graph RAG Cloud-native architecture (AWS, Azure, GCP) Data lakes, real-time systems, and platform scalability Team building, roadmap ownership, GTM support I work with startups and enterprises as a fractional CTO, technical advisor, or platform architect—helping scale products, modernize data infra, or bring AI into production. Let’s build something impactful.Pyspark
Apache SparkData EngineeringStream Processing FrameworkMongoDBVector DatabasePySparkApache CassandraElasticsearchLangChainLLM Prompt EngineeringApache KafkaJavaArtificial IntelligenceMachine Learning ModelETL - $40 hourly
- 0.0/5
- (0 jobs)
Experienced Data Engineer | Python • Spark • AWS • Elasticsearch • EMR I’m a results-driven Data Engineer with a strong background in designing and building scalable data pipelines for big data environments. I specialize in: Python scripting and automation Distributed data processing with Apache Spark Search and analytics using Elasticsearch / OpenSearch Real-time caching with Redis Cloud-native workflows using AWS EMR and S3 Whether you need to process large datasets, build ETL/ELT pipelines, or integrate search into your applications — I can manage projects end-to-end with clean, efficient code and best practices. 💬 I value clear and consistent communication and always keep clients updated throughout the project. Let’s connect and discuss how I can help bring your data project to life!Pyspark
RedisPySparkApache KafkaJavaElasticsearchPython - $35 hourly
- 0.0/5
- (0 jobs)
Results oriented Data Engineer with 8+ years of experience in building scalable data pipelines, and Business Intelligence solutions for products, using batch or real-time ETL, ELT and Big Data frameworks like Apache Spark. Skilled in Python scripting, Databases, Dimensional Data Modeling, leveraging SQL & dbt for data transformations, and deploying solutions on cloud platforms such as AWS, Snowflake, and Databricks.Pyspark
Amazon RedshiftETL PipelineData ModelingPySparkETLData Warehousing & ETL SoftwareApache SupersetdbtSnowflakeDatabricks PlatformAWS GlueData VisualizationData EngineeringPythonSQL - $25 hourly
- 5.0/5
- (39 jobs)
🚀 Skyrocket Your Business with Cutting-Edge AI and ML Solutions! 🚀 ⭐ Fortune 500 | 🤖 AI & Data Science and Analytics Expert | 🤖 LLM, AI Automation & Business Intelligence Consultant | Natural Language Processing | Generative AI | ✅ 15+ Years of Experience In today’s fast-evolving landscape of AI, Data Science, and Business Intelligence (BI), I understand the challenge of finding reliable expertise. My clients from Fortune 500 companies, worldwide startups, and various sectors have partnered with me to navigate these complexities. I have streamlined processes, built scalable AI models, and transformed their data into actionable strategies for growth and efficiency. My approach focuses on selecting the right tools tailored to each project’s needs, rather than chasing every new trend. My expertise: ✅ AI & Machine Learning – I specialize in developing AI-driven predictive analytics, anomaly detection, and natural language processing (NLP) solutions to automate processes and uncover valuable insights. By leveraging advanced technologies such as TensorFlow, PyTorch, Huggingface, Scikit-learn, GPT-4, LangChain, LangGraph and OpenAI API, I empower businesses to optimize decision making insights and improve operational efficiency. ✅ Data Analysis & Big Data – I specialize in processing and analyzing large-scale structured and unstructured datasets for both real-time and batch analytics. My proficiency in SQL, BigQuery, Snowflake, Apache Spark and Databricks empowers businesses to build scalable cloud-based analytics platforms that enhance decision-making capabilities. ✅ ETL & Data Engineering – I specialize in building efficient, automated ETL/ELT pipelines to enable fast, scalable data transformation. I specialize in Apache Airflow, AWS Glue, and Databricks, ensuring businesses have clean, structured, and reliable data pipelines for analytics, AI, and reporting. ✅ Business Intelligence (BI) & Data Visualization – I design real-time, interactive dashboards that simplify complex data and enhance decision-making. Using Power BI (DAX, Power Query), Tableau, Streamlit app, Shiny app, and Google Data Studio, I create custom BI solutions integrated range of data such as ERP, CRM and SaaS systems to deliver seamless analytics. ✅ AI-Driven Automation & RPA – I streamline manual workflows and enhance efficiency with AI-powered automation. Using Microsoft Power Automate I design end-to-end automation workflows, AI-driven RPA solutions, and API integrations that optimize business processes. Why Work With Me? ✅ Data-Driven Results – Leveraged Industry 4.0 technologies to enhance manufacturing processes and operational efficiency, reducing operational costs by $0.5 million annually per factory. ✅ Scalable AI, ML & BI Solutions – Predictive analytics resulting in manufacturing operational cost by 27%. AI-powered solutions integrated in data products achieving efficiency gains of ~$4M annually. ✅ Enterprise-Grade Data Infrastructure – Optimized ETL pipelines reducing data processing time by 30%Pyspark
ETLPySparkData VisualizationTableauRSQLPythonMicrosoft Power BIIndustry 4.0Databricks PlatformMachine LearningOpenAI APINatural Language ProcessingGenerative AIData Analytics - $40 hourly
- 5.0/5
- (2 jobs)
I'm a highly results-driven Automation and Data Engineering expert with 15+ years of experience building and scaling robust, efficient, and cost-effective solutions for businesses of all sizes. I specialize in cloud computing (Azure, AWS, GCP), automation, and data engineering, with a strong background in full-stack development. My passion lies in leveraging cutting-edge technologies to solve complex challenges and deliver tangible business value. I'm eager to collaborate with you on your next project and help you achieve your goals. **Core Competencies:** * **Data Engineering & Analytics:** I design, build, and maintain high-performance data pipelines using Azure Data Engineering tools, PySpark, and Databricks. I have extensive experience in data warehousing, ETL processes, data modeling, and business intelligence. For example, I recently developed a PySpark-based ETL pipeline that reduced data processing time by 60% for a financial services client, resulting in significant cost savings and improved reporting accuracy. I'm also proficient in Big Data technologies like Hadoop, MapReduce, and Pig. * **DevOps & Cloud:** I excel at implementing and managing CI/CD pipelines using Jenkins, Azure DevOps, GCP Cloud Build, and AWS CodePipeline. I'm highly experienced with containerization and orchestration technologies like Docker, Kubernetes (AKS, EKS, GKE), and OpenShift. I'm proficient in infrastructure as code (Terraform) and cloud platforms (Azure, GCP, and AWS). In a recent project, I automated the deployment process for a SaaS application using Kubernetes and Terraform, reducing deployment time from 2 days to 2 hours and increasing release frequency by 50%. I also specialize in monitoring and logging with Splunk and the ELK stack (Elasticsearch, Kibana, Logstash). * **Automation:** I have a deep understanding of Robotic Process Automation (RPA) using UiPath (Studio, Orchestrator, and RE Framework). I have a proven track record of automating complex business processes, resulting in increased efficiency and reduced operational costs. For instance, I automated a manual invoice processing workflow for a manufacturing company, saving them 15 hours per week and eliminating human error. I also leverage Python scripting for various automation tasks. * **Full Stack Development:** I'm a proficient Java full-stack developer with experience building robust and scalable web applications. I'm skilled in front-end technologies (HTML, CSS, JavaScript, React) and back-end frameworks (Spring Boot, REST APIs). I recently developed a web application for an e-commerce startup using Spring Boot and React, which handled over 10,000 transactions per day with 99.9% uptime. **Technical Skills (Upwork Keywords):** * DevOps * Data Engineering * Cloud Computing (AWS, Azure, GCP) * Kubernetes (AKS, EKS, GKE) * Docker * Terraform * Jenkins * CI/CD * RPA (UiPath) * Python * PySpark * Java * Full Stack Development * Web Development (HTML, CSS, JavaScript, React) * SQL (MSSQL, MySQL, PostgreSQL) * Big Data (Hadoop, MapReduce, Pig) * ELK Stack (Elasticsearch, Kibana, Logstash) * Splunk * Agile * JIRA * AzDo Boards * Git * GitHub * GitLab * Business Process Automation * REST APIs * API Development * AWS Lambda * Azure Functions * GCP Cloud Functions * AWS Fargate * Azure Container Registry * Cloud Foundry * Red Hat OpenShift * Amazon ECS * Google Kubernetes Services * Data Warehousing * ETL * Data Modeling * Business Intelligence * Databricks **Availability:** Full-time **Contact:** I'm available for consultations and eager to discuss your project requirements. Let's connect and explore how I can help you achieve your business objectives. **Cheers!**Pyspark
AutomationScalaWeb ScrapingData AnalyticsData Analytics & Visualization SoftwareMicrosoft Power AutomateBig DataETLAnalyticsPySparkData EngineeringTerraformUiPathSQLPython - $90 hourly
- 5.0/5
- (5 jobs)
*******Certified Apache Airflow Developer******* Having more than 7+ years of professional experience, I have done masters of Engineering in Information Technology. Currently working full time as Senior Consultant with one of a multi-national companies, I'm into a Data Engineering role working mostly on Python, PySpark, Airflow, Palantir Foundry, Collibra, SQL. In my past professional years I have also worked as Full Stack Developer building REST API's & UI functionalities. Also have mobile development experience using Flutter, Android & Xojo(for iOS). Please consider me if you want your work be done in time.Pyspark
Amazon Web ServicesRabbitMQNode.jsAmazon S3JavaScriptPySparkDatabricks PlatformApache AirflowSQLPythonETL PipelineKubernetesDockerJavaApache Spark - $40 hourly
- 4.9/5
- (0 jobs)
Professional Summary Versatile Solution Architect and Developer with 17 years of extensive IT experience across multiple domains including healthcare, manufacturing, telecom, banking, insurance, retail, e-commerce, energy, government, and education. Proven track record delivering enterprise-grade solutions in data engineering, cloud infrastructure, and application development. Expert in AWS, Azure, GCP, Oracle, SQL Server, Snowflake, Python, Java, Spark, Databricks, and modern AI/ML technologies. I provide comprehensive end-to-end solutions with meticulous attention to performance optimization and scalability, ensuring detailed documentation and adherence to industry best practices. Core Competencies Data Engineering & Analytics Expert in Apache Spark/PySpark, Databricks, AWS Glue, Azure Data Factory, Informatica, Kafka, NiFi, Airflow, and Python data stack (pandas, numpy, polars, scikit-learn). Delivered enterprise-scale ETL pipelines and real-time data streaming solutions across multiple domains. Data Warehousing & Database Technologies Extensive experience with modern data platforms including Snowflake, Redshift, BigQuery, Synapse, Oracle, SQL Server, PostgreSQL, MySQL, MongoDB, Cassandra, Redis, DynamoDB, time-series databases (InfluxDB, Timestream), graph databases (Neo4j, Neptune), and vector databases (Pinecone, Weaviate). Expert in advanced data modeling, MPP optimization, and SQL performance tuning for analytical workloads. Cloud & Infrastructure Certified cloud architect with hands-on expertise across AWS (EMR, Glue, Athena, Redshift, Lambda, S3), Azure (Databricks, Synapse, Data Lake, Functions, Cosmos DB), and GCP (BigQuery, Dataflow, Dataproc, Spanner). Proficient in Infrastructure as Code (Terraform, CloudFormation), serverless architectures, and multi-cloud strategy planning. AI/ML Implementation Implemented end-to-end ML solutions using TensorFlow, PyTorch, and JAX frameworks with MLOps pipelines (MLflow, Kubeflow, SageMaker). Experience in feature engineering, LLM fine-tuning, and production model deployment across healthcare analytics, fraud detection systems, recommendation engines, and predictive maintenance applications. Development & Programming Proficient in multiple programming languages (Python, Java, JavaScript/TypeScript, C#, SQL, Bash/PowerShell) with expertise in web technologies (REST/GraphQL APIs, microservices), DevOps practices (CI/CD, Docker, Kubernetes), and comprehensive testing frameworks. Developed and maintained mission-critical applications across diverse industry verticals. Service Offering As a Solution Architect and Developer, I provide: Comprehensive solution design documents Production-ready code implementation Performance optimization and tuning Testing strategy and implementation Deployment automation Scalability planning Knowledge transfer and documentation Domain Expertise Over 17 years of implementing technology solutions across diverse industries: Healthcare: Patient data management, claims processing, clinical analytics Banking & Finance: Transaction processing, fraud detection, regulatory compliance Telecom: Customer data management, network analytics, billing systems Manufacturing: Supply chain optimization, IoT integration, predictive maintenance Retail & E-commerce: Inventory management, recommendation engines, customer analytics Insurance: Risk assessment, claims processing, policy management Energy & Utilities: Smart grid solutions, consumption analytics Government: Secure data management, compliance reporting Transportation & Logistics: Route optimization, fleet management Approach I prioritize understanding business requirements before recommending technology solutions. My process involves thorough requirements gathering, architecture design, implementation planning, and continuous validation to ensure solutions meet both current needs and future scalability requirements. Ready to tackle challenging projects as a contractor, consultant, or project-based freelancer. I deliver production-ready solutions with comprehensive documentation, knowledge transfer, and ongoing support. Available for remote collaboration worldwide with flexible scheduling options.Pyspark
Data IntegrationData IngestionJavaPerlRedisPySparkDatabricks PlatformAmazon RedshiftSnowflakeMLOpsSQLMicrosoft AzureAmazon Web ServicesInformatica CloudPython - $20 hourly
- 5.0/5
- (17 jobs)
Very well understand your bussiness need. Also find Problem in your bussiness using your past data. Find new way or create new way for problem solution.Pyspark
SnowflakePySparkDatabricks PlatformWekaApache Spark MLlibData ScienceData MiningOracle PLSQLApache KafkaScalaPythonSQLMicrosoft SQL ServerSpring FrameworkApache Spark - $15 hourly
- 0.0/5
- (1 job)
Dedicated Data Engineer with 3.5 years of hands-on experience in building and maintaining data systems. Skilled in using SQL, Python, and ETL tools to handle and process large volumes of data. Experienced with cloud services like Azure, and knowledgeable in data warehousing and real-time data processing. Strong problem-solving abilities with a focus on ensuring data quality, security, and efficiency. Committed to supporting data-driven decisions and enhancing business intelligence through reliable data solutionsPyspark
Databricks PlatformData ModelInteractive Data VisualizationData Warehousing & ETL SoftwareETLData WarehousingData ModelingSQLPythonPySparkMicrosoft Power BI DevelopmentMicrosoft Power BI Data VisualizationMicrosoft Power BI - $15 hourly
- 0.0/5
- (3 jobs)
I am a dedicated Data Engineer with over 6+ years of experience in designing and building data pipelines, data warehouses, and data lakes. My expertise includes ETL processes using AWS services, Databricks, and other cutting-edge technologies. I have a proven track record of transforming complex business requirements into scalable and efficient data solutions. I am proficient in Python, SQL, and cloud platforms like AWS and Azure, and have extensive experience in optimizing data workflows and ensuring data quality. Key Skills: - ETL & ELT Processes: Expertise in extracting, transforming, and loading data using modern tools and techniques. - Data Warehousing: Proficient in designing and managing data warehouses for large-scale data storage and retrieval. - AWS Services: Skilled in using AWS services like S3, EC2, Lambda, Redshift, Athena, and Glue for various data engineering tasks. - Databricks & Airflow: Experienced in building and orchestrating data workflows using Databricks and Airflow. - Programming Languages: Proficient in Python, SQL, and PySpark for data manipulation and analysis. - Cloud Platforms: Extensive experience with AWS and Azure for cloud-based data solutions. - Big Data Technologies: Knowledgeable in big data tools and technologies for handling large datasets. - Process Improvement: Skilled in using tools like GitHub and Bitbucket for version control and process improvement. Key Projects: Item Maintenance: - Created a real-time process to gather and transform item-related data from Oracle UCM using Databricks and AWS services. - Technologies: AWS S3, Databricks, Elasticsearch. Visibility Report: - Developed a real-time ERP process to gather supplier and item data, applied business rules, and exposed data to Elasticsearch and Unity Catalog for dashboard creation. - Technologies: AWS S3, Databricks, PowerBI. Oasis Claims Data Analytics: - Built custom queries and modules to process patient medical data, optimize Spark jobs, and create ETL pipelines using Airflow. - Technologies: PySpark, SQL, AWS Athena. Education: - M.Sc. Computer Science - University of Pune, 2020 (9.45 CGPA) - B.Sc. Computer Science - University of Pune, 2018 (81%) I am passionate about leveraging data to drive business insights and look forward to collaborating on projects that require innovative data solutions. Let’s work together to turn your data into a strategic asset!Pyspark
Microsoft Power BILooker StudioAmazon RDSAWS CloudFormationAWS LambdaDevOpsBig DataGitHubDatabricks PlatformPySparkApache AirflowPythonSQLAWS GlueETL Pipeline - $17 hourly
- 0.0/5
- (0 jobs)
Welcome to my profile! With over 5 years of hands-on experience in cloud technology, specializing in AWS and Azure, I am a dedicated Data Engineer passionate about transforming complex data into actionable insights. I thrive on designing and implementing scalable solutions that drive organizational growth and efficiency. As a Cloud Data Engineer, I possess a deep understanding of cloud architecture, data storage, and processing frameworks. My expertise extends to AWS services, such as EC2, S3, Redshift, Glue, and Lambda, as well as Azure services, including Azure Data Factory, Azure Databricks, and Azure SQL Database. I leverage these tools and technologies to build robust data pipelines, optimize data ingestion, and ensure data integrity. Throughout my career, I have successfully executed end-to-end data engineering projects, collaborating with cross-functional teams to deliver high-quality solutions. I have a proven track record of designing and implementing data warehouses, data lakes, and ETL processes to enable efficient data management and analysis. In previous engagements, I have tackled complex challenges, such as data integration across multiple systems, real-time data processing, and implementing scalable architectures to handle large volumes of data. I am skilled in transforming raw data into meaningful insights using SQL, Python, and other relevant programming languages. My commitment to delivering excellence is complemented by my ability to understand business requirements and translate them into technical solutions. I prioritize performance, security, and cost optimization in every project, ensuring that my clients achieve their desired outcomes while maximizing ROI. Client satisfaction is at the core of my work philosophy. I communicate effectively, maintain regular progress updates, and actively seek client feedback to ensure alignment and exceed expectations. I am committed to fostering long-term partnerships and providing ongoing support to my clients. I hold a Certification in AWS and continually expand my knowledge through professional development initiatives, staying up-to-date with the latest advancements in cloud technology and data engineering. If you are seeking a dedicated Cloud Data Engineer who can drive your data initiatives forward, I am ready to collaborate with you. Let's discuss your project requirements and how I can leverage my expertise to deliver exceptional results. Contact me now to get started!"Pyspark
Apache SparkDatabricks PlatformData AnalysisGitMicrosoft AzureAWS GlueDatabase ModelingData CleaningAWS IoT AnalyticsPySparkAWS LambdaSpreadsheet SoftwareAmazon RedshiftApache KafkaData ScrapingAmazon S3Microsoft Azure SQL DatabaseAmazon EC2Data LakeSQLPython Want to browse more freelancers?
Sign up
How hiring on Upwork works
1. Post a job
Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.
2. Talent comes to you
Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.
3. Collaborate easily
Use Upwork to chat or video call, share files, and track project progress right from the app.
4. Payment simplified
Receive invoices and make payments through Upwork. Only pay for work you authorize.
How do I hire a Pyspark Developer near Pune, on Upwork?
You can hire a Pyspark Developer near Pune, on Upwork in four simple steps:
- Create a job post tailored to your Pyspark Developer project scope. We’ll walk you through the process step by step.
- Browse top Pyspark Developer talent on Upwork and invite them to your project.
- Once the proposals start flowing in, create a shortlist of top Pyspark Developer profiles and interview.
- Hire the right Pyspark Developer for your project from Upwork, the world’s largest work marketplace.
At Upwork, we believe talent staffing should be easy.
How much does it cost to hire a Pyspark Developer?
Rates charged by Pyspark Developers on Upwork can vary with a number of factors including experience, location, and market conditions. See hourly rates for in-demand skills on Upwork.
Why hire a Pyspark Developer near Pune, on Upwork?
As the world’s work marketplace, we connect highly-skilled freelance Pyspark Developers and businesses and help them build trusted, long-term relationships so they can achieve more together. Let us help you build the dream Pyspark Developer team you need to succeed.
Can I hire a Pyspark Developer near Pune, within 24 hours on Upwork?
Depending on availability and the quality of your job post, it’s entirely possible to sign up for Upwork and receive Pyspark Developer proposals within 24 hours of posting a job description.