Hire the best Pyspark Developers in the United Kingdom
Check out Pyspark Developers in the United Kingdom with the skills you need for your next job.
- $40 hourly
- 5.0/5
- (5 jobs)
Data Scientist | AI Specialist | Machine Learning Engineer | Python, SQL, Spark, Github, Docker, Airflow, LLM, Langchain, CrewAI, PydanticAI| +6 Years of Experience ◾️ Who I am? I am a full-stack data scientist, AI specialist, and Machine Learning Engineer with over 6 years of experience, helping businesses solve issues through data-driven approaches. Also, I've been recognized both as an industry leader and emerging talent by the UK Global Talent Program and as a top 1% mentor by ADPList (the largest online mentorship community). ◾️ Why work with me? ✅ I am experienced, reliable, trustworthy, and professional. ✅ I put your needs first and provide tailored solutions to ensure the finished product is a great fit. ✅ I have experience working across multiple industries with startups to global $2 billion turnover clients. ✅ I have over 6 years of industry experience across Data and AI. ◾️ More About Me 🔗 LinkedIn: younes-sandi/ 🔗 ADPList: adplist.org/mentors/younes-sandi 🔗 Github: github.com/Unessam 🔗 Website: ds4technology.com/ ◾️ Accepting • Data Analysis • Data Visualization • Web Scraping • Data Modeling • ETL • Data Manipulation • Machine Learning • Data Engineering • Automation and Pipeline Development • Model Deployment • LLM Development • AI Agent Development • MLOps • AIOps • Tutoring • Mentorship ◾️ Tools • Python • SQL • Pyspark • Microsoft Azure • AWS • GCP • Tableau • Power BI • Google Analytics • Excel • Docker • Airflow • MLflow • Github • OpenAI • Claude • Gemini • LangChain • CrewAI • Pydantic • n8n ◾️ Projects • Predictive Modeling • Recommendation Systems • Segmentation and Clustering • Path Analysis • LLM Integration • AI Integration • Business Automation • Sales & Demand Prediction • Customer Segmentation • Customer Attribution Modeling • Customer Churn/ Retention Prediction • Predictive Maintenance Modeling • Anomaly Detention and Modeling • NLP and Sentiment Analysis • Survival Analysis and Modeling • A/B Testing • Impact Analysis • Performing Complex Statistical Test • Agentic Workflow Development • AI Agent Development • AI Chatbot DevelopmentPyspark
AI Agent DevelopmentAI Model DevelopmentAI ConsultingOpenAI APILLM Prompt EngineeringSurvival AnalysisPredictive AnalyticsData AnalysisPySparkMachine LearningData ScienceRecommendation SystemDeep LearningSQLPython - $40 hourly
- 5.0/5
- (10 jobs)
Data Engineer with over 5 years of experience in developing Python-based solutions and leveraging Machine Learning algorithms to address complex challenges. I have a strong background in Data Integration, Data Warehousing, Data Modelling, and Data Quality. I excel at implementing and maintaining both batch and streaming Big Data pipelines with automated workflows. My expertise lies in driving data-driven insights, optimizing processes, and delivering value to businesses through a comprehensive understanding of data engineering principles and best practices. KEY SKILLS Python | SQL | PySpark | JavaScript | Google cloud platform (GCP) | Azure | Amazon web services (AWS) | TensorFlow | Keras | ETL | ELT | DBT | BigQuery | BigTable | Redshift | Snowflake | Data warehouse | Data Lake | Data proc | Data Flow | Data Fusion | Data prep | Pubsub | Looker | Data studio | Data factory | Databricks | Auto ML | Vertex AI | Pandas | Big Data | Numpy | Dask | Apache Beam | Apache Airflow | Azure Synapse | Cloud Data Loss Prevention | Machine Learning | Deep learning | Kafka | Scikit Learn | Data visualisation | Tableau | Power BI | Django | Git | GitLabPyspark
Data EngineeringdbtETLChatbotCI/CDKubernetesDockerApache AirflowApache KafkaPySparkMachine LearningExploratory Data AnalysisPythonSQLBigQuery - $30 hourly
- 5.0/5
- (4 jobs)
As an Azure certified (dp 203) Data Engineer with a strong focus on data modeling and advanced cloud data architecture, I specialize in creating and optimizing data warehouses, lakehouses, and integrated data ecosystems tailored to business needs. Leveraging best practices in data engineering, I utilize a wide range of Azure tools to design and deploy robust, scalable, and highly efficient data solutions. My expertise includes end-to-end data pipeline design, data modeling, and transformation using leading Azure services like Azure Data Factory, Azure Synapse Analytics, and Azure Data Lake Storage, alongside the power of Azure Databricks for big data processing. I have extensive experience with multi-cloud solutions, incorporating other cloud platforms such as AWS and Google Cloud to enhance flexibility and scalability. Core Competencies: ◉ Data Warehousing & Lakehouse Architecture: Skilled in implementing scalable data warehouses and lakehouses using Azure Synapse Analytics, SQL Database, and Azure Data Lake. ◉ Data Modeling & ETL/ELT Pipelines: Expert in data transformation, ETL/ELT pipeline design with Azure Data Factory and Databricks, focusing on efficient data flow and storage. ◉ Azure Databricks & Spark for Big Data: Proven experience in big data processing, utilizing Databricks for both real-time and batch processing to deliver high-performance data solutions. ◉ Multi-Cloud Integration: Capable of integrating Azure with AWS, Google Cloud, and other platforms to create seamless multi-cloud architectures. ◉ Data Governance & Security: Proficient in implementing data governance and security practices with Azure Active Directory, Role-Based Access Control (RBAC), and data masking. Let’s work together to unlock the power of your data and drive your business to new heights with modern data architecture and cloud solutions tailored to your needs!Pyspark
Data TransformationData AnalysisMicrosoft Power BIApache KafkaBigQuerySnowflakeApache AirflowData WarehousingData LakeMicrosoft AzureDatabricks PlatformPySparkETL PipelinePythonSQL - $50 hourly
- 5.0/5
- (2 jobs)
As a seasoned Data Scientist and Technical Product Manager, I bring extensive experience in Financial Crime Risk and Credit Risk management, coupled with deep proficiency in Python, Spark, SAS (Base, EG, and DI Studio), Hadoop, and SQL. Transitioning into freelancing, I am eager to leverage my skills to contribute to diverse projects. While Upwork's guidelines restrict sharing direct links to external profiles, I am happy to provide a detailed portfolio from my LinkedIn upon request.Pyspark
Data MiningBig DataData ScienceFraud DetectionData AnalysisPySparkSASCredit ScoringApache HadoopSQLPython - $60 hourly
- 5.0/5
- (7 jobs)
I'm a data scientist with a Master's in Analytics and 3 years of in-industry experience. I have experience in all areas of data science but specialise in: - Developing and deploying machine learning models - Natural language processing - Analysing and visualising data with interactive dashboards - Creating clear, well documented and reusable python code - AWS Certified Cloud Practioner Get in touch and find out how I can help!Pyspark
Data AnalyticsGitHubAlgorithm DevelopmentNetwork AnalysisAnalyticsPySparkSQLTableauData SciencePythonMachine Learning ModelDeep LearningMachine LearningNatural Language ProcessingAmazon SageMaker - $50 hourly
- 5.0/5
- (4 jobs)
Result-driven Data Engineer/Scientist professional with experience in Data Analytics, Data Warehousing, Statistical Modelling and Visualization. Certified by Microsoft as a Microsoft Azure Data Scientist & Engineer.Pyspark
Data AnalysisNatural Language ProcessingData AnalyticsCloud ComputingArtificial IntelligenceMicrosoft AzureData Analytics & Visualization SoftwareData CleaningDeep LearningMachine LearningData IngestionPySparkData EngineeringSQLPython - $51 hourly
- 5.0/5
- (4 jobs)
🏆 Multi-Award Winner with Big5 company ex-clients in UK,Europe and Caribbean 🎯 Digital & Data Analytics Strategist. 📈 Expert to help you driving the Digital Transformation in any Industry. 🏆 Worked with biggest mobile networks and finance clients in UK and Europe and 30million+ customer and data systems. 📈 Process Transformation, Data-Driven Decision Making Enabler. Developing and implementing a successful Business Intelligence strategy should not feel daunting with expert guidance and support from the early stages. Thats where I can help you. For start-ups, small enterprises, and corporate companies, I specialize in developing scalable Business Intelligence and Data Analytics solutions. Using the most recent tools and technologies, I will work hard to develop you a high-performing BI solution. I am well versed in the tools & technologies listed below. (Expert level & Practitioner) -- Visualization Tools: Microsoft Fabric, Tableau, Data Studio, Power BI, QlikView, BIRST, Jaspersoft and Cognos. -- Cloud ETL: Funnel.io, Supermetrics, Fivetrans, Stitch, Snowflake, Synapse -- Cloud Platforms Amazon AWS, GCP, Microsoft Azure, Microsoft Fabric -- Microsoft Fabric, Redshift, Google Cloud Platform, Google Data Studio, Looker Studio, Google BigQuery, SQL, ETL Pipelines, PowerBI I can assist to turn your data or digital transformation project into a money-spinner. 💰 I would love to arrange an inital consultation call with you to discuss about your ideas for maximizing your business performance. ════════════ SERVICES ════════════ 🟢 End 2 End Data Architecture Strategy and development according to your key business priorities 🟢 ETL Data Pipelines from any data source into AWS, Azure and GCP 🟢 Creating complex SQL queries 🟢 Beautiful looking dashboards (in Power BI, Google Data Studio or Tableau) 🟢 Business Analysis, i.e., providing actionable insights to solve business problems 🟢 Transition from Google Sheets/Excel reports into automated data dashboards. 🟢 Interactive dashboards 🟢 Creating reports and dashboards I've worked with big telecom and finance companies having 1 billion+ revenue andd 10million+ transactions per day. I have worked with companies that asked me to work on several KEY business performance indicators such as ROI, CPC, CPA, AOV, CAR, and COGS. Feel free to schedule a call with me to talk about iyour project. Kind Regards, SavneetPyspark
Data WarehousingPySpark.NET CoreETL PipelineBusiness IntelligenceMicrosoft Power BIDatabase DesignData VisualizationMicrosoft AzurePythonAngularC#SQL - $100 hourly
- 5.0/5
- (11 jobs)
💬 "Every month, I spend hours manually pulling reports instead of focusing on our strategy" 💬 "I just want to see my team's performance without having to juggle different spreadsheets" 💬 "End-of-month reporting shouldn't feel like assembling a thousand-piece jigsaw puzzle" 💬 "I just need the figures to reconcile. Why is it such a hassle to get consistent data?" If you find yourself nodding to any of these, you're in the right place. I'm Ayub, and I specialise in streamlining data and reporting processes, so you can focus on what truly matters: growing your business. Let's make your data work for you, not the other way around. 𝗜𝗡𝗧𝗥𝗢 With 7+ years in the data and analytics space, I've collaborated with the likes of Meta, HelloFresh, Capgemini, and several thriving startups. 𝗦𝗨𝗖𝗖𝗘𝗦𝗦 𝗦𝗧𝗢𝗥𝗜𝗘𝗦 Online consumer services business: Worked closely with senior management to gather reporting requirements and developed a suite of Tableau reports following data visualisation best practices. These dashboards allowed everyone in the business to finally automate and track business KPIs with ease. ⭐️ Testimonial: "Ayub is exemplary in his work and delivery. He is quick in understanding the exact requirement, his planning is meticulous and he has an eye for details. He is very good with data visualization and his dashboards have made it easy for our organization to make sense of numbers. I enjoyed working with Ayub and would love to work with him in future as well." E-commerce agency: Built a data pipeline to extract and load live tracking and price history data and built dashboards in Tableau, Power BI, Google Data Studio, and Klipfolio. These dashboards served is used as an analytics offering by the business to their clients to consolidate and present their clients data in a compact and easy-to-digest set of dashboards. ⭐️ Testimonial: "I've worked with Ayub for over a year on some complex data and data visualisation projects in Tableau, Power BI and Klipfolio. I've found him to be very competent and an excellent problem solver, as well as responsive and efficient. Looking forward to working with him again in the future!" Drop me a message anytime to discuss your challenges. All the best, AyubPyspark
Data ManagementAmazon RedshiftETLPySparkAmazon S3BigQueryPostgreSQLData VaultData ModelingApache AirflowApache SparkData WarehousingdbtAmazon Web ServicesGoogle Cloud PlatformTerraformCloud EngineeringSnowflakeSQLPythonData Engineering - $30 hourly
- 4.7/5
- (19 jobs)
Hi there! I have over 4 years of experience in Data Engineering and Data Analytics. I use Python as my daily driver, and I regularly work with technologies and frameworks like SQL, Azure Databricks, Azure Data Factory, Azure Synapse Analytics and PowerBI. I can help you with tasks like Data Extraction, Data Cleaning, Data Transformation, Data Analysis and Data Visualisation. Feel free to reach out if you'd like to discuss your project with me! Languages - Python, SQL Cloud Tools - Azure Databricks, Azure Data Factory, Azure Synapse Analytics, Azure Data Lake Storage Data Processing, Transformation and Analysis - Apache Spark, PySpark, Pandas Data Visualisation - PowerBI Data Storage Formats - CSV, Microsoft Excel, Google Sheets, Parquet Others - Jupyter Notebook, ipynbPyspark
Algorithm DevelopmentData ManagementJavaData AnalysisData StructuresResumeInterview PreparationCandidate InterviewingMachine LearningData ScienceCareer CoachingPySparkApache SparkPythonSQL - $45 hourly
- 0.0/5
- (1 job)
🎯 Microsoft Fabric | Power BI | SQL | BI Developer | Financial & Commercial Data Specialist | I'm a results-driven BI Developer with a strong background in finance, commercial analytics, and data engineering, delivering impactful data solutions that support strategic decisions at all levels of the business. I specialize in building robust, scalable reporting systems that bring clarity and insight to complex datasets. I've worked closely with senior global stakeholders—especially in financial and commercial environments—to automate reporting, optimize performance, and uncover opportunities through data. 💼 Core Strengths: - Financial & Commercial Expertise: Proven track record in revenue analysis, forecasting, performance tracking, and cost optimization reporting - Power BI Development: Dashboards, KPIs, advanced DAX, and data storytelling tailored to financial and business users - Microsoft Fabric: Data pipelines, lakehouses/warehouses, dataflows, notebooks (Python/Spark) - Advanced SQL & Excel: ETL, complex joins, financial modeling, and dynamic reporting -Data Governance & Documentation: SOPs, data dictionaries, glossaries, and process flows - Stakeholder Engagement: Skilled at translating commercial and financial needs into reliable, high-impact BI tools If you need a BI expert who understands the numbers and the business, let’s connect. I help teams turn data into growth.Pyspark
FabricPySparkPythonMicrosoft AzureMicrosoft Azure SQL DatabaseAzure DevOpsMicrosoft Windows PowerShellMicrosoft PowerAppsPower QueryMicrosoft Power BISQL Server Reporting ServicesSQL Server Integration ServicesPostgreSQLMicrosoft SQL ServerMicrosoft Excel - $40 hourly
- 0.0/5
- (0 jobs)
Proficient Azure SQL Database and Managed Instance Support Engineer with experience in business analytics, data visualization, web development, and business reporting. Well-versed in Azure cloud networking and architecture, SQL database performance optimization, providing customer support, presentations, debugging, and teaching/knowledge transfer sessions. I have substantial experience in Azure SQL Database/SQL MI support for Microsoft as a client, for 1.5 years, where I gained extensive knowledge and expertise in Azure cloud networking and architecture, 24*7 customer support, SQL database optimization, data migration, backup-restore, and problem-solving. I have recently completed Microsoft Certified Azure Data Engineer Associate Certification Exam (DP-203) and have a strong understanding of Azure fundamentals. Prior to that, I obtained a Master of Science degree in Information Technology and Management from the University of Texas at Dallas, USA, where I got hands-on experience on academic projects on Advanced Business Analytics, Big Data, Sentiment Analysis, System Analysis, and Project Management, Statistics, Data Warehousing, Agile Project Management. Earlier, I worked as Application Development Analyst at Accenture in India for over 4 years where I gained experience in web development, including skills as follows: DotNet, C#, AngularJS, Data Management, VBA development, project deployment, MS Excel reporting, and Presentations. I am well-versed in data visualization, ETL, Python, SQL, Power BI, MS Excel, Databricks, data engineering, and Microsoft Azure Cloud technologies and customer support.Pyspark
Data MigrationTableauMicrosoft Power BI Data VisualizationBusiness AnalysisDatabricks PlatformData EngineeringBig DataBusinessPySparkData LakeData AnalysisMicrosoft AzureRSQLPython - $20 hourly
- 5.0/5
- (8 jobs)
Dynamic and results-driven Cloud Data Engineer with a proven track record in designing and implementing end-to-end batch and streaming data pipelines. Proficient in orchestrating data workflows within multi-cloud environments, combining technical expertise with a commitment to optimizing data-driven solutions for enhanced business outcomes. Put your faith in me!Pyspark
Agile Software DevelopmentTerraformAmazon Web ServicesGoogle Cloud PlatformApache SparkApache KafkaPySparkApache AirflowSQLPython - $20 hourly
- 5.0/5
- (1 job)
🏆 Most recent achievements: ✅ LinkedIn made me a "Top Data Engineering Voice" for my contributions to Data Analytics & Data Engineering ✅ Teaching Python, SQL, Power BI to over 33,000 data enthusiasts on TikTok, YouTube, LinkedIn etc ✅ Saved my clients ~$500k in computing costs building custom data tools with Python, SQL, Databricks etc ✅ I have a YouTube channel with over 1.87k+ subscribers where I walk through data projects step by step (titled “Stephen | Data”) 💻 What do I do? I help companies build data products that generate the ROI they're after. I've done this using Python, SQL and Spark for over 7+ years in data engineering. Here are some of the tools + resources I can design and build from start to finish: 📍 data strategies 📍 data workflows 📍 data quality frameworks 📍 automation + augmentation tools ...among other solutions tailored to your data team's needs. Show me your data challenges and I'll create the solutions for them. 🌐 Other things to note: ✅ Response time: <24 hours ✅ Availability: Immediately (most of the time) Feel free to check out my portfolio and online handles for more information about mePyspark
pandasData IngestionData TransformationData ExtractionData AnalyticsData EngineeringData Warehousing & ETL SoftwareETLDatabricks PlatformPySparkSQLPython - $60 hourly
- 0.0/5
- (0 jobs)
PhD in Computer Science with strong technical skills and a research track in Linked Data, Databases, and Large-Scale Machine Learning. Highly skilled in Apache Spark, PySpark, Python, SQL, Datalog, and designing scalable data processing pipelines. Passionate about enhancing input data quality in machine learning. For more details, please visit my GitHub or LinkedIn profile.Pyspark
Neo4jNoSQL DatabaseQuery OptimizationJavaOWLSPARQLRDFDeep Neural NetworkRegression AnalysisMachine LearningApache HadoopPySparkSQLPython - $250 hourly
- 0.0/5
- (0 jobs)
Experienced ML Engineer with over decade of professional experience, I specialize in transforming data into actionable insights and building robust machine learning models. Dedicated to helping clients succeed, I offer freelance services that ensure data-driven decisions and advanced ML solutions. Let's collaborate to turn your data challenges into success stories.Pyspark
Data SciencePyTorchMachine LearningData EngineeringWeb ScrapingData AnalysisMLOpsPySparkTensorFlowDatabricks PlatformSQLPython - $150 hourly
- 0.0/5
- (0 jobs)
Technical Profile * PyKX * Python * Q * FX Trading * Cash/Money Markets * Fixed Income investments * KDB+ * Java * KDB Gateway * KDB Insights platform * KDB insight Enterprise * KDB.AI * Chained Ticker Plant * Linux * RDB * HDB * Docker * MongoDB * Asset Management * Wealth Management * KX Technology * Risk Management * GitHub * Athena * Spring Boot * Microsoft SQL Server Having overall 12+ years of experience in Planning, Building, Implementation, and Integration of full-scale commercial projects in the different verticals like Financial, Retail, Insurance, Banking, High-tech, social media, Oil and Gas and Networking/Telecom. Worked on various Agile practices like TDD, BDD, pair programming, continuous Integration and Scrum. Worked on programming languages like Java, Q, PyKX, Scala, Python,Spark, Shell scripting, KDB Gateway, KDB+ tick configuration, KDB+ tick profiling, Query routing, KDB+, KX Technology, Timeseries Processing.Pyspark
dbtApache AirflowApache KafkaApache CassandraApache HBaseBig DataHiveScalakdb+JavaPythonDatabricks PlatformAzure DevOpsDevOpsPySpark - $18 hourly
- 5.0/5
- (3 jobs)
I help organizations leverage the power of cloud and big data to drive actionable insights and optimize their data pipelines. With extensive experience in Azure and a solid command of tools like Spark, Python, and SQL, I specialize in building efficient ETL processes, creating advanced data visualizations, and optimizing data systems to support scalable business intelligence solutions. 🔹 My Tech Stack & Expertise: ✅ Cloud & Data Engineering: Azure, Databricks, Azure Data Factory, Azure Logic Apps, Apache Nifi ✅ ETL & Data Integration: Extraction, Transformation, Loading (ETL) workflows ✅ Data Analytics & Visualization: Microsoft Power BI, Excel ✅ Data Warehousing: Azure Synapse Data Warehouse ✅ Programming & Optimization: Python, Spark, SQL, Performance Optimization, Quality Assurance I have worked across a variety of industries, enabling businesses to unlock the value of their data and make data-driven decisions through powerful, scalable solutions. My focus is on providing end-to-end data engineering services that deliver results. 🔹 What I Offer: ✔️ Custom ETL Pipelines & Data Integration ✔️ Scalable & Secure Data Solutions on Azure ✔️ Data Analytics & Visualization using Power BI and Excel ✔️ Performance Tuning & Optimization for Big Data ✔️ Quality Assurance for Data Workflows ➡️ Why Clients Love Working With Me: 🔹 Proven Success: Expertise in large-scale data solutions with a focus on optimization and quality. 🔹 Fast Turnaround: Efficient and reliable, with strong collaboration skills. 🔹 Cost-Effective Solutions: Streamlined processes that reduce overhead and drive ROI. 🔹 Long-Term Collaboration: Trusted by organizations to consistently deliver high-quality results. 💬 Client Testimonials: ● "Umair has been a crucial asset to our data team, optimizing our data pipelines and ensuring smooth integration with Azure. His technical expertise and problem-solving skills are top-notch." ● "With Umair's guidance, we were able to build a robust data warehousing solution, enabling us to leverage analytics effectively. His attention to detail and focus on performance made all the difference." ● Great man, 5+ stars! Was able to handle any and all requests. ● Awesome job as always! Thank you for all the things you did. 📩 Let’s build your next big project! Message me now for a FREE consultation.Pyspark
Apache NiFiPrice OptimizationData LakeData ModelingPythonMicrosoft ExcelData AnalysisDatabricks PlatformSQLData Warehousing & ETL SoftwareData VisualizationMicrosoft Power BIPySparkMicrosoft Azure - $50 hourly
- 0.0/5
- (0 jobs)
• Experienced Data Engineer: 4+ years of hands-on experience in designing, building, and maintaining scalable data pipelines and ETL workflows on cloud platforms like AWS and GCP. Expertise in data migration, database optimization, and real-time data processing using tools like Apache Kafka, Spark Streaming, and Airflow. • Python & Pypark Expert: Proficient in Python for data processing, transformation, and analysis. Skilled in building distributed data pipelines using PySpark for large-scale data processing. • Cloud Data Specialist: Proficient in leveraging AWS services (Glue, Athena, DMS, EMR, S3, Aurora) and GCP for building robust data solutions. Skilled in containerization (Docker, Kubernetes) and orchestration tools for efficient data pipeline management. • Database Engineering Expert: 10+ years of experience in database design, development, SQL Query Tuning and administration, with deep expertise in MySQL, PostgreSQL and AWS Aurora, Proven track record in ETL development, data migration, and performance tuning for high-volume transactional systems. • Data-Driven Problem Solver: Adept at transforming complex data into actionable insights through data visualization (Tableau, Power BI) and data storytelling. Strong background in data analysis, feature engineering, and data quality assurance. • Certified Data Professional: MSc in Data Science from the University of East Anglia, complemented by extensive hands-on experience in data engineering and cloud platforms. • Experienced ML Engineer: 3+ years of ML Engineer experience implementing sophisticated algorithms and managing end-to-end ML lifecycles, with particular expertise in the Airline Loyalty Domain.Pyspark
Apache AirflowDatabricks PlatformAWS GlueSnowflakePostgreSQL ProgrammingTableauMicrosoft Power BIPostgreSQLApache SparkApache KafkaPySparkPythonDatabaseMLOpsMachine Learning - $40 hourly
- 0.0/5
- (0 jobs)
Highly experienced Data Engineer with 7+ years of expertise designing and implementing scalable, cloud-native data solutions on the Azure platform, primarily within the healthcare domain. Proficient in building end-to-end data pipelines using Azure Data Factory, Azure Databricks, Delta Lake, and Azure SQL Database to drive efficient data integration, transformation, and analytics. Skilled in Pyspark, SQL, Python, and big data technologies such as Hadoop and Hive. Experienced in developing HIPAAcompliant Data Lakehouse solutions to support secure, large-scale healthcare data processing. Adept at optimizing ETL/ELT workloads for performance, scalability, and cost-efficiency. A collaborative team player with a strong track record of partnering with global teams and stakeholders to deliver high-impact data solutions. Committed to continuous learning and staying current with advancements in cloud and data engineering technologies.Pyspark
ETL PipelineData ExtractionDatabricks PlatformHiveApache SparkBig DataApache HadoopSQLPySparkPythonMicrosoft Azure - $65 hourly
- 0.0/5
- (0 jobs)
I help businesses in banking, finance, insurance, and sustainability sectors solve complex data problems. From cloud migration to analytics-ready data lakes, I’ve led solutions that save time, reduce cost, and boost reporting. > 5+ years of data engineering (16+ in IT overall). > Expertise in Azure (ADF, Synapse, Databricks, Fabric), SQL, and Spark/PySpark. > Reduced cloud costs by £10K/month through optimized pipelines. > Passionate about clean data, automation, and end-user valuePyspark
PySparkApache SparkDatabricks PlatformBig DataExpertKnowledge SynapseFabricAzure DevOpsData EngineeringETL PipelineETLData ExtractionData Analysis - $50 hourly
- 0.0/5
- (0 jobs)
I am a Data Engineer with experience building and maintaining production data pipelines. I have experience working with Python3, SQL and various other tools to help make data easy to access and use.Pyspark
Data ExtractionETLJiraAtlassian ConfluencedbtTerraformDatabricks PlatformApache SparkPySparkAWS LambdaAmazon S3SQLTime Series AnalysisApache AirflowPython - $85 hourly
- 5.0/5
- (13 jobs)
I am an experienced full stack data and backend engineer. My background and skills include: - Expert in Python, SQL and NodeJS - Certified AWS Cloud Practitioner and studying for exams in Certified Developer, Solutions Architect, SysOps Administrator and Data Analytics - Two MScs in Mathematics and Data Science - Databases (Postgres, PostGIS, Aurora, DynamoDB, MySQL, Mongo, Redis, Snowflake, Redshift, Neo4j) and ORMs - Cloud Infrastructure (AWS, GCP, Terraform) - APIs (FastAPI, Flask, Express, GraphQL) - Orchestration (Airflow, Kubernetes) - Pub/Sub + Queuing (Kafka, RabbitMQ) - Version control Git - CI/CD (Docker, Github Actions, Jenkins) - GIS (Postgis, Shapely, GDAL, H3) - Web crawling (Scrapy, custom crawlers in Python/Node/Go) - PySpark/AWS EMR/AWS Batch I have worked in data and tech for 4 years including full time roles as a Data Scientist, Data Engineer, Backend Engineer and Head of Data. In a previous life I worked investment banking in Sales and Trading for 4 years. Previous projects include: - Designing and building a client a booking engine to allow them to expand into the reservation and beauty treatment business (GCP, Python, Postgres, Redis and FastAPI). - Assisted a bottling company in expanding their tech capabilities by moving them from spreadsheets into the cloud and developing APIs to automate their business with other partners (Python, Postgres, AWS, FastAPI). - Refactored an employers crawling system to make it more efficient. Previously each data entity was crawled on a regular frequency, but the vast majority of entities very rarely changed leading to unnecessary crawling and resource use. The refactor took into account the changes in the data and how often they occurred to predict the next optimal time to crawl resulting in a 15% reduction in cloud costs (AWS, Postgres, NodeJS, Python, RabbitMQ). - Developed a streaming change data capture pipeline handling 50 million unique payloads per day resulting in a +15% reduction in total cloud costs while allowing live data to be available to customers (AWS, Python, NodeJS, RabbitMQ, Postgres, Snowflake, Airflow) - Created a data architecture to allow aggregation of geospatial time series for any possible geographic polygon across +20bn global data points in sub-second time (AWS, Clickhouse, Python, FastAPI, Postgres, Uber H3, Redis) - Created an autonomous on-demand Excel and PDF reporting system to allow a sales team to generate their own reports from data stores with no required input from developers (AWS, Python) - Developer multiple machine learning models running production including an age/gender classification for faces in photos (Python, Keras, GCP), entity resolution system combining tabular, text and image embeddings to deduplicate +30mm listings across multiple provider platforms (AWS, Python, PyTorch, RabbitMQ, Neo4j, Postgres)Pyspark
Amazon Web ServicesTerraformRabbitMQFlaskMachine LearningRESTful APIPostgreSQLNode.jsSnowflakePySparkApache KafkaDockerApache AirflowPythonSQL - $24 hourly
- 4.9/5
- (49 jobs)
Looking to build a cutting-edge social media analytics product that harnesses the power of AI? Look no further! As a seasoned MLOps and machine learning engineer with expertise in NLP and textual data analysis, I have the skills and tools you need to take your project to the next level. 💪🚀 With my experience using state-of-the-art natural language processing models like GPT-3 and GPT-4, I can help you unlock new levels of insight from your data. These powerful language models use deep learning techniques to generate natural language text that is virtually indistinguishable from that written by a human, making them ideal for tasks like language translation, content creation, and more. 📚💡 And when it comes to deploying your models at scale, I'm a pro with ML automation and deployment tools like MLflow, PySpark, TensorFlow Extended, Kubeflow, and more. Whether you're working with AWS, Databricks, or another cloud platform, I'll help you ensure your models are deployed reliably and efficiently, with careful unit testing and source control. 🚀☁️ So if you're looking for a top-notch machine learning engineer to join your team and help you build a social media analytics product that can transform the way you do business, look no further. With my expertise in NLP, MLOps, and machine learning, we'll unlock new insights and drive innovation together. By utilizing my expertise in NLP, MLOps, and machine learning, your organization can benefit in the following ways: 🔍 Improved decision-making through advanced NLP analysis of customer sentiment and market trends. 🤝 Enhanced customer experience by personalizing products and services based on customer preferences. ⚙️ Streamlined operations and cost savings through ML automation and deployment tools. 🚀 Competitive advantage by leveraging state-of-the-art NLP models for content generation and recommendations. 💪 Scalability and cost-efficiency through reliable deployment on cloud platforms. These benefits will contribute to increased profitability and drive innovation within your organization. Let's collaborate to unlock the full potential of your social media analytics product! 🌟Let's get started! 🎉🔍Pyspark
DockerBashPySparkGolangGPT-3Data ScienceData MiningDatabricks PlatformpandasSQLApache SparkPythonMLflowMachine LearningNatural Language Processing - $50 hourly
- 0.0/5
- (0 jobs)
⚡️ Friendly but professional, and solution-oriented. Hi! I'm a Data Engineer with over 10 years of experience building scalable, high-performance data pipelines, automating ETL workflows, and managing cloud-based infrastructure on platforms like AWS, Azure, and Snowflake. 🔧 I specialize in: End-to-end ETL pipeline development (Python, PySpark, SQL) Data modeling and warehouse design (Kimball, Snowflake) Cloud-based data engineering (AWS Glue, Azure Data Factory) Dashboarding and reporting (Power BI, Palantir Foundry, Tableau) 🚀 What sets me apart: Deep understanding of large-scale data systems (Public health service and supply chain experience) Clean, modular, production-ready code Clear communication and collaborative approach Whether you're a startup needing a quick MVP or a company scaling your data infrastructure, I can help you get there — on time and with confidence. Let’s talk about your project!Pyspark
dbtData WarehousingPySparkData ModelingTableauMicrosoft Power BI DevelopmentMicrosoft Power BIPythonSQLSnowflakeMicrosoft Azure SQL DatabaseMicrosoft AzureAWS CodePipelineETL PipelineData Engineering - $30 hourly
- 0.0/5
- (2 jobs)
Data Engineer with over 4 years of professional experience developing ETL/ML pipelines, APIs, app backends, SQL databases using python. Also know about developing Machine learning models.Pyspark
PySparkArtificial IntelligenceFlaskPyTorchKerasPython - $35 hourly
- 0.0/5
- (1 job)
I possess a comprehensive skillset that seamlessly blends data engineering expertise with data science proficiency. My expertise lies in architecting and implementing robust data pipelines, utilizing Python, Spark, PostgreSQL, and Amazon Web Services to ensure data reliability and accessibility. My proficiency in data preprocessing, cleansing, and transformation enables me to develop pipelines for analysis and predictive modeling in both R and Python. My unique fusion of skills allows me to seamlessly transition between data engineering and data science tasks, enabling me to efficiently handle the entire data lifecycle from acquisition to insights generation. This holistic approach to data management ensures that I can effectively extract actionable insights from complex datasets. Outside of my professional work, I have consistently engaged in personal projects that further enhance my data engineering and data science skills. My recent project involved designing and executing a data ingestion and storage solution in AWS DynamoDB and S3 Bucket using Python, AWS Lambda, Pandas, and Boto3. This project showcased my proficiency in AWS setup and configuration, ensuring seamless interaction between AWS services. I am also adept at conducting comprehensive data cleaning and exploratory data analysis using R and the ggplot2 package, revealing key insights and trends from complex datasets. My data visualization skills are well-honed, as evidenced by my ability to clean and visualize data using Power BI and DAX on an academic institution's payment collection dataset. My latest project involved developing a database schema, performing data ingestion into the database, and carrying out exploratory analysis using SQL to extract valuable insights for business decision-making. This project further solidified my database modeling and SQL skills. My understanding of Machine Learning Operations (MLOps) and its application in data engineering and model deployment lifecycle (MDLC) is adequate, and I have experience using GitHub Actions for managing CI/CD in a professional setting.Pyspark
SQL Server Integration ServicesAWS GlueSnowflakePySparkETL PipelineMicrosoft Power BIRAWS LambdaPostgreSQLSQLReactPythonPHP - $100 hourly
- 0.0/5
- (0 jobs)
DATA ENGINEER 4 years of working experience in Bigdata technologies PERSONAL PROJECTS Face Recognition Face Recognizer with OpenCV library which can detect faces using Haar Cascade Classifier of Face, create dataset to train itself, recognize faces live with webcam. Pose Detection * Pose Detection of the Human Body using OpenPose and OpenCV. In this I used model weights of COCO keypoint as well as MPII human pose detection dataset. The software reads an image using OpenCV which was converted to input blob so that it can be fed to a network and then make a prediction for pose detection.Pyspark
HiveSQL ProgrammingAmazon S3OraclePySparkETL Want to browse more freelancers?
Sign up
How hiring on Upwork works
1. Post a job
Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.
2. Talent comes to you
Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.
3. Collaborate easily
Use Upwork to chat or video call, share files, and track project progress right from the app.
4. Payment simplified
Receive invoices and make payments through Upwork. Only pay for work you authorize.