Hire the best Pyspark Developers in the United Kingdom

Check out Pyspark Developers in the United Kingdom with the skills you need for your next job.
  • $40 hourly
    Data Scientist | AI Specialist | Machine Learning Engineer | Python, SQL, Spark, Github, Docker, Airflow, LLM, Langchain, CrewAI, PydanticAI| +6 Years of Experience ◾️ Who I am? I am a full-stack data scientist, AI specialist, and Machine Learning Engineer with over 6 years of experience, helping businesses solve issues through data-driven approaches. Also, I've been recognized both as an industry leader and emerging talent by the UK Global Talent Program and as a top 1% mentor by ADPList (the largest online mentorship community). ◾️ Why work with me? ✅ I am experienced, reliable, trustworthy, and professional. ✅ I put your needs first and provide tailored solutions to ensure the finished product is a great fit. ✅ I have experience working across multiple industries with startups to global $2 billion turnover clients. ✅ I have over 6 years of industry experience across Data and AI. ◾️ More About Me 🔗 LinkedIn: younes-sandi/ 🔗 ADPList: adplist.org/mentors/younes-sandi 🔗 Github: github.com/Unessam 🔗 Website: ds4technology.com/ ◾️ Accepting • Data Analysis • Data Visualization • Web Scraping • Data Modeling • ETL • Data Manipulation • Machine Learning • Data Engineering • Automation and Pipeline Development • Model Deployment • LLM Development • AI Agent Development • MLOps • AIOps • Tutoring • Mentorship ◾️ Tools • Python • SQL • Pyspark • Microsoft Azure • AWS • GCP • Tableau • Power BI • Google Analytics • Excel • Docker • Airflow • MLflow • Github • OpenAI • Claude • Gemini • LangChain • CrewAI • Pydantic • n8n ◾️ Projects • Predictive Modeling • Recommendation Systems • Segmentation and Clustering • Path Analysis • LLM Integration • AI Integration • Business Automation • Sales & Demand Prediction • Customer Segmentation • Customer Attribution Modeling • Customer Churn/ Retention Prediction • Predictive Maintenance Modeling • Anomaly Detention and Modeling • NLP and Sentiment Analysis • Survival Analysis and Modeling • A/B Testing • Impact Analysis • Performing Complex Statistical Test • Agentic Workflow Development • AI Agent Development • AI Chatbot Development
    Featured Skill Pyspark
    AI Agent Development
    AI Model Development
    AI Consulting
    OpenAI API
    LLM Prompt Engineering
    Survival Analysis
    Predictive Analytics
    Data Analysis
    PySpark
    Machine Learning
    Data Science
    Recommendation System
    Deep Learning
    SQL
    Python
  • $40 hourly
    Data Engineer with over 5 years of experience in developing Python-based solutions and leveraging Machine Learning algorithms to address complex challenges. I have a strong background in Data Integration, Data Warehousing, Data Modelling, and Data Quality. I excel at implementing and maintaining both batch and streaming Big Data pipelines with automated workflows. My expertise lies in driving data-driven insights, optimizing processes, and delivering value to businesses through a comprehensive understanding of data engineering principles and best practices. KEY SKILLS Python | SQL | PySpark | JavaScript | Google cloud platform (GCP) | Azure | Amazon web services (AWS) | TensorFlow | Keras | ETL | ELT | DBT | BigQuery | BigTable | Redshift | Snowflake | Data warehouse | Data Lake | Data proc | Data Flow | Data Fusion | Data prep | Pubsub | Looker | Data studio | Data factory | Databricks | Auto ML | Vertex AI | Pandas | Big Data | Numpy | Dask | Apache Beam | Apache Airflow | Azure Synapse | Cloud Data Loss Prevention | Machine Learning | Deep learning | Kafka | Scikit Learn | Data visualisation | Tableau | Power BI | Django | Git | GitLab
    Featured Skill Pyspark
    Data Engineering
    dbt
    ETL
    Chatbot
    CI/CD
    Kubernetes
    Docker
    Apache Airflow
    Apache Kafka
    PySpark
    Machine Learning
    Exploratory Data Analysis
    Python
    SQL
    BigQuery
  • $30 hourly
    As an Azure certified (dp 203) Data Engineer with a strong focus on data modeling and advanced cloud data architecture, I specialize in creating and optimizing data warehouses, lakehouses, and integrated data ecosystems tailored to business needs. Leveraging best practices in data engineering, I utilize a wide range of Azure tools to design and deploy robust, scalable, and highly efficient data solutions. My expertise includes end-to-end data pipeline design, data modeling, and transformation using leading Azure services like Azure Data Factory, Azure Synapse Analytics, and Azure Data Lake Storage, alongside the power of Azure Databricks for big data processing. I have extensive experience with multi-cloud solutions, incorporating other cloud platforms such as AWS and Google Cloud to enhance flexibility and scalability. Core Competencies: ◉ Data Warehousing & Lakehouse Architecture: Skilled in implementing scalable data warehouses and lakehouses using Azure Synapse Analytics, SQL Database, and Azure Data Lake. ◉ Data Modeling & ETL/ELT Pipelines: Expert in data transformation, ETL/ELT pipeline design with Azure Data Factory and Databricks, focusing on efficient data flow and storage. ◉ Azure Databricks & Spark for Big Data: Proven experience in big data processing, utilizing Databricks for both real-time and batch processing to deliver high-performance data solutions. ◉ Multi-Cloud Integration: Capable of integrating Azure with AWS, Google Cloud, and other platforms to create seamless multi-cloud architectures. ◉ Data Governance & Security: Proficient in implementing data governance and security practices with Azure Active Directory, Role-Based Access Control (RBAC), and data masking. Let’s work together to unlock the power of your data and drive your business to new heights with modern data architecture and cloud solutions tailored to your needs!
    Featured Skill Pyspark
    Data Transformation
    Data Analysis
    Microsoft Power BI
    Apache Kafka
    BigQuery
    Snowflake
    Apache Airflow
    Data Warehousing
    Data Lake
    Microsoft Azure
    Databricks Platform
    PySpark
    ETL Pipeline
    Python
    SQL
  • $50 hourly
    As a seasoned Data Scientist and Technical Product Manager, I bring extensive experience in Financial Crime Risk and Credit Risk management, coupled with deep proficiency in Python, Spark, SAS (Base, EG, and DI Studio), Hadoop, and SQL. Transitioning into freelancing, I am eager to leverage my skills to contribute to diverse projects. While Upwork's guidelines restrict sharing direct links to external profiles, I am happy to provide a detailed portfolio from my LinkedIn upon request.
    Featured Skill Pyspark
    Data Mining
    Big Data
    Data Science
    Fraud Detection
    Data Analysis
    PySpark
    SAS
    Credit Scoring
    Apache Hadoop
    SQL
    Python
  • $60 hourly
    I'm a data scientist with a Master's in Analytics and 3 years of in-industry experience. I have experience in all areas of data science but specialise in: - Developing and deploying machine learning models - Natural language processing - Analysing and visualising data with interactive dashboards - Creating clear, well documented and reusable python code - AWS Certified Cloud Practioner Get in touch and find out how I can help!
    Featured Skill Pyspark
    Data Analytics
    GitHub
    Algorithm Development
    Network Analysis
    Analytics
    PySpark
    SQL
    Tableau
    Data Science
    Python
    Machine Learning Model
    Deep Learning
    Machine Learning
    Natural Language Processing
    Amazon SageMaker
  • $50 hourly
    Result-driven Data Engineer/Scientist professional with experience in Data Analytics, Data Warehousing, Statistical Modelling and Visualization. Certified by Microsoft as a Microsoft Azure Data Scientist & Engineer.
    Featured Skill Pyspark
    Data Analysis
    Natural Language Processing
    Data Analytics
    Cloud Computing
    Artificial Intelligence
    Microsoft Azure
    Data Analytics & Visualization Software
    Data Cleaning
    Deep Learning
    Machine Learning
    Data Ingestion
    PySpark
    Data Engineering
    SQL
    Python
  • $51 hourly
    🏆 Multi-Award Winner with Big5 company ex-clients in UK,Europe and Caribbean 🎯 Digital & Data Analytics Strategist. 📈 Expert to help you driving the Digital Transformation in any Industry. 🏆 Worked with biggest mobile networks and finance clients in UK and Europe and 30million+ customer and data systems. 📈 Process Transformation, Data-Driven Decision Making Enabler. Developing and implementing a successful Business Intelligence strategy should not feel daunting with expert guidance and support from the early stages. Thats where I can help you. For start-ups, small enterprises, and corporate companies, I specialize in developing scalable Business Intelligence and Data Analytics solutions. Using the most recent tools and technologies, I will work hard to develop you a high-performing BI solution. I am well versed in the tools & technologies listed below. (Expert level & Practitioner) -- Visualization Tools: Microsoft Fabric, Tableau, Data Studio, Power BI, QlikView, BIRST, Jaspersoft and Cognos. -- Cloud ETL: Funnel.io, Supermetrics, Fivetrans, Stitch, Snowflake, Synapse -- Cloud Platforms Amazon AWS, GCP, Microsoft Azure, Microsoft Fabric -- Microsoft Fabric, Redshift, Google Cloud Platform, Google Data Studio, Looker Studio, Google BigQuery, SQL, ETL Pipelines, PowerBI I can assist to turn your data or digital transformation project into a money-spinner. 💰 I would love to arrange an inital consultation call with you to discuss about your ideas for maximizing your business performance. ════════════ SERVICES ════════════ 🟢 End 2 End Data Architecture Strategy and development according to your key business priorities 🟢 ETL Data Pipelines from any data source into AWS, Azure and GCP 🟢 Creating complex SQL queries 🟢 Beautiful looking dashboards (in Power BI, Google Data Studio or Tableau) 🟢 Business Analysis, i.e., providing actionable insights to solve business problems 🟢 Transition from Google Sheets/Excel reports into automated data dashboards. 🟢 Interactive dashboards 🟢 Creating reports and dashboards I've worked with big telecom and finance companies having 1 billion+ revenue andd 10million+ transactions per day. I have worked with companies that asked me to work on several KEY business performance indicators such as ROI, CPC, CPA, AOV, CAR, and COGS. Feel free to schedule a call with me to talk about iyour project. Kind Regards, Savneet
    Featured Skill Pyspark
    Data Warehousing
    PySpark
    .NET Core
    ETL Pipeline
    Business Intelligence
    Microsoft Power BI
    Database Design
    Data Visualization
    Microsoft Azure
    Python
    Angular
    C#
    SQL
  • $100 hourly
    💬 "Every month, I spend hours manually pulling reports instead of focusing on our strategy" 💬 "I just want to see my team's performance without having to juggle different spreadsheets" 💬 "End-of-month reporting shouldn't feel like assembling a thousand-piece jigsaw puzzle" 💬 "I just need the figures to reconcile. Why is it such a hassle to get consistent data?" If you find yourself nodding to any of these, you're in the right place. I'm Ayub, and I specialise in streamlining data and reporting processes, so you can focus on what truly matters: growing your business. Let's make your data work for you, not the other way around. 𝗜𝗡𝗧𝗥𝗢 With 7+ years in the data and analytics space, I've collaborated with the likes of Meta, HelloFresh, Capgemini, and several thriving startups. 𝗦𝗨𝗖𝗖𝗘𝗦𝗦 𝗦𝗧𝗢𝗥𝗜𝗘𝗦 Online consumer services business: Worked closely with senior management to gather reporting requirements and developed a suite of Tableau reports following data visualisation best practices. These dashboards allowed everyone in the business to finally automate and track business KPIs with ease. ⭐️ Testimonial: "Ayub is exemplary in his work and delivery. He is quick in understanding the exact requirement, his planning is meticulous and he has an eye for details. He is very good with data visualization and his dashboards have made it easy for our organization to make sense of numbers. I enjoyed working with Ayub and would love to work with him in future as well." E-commerce agency: Built a data pipeline to extract and load live tracking and price history data and built dashboards in Tableau, Power BI, Google Data Studio, and Klipfolio. These dashboards served is used as an analytics offering by the business to their clients to consolidate and present their clients data in a compact and easy-to-digest set of dashboards. ⭐️ Testimonial: "I've worked with Ayub for over a year on some complex data and data visualisation projects in Tableau, Power BI and Klipfolio. I've found him to be very competent and an excellent problem solver, as well as responsive and efficient. Looking forward to working with him again in the future!" Drop me a message anytime to discuss your challenges. All the best, Ayub
    Featured Skill Pyspark
    Data Management
    Amazon Redshift
    ETL
    PySpark
    Amazon S3
    BigQuery
    PostgreSQL
    Data Vault
    Data Modeling
    Apache Airflow
    Apache Spark
    Data Warehousing
    dbt
    Amazon Web Services
    Google Cloud Platform
    Terraform
    Cloud Engineering
    Snowflake
    SQL
    Python
    Data Engineering
  • $30 hourly
    Hi there! I have over 4 years of experience in Data Engineering and Data Analytics. I use Python as my daily driver, and I regularly work with technologies and frameworks like SQL, Azure Databricks, Azure Data Factory, Azure Synapse Analytics and PowerBI. I can help you with tasks like Data Extraction, Data Cleaning, Data Transformation, Data Analysis and Data Visualisation. Feel free to reach out if you'd like to discuss your project with me! Languages - Python, SQL Cloud Tools - Azure Databricks, Azure Data Factory, Azure Synapse Analytics, Azure Data Lake Storage Data Processing, Transformation and Analysis - Apache Spark, PySpark, Pandas Data Visualisation - PowerBI Data Storage Formats - CSV, Microsoft Excel, Google Sheets, Parquet Others - Jupyter Notebook, ipynb
    Featured Skill Pyspark
    Algorithm Development
    Data Management
    Java
    Data Analysis
    Data Structures
    Resume
    Interview Preparation
    Candidate Interviewing
    Machine Learning
    Data Science
    Career Coaching
    PySpark
    Apache Spark
    Python
    SQL
  • $45 hourly
    🎯 Microsoft Fabric | Power BI | SQL | BI Developer | Financial & Commercial Data Specialist | I'm a results-driven BI Developer with a strong background in finance, commercial analytics, and data engineering, delivering impactful data solutions that support strategic decisions at all levels of the business. I specialize in building robust, scalable reporting systems that bring clarity and insight to complex datasets. I've worked closely with senior global stakeholders—especially in financial and commercial environments—to automate reporting, optimize performance, and uncover opportunities through data. 💼 Core Strengths: - Financial & Commercial Expertise: Proven track record in revenue analysis, forecasting, performance tracking, and cost optimization reporting - Power BI Development: Dashboards, KPIs, advanced DAX, and data storytelling tailored to financial and business users - Microsoft Fabric: Data pipelines, lakehouses/warehouses, dataflows, notebooks (Python/Spark) - Advanced SQL & Excel: ETL, complex joins, financial modeling, and dynamic reporting -Data Governance & Documentation: SOPs, data dictionaries, glossaries, and process flows - Stakeholder Engagement: Skilled at translating commercial and financial needs into reliable, high-impact BI tools If you need a BI expert who understands the numbers and the business, let’s connect. I help teams turn data into growth.
    Featured Skill Pyspark
    Fabric
    PySpark
    Python
    Microsoft Azure
    Microsoft Azure SQL Database
    Azure DevOps
    Microsoft Windows PowerShell
    Microsoft PowerApps
    Power Query
    Microsoft Power BI
    SQL Server Reporting Services
    SQL Server Integration Services
    PostgreSQL
    Microsoft SQL Server
    Microsoft Excel
  • $40 hourly
    Proficient Azure SQL Database and Managed Instance Support Engineer with experience in business analytics, data visualization, web development, and business reporting. Well-versed in Azure cloud networking and architecture, SQL database performance optimization, providing customer support, presentations, debugging, and teaching/knowledge transfer sessions. I have substantial experience in Azure SQL Database/SQL MI support for Microsoft as a client, for 1.5 years, where I gained extensive knowledge and expertise in Azure cloud networking and architecture, 24*7 customer support, SQL database optimization, data migration, backup-restore, and problem-solving. I have recently completed Microsoft Certified Azure Data Engineer Associate Certification Exam (DP-203) and have a strong understanding of Azure fundamentals. Prior to that, I obtained a Master of Science degree in Information Technology and Management from the University of Texas at Dallas, USA, where I got hands-on experience on academic projects on Advanced Business Analytics, Big Data, Sentiment Analysis, System Analysis, and Project Management, Statistics, Data Warehousing, Agile Project Management. Earlier, I worked as Application Development Analyst at Accenture in India for over 4 years where I gained experience in web development, including skills as follows: DotNet, C#, AngularJS, Data Management, VBA development, project deployment, MS Excel reporting, and Presentations. I am well-versed in data visualization, ETL, Python, SQL, Power BI, MS Excel, Databricks, data engineering, and Microsoft Azure Cloud technologies and customer support.
    Featured Skill Pyspark
    Data Migration
    Tableau
    Microsoft Power BI Data Visualization
    Business Analysis
    Databricks Platform
    Data Engineering
    Big Data
    Business
    PySpark
    Data Lake
    Data Analysis
    Microsoft Azure
    R
    SQL
    Python
  • $20 hourly
    Dynamic and results-driven Cloud Data Engineer with a proven track record in designing and implementing end-to-end batch and streaming data pipelines. Proficient in orchestrating data workflows within multi-cloud environments, combining technical expertise with a commitment to optimizing data-driven solutions for enhanced business outcomes. Put your faith in me!
    Featured Skill Pyspark
    Agile Software Development
    Terraform
    Amazon Web Services
    Google Cloud Platform
    Apache Spark
    Apache Kafka
    PySpark
    Apache Airflow
    SQL
    Python
  • $20 hourly
    🏆 Most recent achievements: ✅ LinkedIn made me a "Top Data Engineering Voice" for my contributions to Data Analytics & Data Engineering ✅ Teaching Python, SQL, Power BI to over 33,000 data enthusiasts on TikTok, YouTube, LinkedIn etc ✅ Saved my clients ~$500k in computing costs building custom data tools with Python, SQL, Databricks etc ✅ I have a YouTube channel with over 1.87k+ subscribers where I walk through data projects step by step (titled “Stephen | Data”) 💻 What do I do? I help companies build data products that generate the ROI they're after. I've done this using Python, SQL and Spark for over 7+ years in data engineering. Here are some of the tools + resources I can design and build from start to finish: 📍 data strategies 📍 data workflows 📍 data quality frameworks 📍 automation + augmentation tools ...among other solutions tailored to your data team's needs. Show me your data challenges and I'll create the solutions for them. 🌐 Other things to note: ✅ Response time: <24 hours ✅ Availability: Immediately (most of the time) Feel free to check out my portfolio and online handles for more information about me
    Featured Skill Pyspark
    pandas
    Data Ingestion
    Data Transformation
    Data Extraction
    Data Analytics
    Data Engineering
    Data Warehousing & ETL Software
    ETL
    Databricks Platform
    PySpark
    SQL
    Python
  • $60 hourly
    PhD in Computer Science with strong technical skills and a research track in Linked Data, Databases, and Large-Scale Machine Learning. Highly skilled in Apache Spark, PySpark, Python, SQL, Datalog, and designing scalable data processing pipelines. Passionate about enhancing input data quality in machine learning. For more details, please visit my GitHub or LinkedIn profile.
    Featured Skill Pyspark
    Neo4j
    NoSQL Database
    Query Optimization
    Java
    OWL
    SPARQL
    RDF
    Deep Neural Network
    Regression Analysis
    Machine Learning
    Apache Hadoop
    PySpark
    SQL
    Python
  • $250 hourly
    Experienced ML Engineer with over decade of professional experience, I specialize in transforming data into actionable insights and building robust machine learning models. Dedicated to helping clients succeed, I offer freelance services that ensure data-driven decisions and advanced ML solutions. Let's collaborate to turn your data challenges into success stories.
    Featured Skill Pyspark
    Data Science
    PyTorch
    Machine Learning
    Data Engineering
    Web Scraping
    Data Analysis
    MLOps
    PySpark
    TensorFlow
    Databricks Platform
    SQL
    Python
  • $150 hourly
    Technical Profile * PyKX * Python * Q * FX Trading * Cash/Money Markets * Fixed Income investments * KDB+ * Java * KDB Gateway * KDB Insights platform * KDB insight Enterprise * KDB.AI * Chained Ticker Plant * Linux * RDB * HDB * Docker * MongoDB * Asset Management * Wealth Management * KX Technology * Risk Management * GitHub * Athena * Spring Boot * Microsoft SQL Server Having overall 12+ years of experience in Planning, Building, Implementation, and Integration of full-scale commercial projects in the different verticals like Financial, Retail, Insurance, Banking, High-tech, social media, Oil and Gas and Networking/Telecom. Worked on various Agile practices like TDD, BDD, pair programming, continuous Integration and Scrum. Worked on programming languages like Java, Q, PyKX, Scala, Python,Spark, Shell scripting, KDB Gateway, KDB+ tick configuration, KDB+ tick profiling, Query routing, KDB+, KX Technology, Timeseries Processing.
    Featured Skill Pyspark
    dbt
    Apache Airflow
    Apache Kafka
    Apache Cassandra
    Apache HBase
    Big Data
    Hive
    Scala
    kdb+
    Java
    Python
    Databricks Platform
    Azure DevOps
    DevOps
    PySpark
  • $18 hourly
    I help organizations leverage the power of cloud and big data to drive actionable insights and optimize their data pipelines. With extensive experience in Azure and a solid command of tools like Spark, Python, and SQL, I specialize in building efficient ETL processes, creating advanced data visualizations, and optimizing data systems to support scalable business intelligence solutions. 🔹 My Tech Stack & Expertise: ✅ Cloud & Data Engineering: Azure, Databricks, Azure Data Factory, Azure Logic Apps, Apache Nifi ✅ ETL & Data Integration: Extraction, Transformation, Loading (ETL) workflows ✅ Data Analytics & Visualization: Microsoft Power BI, Excel ✅ Data Warehousing: Azure Synapse Data Warehouse ✅ Programming & Optimization: Python, Spark, SQL, Performance Optimization, Quality Assurance I have worked across a variety of industries, enabling businesses to unlock the value of their data and make data-driven decisions through powerful, scalable solutions. My focus is on providing end-to-end data engineering services that deliver results. 🔹 What I Offer: ✔️ Custom ETL Pipelines & Data Integration ✔️ Scalable & Secure Data Solutions on Azure ✔️ Data Analytics & Visualization using Power BI and Excel ✔️ Performance Tuning & Optimization for Big Data ✔️ Quality Assurance for Data Workflows ➡️ Why Clients Love Working With Me: 🔹 Proven Success: Expertise in large-scale data solutions with a focus on optimization and quality. 🔹 Fast Turnaround: Efficient and reliable, with strong collaboration skills. 🔹 Cost-Effective Solutions: Streamlined processes that reduce overhead and drive ROI. 🔹 Long-Term Collaboration: Trusted by organizations to consistently deliver high-quality results. 💬 Client Testimonials: ● "Umair has been a crucial asset to our data team, optimizing our data pipelines and ensuring smooth integration with Azure. His technical expertise and problem-solving skills are top-notch." ● "With Umair's guidance, we were able to build a robust data warehousing solution, enabling us to leverage analytics effectively. His attention to detail and focus on performance made all the difference." ● Great man, 5+ stars! Was able to handle any and all requests. ● Awesome job as always! Thank you for all the things you did. 📩 Let’s build your next big project! Message me now for a FREE consultation.
    Featured Skill Pyspark
    Apache NiFi
    Price Optimization
    Data Lake
    Data Modeling
    Python
    Microsoft Excel
    Data Analysis
    Databricks Platform
    SQL
    Data Warehousing & ETL Software
    Data Visualization
    Microsoft Power BI
    PySpark
    Microsoft Azure
  • $50 hourly
    • Experienced Data Engineer: 4+ years of hands-on experience in designing, building, and maintaining scalable data pipelines and ETL workflows on cloud platforms like AWS and GCP. Expertise in data migration, database optimization, and real-time data processing using tools like Apache Kafka, Spark Streaming, and Airflow. • Python & Pypark Expert: Proficient in Python for data processing, transformation, and analysis. Skilled in building distributed data pipelines using PySpark for large-scale data processing. • Cloud Data Specialist: Proficient in leveraging AWS services (Glue, Athena, DMS, EMR, S3, Aurora) and GCP for building robust data solutions. Skilled in containerization (Docker, Kubernetes) and orchestration tools for efficient data pipeline management. • Database Engineering Expert: 10+ years of experience in database design, development, SQL Query Tuning and administration, with deep expertise in MySQL, PostgreSQL and AWS Aurora, Proven track record in ETL development, data migration, and performance tuning for high-volume transactional systems. • Data-Driven Problem Solver: Adept at transforming complex data into actionable insights through data visualization (Tableau, Power BI) and data storytelling. Strong background in data analysis, feature engineering, and data quality assurance. • Certified Data Professional: MSc in Data Science from the University of East Anglia, complemented by extensive hands-on experience in data engineering and cloud platforms. • Experienced ML Engineer: 3+ years of ML Engineer experience implementing sophisticated algorithms and managing end-to-end ML lifecycles, with particular expertise in the Airline Loyalty Domain.
    Featured Skill Pyspark
    Apache Airflow
    Databricks Platform
    AWS Glue
    Snowflake
    PostgreSQL Programming
    Tableau
    Microsoft Power BI
    PostgreSQL
    Apache Spark
    Apache Kafka
    PySpark
    Python
    Database
    MLOps
    Machine Learning
  • $40 hourly
    Highly experienced Data Engineer with 7+ years of expertise designing and implementing scalable, cloud-native data solutions on the Azure platform, primarily within the healthcare domain. Proficient in building end-to-end data pipelines using Azure Data Factory, Azure Databricks, Delta Lake, and Azure SQL Database to drive efficient data integration, transformation, and analytics. Skilled in Pyspark, SQL, Python, and big data technologies such as Hadoop and Hive. Experienced in developing HIPAAcompliant Data Lakehouse solutions to support secure, large-scale healthcare data processing. Adept at optimizing ETL/ELT workloads for performance, scalability, and cost-efficiency. A collaborative team player with a strong track record of partnering with global teams and stakeholders to deliver high-impact data solutions. Committed to continuous learning and staying current with advancements in cloud and data engineering technologies.
    Featured Skill Pyspark
    ETL Pipeline
    Data Extraction
    Databricks Platform
    Hive
    Apache Spark
    Big Data
    Apache Hadoop
    SQL
    PySpark
    Python
    Microsoft Azure
  • $65 hourly
    I help businesses in banking, finance, insurance, and sustainability sectors solve complex data problems. From cloud migration to analytics-ready data lakes, I’ve led solutions that save time, reduce cost, and boost reporting. > 5+ years of data engineering (16+ in IT overall). > Expertise in Azure (ADF, Synapse, Databricks, Fabric), SQL, and Spark/PySpark. > Reduced cloud costs by £10K/month through optimized pipelines. > Passionate about clean data, automation, and end-user value
    Featured Skill Pyspark
    PySpark
    Apache Spark
    Databricks Platform
    Big Data
    ExpertKnowledge Synapse
    Fabric
    Azure DevOps
    Data Engineering
    ETL Pipeline
    ETL
    Data Extraction
    Data Analysis
  • $50 hourly
    I am a Data Engineer with experience building and maintaining production data pipelines. I have experience working with Python3, SQL and various other tools to help make data easy to access and use.
    Featured Skill Pyspark
    Data Extraction
    ETL
    Jira
    Atlassian Confluence
    dbt
    Terraform
    Databricks Platform
    Apache Spark
    PySpark
    AWS Lambda
    Amazon S3
    SQL
    Time Series Analysis
    Apache Airflow
    Python
  • $85 hourly
    I am an experienced full stack data and backend engineer. My background and skills include: - Expert in Python, SQL and NodeJS - Certified AWS Cloud Practitioner and studying for exams in Certified Developer, Solutions Architect, SysOps Administrator and Data Analytics - Two MScs in Mathematics and Data Science - Databases (Postgres, PostGIS, Aurora, DynamoDB, MySQL, Mongo, Redis, Snowflake, Redshift, Neo4j) and ORMs - Cloud Infrastructure (AWS, GCP, Terraform) - APIs (FastAPI, Flask, Express, GraphQL) - Orchestration (Airflow, Kubernetes) - Pub/Sub + Queuing (Kafka, RabbitMQ) - Version control Git - CI/CD (Docker, Github Actions, Jenkins) - GIS (Postgis, Shapely, GDAL, H3) - Web crawling (Scrapy, custom crawlers in Python/Node/Go) - PySpark/AWS EMR/AWS Batch I have worked in data and tech for 4 years including full time roles as a Data Scientist, Data Engineer, Backend Engineer and Head of Data. In a previous life I worked investment banking in Sales and Trading for 4 years. Previous projects include: - Designing and building a client a booking engine to allow them to expand into the reservation and beauty treatment business (GCP, Python, Postgres, Redis and FastAPI). - Assisted a bottling company in expanding their tech capabilities by moving them from spreadsheets into the cloud and developing APIs to automate their business with other partners (Python, Postgres, AWS, FastAPI). - Refactored an employers crawling system to make it more efficient. Previously each data entity was crawled on a regular frequency, but the vast majority of entities very rarely changed leading to unnecessary crawling and resource use. The refactor took into account the changes in the data and how often they occurred to predict the next optimal time to crawl resulting in a 15% reduction in cloud costs (AWS, Postgres, NodeJS, Python, RabbitMQ). - Developed a streaming change data capture pipeline handling 50 million unique payloads per day resulting in a +15% reduction in total cloud costs while allowing live data to be available to customers (AWS, Python, NodeJS, RabbitMQ, Postgres, Snowflake, Airflow) - Created a data architecture to allow aggregation of geospatial time series for any possible geographic polygon across +20bn global data points in sub-second time (AWS, Clickhouse, Python, FastAPI, Postgres, Uber H3, Redis) - Created an autonomous on-demand Excel and PDF reporting system to allow a sales team to generate their own reports from data stores with no required input from developers (AWS, Python) - Developer multiple machine learning models running production including an age/gender classification for faces in photos (Python, Keras, GCP), entity resolution system combining tabular, text and image embeddings to deduplicate +30mm listings across multiple provider platforms (AWS, Python, PyTorch, RabbitMQ, Neo4j, Postgres)
    Featured Skill Pyspark
    Amazon Web Services
    Terraform
    RabbitMQ
    Flask
    Machine Learning
    RESTful API
    PostgreSQL
    Node.js
    Snowflake
    PySpark
    Apache Kafka
    Docker
    Apache Airflow
    Python
    SQL
  • $24 hourly
    Looking to build a cutting-edge social media analytics product that harnesses the power of AI? Look no further! As a seasoned MLOps and machine learning engineer with expertise in NLP and textual data analysis, I have the skills and tools you need to take your project to the next level. 💪🚀 With my experience using state-of-the-art natural language processing models like GPT-3 and GPT-4, I can help you unlock new levels of insight from your data. These powerful language models use deep learning techniques to generate natural language text that is virtually indistinguishable from that written by a human, making them ideal for tasks like language translation, content creation, and more. 📚💡 And when it comes to deploying your models at scale, I'm a pro with ML automation and deployment tools like MLflow, PySpark, TensorFlow Extended, Kubeflow, and more. Whether you're working with AWS, Databricks, or another cloud platform, I'll help you ensure your models are deployed reliably and efficiently, with careful unit testing and source control. 🚀☁️ So if you're looking for a top-notch machine learning engineer to join your team and help you build a social media analytics product that can transform the way you do business, look no further. With my expertise in NLP, MLOps, and machine learning, we'll unlock new insights and drive innovation together. By utilizing my expertise in NLP, MLOps, and machine learning, your organization can benefit in the following ways: 🔍 Improved decision-making through advanced NLP analysis of customer sentiment and market trends. 🤝 Enhanced customer experience by personalizing products and services based on customer preferences. ⚙️ Streamlined operations and cost savings through ML automation and deployment tools. 🚀 Competitive advantage by leveraging state-of-the-art NLP models for content generation and recommendations. 💪 Scalability and cost-efficiency through reliable deployment on cloud platforms. These benefits will contribute to increased profitability and drive innovation within your organization. Let's collaborate to unlock the full potential of your social media analytics product! 🌟Let's get started! 🎉🔍
    Featured Skill Pyspark
    Docker
    Bash
    PySpark
    Golang
    GPT-3
    Data Science
    Data Mining
    Databricks Platform
    pandas
    SQL
    Apache Spark
    Python
    MLflow
    Machine Learning
    Natural Language Processing
  • $50 hourly
    ⚡️ Friendly but professional, and solution-oriented. Hi! I'm a Data Engineer with over 10 years of experience building scalable, high-performance data pipelines, automating ETL workflows, and managing cloud-based infrastructure on platforms like AWS, Azure, and Snowflake. 🔧 I specialize in: End-to-end ETL pipeline development (Python, PySpark, SQL) Data modeling and warehouse design (Kimball, Snowflake) Cloud-based data engineering (AWS Glue, Azure Data Factory) Dashboarding and reporting (Power BI, Palantir Foundry, Tableau) 🚀 What sets me apart: Deep understanding of large-scale data systems (Public health service and supply chain experience) Clean, modular, production-ready code Clear communication and collaborative approach Whether you're a startup needing a quick MVP or a company scaling your data infrastructure, I can help you get there — on time and with confidence. Let’s talk about your project!
    Featured Skill Pyspark
    dbt
    Data Warehousing
    PySpark
    Data Modeling
    Tableau
    Microsoft Power BI Development
    Microsoft Power BI
    Python
    SQL
    Snowflake
    Microsoft Azure SQL Database
    Microsoft Azure
    AWS CodePipeline
    ETL Pipeline
    Data Engineering
  • $30 hourly
    Data Engineer with over 4 years of professional experience developing ETL/ML pipelines, APIs, app backends, SQL databases using python. Also know about developing Machine learning models.
    Featured Skill Pyspark
    PySpark
    Artificial Intelligence
    Flask
    PyTorch
    Keras
    Python
  • $35 hourly
    I possess a comprehensive skillset that seamlessly blends data engineering expertise with data science proficiency. My expertise lies in architecting and implementing robust data pipelines, utilizing Python, Spark, PostgreSQL, and Amazon Web Services to ensure data reliability and accessibility. My proficiency in data preprocessing, cleansing, and transformation enables me to develop pipelines for analysis and predictive modeling in both R and Python. My unique fusion of skills allows me to seamlessly transition between data engineering and data science tasks, enabling me to efficiently handle the entire data lifecycle from acquisition to insights generation. This holistic approach to data management ensures that I can effectively extract actionable insights from complex datasets. Outside of my professional work, I have consistently engaged in personal projects that further enhance my data engineering and data science skills. My recent project involved designing and executing a data ingestion and storage solution in AWS DynamoDB and S3 Bucket using Python, AWS Lambda, Pandas, and Boto3. This project showcased my proficiency in AWS setup and configuration, ensuring seamless interaction between AWS services. I am also adept at conducting comprehensive data cleaning and exploratory data analysis using R and the ggplot2 package, revealing key insights and trends from complex datasets. My data visualization skills are well-honed, as evidenced by my ability to clean and visualize data using Power BI and DAX on an academic institution's payment collection dataset. My latest project involved developing a database schema, performing data ingestion into the database, and carrying out exploratory analysis using SQL to extract valuable insights for business decision-making. This project further solidified my database modeling and SQL skills. My understanding of Machine Learning Operations (MLOps) and its application in data engineering and model deployment lifecycle (MDLC) is adequate, and I have experience using GitHub Actions for managing CI/CD in a professional setting.
    Featured Skill Pyspark
    SQL Server Integration Services
    AWS Glue
    Snowflake
    PySpark
    ETL Pipeline
    Microsoft Power BI
    R
    AWS Lambda
    PostgreSQL
    SQL
    React
    Python
    PHP
  • $100 hourly
    DATA ENGINEER 4 years of working experience in Bigdata technologies PERSONAL PROJECTS Face Recognition Face Recognizer with OpenCV library which can detect faces using Haar Cascade Classifier of Face, create dataset to train itself, recognize faces live with webcam. Pose Detection * Pose Detection of the Human Body using OpenPose and OpenCV. In this I used model weights of COCO keypoint as well as MPII human pose detection dataset. The software reads an image using OpenCV which was converted to input blob so that it can be fed to a network and then make a prediction for pose detection.
    Featured Skill Pyspark
    Hive
    SQL Programming
    Amazon S3
    Oracle
    PySpark
    ETL
  • Want to browse more freelancers?
    Sign up

How hiring on Upwork works

1. Post a job

Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.

2. Talent comes to you

Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.

3. Collaborate easily

Use Upwork to chat or video call, share files, and track project progress right from the app.

4. Payment simplified

Receive invoices and make payments through Upwork. Only pay for work you authorize.