Hire the best Apache Spark Engineers in Poland

Check out Apache Spark Engineers in Poland with the skills you need for your next job.
  • $55 hourly
    Over 22 years of experience working on enterprise class systems. Variety of projects and customers. Always focused on performance and reliability. Always on cutting edge of technology. Keeping in mind business objectives and customers interests. Competencies: • Solution Selection, Architecture Design, Implementation, Customization, Migration, Internal Adoption - Freelance Consulting & Audit BigData & Machine Learning Expertise: • Machine Learning Models adoption from business use-case until production maintenance • Data Collection, Data pipelines, Data Workflows • Big Data ETL & Mining • System Integration I have a solid portfolio of enterprise level, successfully implemented solutions as well as ad-hoc optimizations of existing systems in large, global organizations.
    Featured Skill Apache Spark
    Kubernetes
    BigQuery
    SQL Programming
    Big Data
    Docker
    Apache Kafka
    NoSQL Database
    Machine Learning
    Scala
    Python
  • $100 hourly
    I have over 4 years of experience in Data Engineering (especially using Spark and pySpark to gain value from massive amounts of data). I worked with analysts and data scientists by conducting workshops on working in Hadoop/Spark and resolving their issues with big data ecosystem. I also have experience on Hadoop maintenace and building ETL, especially between Hadoop and Kafka. You can find my profile on stackoverflow (link in Portfolio section) - I help mostly in spark and pyspark tagged questions.
    Featured Skill Apache Spark
    MongoDB
    Data Warehousing
    Data Scraping
    ETL
    Data Visualization
    PySpark
    Python
    Data Migration
    Apache Airflow
    Apache Kafka
    Apache Hadoop
  • $60 hourly
    🚀 Senior Data Engineer | Databricks Expert | Data Solutions Architect With over 7 years of experience as a Data Engineer, I help businesses transform raw data into real, measurable value. I've managed infrastructures scaling over 2 petabytes, and in one project alone, handled 500+ TB of data, building and optimizing large-scale pipelines that are fast, reliable, and cost-efficient. 🧠 Area of Expertise: ✅ Data pipelines (ETL/ELT) ✅ Data Governance & Security ✅ Data Silos & Integration ✅ Data CI/CD ✅ Anything related to Databricks and Data Engineering 🛠️ Services I Offer: ✅ Databricks implementation or migration ✅ Organizing and structuring your data for insight and scalability ✅ Choosing the right data architecture for your goals ✅ Building both batch and real-time streaming pipelines ✅ Creating your data platform from the ground up ✅ Training and deploying ML/AI models using Databricks ✅ Implementing Unity Catalog and the Data Lakehouse architecture ✅ Data strategy and consulting ✅ Training your team to be self-sufficient with Databricks 💡 My work consistently delivers impact — I've helped clients reduce storage and processing costs by an average of 75%, translating into substantial long-term savings. While I'm proficient in Snowflake, AWS Glue, SageMaker, Azure Synapse, GCP BigQuery, and other major cloud platforms, Databricks is my specialty and my preferred environment. I use it to build everything from high-performance ETL/ELT pipelines to ML model training workflows, data governance solutions, and real-time analytics. It’s simply the best platform for modern data engineering, hands down. 🔍 I solve data challenges like: ✅ Inefficient or outdated pipelines ✅ Lack of data governance and lineage ✅ Disconnected and siloed data sources ✅ Difficult-to-maintain or poorly designed data architectures 📈 My goal is simple: Help clients understand, use, and profit from their data. Whether you're scaling up, cleaning up, or starting from scratch — I can help. Industries I know really well are: ✅ Aviation ✅ Logistics / ocean freight ✅ SaaS / PaaS products Let’s unlock the value of your data — and turn it into real business results.
    Featured Skill Apache Spark
    Data Profiling
    Microsoft Azure
    PySpark
    SQL
    Python
    Data Cleaning
    Data Analytics & Visualization Software
    Data Analysis
    Machine Learning
    Data Mining
    ETL Pipeline
    ETL
    Data Engineering
    Databricks Platform
  • $80 hourly
    With over 8 years of professional experience in Data Engineering, I specialize in 𝒃𝒖𝒊𝒍𝒅𝒊𝒏𝒈 𝒂𝒏𝒅 𝒐𝒑𝒕𝒊𝒎𝒊𝒛𝒊𝒏𝒈 𝒉𝒊𝒈𝒉-𝒑𝒆𝒓𝒇𝒐𝒓𝒎𝒂𝒏𝒄𝒆 𝒅𝒂𝒕𝒂 𝒊𝒏𝒇𝒓𝒂𝒔𝒕𝒓𝒖𝒄𝒕𝒖𝒓𝒆𝒔 using Apache Spark and related big data technologies. My expertise lies in 𝒂𝒓𝒄𝒉𝒊𝒕𝒆𝒄𝒕𝒊𝒏𝒈 𝒔𝒄𝒂𝒍𝒂𝒃𝒍𝒆 𝒔𝒐𝒍𝒖𝒕𝒊𝒐𝒏𝒔 𝒇𝒐𝒓 𝒑𝒓𝒐𝒄𝒆𝒔𝒔𝒊𝒏𝒈 𝒂𝒏𝒅 𝒂𝒏𝒂𝒍𝒚𝒛𝒊𝒏𝒈 𝒎𝒂𝒔𝒔𝒊𝒗𝒆 𝒅𝒂𝒕𝒂𝒔𝒆𝒕𝒔, enabling businesses to extract actionable insights, drive innovation, and d𝒆𝒍𝒊𝒗𝒆𝒓 𝒄𝒖𝒕𝒕𝒊𝒏𝒈-𝒆𝒅𝒈𝒆, 𝒕𝒐𝒑-𝒕𝒊𝒆𝒓 𝑨𝑰 𝒔𝒐𝒍𝒖𝒕𝒊𝒐𝒏𝒔. I possess an in-depth understanding of Apache Spark and have served as the author, designer, and lead developer of several libraries for Apache Spark, including: ● 𝑺𝒑𝒂𝒓𝒌 𝑶𝑪𝑹 (𝑽𝒊𝒔𝒖𝒂𝒍 𝑵𝑳𝑷): Developed advanced solutions for Optical Character Recognition (OCR) and Visual Natural Language Processing (NLP) within the Spark ecosystem. ● 𝑷𝑫𝑭 𝑫𝒂𝒕𝒂𝑺𝒐𝒖𝒓𝒄𝒆 𝒇𝒐𝒓 𝑨𝒑𝒂𝒄𝒉𝒆 𝑺𝒑𝒂𝒓𝒌: Creator and contributor to the 𝒐𝒑𝒆𝒏-𝒔𝒐𝒖𝒓𝒄𝒆 spark-pdf datasource project, written in Scala, enhancing Spark’s data processing capabilities. 🔑 𝗞𝗲𝘆 𝗦𝗸𝗶𝗹𝗹𝘀 & 𝗘𝘅𝗽𝗲𝗿𝘁𝗶𝘀𝗲: 🔷𝑩𝒊𝒈 𝑫𝒂𝒕𝒂 𝑷𝒓𝒐𝒄𝒆𝒔𝒔𝒊𝒏𝒈 & 𝑨𝒏𝒂𝒍𝒚𝒕𝒊𝒄𝒔: Extensive experience with Apache Spark (PySpark, Spark ML, Spark Structured Streaming) to build distributed systems and handle complex ETL workflows and real-time data pipelines. 🔷𝑹𝒆𝒂𝒍-𝑻𝒊𝒎𝒆 𝑫𝒂𝒕𝒂 𝑺𝒕𝒓𝒆𝒂𝒎𝒊𝒏𝒈: Proficient in designing and deploying real-time aggregation systems with tools like Kafka, Kinesis, and Spark Streaming. 🔷𝑫𝒂𝒕𝒂 𝑬𝒏𝒈𝒊𝒏𝒆𝒆𝒓𝒊𝒏𝒈 𝑾𝒐𝒓𝒌𝒇𝒍𝒐𝒘𝒔: Skilled in end-to-end development of robust ETL processes, batch pipelines, and automated workflows to transform and enrich large-scale data. 🔷𝑪𝒍𝒐𝒖𝒅 & 𝑫𝒊𝒔𝒕𝒓𝒊𝒃𝒖𝒕𝒆𝒅 𝑺𝒚𝒔𝒕𝒆𝒎𝒔: Hands-on expertise with AWS, Databricks, and containerized environments (Docker), ensuring efficient and scalable infrastructure for big data solutions. 🔷𝑶𝒑𝒕𝒊𝒎𝒊𝒛𝒂𝒕𝒊𝒐𝒏 & 𝑷𝒆𝒓𝒇𝒐𝒓𝒎𝒂𝒏𝒄𝒆 𝑻𝒖𝒏𝒊𝒏𝒈: Proven ability to optimize Spark jobs and workflows, reducing execution time and improving throughput to handle 50GB+ daily datasets efficiently. 🔷𝑷𝒓𝒐𝒈𝒓𝒂𝒎𝒎𝒊𝒏𝒈 𝑬𝒙𝒑𝒆𝒓𝒕𝒊𝒔𝒆: Advanced skills in Scala and Python, leveraging best practices for big data applications and distributed systems. 🔷𝑫𝒂𝒕𝒂 𝑫𝒆-𝒊𝒅𝒆𝒏𝒕𝒊𝒇𝒊𝒄𝒂𝒕𝒊𝒐𝒏 & 𝑨𝒏𝒐𝒏𝒚𝒎𝒊𝒛𝒂𝒕𝒊𝒐𝒏: Expert in anonymizing sensitive data from text, images, PDFs, and DICOM files. I ensure privacy, security, and compliance with GDPR and HIPAA standards using NLP, OCR, and computer vision to remove or mask personal information, safeguarding data confidentiality. 🔷𝑯𝒆𝒂𝒍𝒕𝒉𝒄𝒂𝒓𝒆, 𝑷𝒉𝒂𝒓𝒎𝒂, 𝑴𝒆𝒅𝑻𝒆𝒄𝒉, 𝑩𝒊𝒐𝑻𝒆𝒄𝒉 𝑬𝒙𝒑𝒆𝒓𝒕𝒊𝒔𝒆: Over 5 years of experience in the healthcare and life sciences sectors, with a strong understanding of formats like DICOM, and expertise in delivering solutions specifically tailored to meet the unique needs of these industries. 𝗧𝗢𝗣 𝟱 𝗥𝗲𝗮𝘀𝗼𝗻𝘀 𝘁𝗼 𝗪𝗼𝗿𝗸 𝗪𝗶𝘁𝗵 𝗠𝗲 ✅ 𝑬𝒏𝒅-𝒕𝒐-𝑬𝒏𝒅 𝑬𝒙𝒑𝒆𝒓𝒕𝒊𝒔𝒆 ✅ 𝑪𝒐𝒎𝒑𝒍𝒆𝒙 𝑷𝒓𝒐𝒃𝒍𝒆𝒎-𝑺𝒐𝒍𝒗𝒊𝒏𝒈 𝑨𝒃𝒊𝒍𝒊𝒕 ✅ 𝑻𝒊𝒎𝒆𝒍𝒚 𝑫𝒆𝒍𝒊𝒗𝒆𝒓𝒚 ✅ 𝑻𝒓𝒂𝒏𝒔𝒑𝒂𝒓𝒆𝒏𝒕 𝑪𝒐𝒎𝒎𝒖𝒏𝒊𝒄𝒂𝒕𝒊𝒐𝒏 ✅ 𝑺𝒄𝒂𝒍𝒂𝒃𝒍𝒆 𝑺𝒐𝒍𝒖𝒕𝒊𝒐𝒏𝒔 🏆 100% Job Success Score 📈 15+ years of experience 🕛 15 000+ Upwork Hours 🎓 Master’s in Applied Mathematics 𝗣𝗿𝗼𝗳𝗲𝘀𝘀𝗶𝗼𝗻𝗮𝗹 𝗦𝗸𝗶𝗹𝗹𝘀 🛠️ 𝑷𝒓𝒐𝒈𝒓𝒂𝒎𝒎𝒊𝒏𝒈 𝑳𝒂𝒏𝒈𝒖𝒂𝒈𝒆𝒔: Python, Scala ⚡𝑩𝒊𝒈 𝑫𝒂𝒕𝒂 & 𝑫𝒊𝒔𝒕𝒓𝒊𝒃𝒖𝒕𝒆𝒅 𝑺𝒚𝒔𝒕𝒆𝒎𝒔: Big Data Processing, ETL, Stream Processing, Real-Time Aggregation, Apache Spark (PySpark, Spark ML, Spark Structured Streaming), Kinesis, Kafka, Databricks 🚀𝑪𝒍𝒐𝒖𝒅 𝑪𝒐𝒎𝒑𝒖𝒕𝒊𝒏𝒈 & 𝑰𝒏𝒇𝒓𝒂𝒔𝒕𝒓𝒖𝒄𝒕𝒖𝒓𝒆: Amazon Web Services (AWS), Distributed Systems, CI/CD Pipelines, Docker, Jenkins, Graphite, Grafana, Elasticsearch, Kibana ⚙️𝑫𝒂𝒕𝒂𝒃𝒂𝒔𝒆𝒔: PostgreSQL, MongoDB, Redis, DynamoDB 📊 𝑫𝒂𝒕𝒂 𝑺𝒄𝒊𝒆𝒏𝒄𝒆 & 𝑴𝒂𝒄𝒉𝒊𝒏𝒆 𝑳𝒆𝒂𝒓𝒏𝒊𝒏𝒈: NLP, Computer Vision, Large Language Models (LLMs), Optical Character Recognition (OCR), Model Productionalization, Deep Learning (PyTorch, TensorFlow, Hugging Face Transformers, ONNX, Pandas, CLIP) 📅 𝗔𝘃𝗮𝗶𝗹𝗮𝗯𝗶𝗹𝗶𝘁𝘆: Committed to long-term collaborations. Available full-time for your next project. 🔍 𝗞𝗲𝘆𝘄𝗼𝗿𝗱𝘀: Technical Architect Big Data Document Processing ML Infrastructure MLOps Engineer ETL & ML Pipeline Cloud Architecture (SaaS) Machine Learning
    Featured Skill Apache Spark
    AI Development
    LangChain
    PySpark
    Databricks Platform
    Software Architecture & Design
    Hugging Face
    Large Language Model
    Machine Learning
    Tesseract OCR
    Python
    Scala
    PyTorch
    Computer Vision
    Natural Language Processing
  • $120 hourly
    ✅ AWS Certified Solutions Architect ✅ Google Cloud Certified Professional Data Engineer ✅ SnowPro Core Certified Individual ✅ Upwork Certified Top Rated Professional Plus ✅ The author of Python package for cryptocurrency market Currency.com (python-currencycom) Specializing in Business Intelligence Development, ETL Development, and API Development with Python, Apache Spark, SQL, Airflow, Snowflake, Amazon Redshift, GCP, and AWS. Accomplished lots of complicated and not very projects like: ✪ Highly scalable distributed applications for real-time analytics ✪ Designing data Warehouse and developing ETL Pipelines for multiple mobile apps ✪ Cost optimization for existing cloud infrastructure But the main point: I have a responsibility for the final result.
    Featured Skill Apache Spark
    Data Scraping
    Snowflake
    ETL
    BigQuery
    Amazon Redshift
    Big Data
    Data Engineering
    Cloud Architecture
    Google Cloud Platform
    ETL Pipeline
    Python
    Amazon Web Services
    Apache Airflow
    SQL
  • $48 hourly
    Hi! I am Data Engineer with more than 4 years of experience of building robust Data pipelines and warehouses. I can help with handling all types of data challenges. I am great enthusiast of classic and new data technologies, I love to squeeze their potential to bring new sense of data to companies. I work with many technologies like Python, Scala, SQL, Git, Airflow, Hadoop, Spark and many more, both on premise or in the Cloud. Apart from that, I am open-minded and extremely easy going person, I like to work in a good team environment. If you think that I could be a help in your data challenge, hit me a message!
    Featured Skill Apache Spark
    BigQuery
    Data Analytics & Visualization Software
    Data Engineering
    Git
    Jira
    Microsoft Excel
    Scala
    Tableau
    Apache Hadoop
    Apache Airflow
    SQL
    Amazon Web Services
    Google Cloud Platform
    Python
  • $100 hourly
    I am an experienced software developer who combines practical skills with a deep respect for the theoretical foundations of software engineering and related disciplines. In my work, I believe that different classes of problems require tailored solutions. Therefore, I continuously expand my toolkit and programming practices by attending industry conferences, reading technical literature, and staying updated through podcasts and articles. I adhere to the principles of Software Craftsmanship, which emphasize quality, attention to detail, and a responsible approach to software development. I am particularly interested in both low-level and high-level architectural patterns, as well as methodologies for analyzing and translating complex business requirements into scalable, working software. In my career, I also develop expertise in Data Science, including data exploration and analysis, designing machine learning models, and deploying them in production environments. I work with tools such as Python, scikit-learn, TensorFlow, and PyTorch, as well as techniques to optimize analytical processes. In the field of AWS, I focus on designing and implementing cloud solutions, including building scalable microservices, implementing CI/CD strategies, cost optimization, and ensuring system reliability. I have experience with key services such as AWS Lambda, S3, EC2, RDS, EKS, Machine Learning Services, I value responsibility and autonomy, both in software engineering and in everyday life. This approach allows me to deliver tangible value to all project stakeholders while fostering collaboration and understanding within teams.
    Featured Skill Apache Spark
    Web Application
    Web API
    Machine Learning Model
    Spring Boot
    SQL
    Teaching Programming
    AWS CloudFormation
    AWS Development
    Python
    Data Science
    Data Analysis
  • $10 hourly
    I am a Master Degree's student in Data Science, specializing in data-driven solutions. I develop scalable models using machine learning frameworks like Scikit-Learn and PyTorch, combined with SQL-based data analysis. I have achieved over 90% accuracy in customer behavior prediction and A/B testing projects. I'm here to help you align your business goals with data-driven insights.
    Featured Skill Apache Spark
    Web Scraping
    Big Data
    AWS CloudFormation
    Tableau
    PySpark
    pandas
    TensorFlow
    SQL
    Python
    Adobe InDesign
    Adobe Photoshop
    Adobe Illustrator
  • $20 hourly
    - 10+ years of professional experience in JVM-based software development. - Expertise in Scala’s modern FP stack (Typelevel, ZIO, Akka), data engineering (Hadoop, AWS, Spark), and Java (Spring) for high-performance solutions. - A product-oriented mindset, focused on understanding the business domain deeply to deliver solutions that align with and exceed business expectations.
    Featured Skill Apache Spark
    Distributed Database
    Distributed Computing
    Microsoft SQL Server
    PostgreSQL
    ClickHouse
    PySpark
    Google Cloud Platform
    Amazon Web Services
    Microsoft Azure
    Java
    Apache Flink
    Apache Kafka
    Python
    Scala
  • $12 hourly
    I'm a Data Engineer with a passion for building end-to-end data pipelines and solving real-world problems. I specialize in working with Python, SQL, Apache Spark, and Kafka — delivering reliable, scalable systems for both batch and real-time processing. 🔹 Experienced with cloud platforms like AWS,GCP, Snowflake, and Databricks 🔹 Skilled in orchestrating workflows using Airflow and CI/CD tools like Docker & GitHub Actions 🔹 Comfortable designing ETL/ELT pipelines, validating data quality, and creating insightful dashboards with Streamlit and Grafana 🔹 Built and deployed 7+ production-grade projects, including real-time fraud detection, outage monitoring, and IoT data analytics 🔹 Work independently as a B2B contractor with flexible hours — no need for sponsorship or relocation I'm self-taught, fast to adapt, and genuinely enjoy working on data problems. Whether it’s automation, analytics, or infrastructure — I’ll bring clarity to your data and value to your team.
    Featured Skill Apache Spark
    Data Cleaning
    Real Time Stream Processing
    Dashboard
    Analytics Dashboard
    PostgreSQL
    Streamlit
    Apache Kafka
    Apache Airflow
    SQL
    Python
    Data Visualization Framework
    Data Analysis
    ETL Pipeline
    ETL
  • $11 hourly
    I am a Data Engineer and Analyst with a strong background in data processing, automation, and big data technologies. Currently, I am pursuing a Master’s degree in Data Science at Cracow University of Technology, where I focus on building efficient data pipelines and optimizing data workflows. I have hands-on experience with Python, SQL, Apache Spark, Kafka, Airflow, and Power BI, allowing me to work with large datasets, automate data transformations, and develop scalable solutions. Additionally, I am proficient in Microsoft Excel and have a keen eye for detail, ensuring accuracy in data entry and organization. I am highly motivated, analytical, and always eager to take on new challenges. If you need a reliable and detail-oriented professional for your data-related tasks, feel free to reach out—I’d love to collaborate! If you're interested in checking out some of the projects I've been working on, feel free to take a look at my LinkedIn profile and GitHub repository.
    Featured Skill Apache Spark
    Git
    GitHub
    Microsoft Azure
    Data Visualization
    Microsoft Power BI
    Java
    Apache Superset
    Apache Kafka
    ETL
    PySpark
    Python
    Docker
    NoSQL Database
    SQL
  • Want to browse more freelancers?
    Sign up

How hiring on Upwork works

1. Post a job

Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.

2. Talent comes to you

Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.

3. Collaborate easily

Use Upwork to chat or video call, share files, and track project progress right from the app.

4. Payment simplified

Receive invoices and make payments through Upwork. Only pay for work you authorize.