Hire the best Apache Spark Engineers

Check out Apache Spark Engineers with the skills you need for your next job.
Clients rate Apache Spark Engineers
Rating is 4.8 out of 5.
4.8/5
based on 775 client reviews
  • $30 hourly
    Big Data/Data engineer, with almost 4 years of experience using AWS or GCP cloud ✅ Building Data Lakes and Data Warehouses using AWS/GCP cloud infrastructure ✅ 💯 Certified AWS Developer 💯 ✅ Building complex analytical queries using cloud engines ✅ Data ingestion from various sources: RDBMS, API, SFTP, S3, GCS Open-minded software engineer, eager to work with complex distributed systems and components. Capable to build back-end solutions. Strong in design and integration of problem-solving skills. Skilled in Python, AWS, GCP, Apache Airflow, Apache Spark, and Apache Kafka with database analysis and design. Have a good experience with the Odoo ERP framework. Capable of creative thinking, highly disciplined, punctual, demanding, good team player. Strong written and verbal communications. 𝛑 Finished my Master's degree in Applied Math in 2020 at Lviv Polytechnic National University. ⚡⚡⚡ Worked for such big clients as Dyson, Syngenta, and Deloitte. Helping them with the digital transformation of their business. Constructing centralized data storage for them in most cases. As well as providing support and improvements to existing solutions. Here is what one of my clients said about me(you can check it on my LinkedIn profile): 💥"Yurii is a highly skilled and capable software engineer working in the Big Data space. He worked on a large data lake platform in the AWS Cloud environment, for which I was the project manager. He was consistently responsive and skillful and came through with timely deliveries when needed. Yurii is a pleasure to work with, due to his technical mastery coupled with strong interpersonal skills. I highly recommend working with him, and will seek him out for future engagements. "💥
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Apache Airflow
    SQL Programming
    dbt
    Amazon Athena
    Data Warehousing
    Google Dataflow
    AWS Glue
    AWS Lambda
    Snowflake
    Python
    BigQuery
  • $38 hourly
    💡 If you want to turn data into actionable insights or planning to use 5 V's of big data or if you want to turn your idea into a complete web product... I can help. 👋 Hi. My name is Prashant and I'm a Computer Engineer. 💡 My true passion is creating robust, scalable, and cost-effective solutions using mainly Java, Open source technologies. 💡During the last 11 years, I have worked with, 💽Big Data______🔍Searching____☁️Cloud services 📍 Apache Spark_📍ElasticSearch_📍AWS EMR 📍 Hadoop______📍Logstash_____📍AWS S3 📍 HBase_______📍Kibana_______📍AWS EC2 📍 Hive_________📍Lucene______ 📍AWS RDS 📍 Impala_______📍Apache Solr__📍AWS ElasticSearch 📍 Flume_______📍Filebeat______📍AWS Lambda 📍 Sqoop_______📍Winlogbeat___📍AWS Redshift 5-step Approach 👣 Requirements Discussion + Prototyping + Visual Design + Backend Development + Support = Success! Usually, we customize that process depending on the project's needs and final goals. How to start? 🏁 Every product requires a clear roadmap and meaningful discussion to keep everything in check. But first, we need to understand your needs. Let’s talk! 💯 Working with me, you will receive a modern good looking application that will meet all guidelines with easy navigation, and of course, you will have unlimited revisions until you are 100% satisfied with the result. Keywords that you can use to find me: Java Developer, ElasticSearch Developer, Big Data Developer, Team lead for Big Data application, Corporate, IT, Tech, Technology.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Big Data
    ETL
    Data Visualization
    Amazon Web Services
    SQL
    Amazon EC2
    ETL Pipeline
    Data Integration
    Data Migration
    Logstash
    Apache Kafka
    Elasticsearch
    Apache Hadoop
    Core Java
  • $400 hourly
    I excel at analyzing and manipulating data, from megabytes to petabytes, to help you complete your task or gain a competitive edge. My first and only language is English. My favorite tools: Tableau, Alteryx, Spark (EMR & Databricks), Presto, Nginx/Openresty, Snowflake and any Amazon Web Services tool/service (S3, Athena, Glue, RDS/Aurora, Redshift Spectrum). I have these third-party certifications: - Alteryx Advanced Certified - Amazon Web Services (AWS) Certified Solutions Architect - Professional - Amazon Web Services (AWS) Certified Big Data - Specialty - Amazon Web Services (AWS) Certified Advanced Networking - Specialty - Amazon Web Services (AWS) Certified Machine Learning - Specialty - Databricks Certified Developer:
 Apache Spark™ 2.X - Tableau Desktop Qualified Associate I'm looking for one-time and ongoing projects. I especially enjoy working with large datasets in the finance, healthcare, ad tech, and business operations industries. I possess a combination of analytic, machine learning, data mining, statistical skills, and experience with algorithms and software development/authoring code. Perhaps the most important skill I possess is the ability to explain the significance of data in a way that others can easily understand. Types of work I do: - Consulting: How to solve a problem without actually solving it. - Doing: Solving your problem based on your existing understanding of how to solve it. - Concept: Exploring how to get the result you are interested in. - Research: Finding out what is possible, given a limited scope (time, money) and your resources. - Validation: Guiding your existing or new team is going to solve your problem. My development environment: I generally use a dual computer-quad-monitor setup to access my various virtualized environments over my office fiber connection. This allows me to use any os needed (mac/windows */*nix) and also to rent any AWS hardware needed for faster project execution time and to simulate clients' production environments as needed. I also have all tools installed in the environments which make the most sense. I'm authorized to work in the USA. I can provide signed nondisclosure, noncompete and invention assignment agreements above and beyond the Upwork terms if needed. However, I prefer to use the pre-written Optional Service Contract Terms www [dot] upwork [dot] com/legal#optional-service-contract-terms.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    CI/CD
    Systems Engineering
    Google Cloud Platform
    DevOps
    BigQuery
    Amazon Web Services
    Web Service
    Amazon Redshift
    ETL
    Docker
    Predictive Analytics
    Data Science
    SQL
    Tableau
  • $40 hourly
    I am a highly skilled Data Scientist with a master's degree in Data Science, provides services in Data Sciences and ML Ops Top Rated Plus on Upwork, with extensive 4+ years of experience in the field, offers a broad range of NLP and Computer Vision services, Expert in ChatGPT, GPT, Large Language Models (LLM), LangChain Experienced in managing vector databases such as Milvus Faiss My proficiency lies in building models for NLP, which includes text preprocessing, sentiment analysis, topic modeling, text classification, OCR, Visual Question Answering, text summarization, document classification, named entity recognition (NER), text generation, machine translation, speech-to-text and text-to-speech capabilities for audio data. I leverage the latest NLP tools and technologies, such as SBert, spaCy, the Hugging Face Trabsforners library, ChatGPT, GPT4, and Sentence Transformer to ensure high accuracy and efficiency in all projects. In addition, I have a firm grip on the Computer Vision field and can handle a wide range of video processing tasks. These include action recognition, object tracking, optical flow analysis, scene segmentation, image segmentation, image classification, object detection, image captioning, visual question answering, and video processing. These services are critical for various applications such as sports analysis, medical imaging, security systems, and any situation where real-time video processing is essential. I can handle your image-related problems such as object tracking, image segmentation, image classification, object detection, image captioning, visual question answering, and video processing. I can leverage the NLP and Computer Vision techniques together to build your application. In the realm of data engineering and MLOps, I excel in designing, building, deploying, and maintaining large-scale, data-driven (distributed) systems. I can proficiently handle tasks involving collection, processing, applying ETL, converting models to ONNX, quantize and optimize AI models, and model deployment with FASTAPI, Spark, and Kubernetes at scale. I am also adept at building complex pre-processing and post-processing pipelines for model inference and handling big data within a distributed environment using PySpark. Regardless of the scale of your user base, I can design a system that efficiently manages concurrent requests, ensuring a seamless user experience with optimized minimum latency. I utilize MLOps and Data Engineering technologies and tools, including PySpark, Docker, Kubernetes, JINA-AI, and Airflow, for big data processing and efficient model inferences. My skill set extends to integration of multimodal models like CLIP, BLIP, Donut, LayoutLM v3, and diffusion techniques for art generation models, including stable diffusion and DreamBooth training. I am well-versed in using the latest computer vision and NLP tools, technologies, libraries, and frameworks, such as OpenCV, Python, Pytorch, and TensorFlow, SBert, spaCy, the Hugging Face library, SentenceTransformer, OpenAI, LangChain, Large Language Models (LLM), vector Databases (Milvus, Faiss, etc). I have worked on numerous projects, delivering high-quality and scalable solutions to my clients. I have a firm grip over the following technologies: - Frameworks: JINA-AI, Haystack, PySpark, FASTAPI, Onnx, Airflow - AI Inference Libraries: Pytorch, HuggingFace, OpenAI, LangChain, Transformers, Onnx - Vector Databases: Milvus, Pinecone, Faiss, Weaviate, Large Language Models (LLM) - Databases: PostgreSQL, SQL, NoSQL, Hive, MySQL - MLOPs: Docker, Kubernetes.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Artificial Intelligence
    Big Data
    Data Mining
    Image Processing
    Computer Vision
    Data Science
    PyTorch
    Python
    Keras
    TensorFlow
    Natural Language Processing
    Deep Learning Modeling
    Machine Learning Model
    Model Optimization
  • $100 hourly
    I have over 4 years of experience in Data Engineering (especially using Spark and pySpark to gain value from massive amounts of data). I worked with analysts and data scientists by conducting workshops on working in Hadoop/Spark and resolving their issues with big data ecosystem. I also have experience on Hadoop maintenace and building ETL, especially between Hadoop and Kafka. You can find my profile on stackoverflow (link in Portfolio section) - I help mostly in spark and pyspark tagged questions.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    MongoDB
    Data Warehousing
    Data Scraping
    ETL
    Data Visualization
    PySpark
    Python
    Data Migration
    Apache Airflow
    Apache Kafka
    Apache Hadoop
  • $50 hourly
    DataOps Leader with 20+ Years of Experience in Software Development and IT Expertise in a Wide Range of Cutting-Edge Technologies * Databases: NoSQL, SQL Server, SSIS, Cassandra, Spark, Hadoop, PostgreSQL, Postgis, MySQL, GIS Percona, Tokudb, HandlerSockets (nosql), CRATE, RedShift, Riak, Hive, Sqoop * Search Engines: Sphinx, Solr, Elastic Search, AWS cloud search * In-Memory Computing: Redis, memcached * Analytics: ETL, Analytic data from few millions to billions of rows and analytics on it, Sentiment analysis, Google BigQuery, Apache Zeppelin, Splunk, Trifacta Wrangler, Tableau * Languages & Scripting: Python, php, shell scripts, Scala, bootstrap, C, C++, Java, Nodejs, DotNet * Servers: Apache, Nginx, CentOS, Ubuntu, Windows, distributed data, EC2, RDS, and Linux systems Proven Track Record of Success in Leading IT Initiatives and Delivering Solutions * Full lifecycle project management experience * Hands-on experience in leading all stages of system development * Ability to coordinate and direct all phases of project-based efforts * Proven ability to manage, motivate, and lead project teams Ready to Take on the Challenge of DataOps I am a highly motivated and results-oriented IT Specialist with a proven track record of success in leading IT initiatives and delivering solutions. I am confident that my skills and experience would be a valuable asset to any team looking to implement DataOps practices. I am excited about the opportunity to use my skills and experience to help organizations of all sizes achieve their data goals.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Python
    Scala
    ETL Pipeline
    Data Modeling
    NoSQL Database
    BigQuery
    Sphinx
    Linux System Administration
    Amazon Redshift
    PostgreSQL
    ETL
    MySQL
    Database Optimization
    Apache Cassandra
  • $120 hourly
    With over 12 years of experience, of which about 8yrs I have worked with different Bigdata technologies(Hadoop, Spark) and the remaining time I mostly worked on writing python scrappers, scripts, API services and also built iOS applications using Objective-C - Experience in building data pipelines to process Petabyte scale data and optimise them for cost and performance - Experience in fine tuning the Spark jobs to the most optimal level and thereby cutting down infrastructure costs by 50-80% - Experience with building Data lakes for major e-commerce and fintech companies - Worked at different startups throughout my career and highly adaptable to different working methodologies like Agile and Kanban
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Big Data
    Apache Hadoop
    PySpark
    Scala
    Python
  • $30 hourly
    I have been working in Machine Learning model development for over 5 years with success. I use Docker, Docker Compose to containerize services/micro-service, to make deployment easier and make easy scalability. Machine learning expertise - - Python, - HuggingFace/transformers, - Keras, PyTorch, TensorFlow, - Spacy, NLTK - SciKit Learn, - SciKit Image, - OpenCV, - pandas, - NumPy, I have also worked on some backend and frontend frameworks for model deployment, such as Flask, Tornado, Django, ReactJS, and ReactNative. I have configured a couple of ML infrastructures in GCP and AWS so that it has low latency and low cost at the same time. Types of projects I have completed in recent years - #1. Natural Language Processing/Understanding/Generation - Chat engine with HuggingFace, GPT, LangChain, OpenAI, ChromaDB - Virtual Personal Assistant Whisper, LLaMA, Tacotron - Customer Support with Whisper, GPT4, GoogleTTS, Twilio - Tweet Generator - Recommendation Systems #2. Computer Vision - - OCR with Textract, Tesseract, DocTR - Real-time person detection and recognition - Image segmentation - Image depth measurements #3. Reinforcement Learning - - Fine-tune GPT-2 for review generation with TRL - Toy game player/agent with DQN #4. Artificial Intelligence Algorithm - - Genetic Algorithm - DPLL #5. Regression Models - - Sales Prediction - Stock Price Prediction
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Google Cloud Platform
    OCR Algorithm
    API Integration
    Text Recognition
    Autoencoder
    Deep Learning
    Machine Learning
    Model Tuning
    Amazon SageMaker
    TensorFlow
    Natural Language Processing
    OpenCV
    Python
    Model Optimization
  • $18 hourly
    #React #Node, #Angular, #Python, Typescript Greetings! ❆ I am a seasoned Senior Full-Stack Developer with expertise in React, Angular, Python, and Node.js. With extensive experience in web development, I have successfully delivered a wide range of projects, showcasing my proficiency across both frontend and backend technologies Skills:- ✈ Make dynamic and interactive user interfaces with React, Redux, Angular, Angular CLI ✈ Building server-side applications using Python, Django, Node JS, Express JS ✈ Experience with Django ORM, models, and database interactions ✈ HTML5, CSS3, and responsive web design principles ✈ Experience with frontend build tools and package managers (Webpack, npm, etc.) ✈ Performing debugging and troubleshooting to identify and resolve issues ✈ Proficient in Git and version control workflows ✈ Experience with CI/CD pipelines and deployment using tools like Jenkins or GitLab CI ✈ Excellent verbal and written communication skills ✈ Ability to collaborate effectively with team members and stakeholders ❆ If you are seeking a highly skilled Senior Full-Stack Expert with proficiency in React, Angular, Python, and Node.js, I am confident in my ability to contribute to your projects and drive their success. Let's collaborate to create cutting-edge software solutions that propel your business forward.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Redux Saga
    Angular Material
    Amazon Web Services
    Django
    JavaScript
    Redux
    TypeScript
    ExpressJS
    Git
    MongoDB
    webpack
    Node.js
    Python
    React
  • $45 hourly
    I am a versatile professional with a unique trajectory that spans both the realms of physics research (where I obtained a PhD degree) and information technology. Over the course of my career, I have amassed a wealth of expertise that has enabled me to seamlessly transition from scientific research to the landscape of IT. For the past six years, I have been fully immersed in the realm of IT, leveraging my skills to develop robust Python backend solutions and data-centric methodologies in the frameworks of IT consulting and product-oriented international IT startups. I demonstrated the ability to rapidly gain competencies in any sub-domain of IT as I enter every new project. I thrive in interdisciplinary environments, where my background allows me to bridge the gap between technical and scientific domains. My competencies: ◦ Strong practical knowledge of risk models calibration methodologies and machine learning methods ◦ Deep knowledge of various statistical methods including time series analysis ◦ Deep knowledge of numerical methods including Monte-Carlo, Markov Chain Monte-Carlo, finite element, and finite-difference. ◦ Strong programming, algorithmic, and problem-solving skills, vast experience in Python backend development
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Dialogflow
    Web Services Development
    Mathematical Modeling
    Python Asyncio
    Statistical Analysis
    COMSOL Multiphysics
    Python
  • $40 hourly
    I am a passionate person. And I am most passionate about solving problems with Data. Being a Data Scientist with the industrial experience of 4 years, I am equipped with the machine learning knowledge to make a world a better place with Data Science. In my professional career, I have worked both as a freelance Data Scientist and a full-time employee. I have worked in the IoT industry for clients in Pakistan and the Middle East. I also have experience working in the Transport Industry, providing solutions using text analytics and NLP. My current industry is retail and I am working for a Danish Retail and beauty company MATAS as a Data Scientist. I am responsible for all stages of the Data science process, from Business understanding to model deployment. Skillsets:- - Understanding of the business problem and where Data Science can create value. - Ability to research the academia and Industry for modern solutions. - Ability to explain Data Science to non-technical business stakeholders. - Key areas, where I consider my self well versed are Recommendations Systems, Multi-Armed Bandits, Send Time Optimization, Demand Forecasting, Price Elasticity, Word2vec, and sentence embeddings, and pretty much all the machine learning algorithms. - Well versed in Big data frameworks such as Spark, with the hands-on experience on PySpark Dataframes and the Databricks platform. - Building Data integration pipelines and collaborating with Data Engineers to support the ETL. - Designing the Power BI dashboards to present the insights to the stakeholder. - Developing the DevOps pipeline for model deployment using Docker, Kubernetes. - Maintaining motivation and enthusiasm within the team when the model accuracy falls.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    ETL Pipeline
    Data Integration
    PySpark
    Data Visualization
    Machine Learning
    Apache Spark MLlib
    Python
    R
    Natural Language Processing
    Deep Learning
    Recommendation System
    Databricks Platform
    Computer Vision
  • $45 hourly
    Personal skills •Analytical and quality-oriented. I am interested in researching best practices and creating solutions that are resilient and work in general contexts. In the long term, 5-stars systems pay off the extra effort. • Communicative. I don't like waiting for 3 days to receive feedback, so I don't do that to others. You can expect fast feedback from me, at the very least a "looking into it". Also, If I think you’re wrong in you're requirements, I’ll tell you and suggest alternative solutions :) Besides, I consider myself a friendly person and approachable person, who loves to help my colleagues whenever I can! Technical skills —— Data Engineering • Expertise in Python: I ranked in the top 15% out of 1.3 million people on the LinkedIn Python assessment (see portfolio). • SQL • ETL/ELT with Python, Databricks (Pyspark), DBT, Dagster, Airbyte and a lot of AWS services. • Python Google Style Guide. • Agile, Extreme Programming (XP) & Clean Code (and Google Python Style Guide) —— Cloud/DevOps • AWS: Batch, Step Functions, Glue, Athena, Boto3, Lambda, S3, EC2, IAM, KMS, SQS, etc. • Bash. Docker + ECS. CI/CD - Github actions. Terraform, SAM, CodePipeline.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Amazon S3
    Amazon API Gateway
    Terraform
    Tableau
    pandas
    AWS Lambda
    Google Cloud Platform
    Amazon Web Services
    dbt
    PySpark
    Databricks Platform
    Python
    Docker
  • $40 hourly
    Experienced AWS certified Data Engineer. Currently have around 4 years of Experience in Big Data and tools. AWS | GCP Hadoop | HDFS | Hive | SQOOP Apache Airflow | Apache Spark | Apache Kafka | Apache NIFI | Apache Iceberg Python | BaSH | SQL | PySpark | Scala | Delta Lake Datastage | Git | Jenkins | Snaplogic | Snowflake.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Amazon API Gateway
    Google Cloud Platform
    Apache Kafka
    Apache Airflow
    Big Data
    Data Migration
    Apache NiFi
    Amazon Redshift
    Amazon Web Services
    PySpark
    AWS Lambda
    AWS Glue
    ETL
    Python
    SQL
  • $110 hourly
    Top-rated developer working (mostly) with big data, artificial intelligence, machine learning, analytics & back-end architecture. I am specialized in Bigdata (Hadoop, Apache Spark, Sqoop, Flume, Hive, Pig, Scala, Apache Kudu, kafka, python, shell scripting core Java, Machine Learning). As a Big Data architect I work as part of a team responsible for building, designing application for online analytics, Outgoing, motivated team player eager to contribute dynamic customer service, administrative, supervisory, team building, and organizational skills towards supporting the objectives of an organization that rewards reliability, dedication, and solid work ethics with opportunities for professional growth. skillSet: Hadoop,spark, scala, python, bash, Tableau,jenkins, Ansible,Hbase, Sqoop, Flume, Ne04j, Machine Learning, java, Nifi, Awz, Azure, GCP, DataBricks, DataMeer, kafka, Confluent, Schema Registry, SQl, DB2, CDC Why should you hire me ? ✅ 1400+ Upwork Hours Completed+ productive hours logged with 100% customer satisfaction » Passion for Data Engineering and Machine Learning » Experience with functional scala: shapeless, cats, itto-csv, neotypes » Familiar with Hadoop ecosystem; Apache Spark, Hive, YARN, Apache Drill, Sqoop, Flume, Zookeeper, HDFS, MapReduce, Machine Learning, airflow » Worked with JWT authentication, reactive JDBC-like connectors for PostgreSQL, MySQL & MariaDB, reactive MongoDB » Micro-services expert. Worked mostly with Lagom; Akka persistence, event-sourcing » Defining a scalable architecture on top of AWS, Google Cloud, Digital Ocean, Alibaba Cloud » ElasticSearch stack pro; ElasticSearch, Logstash, Beats, Kibana » Efficient project manager Let's discuss your idea and build the next big thing!
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Google Cloud Platform
    Apache HBase
    Snowflake
    Machine Learning
    Apache Spark MLlib
    Databricks Platform
    ETL Pipeline
    AWS Glue
    Apache Hive
    Scala
    SQL
    Docker
    Apache NiFi
    Apache Kafka
    Apache Hadoop
  • $20 hourly
    Very well understand your bussiness need. Also find Problem in your bussiness using your past data. Find new way or create new way for problem solution.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Snowflake
    PySpark
    Databricks Platform
    Weka
    Apache Spark MLlib
    Data Science
    Data Mining
    Oracle PLSQL
    Apache Kafka
    Scala
    Python
    SQL
    Microsoft SQL Server
    Spring Framework
  • $100 hourly
    I hold a BS in Physics from the University of Colorado and a Master's equivalent in computer science and physics with an emphasis in data science. I have expertise in numerical analysis, modeling, and machine learning. I have a diverse mathematical and programming skillset and specialize in Python, Julia and Mathematica development. I can build complex applications in a broad range of domains based on my experience and learning methodology. My strengths are in problem-solving, analysis, and communication.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    API Development
    Financial Modeling
    Mathematical Modeling
    Database Design
    Mathematica
    Artificial Intelligence
    Physics
    Julia
    Data Extraction
    Forecasting
    Algorithm Development
    Machine Learning
    Data Science
    Statistical Analysis
    Python
  • $30 hourly
    Software backend ★ DevOps ★ Database Performance tuning and recovery ★ CI/CD ★ Elasticsearch ★ Redis Caching ★ Azure ★ Google Cloud ★ AWS ------------------------------------------------------------------------------------------- Build of containerized Backend APIs on cloud servers. Managing and deploying microservices applications using devops. Database performance tuning for better caching and latency response. Setting up CPU-Memory optimized clusters on cloud for data science applications. Optimization of compression, storage and access of big data using Apache Spark. ----------------------------------------------------------------- We provide Data Distribution, Data API delivery, Data analysis and Devops support services The technologies we use are the following: • Cloud providers: AWS, Azure, Google Cloud, DigitalOcean • Scripting Languages: Python, Bash/Shell Scripting, Java, * CI/CD integration: Azure DevOps, Travis CI, gitlab, jenkins, Terraform • Databases: MySQL, PostgreSQL, TimescaleDB • Libraries for Data Analysis: Pandas, Dask • Containerization: Docker, Podman, , Buildah, Docker-compose, Kubernetes, Helm * Observability: Prometheus, Grafana, Jaeger, Kibana • Big Data Frameworks: ElasticSearch, Apache Spark, Dask • ML/DL frameworks: Scikit-Learn, XGBoost, PyTorch, Keras, TensorFlow • Frameworks: DJANGO, Flask, FastAPI • Data Visualization: Dash, PLotly, Bokeh • API Delivery: REST API, JSON API • Automation : Selenium ----------------------------------------------------------------- DevOps Automation: We build fully automated continuous integration/continuous development build and release to production pipelines. ----------------------------------------------------------------- Data Distribution Services: We provide relevant data for your needs - no software, hardware or fetching skills need - we do the job as you request. ------------------------------ Data API requests: We build custom APIs for websites having a rate-limited or data-limited APIs so that you can use the data in your applications. ------------------------------ Data analysis: Gain powerful insight metrics with your data and improve performance indicators. ------------------------------ Machine Learning Solutions: We build machine learning models based the data we gather for you, so you can build a powerful decision-making process. -------------------------------------------------------------------------------------------
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Microsoft Azure
    Kubernetes
    Amazon Web Services
    Elasticsearch
    Google Cloud Platform
    DevOps
    Redis
    Docker
    Jenkins
    CI/CD
    Terraform
    Amazon EC2
    Microservice
    Python
  • $35 hourly
    🚀 Experienced Project Manager 🚀 Offering over 10 years of success in leading all phases of diverse technology projects for medium and large businesses. 🎯 My Expertise Includes 🎯 Project Management Requirement Detailing Planning Coding Debugging Deployment 🏆 Achievements 🏆 Completed 50+ projects Over 40 million downloads for mobile apps 💻 My Team 💻 Includes specialists in: Back-end development: PHP, Python, NodeJS, Java, .NET Front-end development: JavaScript, TypeScript, CSS, HTML Mobile development: iOS: Swift/Objective C/RxSwift Android: Java/Kotlin Cross-platform: Xamarin, ReactNative, PhoneGap, Cordova 🔧 We Also Specialize In 🔧 Testing: Selenium, Appium, TDD, BDD, Unit Testing, Rest API Cloud: Amazon, Microsoft, Google DevOps: Docker, Jenkins, TeamCity, AzureDevOps, Kibana, Elk, Telegraf, Ansible, Terraform Data Science and Exploratory Data Analysis: Statistical Analytics, Hypothesis, Insights, Feature Engineering Machine Learning and AI: Computer Vision (CV) and Image Processing (IP), Natural Language Processing/Understanding/Generation (NLP/NLU/NLG) for text analysis, dialog systems, text-to-speech and vice-versa, etc. UX/UI Designers: Sketch, Figma, Adobe Photoshop, Axure RP 🌟 We Call Ourselves Specialists 🌟 Because we fully undertake all responsibility, reputation, and financial risks to provide a solution that allows your business to develop on a competitive market effectively. 👨‍⚕️ Our Expertise In 👨‍⚕️ Machine learning AI Public safety Healthcare E-sport Education Social platforms Business process automation 🤝 Partner With Us 🤝 If you are looking for a reliable developer and partner, invite me, and I will try to help you!
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    PHP
    Kotlin
    JavaScript
    CSS
    HTML
    Objective-C
    Swift
    Node.js
    Python
    Java
  • $25 hourly
    Seasoned Data Scientist driving transformation with Analytics and Machine Learning with overall 16 years of experience in building Machine Learning models, Deep Learning Models, IBM Qiskit, BigQuary, GCP, Data Visualization, SQL Server DBA & Developer, Performance Tuning, Web and REST API applications. I have worked with Investment Banks, Startups, and consulted for Technology services companies. I Specialize in : - Data Analytics and Machine Learning with Pytorch and TensorFlow - BigQuary, Google Cloud Platform, Cloud SQL - Web Scraping/Crawling and Data Mining - Data Visualization - MSSQL, MongoDB - IBM Qiskit With a Master in Computer Application with a Mathematics degree and extensive experience in an Agile development environment, I have the necessary skillset and problem-solving abilities to get your job done and deliver on the expectations.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Project Management
    BigQuery
    Microsoft SQL Server Programming
    Google Cloud Platform
    Data Mining
    Transact-SQL
    Deep Neural Network
    Convolutional Neural Network
    SQL
    Machine Learning
    Data Science
    Python
    TensorFlow
    Deep Learning
  • $30 hourly
    I am a full stack analytics expert with more than 8 years of experience in Python, Machine Learning, Analytics, Data Modeling, eCommerce, Visualization, Dashboard Development, API Integration, and MERN stack development. I design reports that are intuitive, attractive, and insightful, so you can quickly make decisions that improve your business. My expertise are:- ✅ Python Development ✅ Data Visualization ✅ Dashboard development ✅ Data extraction from PDF. ✅ Text Analysis and NLP ✅ Data compiling from Business Directories ✅ Web scraping, crawling, parsing, data extraction (Scrapy, Selenium, Beautifulsoup) ✅ Gathering data from a website and entering it into a Spreadsheet ✅ JavaScript development using React, Node, and D3 ✅ Backend API development ✅ Google Data Studio ✅ Kibana/ Grafana ✅ GraphDB/ Neo4j ✅ SQL/ MySQL ✅ Data Mining ✅ Data Collector ✅ KPI Please feel free to let me know if you like to know any other details, and I would be happy to help. My core technical skills are:- A. Dashboard & Data Visualization:- I. Analytics:- R/SAS, KPI, Tableau, Kibana, Elastic Search, Grafana, Power BI, Google Data Studio II. Cloud:- AWS, Azure, Google Cloud III. Database:- SQL, MySQL, MongoDB, Postgresql, Neo4j, GraphDB B. Backend Development:- Python, Node.js, PHP, Ruby on Rails, Laravel, Docker, Django, Apache, etc. C. Frontend Development:- JavaScript, Typescript, React.js, D3.js, Chart.js, Angular, SASS, Bootstrap, Django, jQuery, Vue.js, Git, etc.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Data Analysis Consultation
    API Integration
    ChatGPT
    Neo4j
    Data Visualization
    Elasticsearch
    Dashboard
    Microsoft Power BI Development
    Kibana
    JavaScript
    Node.js
    React
    Artificial Intelligence
    Machine Learning Framework
    Python
  • $39 hourly
    You can set up free consultation using: calendly.com/gaurav-soni226/gaurav-consultation-1-1 Hello, I am Data Architect and Big Data Engineer with extensive experience building large-scale Analytics solutions starting from Solution Architecture design to Implementation and Subsequent maintenance. The solution includes building and managing cloud Infrastructure. EXPERIENCE 9+ years working in data warehousing, ETL, Cloud Computing (Google Cloud Platform & AWS), and Real-time streaming. MY TOP SKILLS - Python, Java, Scala, SQL, TSQL, HQL - Apache Spark, Flink, Kafka, NiFi, Hive, Presto, Apache Beam (DataFlow) - Azure: Azure Databricks, Azure Data factory, Azure Synapse, Azure Datawarehouse - GCP: Google Dataproc, BigQuery, BigTable, Cloud storage, Cloud Pub/Sub - AWS: EMR, Redshift, DynamoDB, AWS Glue, AWS Athena, Kinesis Streams, S3 - File Format: Parquet, AVRO, CSV, JSON - Other: Data Migration, Snowflake, Pandas, Pyarrow, Delta Lake - Cloud Infra: Kubernetes, GKE, Azure Kubernetes services, EC2, Lambda functions House of Apache Spark: - Spark Jobs tunning: Executors, Core, Memory, Shuffle Partitions, Data Skewness - Spark-SQL: Catalyst Optimizer and Tungsten Optimizer - Spark-MLlib: Machine Learning with Pyspark - Streaming : Spark Structured Streaming(Dataframes), Spark Streaming(RDD) Data Store: - SQL: PostgreSQL, MySQL, Oracle, Azure SQL, DynamoDB - No-SQL: Cassandra, Elasticsearch ILM, OpenSearch ISM, Mongo DB, Hbase - File system: HDFS, Object Storage, Block Storage(Azure Blob, AWS S3) Data Orchestrator: - Apache Airlfow, Apache Oozie Workflow, Azkaban Authentication: - Azure Active Directory - LDAP - Kerberos - SAML Next Steps 👣 Requirements Discussion + Prototyping + Visual Design + Backend Development + Support = Success!
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Data Migration
    Amazon Redshift
    Apache Hadoop
    PySpark
    Microsoft Azure SQL Database
    AWS Glue
    ExpertKnowledge Synapse
    Apache Airflow
    Data Modeling
    NoSQL Database
    Databricks Platform
    Scala
    Apache Hive
    Azure IoT HuB
    Elasticsearch
    SQL
    Python
  • $60 hourly
    Professional Data Engineer with 7 years of experience doing stuff like ETL, building data lakes/pipelines, data analytics etc. Worked on Java, Scala, Python, R.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    ClickHouse
    Apache Kafka
    Java
    ETL Pipeline
    Amazon Web Services
    Apache Spark MLlib
    Logstash
    Elasticsearch
    AWS Lambda
    Apache Hadoop
    R
    Python
    Scala
  • $22 hourly
    If you are expecting such things like " High Quality " , " Clean Code " , " In-Time Delivery "You are at the right place. I never compromise with the quality of delivery & " **Client satisfaction**" is my main motto. My Expertise Areas :- Prototyping :- Software prototyping is essential to get the idea of user experience in your application . For prototyping I use various tools like Invision , Axure RP , Balsamiq , Adobe XD . UX/UI ( Graphic & Web Design ) :- A beautifully designed user-interface can make or break the opinion about your brand , that’s why I devote time in understanding your target audience , carry out thorough research on their needs, wants and expectations, and accordingly zero to one of the most probable USPs for your project . I use various tools for UX/UI like Adobe Photoshop , Corel Draw , Adobe Illustrator , Sketch . Front-End Development :- Building state-of-the-art , easy to use , user-friendly websites and applications is truly a passion of mine and I am confident I would be an excellent addition to your organisation . In frontend development I have expertise with Angular.js , React.js , Cordova , TypeScript , Html5 , CSS3 , Bootstrap , Javascript , Angular 4 , Angular 5 , Angular 6 , Angular 7 . Back-End Development :- I have the ability to create back-end code that will add utility to everything the front-end designer creates . I’m passionate about the impact my skills can have in the real world , and firmly believe that I can create innovative solutions to business processes and problems which will ultimately lead to a better user experience . I have expertise with most of the leading platforms including Python , Django , PHP , Laravel , CodeIgniter , Cake PHP , Yii , Node.js . Database Development :- With more than 7 years experience as a Database Developer , I am adept in query writing , information security , and quality assurance . I excel at MySQL Administration , MongoDB , MYSQL , PostgreSQL Shopify skills :- ecommerce development – theme setup and customisation – ecommerce Store maintenance – Custom ecommerce design - Custom checkouts & upgrades ( Shopify Plus ) – Shopify Scripts – Integrating Amazon , Facebook , eBay and other web services – marketplace Wordpress / Woocommerce Skills :- Responsive UX/UI development , mobile responsive , cross-browser frontend - theme/plugin development , CMS Customisation & Development , Integration of other APIs like Google Map API , Google advertising , Booking engine , several Payment method , etc Wordpress Performance Optimization (WPO) Mobile App Design & Development ( iOS & Android ) :- My professional experience includes designing and crafting code for various mobile applications, and then testing the resulting code to meet client needs . As mobile app developer I have expertise with Ionic , React Native , Hybrid Apps , Native Android , Native iOS , PhoneGap , C# , Java , C++ , Objective C , Swift Full stack developer :- MEAN & MERN are the leading technologies which I have learned to become full stack developer . Scripting & Automation :- Now days automation is necessary in almost all the applications because human errors are more than machine errors and in automation I have already worked with Face Recognition , Voice Recognition , ChatBot , InstaBot , API Integrations , Web Crawler , Restful API , Socket.io , Push Notifications . DevOps :- I am familiar with Github , Bitbucket , Zira , Basecamp , Asana , Trello , Mentis . You are here so, why to wait ? If you have any idea then get in touch with me and I'll assist you in a better way to convert your idea into reality . Cheers...
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    API Development
    AngularJS
    Agile Software Development
    Laravel
    CodeIgniter
    Node.js
    Mobile UI Design
    User Experience Design
    Adobe Photoshop
    Adobe Illustrator
  • $105 hourly
    I am a seasoned Lead IT and Data Architect holding over 11 years of extensive experience in the realm of IT architecture and Data Integration. I excel in designing and delivering robust and scalable integration solutions compatible with both Cloud and On-Premise applications. I have an impressive track record of building and delivering automated integration solutions across diverse industries. As a Data Architect, my specialization is focused on Data Integration and Migration, leveraging the iPaaS Dell Boomi platform. I have amassed a wealth of experience in IT, serving as a Solutions/Cloud Architect for renowned organizations such as Ericsson, Yellow Pages, GTES, CLEAResult, and DigitalRadius, among others. My project portfolio includes a variety of intricate projects that range from designing and building automated pipelines for Big Data initiatives, Salesforce integrations with different data sources, and BI systems using iPaaS (Dell Boomi), to executing a comprehensive Okta SSO solution for over 150 Apps (including SAML: SP/IdP initiated Login, SWA, CORS, etc.). I have also partaken in developing international projects like GSN, deploying middleware in a distributed cluster on the Cloud, and much more. My primary objective is to enable businesses of all sizes to leverage emerging technologies effectively, providing them with cost-efficient solutions to unlock their full potential. Here's a brief overview of areas/platforms where I can offer my expertise: * Data Integration: Dell Boomi * Leading applications: Salesforce.com, SAP, Coupa, NetSuite, Microsoft Dynamics, MS SQL Server, MySQL, Workday, (s)FTP, REST/SOAP APIs * File Storage/Sharing: Box, Google Cloud Storage * Big Data: BigQuery, Hadoop * BI & Data Visualization: Tableau Software * Cloud Computing: AWS, RackSpace, Private Cloud (Virtualized Linux/Windows), MS Azure * SSO: Okta, feide.no, Active Directory * E-commerce: Shopify, Amazon.com, eBay
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Data Management
    API
    RESTful API
    OKTA
    Customer Relationship Management
    HTTP
    Microsoft Azure SQL Database
    Salesforce CRM
    Dell Boomi
    Oracle NetSuite
    Salesforce
    Database
    Data Warehousing
    Amazon Web Services
    API Integration
  • $47 hourly
    I am a Software/Big Data Engineer with 9+ years of experience in building highly scalable and robust applications. I am your one stop solution for end-to-end Software and Big Data solutions. ☆ Software Areas: - Developing scalable and quality applications (with best practices) - Test Driven Development (great test coverage) - Deployment using best practices (complete DevOps) - Continuous Integration and Delivery ☆ Big Data Ecosystem: - Data Engineering - Data Streaming & Processing - Data Warehousing - ETL Development - Data Sciences & Machine Learning - Analytics & Dashboarding ☆ Core Languages Python, Ruby, Scala, NodeJS, Golang, Java, Javascript, PHP, R, Shell, SQL ☆ Framework/Libraries Ruby on Rails, ExpressJS, Flask, Sinatra, CodeIgniter, Bootstrap, Django. ☆ Big Data and NoSQL Expertise - Databases: Hadoop, Cassandra, Datastax Enterprise, MongoDB, Redshift, RDS, DynamoDB, Hive, Redis, Memcache, HBase. - Frameworks: Apache Spark, Amazon EMR, Airflow, Pentaho, PrestoDB - Streams & Brokers: Apache Kafka, RabbitMQ, Amazon SQS, ZeroMQ, Kinesis - Others: ECS, Beanstalk, Nutch, Solr, Mahout. ☆ Cloud Platforms Amazon Web Services (AWS), Google Cloud, Docker, DigitalOcean, Heroku, OpenStack. ☆ Mobile & Apps Android Native, Ubuntu Apps, Windows Apps, Chrome Extensions. I am also a great Consultant, Tech Lead & Manager. I promise quick turnaround times and great customer service!
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Data Management
    Web Development
    Big Data
    Data Science
    Machine Learning
    Amazon Web Services
    Ruby on Rails
    Apache Cassandra
    Python
  • $40 hourly
    Successful delivery of 10+ complex client-facing projects and exposure in the Telecom, Retail, Automobile, and Banking industries with a focus on data, analytics, and development of the right analytical and consulting skills to deliver in any challenging environment. Strong track record in Data Engineering with hands-on experience in successfully delivering challenging implementations I offer data services and implementation to set up Data Warehouses and Data solutions for analytics and development in retail, telecom, fintech, automobile, etc. I am a software and data developer. I earned a Bachelor's degree in computer science and have 10+ years of experience in Data Engineering and Cloud infrastructure. Tech Stack: * Snowflake (certified) * Teradata (certified) * Informatica (certified) * WhereScape RED * Airflow * AWS Athena and EC2 * Python, Pandas & Numpy * Data Warehousing (certified) * Data Scrapping, Data Mining * Data Modeling * Netezza, DB2 * Oracle PL\SQL * C# .NET * Automation * SQL & NoSQL databases
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    PDF Conversion
    Web Crawling
    Data Integration
    Data Vault
    Python
    Informatica
    API
    Snowflake
    Data Warehousing
    Database Management
    ETL Pipeline
    Apache Airflow
    MySQL
  • $30 hourly
    Seasoned data engineer with over 11 years of experience in building sophisticated and reliable ETL applications using Big Data and cloud stacks (Azure and AWS). TOP RATED PLUS . Collaborated with over 20 clients, accumulating more than 2000 hours on Upwork. 🏆 Expert in creating robust, scalable and cost-effective solutions using Big Data technologies for past 9 years. 🏆 The main areas of expertise are: 📍 Big data - Apache Spark, Spark Streaming, Hadoop, Kafka, Kafka Streams, HDFS, Hive, Solr, Airflow, Sqoop, NiFi, Flink 📍 AWS Cloud Services - AWS S3, AWS EC2, AWS Glue, AWS RedShift, AWS SQS, AWS RDS, AWS EMR 📍 Azure Cloud Services - Azure Data Factory, Azure Databricks, Azure HDInsights, Azure SQL 📍 Google Cloud Services - GCP DataProc 📍 Search Engine - Apache Solr 📍 NoSQL - HBase, Cassandra, MongoDB 📍 Platform - Data Warehousing, Data lake 📍 Visualization - Power BI 📍 Distributions - Cloudera 📍 DevOps - Jenkins 📍 Accelerators - Data Quality, Data Curation, Data Catalog
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    SQL
    AWS Glue
    PySpark
    Apache Cassandra
    ETL Pipeline
    Apache Hive
    Apache NiFi
    Apache Kafka
    Big Data
    Apache Hadoop
    Scala
  • Want to browse more freelancers?
    Sign up

How it works

1. Post a job (it’s free)

Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.

2. Talent comes to you

Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.

3. Collaborate easily

Use Upwork to chat or video call, share files, and track project progress right from the app.

4. Payment simplified

Receive invoices and make payments through Upwork. Only pay for work you authorize.

Trusted by

How do I hire a Apache Spark Engineer on Upwork?

You can hire a Apache Spark Engineer on Upwork in four simple steps:

  • Create a job post tailored to your Apache Spark Engineer project scope. We’ll walk you through the process step by step.
  • Browse top Apache Spark Engineer talent on Upwork and invite them to your project.
  • Once the proposals start flowing in, create a shortlist of top Apache Spark Engineer profiles and interview.
  • Hire the right Apache Spark Engineer for your project from Upwork, the world’s largest work marketplace.

At Upwork, we believe talent staffing should be easy.

How much does it cost to hire a Apache Spark Engineer?

Rates charged by Apache Spark Engineers on Upwork can vary with a number of factors including experience, location, and market conditions. See hourly rates for in-demand skills on Upwork.

Why hire a Apache Spark Engineer on Upwork?

As the world’s work marketplace, we connect highly-skilled freelance Apache Spark Engineers and businesses and help them build trusted, long-term relationships so they can achieve more together. Let us help you build the dream Apache Spark Engineer team you need to succeed.

Can I hire a Apache Spark Engineer within 24 hours on Upwork?

Depending on availability and the quality of your job post, it’s entirely possible to sign up for Upwork and receive Apache Spark Engineer proposals within 24 hours of posting a job description.

Schedule a call