Hire the best Apache Spark Engineers

Check out Apache Spark Engineers with the skills you need for your next job.
Clients rate Apache Spark Engineers
Rating is 4.8 out of 5.
4.8/5
based on 775 client reviews
  • $140 hourly
    AWS RDS | MySQL | MariaDB | Percona | Semarchy xDM | AWS Glue | PySpark | dbt | SQL Development | Disaster Recovery | Business Continuity | ETL Development | Data Governance / Master Data Management | Data Quality Assessments | Appsheet | Looker Studio | Percona PMM *** Please see my portfolio below.*** I have over two decades of experience immersed in a variety of data systems oriented roles on both cloud-based and on-premise platforms. Throughout my career, I have served in senior-level roles as Data Architect, Data Engineer, Database Administrator, and Director of IT. My technology and platform specialties are diverse, including but not limited to AWS RDS, MySQL, MariaDB, Redshift, Percona XtraDB Cluster, PostgreSQL, Semarchy xDM, Apache Spark/PySpark, AWS Glue, Airflow, dbt, Amazon AWS, Hadoop/HDFS, Linux (Ubuntu, Red Hat). My Services Include: Business Continuity, High Availability, Disaster Recovery: Ensuring minimal downtime of mission-critical databases by utilizing database replication, clustering, and backup testing and validation. Performance Tuning: I can analyze the database configuration, errors and events, physical resources, physical table design, and SQL queries to address performance issues. Infrastructure Engineering: In the AWS environment I use a combination of Ansible, Python with the boto3 SDK, as well as the command line interface (CLI) to create and manage a variety of AWS services including EC2, RDS, S3, and more. System Monitoring: Maintaining historical performance metrics can be useful for proactive capacity planning, immediate outage detection, alerting, and analysis for optimization. I can use tools including Percona Monitoring & Management (PMM), and AWS tools such as Performance Insights and CloudWatch. ETL Development: I develop data processing pipelines using Python, Apache Spark/PySpark, and dbt. For process orchestration, I utilize AWS Glue or Airflow. I am experienced in integrating a variety of sources including AWS S3, REST API's, and all major relational databases. Data Governance / Master Data Management: I am experienced in all phases of development and adminstration on the Semarchy xDM Master Data Management Platform. - Building the infrastructure and installing the software in AWS. - Entity design. - Developing the UI components for use by the data stewards to view and manage master data. - Creating the internal procedures for data enrichment, validation, and duplicate consolidation. - Data ingestion (ETL) - Dashboard creation.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Database Management
    Looker Studio
    Data Lake
    Apache Airflow
    AWS Glue
    PySpark
    Amazon RDS
    dbt
    System Monitoring
    Master Data Management
    High Availability and Disaster Recovery
    MySQL
    MariaDB
    Database Administration
    SQL Programming
  • $30 hourly
    Big Data/Data engineer, with almost 4 years of experience using AWS or GCP cloud ✅ Building Data Lakes and Data Warehouses using AWS/GCP cloud infrastructure ✅ 💯 Certified AWS Developer 💯 ✅ Building complex analytical queries using cloud engines ✅ Data ingestion from various sources: RDBMS, API, SFTP, S3, GCS Open-minded software engineer, eager to work with complex distributed systems and components. Capable to build back-end solutions. Strong in design and integration of problem-solving skills. Skilled in Python, AWS, GCP, Apache Airflow, Apache Spark, and Apache Kafka with database analysis and design. Have a good experience with the Odoo ERP framework. Capable of creative thinking, highly disciplined, punctual, demanding, good team player. Strong written and verbal communications. 𝛑 Finished my Master's degree in Applied Math in 2020 at Lviv Polytechnic National University. ⚡⚡⚡ Worked for such big clients as Dyson, Syngenta, and Deloitte. Helping them with the digital transformation of their business. Constructing centralized data storage for them in most cases. As well as providing support and improvements to existing solutions. Here is what one of my clients said about me(you can check it on my LinkedIn profile): 💥"Yurii is a highly skilled and capable software engineer working in the Big Data space. He worked on a large data lake platform in the AWS Cloud environment, for which I was the project manager. He was consistently responsive and skillful and came through with timely deliveries when needed. Yurii is a pleasure to work with, due to his technical mastery coupled with strong interpersonal skills. I highly recommend working with him, and will seek him out for future engagements. "💥
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Apache Airflow
    SQL Programming
    dbt
    Amazon Athena
    Data Warehousing
    Google Dataflow
    AWS Glue
    AWS Lambda
    Snowflake
    Python
    BigQuery
  • $38 hourly
    💡 If you want to turn data into actionable insights or planning to use 5 V's of big data or if you want to turn your idea into a complete web product... I can help. 👋 Hi. My name is Prashant and I'm a Computer Engineer. 💡 My true passion is creating robust, scalable, and cost-effective solutions using mainly Java, Open source technologies. 💡During the last 11 years, I have worked with, 💽Big Data______🔍Searching____☁️Cloud services 📍 Apache Spark_📍ElasticSearch_📍AWS EMR 📍 Hadoop______📍Logstash_____📍AWS S3 📍 HBase_______📍Kibana_______📍AWS EC2 📍 Hive_________📍Lucene______ 📍AWS RDS 📍 Impala_______📍Apache Solr__📍AWS ElasticSearch 📍 Flume_______📍Filebeat______📍AWS Lambda 📍 Sqoop_______📍Winlogbeat___📍AWS Redshift 5-step Approach 👣 Requirements Discussion + Prototyping + Visual Design + Backend Development + Support = Success! Usually, we customize that process depending on the project's needs and final goals. How to start? 🏁 Every product requires a clear roadmap and meaningful discussion to keep everything in check. But first, we need to understand your needs. Let’s talk! 💯 Working with me, you will receive a modern good looking application that will meet all guidelines with easy navigation, and of course, you will have unlimited revisions until you are 100% satisfied with the result. Keywords that you can use to find me: Java Developer, ElasticSearch Developer, Big Data Developer, Team lead for Big Data application, Corporate, IT, Tech, Technology.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Big Data
    ETL
    Data Visualization
    Amazon Web Services
    SQL
    Amazon EC2
    ETL Pipeline
    Data Integration
    Data Migration
    Logstash
    Apache Kafka
    Elasticsearch
    Apache Hadoop
    Core Java
  • $500 hourly
    I excel at analyzing and manipulating data, from megabytes to petabytes, to help you complete your task or gain a competitive edge. My first and only language is English. My favorite tools: Tableau, Alteryx, Spark (EMR & Databricks), Presto, Nginx/Openresty, Snowflake and any Amazon Web Services tool/service (S3, Athena, Glue, RDS/Aurora, Redshift Spectrum). I have these third-party certifications: - Alteryx Advanced Certified - Amazon Web Services (AWS) Certified Solutions Architect - Professional - Amazon Web Services (AWS) Certified Big Data - Specialty - Amazon Web Services (AWS) Certified Advanced Networking - Specialty - Amazon Web Services (AWS) Certified Machine Learning - Specialty - Databricks Certified Developer:
 Apache Spark™ 2.X - Tableau Desktop Qualified Associate I'm looking for one-time and ongoing projects. I especially enjoy working with large datasets in the finance, healthcare, ad tech, and business operations industries. I possess a combination of analytic, machine learning, data mining, statistical skills, and experience with algorithms and software development/authoring code. Perhaps the most important skill I possess is the ability to explain the significance of data in a way that others can easily understand. Types of work I do: - Consulting: How to solve a problem without actually solving it. - Doing: Solving your problem based on your existing understanding of how to solve it. - Concept: Exploring how to get the result you are interested in. - Research: Finding out what is possible, given a limited scope (time, money) and your resources. - Validation: Guiding your existing or new team is going to solve your problem. My development environment: I generally use a dual computer-quad-monitor setup to access my various virtualized environments over my office fiber connection. This allows me to use any os needed (mac/windows */*nix) and also to rent any AWS hardware needed for faster project execution time and to simulate clients' production environments as needed. I also have all tools installed in the environments which make the most sense. I'm authorized to work in the USA. I can provide signed nondisclosure, noncompete and invention assignment agreements above and beyond the Upwork terms if needed. However, I prefer to use the pre-written Optional Service Contract Terms www [dot] upwork [dot] com/legal#optional-service-contract-terms.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    CI/CD
    Systems Engineering
    Google Cloud Platform
    DevOps
    BigQuery
    Amazon Web Services
    Web Service
    Amazon Redshift
    ETL
    Docker
    Predictive Analytics
    Data Science
    SQL
    Tableau
  • $60 hourly
    Reliable data engineer with 10 years of proven industry experience in data lake development, data analytics, real-time streaming, and back-end application development. My work is used by millions of people in the legal and entertainment industries. I have built exceptionally stable solutions for high-traffic, high-visibility projects, and understand what it takes to ensure products are robust and dependable. I also have expertise in the Apache Spark ecosystem, Elastic Search, ETL, AWS Glue, DMS, Athena, EMR, Data Lake, AWS Big Data, Apache Kafka, Java, and NoSQL. Specific Experience 1. Databricks : 5+ years of experience 2. Unity Catalog: 2+ years of experience 3. Apache Spark: 8+ years of experience 4. ETL: 8+ years of experience 5. SQL: 9+ years of experience 6. AWS: 8+ years of experience 7. Azure and GCP: 5+ years of experience I am a data professional, worked with many companies, and delivered some of the enormous data engineering and data science projects in the past. My focus is always on scalable, sustainable, and robust software building. As a data scientist, I will use data modeling, programming, analysis, visualization, and writing skills to help people have the insight to develop products, customers, and impact. As a data scientist, I care deeply about the data from beginning to end—I am actively involved in all aspects of data analysis, from data modeling tasks to writing reports and making visualizations. Python/Scala Programming, Linux Admin, Data Wrangling, Data Cleansing & Data Extraction services utilizing Python 3 or Python 2 Programming or Scala/Spark on Linux or Windows. I slice, dice, extract, transform, sort, calculate, cleanse, collect, organize, migrate, and otherwise handle data management for clients. Services Provided: - Big data processing using Spark Scala - Building large Scale ETL - Could Management - Distributed platform development - Machine learning - Python Programming - Algorithm Development - AWS glue - Pyspark - Data Conversion (Excel to CSV, PDF to Excel, CSV to Excel, Audio) - Data Mining - Data extraction - ETL Data Transformation - Data Cleansing - Linux Server Administration - Website & Data Migrations - DevOps (AWS,AZURE)
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Amazon EC2
    Data Warehousing & ETL Software
    PySpark
    ETL Pipeline
    Redis
    AWS Glue
    Databricks MLflow
    Databricks Platform
    Python
  • $60 hourly
    Senior data engineer with product analytics and data science background, having worked with Fortune 500 companies (Procter & Gamble, Merck, Anheuser-Busch), as well as top-notch data-driven startups. Skilled in translating complex business problems into data solutions, designing data pipelines, providing high-quality data for data-driven insights and decision-making, as well as building KPIs, conducting statistical analyses, and creating impactful visualizations. Problems I'm good at solving: • Data Warehousing and Analytics • ETL / ELT data pipelines • SQL query tuning • Data Modeling and Database Design • Reporting • Data Analysis • Data Cleaning, Pre-Processing • Data Visualization • NLP problems I have a bachelor's in engineering from the top LATAM university (Universidade de São Paulo) with a track record of supporting organizations across various industries, including remote hiring, real estate, and consumer goods. Skills and Expertise ✅ SQL ✅ Python Databases ✅ Snowflake ✅ Redshift ✅ BigQuery ✅ Athena ✅ Trino ✅ Postgres ✅ MySQL Big Data Cloud Technologies ✅ Amazon Web Services – AWS Certified (Redshift, Athena, S3, Lambda, Glue ...) ✅ Google Cloud Platform Other Data Engineering Tools ✅ dbt ✅ Airflow ✅ Fivetran ✅ Git, Gitlab, and Github ✅ Rundeck ✅ Docker Data Visualization ✅ Looker (LookML Expert) ✅ PowerBI ✅ Metabase ✅ Looker Studio (Data Studio) Data Science and Machine Learning ✅ Sci-kit learn, pandas, etc ✅ NLP analysis ✅ Spark ✅ Databricks ✅ Hex ✅ Jupyter Notebooks User Behavioral Analytics ✅ Snowplow ✅ Indicative ✅ Heap ✅ Amplitude ✅ Google Analytics
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Snowflake
    dbt
    Amazon Redshift
    BigQuery
    ETL Pipeline
    Looker
    Data Analysis
    Data Modeling
    Data Visualization
    Business Intelligence
    Data Warehousing
    Machine Learning
    Python
    SQL
  • $50 hourly
    DataOps Leader with 20+ Years of Experience in Software Development and IT Expertise in a Wide Range of Cutting-Edge Technologies * Databases: NoSQL, SQL Server, SSIS, Cassandra, Spark, Hadoop, PostgreSQL, Postgis, MySQL, GIS Percona, Tokudb, HandlerSockets (nosql), CRATE, RedShift, Riak, Hive, Sqoop * Search Engines: Sphinx, Solr, Elastic Search, AWS cloud search * In-Memory Computing: Redis, memcached * Analytics: ETL, Analytic data from few millions to billions of rows and analytics on it, Sentiment analysis, Google BigQuery, Apache Zeppelin, Splunk, Trifacta Wrangler, Tableau * Languages & Scripting: Python, php, shell scripts, Scala, bootstrap, C, C++, Java, Nodejs, DotNet * Servers: Apache, Nginx, CentOS, Ubuntu, Windows, distributed data, EC2, RDS, and Linux systems Proven Track Record of Success in Leading IT Initiatives and Delivering Solutions * Full lifecycle project management experience * Hands-on experience in leading all stages of system development * Ability to coordinate and direct all phases of project-based efforts * Proven ability to manage, motivate, and lead project teams Ready to Take on the Challenge of DataOps I am a highly motivated and results-oriented IT Specialist with a proven track record of success in leading IT initiatives and delivering solutions. I am confident that my skills and experience would be a valuable asset to any team looking to implement DataOps practices. I am excited about the opportunity to use my skills and experience to help organizations of all sizes achieve their data goals.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Python
    Scala
    ETL Pipeline
    Data Modeling
    NoSQL Database
    BigQuery
    Sphinx
    Linux System Administration
    Amazon Redshift
    PostgreSQL
    ETL
    MySQL
    Database Optimization
    Apache Cassandra
  • $100 hourly
    I have over 4 years of experience in Data Engineering (especially using Spark and pySpark to gain value from massive amounts of data). I worked with analysts and data scientists by conducting workshops on working in Hadoop/Spark and resolving their issues with big data ecosystem. I also have experience on Hadoop maintenace and building ETL, especially between Hadoop and Kafka. You can find my profile on stackoverflow (link in Portfolio section) - I help mostly in spark and pyspark tagged questions.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    MongoDB
    Data Warehousing
    Data Scraping
    ETL
    Data Visualization
    PySpark
    Python
    Data Migration
    Apache Airflow
    Apache Kafka
    Apache Hadoop
  • $95 hourly
    I'm a seasoned Data Engineer with 10+ years of experience specializing in end-to-end data platform development and AI integration solutions. My expertise spans: 🏗️ Building scalable data platforms using AWS, GCP, Snowflake, and Databricks ⚡ Optimizing ETL/ELT pipelines that process billions of records 🧠 Integrating AI/ML solutions with enterprise data systems 💡 Advising startups on cost-effective, scalable data strategies Notable Achievements: 🌟 🎓 Received IT4BI Erasmus Mundus Category 'A' Scholarship (top 6 out of 468 applicants) ⚡ Improved ETL performance by 60-70% using 3-phase optimization strategy 📈 Migrated ~8 billion Tracking Click Events via Spark Jobs & AWS EMR 🤖 Developed end-to-end Data Platform Medallion Architecture using Delta Lake 💪 Increased team productivity by 30% through Python automation I don't just write code – I architect complete solutions considering performance, cost, and reliability. My approach includes: 📋 Production-ready code with proper architecture 📚 Comprehensive documentation and knowledge transfer 🛠️ One week post-deployment support 💬 Regular communication and progress updates Whether you need data pipeline optimization, AI integration, or strategic guidance, I deliver solutions that scale with your business. Technical Stack 🛠️: Python, Spark, dbt, Airflow, AWS, GCP, Snowflake, Databricks, Docker, Kubernetes, ChatGPT/LLM Integration Let's discuss how I can help transform your data challenges into opportunities! 🚀
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Fivetran
    Data Engineering
    Databricks Platform
    Snowflake
    Data Lake
    Data Annotation
    Data Integration
    ETL Pipeline
    Data Analysis
    Machine Learning
    Big Data
    Python
    Apache Kafka
    Amazon Web Services
  • $35 hourly
    Seasoned data engineer with over 11 years of experience in building sophisticated and reliable ETL applications using Big Data and cloud stacks (Azure and AWS). TOP RATED PLUS . Collaborated with over 20 clients, accumulating more than 2000 hours on Upwork. 🏆 Expert in creating robust, scalable and cost-effective solutions using Big Data technologies for past 9 years. 🏆 The main areas of expertise are: 📍 Big data - Apache Spark, Spark Streaming, Hadoop, Kafka, Kafka Streams, Trino, HDFS, Hive, Solr, Airflow, Sqoop, NiFi, Flink 📍 AWS Cloud Services - AWS S3, AWS EC2, AWS Glue, AWS RedShift, AWS SQS, AWS RDS, AWS EMR 📍 Azure Cloud Services - Azure Data Factory, Azure Databricks, Azure HDInsights, Azure SQL 📍 Google Cloud Services - GCP DataProc 📍 Search Engine - Apache Solr 📍 NoSQL - HBase, Cassandra, MongoDB 📍 Platform - Data Warehousing, Data lake 📍 Visualization - Power BI 📍 Distributions - Cloudera 📍 DevOps - Jenkins 📍 Accelerators - Data Quality, Data Curation, Data Catalog
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    SQL
    AWS Glue
    PySpark
    Apache Cassandra
    ETL Pipeline
    Apache Hive
    Apache NiFi
    Apache Kafka
    Big Data
    Apache Hadoop
    Scala
  • $55 hourly
    I focus on data engineering, software engineering, ETL/ELT, SQL reporting, high-volume data flows, and development of robust APIs using Java and Scala. I prioritize three key elements: reliability, efficiency, and simplicity. I hold a Bachelor's degree in Information Systems from Pontifícia Universidade Católica do Rio Grande do Sul as well as graduate degrees in Software Engineering from Infnet/FGV and Data Science (Big Data) from IGTI. In addition to my academic qualifications I have acquired a set of certifications: - Databricks Certified Data Engineer Professional - AWS Certified Solutions Architect – Associate - Databricks Certified Associate Developer for Apache Spark 3.0 - AWS Certified Cloud Practitioner - Databricks Certified Data Engineer Associate - Academy Accreditation - Databricks Lakehouse Fundamentals - Microsoft Certified: Azure Data Engineer Associate - Microsoft Certified: DP-200 Implementing an Azure Data Solution - Microsoft Certified: DP-201 Designing an Azure Data Solution - Microsoft Certified: Azure Data Fundamentals - Microsoft Certified: Azure Fundamentals - Cloudera CCA Spark and Hadoop Developer - Oracle Certified Professional, Java SE 6 Programmer My professional journey has been marked by a deep involvement in the world of Big Data solutions. I've fine-tuned my skills with Apache Spark, Apache Flink, Hadoop, and a range of associated technologies such as HBase, Cassandra, MongoDB, Ignite, MapReduce, Apache Pig, Apache Crunch and RHadoop. Initially, I worked extensively with on-premise environments but over the past five years my focus has shifted predominantly to cloud based platforms. I've dedicated over two years to mastering Azure and I’m currently immersed in AWS. I have a great experience with Linux environments as well as strong knowledge in programming languages like Scala (8+ years) and Java (15+ years). In my earlier career phases, I had experience working with Java web applications and Java EE applications, primarily leveraging the WebLogic application server and databases like SQL Server, MySQL, and Oracle.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Scala
    Apache Solr
    Apache Kafka
    Bash Programming
    Elasticsearch
    Java
    Progress Chef
    Apache Flink
    Apache HBase
    Apache Hadoop
    MapReduce
    MongoDB
    Docker
  • $60 hourly
    ✅ AWS Certified Solutions Architect ✅ Google Cloud Certified Professional Data Engineer ✅ SnowPro Core Certified Individual ✅ Upwork Certified Top Rated Professional Plus ✅ The author of Python package for cryptocurrency market Currency.com (python-currencycom) Specializing in Business Intelligence Development, ETL Development, and API Development with Python, Apache Spark, SQL, Airflow, Snowflake, Amazon Redshift, GCP, and AWS. Accomplished lots of complicated and not very projects like: ✪ Highly scalable distributed applications for real-time analytics ✪ Designing data Warehouse and developing ETL Pipelines for multiple mobile apps ✪ Cost optimization for existing cloud infrastructure But the main point: I have a responsibility for the final result.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Data Scraping
    Snowflake
    ETL
    BigQuery
    Amazon Redshift
    Big Data
    Data Engineering
    Cloud Architecture
    Google Cloud Platform
    ETL Pipeline
    Python
    Amazon Web Services
    Apache Airflow
    SQL
  • $20 hourly
    Proficient data engineer experienced in big data pipeline development and designing data solutions for retail, healthcare, etc. I've designed and implemented multiple cloud-based data pipelines for companies located in Europe and the USA. I'm Experienced in designing enterprise-level data warehouses, have Good analytical and communication skills, team player, and am hard working. Experiences: - More than 4+ years of experience in data engineering. - Hand-on experience in developing data-driven solutions using cloud technologies. - Designed multiple data warehouses using Snowflake and Star schema. - Requirement gathering and understanding business needs, to propose solutions. Certified: - Databricks Data Engineer Certified. - Microsoft Azure Associate Data Engineer. Tools and tech: - Pyspark - DBT - Airflow - Azure Cloud - python - Data factory - Snowflake - Databricks - C# - Aws - Docker - CI/CD - Restful API Development
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    AWS Lambda
    PySpark
    Microsoft Azure
    Databricks MLflow
    dbt
    Snowflake
    API Development
    Data Lake
    ETL
    Databricks Platform
    Python
    Apache Airflow
  • $40 hourly
    I am a passionate person. And I am most passionate about solving problems with Data. Being a Data Scientist with the industrial experience of 4 years, I am equipped with the machine learning knowledge to make a world a better place with Data Science. In my professional career, I have worked both as a freelance Data Scientist and a full-time employee. I have worked in the IoT industry for clients in Pakistan and the Middle East. I also have experience working in the Transport Industry, providing solutions using text analytics and NLP. My current industry is retail and I am working for a Danish Retail and beauty company MATAS as a Data Scientist. I am responsible for all stages of the Data science process, from Business understanding to model deployment. Skillsets:- - Understanding of the business problem and where Data Science can create value. - Ability to research the academia and Industry for modern solutions. - Ability to explain Data Science to non-technical business stakeholders. - Key areas, where I consider my self well versed are Recommendations Systems, Multi-Armed Bandits, Send Time Optimization, Demand Forecasting, Price Elasticity, Word2vec, and sentence embeddings, and pretty much all the machine learning algorithms. - Well versed in Big data frameworks such as Spark, with the hands-on experience on PySpark Dataframes and the Databricks platform. - Building Data integration pipelines and collaborating with Data Engineers to support the ETL. - Designing the Power BI dashboards to present the insights to the stakeholder. - Developing the DevOps pipeline for model deployment using Docker, Kubernetes. - Maintaining motivation and enthusiasm within the team when the model accuracy falls.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    ETL Pipeline
    Data Integration
    PySpark
    Data Visualization
    Machine Learning
    Apache Spark MLlib
    Python
    R
    Natural Language Processing
    Deep Learning
    Recommendation System
    Databricks Platform
    Computer Vision
  • $150 hourly
    I am Data Engineer/ Cloud Engineer/ Python Engineer/ Full stack developer with over 10 years experience. I am skilled in the following areas: 1. Data Engineering: I have experience building data pipelines with tools such as Python, SQL, AWS Glue, Apache Hadoop, Apache Spark, Apache Airflow, Google BigQuery, AWS Redshift, AWS Kinesis, AWS S3, Google Cloud Bigtable, Google Cloud Dataflow, Google Cloud Pub/Sub, Google Cloud Storage, Microsoft Azure Data Factory, Microsoft Azure HDInsight, Microsoft Azure Stream Analytics, Oracle Exadata, PostgreSQL, MySQL, MongoDB, CouchDB, Elasticsearch, Neo4j, Snowflake, Databricks, Talend, Tableau, Power BI, Looker, Plotly, Matplotlib. 2. Cloud Engineering: I have experience building automated CI/CD pipelines and deploying Cloud infrastructures to Cloud Service providers such as AWS, GCP, AZURE, DigitalOcean with tools such as Terraform, CloudFormation, AWS SAM, Ansible, Chef, Puppet, Jenkins. I also have experience using containerisation tools such as Docker and Container Orchestration tools such as Kubernetes. 3. Python Engineering: I have experience building web application backend and backend APIs with frameworks such as Django, Flask, FastAPI. I also have experience building web scraping tools and scraping spiders with tools such as Scrapy, requests, BeautifulSoup, lxml. I have experience building machine learning models and deep learning models with tools such as Pytorch, Tensorflow, Keras, XGBoost, LightGBM, Scikit-learn. I also have experience working with machine/ deep learning algorithms such as Random Forest, Naive Bayes, Support Vector Machines, K-Nearest Neighbors, Decision Trees, Neural Networks, Convolutional Neural Networks, Recurrent Neural Networks, Long Short-Term Memory, Deep Belief Networks, Generative Adversarial Networks, Reinforcement Learning, AutoML, GPT-2, GPT-3. 4. Full Stack Engineering: I have experience building Full stack web applications with Angular, React, VueJs for the front end and NodeJS, ExpressJS for the backend. I also have experience integrating full stack applications with various databases such as PostgreSQL, MSSQL, MySQL. 5. Database Engineering: I have experience working with various databases such as MySQL, MSSQL, PostgreSQL. I have experience designing, implementing, and maintaining databases for efficient storage and retrieval of data. I also have experience doing performance tuning for large databases, large scale data migration. I am also experienced with developing back up and recovery plan for databases. 6. Blockchain Engineering: I have experience building smart contracts on various blockchains such as Ethereum, Solana, Polygon etc. I have experience building various types of applications on the blockchain such as DEX Arbitrage applications, Digital identity management systems, crypto currencies, NFTs, Voting apps, supply chain apps.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Node.js
    Azure DevOps
    JavaScript
    AWS Application
    Django
    Automation
    AWS Glue
    DevOps
    AWS Lambda
    Cloud Engineering
    Docker
    AWS Fargate
    Selenium
    Beautiful Soup
    Amazon S3
  • $25 hourly
    Data Engineer specializing in data and cloud (AWS). Experienced in building scalable, robust data pipelines and integrations, cloud and data architecture, and solution design. Proficient with open-source technologies. Experienced in: - Python, PySpark, Flask, FastApi, Django - SQL, PostgreSQL, MySQL, SQLite, MS SQL, MariaDB, Oracle, Snowflake Cloud Data Warehouse, Greenplum, MongoDB, Firebase - AWS (Amazon Web Services), Databricks, Docker, Kubernettes, IaC (Terraform, CloudFormation) - Mage-ai, Airbyte, Airflow, singer.io, Talend Open Studio for Data Integration, Informatica - Typescript, React.js - Power BI, Metabase, Google Data Studio
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Data Analysis
    Database Management
    Amazon API Gateway
    Amazon Web Services
    SQL Programming
    ETL
    Automation
    Data Warehousing & ETL Software
    Amazon S3
    Selenium
    AWS Lambda
    Data Scraping
    Python
    SQL
    Data Integration
    AWS Glue
  • $100 hourly
    I am Data Architect/Snr. Data Engineer with 11 years experience with RDBMS/NoSQL databases and processing large amounts of data. Please note that minimum engagement is part-time (20 hrs/week) and month long, this will ensure the quality of delivered solution and mutual benefit. My experience related to enterprise level and high profile projects in the past, but now I'm helping startups and small-mid sized companies to achieve their goals! My core competences are: Data Modelling, Data Architecture on Cloud platforms, Database development, ETL and Business Intelligence, Database Administration Solution Architecture : Design of solution architectures for a data processing systems of various scale and purpose. Definition of up-to-date technical solutions including data storage, network, data processing (ELT/ETL), BI and AI/ML components. Process and methods definitions and optimisations. AI/ML : Amazon AI Services, Bedrock, Sagemaker, Pinecone, pgvector, Databricks Mozaic Data Modelling : Modelling of OLTP and Datawarehouse systems. It could be design of new schema, normalization/denormalization of existing model, Enterprise datawarehouse design based on Kimball/Inmon, Data Lake and Data Vault architectures, Modernization of existing data landscape. Data Lakes : Modern datalakes built on S3, GCS with Databricks, AWS Glue, Trino DBA Activities : DB migrations, Backup & Recovery, Upgrades, Instance configurations, DB Monitoring, Horizontal scaling, Streaming/BDR replications. Sharding with postgreSQL extensions. Data Integration and ETL : Traditional batch ETL - Informatica, Talend, AWS Datapipeline, Matillion ETL Serverless ETL - AWS Lambda, AWS Glue, Batch, AWS DMS, Google Cloud Functions, Databricks Streaming ETL - Apache NiFi, Kafka, Kinesis streams SaaS ETL - Stitch, Alooma, Fivetran, Airbyte BI/data layers - dbt/Prefect Direct loading with DBMS tools & scripting Data Governance and MDM: Design and implementation of solutions based on Informatica DG/MDM, Alation, Atlan, custom dataquality solutions. BI Systems : Design of BI systems and implementation. I had experience with major industry leading tools as Tableau, PBI, Looker and cloud alternatives. Additionally i had experience with old-style reporting solutions from SAP, Qlick, Jasper. Cloud containerization and deployment : Docker, Mesos/Kubernetes Java development : EE/SE , Spring, Hibernate, RESTful APIs, Maven Clouds : - Cloud migrations (AWS, Azure, GCP) - Cloud infrastructures (VPCs, EC2, Loadbalancing, Autoscaling, Security in AWS/GCP) Thank you for getting to the end of this boring details and looking forward working on exciting projects together :) Best Regards, Yegor.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Oracle Database Administration
    Amazon EC2
    Amazon RDS
    Amazon Web Services
    Amazon Redshift
    Tableau
    Oracle Performance Tuning
    PostgreSQL Programming
    Oracle PLSQL
    ETL
  • $35 hourly
    Over 5 years of working experience in data engineering, ETL, AWS, ML and python. AWS data analytics and machine learning certified.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    OpenAI Embeddings
    Docker
    Terraform
    Amazon ECS
    AWS Lambda
    Amazon Redshift
    Amazon S3
    Amazon Web Services
    Analytics
    PostgreSQL
    PySpark
    SQL
    pandas
    AWS Glue
    Python
  • $70 hourly
    🎓 𝗖𝗲𝗿𝘁𝗶𝗳𝗶𝗲𝗱 𝗗𝗮𝘁𝗮 𝗣𝗿𝗼𝗳𝗲𝘀𝘀𝗶𝗼𝗻𝗮𝗹 with 𝟲+ 𝘆𝗲𝗮𝗿𝘀 of experience and hands-on expertise in Designing and Implementing Data Solutions. 🔥 4+ Startup Tech Partnerships ⭐️ 100% Job Success Score 🏆 In the top 3% of all Upwork freelancers with Top Rated Plus 🏆 ✅ Excellent communication skills and fluent English If you’re reading my profile, you’ve got a challenge you need to solve and you are looking for someone with a broad skill set, minimal oversight and ownership mentality, then I’m your go-to expert. 📞 Connect with me today and let's discuss how we can turn your ideas into reality with creative and strategic partnership.📞 ⚡️Invite me to your job on Upwork to schedule a complimentary consultation call to discuss in detail the value and strength I can bring to your business, and how we can create a tailored solution for your exact needs. 𝙄 𝙝𝙖𝙫𝙚 𝙚𝙭𝙥𝙚𝙧𝙞𝙚𝙣𝙘𝙚 𝙞𝙣 𝙩𝙝𝙚 𝙛𝙤𝙡𝙡𝙤𝙬𝙞𝙣𝙜 𝙖𝙧𝙚𝙖𝙨, 𝙩𝙤𝙤𝙡𝙨 𝙖𝙣𝙙 𝙩𝙚𝙘𝙝𝙣𝙤𝙡𝙤𝙜𝙞𝙚𝙨: ► BIG DATA & DATA ENGINEERING Apache Spark, Hadoop, MapReduce, YARN, Pig, Hive, Kudu, HBase, Impala, Delta Lake, Oozie, NiFi, Kafka, Airflow, Kylin, Druid, Flink, Presto, Drill, Phoenix, Ambari, Ranger, Cloudera Manager, Zookeeper, Spark-Streaming, Streamsets, Snowflake ► CLOUD AWS -- EC2, S3, RDS, EMR, Redshift, Lambda, VPC, DynamoDB, Athena, Kinesis, Glue GCP -- BigQuery, Dataflow, Pub/Sub, Dataproc, Cloud Data Fusion Azure -- Data Factory, Synapse. HDInsight ► ANALYTICS, BI & DATA VISUALIZATION Tableau, Power BI, SSAS, SSMS, Superset, Grafana, Looker ► DATABASE SQL, NoSQL, Oracle, SQL Server, MySQL, PostgreSQL, MongoDB, PL/SQL, HBase, Cassandra ► OTHER SKILLS & TOOLS Docker, Kubernetes, Ansible, Pentaho, Python, Scala, Java, C, C++, C# 𝙒𝙝𝙚𝙣 𝙮𝙤𝙪 𝙝𝙞𝙧𝙚 𝙢𝙚, 𝙮𝙤𝙪 𝙘𝙖𝙣 𝙚𝙭𝙥𝙚𝙘𝙩: 🔸 Outstanding results and service 🔸 High-quality output on time, every time 🔸 Strong communication 🔸 Regular & ongoing updates Your complete satisfaction is what I aim for, so the job is not complete until you are satisfied! Whether you are a 𝗦𝘁𝗮𝗿𝘁𝘂𝗽, 𝗘𝘀𝘁𝗮𝗯𝗹𝗶𝘀𝗵𝗲𝗱 𝗕𝘂𝘀𝗶𝗻𝗲𝘀𝘀 𝗼𝗿 𝗹𝗼𝗼𝗸𝗶𝗻𝗴 𝗳𝗼𝗿 your next 𝗠𝗩𝗣, you will get 𝗛𝗶𝗴𝗵-𝗤𝘂𝗮𝗹𝗶𝘁𝘆 𝗦𝗲𝗿𝘃𝗶𝗰𝗲𝘀 at an 𝗔𝗳𝗳𝗼𝗿𝗱𝗮𝗯𝗹𝗲 𝗖𝗼𝘀𝘁, 𝗚𝘂𝗮𝗿𝗮𝗻𝘁𝗲𝗲𝗱. I hope you become one of my many happy clients. Reach out by inviting me to your project. I look forward to it! All the best, Anas ⭐️⭐️⭐️⭐️⭐️ 🗣❝ Muhammad is really great with AWS services and knows how to optimize each so that it runs at peak performance while also minimizing costs. Highly recommended! ❞ ⭐️⭐️⭐️⭐️⭐️ 🗣❝ You would be silly not to hire Anas, he is fantastic at data visualizations and data transformation. ❞ 🗣❝ Incredibly talented data architect, the results thus far have exceeded our expectations and we will continue to use Anas for our data projects. ❞ ⭐️⭐️⭐️⭐️⭐️ 🗣❝ The skills and expertise of Anas exceeded my expectations. The job was delivered ahead of schedule. He was enthusiastic and professional and went the extra mile to make sure the job was completed to our liking with the tech that we were already using. I enjoyed working with him and will be reaching out for any additional help in the future. I would definitely recommend Anas as an expert resource. ❞ ⭐️⭐️⭐️⭐️⭐️ 🗣❝ Muhammad was a great resource and did more than expected! I loved his communication skills and always kept me up to date. I would definitely rehire again. ❞ ⭐️⭐️⭐️⭐️⭐️ 🗣❝ Anas is simply the best person I have ever come across. Apart from being an exceptional tech genius, he is a man of utmost stature. We blasted off with our startup, high on dreams and code. We were mere steps from the MVP. Then, pandemic crash. Team bailed, funding dried up. Me and my partner were stranded and dread gnawed at us. A hefty chunk of cash, Anas and his team's livelihood, hung in the balance, It felt like a betrayal. We scheduled a meeting with Anas to let him know we were quitting and request to repay him gradually over a year, he heard us out. Then, something magical happened. A smile. "Forget it," he said, not a flicker of doubt in his voice. "The project matters. Let's make it happen!" We were floored. This guy, owed a small fortune, just waved it away? Not only that, he offered to keep building, even pulled his team in to replace our vanished crew. As he spoke, his passion was a spark that reignited us. He believed. In us. In our dream. In what he had developed so far. That's the day Anas became our partner. Not just a contractor, but a brother in arms. Our success story owes its spark not to our own leap of faith, but from the guy who had every reason to walk away. Thanks, Anas, for believing when we couldn't.❞
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Solution Architecture Consultation
    AWS Lambda
    ETL Pipeline
    Data Management
    Data Warehousing
    AWS Glue
    Amazon Redshift
    ETL
    Python
    SQL
    Marketing Analytics
    Big Data
    Data Visualization
    Artificial Intelligence
  • $20 hourly
    I'm a digital architect, blending artistry with engineering to build robust, scalable, and innovative solutions. With a deep passion for technology, I create exceptional digital experiences that drive business growth. 𝗠𝘆 𝗧𝗲𝗰𝗵 𝗔𝗿𝘀𝗲𝗻𝗮𝗹 𝗙𝘂𝗹𝗹-𝗦𝘁𝗮𝗰𝗸 𝗠𝗮𝘀𝘁𝗲𝗿𝘆: I wield JavaScript (React, Next, Vue, Express, Node) with precision, PHP (Laravel, CodeIgniter) for backend muscle, and Python (Django, Flask) for AI-driven magic. 𝗖𝗿𝗼𝘀𝘀-𝗣𝗹𝗮𝘁𝗳𝗼𝗿𝗺 𝗕𝗿𝗶𝗹𝗹𝗶𝗮𝗻𝗰𝗲: React Native and Flutter are my tools for creating stunning mobile apps that seamlessly bridge the web and native worlds. 𝗗𝗮𝘁𝗮 𝗩𝗶𝘀𝘂𝗮𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻 & 𝗖𝗼𝗹𝗹𝗮𝗯𝗼𝗿𝗮𝘁𝗶𝗼𝗻: I leverage Power BI to transform complex data into compelling insights and SharePoint to create dynamic, collaborative platforms. 𝗔𝗜 𝗜𝗻𝗻𝗼𝘃𝗮𝘁𝗶𝗼𝗻: Fueled by Python's prowess, I breathe intelligence into your products using cutting-edge AI/ML algorithms. 𝗪𝗵𝗮𝘁 𝗦𝗲𝘁𝘀 𝗠𝗲 𝗔𝗽𝗮𝗿𝘁: 𝗣𝗿𝗼𝗯𝗹𝗲𝗺 𝗦𝗼𝗹𝘃𝗲𝗿: I don't just code; I analyze, strategize, and deliver solutions that exceed expectations. 𝗣𝗮𝘀𝘀𝗶𝗼𝗻𝗮𝘁𝗲 𝗟𝗲𝗮𝗿𝗻𝗲𝗿: The tech world is constantly evolving, and I'm always hungry to explore new frontiers. 𝗖𝗹𝗶𝗲𝗻𝘁-𝗖𝗲𝗻𝘁𝗿𝗶𝗰: Your vision is my mission. I collaborate closely to transform ideas into reality. 𝗟𝗲𝘁'𝘀 𝗕𝘂𝗶𝗹𝗱 𝗦𝗼𝗺𝗲𝘁𝗵𝗶𝗻𝗴 𝗔𝗺𝗮𝘇𝗶𝗻𝗴 𝗧𝗼𝗴𝗲𝘁𝗵𝗲𝗿! Whether you need a complex web application, a high-performance mobile app, insightful data visualizations, or effective collaboration platforms, I'm your go-to tech wizard. Let's create digital magic that leaves a lasting impact.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Microsoft SharePoint
    Microsoft Power BI
    Flutter
    React Native
    Next.js
    CodeIgniter
    Laravel
    MongoDB
    Vue.js
    ExpressJS
    Node.js
    React
  • $15 hourly
    --Cloud Big Data Engineer I am Azure certified data engineer with professional experience of DataBricks,DataFactory,StreamAnalytics,EventHubs,Datalake store. I have developed API driven and DataFactory orchestration , developed Databricks jobs orchestration, cluster creation and job management through DataBricks REST API. I have successfully developed around 3 full scale enterprises solution on Microsoft cloud(DataBricks,Datafactory,stream analytics, Datalake store,Blob storage) . I have developed DataBricks orchestration and cluster management mechanism in .NET c#, Java, Python. Hopefully I will serve you in better way due to my experience and knowledge. Following are BigData and cloud tools in which I have experties. -Apache Spark -scala -python -kafka -Datafactory -stream analytics -Eventhubs -spark streaming -Azure DataLake store -Azure Blob storage -parqute files -Snowflake MPP -Databricks -.NET C# --Webscraping Data mining I have professionalHDFS experience in Datamining , webscraping with selenium python. I have professional experience of scraping on many e-com sites like Amazon, Ali express, Ebay, Walmar and of social sites like Facebook, Twitter,linkdin and many other sites. I will provide required scraped data and script as well as support. Hopefully I will serve you in better way due to my relevant professional experience and knowledge .
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Google Cloud Platform
    Apache Airflow
    Data Management
    Microsoft Azure
    Snowflake
    Big Data
    Selenium
    Data Scraping
    Python
  • $35 hourly
    👉6+ years in Data Engineering, Analysis & Science 👉Completed projects with 99.7% positive Feedback 👉Delivered significant cost savings and scalability improvements through optimized data pipelines on GCP, AWS and Azure platform 👉Achieved 99.9% data reliability and security across multiple large-scale deployments, ensuring robust and error-free data systems. Following are my major skills: 🟢Data Engineering: ● ETL: Specialize in architecting end-to-end data pipelines and ETL processes across e-commerce, real estate, logistics, pharma, and healthcare, ensuring accurate and high-quality data movement from diverse sources to data warehouses. ● Cloud Data Solutions: Proficient in designing end-to-end data solutions on GCP, AWS and Azure optimizing cloud services for scalable data pipelines. ●API Management: Experienced in API integration for data pipelines, including social media, Screaming Frog, DataforSEO, Apollo, GA4, Google Search Console, HubSpot. ● Data Warehousing: Skilled in constructing and refining data warehouses with BigQuery, Redshift, and Snowflake. 🟢Data Analyst: ●Data Analysis: My proficiency in data analysis allows me to uncover meaningful patterns and trends within datasets using python and SQL. I can perform various types of analysis. 🔸Predictive analysis 🔸Statistical analysis 🔸Descriptive analysis 🔸Exploratory analysis 🔸Diagnostic analysis 🔸Time Series Analysis ●Data Visualization: I visualize complex data insights clearly using Tableau, Power BI, Looker Studio and Google Analytics 4 (GA4). 🟢Machine Learning: Experienced in developing and deploying machine learning models across various domains, including regression, classification, clustering, and natural language processing. These models drive predictive analytics and enable data-driven decision-making. ●Time-Series Forecasting: Led projects in time-series forecasting, optimizing tariffs and revenue using deep learning models like LSTM and GRU. ● Recommendation Systems: I specialize in designing and deploying recommendation systems that boost customer engagement and sales. ● Deep Learning: Proficient in TensorFlow and PyTorch, I deploy deep neural networks for tasks including image recognition, natural language processing, and recommendation systems. 🟢Tools and Technologies: ● Data Engineering: ETL | GCP | Azure | API | DBT | Firebase | Redshift | Synapse | Big Query | Data Robot | Oracle DB | Cloud Composer | DataProc | Data Lake | Google Cloud Function | Google Pub/Sub | Apache Airflow|Data Form | Google Cloud Run | Docker | Google Cloud Storage | Google App Script ● API: Screaming Frog | DataforSEO | Apollo | Facebook | GA4 | Google Search Console | HubSpot | SerpAPI ● Data analysis: Power BI | Looker | Tableau | SQL | Python ● Data Science: Clustering | Time series forecasting | Recommendation system ● Other Platforms: Heroku | Stripe | Google Sheets | Salesforce | Confluence | Jira | Slack I prioritize data integrity to consistently deliver impactful results and with a strong foundation in effective communication and collaboration, I bring flexibility and a proactive approach to every project.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    dbt
    SQL Server Integration Services
    ETL
    Google Cloud Platform
    API Integration
    Looker
    Microsoft Azure
    BigQuery
    Data Engineering
    Data Modeling
    Data Integration
    Data Analysis
    SQL
    Data Science
    Python
  • $350 hourly
    I am a full-stack data scientist/data engineer with 16000+ hours on Upwork and many more offline. I am familiar with almost all major tech stacks on data science/engineering and app development. Front end: ui/ux, nodejs, react, angular Back end: micro service, rest api, database performance optimization CI/CD: jenkins, gitlab, Kubernetes Security: secure file transferring and oAuth etc ETL: scriptella, informatica, nifi Search Engine: elasticsearch Software Design Documentation: graphviz, mermaid Web scraping: Scrapy, Rotation Proxy, Selenium, Beautifulsoup Data Science: Python, Java, R, C/C++, NLP/NLG, AIGC, GPT-3, ChatGPT, HuggingFace, ML Predictive Modeling, Reinforcement Learning, Knowledge Graph Neo4j, Recommendation Engine, Deep Learning, Computer Vision, OCR, GAN, Stable Diffusion, Signal Processing, Voice Clone, Chatbot, Sports betting, Price Optimization, Time Series Analysis/Forecasting, Crypto, Solidity, Tokenomics etc. Research: Solid experience in ML, Algorithm, Bioinformatics, Healthcare. Published around 100 papers on top-tier conferences and journals and 7 patents. I worked as a research scientist on machine learning and algorithm for IBM T.J. Watson Research Center, Industry Solution Group, from 2012 to 2017. I worked as a J2EE software engineer from 2006 to 2008. I obtained a Ph.D of Computer Science from University of California, Los Angeles, with major in Machine Learning and minors in Artificial Intelligence and Data Mining. I have been awarded the Most Outstanding Ph.D Graduate Award, the Northrup-Grumman Outstanding Graduate Student Research Award, the Chancellor Award for Most Outstanding Applicants, all from Computer Science Department, UCLA and the Chinese Government Award for Outstanding Chinese Students Overseas, 2010. I had worked as a consultant for many start-ups on various projects and I have solid background on both research and development. I am also an instructor teaching machine learning related courses on Udemy. Simply search my name and you can find my courses there.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Data Scraping
    Smart Contract
    Binance Coin
    Blockchain
    Bioinformatics
    Predictive Analytics
    Python
    Machine Learning
    Recommendation System
    Computer Vision
    Chatbot
    Natural Language Processing
    Deep Learning
  • $45 hourly
    As a highly experienced Data Engineer with over 10+ years of expertise in the field, I have built a strong foundation in designing and implementing scalable, reliable, and efficient data solutions for a wide range of clients. I specialize in developing complex data architectures that leverage the latest technologies, including AWS, Azure, Spark, GCP, SQL, Python, and other big data stacks. My extensive experience includes designing and implementing large-scale data warehouses, data lakes, and ETL pipelines, as well as data processing systems that process and transform data in real-time. I am also well-versed in distributed computing and data modeling, having worked extensively with Hadoop, Spark, and NoSQL databases. As a team leader, I have successfully managed and mentored cross-functional teams of data engineers, data scientists, and data analysts, providing guidance and support to ensure the delivery of high-quality data-driven solutions that meet business objectives. If you are looking for a highly skilled Data Engineer with a proven track record of delivering scalable, reliable, and efficient data solutions, please do not hesitate to contact me. I am confident that I have the skills, experience, and expertise to meet your data needs and exceed your expectations.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Snowflake
    ETL
    PySpark
    MongoDB
    Unix Shell
    Data Migration
    Scala
    Microsoft Azure
    Amazon Web Services
    SQL
    Apache Hadoop
    Cloudera
  • $55 hourly
    AWS certified solution architect and expert in data engineering. I can provide a variety of custom data solutions; from the architecture and engineering of your data, to the storage of data in your data lake and data warehouse. I can architect & Implement your Data Pipeline and Data Warehouse solution using the latest modern cloud services. I have expertise in SnowFlake Computing, Redshift, Teradata, Fivetran, AWS Lake Formation, AWS Glue, Kinesis, Lambda, Dynamodb and much more. Key strengths include: AWS Glue, AWS Databrew, Quicksigt, Kinesis, Lambda, Redshift, Snowflake, Teradata, RDS and Dynamodb. Python, PySpark, Java and PLSQL developer. Education: Master in Data Science, University of Sterlings, UK Bachelors in Computer Engineering, GIK University, PAK AWS Certified Solution Architect Teradata Associate Solution Architect
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Fivetran
    Data Extraction
    Snowflake
    Amazon Redshift
    Java
    CI/CD
    Amazon S3
    Amazon Web Services
    ETL Pipeline
    AWS Glue
    Cloud Migration
    SQL
    Python
    Database Architecture
  • $80 hourly
    Hi, I'm Movses, a Data Engineer with 10+ years of experience. I'm a certified AWS Data Engineer with extensive experience in the AWS data stack. You name it, I've done it. More than data engineering, I love database development. With Oracle, Postgres, SQL Server, Snowflake, I feel home. SQL is my native language, and I code in Python, Scala, PL/SQL, and Bash. I specialize in creating ETL/ELT jobs, data pipelines, and orchestration jobs. I'm here to help you overcome challenges and set optimal data goals and designs. I stay involved after the project wraps up, not for the money, but because I only take on projects that genuinely interest me. I believe in delivering results that matter to both of us.
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    Django
    Oracle Database Administration
    AWS Lambda
    AWS Glue
    Oracle
    Apache Superset
    Data Visualization
    Oracle PLSQL
    Scala
    Python
    Microsoft SQL Server
    SQL
    SQL Server Integration Services
    PostgreSQL
  • $235 hourly
    As a seasoned Data Scientist and AI strategist with over 10 years of experience across diverse industries, I possess a deep understanding of leveraging AI and data analytics to drive business growth and innovation. My expertise includes: 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝘃𝗲 𝗔𝗜 𝗠𝗼𝗱𝗲𝗹𝗶𝗻𝗴: I specialize in developing and implementing cutting-edge Generative AI models to address complex business challenges and unlock new possibilities. 𝗠𝗮𝗻𝗮𝗴𝗲𝗺𝗲𝗻𝘁 𝗖𝗼𝗻𝘀𝘂𝗹𝘁𝗶𝗻𝗴: I provide strategic guidance and actionable insights to businesses seeking to leverage AI and data-driven approaches for enhanced decision-making and transformative growth. 𝗔𝗜 𝗦𝗼𝗹𝘂𝘁𝗶𝗼𝗻𝘀: I offer end-to-end AI product development services, from ideation and design to implementation and deployment, ensuring alignment with your specific business objectives. 𝗗𝗮𝘁𝗮 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀: I excel in extracting meaningful insights from data, empowering your business with data-driven decision-making capabilities for a competitive edge. 𝗘𝘅𝗲𝗰𝘂𝘁𝗶𝘃𝗲 𝗖𝗼𝗮𝗰𝗵𝗶𝗻𝗴: I provide expert mentorship to executives navigating the complexities of AI adoption and strategy, enabling them to make informed decisions and lead with confidence. 𝗣𝗿𝗼𝗷𝗲𝗰𝘁 𝗠𝗮𝗻𝗮𝗴𝗲𝗺𝗲𝗻𝘁: I possess a proven track record of successfully delivering large-scale AI projects, ensuring seamless execution, timely completion, and optimal outcomes. 𝗞𝗲𝘆 𝗦𝗸𝗶𝗹𝗹𝘀: 𝗣𝗿𝗼𝗴𝗿𝗮𝗺𝗺𝗶𝗻𝗴 & 𝗧𝗼𝗼𝗹𝘀: Python, R, Docker, Bash/Shell, Perl, C++, C 𝗔𝗜 & 𝗠𝗟 𝗧𝗲𝗰𝗵𝗻𝗶𝗾𝘂𝗲𝘀: Natural Language Processing (NLP), Reinforcement Learning, Computer Vision, Transfer Learning, Supervised, Unsupervised, Semi-Supervised Learning 𝗗𝗶𝘀𝘁𝗿𝗶𝗯𝘂𝘁𝗲𝗱 𝗖𝗼𝗺𝗽𝘂𝘁𝗶𝗻𝗴:: Spark, Ray, Dask, MPI 𝗔𝗜 𝗟𝗶𝗯𝗿𝗮𝗿𝗶𝗲𝘀: PyTorch, DeepSpeed, FairScale 𝗖𝗼𝗻𝘀𝘂𝗹𝘁𝗶𝗻𝗴 & 𝗦𝘁𝗿𝗮𝘁𝗲𝗴𝘆: Business Analysis, Management Consulting, Data Analytics, Deep Learning 𝗪𝗵𝘆 𝗖𝗵𝗼𝗼𝘀𝗲 𝗠𝗲? 𝗣𝗿𝗼𝘃𝗲𝗻 𝗥𝗲𝘀𝘂𝗹𝘁𝘀: I have a history of delivering exceptional results, including resolving multi-million-dollar issues and driving significant revenue increases for my clients (>$40M) 𝗜𝗻𝗱𝘂𝘀𝘁𝗿𝘆 𝗘𝘅𝗽𝗲𝗿𝘁𝗶𝘀𝗲: I have worked across a broad range of sectors, including Healthcare, Pharmaceuticals, Financial Services, Education, Aerospace, and Defense. 𝗖𝗹𝗶𝗲𝗻𝘁-𝗖𝗲𝗻𝘁𝗿𝗶𝗰 𝗔𝗽𝗽𝗿𝗼𝗮𝗰𝗵: I am committed to understanding your unique needs and tailoring solutions that deliver tangible business outcomes. 𝗖𝗲𝗿𝘁𝗶𝗳𝗶𝗲𝗱 & 𝗧𝗿𝘂𝘀𝘁𝗲𝗱: I am an Upwork Expert-Vetted professional, placing me in the top 1% of my field, and I am certified in Emotional and Social Intelligence, demonstrating my commitment to effective communication and client satisfaction. I am eager to collaborate with businesses seeking to embrace the transformative power of AI. Let's work together to achieve your business goals and drive innovation. Contact me today!
    vsuc_fltilesrefresh_TrophyIcon Apache Spark
    LLM Prompt Engineering
    Hugging Face
    Large Language Model
    AI Consulting
    Generative AI
    Artificial Intelligence
    Distributed Computing
    Business Planning & Strategy
    White Paper Writing
    Data Science
    Natural Language Processing
    Computer Vision
    Machine Learning
    Deep Learning
    Python
  • Want to browse more freelancers?
    Sign up

How it works

1. Post a job

Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.

2. Talent comes to you

Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.

3. Collaborate easily

Use Upwork to chat or video call, share files, and track project progress right from the app.

4. Payment simplified

Receive invoices and make payments through Upwork. Only pay for work you authorize.

Trusted by

How do I hire a Apache Spark Engineer on Upwork?

You can hire a Apache Spark Engineer on Upwork in four simple steps:

  • Create a job post tailored to your Apache Spark Engineer project scope. We’ll walk you through the process step by step.
  • Browse top Apache Spark Engineer talent on Upwork and invite them to your project.
  • Once the proposals start flowing in, create a shortlist of top Apache Spark Engineer profiles and interview.
  • Hire the right Apache Spark Engineer for your project from Upwork, the world’s largest work marketplace.

At Upwork, we believe talent staffing should be easy.

How much does it cost to hire a Apache Spark Engineer?

Rates charged by Apache Spark Engineers on Upwork can vary with a number of factors including experience, location, and market conditions. See hourly rates for in-demand skills on Upwork.

Why hire a Apache Spark Engineer on Upwork?

As the world’s work marketplace, we connect highly-skilled freelance Apache Spark Engineers and businesses and help them build trusted, long-term relationships so they can achieve more together. Let us help you build the dream Apache Spark Engineer team you need to succeed.

Can I hire a Apache Spark Engineer within 24 hours on Upwork?

Depending on availability and the quality of your job post, it’s entirely possible to sign up for Upwork and receive Apache Spark Engineer proposals within 24 hours of posting a job description.

Schedule a call