Hire the Best Pyspark Developers
in Brazil

More than 3,000 reviews on G2
Rating is 4.5 out of 5.
4.5/5
of Upwork by G2 peer reviewers
Leo R.

Curitiba, Brazil

$40/hr
4.1
9 jobs

You probably think clicking "deploy" on Databricks from the cloud marketplace is all it takes to build a modern data stack. Instead, you get unmanageable infrastructure, skyrocketing costs, and pipelines feeding reports nobody trusts. ๐—œ ๐—ณ๐—ถ๐˜… ๐˜๐—ต๐—ฎ๐˜. ๐—ก๐—ผ ๐—ฎ๐—ด๐—ฒ๐—ป๐—ฐ๐—ถ๐—ฒ๐˜€, ๐—ป๐—ผ ๐—ฏ๐—น๐—ผ๐—ฎ๐˜. Just a multi-certified, 5+ years of experience Cloud Solutions Architect building automated, high-integrity platforms that turn raw data into a competitive advantage. If you shoot me a invitation or message I'll send you a personalized Loom video back on how I may be able to help you; and of course, to prove that I'm the real deal, ๐—ป๐—ผ ๐—”๐—œ ๐—ถ๐—ป๐˜ƒ๐—ผ๐—น๐˜ƒ๐—ฒ๐—ฑ! Whether you are building a greenfield lakehouse from scratch or migrating legacy systems to the cloud, I architect efficient, cost-effective environments that scale without the overhead. I understand the business bottom line just as well as the underlying code. โœช 100% Job Success Score | 5.0โ˜… average โœช Proven experience on multi-cloud architectures ๐Ÿ’ก ๐—ช๐—ต๐—ฎ๐˜ ๐—œ ๐—ฑ๐—ผ: โ€ข ๐——๐—ฎ๐˜๐—ฎ ๐—ฃ๐—น๐—ฎ๐˜๐—ณ๐—ผ๐—ฟ๐—บ ๐—˜๐—ป๐—ด๐—ถ๐—ป๐—ฒ๐—ฒ๐—ฟ๐—ถ๐—ป๐—ด: I build production-ready environments using Terraform. No manual marketplace or standard deployments that break at scale. โ€ข ๐—ฅ๐—ฒ๐—น๐—ถ๐—ฎ๐—ฏ๐—น๐—ฒ ๐——๐—ฎ๐˜๐—ฎ ๐—˜๐—ป๐—ด๐—ถ๐—ป๐—ฒ๐—ฒ๐—ฟ๐—ถ๐—ป๐—ด: Raw data becomes actionable. I build resilient Medallion architectures and automated ETL/ELT pipelines so your stakeholders actually trust the numbers. โ€ข ๐—ฃ๐—ฟ๐—ผ๐—ฑ๐˜‚๐—ฐ๐˜๐—ถ๐—ผ๐—ป ๐— ๐—Ÿ๐—ข๐—ฝ๐˜€: I bridge the gap between data engineering and machine learning. Using MLflow and Databricks Model Serving, I operationalize models into scalable, real-time REST endpoints and automated streaming inference pipelines. โ€ข ๐—š๐—ผ๐˜ƒ๐—ฒ๐—ฟ๐—ป๐—ฎ๐—ป๐—ฐ๐—ฒ & ๐—ฆ๐—ฒ๐—ฐ๐˜‚๐—ฟ๐—ถ๐˜๐˜†: Proper data governance utilizing Unity Catalog (no legacy Hive metastores) to ensure your data is accessible, secure, and future-proof. โ€ข ๐—–๐—น๐—ผ๐˜‚๐—ฑ ๐—–๐—ผ๐˜€๐˜ ๐—ข๐—ฝ๐˜๐—ถ๐—บ๐—ถ๐˜‡๐—ฎ๐˜๐—ถ๐—ผ๐—ป: Most companies overspend on cloud infrastructure. I architect systems that pay for themselves in weeks by eliminating overhead and inefficiencies with efficient auditing and monitoring features. โœ… ๐—–๐—ฒ๐—ฟ๐˜๐—ถ๐—ณ๐—ถ๐—ฐ๐—ฎ๐˜๐—ถ๐—ผ๐—ป๐˜€ (๐˜ƒ๐—ฒ๐—ฟ๐—ถ๐—ณ๐—ถ๐—ฒ๐—ฑ): โ€ข Databricks Professional Data Engineer โ€ข Databricks Associate Data Engineer โ€ข Databricks Lakehouse Fundamentals โ€ข GCP Professional Data Engineer โ€ข GCP Associate Cloud Engineer โ€ข GCP Cloud Digital Leader โ€ข AWS Associate Solutions Architect โ€ข AWS Cloud Practitioner ๐Ÿ”ง ๐—˜๐˜…๐—ฝ๐—ฒ๐—ฟ๐—ถ๐—ฒ๐—ป๐—ฐ๐—ฒ ๐˜„๐—ถ๐˜๐—ต ๐—–๐—น๐—ผ๐˜‚๐—ฑ ๐—ฆ๐—ฒ๐—ฟ๐˜ƒ๐—ถ๐—ฐ๐—ฒ๐˜€: โ€ข ๐——๐—ฎ๐˜๐—ฎ๐—ฏ๐—ฟ๐—ถ๐—ฐ๐—ธ๐˜€: Workflows, LDP (Lakeflow Declarative Pipelines), Unity Catalog, Workflows, Databricks SQL, MLFlow. โ€ข ๐—”๐—บ๐—ฎ๐˜‡๐—ผ๐—ป ๐—ช๐—ฒ๐—ฏ ๐—ฆ๐—ฒ๐—ฟ๐˜ƒ๐—ถ๐—ฐ๐—ฒ (๐—”๐—ช๐—ฆ): EMR, Athena, Redshift, Glue, S3, RDS, Kinesis Data Firehose, Kinesis, and Data Streams. โ€ข ๐—š๐—ผ๐—ผ๐—ด๐—น๐—ฒ ๐—–๐—น๐—ผ๐˜‚๐—ฑ ๐—ฃ๐—น๐—ฎ๐˜๐—ณ๐—ผ๐—ฟ๐—บ (๐—š๐—–๐—ฃ): Bigquery, Dataform, Composer, Dataflow, Dataproc, Cloud Storage, Pub/Sub, Cloud Functions, and Looker Studio. โ€ข ๐— ๐—ถ๐—ฐ๐—ฟ๐—ผ๐˜€๐—ผ๐—ณ๐˜ ๐—”๐˜‡๐˜‚๐—ฟ๐—ฒ: Data Factory, Synapse, and Storage Account. โ€ข ๐—ข๐˜๐—ต๐—ฒ๐—ฟ๐˜€: Terraform, dbt, Airflow, Airbyte, Hadoop, and Hive. โš™๏ธ ๐—–๐—ผ๐—ฟ๐—ฒ ๐—ฒ๐˜…๐—ฝ๐—ฒ๐—ฟ๐˜๐—ถ๐˜€๐—ฒ: โ€ข ๐—ฅ๐—ผ๐—น๐—ฒ๐˜€: Data Architect, Data Engineer, Solutions Architect, Platform Engineer โ€ข ๐—ฃ๐—น๐—ฎ๐˜๐—ณ๐—ผ๐—ฟ๐—บ๐˜€: Databricks (Delta Lake, Unity Catalog, Lakeflow, Workflows), BigQuery โ€ข ๐—œ๐—ป๐—ณ๐—ฟ๐—ฎ๐˜€๐˜๐—ฟ๐˜‚๐—ฐ๐˜๐˜‚๐—ฟ๐—ฒ: Infrastructure as Code (IaC), Terraform, Multi-Cloud (AWS, GCP, Azure) โ€ข ๐—”๐—ฟ๐—ฐ๐—ต๐—ถ๐˜๐—ฒ๐—ฐ๐˜๐˜‚๐—ฟ๐—ฒ: Medallion Architecture, Data Lakehouse, Data Governance, Data Quality, Machine Learning โ€ข ๐—˜๐—ป๐—ด๐—ถ๐—ป๐—ฒ๐—ฒ๐—ฟ๐—ถ๐—ป๐—ด: PySpark, Python, SQL, dbt, Apache Airflow, ETL/ELT, CDC, Batch and Stream Processing

  • PySpark
  • Cloud Architecture
  • Cloud Computing
  • Databricks Platform
  • Data Engineering
  • Python
  • SQL
  • Apache Airflow
  • Google Cloud Platform
  • Amazon Web Services
  • Microsoft Azure
  • ETL
  • Data Analysis
  • Bash
  • Data Modeling
  • Data Warehousing
  • Continuous Improvement
Paulo P.

Sao Vicente, Brazil

$10/hr
5.0
9 jobs

Hi there, thanks for considering me for your project! ๐Ÿ‘‹ My name is Paulo, a Computer Science graduate from Brazil with 2.5 years of experience in Data Engineering using Python. With a strong background in algorithms, mathematics, and logic, Iโ€™ll quickly adapt to new tools and technologies to find the best solutions for your needs. ๐Ÿ“š ๐Ÿ‡ง๐Ÿ‡ท What I can do for you: โ€ข Web Scraping โ€“ Extract and process data from websites ๐Ÿ”Ž โ€ข Develop ETL & Data Pipelines โ€“ Automate data collection and transformation ๐Ÿ”„ โ€ข API Integrations โ€“ Connect and exchange data efficiently ๐Ÿ”Œ โ€ข Train and Deploy Machine Learning Models โ€“ Build intelligent, data-driven solutions ๐Ÿค– Iโ€™ll break down technical concepts in an intuitive and easy-to-understand way. ๐Ÿ’ก My work is driven by commitment, dedication, and professionalism. ๐Ÿ‘จ๐Ÿพโ€๐Ÿ’ป Available in your time zone for smooth collaboration. โŒš๏ธ

  • ETL
  • Data Extraction
  • ETL Pipeline
  • Machine Learning
  • Machine Learning Model
  • Database Modeling
  • Database Architecture
  • Data Engineering
  • Data Cloud
  • Data Integration
  • Big Data
Pedro S.

Porto Alegre, Brazil

$8/hr
5.0
13 jobs

I have almost 5 years of experience working with data, having started as a data analyst, then data scientist, and for the past 4 years, as a data engineer. I've worked in many projects related to API consumption, web scraping and automation. I work with Python, SQL, PySpark, and AWS services like Glue, S3, Redshift, and IAM. Also have got experience with Azure services like ADLS and Functions. I have hands-on experience building scalable, production-grade pipelines using the medallion architecture, with automation and orchestration through serverless services. Experience with infrastructure-as-code practices using Terraform. I'm passionate about clean engineering, automation, and creating end-to-end data solutions that drive business value. Related to web scraping these are the tools I use depending on the need: requests, selenium, bs4, playwright. To deal with anti-bot we can always use proxies (that we can gather for free and I do have a database for that already), user-agents and cookies to mimic human like behavior. We scrape the data as json, html, xml or plain text and turn them into structured data as an EXCEL file, csv, database etc.

  • PySpark
  • ETL
  • Data Extraction
  • Data Mining
  • ETL Pipeline
  • Web Scraping
  • Python
  • API
  • Data Entry
  • Data Analytics
  • Machine Learning
Felipe C.

Belo Horizonte, Brazil

$10/hr
5.0
3 jobs

I am a Software Engineer with 8 year of experience in developing applications. I already work with a lot of technologies including: VB.NET, C#, Asp. Net Web Forms, Asp.Net Web Api, Asp.Net MVC, JQuery, Angular JS, Angular, Entity Framework, NHibernate, Delphi. I have experience in both backend and front-end development. I have a lot experience in T-SQL (Sql Server), but I also have some knowledge in Oracle, MySql,Postgre and MongoDB. For control version I already work with Git, Bitbucket and TFS. Recently I've shifted my career to work with Big Data, I'm finishing my thesis on Data Science and Advanced Analytics and I'm currently working as a Big Data Engineer.

  • PySpark
  • Python
  • Databricks Platform
  • Machine Learning
  • SQL
  • Transact-SQL
  • PostgreSQL Programming
  • BigQuery
  • Google Cloud Platform
  • HubSpot
  • Apache Airflow
  • Google Search Console
Christian G.

Divinopolis, Brazil

$150/hr
5.0
32 jobs

I build AI systems that go beyond prototypes โ€” production-ready, well-architected, and designed to scale. With 7+ years of experience across industry and academic research (MSc in NLP, published at ACL & WSDM), I bring both deep technical skills and a practical, results-driven approach to every project. My clients tend to stay. My longest engagement ran 3 years, and most of my UpWork projects have been 1+ year collaborations. I don't just write code โ€” I ask the right questions, suggest better approaches, and make sure the final product actually works in the real world. ๐Ÿ”Ž What I build โ˜… AI Agents & Chatbots โ€” text, voice, multi-language (OpenAI, Anthropic, Azure AI, LiveKit) โ˜… RAG Pipelines โ€” semantic search, vector databases, MCP servers, knowledge retrieval โ˜… NLP & LLM Applications โ€” classification, entity recognition, summarization, Q&A โ˜… Data Extraction โ€” OCR, web scraping, document parsing, transaction processing โ˜… Cloud-Native Backends โ€” microservices, event-driven architectures, serverless โ˜… Real-Time Audio/Video Processing โ€” transcription, translation, speaker diarization โ˜… MLOps & Monitoring โ€” experiment tracking, evaluation, observability ๐Ÿ› ๏ธ Tech I work with daily โ˜… Python: FastAPI, Flask, Streamlit, Pandas, scikit-learn, TensorFlow/Keras, spaCy, NLTK โ˜… AI/LLM: OpenAI, Anthropic Claude, DeepSeek, Gemini, LangChain, LlamaIndex, CrewAI, Guardrails AI โ˜… Cloud: AWS (Lambda, SQS, S3, Batch, EC2, RDS, SageMaker), Azure, GCP, Terraform, Docker โ˜… Databases: PostgreSQL, Supabase, MongoDB, DynamoDB โ˜… Vector DBs: Weaviate, Pinecone, Milvus, pgvector โ˜… MLOps: MLFlow, Langfuse, DeepEval, Weights & Biases, OpenTelemetry โ˜… Other: Tesseract OCR, Faster-Whisper, Pyannote, FastMCP, Selenium ๐Ÿš€ Recent projects โ†’ 3-year AI platform for a finance company: chatbot (voice + text), OCR pipelines, merchant classification with GPT-4o, web scraping, MLOps setup โ†’ Real-time transcription & translation system on AWS for multi-language conferences: GPU-accelerated Whisper, speaker diarization, LangChain summarization โ†’ RAG backend with MCP server: semantic search across Slack, ClickUp, and Fireflies using Weaviate, LlamaIndex, and Supabase pgvector ๐Ÿ“ข What clients say about me โ–ธ "Christian is a great developer, and asks relevant questions for the problems we give him, he's not just a 'pair of hands' but a helpful advisor for improving your initial suggestion on how to solve the problem. He's very fast to iterate and delivers code with great quality." โ–ธ "Went above and beyond to help with code for a large project, and completed tasks quickly and efficiently. Understood exactly what was needed for the job and executed with precision. I will absolutely be working with him again in the future!" Let's build something great โ€” send me a message and let's talk about your project.

  • Generative AI
  • Natural Language Processing
  • Python
  • Machine Learning
  • Amazon Web Services
  • Claude
  • LangChain
  • Deep Learning
  • Data Science
  • Google Cloud Platform
  • Artificial Intelligence
  • Streamlit
  • Azure Cognitive Services
  • Retrieval Augmented Generation
  • Microservice
  • Audio Transcription
  • MLflow
  • Vector Database
  • Text Summarization
  • Classification
Alexsander S.

Betim, Brazil

$20/hr
5.0
4 jobs

Most data teams have infrastructure that looks good on paper but still can't answer basic business questions fast enough. I fix that end to end. With 7+ years in data engineering, I specialize in Snowflake, Databricks, and AWS building pipelines that actually run in production without breaking at 6am. Recent work includes: โ€ข CI/CD pipeline for a US healthcare client using Schemachange + Snowflake with automated DQ checks and deployment gating โ€ข Databricks-to-Snowflake migration of a 200-view mart package, rebuilt natively in Snowflake SQL โ€ข ML-powered matching service on FastAPI with pgvector semantic search and a LightGBM reranker โ€ข End-to-end RAG pipeline using Airflow, Pinecone, and OpenAI I work as an independent contractor no middleman, no overhead. I treat your problem as mine to solve, not a ticket to close. If your data is slow, unreliable, or not being used strategically, let's talk.

  • Apache Spark
  • Microsoft Power BI
  • Python
  • SQL Programming
  • ETL
  • Database
  • Amazon Web Services
  • Snowflake
  • Apache Airflow
  • Apache Kafka
  • Cloud Architecture
  • dbt
  • AI Consulting
  • Microsoft Azure
  • Artificial Intelligence

How it works

Post a job for free Post a job

Tell us what you need. Create your own job post or generate one with AI then filter talent matches.

Hire top talent fast

Consult, interview, and hire quickly, so you can meet the freelancers you're excited about.

Collaborate easily

Use Upwork to chat or video call, share files, and track project progress right from the app.

Payment simplified

Manage payments in one place with flexible billing options. Only pay for approved work, hourly or by milestone.

Don't just take our word for it

How do I hire a Pyspark Developer in Brazil on Upwork?

You can hire a Pyspark Developer in Brazil on Upwork in four simple steps:

  • Create a job post tailored to your Pyspark Developer project scope. We'll walk you through the process step by step.
  • Browse top Pyspark Developer talent on Upwork and invite them to your project.
  • Once the proposals start flowing in, create a shortlist of top Pyspark Developer profiles and interview.
  • Hire the right Pyspark Developer for your project from Upwork, the world's largest work marketplace.

At Upwork, we believe talent staffing should be easy.

How much does it cost to hire a Pyspark Developer?

Rates charged by Pyspark Developers on Upwork can vary with a number of factors including experience, location, and market conditions. See hourly rates for in-demand skills on Upwork.

Why hire a Pyspark Developer in Brazil on Upwork?

As the world's work marketplace, we connect highly-skilled freelance Pyspark Developers and businesses and help them build trusted, long-term relationships so they can achieve more together. Let us help you build the dream Pyspark Developer team you need to succeed.

Can I hire a Pyspark Developer in Brazil within 24 hours on Upwork?

Depending on availability and the quality of your job post, it's entirely possible to sign up for Upwork and receive Pyspark Developer proposals within 24 hours of posting a job description.