Hire the best Pyspark Developers in Canada
Check out Pyspark Developers in Canada with the skills you need for your next job.
- $25 hourly
- 5.0/5
- (5 jobs)
I am a Data Engineer and Tech Enthusiast experienced with multiple cloud platforms. Whatever your data centric need, I am the go to person! - Expert with Airflow, Python and Spark to develop end to end robust ETL solutions. - Experienced with advanced technical writing with several articles and a book already published. - Hobbyist turned professional IoT Project creator. Will support you throughout the process. - Regular communication is important to me, so let’s keep in touch.Pyspark
Technical WritingMQTTAmazon Web ServicesGoogle Cloud PlatformAWS IoT CorePySparkApache SparkApache AirflowKubernetesOpenAI EmbeddingsData ScienceMachine LearningData EngineeringPythonRaspberry Pi - $80 hourly
- 4.9/5
- (32 jobs)
I have been working on variety of projects which involves project management, coding, machine learning, neural networks and data presentation. I am well versed with ML tools, Cloud based applications and data exploration.Pyspark
Android StudioPostgreSQL ProgrammingArtificial IntelligenceIBM WatsonSQLite ProgrammingPySparkDjangoDeep Neural NetworkFlaskTableauApache SparkPythonData ScienceJavaMachine Learning Model - $85 hourly
- 5.0/5
- (5 jobs)
I have 7 years plus experience working in big data eco system both on prem and on cloud.Developed ETL framework to ease ingestion of complex data pipelines with minimal code for the end developer.Ingested streaming data using custom build spark sources and sinks.Created framework to deploy spark pipelines on k8s using argo workflows. Extensive knowledge on python and scala.Working knowledge on Mlops and DevopsPyspark
Big DataApache SparkSnowflakeMLflowKubernetesPython Scikit-LearnApache AirflowApache KafkaPySparkAWS LambdapandasPython - $125 hourly
- 5.0/5
- (20 jobs)
You need a data scientist but finding the right one for your project can be laborious. What makes me trustworthy: 🔸15 years experience as a senior data scientist 🔸Strong work ethics 🔸Various industry experience (media measurement, banking sector, retail loyalty program (B2C), distribution and logistics (B2B), early stage startup) 🔸PhD in Physics 🔸High Performance Computing background 🔸Expertise in predictive analytics using supervised machine learning algorithms (regression and classification models) and prescriptive models (mixed-integer optimization, non-linear optimization, pricing models, procurement planning). 🔸Open Source solutions using Python, PySpark and Jupyter notebooks , Scikit Learn, Numpy ... consuming data from SQL databases or binary files (hdf5, netcdf ... ) or flat files (csv, txt ...) I'd like to learn more about your project and better understand your needs. Please reach out with your availability, and we can coordinate a Zoom meeting to discuss further. Examples of deliverable: 🔸One of my customers faced the challenge of optimizing its procurement strategy to maximize savings while managing stock levels. The company sources products from multiple suppliers, each offering various price breaks and conditions for free shipping based on minimum order quantities. Using mixed integer programming (MIP), we aimed to determine the optimal order quantities from each supplier that not only satisfy inventory requirements but also minimize overall purchasing costs. This MIP model involves defining decision variables for the quantity ordered from each supplier, constraints to maintain stock levels within specific limits, and an objective function that minimizes the total cost, including adjustments for suppliers' discounts and shipping costs. By implementing this solution, my customer saves thousands of dollars each week by optimizing their purchases. 🔸Designing and building custom forecast models for planning future demand and aligning resources accordingly. These models leverage advanced analytics to predict demand patterns accurately, enabling efficient resource allocation. From sales forecasting to inventory management, customized models empower businesses to stay agile and responsive to market fluctuations, ensuring optimal operational efficiency and customer satisfaction. 🔸Developed propensity models in Jupyter notebooks to predict lead conversion likelihood, enabling prioritization of leads that align with business objectives and optimize ROI. The models were seamlessly integrated into production on a SQL server. 🔸Created a price simulation tool to estimate the impact of various input parameters on profitability, accounting for non-linear effects such as churn rate based on margin and marketing cost to expand the customer base. 🔸 Implemented price optimization models to maximize revenue or profit, collaborating closely with sales and marketing teams to align with business objectives and revenue management rules. 🔸 Designed a customized optimization process to find the value of the parameters maximizing a business outcome within constraints. 🔸Generated synthetic tabular data using Generative Adversarial Networks (GANs) with the ctgan library. Designed a custom cost function to produce synthetic records that adhere to specific criteria, including distribution functions and goal metrics.Pyspark
Generative Adversarial NetworkPrice OptimizationPricing ResearchPricingStatisticsPySparkOperations ResearchJupyter NotebookPredictive AnalyticsPython Scikit-LearnMachine LearningPythonData Sciencepandas - $140 hourly
- 5.0/5
- (9 jobs)
Mark is a senior data engineer with over 25 years of experience. His primary expertise is in the field of analytics / data warehousing / business intelligence with a specific focus on the Microsoft Fabric platform. He began his consulting career with Accenture and then founded IGENO in 1999. Mark has consulted in Canada, the United States, Australia and UAE and across a variety of industries including healthcare, e-commerce, real estate, retail, telecom, financial services, tourism and hospitality. He holds a Bachelor of Engineering (electrical) from the University of Victoria and an MBA from the University of Western Ontario (Ivey). He also served as an Adjunct Professor at the University of British Columbia (UBC) where he taught a course in data analytics.Pyspark
MongoDBData EngineeringMicrosoft AzureAzure DevOpsReactNodeJS FrameworkMicrosoft Power BI Data VisualizationMicrosoft SQL Server ProgrammingData ScienceNoSQL DatabaseJavaScriptPySparkSQLPythonFabric - $75 hourly
- 5.0/5
- (9 jobs)
🔹 Welcome! I’m Rizwan, a Senior Data Engineer & AI Consultant with 10+ years of experience designing, optimizing, and deploying scalable data solutions, AI-driven analytics, and cloud architectures. I specialize in Big Data, ETL, AI-powered automation, and cloud-based data engineering, helping businesses process large-scale datasets, integrate AI models, and optimize cloud infrastructure for faster decision-making and business growth. 💡 Industries I’ve Worked With: Healthcare, FinTech, SaaS, E-commerce, Infrastructure, AI Startups 🚀 What I Offer: ✅ Data Engineering & ETL Pipelines – Optimized ETL/ELT workflows using Python, Apache Spark, dbt, and SQL for real-time data processing. ✅ Big Data & Cloud Solutions – Proficient in Hadoop, Databricks, Snowflake, Redshift, and BigQuery for scalable and cost-efficient analytics. ✅ AI & Machine Learning Integration – Expertise in LLMs, Retrieval-Augmented Generation (RAG), NLP, AI chatbots, and data-driven AI workflows. ✅ Database Design & Optimization – SQL & NoSQL databases (PostgreSQL, MongoDB, Pinecone, Faiss) for efficient querying and high-speed analytics. ✅ Cloud Data Engineering & API Development – Deploying AWS (Glue, Redshift, Lambda), GCP (BigQuery, Vertex AI), Azure Data Factory for secure & scalable AI+data solutions. 🔥 Why Hire Me? 🚀 AI-Driven Data Architect – Built high-performance AI+data pipelines processing billions of records. ⚡ Proven Track Record – Delivered enterprise-grade AI & data solutions across multiple industries. 💡 Scalable & Future-Proof Designs – Implementing highly efficient cloud-native architectures. 📢 Clear Communication & Transparency – Providing real-time updates and reports. ❓ FAQs 💾 What data engineering tools & frameworks do you use? Python, Spark, Hadoop, Databricks, Snowflake, PostgreSQL, MongoDB, Kafka, dbt, Airflow, FastAPI, Power BI, Looker. ⚡ Can you handle enterprise-level AI & big data projects? Absolutely! I specialize in AI-powered data pipelines, LLM integrations, and scalable cloud architectures for large-scale businesses. ☁️ Do you work with cloud AI & data platforms? Yes! Expertise in AWS (Redshift, Glue, EMR), GCP (BigQuery, Vertex AI), Azure Data Factory. 📊 Can you help optimize my data infrastructure for AI & analytics? Yes! I can design data pipelines, automate AI workflows, and optimize your cloud infrastructure to boost performance and reduce costs. 📩 Let’s Connect & Build Scalable AI-Powered Data Solutions! Looking to transform your business with AI, big data, and cloud automation? Let’s discuss how I can help optimize your data infrastructure, AI applications, and analytics workflows. 🚀 Let’s collaborate and scale your AI-driven success!Pyspark
Data MigrationBigQueryBig DataETLData AnalysisDatabricks PlatformApache SparkBack-End DevelopmentDjangoDjango StackETL PipelinePythonPySparkData ScienceData Engineering - $25 hourly
- 5.0/5
- (1 job)
As a dedicated and skilled Data Engineer, I specialize in designing, building, and optimizing data pipelines and systems that empower businesses to make data-driven decisions. My expertise spans various technologies and tools, ensuring seamless data processing, storage, and integration tailored to meet unique business needs. What I Bring to the Table - Data Engineering Expertise: Proficient in SQL, Python, PySpark, SparkSQL, and modern data frameworks for efficient processing and transformation. - Cloud Solutions Mastery: Extensive experience with Azure services like Azure Data Factory, Azure Synapse Analytics, and Azure Blob Storage for building scalable cloud-based solutions. - Data Warehousing & Modeling: Expertise in developing high-performance data warehouses and implementing effective data models to support analytics and reporting. - End-to-End Workflow Automation: Skilled in orchestrating workflows, automating ETL/ELT processes, and integrating diverse data sources into unified systems. Why Work With Me? - Proven ability to deliver scalable, reliable, and secure data solutions tailored to business goals. - Focus on writing clean, efficient, and maintainable code. - Strong analytical and problem-solving skills to handle complex data challenges. - Transparent communication and commitment to delivering quality results on time. Services I Offer - Design and development of data pipelines and workflows. - Implementation of ETL/ELT processes for data integration and transformation. - Cloud-based data engineering with Azure tools and platforms. - Performance optimization for data processing jobs and systems. - Develop and manage data warehouses and models. Let’s Collaborate - Whether you're building a new data solution or optimizing your existing systems, I can help you unlock the full potential of your data. Let’s work together to create impactful, efficient, and reliable data solutions that drive insights and growth for your business.Pyspark
Apache HiveApache SparkApache HadoopDatabase Management SystemSqoopLinuxPySparkMySQLSQL ProgrammingSQL - $20 hourly
- 5.0/5
- (13 jobs)
I'm a Software Engineer with specialisation in Python and Azure and 5+ years of experience. Certified as an Azure Developer Associate (AZ-203), I can build software using advanced python practices. Whether it's creating object-oriented backend solutions or navigating the cloud with Azure, I've got it covered. I have previously helped large organisations in FMCG, Manufacturing and Finance domains, and startups building digital transformation and backend development projects. Core Skills: - Orchestrating cloud infrastructure with experience in setting up Azure VMs/AWS EC2 instances, Cloud Storage, Azure Datalake, Cloud Data Warehousing resources, Log Analytics etc. - System design and architecture of data applications for both cloud and self hosted solutions - Data transformation in python using pandas, koalas and duckdb - Building modern data applications with interactive GUI in python using pyqt, tkinter, streamlit, panel - Production grade web scrapping solutions using beautiful soup and requests - Building backends and REST APIs in python using FastAPI and experience in automating postman tasks - Custom LLM and Gen AI solutions using FastAPI and Open AI - Experienced in both SQL and NoSQL databases including Microsoft SQL Server, Azure Cosmos DB, MySQL - CI/CD using git actions, azure devops and experienced in utilising git for complete version control efficiency - Hands on experience in implementing OOP and SOLID design principles in my projects - I have worked with multiple types of data and Databases, which has enabled me to develop an optimised solution for different use cases. I am always keen on developing re-useable and optimised solutions. - Deployment and maintenance of solutions on server/cloud like AWS, Azure, Heroku. I thrive in big team vibes, ensuring our projects hit the sweet spot of collaboration and innovation. Keywords: python, azure, API developer, web scrapping, data extractions, aws, pyspark, cloud developer, data developer, data ingestion, pandas, mongo dbPyspark
Microsoft AzureData ManagementDatabricks PlatformApache SparkPySparkMicrosoft Azure SQL DatabaseData AnalysisPythonSQLMicrosoft Power BI - $45 hourly
- 0.0/5
- (1 job)
Pablo Guinea Benito - 🌐 Welcome to My World of Full Stack Development! 🔍 Who Am I? As a Computer Engineer specialised in AI & Robotics, I bring a rich blend of over 6 years in Full Stack Development and cutting-edge technological innovation. My journey in tech is grounded in a strong academic foundation, including a B.Sc. in Computer Science with specialisation in Robotics and a Postgraduate Diploma in Applied AI Solutions Development. 🔧 My Technical Strengths: Advanced Technologies: Mastery in Python, JavaScript, R, MATLAB, SQL, HTML, and CSS. Data Analytics & Machine Learning: Proficient in TensorFlow, Keras, PyTorch, Pandas, NumPy, and more. Web Development: Expertise in Vue.js, React, HTML5, CSS3. Business Intelligence & Cloud Technologies: Skilled in Tableau, Power BI, AWS, Azure, and GCP. 🏆 Accomplishments & Projects: Co-led the development of an innovative medical question-answering system with SyTaCa, a startup recognised in the European Startup Challenge. This project aimed to revolutionise patient-doctor interactions, targeting a 65% reduction in consultation time. Collaborated on the creation of a Decision Support System for early detection of Alzheimer's and Parkinson's diseases, showcasing my ability to apply AI in healthcare. 📚 Educational Background & Certifications: DeepLearnging.AI TensorFlow Developer Professional Certificate (2023) IBM Data Analytics Professional Certificate (2022). Machine Learning: Data Science in Python – Udemy (2020). Advanced Python for Data Scientists – LinkedIn (2021). ✨ Why Choose Me? Innovative Problem-Solver: Leveraging AI and robotics knowledge to develop advanced solutions. Effective Communicator: Fluent in both Spanish and English, ensuring clear and efficient communication. Proven Leadership: Successful track record in leading project teams and delivering high-impact solutions. 🔭 Looking Ahead: Ready to bring my diverse expertise to your project, I am committed to realizing your vision with top-notch quality and efficiency. Let’s connect and embark on a journey of innovation and excellence together!Pyspark
Data ProcessingPySparkApache HadoopLLM Prompt EngineeringGenetic AlgorithmTrading StrategyStatistical Process ControlProbability TheoryMathematical ModelingRESTful APIPython Scikit-LearnAmazon Web ServicesDockerTensorFlowFlask - $28 hourly
- 5.0/5
- (6 jobs)
I am an experienced consultant and developer in Data Analytics (Big Data & AI) domain. As a part of my job responsibilities, I have gained proficiency in gathering, interpreting and understanding client requirements. Furthermore, I have designed and developed their big data analytics workflows/architectures on cloud in an optimal way by evaluating cost/performance trade-offs. Some of my successfully implemented tasks are: 💎 Experience in designing large scale Big Data pipelines using AWS, GCP & Azure. 💎 Developing Curation jobs in Apache Spark for data integration, cleaning, entity resolution and format conversion for optimized performance in complex data processing. 💎 Writing and optimizing Extract, Transform and Load (ETL) jobs in Spark under resource constraints to transform curated data into analysed data sets where it starts to add value to the business. 💎 Developing methods to store transformed full load/incremental data using Apache Hudi. 💎 Developing automated pipelines to export transformed data from cloud storage services to SQL and NoSQL databases for data warehousing and creating reports. 💎 Deploying and maintaining big data infrastructure as code. 💎 Designing strategies for testing data conformity, accuracy, duplication, consistency, validity, and completeness. 💎 Developing automated pipelines to migrate data files among multiple clouds. 💎 Using Databricks for data orchestration and ETL development and Azure Functions and Queue to push and store events in Redis for near real-time spark ETL job processing. 💎 Customer Sentiment Analysis (Natural Language Processing) project on Google Cloud Platform to check if customer reviews on the client’s products are positive, negative or neutral. 💎 Designed a Google Cloud Platform based architecture where Google Cloud Functions and Cloud Pub/Sub were used to automatically load incoming ’csv’ files from client’s ’gmail’ address attachments to Cloud Storage and then BigQuery for doing queries and analysis. 💎 Successfully delivered multiple proof-of-concepts (POC) in Big Data Analytics domain. 𝙃͟𝙞͟𝙜͟𝙝͟𝙡͟𝙞͟𝙜͟𝙝͟𝙩͟𝙚͟𝙙͟ 𝙎͟𝙠͟𝙞͟𝙡͟𝙡͟𝙨͟ 🔷 𝐀𝐦𝐚𝐳𝐨𝐧 𝐖𝐞𝐛 𝐒𝐞𝐫𝐯𝐢𝐜𝐞𝐬: EC2, EMR, S3, Step Functions, Cloud Formation, Lambda, AWS Kinesis, SNS, CloudWatch, CloudTrail, IAM, Redshift) 🔷 𝐆𝐨𝐨𝐠𝐥𝐞 𝐂𝐥𝐨𝐮𝐝 𝐏𝐥𝐚𝐭𝐟𝐨𝐫𝐦: Cloud Functions, Dataflow, Pub/Sub, Cloud Storage, Dataproc, BigQuery, Compute, AutoML, Natural Language. 🔷 𝐌𝐢𝐜𝐫𝐨𝐬𝐨𝐟𝐭 𝐀𝐳𝐮𝐫𝐞: Data Factory, Databricks, Blob Storage, Redis, Azure Function, Azure Queue, Azure Synapse 🔷 𝐁𝐢𝐠 𝐃𝐚𝐭𝐚: Hadoop, HDFS, Spark, Sqoop, MapReduce, Hudi, ETL, APIs 🔷 𝐃𝐚𝐭𝐚𝐛𝐚𝐬𝐞𝐬: PostgreSQL, SQL Server, AWS Redshift 🔷 𝐏𝐫𝐨𝐠𝐫𝐚𝐦𝐦𝐢𝐧𝐠: Python, C++, SQL, Spark, MATLAB, Assembly Language, ML(scikit-learn, keras), R 🔷 𝐎𝐭𝐡𝐞𝐫𝐬: Linux, GitHub, GitLab, TFS, Robotics, Microcontrollers, MS Project, MS Office, MS Visio In my work, I do my best to meet my client's expectations and deadlines. Looking forward to discussing your project together!Pyspark
Data ProcessingTerraformPySparkDatabricks PlatformData EngineeringMicrosoft SQL ServerData AnalyticsETLDatabaseApache SparkPythonSQLMicrosoft AzureGoogle Cloud PlatformAmazon Web Services - $95 hourly
- 5.0/5
- (2 jobs)
Data Engineering and Business Intelligence professional with 8+ years of experience specializing in SQL, Python, and Azure Cloud Services. Pioneered machine learning classifier project securing $300,000 in funding. Led data integration initiatives, improving operational efficiency by 30%. Expertise in cloud migration, data governance, business intelligence solutions, and Database design.Pyspark
SQLMicrosoft SQL ServerSQL Server Integration ServicesCloud ArchitectureCloud MigrationMicrosoft Azure SQL DatabaseETL PipelinePySparkPythonData EngineeringData WarehousingData ModelingBusiness Intelligence - $40 hourly
- 0.0/5
- (0 jobs)
I am a passionate result-driven Big Data professional, with proven knowledge of Python, PySpark, Java, SQL, Hadoop, Kafka, Spark, Scala, Sqoop,R, Hive, Pig Latin, MySQL, NoSQL, HiveQL, Flume, and Oozie. I have utilized hands-on and academic knowledge to provide efficient solutions for Big Data problems with a willingness to learn and growPyspark
Big DataEclipse IDEJupyter NotebookMicrosoft AzureData AnalysisJavaPythonMicrosoft ExcelETL PipelineScalaSQLApache KafkaApache SparkPySparkData Engineering - $10 hourly
- 5.0/5
- (1 job)
Statistician / data analyst interested in stochastic modelling, parameter estimation and forecasting.Pyspark
LassoPython Scikit-LearnpandasNumPyCI/CDSQLPySparkRForecastingExperimental MusicPython - $70 hourly
- 0.0/5
- (0 jobs)
As an experienced and highly skilled data scientist, I have consistently applied my knowledge in machine learning, product analytics, statistical analysis, online experiment, dashboard creation, and stakeholder communication in previous roles. I thrive on translating business needs into empirical data science projects that bring meaning and focus to overwhelming data and insights. If you are interested in a data science career or need to do any related data science projects, I can help.Pyspark
Data ScienceStatistical AnalysisData AnalysisA/B TestingMachine LearningMode AnalyticsGitHubApache AirflowGitBigQueryPySparkPythonSQL - $45 hourly
- 0.0/5
- (0 jobs)
I am Data Professional passionate about leveraging data to provide valuable insights and analytics. I am a proactive problem solver, with solid organizational and time management skills.Pyspark
Machine LearningData AnalysisReport WritingMicrosoft PowerPointMicrosoft ExcelPySparkTableauSQLPython - $18 hourly
- 5.0/5
- (1 job)
Hello, I'm Paul, a Data Scientist passionate about transforming raw data into actionable insights. With expertise in: - Excel, Python (Pandas, NumPy, Scikit-learn), SQL, data visualization tools - Experience with machine learning algorithms for predictive modeling - Strong background in statistical analysis and data interpretation I specialize in data analysis and modeling for diverse applications. I believe in regular communication throughout the project to ensure your needs are met. Let's work together to harness the power of data for your business success.Pyspark
SQLData ModelingMicrosoft Power BI Data VisualizationScientific IllustrationData AnalysisEnvironmentClimate SciencePlotlyPySparkMatplotlibNumPyApache SparkMicrosoft ExcelPythonData Science - $80 hourly
- 0.0/5
- (0 jobs)
Data Engineering Specialist working with goverments, banks, transporation industry. I am a Data Engineering Specialist with over 10 years of experience building robust, scalable data pipelines and solutions for governments, banks, and the transportation industry. My expertise includes Python, MongoDB, Sybase, PostgreSQL, and Azure cloud technologies. I specialize in ETL processes, database optimization, and big data analytics to help businesses transform their data into actionable insights. Let’s collaborate to make your data work efficiently!Pyspark
SnowflakePySparkDatabricks PlatformMicrosoft AzureETL PipelineETLData Extraction - $75 hourly
- 0.0/5
- (0 jobs)
Senior Data Engineer avec plus de 6 ans d'expérience dans l'ingénierie des données et la science des données. Spécialisé dans l'optimisation des pipelines de données et les solutions de cloud computing, avec une passion pour l'innovation technologique et l'amélioration continue. Expérience avérée dans divers secteurs, notamment l'énergie, les transports et la finance.Pyspark
KubernetesPySparkPythonSQLData AnalysisETLETL PipelineData Extraction - $40 hourly
- 0.0/5
- (0 jobs)
TECHNICAL SUMMARY Microsoft Azure Data Engineer Associate certified Data Engineer with over 10 years of experience in the IT industry and more than 4 years of expertise in data engineering. Demonstrates a strong focus on designing , developing , and maintaining sophisticated data infrastructure and pipelines . Adept at constructing and optimizing data systems, ETL processes , and data warehouses to guarantee the dependable and efficient collection , storage, and retrieval of data. Possesses a comprehensive skill set in data modeling , proficient database management , and a high level of proficiency in utilizing tools such as Apache Spark and diverse database technologies . Proven ability to contribute to the creation of data architectures that align with organizational goals.Pyspark
Data WarehousingMicrosoft AzureADF FacesPySparkDatabricks PlatformPythonSQLETL PipelineETL - $45 hourly
- 0.0/5
- (0 jobs)
I am a Senior IT Developer Data Engineer with 5 years of experience in designing, developing, and optimizing large-scale data pipelines in cloud environments. Currently, I work at TD Bank, where I specialize in data ingestion, transformation, and automation using Azure, Databricks, Kafka, and Delta Lake. End-to-End Data Engineering: Expertise in building robust data pipelines for batch and streaming workloads, ensuring efficient ingestion, transformation, and storage of large datasets. Cloud & Big Data Technologies: Strong proficiency in Azure Data Factory (ADF), Databricks, Delta Lake, Kafka, and Snowflake, enabling scalable and high-performance data processing. - Automation & Optimization: Experience implementing CI/CD pipelines, automating data workflows, and optimizing ETL processes to enhance efficiency and reduce manual effort. - Financial & Fraud Data Processing: Deep understanding of financial data models, fraud detection systems, and regulatory compliance, having worked extensively on projects like DCMCR, TSYS Pipeline, and UAP Ingestion. -Solution Architecture & Data Governance: Skilled in defining data models, schema enforcement, deduplication strategies, and rule-based transformations to maintain data integrity and quality. Key Projects & Achievements: DCMCR Project – Led the end-to-end processing of financial data, ensuring seamless integration between DSAP, FRAM, and ACI systems while maintaining security and compliance. TSYS Pipeline – Developed a rule-based JSON mapping approach for ingesting MBNA, ADS, and fraud logs into Delta tables, handling structured and semi-structured data efficiently. UAP Ingestion & Migration – Engineered solutions to support missing batch ingestion in streaming pipelines, transforming Avro source files into Delta tables with optimized workflows. PRM Upgrade – Enhanced existing streaming Delta tables by incorporating new columns and handling schema evolution, solving technical challenges such as 256-parameter limits in linked services. I am passionate about solving complex data challenges and building innovative solutions that drive data-driven decision-making. My goal is to transition into a leadership role in data engineering or solution architecture, where I can mentor teams, shape data strategies, and design high-impact cloud solutions.Pyspark
BitbucketGitHubMicrosoft Windows PowerShellVisualizationMicrosoft AzureMicrosoft Azure SQL DatabaseSQL Server Reporting ServicesData MiningData AnalysisSnowflakeScalaPySparkSQLETL PipelineData Extraction - $65 hourly
- 0.0/5
- (0 jobs)
SUMMARY AND TECHNICAL SKILLS: 5+ years of ML/MLOps experience in Generative AI, Machine Learning, Data Engineering, and Data Science in Public Services, Aviation, Market Research, Medical, Finance, and Insurance, Customer Service domains. SKILLS: AI, ML, Data Science, Cloud Engineering, DevOps, and BI Tools: • Computer Language: Python, C++, MS SQL, Java, Shell scripting, Javascript, Bash • Toolkits: Databricks, Numpy, Pandas, Sklearn, PySpark, MlOps, NLP, AWS, Azure OpenAI, GCP, Copilot Studio, Visual Studio Code, Docker, Kubernetes, LangChain, Google Vertex AI, Linux, Jupyter, Gitlab, Pytorch, Tensorflow, Pandas, LLM, RAG, Azure DevOps, Azure ML Studio, Azure App Services, MlFlow, Spark, Kafka, Azure AI Search, PowerApps, AWS SagemakerPyspark
Stanford CoreNLPPySparkSQLPythonDatabricks MLflowETLMachine LearningMachine Learning ModelArtificial Intelligence - $85 hourly
- 0.0/5
- (0 jobs)
AVAILABILITY * Interview Availability: 1-2 days' notice. * Start: 2 weeks' notice upon offer (open to discuss). * Vacations: No vacations planned for the next 6 months. * Work Status: Permanent resident in Canada - no sponsorship required. PROFESSIONAL SUMMARY * I'm a Senior Big Data Engineer with 8+ years of experience providing both on-premises and cloud-based solutions across different industries such as finance, technology, insurance, Capital Market, tele-communication, and government. I have extensive experience designing and implementing ETL pipelines that process terabytes of data daily, ensuring efficient and scalable data integration across various systems. * Azure Certified Data Engineer Associate and Databricks Certified Data Engineer Associate with sound analytical and big data expertise, having hands-on experience across multiple dataPyspark
TableauSnowflakeApache HadoopPySparkDatabricks PlatformMicrosoft AzurePythonSQLData ExtractionMiningData AnalysisETL PipelineETL - $35 hourly
- 0.0/5
- (0 jobs)
SUMMARY * Skilled IT professional with 8+ years of diverse experience, excelling as a Big Data Engineer for the past 4+ years. Proficient in developing industry-specific software applications and implementing Big Data technologies in core and enterprise environments. * Expertise in Analysis, Design, Development, Deployment, and Integration, utilizing SQL and Big Data tools. * Proficient in SQL programming across various databases including MySQL, SQL Server, Oracle, Cassandra, and HBase. * Demonstrated proficiency in Hadoop architecture and components such as HDFS, Hive, Pig, and MapReduce. * Skilled in Scala for functional programming and Python for scripting and data analysis. * Experience in Extraction, Transformation, and Loading (ETL) processes, adept at data processing and manipulation using Apache Spark. * Certified AWS Solutions Architect Associate with hands-on experience in AWS services such asPyspark
PySparkApache SparkScalaSQLData AnalysisETL PipelineData ExtractionETL - $70 hourly
- 0.0/5
- (0 jobs)
Professional Summary * Over 15+ years of IT experience out of which 7+ Years in Cloud Analytics using Azure, AWS, GCP, Snowflake, ETL, Business Objects, and SAP HANA. * Experience with Azure transformation projects and Azure architecture decision-making Architect and implement ETL and data movement solutions using Azure Data Factory (ADF). * Implemented Copy activity, Custom Azure Data Factory Pipeline Activities for On-cloud ETL processing using various databases including Azure Synapse/Snowflake Database. * Spearheaded the development of medium to large-scale BI solutions leveraging Azure Data Platform services, including Azure Data Factory (ADF), Azure Databricks, and Snowflake, alongside Power BI for comprehensive data analytics and visualization. * Executed meticulous data Extraction, Transformation, Loading, ingestion, integration, cleansing, and aggregation processes across diverse sources, ensuring data accuracy.Pyspark
Microsoft AzurePySparkPythonSnowflakeMicrosoft Power BITableauPostgreSQLMongoDBSQLData EngineeringData VisualizationData ModelingData WarehousingData AnalysisETL - $34 hourly
- 0.0/5
- (0 jobs)
I’m a Business Intelligence Analyst with 6+ years of experience helping organizations make data-driven decisions. Whether you need interactive dashboards, data analysis, or workflow optimization, I can help. Proficient in Power BI, Tableau, SQL, Python, and database technologies. I manage projects end-to-end and prioritize clear communication. Let’s keep in touch!Pyspark
PySparkSQLPythonTableauMicrosoft Power BI Data VisualizationETL PipelineMiningBeta TestingAlpha TestingData AnalysisETLData ExtractionData MiningAgriculture & MiningAnalytical Presentation - $20 hourly
- 0.0/5
- (1 job)
Over 5 years experience as a Data Engineer with expertise in Data Analytics & Statistics, Web Scraping, Big Data, Data QA and Business Intelligence. I have experience working with automation, data analysis, and dashboard creation. Technical Skills: • Data Analysis & Data Visualization • Data Mining • Web Scraping • Big Data Analysis using PySpark, Pandas, My SQL • Business Intelligence using Google Data Studio and Tableau • Quality Assurance of Data • Create Data Analysis tool using Python • Google Sheets & Excel Programming Language: • Python • SQLPyspark
PySparkData ScienceWeb ScrapingData AnalysisSelenium WebDriverData VisualizationData MiningPythonData Scrapingpandas - $25 hourly
- 5.0/5
- (1 job)
As a dedicated and detail-oriented Data Engineer with 4 years of proven experience, I offer a robust skill set tailored to meet your project needs. Proficient in designing, constructing, and maintaining highly scalable data management systems, I specialize in developing efficient data pipeline architectures capable of handling large volumes of complex data from diverse sources. My strengths lie in collaborating closely with data and business analysts to enhance data models and systems, ensuring alignment with evolving business requirements. Key highlights of my experience include: Developing ingestion frameworks for streaming data sources using technologies such as Databricks, PySpark, Apache Hive, and Kafka, resulting in a 60% reduction in development time. Successfully designing and implementing data ingestion frameworks and pipelines, including the configuration of Spark Streaming to receive real-time data from Kafka. Leveraging Azure Databricks to mount and transform different data storages, improving data quality and speeding up analysis processes. Coordinating automation strategies to improve system monitoring and reducing service tickets through restructuring processes. Completing a Master's degree in Applied Computing with a focus on Artificial Intelligence from the University of Windsor, complemented by certifications and a strong academic background in Computer Science. With a track record of delivering high-quality projects on time and within budget, I am committed to continuous learning and staying abreast of cutting-edge technologies to drive innovation. I am adept at project management, delegation, and maintaining thorough documentation, ensuring seamless collaboration within teams. If you are seeking a skilled Data Engineer with a proven ability to tackle complex challenges and deliver results, I am confident in my ability to exceed your expectations.Pyspark
Microsoft AzureAzure DevOpsApache KafkaPySparkRelational DatabaseDjangoAzure Machine LearningApache SparkMachine LearningKerasTensorFlowPythonC#WordPressJava Want to browse more freelancers?
Sign up
How hiring on Upwork works
1. Post a job
Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.
2. Talent comes to you
Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.
3. Collaborate easily
Use Upwork to chat or video call, share files, and track project progress right from the app.
4. Payment simplified
Receive invoices and make payments through Upwork. Only pay for work you authorize.