Hire the best Apache Spark Engineers in California
Check out Apache Spark Engineers in California with the skills you need for your next job.
- $150 hourly
- 5.0/5
- (12 jobs)
I am a professional cloud architect, data engineer, and software developer with 18 years of solid work experience. I deliver solutions using a variety of technologies, selected based on the best fit for the task. I have experience aiding startups, offering consulting services to small and medium-sized businesses, as well as experience working on large enterprise initiatives. I am an Amazon Web Services (AWS) Certified Solutions Architect. I have expertise in data engineering and data warehouse architecture as well. I am well versed in cloud-native ETL schemes/scenarios from various source systems (SQL, NoSQL, files, streams, and web scraping). I use Infrastructure as Code tools (IaC) and am well versed in writing continuous integration/delivery (CICD) processes. Equally important are my communication skills and ability to interface with business executives, end users, and technical personnel. I strive to deliver elegant, performant solutions that provide value to my stakeholders in a "sane," supportable way. I have bachelor's degrees in Information Systems and Economics as well as a Master of Science degree in Information Management. I recently helped a client architect, develop, and grow a cloud-based advertising attribution system into a multi-million $ profit center for their company. The engagement lasted two years, in which I designed the platform from inception, conceived/deployed new capabilities, led client onboardings, and a team to run the product. The project started from loosely defined requirements, and I transformed it into a critical component of my client's business.Apache SparkData ManagementBusiness IntelligenceAPI DevelopmentAmazon RedshiftAmazon Web ServicesMongoDBData WarehousingETLNode.jsDockerAWS GlueApache AirflowSQLPython - $85 hourly
- 5.0/5
- (5 jobs)
I'm a developer with a diverse range of experiences, having been a builder, consultant, and involved in product sales. I understand that the success of a project lies not in the technology itself but in identifying and solving the right problem for yourself or your business. I bring a wealth of experience in various programming languages, frameworks, data modeling in SQL and NoSQL databases, and CI/CD tooling. Additionally, I have expertise in working with Retrieval-Augmented Generation (RAG) models and Artificial Intelligence/Machine Learning (AI/ML) technologies, having applied RAG models, which combine traditional language models with information retrieval systems, to enhance the quality and accuracy of generated outputs by incorporating external knowledge sources. Projects: - Cohesive AI: Utilized Whisper, GPT-3.5 and Scenario.com to generate automated summaries of sales calls, update data in Salesforce and attach customers to open feature requests to reduce data duplication and improve data quality. - Barcade.ai: A web based arcade for LLM powered agents to compete on games. The goal being to introduce audience interaction into the games to build more engaging experiences where the agents have to adapt to audience controlled environments.Apache SparkAutomationJavaScriptTerraformVault by HashiCorpCI/CDReactPythonGolang - $150 hourly
- 5.0/5
- (7 jobs)
Do you need help getting your machine learning initiative moving? I can help move you from concept or POC to a weaponized production implementation. I'm an independent data consultant helping small and medium enterprise companies develop and execute on data strategy. I specialize in helping where off-the-shelf solutions can't. I've worked in a variety of industries including ecommerce, aerospace, dating apps, chatbots, and scraping. If your data pipeline or architecture needs work, I can help get you on track. I believe in best practices for CI/CD and reproducibility. I can walk you through how I've helped other companies achieved these goals. If your starting from a sound foundation, I can help you identify and execute on the best use cases for machine learning, computer vision, natural language processing, and stochastic optimization that will impact your business.Apache SparkPostgreSQLAmazon DynamoDBDockerElasticsearchAWS LambdaMachine LearningTensorFlow - $130 hourly
- 4.9/5
- (15 jobs)
AWS cloud architect and data engineer with 6+ years expertise in application development, data engineering, data modeling, data & machine learning pipelines, serverless applications, cloud engineering, ETL/ELT and applied Artificial Intelligence solutions. I have worked in a wide variety of domains including LLM application development, web analytics, genomics, cybersecurity, advertising, social media, and natural language processing (NLP). Services: * Cloud-hosted application development & implementation * AWS Solutions Architecture * AI & LLM applications * Software engineering * Python package development * Consultation & Auditing * Unit & Integration testing * Documentation Programming languages: * Python * TypeScript * SQL * Bash * Makefile * Cypher Databases, Data Warehouses & Storage: * PostgreSQL & MySQL (RDS, Aurora & Docker hosted) * Hadoop/Hive/PrestoDB on AWS S3 (using AWS Athena) * Snowflake * Neo4j * Kafka * DynamoDB * Weaviate & other vector databases * Redshift * DataBricks AWS Services & APIs: * Athena * Batch * CDK * CloudFormation * CloudFront * CloudWatch * CodeBuild * CodeDeploy * CodePipeline * DataSync * DataPipeline * DynamoDB * EC2 * ECS & Fargate * ECR * EFS * EMR, EMR Studio & EMR Serverless * OpenSearch (formerly AWS Elasticsearch) * Glue * IAM * Kinesis * Kinesis * Lambda * Neptune * RDS * Redshift * S3 * SQS * SSM * SNS * StepFunctions * Systems Manager * VPC & Networking Libraries & Frameworks: * CDK for TypeScript & Python * CloudFormation * AWS Serverless Application Model (SAM) * Pytest & Unittest * Apache Spark (pyspark) * pandas, polars * DSPy * Docker * Sphinx Documentation * PyTorch * Hugging Face transformers * scikit-learn * dbt Models: * Ollama models & Modelfiles for custom models * Chatbot & Large Language Model (LLM) powered application development using OpenAI and Anthropic APIs * OpenAI Whisper for audio transcription * Other models offered by Hugging Face's transformer's library My GitHub profile includes the following work: * gfe-db: Genomics data pipeline using AWS StepFunctions and Batch to build and load alleles into a Neo4j graph database running on EC2 and served through a public API for clinicians and researchers. * serverless-streaming-reddit-pipeline: Infinitely scalable serverless data mining app built on Lambda, SNS/SQS, Kinesis Firehose, S3, Glue, and Athena, capable of rapidly ingesting gigabytes of parquet data. * aws-open-data-registry-neural-search: Semantic search application of AWS Open Data Registry datasets using the Weaviate vector database. Vector databases store embeddings of records in addition to the records themselves for rapid topic modeling, Q&A search, NER and similarity searches for images, text or both using CLIP. (Work in progress as of December 2022). * aws-cdk-ec2-weaviate: CDK application to deploy a Weaviate instance on an EC2 instance. Configures Weaviate to use text2vec-transformers and sentence-transformers-multi-qa-MiniLM-L6-cos-v1 for text2vec. * emr-managed-scaling-cluster: Automated CloudFormation deployment of an EMR cluster with managed scaling policy for large workloads. Basic configuration deploys Hadoop, Hive, and Spark applications and can be configured for Flink, MXNet, Pig, Tensorflow, Delta Lake, Hudi, Iceberg and Presto. Can be combined with EMR Studio for a highly capable analytics & ETL development environment. * emr-studio: Deploy an EMR Studio environment using CDK TypeScript. Useful for organizations needing a development environment backed by an EMR cluster for analyzing large volumes of data with Jupyter Notebook (also see emr-managed-scaling-cluster). * aws-getting-started-opensearch: CloudFormation deployment following the AWS documentation tutorial for OpenSearch. Can be used as a starting point to get an OpenSearch domain up and running. * neo4j-titanic: Demonstration data pipeline to load the Titanic dataset into the Neo4j graph database. Certifications held: AWS DevOps Professional - Validation number 4115382bda3f421cafede0d8cc11b02a (1/2025 - 1/2028) AWS Developer Associate - Credential ID 00WR156CTN1EQG54 (1/2023 - 1/2026) AWS SysOps Administrator Associate - Credential ID JNX5EPN1ZM14QDCE 1/2023 - 1/2026) AWS Solutions Architect Associate - Credential ID ELY44WSCFEQQ1G5M (11/2019 - 11/2022)Apache SparkData ScrapingAWS LambdaDatabase DesignNeo4jBash ProgrammingDockerDatabase ArchitectureAWS CloudFormationPySparkAWS GlueServerless ComputingAmazon S3ETL PipelineSQLPython - $100 hourly
- 0.0/5
- (1 job)
At UC Berkeley, I helped submit several Computer Vision and Reinforcement Learning papers. Here is a short list for context: - GANs for Model-Based Reinforcement Learning - Frame Rate Upscaling with Constitutional Networks - Neural Multi-Style Transfer At Amazon, I built a pipeline framework to store and serve sales data for the millions of third party merchants on Amazon.com. More recently, I have taken on part-time consulting. These are some of the clients and projects I have worked on in the past: - GitHub on Improving Code Classification with SVMs - SAP on Applying HANA Vora to Load Forecasting - Intuit on Quantifying Brand Exposure From Unstructured Text As opposed to these previous projects, I am looking to take on more projects, each with smaller time commitments.Apache SparkETLApache HadoopMachine LearningDeep LearningTensorFlowKerasPythonJavaComputer Vision - $105 hourly
- 0.0/5
- (1 job)
Tableau developer and dashboard designer with five years of development experience. Specialize in KPI selection, Tableau dashboard development, Data visualization and data transformation with SQL. Tenured analyst with over ten years of experience in the marketing analytics landscape. Extensive experience across B2B, B2C, SAAS and eCommerce business models, focused on improving revenue metrics without relying on increased marketing spend. Thrive in fast-paced, quick turnaround engagements. If you have raw data that needs to be transformed or engineered into a dataset for visualizing, dashboarding or general reporting purposes - I can't wait to help you solve your challenges. I have extensive experience while working with leading technology providers like Atlassian in providing data engineering support to transform your data from jumbled data sets into a cohesive series of tables and business logic, to then be visualized for you. Core skills include: Marketing Analytics | Visualizations | Executive Dashboards Segmentation | KPI Selection | Product Analytics Technical skills include: SQL | Tableau | ETL | Google Analytics | Data Studio | Looker And accolades I have received from prior stakeholders: - I've worked with Tyler for nearly four years and there is so much I could say about him. Even in the very beginning, Tyler was handed nearly impossible problems to solve. It seems like the harder the problem the more invested he is in solving it though. Never satisfied with being comfortable and always onto the next adventure, Tyler is an analytics powerhouse that I'm lucky to have on the team. In addition to his work ethic, his personality and ability to play as a team makes him such an important factor in team moral. He's always helping others, making people laugh, and down to hang out after work for a beer or two. I've enjoyed every minute of working with him! (Jessica Vetorino, Atlassian) - I and Tyler worked together as part of the Stride/Hipchat team. Tyler was part of the extended marketing team and was responsible for the marketing analytics function. He did a kick-ass job of reporting daily/weekly/monthly on the health of the business and actively looking for insights into the data. The last part is most important - you elevate yourself as an analyst when you don't wait for questions to be asked but you formulate them yourselves. He not only excelled analytically but was also very pleasant to work with. You want to be surrounded by team players like Tyler. I will hire him again in a heartbeat. (Raj Sarkar, Atlassian)Apache SparkData Visualization FrameworkETLBusiness Logic LayerDatabricks PlatformInteractive Data VisualizationVisualizationData Analytics & Visualization SoftwareLooker StudioLookerData VisualizationData AnalysisSQLTableauGoogle Analytics - $80 hourly
- 5.0/5
- (1 job)
I’m a Full-Stack Sr.Software engineer with more than 10 years of experience working in developing big-data and machine learning systems. I have extensive experience in creating scalable machine learning data science applications & high-load web applications. My main areas of expertise are: - Python 2/ 3, R, Scala, Java, PHP - Machine Learning: Regression, Decision Trees, PCA, SVD, Clustering, Image Processing - NLP: Bag of words, LDA, LSI - - Node.JS and Python Web development. - Deep Learning: Word2Vec, Neural Networks, CNN - Frameworks: Play, Django/ Flask, Spring, Symfony - Databases: SQL (MySQL, PostgreSQL), NoSQL (MongoDB, Hbase, Cassandra), Druid - Distributed Tools: Storm, Kafka, RabbitMQ, I am very proficient with data structures and algorithms. I have designed very sophisticated and scalable architectures on different cloud providers including AWS, Rackspace. I have experience in working with Fortune 100 clients, large Universities and references can be provided. OTHER - Git version control system knowledge - Project management skillsApache SparkAmazon Web ServicesDjangoDistributed ComputingMachine LearningJavaPythonDeep LearningScala - $85 hourly
- 2.8/5
- (14 jobs)
🏅 TOP-RATED Plus 🏆 100% Job Success Score 🔰 7+ years of experience 🕛 3.9k Upwork Hours Hello! My name is Kerim. Thank you for taking the time to learn about my expertise in data engineering, machine learning, and generative AI. I look forward to the opportunity to work together! WHY CHOOSE ME: - **Data Engineering and Machine Learning Expert** Skilled in designing end-to-end data engineering solutions and building machine learning models to generate predictive insights and enhance business intelligence. - **Specialist in Multi-Agent AI Workflows** Experienced in developing complex multi-agent AI workflows, similar to those in CrewAI and AWS Bedrock, allowing for highly interactive, scalable solutions in areas like customer service, recommendation engines, and real-time analytics. - **Python and Cloud Proficiency** Proficient in building and optimizing ETL pipelines, working with large datasets, and deploying machine learning models in cloud environments, particularly AWS. - **Strategic and Insightful Problem-Solver** Adept at understanding project requirements and translating them into structured, actionable AI-driven solutions tailored to deliver impactful results. - **Quick Learner and Adaptable** Continuously up-to-date with emerging technologies, incorporating new ideas and best practices to achieve efficient and innovative solutions. I am open to collaborating with other developers. Whether you’re expanding your team or need an experienced data engineering and machine learning specialist, I’m here to help. I have built data engineering and AI-driven platforms across these industries: E-Commerce: ETL pipelines, personalized recommendation systems Cyber Security: Anomaly detection, AI-driven threat intelligence Web Services and API Development: Generative APIs, real-time data processing Financial Technology: Predictive analytics, fraud detection models Healthcare Services: Data-driven patient management, diagnostic AI Gaming Analytics: Player behavior modeling, multi-agent adaptive content Multi-Agent AI Workflows: Experienced in creating advanced, multi-agent systems that perform complex tasks, manage workflows, and interact intelligently across distributed platforms, similar to CrewAI and AWS Bedrock frameworks. Working hours: 40 hrs/week Feel free to reach out to discuss your project and see how I can contribute to your success. Best Regards, Kerim TricicApache SparkETLData MigrationApache AirflowComputer ScienceDatabase ProgrammingPython ScriptSQL ProgrammingETL PipelineData VisualizationMicrosoft Power BISQLPython - $90 hourly
- 0.0/5
- (0 jobs)
SUMMARY An achiever computer scientist and engineer with 13+ years of experience developing high quality software and machine learning applications. Equipped with diversified computer science skills including machine learning, deep learning, network communications, Internet of things, and software engineering, I am interested in building something superb by applying my skills in data science, deep learning and Internet of Things (IoT). What I can bring to you: * Develop Deep Learning models and applications, Reinforcement Learning models, and supervised Machine Learning algorithms. * Internet of Things technologies, and IoT Smart Services, Consultation and Implementation. * Design, develop and deploy applications on cloud platforms (Google cloud platform, Amazon Web Services), Big data frameworks (Hadoop, Spark). * Design, develop, review and consult research proposals. Some of my open source projects: github.com/mehdimoApache SparkSoftware DevelopmentDatabase ManagementBig DataSoftware DesignBiology ConsultationDatabase DesignAmazon Web ServicesMachine LearningPythonReinforcement LearningSupervised Learning - $50 hourly
- 0.0/5
- (0 jobs)
I'm a Software developer 👩💻 currently doing my master in Computer Science and looking for opportunitiesApache SparkDatabaseArtificial IntelligenceGraphQLAPIData StructuresDatabase Management SystemData VisualizationAPI DevelopmentPython - $140 hourly
- 4.5/5
- (1 job)
I am Sr Architect, proven leader in Reliability Engineering, DevOps and Cloud Computing World with over 15 years experience managing highly critical Infrastructure for businesses. Have proven track record of helping Organizations with getting started on Cloud or either setting up for scale or ensuring reliability of the systems. Well versed with implementation of latest CI/CD, WAF, Security and Compliance/certifications such as CCF, PCI etc for organizational needs. I am experienced with Cloud Infrastructure, especially AWS, Data systems and stream/event processing and batch processing pipelines. Any Data related question/work, I am here to help.Apache SparkAmazon S3Amazon EC2AWS Systems ManagerAWS CloudFormationDatabase DesignOracle Database AdministrationDevOps EngineeringApache HadoopDatabase ArchitectureApache HBaseKubernetesMySQLCloud ComputingAmazon Web Services - $75 hourly
- 0.0/5
- (1 job)
Professional with deep and broad experience applying quantitative analysis, statistical learning, Machine Learning and Artificial Intelligence. Experienced in implementing CI/CD throughout the MLOps lifecycle.Apache SparkApache KafkaJavaScriptPySparkArtificial IntelligenceFlaskTableauPyTorchKerasTensorFlowSASRPythonMachine LearningMachine Learning Model - $50 hourly
- 0.0/5
- (0 jobs)
I’m a Data Engineer with a strong background in building scalable data pipelines and transforming complex datasets into actionable insights. I’ve worked extensively with Iceberg and Trino to manage modern data lakehouse environments, and I’ve got hands-on experience with ML ops and building machine learning pipelines in Databricks. Whether it’s setting up cloud data warehousing, optimizing workflows, or bridging the gap between data engineering and AI solutions, I’m ready to help. I’m also skilled in Snowflake, dbt, and Airflow, and I’m available for freelance projects where you need expert data engineering to drive results.Apache SparkAutomationETL PipelineMachine LearningSQLApache AirflowdbtPythonSnowflake - $45 hourly
- 0.0/5
- (0 jobs)
A seasoned Software Engineer, with expertise in big data, distributed frameworks, underpinned by a strong grasp of block and object storage systems. She has demonstrated a notable proficiency in optimizing performance for AI/deep learning and data intensive applications over parallel and distributed computing frameworks. My academic journey culminated in a Ph.D. in Computer Engineering from Northeastern University, where my research focused on GPU computing, big data frameworks, and modern storage systems. I addressed data transfer bottlenecks between NVMe SSD storage and GPU, as well as distributed platforms such as Apache Spark and Hadoop. After completing her academic pursuits, I ventured into the professional landscape, start working at Samsung memory solution lab, I honed my expertise in optimizing data pipeline bandwidth for distributed object storage over fabric, specifically for AI/deep learning applications on PyTorch platform.Apache SparkUnix ShellBash ProgrammingLinuxSSDAWS DevelopmentApache HadoopMultithreaded, Parallel, & Distributed Programming LanguageDistributed ComputingOpenCLPyTorchPythonC++CUDA - $60 hourly
- 0.0/5
- (2 jobs)
Generative AI - LLM - STT -TTS - Talking-Face Generation - Chatbot- Spark – SQL –Databricks – ETL – Airflow – Hadoop – Python – DWH – AWS – GCP – MapReduce – BI – Analytics – NoSQL. 𝗢𝗽𝗲𝗻 𝗽𝗿𝗼𝗳𝗶𝗹𝗲 𝘁𝗼 𝘀𝗲𝗲 𝗱𝗲𝘁𝗮𝗶𝗹𝘀. 🤝 𝙒𝙃𝘼𝙏 𝙔𝙊𝙐 𝙂𝙀𝙏 𝙃𝙄𝙍𝙄𝙉𝙂 𝙈𝙀: — Data Engineering and AI Solutions: From crafting sophisticated data platforms tailored to your use case to integrating advanced chatbot solutions, I deliver end-to-end expertise. — Data Scraping and Mining: Extract as much data as you want from any source. — LLM and Chatbot Innovation: Leveraging the latest in AI, I provide guidance in implementing Large Language Models for various applications including conversational AI, enhancing user interaction through intelligent chatbot systems. — BI & Data Visualization: Proficient in tools like Tableau, Power BI, and Looker, I turn complex data into actionable insights. — Automation: Proficient in workflow platforms like Zapier, GHL, and Make.com, I turn complex processes into completed automation workflows. 😉 𝙄 𝙝𝙖𝙫𝙚 𝙚𝙭𝙥𝙚𝙧𝙞𝙚𝙣𝙘𝙚 𝙞𝙣 𝙩𝙝𝙚 𝙛𝙤𝙡𝙡𝙤𝙬𝙞𝙣𝙜 𝙖𝙧𝙚𝙖𝙨, 𝙩𝙤𝙤𝙡𝙨 𝙖𝙣𝙙 𝙩𝙚𝙘𝙝𝙣𝙤𝙡𝙤𝙜𝙞𝙚𝙨: ► AI ENGINEERING LLM models(GPT-x, Llama, LangChain), TTS(Whisper, Deepgram), STT(Bark, ElevenLabs), DeepFake(Wav2lip, Sadtalker), Model Optimization ► BIG DATA & DATA ENGINEERING Apache Spark, Apache Airflow, Hadoop, ClickHouse, Amplitude, MapReduce, YARN, Pig, Hive, HBase, Kafka, Druid, Flink, Presto (incl. AWS Athena) ► ANALYTICS, BI & DATA VISUALIZATION SQL Experienced with complex queries and analytical tasks. BI: Tableau, Redash, Superset, Grafana, DataStudio, Power BI, Looker ► WORKFLOW AUTOMATION Zapier, GoHigheLevel, Make.com, ZoHo, N8N ► OTHER SKILLS & TOOLS Docker, Terraform, Kubernetes, Pentaho, NoSQL databases 𝙈𝙮 𝙧𝙚𝙘𝙚𝙣𝙩 𝙥𝙧𝙤𝙟𝙚𝙘𝙩𝙨: — Real-time crypto status tracking and technical analysis, AI auditor for blockchain code and crypto contracts — GCP-based ML-oriented ETL infrastructure (using Airflow, Dataflow) — Real-time events tracking system (utilizing Amplitude, DataLens, serverless) — Data Analytics platform for CRM analysis of online-game websites — Data visualization project for an E-commerce company 𝙎𝙠𝙞𝙡𝙡𝙨: — Spark expert and experienced Data Engineer 😉 — Extensive experience with MapReduce and BI tools. — Effective communicator, responsible, team-oriented. — Major remote experience: I build an effective work process for a distributed team. — Interested in high load back-end development, ML, and analytical researches. — Data Visualization expert — Data Scraping expertApache SparkData VisualizationData ScrapingData WarehousingPythonApache HadoopApache AirflowETLDatabricks PlatformSQLChatbotAI Text-to-SpeechAI Speech-to-TextNatural Language ProcessingGenerative AI - $60 hourly
- 0.0/5
- (0 jobs)
I’ve been a Software Engineer and Data Scientist. I’ve also been a Business Analyst, IT Project Manager, and Scrum Master. So I can do both technical and managerial task. This has allowed me to communicate in a way that both technical and non-technical people can both understand.Apache SparkAgile Software DevelopmentScrumChatGPTTypeScriptNext.jsAPIReactPySparkJavaScriptData ScienceNumPyDatabricks PlatformComputer VisionPython - $60 hourly
- 0.0/5
- (2 jobs)
Hi, I’m Najeeb Al-Amin I’m a Multimedia / IT / Business Intelligence professional with 4+ years experience in architecting, testing, and deploying highly effective and scalable Big Data solutions. With 15 years of IT and Multimedia experience and also early project experience concentrated on Building Physical Machine Networks, digital automated dialogue replacement (VoiceOver) and Virtual Server Networks under my belt I’m a one stop shop. Also experienced in complex ETL development. Focus installing, designing, configuration and administration of Hadoop architecture as well as Ecosystem components. Building data models to support business reporting and analysis requirements. Highly familiar and experienced with implementing Business Intelligence methodologies in a flexible situation based custom which has become indispensable in architecting top tier information systems. Digital Audiobook, remote and on-site Digital audio workstation engineering, and video tutorial creation are only some of the many ways a strong multimedia background helps me re-shape how I can provide quality services to clients. Effective metadata strategies are key when it comes to being able to deliver solutions. Very knowledgeable as per transfer capabilities .4+ years experience working on mid to small scale data warehouse projects and executing roles such as Big Data Developer, Big Data Consultant, Hadoop Administrator, Hive Developer, Lab Technician and Logistics Consultant.Apache SparkApache HiveBig DataData AnalysisApache HadoopSqoopApache FlumeSystem AdministrationData ModelingApache Kafka - $120 hourly
- 0.0/5
- (0 jobs)
I was a senior machine learning engineer for Fanatics, a leading e-commerce platform. I built machine learning pipelines and models at Fanatics to match customers with products. I am experienced in AWS SageMaker, Apache Spark, and PyTorch. I would love to design and build machine learning solutions for your business.Apache SparkBig DataAmazon Web ServicesNodeJS FrameworkMachine LearningPyTorch - $150 hourly
- 0.0/5
- (0 jobs)
I am an experienced and ambitious engineering leader with over a decade of professional experience building software solutions.Apache SparkNatural Language ProcessingMachine LearningAWS GluePostgreSQLApache KafkaDatabase DesignData SciencePyTorchData EngineeringAWS LambdaSnowflakeRustAPI DevelopmentPython - $25 hourly
- 5.0/5
- (1 job)
PROFESSIONAL SUMMARY * Accomplished in-depth knowledge of Machine Learning (ML) professional skilled in leading strategic early stage and large scale machine intelligence algorithms, aligning machine learning techniques in the field of Additive Manufacturing (AM). * Possess considerable experience across different data types, working on different AI solutions incorporating Recommender System, Computer Vision (CV), and Reinforcement Learning (RL) systems. * Demonstrated strong problem-solving and analytical skills in process troubleshooting, root cause analysis, continuous improvement, and high safety standards.Apache SparkAmazon Web ServicesHiveApache HadoopComputer VisionData AnalysisBig DataApache HiveData Visualization - $75 hourly
- 0.0/5
- (0 jobs)
My expertise lies in data cleaning and organization, a crucial skill in the world of data-driven decision-making. With my background, I can be a valuable asset to businesses seeking a virtual assistant to handle data-related tasks efficiently and effectively. I have an aerospace engineer with two years of industry experience, I understand the importance of quality work delivered in good timing.Apache SparkLangChainOpenAI APIMachine Learning AlgorithmData ScrapingWorkato - $30 hourly
- 5.0/5
- (1 job)
I am a versatile AI/ML Developer proficient in diverse software paradigms, driving profitability, optimising business processes, enhancing energy efficiency, and ensuring customer satisfaction. With expertise in scalable AI/ML product development, I have catered to diverse industries including Renewable Energy, Education, Finance, Sales, and Healthcare. My strong leadership background, both academically and professionally, enables me to excel in creating proprietary AI/ML solutions with impactful results.Apache SparkWeb DevelopmentData Analytics & Visualization SoftwareDeep LearningSnowflakeApache AirflowGenerative AIEnergy Modeling SoftwareSQLPythonData EngineeringData ScienceAI DevelopmentMachine LearningBusiness Development - $30 hourly
- 0.0/5
- (0 jobs)
Data Engineer with 1.5 years of experience in a Fortune 5 company, with a background in developing / optimizing data pipelines and migrating codebases. I have also interned at Microsoft and a Startup where I developed ETL pipelines and API's for applications, with expertise in Python, Scala, Apache Spark, and cloud technologies.Apache SparkScalaNatural Language ProcessingMachine LearningETL PipelineETLData Extraction - $5 hourly
- 0.0/5
- (0 jobs)
I am a seasoned data analyst with 4 years of experience delivering actionable insights to drive business decisions. My expertise lies in transforming complex datasets into clear, impactful strategies that optimize performance and fuel growth. - Proficient in SQL, Python, Tableau, and Power BI, with a strong focus on data visualization and predictive modeling. - Skilled in identifying trends, reducing inefficiencies, and delivering analytics-driven solutions tailored to business needs. - Experienced in collaborating with cross-functional teams to design and implement data-driven strategies. With a commitment to precision and innovation, I excel at turning raw data into meaningful narratives that empower stakeholders. Let’s uncover opportunities and achieve measurable success together!Apache SparkRetrieval Augmented GenerationArtificial Neural NetworkYARNApache BeamMongoDBData VisualizationTime Series ForecastingMachine Learning ModelData MiningETL PipelineDockerPythonRSQL - $125 hourly
- 0.0/5
- (1 job)
I am an expert cloud data architect and backend engineer with 10+ years' experience in developing microservice-based complex software stacks, real-time fast-data platforms, production container deployments using Docker and/or Kubernetes, modern DevOps and CI/CD practices. I completed my Ph.D. in 2016 from the University of California at Davis. My research focused on next-generation broadband access network architectures and services, software-defined networks, and traffic modeling for modern Internet-based video applications such as IPTV. One of my papers received the Best Paper Award at ANTS 2013, and another paper was a semi-finalist in Corning Outstanding Student Paper Award at OFC 2014. I have served as reviewer for prestigious journals such as IEEE Transactions on Communications (TCOM), IEEE Journal of Optical Communication and Networking (JOCN), Photonic Network Communications (PNET). After completing my Ph.D., I joined Ennetix, Inc., a start-up incubated from UC Davis, where I was one of the very first full-time employees. During my stint at Ennetix, first as a Senior R&D Engineer and then as Director of Engineering, I designed the entire microservice-based software architecture of Ennetix's flagship AIOps product, xVisor. I was in charge of technical leadership, where I interfaced with stakeholders and customers to solidify requirements of Ennetix's software products. I made decisions regarding technology choices along the entire software stack, with a focus on quality, stability, and performance. I was also in charge of engineering management at Ennetix - I created project management and work tracking framework following agile methodologies in Azure Boards. I managed a team of developers, using Kanban boards, sprints, backlogs, and milestones. I mentored engineers at various stages of their careers, and educated on best practices on code design, style, and reviews. After more than six busy and fascinating years at Ennetix, I decided to take a break. I was battling some health issues, and needed a more flexible schedule where I could work as an individual contributor on my schedule. Freelancing was a natural fit for me. Since October 2022, I have been working with a stealth start up seeking to bring exciting new tech to the Ethereum blockchain space, especially the NFT market. I am the principal lead backend engineer working on developing real-time analytics and machine learning on Ethereum blockchain data using Apache Spark, as well as fast optimized APIs to power our first of its kind dashboard. We have been building a lot of things that is a first in this space, for example, intelligent asset valuation and advanced portfolio tracking. If you decide to hire me, you will get deep expertise on modern cloud-native data-intensive application design at pennies on the dollar. I feel like this can be a great opportunity for both me and you, and we can form a lasting relationship if we end up being a great fit. Besides software and architecture development, I can guide and mentor junior engineers, improve existing processes on build and testing pipelines, prepare design documents, manage infrastructure, improve existing cloud deployments by optimizing cost and ease of maintenance, etc. Below is a list of areas I have expertise in: - Real-time streaming applications using Apache Kafka/Apache Pulsar/Google PubSub/Azure Event Bus/AWS Kinesis, and Apache Spark/Apache Flink/ksqlDB/Kafka Streams/Google Cloud Dataflow/Azure HDInsight/AWS EMR, etc. - Restful API development in Golang/Scala - Databases such as PostgreSQL, TimescaleDB, MySQL, MariaDB - Analytics datastores such as Elasticsearch, Clickhouse, InfluxDB, Prometheus, Druid, Pinot - Caching using Redis, Dragonfly, Memcached - Infrastructure as code using Terraform, AWS CloudFormation, Azure Resource Manager - Cloud orchestration on AWS/Azure/Google Cloud Platform - Container orchestration on Kubernetes in GKE/AKS/EKS - DevOps and CI/CD using CircleCI/Travis CI/Azure DevOps/Github Actions/Gitlab PipelinesApache SparkMicrosoft AzureDockerAPI DevelopmentScalaPostgreSQLDocker ComposeElasticsearchTerraformData EngineeringGoogle Cloud PlatformApache KafkaRESTful ArchitectureGolangKubernetes - $30 hourly
- 4.6/5
- (1 job)
I am a highly skilled Data Scientist with a passion for leveraging data-driven insights to solve complex business problems. With a background in computer science, business analytics, and information systems and technology, I have developed a strong foundation in machine learning, data visualization, natural language processing, and statistical analysis. My professional experience includes working with technologies such as Python, Tableau, Power BI, Spark/PySpark, and SQL to perform root cause analysis, design interactive dashboards, and communicate KPI insights to technical and non-technical stakeholders. I'm a motivated and detail-oriented individual who is dedicated to providing valuable insights to organizations and helping them make data-driven decisions.Apache SparkDatabasepandasQualtricsAnalyticsApache Spark MLlibMachine LearningAnalytical PresentationData AnalysisTableauGoogle AnalyticsData VisualizationSQLMicrosoft Power BIPython - $70 hourly
- 0.0/5
- (0 jobs)
Service & product minded data engineer with experience building terabyte-scale data platforms from scratch, at organizations ranging from stealth mode to unicorn. Experience working in the healthcare regulated environment.Apache SparkDevOpsApache AirflowdbtSnowflakeAmazon RedshiftTableauSQLPython Want to browse more freelancers?
Sign up
How hiring on Upwork works
1. Post a job
Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.
2. Talent comes to you
Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.
3. Collaborate easily
Use Upwork to chat or video call, share files, and track project progress right from the app.
4. Payment simplified
Receive invoices and make payments through Upwork. Only pay for work you authorize.