Hire the best Apache Spark Engineers in Texas
Check out Apache Spark Engineers in Texas with the skills you need for your next job.
- $175 hourly
- 5.0/5
- (4 jobs)
Mr. Joshua B. Seagroves is a seasoned professional having served as an Enterprise Architect/Senior Data Engineer for multiple Fortune 100 Companies. With a successful track record as a startup founder and CTO, Mr. Seagroves brings a wealth of experience to his role, specializing in the strategic design, development, and implementation of advanced technology systems. Throughout his career, Mr. Seagroves has demonstrated expertise in architecting and delivering cutting-edge solutions, particularly in the realm of data engineering and sciences. He has successfully spearheaded the implementation of multiple such systems and applications for a diverse range of clients. As part of his current responsibilities, Mr. Seagroves actively contributes to the prototyping and research efforts in the field of data engineering/data science, specifically in the development of operational systems for critical mission systems. Leveraging his extensive background in architecture and software modeling methodologies, he has consistently led and collaborated with multidisciplinary teams, successfully integrating various distributed computing technologies, including Hadoop, NiFi, HBase, Accumulo, and MongoDB. Mr. Seagroves' exceptional professional achievements and extensive experience make him a highly sought-after expert in his field. His comprehensive knowledge and hands-on expertise in advanced technology systems and big data make him a valuable asset to any organization.Apache SparkYARNApache HadoopBig DataApache ZookeeperTensorFlowApache NiFiApache KafkaArtificial Neural NetworkArtificial Intelligence - $125 hourly
- 4.8/5
- (14 jobs)
🏆 Achieved Top-Rated Freelancer status (Top 10%) with a proven track record of success. Past experience: Twitter, Spotify, & PwC. I am a certified data engineer & software developer with 5+ years of experience. I am familiar with almost all major tech stacks on data science/engineering and app development. If you require support in your projects, please do get in touch. Programming Languages: Python | Java | Scala | C++ | Rust | SQL | Bash Big Data: Airflow | Hadoop | MapReduce | Hive | Spark | Iceberg | Presto | Trino | Scio | Databricks Cloud: GCP | AWS | Azure | Cloudera Backend: Spring Boot | FastAPI | Flask AI/ML: Pytorch | ChatGPT | Kubeflow | Onnx | Spacy | Vertex AI Streaming: Apache Beam | Apache Flink | Apache Kafka | Spark Streaming SQL Databases: MSSQL | Postgres | MySql | BigQuery | Snowflake | Redshift | Teradata NoSQL Databases: Bigtable | Cassandra | HBase | MongoDB | Elasticsearch Devops: Terraform | Docker | Git | Kubernetes | Linux | Github Actions | Jenkins | GitlabApache SparkJavaApache HadoopAmazon Web ServicesSnowflakeMicrosoft AzureGoogle Cloud PlatformDatabase ManagementLinuxETLAPI IntegrationScalaSQLPython - $80 hourly
- 5.0/5
- (6 jobs)
I am a versatile professional with extensive expertise in MLOps, Machine Learning Engineering, Data Engineering, and Data Science. I specialize in building and deploying scalable AI solutions, automating ML pipelines, and transforming data into actionable insights to drive business value. With hands-on experience across various industries, including banking, freelancing platforms, and government services, I have a proven track record of delivering robust, production-grade machine learning systems and end-to-end AI solutions. 1. What I Offer a. MLOps & DevOps Excellence: Expertise in deploying scalable ML models using Docker, Kubernetes, and Terraform. Built CI/CD pipelines with Jenkins, Azure DevOps, and GitHub for seamless ML lifecycle management. Managed cloud-native infrastructure on AWS and Azure, integrating services like SageMaker, Azure ML, AKS, and EKS. b. Machine Learning Expertise: Delivered predictive models using LightGBM, XGBoost, and CatBoost for tasks such as job matching, customer analytics, and time-series forecasting. Developed LLM-based solutions with OpenAI GPT-3/4 for natural language processing tasks like resume matching, skill extraction, and market insights generation. Built high-performing recommendation systems, graph-based inference models, and advanced computer vision pipelines. c. Data Engineering Proficiency: Designed and implemented data pipelines for real-time and batch processing using Databricks, Azure Data Factory, and AWS services. Proficient in integrating and transforming data from diverse sources such as APIs, databases, and unstructured data streams. d. AI-Powered Insights & Analytics: Built a knowledge graph for entity linking and profile enrichment, leveraging NLP and embedding-based similarity models. Created digital twins for city-scale social issue monitoring and policy simulations using time-series forecasting and knowledge graphs. Extracted actionable insights through sentiment analysis, entity recognition, and topic modeling. 2. Key Achievements Successfully deployed scalable machine learning models for Citibank, enhancing predictive capabilities for financial operations. Developed a job connection prediction model for a leading freelancing platform, improving application and hiring rates with real-time optimization. Built a state-of-the-art face similarity search pipeline, capable of handling cross-domain challenges with high accuracy and scalability. Delivered a knowledge graph-based linking engine for entity matching, revolutionizing data enrichment processes for large-scale internet traffic. 3. Technical Skills Programming & Frameworks: Python, PyTorch, TensorFlow, Spark, Flask, FastAPI. Cloud & DevOps: AWS (SageMaker, EKS, Rekognition), Azure (AKS, Functions, ML), Terraform, Docker, Kubernetes. Machine Learning Tools: LightGBM, XGBoost, CatBoost, OpenAI GPT-3/4, MLflow. Data Tools: SQL, Elasticsearch, Databricks, GroundTruth. 4. Why Choose Me? I bring a results-driven approach to every project, ensuring that solutions are not only technically sound but also aligned with business goals. Whether you need end-to-end ML deployment, advanced analytics, or AI-driven insights, I am here to help you unlock the full potential of your data.Apache SparkAmazon Web ServicesMicrosoft AzureMLOpsDockerLLM PromptMicrosoft Power BIElasticsearchMongoDBSQLMachine LearningPythonNatural Language ProcessingDeep LearningPython Scikit-Learn - $100 hourly
- 5.0/5
- (140 jobs)
— TOP RATED PLUS Freelancer on UPWORK — EXPERT VETTED Freelancer (Among the Top 1% of Upwork Freelancers) — Full Stack Engineer — Data Engineer ✅ AWS Infrastructure, DevOps, AWS Architect, AWS Services (EC2, ECS, Fargate, S3, Lambda, DynamoDB, RDS, Elastic Beanstalk, AWS CDK, AWS Cloudformation etc.), Serverless application development, AWS Glue, AWS EMR Frontend Development: ✅ HTML, CSS, Bootstrap, Javascript, React, Angular Backend Development: ✅ JAVA, Spring Boot, Hibernate, JPA, Microservices, Express.js, Node.js Content Management: ✅ Wordpress, WIX, Squarespace Big Data: ✅ Apache Spark, ETL, Big data, MapReduce, Scala, HDFS, Hive, Apache NiFi Database: ✅ MySQL, Oracle, SQL Server, DynamoDB Build/Deploy: ✅ Maven, Gradle, Git, SVN, Jenkins, Quickbuild, Ansible, AWS Codepipeline, CircleCI As a highly skilled and experienced Lead Software Engineer, I bring a wealth of knowledge and expertise in the areas of Java, Spring, Spring Boot, Big Data, MapReduce, Spark, React, Graphics Design, Logo Design, Email Signatures, Flyers, Web Development (HTML, CSS, Bootstrap, JavaScript & frameworks, PHP, Laravel), responsive web page development, Wordpress and designing, and testing. With over 11 years of experience in the field, I have a deep understanding of Java, Spring Boot, and Microservices, as well as Java EE technologies such as JSP, JSF, Servlet, EJB, JMS, JDBC, and JPA. I am also well-versed in Spring technologies including MVC, IoC, security, boot, data, and transaction. I possess expertise in web services, including REST and SOAP, and am proficient in various web development frameworks such as WordPress, PHP, Laravel, and CodeIgniter. Additionally, I am highly skilled in Javascript, jQuery, ReactJs, AngularJs, Vue.Js, and Node. C#, ASP.NET MVC In the field of big data, I have experience working with MapReduce, Spark, Scala, HDFS, Hive, and Apache NiFi. I am also well-versed in cloud technologies such as PCF, Azure, and Docker. Furthermore, I am proficient in various databases including MySQL, SQL Server, MySql, and Oracle. I am familiar with different build tools such as Maven, Gradle, Git, SVN, Jenkins, Quickbuild, and Ansible.Apache SparkDatabaseWordPressCloud ComputingSpring FrameworkData EngineeringNoSQL DatabaseReactServerless StackSolution Architecture ConsultationSpring BootDevOpsMicroserviceAWS FargateAWS CloudFormationJavaCI/CDAmazon ECSContainerization - $50 hourly
- 5.0/5
- (46 jobs)
🏅Expert-Vetted | 🏆 100% Job Success Rate | ⭐ 5-Star Ratings | 💎 $1 Million+ Earnings | 🕛 Full Time Availability | ✅ Verifiable projects | ❇️ 16,000+ Hours 🏆 Winner of 2 Presidential Awards in the country of operations 🏆 Member of Chamber of Commerce in the country of operations 🏆 Work Recognition on TV shows and Blogs 🏆 Regularly conducts IT Bootcamps in the country of operations 🚀 Streamlining Business Operations Through Data Empowerment! 🚀 9+ years of experience | Automation Specialist | Data Driven Operations | Web Development | Data Science | Data Management | AI/ML Implementations | Deep Learning Solutions | Process Optimization | Business Performance Enhancement 👉 Big Data: I help business owners and their teams unlock the true value of their historic data, aligning sales and marketing teams, as well as product teams with the demand and supply dynamics of both the market, and the company's product/service offerings. ✅ Big Data Tools Integration (e.g., Apache Hadoop, Apache Spark) ✅ ETL Processes Implementation (Extract, Transform, Load) ✅ Data Retention Policies ✅ Distributed Computing ✅ Hadoop Cluster Management ✅ Stream Processing Systems (e.g., Apache Kafka) ✅ NoSQL Databases (e.g., MongoDB, DynamoDB) ✅ ML Toolkits (e.g., TensorFlow) ✅ Lambda Architecture 👉 Data Science: I help build business owners and their teams robust predictive analytics systems to better align their budgeted with the actual figures at year end, thereby reducing the delta and increasing the likelihood of shareholder alignment, improving the company's going-concern assumption for the stakeholders, and thereby improving the share value in the long run. ⚡ Selecting features, building and optimising classifiers using ML techniques. ⚡ Data mining ⚡ Data collection (Apache Kafka, Logstash) ⚡Third-Party Data Integration (e.g., APIs, Selenium, Postman) ⚡ Data Integrity Verification ⚡ Anomaly Detection Systems ⚡ Query Language Proficiency (e.g., SQL, MySQL, MongoDB, BigQuery, PostgreSQL) ⚡ Scripting Skills (e.g., Python) ⚡ Statistical Analysis ⚡ Effective Communication Skills 👉 Core Technology: ✅ Frontend: HTML, CSS, JavaScript (ES6, ES7, Typescript), JavaScript frameworks (e.g., React, Next.js, Angular), GraphQL clients (e.g., Apollo), CSS preprocessors (e.g., SASS, LESS), JavaScript charting libraries (e.g., D3, Highcharts, Recharts), State management (e.g., Redux, Redux Saga, React Toolkit), RESTful APIs ✅ Backend: Node.js, Nest.js, Express, Python, ✅ Cloud Computing Platforms (AWS, Google Cloud, IBM Cloud) ✅ Testing frameworks (e.g., Jest, Mocha) ✅ CI/CD tools (e.g., GitHub Actions, Jenkins, AWS CodePipeline) ✅ Version control tools (e.g., Git) ✅ MERN Stack (React.js, Node.js, MongoDB, Express.js) ✅ CMS (WordPress, WooCommerce) ✅ Deployment: Docker, Heroku, AWS, Azure ✅ UI/UX Material Design, Figma, Miro, HTML5, CSS, JavaScript, XML ✅ AI and Machine Learning (AI Assistants and Chatbots, ML Models, Predictive Analytics, Sentiment Analysis, NLP, Audio/Video/Speech to Text) ✅ Data Visualisation (Power BI, Tableau) Driven by a passion for innovation and backed by a strong foundation in cutting-edge technologies, I'm committed to propelling your business towards unparalleled success. Let's harness the power of tech to revolutionize your operations! 💡🌟Apache SparkElasticsearchApache KafkaData ModelingData IntegrationAPI IntegrationETL PipelineBigQueryArtificial IntelligenceDjangoData AnalysisData MiningData ScienceMachine LearningPython - $50 hourly
- 0.0/5
- (0 jobs)
Hands-on design and development experience on Hadoop ecosystem (Hadoop, HBase, PIG, Hive, and MapReduce) including one or more of the following Big data related technologies - Scala, SPARK, Sqoop, Flume, Kafka and Python, strong ETL, PostgreSQL experience as well Strong background in all aspects of software engineering with strong skills in parallel data processing, data flows, REST APIs, JSON, XML, and microservice architecture * Experience in Cloudera Stack, HortonWorks, and Amazon EMR * Strong experience in using Excel, SQL, SAS, Python and R to dump the data and analyze based on business needs. * Strong experience in Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export through the use of multiple ETL tools such as Ab Initio and Informatica PowerCenter * Strong understanding and hands-on programming/scripting experience skills - UNIX shell * An excellent team player & technically strong person who hasApache SparkAmazon S3Data Warehousing & ETL SoftwareBig DataAmazon Web ServicesHiveData ScienceETLData LakeData CleaningApache HiveApache HadoopApache KafkaData MigrationETL Pipeline - $60 hourly
- 0.0/5
- (1 job)
* I'm a data scientist with 9 years experience analyzing and modeling data and expertise in quantitative research and statistical programming. My track record includes statistical and predictive modeling of large-scale cross-sectional and time series data. If you need help with data cleaning, data wrangling, data analysis, visualization, result interpretation, or presentation, I can help! * I'm proficient in Python, R, MATLAB, and SQL * I'm comfortable working with any stage of project, even managing a project start to finish * Regular communication is really important to me, so let’s keep in touch!Apache SparkStatistical AnalysisData WranglingData AnalysisAnalytical PresentationMATLABData CleaningFeature ExtractionData ModelingPythonRandom ForestMachine LearningLinear RegressionData VisualizationR - $30 hourly
- 5.0/5
- (3 jobs)
I am a professional software developer and analyst. I got my MS from Georgia Tech in computer science and a BS from UT Austin in math. I mainly use Python, SQL, R, and Java.Apache SparkRegression AnalysisStatistical AnalysisRHTMLTypeScriptSnowflakeTableauMachine LearningData AnalysisSQLJavaScriptGitJavaPython - $60 hourly
- 0.0/5
- (0 jobs)
Self-motivated software engineer with 13 years of experience in designing, developing, testing, and implementing applications for Banking, Health Insurance, PBM and IT industry organizations mainly using Java, Python and React . TECHNICAL COMPETENCIES: -- Languages: Java, JavaScript, Python -- J2EE Technologies: Spring Boot, Spring Framework, Spring Data, Spring JPA, Spring Scheduler, Hibernate, JSP, JDBC, RESTful, SOAP, Sprint Cloud -- JavaScript: ReactJs, Redux, React-Router, Axios, JQuery, WebPack, Babel. -- Cloud Technologies: AWS(Lambda,S3,Route53,apiGateway,SNS,SQS,EMR,EC2, Pivotal Cloud Foundry(PCF) -- Web: HTML, CSS, Bootstrap, XML, XPath, JSON, YAML, Shell and batch Scripting, XQuery. -- Databases: Oracle, MySQL, SQL Server, PostgreSQL, MongoDB, Teradata, MariaDB, DynamoDB, DB2. -- Application Server: WebSpahere, Apache Tomcat, JBoss, GlassFish, Weblogic. -- IDE: MyEclipse, RAD, Eclipse, Netbeans, Spring Tool Suite,IntelliJ, WebStorm, Visual Code, PyCharm. -- SDLC: Agile-Scrum , Waterfall, Kanban. -- Tools: Apache Spark, Docker, JIRA, SVN,GitHub,Jenkins, XLRelease, JFrog, Maven, Ant, Ansible, CyberArk, Checkmarx, Splunk, New Relic, LDAP, Terraform -- Platforms: Windows, Linux, Unix, AIX. -- Testing Tools: Selenium, Junit, Jest, Mockito, Jasmine. -- Networks: TCP/IP, DNS, Proxy, DHCP Server, LAN / WAN, CISCO Router and Switch.Apache SparkGitHubJenkinsJiraTeradataIBM WebSphereAb InitioTerraformDockerOracleAmazon DynamoDBAWS LambdaReactJavaPython - $49 hourly
- 0.0/5
- (0 jobs)
I’m a developer experienced in building websites for small and medium-sized businesses. Whether you’re trying to win work, list your services, or create a new online store, I can help. Backend, frontend, devops, data engineering, event sourcing, TDD - we can work it out by pairing.Apache SparkCI/CDAzure App ServiceApache AvroApache Kafka.NET CorepandasReactAzure Cosmos DBMongoDBGraphQLAzure DevOpsSQLNode.js - $150 hourly
- 0.0/5
- (0 jobs)
8 Years of Expertise in Big Data, Cloud Technologies, and Advanced Analytics With eight years of extensive experience across data engineering, machine learning, and cloud technologies, I deliver innovative and scalable solutions tailored to diverse business needs. My expertise spans a wide range of tools, frameworks, and platforms, enabling me to manage complex data ecosystems and drive impactful insights. Core Technical Proficiencies: Programming and Scripting: > Mastery in Java, Scala, C, C#, Python, and JavaScript for building robust, scalable applications. > Advanced scripting capabilities with Bash, Shell, and tools like awk for process automation. Data Storage and Processing: > Distributed Storage: Expertise with Apache Hadoop (HDFS), Amazon S3, Azure Blob, and Google Cloud Storage. > Processing Frameworks: In-depth experience with Apache Spark, Flink, MapReduce, and Google Dataflow for batch and stream processing. > Data Lakehouses: Skilled in Snowflake, Databricks, Hive, and Delta Lake for modern analytics architecture. Databases and Data Models > SQL Databases: Extensive work with PostgreSQL, MySQL, Oracle, and MSSQL. > NoSQL Databases: Advanced proficiency in MongoDB, Cassandra, DynamoDB, and Aerospike. > Specialized Technologies: Amazon QLDB (ledger databases), Milvus and Chroma (vector databases), and graph solutions like Neo4j and JanusGraph. Real-Time Data and Messaging Systems > Proficient in setting up and managing Apache Kafka, RabbitMQ, and Amazon Kinesis for real-time data pipelines. > Expertise in distributed query engines like Presto, Trino, and Athena for high-performance data analytics. Data Integration and Transformation > Data Tools: Skilled in Logstash, Fluentd, and Talend for data ingestion and transformation. > File Formats: Comprehensive knowledge of Parquet, Avro, Delta, JSON, and other data formats. Machine Learning and Data Science > Advanced ML Techniques: Feature engineering, dimensionality reduction, and model optimization. > Algorithms: Expertise in Random Forest, Logistic Regression, SVM, and more. Tools and Platforms: Hands-on with MLflow, AWS SageMaker, and Jupyter for model development, training, and deployment. Visualization and Reporting > Dashboarding expertise with Tableau, Kibana, AWS QuickSight, and Looker for actionable insights and analytics reporting. Cloud and DevOps Proficiency > Cloud Platforms: Deep experience with AWS (Lambda, EMR, ECS), Google Cloud Platform, and Azure services. > DevOps Tools: Skilled in Docker, Kubernetes, Terraform, and ArgoCD for infrastructure management and deployment. Workflow Orchestration and Automation > Workflow Platforms: Expertise in Airflow, Prefect, and Oozie for efficient pipeline automation. Advanced AI and NLP Solutions > Proficiency in integrating OpenAI (GPT-4, LLMs), AWS Transcribe, and AWS Translate for AI-driven applications. > Specialized knowledge in RAG (Retrieval Augmented Generation) and AI-enabled data insights. Why Choose Me? > Broad Expertise: Comprehensive experience across modern data ecosystems ensures tailored, end-to-end solutions. > Scalable Solutions: Proven ability to deliver scalable and efficient systems for real-world business challenges. > Cutting-Edge Tools: Up-to-date knowledge and hands-on experience with the latest in big data, AI, and cloud technologies. > Collaborative Approach: A strong communicator and team player, committed to achieving your project’s goals efficiently. Let’s collaborate to transform your data challenges into opportunities. Get in touch to discuss your project today!Apache SparkArchitectureApache KafkaGenerative AINode.jsJavaPythonSQLNoSQL DatabasedbtETL PipelineMachine LearningArtificial IntelligenceApache FlinkDatabricks Platform - $75 hourly
- 0.0/5
- (0 jobs)
Senior Data Engineer Senior Data Engineer with 14 years of extensive experience in designing and implementing large-scale data pipelines and cloud-based solutions. Expertise spans AWS, Azure, and Snowflake, with a proven track record in building and optimizing data platforms. Proficient in Big Data technologies, including Apache Spark and Hadoop, with deep knowledge of ETL processes, data modeling, and performance tuning.Apache SparkCSVJSONORCApache AvroParquetApache HadoopBig Data File FormatBig DataPySparkOracleMicrosoft SQL ServerInformatica CloudDatabricks PlatformSnowflake Want to browse more freelancers?
Sign up
How hiring on Upwork works
1. Post a job
Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.
2. Talent comes to you
Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.
3. Collaborate easily
Use Upwork to chat or video call, share files, and track project progress right from the app.
4. Payment simplified
Receive invoices and make payments through Upwork. Only pay for work you authorize.