Hire the best Pyspark Developers in Nepal
Check out Pyspark Developers in Nepal with the skills you need for your next job.
- $15 hourly
- 5.0/5
- (10 jobs)
Do you need someone to build data processes and applications for driving your business forward? What I can do? - Process your raw data and present it in a meaningful way - this could be either be output from a machine learning pipeline or BI dashboard or intermediate table for you to work with. - Migrate your data in your desired output following data security protocols and optimization techniques. - Architect your data flow and it’s processing as per your business needs. - Create new channels to update your existing systems with insights and KPIs derived from BI Projects. - Suggest the best course of action for your data processes moving forward - Suggest new possible revenue sources or cost reductions. - Document your existing system or that is developed by me translating technical concepts into understandable terms for non-technical stakeholders translating technical concepts. I feel comfortable with most modern tech stack but following are the technologies I have used in the past for various projects: LANGUAGE: Python, C, C++, Java, R, SQL LIBRARIES: Pyspark, Kafka, Pandas, Numpy WEB DEV: Django, Flask, Bootstrap, APIs, Authentication, FastAPI ETL: AWS Glue, Talend, MuleSoft, Airflow, DBT. DATAWAREHOUSE: Snowflake, AWS Redshift, Google BigQuery, Azure Synapse BI & ANALYTICS: Plotly-Dash, Streamlit, PowerBI, Tableau, Qlik, Data Studio, Quicksight DATABASE: Amazon RDS, Postgres, MSSQL, MYSQL, SQLITE, MongoDB, DynamoDB, Aurora, Cassandra, Stored Procedures CLOUD: AWS, VPC, IAM, EC2, Lambda, S3, Glue, Athena, Azure, GCP VERSION CONTROL: GIT, GitHub, Bit Bucket, AWS CodeCommit SCRAPPING: Selenium, Beautiful Soup PROJECT MANAGEMENT: Jira, Redmine CONTAINER: Docker Want to discuss about your project? Please drop me a message. Thank you so much for considering my profile. Sincerely, ShusantSPyspark
Beautiful SoupData VisualizationData AnalysisData CleaningData ScrapingTalend Data IntegrationApache KafkaDockerData ScienceMachine LearningpandasSeleniumAmazon S3Amazon RedshiftSnowflakeApache AirflowPySparkJavaPythonETL - $20 hourly
- 5.0/5
- (2 jobs)
, 🙋♂️ Hello, I am a data engineer. I have 4+ years of experience in creating a data pipeline, data schedule in airflow, data transformation on a spark , aws and azure cloud services MY TOP SKILLS ✔️ Python ✔️ ETL ✔️ PySpark ✔️Apache Airflow ✔️ SQL 🛠️ THE WHOLE THING ✔️ Data engineering / ETL / API - Apache Airflow - PySpark - Airbyte - Apache Superset - Django/Flask ✔️ Cloud - AWS -- Glue, EC2, S3, RDS, Lambda, DynamoDB, Athena - Azure -- ADLS, Azure function, ADF, Fabric, Databricks ✔️ Databases / SQL / NoSQL / Cloud Storage - Relational : PostgreSQL, MySQL - Mongodb - RedshiftPyspark
Amazon RedshiftAWS LambdaPostgreSQLMongoDBMySQLDjangoPySparkFlaskMachine LearningAWS GlueApache AirflowDockerPython - $20 hourly
- 5.0/5
- (1 job)
Full Stack Developer Experienced with Django Framework and MERN stack created and Devloped Multiple applications ranging from Email Marketing Campaign Applications, to AI API wrapper along with Booking Websites. Dev Ops experience by managing multiple VPS and handling the Deployment Part for Django and MERN stack Applications with NGINX and System configurations. SetUp CI/CD pipeline to automate the BUILD and deployment of the new features. Managing and Configuring Mail Servers handling several issues with dealing with Appropriate solutions. Below Are the Tools and Technologies I am Familar with. REST API development Django(Intermediate) Celery(Intermediate) Nginx(Intermediate) VPS Server Deployment Docker(Beginner) MySQL(Intermediate) MERN stack development JavaScript(NodeJs) express REST api ReactJS Scraping BeautifulSoup Mail server (Postfix), Stripe Integration Postfix SMTP server SPF DKIM and DMARC setupPyspark
Chart.jsCeleryApache KafkaPySparkTableauMachine LearningWeb ScrapingRESTful APINGINXReactMySQLJavaScriptExpressJSDjangoPython - $45 hourly
- 4.0/5
- (11 jobs)
Reliable data engineer with 10 years of proven industry experience in data lake development, data analytics, real-time streaming, and back-end application development. My work is used by millions of people in the legal and entertainment industries. I have built exceptionally stable solutions for high-traffic, high-visibility projects, and understand what it takes to ensure products are robust and dependable. I also have expertise in the Apache Spark ecosystem, Elastic Search, ETL, AWS Glue, DMS, Athena, EMR, Data Lake, AWS Big Data, Apache Kafka, Java, and NoSQL. Specific Experience 1. Databricks : 5+ years of experience 2. Unity Catalog: 2+ years of experience 3. Apache Spark: 8+ years of experience 4. ETL: 8+ years of experience 5. SQL: 9+ years of experience 6. AWS: 8+ years of experience 7. Azure and GCP: 5+ years of experience I am a data professional, worked with many companies, and delivered some of the enormous data engineering and data science projects in the past. My focus is always on scalable, sustainable, and robust software building. As a data scientist, I will use data modeling, programming, analysis, visualization, and writing skills to help people have the insight to develop products, customers, and impact. As a data scientist, I care deeply about the data from beginning to end—I am actively involved in all aspects of data analysis, from data modeling tasks to writing reports and making visualizations. Python/Scala Programming, Linux Admin, Data Wrangling, Data Cleansing & Data Extraction services utilizing Python 3 or Python 2 Programming or Scala/Spark on Linux or Windows. I slice, dice, extract, transform, sort, calculate, cleanse, collect, organize, migrate, and otherwise handle data management for clients. Services Provided: - Big data processing using Spark Scala - Building large Scale ETL - Could Management - Distributed platform development - Machine learning - Python Programming - Algorithm Development - AWS glue - Pyspark - Data Conversion (Excel to CSV, PDF to Excel, CSV to Excel, Audio) - Data Mining - Data extraction - ETL Data Transformation - Data Cleansing - Linux Server Administration - Website & Data Migrations - DevOps (AWS,AZURE)Pyspark
Amazon S3Data ScrapingData ScienceData EngineeringAmazon EC2Data Warehousing & ETL SoftwarePySparkETL PipelineRedisAWS GlueDatabricks MLflowDatabricks PlatformApache SparkPython - $15 hourly
- 5.0/5
- (1 job)
Looking to transform your raw data into meaningful insights? I’m here to help! With over 2 years of experience as a Data Engineer, I specialize in building scalable data pipelines, architecting cross-cloud infrastructures, and delivering real-time analytics that empower businesses to make smarter decisions. My expertise lies in integrating AWS and Azure ecosystems, leveraging tools like Databricks, PySpark, and Delta Lake to ensure data flows seamlessly from source to insights. Whether it's automating ETL processes, setting up incremental data loading, or building Power BI dashboards, I make complex data projects simple and efficient. Here’s what I bring to the table: • Cross-cloud Data Architecture: Seamlessly connect AWS and Azure to ensure smooth data movement and storage. • End-to-End Data Pipelines: Automate and optimize your data workflows, from ingestion to transformation. • Real-time & Batch Processing: Build data systems that handle both real-time and historical data, giving you the flexibility you need. • Advanced Analytics: Create sales, user, and RFM analysis data marts to unlock deep insights. Power BI & Tableau Dashboards: Develop interactive, real-time reports that drive data-driven decisions. • Security & Compliance: Implement Role-Based Access Control (RBAC) to keep your data secure and compliant. • Consulting & Cost Optimization: Get expert advice on cloud infrastructure design and cost-effective data solutions. What sets me apart is my commitment to delivering high-impact results. I work closely with clients to understand their business needs, and I build tailored data solutions that are both scalable and cost-effective. No project is too complex—whether you’re looking for a robust cloud architecture or seamless data integration, I’ll help you bring your vision to life.Pyspark
Azure Service FabricApache SparkETLData WarehousingMicrosoft AzureTableauData AnalysisData LakeData IngestionDatabricks PlatformSQLPySparkPythonData Engineering - $8 hourly
- 5.0/5
- (2 jobs)
Data Engineer | Data QA Engineer | SQL | Python | PySpark | Databricks | Snowflake | MS Office | MS Excel With 8 years of experience in Data Engineering and Data Quality Assurance, I specialize in building, optimizing, and validating robust data pipelines for businesses that rely on high-quality data. My expertise spans across SQL, Python, PySpark, Pandas, Databricks, and Snowflake, ensuring scalable and reliable data solutions. 🔹 Data Engineering & ETL Development Design and develop efficient data pipelines for structured and unstructured data. Implement ETL/ELT processes for seamless data integration across platforms. Optimize big data workflows using Spark, Databricks, and Snowflake. 🔹 Data Quality & Validation Ensure accuracy, integrity, and consistency of data across pipelines. Develop automated data validation frameworks to catch anomalies. Performance tune SQL queries and optimize data processing workflows. 🔹 Automation & Performance Optimization Build custom scripts for automation in Python, PySpark, and SQL. Implement monitoring solutions to track data quality and system performance. Improve query performance and data model efficiency for faster analytics. 🚀 Whether you need a data pipeline built from scratch, an ETL process optimized, or a data quality framework implemented, I’m here to help! Let’s connect and turn your data into a powerful asset.Pyspark
HTML5PostgreSQLPySparkDatabricks PlatformPythonOracle PLSQLMySQL ProgrammingData EntryMicrosoft ExcelMicrosoft Word - $25 hourly
- 0.0/5
- (0 jobs)
I am an experienced and knowledgeable professional with a strong background in data engineering, ETL, and cloud platforms. For the past few years, I have been actively involved in the data engineering domain and have gained extensive experience in this field. As working as a Data Engineer, I have closely collaborated with various teams across the world to implement Data Warehousing Solutions. My professional experience includes building and implementing data warehousing solutions across different subject areas using Cloud Solutions such as Snowflake, and AWS - Business Analytics.Pyspark
Microsoft ExcelFlutterAWS LambdaAWS GlueSnowflakeLookerPySparkData IngestionData Analytics & Visualization SoftwareTableauCPythonSQL - $8 hourly
- 0.0/5
- (1 job)
Data Engineer-II with experience in cloud services and big data technologies such as PySpark .Highly proficient in programming languages such as Python, Java, C++, PySpark and SQL, and have a deep understanding of data warehousing , relational database systems, analyzing and transforming data, maintaining ETL/ELT workflows and also have expertise in web scraping and scraping tools. Some of my major achievements include the automation of the ETL pipelines. PROJECT EXPERIENCE Internal Analytics - Company wide EDWH supporting BI and advanced analytics * Developed an ELT (Extract, Load, Transform) framework to process data from various data sources such as CRM (HubSpot), semi-structured data (MongoDB), APIs, etc. developing different scripts as well as using Airbyte. * Orchestrated the pipeline of Internal Analytics by using Airflow. * Transformed and calculated business-specific metrics according to project requirements. * Linked Athena Databases to Apache Superset for visualizing the metrics and creating interactive dashboards Find-A-Play-ETL * Created an automated pipeline to scrape and collect data from different plays and drama licensing houses of the US. * Transformed and cleaned the crawled data through Pyspark and merged it to standard format with client’s data. * Made the processed data accessible to the ML team through Azure synapse for building a drama recommendation system, and to the full-stack team for creating the "Find a Play" APIs.Pyspark
PythonDockerpandasSeleniumGitApache SupersetApache AirflowMicrosoft SQL ServerPostgreSQLMySQLPySparkInteractive Data VisualizationData Warehousing & ETL SoftwareAmazon Web ServicesETL Pipeline - $15 hourly
- 0.0/5
- (0 jobs)
I am an aspiring Data Engineer with a strong foundation in data engineering principles and hands-on experience in data science and machine learning projects. My expertise includes building and managing scalable data pipelines, performing data wrangling and preprocessing, and implementing efficient ETL processes using tools like Python, SQL, and cloud platforms such as AWS. I have worked on various ML projects involving data analysis, feature engineering, and model development using libraries like Pandas, Scikit-learn. This experience has helped me understand end-to-end data workflows—from raw data ingestion to model deployment—bridging the gap between data engineering and data science. I’m passionate about clean data architecture, automation, and enabling data-driven decision-making. I thrive in collaborative environments and am eager to contribute to impactful projects by combining my technical skills with continuous learning.Pyspark
RedisScalaData ScienceAnalytical PresentationCampaign ReportingSQLPySparkPythonData ExtractionMachine LearningData AnalysisETL PipelineETL - $5 hourly
- 0.0/5
- (0 jobs)
I am a skilled Data Engineer with over 2 years of professional experience in building and optimizing data pipelines, as well as an additional year of expertise as a Back-End Developer. My career has been focused on delivering robust data solutions and creating efficient systems for data-driven decision-making. Tools & Technologies: * Data Engineering Platforms: Databricks, Apache Livy * Programming & Scripting: PySpark, Python, SQL * Cloud Services: AWS, Azure, Google Cloud * Databases: MySQL, Oracle, PostgreSQL, MSSQL, Snowflake, Redshift, Saphana, Bigquery and so on. Highlights of Expertise: * Designing and managing scalable data pipelines. * Working with structured and unstructured data across diverse sources. * Integrating cloud platforms to streamline data workflows. * Collaborating with cross-functional teams to implement data solutions that meet business needs. With a passion for clean, efficient, and reliable data systems, I bring technical proficiency and a results-oriented approach to every project. I look forward to collaborating with clients to unlock the full potential of their data.Pyspark
Database ArchitectureDatabase ModelingBig DataData Warehousing & ETL SoftwareDatabricks PlatformMERN StackGoogle Cloud PlatformWeb ScrapingScripting LanguageTerraformMicrosoft Azure SQL DatabaseAWS Cloud9pandasPythonPySpark - $30 hourly
- 0.0/5
- (0 jobs)
Hello! I'm Kiran Bhandari, an enthusiastic and skilled Data Engineer with a Bachelor's degree in Information Technology. Over the past year, I have built a solid foundation in data engineering, specializing in AWS, ETL pipelines, and database management. My goal is to leverage data to drive informed decision-making and help businesses unlock actionable insights through seamless data processing and management. What I Do Best: Data Engineering Excellence: I excel at designing, developing, and optimizing robust ETL pipelines that streamline data flow, improve performance, and ensure data integrity. I work with large-scale data processing systems to ensure timely and reliable delivery. AWS Cloud Expertise: With in-depth experience in Amazon Web Services (AWS), I specialize in utilizing the cloud to implement scalable, efficient data processing, storage, and warehousing solutions. My skills in AWS Glue, S3, and Redshift enable seamless integration across platforms, ensuring high-performance workflows. Database Management & SQL: I am proficient in SQL and experienced in managing relational databases, creating data models, and ensuring optimal performance through efficient query design and indexing. I focus on data quality and performance in every stage of the pipeline. Python Scripting for Automation: I leverage Python to create custom scripts that automate data processing tasks, build ETL workflows, and enhance overall efficiency. Whether it's parsing large datasets or integrating with APIs, I have the tools and expertise to get the job done. CI/CD Pipeline & Automation: I have hands-on experience designing and implementing CI/CD pipelines for AWS Glue workflows, utilizing AWS CloudFormation for automated deployment and management. This ensures continuous integration and delivery of high-quality data engineering solutions. Data Quality & Monitoring: I implement data quality checks and monitoring to identify and resolve inconsistencies or errors in datasets, ensuring that the data is accurate and reliable for analysis. Data Visualization & Reporting: I specialize in creating impactful data visualizations and dashboards using Amazon QuickSight and Tableau. By delivering clear, actionable insights, I empower stakeholders to make data-driven decisions that lead to business growth. Continuous Learning & Knowledge Sharing: I am deeply passionate about staying up-to-date with the latest technology trends. I regularly share insights and provide training sessions for junior team members on best practices in ETL development, cloud technologies, and data engineering methodologies. Key Achievements: Designed and optimized ETL processes for large datasets, enabling seamless extraction, transformation, and loading of data into centralized data warehouses. Implemented delta load processes and data partitioning techniques, significantly improving data refresh efficiency and reducing processing time. Enhanced CI/CD pipelines for AWS Glue, automating deployment and improving data processing workflows. Successfully collaborated with data scientists, analysts, and cross-functional teams to meet evolving data processing requirements. Delivered intuitive, data-driven dashboards and reports, empowering stakeholders to make informed decisions. If you're looking for a dedicated, results-driven data engineer who can transform your data workflows, ensure the accuracy of your data, and help you leverage AWS cloud technologies for scalable solutions, I'm here to help. Let's collaborate and bring your data initiatives to life!Pyspark
Apache Spark MLlibPython ScriptpandasETLApache SparkPythonSQLPySparkData Engineering - $10 hourly
- 0.0/5
- (0 jobs)
I'm a skilled Data Engineer/Data Scientist with expertise in building efficient data pipelines, performing in-depth analysis, and developing predictive models. I have hands-on experience with Python, PySpark, SQL, Azure, and cloud technologies, enabling me to clean, transform, and optimize datasets for actionable insights. My expertise includes: - Designing ETL workflows for structured and unstructured data - Data visualization and storytelling using matplotlib and Power BI - Machine Learning and predictive modeling for business solutions - Cloud-based data architecture with Azure Blob Storage, ADF, Databricks and SnowflakePyspark
Microsoft Power BIData AnalyticsMachine LearningDatabricks PlatformSnowflakeAmazon S3Microsoft Azure SQL DatabasePySparkSQLPython - $15 hourly
- 0.0/5
- (0 jobs)
Data Engineer with specializing in data pipeline design, ETL, and data modeling. Skilled in Spark, Python, Hadoop, and cloud platforms (AWS, Azure, GCP). Expertise in building data lakes/warehouses and visualizing data with Power BI and Tableau.Pyspark
dbtMachine LearningData VisualizationETL PipelineData WarehousingData ModelingSnowflakeApache HadoopApache KafkaApache AirflowSQLPythonApache SparkCloud ComputingPySpark Want to browse more freelancers?
Sign up
How hiring on Upwork works
1. Post a job
Tell us what you need. Provide as many details as possible, but don’t worry about getting it perfect.
2. Talent comes to you
Get qualified proposals within 24 hours, and meet the candidates you’re excited about. Hire as soon as you’re ready.
3. Collaborate easily
Use Upwork to chat or video call, share files, and track project progress right from the app.
4. Payment simplified
Receive invoices and make payments through Upwork. Only pay for work you authorize.