MLShepherds
Overview
With a total of 12 + years (6.5 years as Java developer, 4 years as Freelance Big data developer and now 1.5 years as Big Data Architect and Senior Data Engineer) of work experience in development, currently, I work as a Big Data Architect and Senior Data engineer. I am currently working in big data projects using Apache Spark, Hadoop, MongoDB, Kafka, Cassandra and elastic search. I am one of the core contributors to the open-source library SparkNLP. Some of my Projects in Past - Spark NLP: Core developer to develop and support the Spark NLP library. - Disperse.io: Worked on analyzing and discovering trends in Wi-Fi data for London metro stations. Worked as data engineer in data cleaning and data analysis using Spark-SQL. - Discovergy: Worked on smart meter applications. Developed a machine learning based platform on top of Spark streaming to collect, store and analysis data from smart meters. - L3Networks: Designed and implemented a processing system that could process web logs arriving at the rate of 46GB/min in real-time. Designed the architecture and the data processing pipeline using Apache Kafka, Apache Spark and Apache Hadoop using Java and Scala - Big Data Architect: Designed the big data platform for a supermarket chain in Spain to help them migrate from their traditional in house data ware system to an on cloud data lake for big data analytical use cases. - Big Data Architect: Designed and helped in the development of a new platform for an advertising company in Madrid for storing user web clickstream data and providing recommendation and predictions using Spark. - Big Data Engineer: Developed a plugin in ElasticSearch for data clustering to be used in an EU search platform. - Big Data Engineer: Developed and helped redesign the Data visualization platform BCNNow as part of DECODE project. - Researcher: Working on Data privacy tool for an EU Project SMOOTH to help Micro and Small enterprises with GDPR compliance. - Research Engineer: Worked on developing a large social graph mining engine Kalium using OrientDB and Spark. - Big Data Engineer: Maintain and support the inhouse cloud for Eurecat using Red Hat Openstack Platform.
Services
Data Extraction/ETL
We specialize in Data platform design and Data pipeline orchestration.