Data Engineer - DW/BI & Big Data Hadoop, Spark, NoSQL
Holds Engineering degree in Computer Science and experienced in building Enterprise DW/BI and Big data application.
Experienced in migrating traditional Data Warehouse application to Big data Lambda architecture.
I have operated in Great Britain (London) area working for Tier 1 business organisations in building enterprise DW and Big data applications.
Trained in Big Data eco systems (Hadoop, Map Reduce, Pig, HiveQL, HBase, Spark - Shark). Implementation of Big data Lambda architecture in a multi node clusters in Hadoop 2.0
Having hands on experience in major RDBMS like MySQL, Oracle, PostgreSQL, SQL Server and
NoSQL - MongoDB with document Query Language.
• DW/BI and OLAP Data modelling (Star, Snowflake schema, Normalisation, )
• ETL data Integration through Pentaho Data Integrator (KETTLE, SPOON) and Python ETL, Oracle
SQL Loader, Oracle Data Pump.
• Advanced PostgreSQL and PLPgsql/ PL/SQL programming (Functions, Packages, Collections,
Partitioning, Hierarchical queries, Bulk Binding, Inlining, Indexing, Dynamic SQL, Analytical
function, Data Mining API, Performance tuning, SQL factory, pipelined function Explain Plan,
Trace O/P etc.)
• Java, Collections - List, Tree, Hash Map, Set etc., Generics, Fork - Join.
• Data visualization, Dashboard Creation, Customer Scorecard and KPI Modelling
• Reporting - Tableau Data visualization. Analytical report, Operational Report, Subjective and
Statistical Analysis on customers, data, Multi-variate analysis, clustering, K-means, Hypothesis
• BIG Data eco system :
o Data Mgmt : Hadoop & YARN
o Data Access: Java Map Reduce, Pig, Hive, Solr, Spark, HBase, Storm.
o Governance & Integration: Sqoop, Flume, Kafka
o Operations: Oozie, Zookeeper
o Libraries: Spark SQL, Spark mlib.
o NoSQL: Mongo DB, HBase
• Functional Programming using Scala and implementation of Spark streaming.