GetInData
Data-processing challenges addressed with experience and passion
Overview
We are data engineers, developers and system administrators with practical multi-year experience in Big Data technologies. Full engagement, true passion, continuous improvement and strong desire to challenge the status quo is a big part our DNA. While working with Big Data for companies like Spotify, IBM, Netezza, Allegro or as Authorised Cloudera Training Partners and building solutions for our clients, we have learned how to use Big Data technologies to solve business problems. We implement data-driven applications to improve your product and discover valuable insights that are hidden in massive volumes of data. The list of tasks and Big Data technologies that we have deployed in production: * Installation, administration and security of Hadoop cluster - Hortonworks (HDP, Ambari), Cloudera (CDH, Cloudera Manager), Kerberos, Sentry, Ranger * Large-scale log delivery - Kafka * Large-scale ETL processes - Nifi, Spark, Hive, Oozie * Batch processing and analysis - Spark, Spark SQL, Hive * Real-time stream processing - Flink, Storm, Spark Streaming * Real-time random read-write requests (NoSQL) - Cassandra, HBase * Low latency analytics, search and BI backends - Flink, Elasticsearch, Druid, Solr, Phoenix, Impala, Kylin * Advanced analytics and machine learning - e.g. text mining, anomaly detection, sentimental analysis, classification, natural language processing, statistical methods. With tools like R, SparkR