Trying out Containerized Applications on Apache Hadoop YARN 3.1
This is the 5th blog of the Hadoop Blog series (part 1, part 2, part 3, part 4). In this blog, we will explore running Docker containers on YARN for faster time to market and faster time to insights...
View ArticleContainerized Apache Spark on YARN in Apache Hadoop 3.1
This is the 6th blog of the Hadoop Blog series (part 1, part 2, part 3, part 4, part 5). In this blog, we will explore how to leverage Docker for Apache Spark on YARN for faster time to insights for...
View ArticleApache Hive LLAP as a YARN Service
This is the 7th blog of the Hadoop Blog series (part 1, part 2, part 3, part 4, part 5, part 6). In this blog, we will share our experiences running LLAP as a YARN Service. This is specifically a...
View ArticleIncreasing Hadoop Storage Scale by 4x!
This is the 8th blog of the Hadoop Blog series (part 1, part 2, part 3, part 4, part 5, part 6, part 7). In this blog, we will discuss how NameNode Federation provides scalability and performance...
View ArticleA Step-by-Step Guide for HDFS Replication
This blog focuses on on-prem to on-prem HDFS replication for HDP clusters using Hortonworks Data Lifecycle Manager (DLM), an extensible service built on the Hortonworks DataPlane Platform....
View ArticleFirst Class GPUs support in Apache Hadoop 3.1, YARN & HDP 3.0
This blog is also co-authored by Zian Chen and Sunil Govindan from Hortonworks. Introduction – Apache Hadoop 3.1, YARN, & HDP 3.0 GPUs are increasingly becoming a key tool for many big data...
View ArticleApache Hadoop Meetup July 2018 – Bangalore Chapter
Meetup Link: https://www.meetup.com/Bangalore-Hadoop-Meetups/events/252534327/ The Bangalore Apache Hadoop Meetup group, with over 3400 members who share common interests and ideas in the Hadoop...
View ArticleDistributed Pricing Engine using Dockerized Spark on YARN w/ HDP 3.0 [Part 1/4]
This is the 1st blog in a 4-part blog series where we will look at an architectural approach to implementing a distributed compute engine for pricing financial derivatives using Hortonworks Data...
View ArticleDistributed Pricing Engine using Dockerized Spark on YARN w/ HDP 3.0 [Part 2/4]
This is the 2nd blog in a 4-part blog series (see part 1) where we will dive into representative pricing semantics and architectural aspects of a prototype implementation using HDP 3.0. Pricing...
View ArticleDistributed Pricing Engine using Dockerized Spark on YARN w/ HDP 3.0 [Part 3/4]
This is the 3nd blog in the series (see part 1, part 2) where we will walk through the tech stack and prepare the environment. The entire infrastructure is provisioned on OpenStack private cloud using...
View ArticleDistributed Pricing Engine using Dockerized Spark on YARN w/ HDP 3.0 [Part 4/4]
This is the finale of the blog series (see part 1, part 2, part 3) where having discussed the problem domain, looked at the functional and architectural aspects and prepared the environment, we are now...
View ArticleUpgrading your clusters and workloads from Apache Hadoop 2 to Apache Hadoop 3
If you’re interested in learning more, go to our recap blog here! Introduction The Apache Hadoop community announced Hadoop 3.0 GA in December, 2017 and Hadoop 3.1 in April, 2018 loaded with great...
View ArticleBenchmark Update: Apache Hive and Druid Integration in HDP 3.0
Earlier we talked about reasons for integrating Druid and Hive in a THREE-PART SERIES (Part 1, Part 2 , Part 3) OF DOING ULTRA FAST OLAP ANALYTICS WITH APACHE HIVE AND DRUID. Since then we have spent...
View ArticleIntroducing Apache Hadoop Ozone: An Object Store for Apache Hadoop
1. Introduction The Apache Hadoop Distributed File System (HDFS) has been the de facto file system for big data. It is easy to forget just how scalable and robust HDFS is in the real world. Our...
View ArticleApache Hadoop Ozone – Object Store Overview
1. Introduction This article is second in the series about Ozone – a distributed key-value store that can efficiently manage small and large files alike. An earlier article introduced the Ozone design...
View ArticleApache Hadoop Ozone – Object Store Architecture
1. Introduction Apache Hadoop Ozone is a distributed key-value store that can efficiently manage both small and large files alike. Ozone is designed to work well with the existing Apache Hadoop...
View ArticleEnterprise Search with HDP Search
We are excited to announce the immediate availability of HDPSearch 4.0. As you are aware, HDP Search offers a performant, scalable, and fault-tolerant enterprise search solution. With HDP Search 4.0,...
View ArticleOpen Hybrid Architecture- Bringing Cloud Native to On-Premises
What Customers Are Telling Us We recently unveiled our Open Hybrid Architecture Initiative where Arun articulated the vision in his blog. We are excited by the huge interest from our customers and...
View ArticleOpen Hybrid Architecture: Real World Use-Case
Building on the vision and concepts outlined previously in Arun and Saumitra’s blogs, we wanted to show the Open Hybrid Architecture Initiative (OHAI) concepts in action, and see how they could be used...
View ArticleA Step-by-Step Replication Guide between On-Prem HDFS and Amazon Web Services
This blog was co-authored by Ryan Peterson, Head of Global Data Segment at AWS . Central to empowering businesses to deliver the right data in the right environment to power the right use case is the...
View Article