From the Dev Team – Hortonworks

↧

Trying out Containerized Applications on Apache Hadoop YARN 3.1

May 16, 2018, 9:08 am

This is the 5th blog of the Hadoop Blog series (part 1, part 2, part 3, part 4). In this blog, we will explore running Docker containers on YARN for faster time to market and faster time to insights...

View Article

Containerized Apache Spark on YARN in Apache Hadoop 3.1

May 24, 2018, 7:00 am

This is the 6th blog of the Hadoop Blog series (part 1, part 2, part 3, part 4, part 5). In this blog, we will explore how to leverage Docker for Apache Spark on YARN for faster time to insights for...

View Article

Apache Hive LLAP as a YARN Service

June 14, 2018, 7:00 am

This is the 7th blog of the Hadoop Blog series (part 1, part 2, part 3, part 4, part 5, part 6). In this blog, we will share our experiences running LLAP as a YARN Service. This is specifically a...

View Article

Increasing Hadoop Storage Scale by 4x!

June 28, 2018, 7:00 am

This is the 8th blog of the Hadoop Blog series (part 1, part 2, part 3, part 4, part 5, part 6, part 7). In this blog, we will discuss how NameNode Federation provides scalability and performance...

View Article

A Step-by-Step Guide for HDFS Replication

July 12, 2018, 8:53 am

This blog focuses on on-prem to on-prem HDFS replication for HDP clusters using Hortonworks Data Lifecycle Manager (DLM), an extensible service built on the Hortonworks DataPlane Platform....

View Article

First Class GPUs support in Apache Hadoop 3.1, YARN & HDP 3.0

August 1, 2018, 12:14 pm

This blog is also co-authored by Zian Chen and Sunil Govindan from Hortonworks. Introduction – Apache Hadoop 3.1, YARN, & HDP 3.0 GPUs are increasingly becoming a key tool for many big data...

View Article

Apache Hadoop Meetup July 2018 – Bangalore Chapter

August 2, 2018, 1:00 pm

Meetup Link: https://www.meetup.com/Bangalore-Hadoop-Meetups/events/252534327/ The Bangalore Apache Hadoop Meetup group, with over 3400 members who share common interests and ideas in the Hadoop...

View Article

Distributed Pricing Engine using Dockerized Spark on YARN w/ HDP 3.0 [Part 1/4]

August 6, 2018, 8:40 am

This is the 1st blog in a 4-part blog series where we will look at an architectural approach to implementing a distributed compute engine for pricing financial derivatives using Hortonworks Data...

View Article

Distributed Pricing Engine using Dockerized Spark on YARN w/ HDP 3.0 [Part 2/4]

August 7, 2018, 10:00 am

This is the 2nd blog in a 4-part blog series (see part 1) where we will dive into representative pricing semantics and architectural aspects of a prototype implementation using HDP 3.0. Pricing...

View Article

Distributed Pricing Engine using Dockerized Spark on YARN w/ HDP 3.0 [Part 3/4]

August 8, 2018, 1:31 pm

This is the 3nd blog in the series (see part 1, part 2) where we will walk through the tech stack and prepare the environment. The entire infrastructure is provisioned on OpenStack private cloud using...

View Article

Distributed Pricing Engine using Dockerized Spark on YARN w/ HDP 3.0 [Part 4/4]

August 9, 2018, 9:00 am

This is the finale of the blog series (see part 1, part 2, part 3) where having discussed the problem domain, looked at the functional and architectural aspects and prepared the environment, we are now...

View Article

Upgrading your clusters and workloads from Apache Hadoop 2 to Apache Hadoop 3

September 20, 2018, 8:00 am

If you’re interested in learning more, go to our recap blog here! Introduction The Apache Hadoop community announced Hadoop 3.0 GA in December, 2017 and Hadoop 3.1 in April, 2018 loaded with great...

View Article

Benchmark Update: Apache Hive and Druid Integration in HDP 3.0

October 1, 2018, 10:00 am

Earlier we talked about reasons for integrating Druid and Hive in a THREE-PART SERIES (Part 1, Part 2 , Part 3) OF DOING ULTRA FAST OLAP ANALYTICS WITH APACHE HIVE AND DRUID. Since then we have spent...

View Article

Introducing Apache Hadoop Ozone: An Object Store for Apache Hadoop

October 8, 2018, 8:00 am

1. Introduction The Apache Hadoop Distributed File System (HDFS) has been the de facto file system for big data. It is easy to forget just how scalable and robust HDFS is in the real world. Our...

View Article

Apache Hadoop Ozone – Object Store Overview

October 10, 2018, 8:00 am

1. Introduction This article is second in the series about Ozone – a distributed key-value store that can efficiently manage small and large files alike. An earlier article introduced the Ozone design...

View Article

Apache Hadoop Ozone – Object Store Architecture

October 15, 2018, 9:00 am

1. Introduction Apache Hadoop Ozone is a distributed key-value store that can efficiently manage both small and large files alike. Ozone is designed to work well with the existing Apache Hadoop...

View Article

Enterprise Search with HDP Search

October 22, 2018, 11:00 am

We are excited to announce the immediate availability of HDPSearch 4.0. As you are aware, HDP Search offers a performant, scalable, and fault-tolerant enterprise search solution. With HDP Search 4.0,...

View Article

Open Hybrid Architecture- Bringing Cloud Native to On-Premises

November 5, 2018, 9:30 am

What Customers Are Telling Us We recently unveiled our Open Hybrid Architecture Initiative where Arun articulated the vision in his blog. We are excited by the huge interest from our customers and...

View Article

Open Hybrid Architecture: Real World Use-Case

November 12, 2018, 9:30 am

Building on the vision and concepts outlined previously in Arun and Saumitra’s blogs, we wanted to show the Open Hybrid Architecture Initiative (OHAI) concepts in action, and see how they could be used...

View Article

A Step-by-Step Replication Guide between On-Prem HDFS and Amazon Web Services

November 20, 2018, 8:45 am

This blog was co-authored by Ryan Peterson, Head of Global Data Segment at AWS . Central to empowering businesses to deliver the right data in the right environment to power the right use case is the...

View Article