From the Dev Team – Hortonworks

↧

Open Hybrid Architecture: Real World Use-Case

November 12, 2018, 9:30 am

Building on the vision and concepts outlined previously in Arun and Saumitra’s blogs, we wanted to show the Open Hybrid Architecture Initiative (OHAI) concepts in action, and see how they could be used...

View Article

A Step-by-Step Replication Guide between On-Prem HDFS and Amazon Web Services

November 20, 2018, 8:45 am

This blog was co-authored by Ryan Peterson, Head of Global Data Segment at AWS . Central to empowering businesses to deliver the right data in the right environment to power the right use case is the...

View Article

An S3 Gateway to Apache Hadoop Ozone

November 27, 2018, 9:00 am

The AWS S3 protocol is the defacto interface for modern object stores. Ozone-0.3.0-Alpha release adds S3 protocol as a first-class notion to Ozone. For all practical purposes, a user of S3 can start...

View Article

Open Hybrid Architecture: O3, the New Rocket Ship

December 3, 2018, 9:00 am

Introducing our Storage Environment O3 Building on the last three blogs (vision, key tenets/concepts, real-world use case) in the Open Hybrid Architecture series, we now want to take a deeper dive into...

View Article

Getting the Most Out of Your Data in the Cloud with Cloudbreak

December 3, 2018, 11:00 am

There are three common abilities across the cloud providers that I want to focus on and to see how they work together and build on each other to help you maximize agility and data insights in the...

View Article

Data Science & Engineering Platform: Data Lineage and Provenance for Apache...

December 11, 2018, 11:00 am

This is the third in a series of data engineering blogs that we plan to publish. The first blog outlined the data science and data engineering capabilities of Hortonworks Data Platform. Motivation...

View Article

2x Faster BI Interactive queries with HDP 3.0

December 17, 2018, 8:30 am

Hortonworks announced the general availability of HDP 3.0 this year. You may read more about it here. Bundled with HDP 3.0, Apache Hive 3 with LLAP took a significant leap as a Enterprise Ready Real...

View Article

Open Hybrid Architecture: Running Stateful Containers on YARN

December 17, 2018, 10:00 am

The Why In the previous blog, we talked about the Open Hybrid Architecture. This architecture decouples storage and computation, thus computation tasks need to access various types of storage systems....

View Article

Big Data Processing Engines – Which one do I use?: Part 1

December 18, 2018, 10:00 am

Special thanks to Bill Preachuk and Brandon Wilson for reviewing and providing their expertise Introduction Columnar storage is an often-discussed topic in the big data processing and storage world...

View Article

Monitoring Kafka Streams Microservices with Hortonworks Streams Messaging...

December 18, 2018, 11:00 am

In last week’s blog Secure and Governed Microservices with HDF/HDP Kafka Streams Support, we walked through how to build microservices with the new Kafka Streams support in HDF 3.3 and HDP 3.1 that is...

View Article

Introducing Hive-Kafka integration for real-time Kafka SQL queries

December 19, 2018, 9:00 am

Our last few blogs as part of the Kafka Analytics blog series focused on the addition of Kafka Streams to HDP and HDF and how to build, secure, monitor Kafka Streams apps / microservices. In this blog,...

View Article

{Submarine} : Running deep learning workloads on Apache Hadoop

December 20, 2018, 9:30 am

(This Blogpost is coauthored by Xun Liu and Quan Zhou from Netease). Introduction Hadoop is the most popular open source framework for the distributed processing of large, enterprise data sets. It is...

View Article

Query Federation with Apache Hive

December 20, 2018, 12:41 pm

Organizations commonly use a plethora of data storage and processing systems today. These different systems offer cost-effective performance for their respective use cases. Besides traditional RDBMSs...

View Article

Apache Hive Warehouse Connector Use-Cases

December 27, 2018, 11:26 am

1. Motivation The HiveWarehouseConnector (HWC) is an open-source library which provides new interoperability capabilities between Hive and Spark. In practice, Hive and Spark are often leveraged...

View Article

Open Hybrid Architecture Initiative: Game Changing User Experience Powering...

January 14, 2019, 11:00 am

This is part seven of an on going series about the Open Hybrid Architecture Initiative. You can learn more about the vision, key tenets, real-world use case, new storage environment of O3,...

View Article