Quantcast
Channel: From the Dev Team – Hortonworks
Browsing all 333 articles
Browse latest View live

An S3 Gateway to Apache Hadoop Ozone

The AWS S3 protocol is the defacto interface for modern object stores. Ozone-0.3.0-Alpha release adds S3 protocol as a first-class notion to Ozone. For all practical purposes, a user of S3 can start...

View Article


Open Hybrid Architecture: O3, the New Rocket Ship

Introducing our Storage Environment O3 Building on the last three blogs (vision, key tenets/concepts, real-world use case) in the Open Hybrid Architecture series, we now want to take a deeper dive into...

View Article


Getting the Most Out of Your Data in the Cloud with Cloudbreak

There are three common abilities across the cloud providers that I want to focus on and to see how they work together and build on each other to help you maximize agility and data insights in the...

View Article

Data Science & Engineering Platform: Data Lineage and Provenance for Apache...

This is the third in a series of data engineering blogs that we plan to publish. The first blog outlined the data science and data engineering capabilities of Hortonworks Data Platform. Motivation...

View Article

2x Faster BI Interactive queries with HDP 3.0

Hortonworks announced the general availability of HDP 3.0 this year. You may read more about it here. Bundled with HDP 3.0, Apache Hive 3 with LLAP took a significant leap as a Enterprise Ready Real...

View Article


Open Hybrid Architecture: Running Stateful Containers on YARN

The Why In the previous blog, we talked about the Open Hybrid Architecture. This architecture decouples storage and computation, thus computation tasks need to access various types of storage systems....

View Article

Big Data Processing Engines – Which one do I use?: Part 1

Special thanks to Bill Preachuk and Brandon Wilson for reviewing and providing their expertise Introduction Columnar storage is an often-discussed topic in the big data processing and storage world...

View Article

Monitoring Kafka Streams Microservices with Hortonworks Streams Messaging...

In last week’s blog Secure and Governed Microservices with HDF/HDP Kafka Streams Support, we walked through how to build microservices with the new Kafka Streams support in HDF 3.3 and HDP 3.1 that is...

View Article


Introducing Hive-Kafka integration for real-time Kafka SQL queries

Our last few blogs as part of the Kafka Analytics blog series focused on the addition of Kafka Streams to HDP and HDF and how to build, secure, monitor Kafka Streams apps / microservices. In this blog,...

View Article


{Submarine} : Running deep learning workloads on Apache Hadoop

(This Blogpost is coauthored by Xun Liu and Quan Zhou from Netease). Introduction Hadoop is the most popular open source framework for the distributed processing of large, enterprise data sets. It is...

View Article

Query Federation with Apache Hive

Organizations commonly use a plethora of data storage and processing systems today. These different systems offer cost-effective performance for their respective use cases. Besides traditional RDBMSs...

View Article

Apache Hive Warehouse Connector Use-Cases

1. Motivation The HiveWarehouseConnector (HWC) is an open-source library which provides new interoperability capabilities between Hive and Spark. In practice, Hive and Spark are often leveraged...

View Article

Open Hybrid Architecture Initiative: Game Changing User Experience Powering...

This is part seven of an on going series about the Open Hybrid Architecture Initiative. You can learn more about the vision, key tenets, real-world use case, new storage environment of O3,...

View Article

Browsing all 333 articles
Browse latest View live