Help! There’s a yellow elephant in my server room!

Picture taken from our EMC Durham Datacenter
http://smma.com/project/mission-critical/emc-durham-cloud-data-center

That could be what’s on an admin’s mind during their first try to deploy Hadoop. It’s not necessarily that hard to install, but to understand how to scale it and how to work with it you need to put some proper time into it.

How about we try to make Hadoop easier for everyone to understand and use? That’s what the team in the Open Innovations Lab at EMC thought, and they’ve now released a full whitepaper called “EMC Hadoop Starter Kit – EMC Isilon and VMware Big Data Extensions for Hadoop”. Now you might wonder what Isilon and VMware has to do with Hadoop, and I’ll come to that in just a bit.

Hadoop + Serengeti + Isilon = AWESOME

First, let’s look at what type of Hadoop distribution we’re talking about deploying here. There are different distributions (or versions) of the lovely elephant Hadoop out in the wild. The most notable ones are Pivotal HD, Hortonworks, Cloudera and of course the original open source Apache one. For the purpose of this whitepaper, the Open Innovations Lab team has decided to start with the Apache Hadoop distribution.

Now what about VMware and Hadoop? We’re actually talking about virtualizing Hadoop here, something that’s usually a big “heck no” in Hadoop circles. Actually, for most companies that have an existing VMware virtualization environment, you’re sure to find a lot of resources just sitting there idle and ready to use. Why not use them for Hadoop and help your organization in getting some good, real information out of all that data you’re already storing? Other benefits of virtualization Hadoop are:

Rapid provisioning – quickly creating a new cluster or node when needed
High availability – Protecting the Single Points Of Failure like the NameNode with the help of VMware HA
Elasticity – Scale your Hadoop cluster to the size you want it to be with resources still shared with other applications in your virtualized environment
Multi-tenancy – Run multiple Hadoop clusters in the same environment, dividing up data but centralizing management
Portability – Use and mix any of the popular Hadoop distributions (Apache, Pivotal HD, Cloudera, Hortonworks) with no data migration

Some of you might now wonder how we can achieve zero data migration, as the data is usually tied to an Hadoop cluster by the use of HDFS? Well, that’s been taken care of as well thanks to the inclusion of EMC Isilon in the whitepaper. Isilon is the only scale-out NAS platform with HDFS natively integrated, meaning we can create and mount HDFS filesystems to any new cluster or node that’s created.

By separating compute and data, we achieve elasticity in both. Want more compute? Scale up your VMs. Need more data? Scale up your storage. This gives you an unprecedented ability to start your Journey to Big Data in a more cost-effective and efficient manner. So, how do we piece it all together? By using the vSphere Big Data Extensions, powered by something called Project Serengeti (Serengeti is a large area in Africa, home to large animals like elephants, get the reference? ), it gives you as an administrator an easy to use interface to create, manage, scale and decommission Hadoop clusters in your environment.

For the full whitepaper including all the step-by-step instructions on how to get your own Hadoop Starter Kit going, have a look here:

http://community.emc.com/docs/DOC-26892

Help! There’s a yellow elephant in my server room!

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112