Skip to content
Oracle Alchemist

Oracle Alchemist

Steve Karam’s Blog

  • Oracle
  • Fun
  • Technology
  • Development
  • Big Data
  • NoSQL

Category: Big Data

Big Data is all the rage nowadays, and there are a ton of tools out there to help you gather and aggregate it. These articles are about the tools, techniques, and concepts necessary to separate the wheat from the chaff about big data.

Another Great OpenWorld

October 10, 2014 Steve Karam Big Data, Development, Fun, News, Oracle, Technology Leave a comment

Last week I attended Oracle OpenWorld 2014, and it was an outstanding event filled with great people, awesome sessions, and a few outstanding notable experiences. Personally I thought the messaging behind the conference itself wasn’t as amazing and upbeat as OpenWorld 2013, but that’s almost to be expected. Last year

Continue reading
Elephant Painting

Hadoop Streaming, Hue, Oozie Workflows, and Hive

October 24, 2013 Steve Karam Big Data, Development, News 10 comments

MapReduce with Hadoop Streaming in bash – Bonus! To conclude my three part series on writing MapReduce jobs with shell script for use with Hadoop Streaming, I’ve decided to throw together a video tutorial on running the jobs we’ve created in Oozie, a workflow editor for Hadoop that allows jobs

Continue reading
Hadoop Streaming Bash

MapReduce with Hadoop Streaming in bash – Part 3

October 23, 2013 Steve Karam Big Data, Development, News 2 comments

In our first MapReduce with Hadoop Streaming in bash article, we took a collection of Stephen Crane poems and used a MapReduce job to calculate ‘term frequency’–meaning we counted the number of times each word in the collection appeared in the collection. In the second part, we calculated ‘document frequency’

Continue reading
Hadoop Streaming Bash

MapReduce with Hadoop Streaming in bash – Part 2

October 22, 2013 Steve Karam Big Data, Development, News 6 comments

In MapReduce with Hadoop Streaming in bash – Part 1 we found the ‘term frequency’ of words within a collection of documents. For the documents I chose 8 Stephen Crane poems, and our bash Map and Reduce jobs tokenized the words and found their frequency among the entire set. The

Continue reading
Hadoop Streaming Bash

MapReduce with Hadoop Streaming in bash – Part 1

October 21, 2013 Steve Karam Big Data, Development, News 8 comments

So to commemorate my recent certification and because my Java absolutely sucks, I decided to do a common algorithm using Hadoop Streaming. Hadoop Streaming Hadoop Streaming allows you to write MapReduce code in any language that can process stdin and stdout. This includes Python, PHP, Ruby, Perl, bash, node.js, and

Continue reading
Happy Hadoop

Cloudera Certified Developer for Hadoop (CCDH)

October 17, 2013 Steve Karam Big Data, News 7 comments

Taking the Cloudera Developer Training for Apache Hadoop had many rewards — one of which was a free voucher to take the CCD-410 Exam (normally $295) which you must pass to get CCDH certified. I’m not sure if that’s a Cloudera University or Global Knowledge thing, but either way it

Continue reading
Goodbye Hadoop

Hadoop Developer Training – Day 4

October 4, 2013 Steve Karam Big Data, Development, News 2 comments

This will be a short(ish) post, as my brain is relatively fried from the obscene amount of knowledge imparted by the Hadoop class (and it’s Friday so I’m allowed to be lazy nanny nanny boo boo). To be honest, this was probably the most fun day of class. While the

Continue reading
Digital Flow

Hadoop Developer Training – Day 3

October 3, 2013 Steve Karam Big Data, Development, News 5 comments

Cloudera Developer Training for Apache Hadoop is almost over, and I’m somewhat sad that my Hadoopin’ days are nearly done–in the classroom at least. However, the breadth of this training has been great and I can definitely say I’ve gotten my (company’s) money’s worth. Being that I’m three days in,

Continue reading
Clever Student

Hadoop Developer Training – Day 2

October 2, 2013 Steve Karam Big Data, Development, News 3 comments

Yesterday I completed the second day of Cloudera Developer Training for Apache Hadoop. While the first day focused on Hadoop core technology like HDFS, the second day was all about MapReduce. That means it was the day that whole ‘developer’ thing was thrown into sharp relief. I’ve been a DBA

Continue reading
Hadoop Elephant

Hadoop Developer Training – Day 1

October 1, 2013 Steve Karam Big Data, Development, News 3 comments

What’s the best way to follow up a week of Oracle OpenWorld? Cloudera Developer Training for Apache Hadoop of course. So today I had my first day. I won’t detail the course itself (though I hope there will be many Hadoop posts to come). But I would like to share

Continue reading

Posts navigation

1 2 Next Posts»

Network

Visit Us On TwitterVisit Us On FacebookCheck Our FeedVisit Us On Linkedin

Series

  • LabAlchemy
  • Hadoop Streaming
  • Cloudera Hadoop Training
  • Optimizing WordPress
  • Adventures of Ace, DBA
  • Grow Your Career
  • Database Diversity

Oracle Blogs

  • Amardeep Sidhu
  • Bertrand Drouvot
  • Bobby Curtis
  • Brian Pardy
  • Chet Justice
  • David Aldridge
  • Dimitri Gielis
  • Doug Burns
  • Jonathan Lewis
  • Kellyn Pot'Vin
  • Kevin Closson
  • Kyle Hailey
  • Marco Gralike
  • Oracle Base
  • Osama Mustafa
  • Pete Finnigan
  • Radio Free Tooting
  • Richard Foote
  • Tim Hall

Network

Visit Us On TwitterVisit Us On FacebookCheck Our FeedVisit Us On Linkedin

Series

  • LabAlchemy
  • Hadoop Streaming
  • Cloudera Hadoop Training
  • Optimizing WordPress
  • Adventures of Ace, DBA
  • Grow Your Career
  • Database Diversity

Oracle Blogs

  • Amardeep Sidhu
  • Bertrand Drouvot
  • Bobby Curtis
  • Brian Pardy
  • Chet Justice
  • David Aldridge
  • Dimitri Gielis
  • Doug Burns
  • Jonathan Lewis
  • Kellyn Pot'Vin
  • Kevin Closson
  • Kyle Hailey
  • Marco Gralike
  • Oracle Base
  • Osama Mustafa
  • Pete Finnigan
  • Radio Free Tooting
  • Richard Foote
  • Tim Hall
WordPress Theme: Tortuga by ThemeZee.