As you know Pig is a member of the Hadoop ecosystem and is a framework for analyzing large data sets (and a first-class Apache project at This article is a mini-tutorial and shows how Pig works with Hadoop and HDFS, and just how much you can accomplish with only a few lines of script.