Tuesday, 8 April 2014

Custom Parameters To Pig Script

There may be scenerios where we need to make our custom pig scripts, which can take any arguments.

Below is an sample code for a Custom Pig Script.

Sample Pig Script

The "customparam.pig" loads an input with custom argument and generates a single field from the input bag to another bag and stores the new bag to HDFS.

Here the input,delimiter for input file,output and filed to seperate are given as custom arguments to Pig Scripts.
--load hdfs/local fs data
original = load '$input' using PigStorage('$delimiter');
--filter a specific field value into another bag 
filtered = foreach original generate $split; 
--storing data into hdfs/local fs
store filtered into '$output'; 

Pig Scripts can be run as Local or in MapReduce Mode.

Local Mode

pig -x local -f customparam.pig -param input=Pig.csv -param output=OUT/pig -param delimiter="," -param split='$1'

This is the sample "Pig.csv" file which is the custom input used in command line.The custom delimiter is ",".

Pig2,6.88,Not Matched
Pig3,6.1,Not Matched

And seperating 2 nd column from the original bag to a new bag.Any field in Pig starts with $0,$1,$2,....So if we need to generate 2 nd column the split param should be "$1".

After executing the above command. The part file content will be


If the command is run in MapReduce mode the part file get stored in HDFS.


  1. Above example not working in my cluster, I'm using pig-0.8.1-cdh3u5.tar.gz version of PIG. Is there any version dependent to run above example, Please suggest me.

    1. Will let you know Anil. willl check with the same version of Pig