Sunday, 27 April 2014

Can we change the default key-value output seperator in Hadoop MapReduce


Yes, We can change it using "mapred.textoutputformat.separator" property in Driver class, if we are using TextOutputFormat as Output Format.Default seperator is "\t".


Change to ","
Configuration conf = getConf();
conf.set("mapred.textoutputformat.separator", ","); 

Change to ";"
Configuration conf = getConf();
conf.set("mapred.textoutputformat.separator", ";"); 

Change to ":"
Configuration conf = getConf();
conf.set("mapred.textoutputformat.separator", ":"); 

Happy Hadooping ...

3 comments:

  1. Hi,

    I dont want any seperator between the key value in the output. How to achieve it?

    ReplyDelete
    Replies
    1. I have provided as below..

      conf.set("mapred.textoutputformat.separator", "");

      But it is giving me tab seperated key/value pairs

      Delete
    2. If you dont want seperator you can emit your result as just key or value.
      for example: String key = "hai";
      String value = "Hello";
      context.write(new Text(key),new Text(value))---->gives seperator
      so inorder to avoid that you can pass it as just value
      context.write(NullWritable.get(),new Text(key.concat(value)));

      Delete