Monday, 21 April 2014

Code For Deleting Output Folder If Exist In Hadoop MapReduce Jobs




Mostly Hadoop MapReduce Jobs operates with two arguments.
Input directory and Output directory.

Each time when we run our MapReduce job we need to give  non-existing folder as our output path. So while we are doing a Trail and Error method in our MR jobs. It is good if it automatically deletes  the output folder if exists.

Here is the code for that:
/*Provides access to configuration parameters*/
Configuration conf = new Configuration();
/*Creating Filesystem object with the configuration*/
FileSystem fs = FileSystem.get(conf);
/*Check if output path (args[1])exist or not*/
if(fs.exists(new Path(args[1]))){
   /*If exist delete the output path*/
   fs.delete(new Path(args[1]),true);
}

5 comments:

  1. Basically i know nothing about Hadoop.But feels great seeing an attempt to share things that u learn. Really nice. Will make use of this information someday for sure. :-)

    ReplyDelete
  2. Fortunately, Apache Hadoop is a tailor-made solution that delivers on both counts, by turning big data insights into actionable business enhancements for long-term success. To know more, visit Hadoop Training Bangalore

    ReplyDelete
  3. Actually I'm using scala . I have deleted the existing output folder and tried it. It is not throwing any exception . But my output file is not created in run time. I'm not able to access my output values what I was stored. What I want to do for this ?

    ReplyDelete