Here is Something !: Chaining Jobs in Hadoop MapReduce

Tuesday, 22 April 2014

Chaining Jobs in Hadoop MapReduce

There are cases where we need to write more than one MapReduce Job.
Map1--Reduce1--Map2--Reduce2
How do you manage the jobs so they are executed in order? There are several approaches, Here is an approach to easily chain jobs together by writing multiple driver methods, one for each job:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

/**
 * @author Unmesha SreeVeni U.B
 * 
 */
public class ChainJobs extends Configured implements Tool {

 private static final String OUTPUT_PATH = "intermediate_output";

 @Override
 public int run(String[] args) throws Exception {
  /*
   * Job 1
   */
  Configuration conf = getConf();
  FileSystem fs = FileSystem.get(conf);
  Job job = new Job(conf, "Job1");
  job.setJarByClass(ChainJobs.class);

  job.setMapperClass(MyMapper1.class);
  job.setReducerClass(MyReducer1.class);

  job.setOutputKeyClass(Text.class);
  job.setOutputValueClass(IntWritable.class);

  job.setInputFormatClass(TextInputFormat.class);
  job.setOutputFormatClass(TextOutputFormat.class);

  TextInputFormat.addInputPath(job, new Path(args[0]));
  TextOutputFormat.setOutputPath(job, new Path(OUTPUT_PATH));

  job.waitForCompletion(true);

  /*
   * Job 2
   */
  
  Job job2 = new Job(conf, "Job 2");
  job2.setJarByClass(ChainJobs.class);

  job2.setMapperClass(MyMapper2.class);
  job2.setReducerClass(MyReducer2.class);

  job2.setOutputKeyClass(Text.class);
  job2.setOutputValueClass(Text.class);

  job2.setInputFormatClass(TextInputFormat.class);
  job2.setOutputFormatClass(TextOutputFormat.class);

  TextInputFormat.addInputPath(job2, new Path(OUTPUT_PATH));
  TextOutputFormat.setOutputPath(job2, new Path(args[1]));

  return job2.waitForCompletion(true) ? 0 : 1;
 }

 /**
  * Method Name: main Return type: none Purpose:Read the arguments from
  * command line and run the Job till completion
  * 
  */
 public static void main(String[] args) throws Exception {
  // TODO Auto-generated method stub
  if (args.length != 2) {
   System.err.println("Enter valid number of arguments <Inputdirectory>  <Outputlocation>");
   System.exit(0);
  }
  ToolRunner.run(new Configuration(), new ChainJobs(), args);
 }
}

The above code has 2 jobs named job1 and job2

private static final String OUTPUT_PATH = "intermediate_output";

String "OUTPUT_PATH" is used to write the output for first job.

TextInputFormat.addInputPath(job, new Path(args[0]));
TextOutputFormat.setOutputPath(job, new Path(OUTPUT_PATH));

So in first job our input will be args[0] and output will be new Path(OUTPUT_PATH).

First Job Configuration

  /*
   * Job 1
   */
  Configuration conf = getConf();
  FileSystem fs = FileSystem.get(conf);
  Job job = new Job(conf, "Job1");
  job.setJarByClass(ChainJobs1.class);

  job.setMapperClass(MyMapper1.class);
  job.setReducerClass(MyReducer1.class);

  job.setOutputKeyClass(Text.class);
  job.setOutputValueClass(IntWritable.class);

  job.setInputFormatClass(TextInputFormat.class);
  job.setOutputFormatClass(TextOutputFormat.class);

  TextInputFormat.addInputPath(job, new Path(args[0]));
  TextOutputFormat.setOutputPath(job, new Path(OUTPUT_PATH));

  job.waitForCompletion(true);

Once the first job has executed successfully "OUTPUT_PATH" is served as the input to second job and the output of job2 is written to args[1].

TextInputFormat.addInputPath(job2, new Path(OUTPUT_PATH));
TextOutputFormat.setOutputPath(job2, new Path(args[1]));

Second Job Configuration

  /*
   * Job 2
   */
 
  Job job2 = new Job(conf, "Job 2");
  job2.setJarByClass(ChainJobs1.class);

  job2.setMapperClass(MyMapper2.class);
  job2.setReducerClass(MyReducer2.class);

  job2.setOutputKeyClass(Text.class);
  job2.setOutputValueClass(Text.class);

  job2.setInputFormatClass(TextInputFormat.class);
  job2.setOutputFormatClass(TextOutputFormat.class);

  TextInputFormat.addInputPath(job2, new Path(OUTPUT_PATH));
  TextOutputFormat.setOutputPath(job2, new Path(args[1]));

  return job2.waitForCompletion(true) ? 0 : 1;

Happy Hadooping . . .

131 comments:

Sorin C10 December 2014 at 03:10
Where is the code for ChainJobs1.java and ChainJobs2.java?
ReplyDelete
Replies
vv9 April 2015 at 08:55
Sorry I am new at Hadoop. Could you please give some examples on how to read the file from map/ reduce function? Do you just do fs.open(), or is there any build in magic from TextInputFormat.addInputPath()?
Thanks!
ReplyDelete
Replies
Unknown11 May 2015 at 05:25
Thank you very much for such a helpful post..
Keep posting such stuffs in Hadoop.
Nishit
ReplyDelete
Replies
Unknown31 August 2015 at 04:55
The second job doesnt seem to run for me.. THe mapper setup runs but not the map function within the second mapper. Is it because of format issues. Coz otherwise there doesnt seem to be anything wrong in my program
ReplyDelete
Replies
Unknown5 October 2015 at 05:53
Hi,

I am running a hadoop chainjobs. While running it with low data sets(i.e. 10-20 files) it is working perfectly but while running with more than 30 files after the first job the second job gets an error connection refuse. Already tried 2 times something like that. Can you please let me know why I am facing this issue. I have also gone with adddepending job but with that the output path for the job2 is not getting validated.

Thanks,
Shuvankar
ReplyDelete
Replies
Unknown25 October 2015 at 15:13
Hi unmesha sreeveni, great post! you saved me! :D
I found some errors, like fileNotFoundException. and i solved it adding "/part-r-00000" (the name of the outputfile)

I my application i am trying to do the GIM-V algorithm that basicly is multiply a matrix by a vector, and again by the vector result and again and so on.

finally i did a cycle for all the new jobs, something like this, check.

Configuration conf = getConf();
Job job = new Job(conf, "matrix-multiply-vector");
// See Amareshwari Sri Ramadasu's comment in this thread...
// http://lucene.472066.n3.nabble.com/Distributed-Cache-with-New-API-td722187.html
// you need to do job.getConfiguration() instead of conf.
DistributedCache.addCacheFile(new Path(args[1]).toUri(),
job.getConfiguration());
job.setJarByClass(MatrixMultiplyVector.class);

job.setMapperClass(Mapper1.class);
job.setReducerClass(Reducer1.class);

job.setMapOutputKeyClass(LongWritable.class);
job.setMapOutputValueClass(DoubleWritable.class);

job.setInputFormatClass(TextInputFormat.class);
//setoutputFormat...

FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[2]));

boolean succ = job.waitForCompletion(true);
int nroRepeticiones =Integer.parseInt(args[3]);
String salida = args[2];
String nuevaSalida=salida;
for(int i=1;i<nroRepeticiones;i++){
Configuration conf2 = new Configuration();
Job job2 = new Job(conf2, "ENCADENADOJOB");
// See Amareshwari Sri Ramadasu's comment in this thread...
// http://lucene.472066.n3.nabble.com/Distributed-Cache-with-New-API-td722187.html
// you need to do job.getConfiguration() instead of conf.
DistributedCache.addCacheFile(new Path(nuevaSalida+"/part-r-00000").toUri(),
job2.getConfiguration());
job2.setJarByClass(MatrixMultiplyVector.class);

job2.setMapperClass(Mapper1.class);
job2.setReducerClass(Reducer1.class);

job2.setMapOutputKeyClass(LongWritable.class);
job2.setMapOutputValueClass(DoubleWritable.class);

job2.setInputFormatClass(TextInputFormat.class);
//setoutputFormat...
nuevaSalida = salida+"-"+String.valueOf(i);

FileInputFormat.addInputPath(job2, new Path(args[0]));
FileOutputFormat.setOutputPath(job2, new Path(nuevaSalida));
System.out.println("-----iteracion:"+i);
succ = job2.waitForCompletion(true);
}
return 5;

Thank you again :D
ReplyDelete
Replies
Unknown2 December 2015 at 10:41
Nice work Unmesha. I will try out the code, meanwhile I have few question.
1. As the OUTPUT_PATH is intermediate output, where does it store, HDFS or Local Disk (Like mappers).
2. Does it persist or gets deleted after job finishes. If it persists can we see the file contents (will it be serialized)
ReplyDelete
Replies
Unknown25 January 2016 at 05:55
Thanks for the blog its really helpful.The chaining job is very interesting one.Thanks for the nice blog.Besant Technologies Reviews | Besant Technologies Reviews
ReplyDelete
Replies
CompleteExamCollection30 January 2016 at 20:45
For latest and updated Cloudera certification dumps in PDF format contact us at completeexamcollection@gmail.com.
Refer our blog for more details http://completeexamcollection.blogspot.in/2015/04/cloudera-hadoop-certification-dumps.html
ReplyDelete
Replies
Unknown28 February 2016 at 13:19
Nice example. But if I need to chain n jobs where n is not predefined, then what should be done? Let's say for an iterative algorithm that terminates only when certain conditions are met.
ReplyDelete
Replies
Red Viper26 September 2016 at 23:18
I am using the same example but when it is executing second job. It is saying input file not found. Also output file not getting created after first job executed successfully.

xception in thread "main" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://localhost:54310/user/output1232
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:321)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:264)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:385)
at org.apache.hadoop.mapreduce.lib.input.DelegatingInputFormat.getSplits(DelegatingInputFormat.java:115)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:597)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:614)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:492)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1314)
at com.hadoop.intellipaat.JoinClickImpressionDetailJob.run(JoinClickImpressionDetailJob.java:418)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at com.hadoop.intellipaat.JoinClickImpressionDetailJob.main(JoinClickImpressionDetailJob.java:422)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
ReplyDelete
Replies
Unknown8 November 2016 at 03:52
very very helpful!
ReplyDelete
Replies
Unknown1 October 2017 at 19:09
Hi Unmesha sreeveni,

Thanks a lot for detailed explanation...Very Helpful.
I am a new beginner in Hadoop. I dont know why these errors in DriverCode.
could u please advice me.
Driver code is sent to this mail. unmeshabiju@gmail.com
ReplyDelete
Replies
Unknown25 November 2017 at 08:48

In Hadoop, MapReduce is a calculation that decomposes large manipulation jobs into individual tasks that can be executed in parallel cross a cluster of servers. The results of tasks can be joined together to compute final results.
Mapreduce program example
Hadoop fs command using java api
ReplyDelete
Replies
Unknown8 March 2018 at 04:41
Hello. I am trying to create a chain joib in hadoop. The algorithm I want to create requests map2 to get as an input the output from the map1 . The Job1 have both map and reduce phase. Is there any possible way something like this to happen?
Thanks in advance
ReplyDelete
Replies
rmouniak17 April 2018 at 03:42
Learned a lot of new things from your post , Thanks for sharing

Java Online Training Hyderabad
ReplyDelete
Replies
Saro21 January 2019 at 22:56
I appreciate your efforts because it conveys the message of what you are trying to say. It's a great skill to make even the person who doesn't know about the subject could able to understand the subject . Your blogs are understandable and also elaborately described. I hope to read more and more interesting articles from your blog. All the best.
rpa training in bangalore
rpa training in chennai
rpa training in pune
best rpa training in bangalore
ReplyDelete
Replies
saranya28 January 2019 at 22:09
Needed to compose you a very little word to thank you yet again regarding the nice suggestions you’ve contributed here.
python training in chennai
python course institute in chennai
ReplyDelete
Replies
Chiến SEOCAM31 May 2019 at 19:38
Надеюсь, удача придет к вам. Желаю тебе всегда счастливой!

Lều xông hơi khô

Túi xông hơi cá nhân

Lều xông hơi hồng ngoại

Mua lều xông hơi
ReplyDelete
Replies
Anonymous7 July 2019 at 08:20
Have you ever thought about including a little bit more than just your articles? I mean, what you say is important and everything. But just imagine if you added some great visuals or videos to give your posts more, "pop"! Your content is excellent but with pics and video clips, this website could definitely be one of the very best in its field. Superb blog!
waterjet cutting edmonton
ReplyDelete
Replies
Kevin6 August 2019 at 03:08
Do you guys know that the tnpsc group 4 online test hall tickets will be released any soon now and you can download them.
Visit our website SarkariResultExams and check the hall ticket section to get more information about the hall tickets of the upcoming exams.
Thanks.
ReplyDelete
Replies
Anonymous9 October 2019 at 02:11
Visit for AWS training in Bangalore:- AWS training in Bangalore
ReplyDelete
Replies
MediaOne Business Group Pte Ltd25 November 2019 at 04:32
Greetings I am so excited I found your website, I really found you by error, while I was browsing on Google for something else, Nonetheless I am here now and would just like to say thanks a lot for a tremendous post and a all round entertaining blog (I also love the theme/design), I don’t have time to read it all at the moment but I have book-marked it and also added in your RSS feeds, so when I have time I will be back to read much more, Please do keep up the great job.
milling services edmonton
ReplyDelete
Replies
digitaltucr22 January 2020 at 03:44
I finally found great post here.I will get back here. I just added your blog to my bookmark sites. thanks.Quality posts is the crucial to invite the visitors to visit the web page, that's what this web page is providing.
data science course Mumbai
data analytics courses Mumbai
data science interview questions
ReplyDelete
Replies
Tanjila Akter27 March 2020 at 11:55
Really amazing content, thanks for sharing with us and keep updating! This website article is really excellent and unique. I will visit your site again. You can see the Bangladesh Education, Events, JSC, PSC, SSC, HSC, Honours, nu, Result, routine and Job circular Pureinfobd
ReplyDelete
Replies
priyash26 June 2020 at 07:17
wonderful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article resolved my all queries.
data science interview questions
ReplyDelete
Replies
Data Science Training2 July 2020 at 04:48
wow, great, I was wondering how to cure acne naturally. and found your site by google, learned a lot, now i’m a bit clear. I’ve bookmark your site and also add rss. keep us updated.

Data Science Training
ReplyDelete
Replies
Veera Blogspot14 October 2020 at 22:22
Very nice article post,Thank you for sharing this awesome blog.
keep updating more big data hadoop tutorials.

Big Data and Hadoop Training
ReplyDelete
Replies
dataanalytics30 September 2021 at 22:44
It is perfect time to make some plans for the future and it is time to be happy. I've read this post and if I could I desire to suggest you some interesting things or suggestions. Perhaps you could write next articles referring to this article. I want to read more things about it!
data scientist training and placement
ReplyDelete
Replies
JAIIB12 November 2021 at 22:35
Dear Readers,
In the very first article of JAIIB or DB&F, we discussed about the exams, what are they, who conducts them and why it is important.
Now, in this article, we will be discussing about the Exam Pattern, Eligibility and Schedule of the JAIIB Examination.
Both JAIIB and DB&F are conducted two times in a year – One in around month of May and second time in around month of November on the three consecutive Sundays of the month.
IIBF JAIIB Exam Study Material All About JAIIB: Exam Pattern, Eligibility and Schedule
ReplyDelete
Replies
Pallavireddy25 November 2021 at 23:19
i am glad to discover this page : i have to thank you for the time i spent on this especially great reading !! i really liked each part and also bookmarked you for new information on your site.
artificial intelligence course in nashik
ReplyDelete
Replies
Data Science12 December 2021 at 23:00
Amazingly by and large very interesting post. I was looking for such an information and thoroughly enjoyed examining this one. Keep posting. An obligation of appreciation is all together for sharing.data science colleges in bangalore

ReplyDelete
Replies
Data Science24 December 2021 at 00:09
Extremely overall quite fascinating post. I was searching for this sort of data and delighted in perusing this one. Continue posting. A debt of gratitude is in order for sharing.business analytics course in warangal
ReplyDelete
Replies
Roman Wilson7 January 2022 at 23:19
Why is Trading Directory better than the rest? Our team has done an extensive research on online brokers and found that, unfortunately, many of them are not transparent with their data. This inspired us to make an honest and transparent comparison site that helps consumers find the right broker.
ReplyDelete
Replies
Data Science18 January 2022 at 00:26
Extremely overall quite fascinating post. I was searching for this sort of data and delighted in perusing this one. Continue posting. A debt of gratitude is in order for sharing.data science training in warangal
ReplyDelete
Replies
JAIIB27 January 2022 at 20:09
Bank Promotion Exams 2022 are conducted by the Institute of Banking Personnel Selection or IBPS (a govt. owned personnel recruitment agency). IIBF Bank Promotion Exams 2022 Study Material and pdf notes
https://learningsessions.in/bank-promotion-exam-2022/
ReplyDelete
Replies
360DigiTMG1 February 2022 at 02:31
Such a helpful article. Interesting to peruse this article.I might want to thank you for the endeavors you had made for composing this wonderful article.
best data science training in hyderabad

ReplyDelete
Replies
Jaiib Caiib7 March 2022 at 22:45
BEST DIGITAL MARKETING AGENCY in ludhiana
BEST DIGITAL MARKETING COMPANY IN LUDHIANA
BEST SEO AGENCY IN LUDHIANA
BEST SMO AGENCY IN LUDHIANA
BEST PPC AGENCY IN LUDHIANA what is seo
BEST what is smo
ReplyDelete
Replies
Jaiib Caiib7 March 2022 at 22:45
BEST DIGITAL MARKETING AGENCY in ludhiana
BEST DIGITAL MARKETING COMPANY IN LUDHIANA
BEST SEO AGENCY IN LUDHIANA
BEST SMO AGENCY IN LUDHIANA
BEST PPC AGENCY IN LUDHIANA what is seo what is smo
ReplyDelete
Replies
Maria Yang23 March 2022 at 20:59
Read our honest broker review of CMC Markets Review , one of the best online trading companies. Find out about the advantages of using a broker like CMC Markets Review, and learn more about the stock market and the best ways to invest in it. Read more here.
ReplyDelete
Replies
PMP Training in Malaysia13 April 2022 at 02:40
360DigiTMG, the top-rated organisation among the most prestigious industries around the world, is an educational destination for those looking to pursue their dreams around the globe. The company is changing careers of many people through constant improvement, 360DigiTMG provides an outstanding learning experience and distinguishes itself from the pack. 360DigiTMG is a prominent global presence by offering world-class training. Its main office is in India and subsidiaries across Malaysia, USA, East Asia, Australia, Uk, Netherlands, and the Middle East.
ReplyDelete
Replies
data science course in gorakhpur7 June 2022 at 02:57
Data development is also a necessary step when learning data science. Raw data is used as input, and then personal recommendations of a user are generated about a particular product.
data science training in gorakhpur
ReplyDelete
Replies
deekshitha22 June 2022 at 01:00
This comment has been removed by the author.
ReplyDelete
Replies
dgn6 July 2022 at 03:42
Feeding yоur dоg а рremium quаlity diet tо keeр them heаlthy is оne оf the mаster things yоu саn dо аs а dоg оwner.
buy Dog food online ONLINE IN INDIA
ReplyDelete
Replies
dgn6 July 2022 at 03:46
Braço do terceiro ponto do trator tractor machine tools and aggriculture machine supertractor machine-tools Machine aggriculture machine
buy Braço do terceiro ponto do trator
ReplyDelete
Replies
Data Analytics Courses in Agra12 September 2023 at 04:53
Amazingly, a post that is generally quite fascinating. I had been searching for such material and had a great time reading this one. Continue to post.
Data Analytics Courses in Agra
ReplyDelete
Replies
Divya Sharma13 September 2023 at 00:23
Hello Blogger,
Thank you for sharing this concise and informative guide on chaining MapReduce jobs in Hadoop. Your clear code and explanations make the process easier to understand and implement. It is an interesting and informative read.
Is iim skills fake?
ReplyDelete
Replies
Aishwarya16 October 2023 at 23:19
This article offers a clear and concise overview of the process of chaining jobs in Hadoop MapReduce
Digital Marketing Courses in Hamburg
ReplyDelete
Replies
Manish Roy20 October 2023 at 02:13
This is a great and engaging article. I had been seeking this type of information and found it enjoyable to read. Please continue to publish more. Thanks for sharing.
daa Analytics courses in leeds
ReplyDelete
Replies
Surabhi20 October 2023 at 14:08
Chaining jobs in Hadoop MapReduce is a crucial technique for optimizing data processing workflows, allowing for the seamless execution of multiple tasks in a coordinated manner. In the vibrant city of London, Data Analytics courses offer the opportunity to master such advanced techniques, equipping professionals with the skills to navigate the dynamic field of big data analytics. Please also Digital Marketing Courses in London .
ReplyDelete
Replies
DMC in Italy3 November 2023 at 04:44
A well explained and structured article on how to chain jobs in Hadoop MapReduce. Thanks for providing informative blog.
Digital Marketing Courses in Italy
ReplyDelete
Replies
Adwords1 December 2023 at 23:37
Thank you for providing detailed explanation on Chaining Jobs in Hadoop MapReduce
.
Adwords marketing
ReplyDelete
Replies
How Digital marketing is changing business3 December 2023 at 20:43
Insightful guide on chaining Hadoop MapReduce jobs. Clear code example and explanations. Grateful for sharing this valuable resource! Thanks.

How Digital marketing is changing business
ReplyDelete
Replies
Altar Runner25 December 2023 at 10:59
Looking forward to diving into more of your well-crafted articles!
Investment banking skills and responsibilities
ReplyDelete
Replies
Ict Job openings21 January 2024 at 23:24
Hiring all Java engineers in the Netherlands who are seeking a demanding position! This position provides a rare fusion of technical know-how and innovative problem-solving techniques.
JAVA jobs in netherlands
ReplyDelete
Replies
Gogou Misao21 March 2024 at 01:37
Thanks for some really useful code. This was just what I needed to finish my Hadoop assignment.
Investment banking analyst jobs
ReplyDelete
Replies
Rachana28 September 2024 at 11:22
Great post on chaining jobs in Hadoop MapReduce! Your insights into optimizing workflows and improving efficiency will undoubtedly help many developers tackle complex data processing tasks. Keep up the fantastic work and continue sharing your expertise!
Data Science Courses in Singapore
ReplyDelete
Replies
Sakshi Shah28 September 2024 at 13:47
This blog provides a clear and concise explanation of how to chain multiple Hadoop MapReduce jobs effectively. The code example demonstrates a practical approach, using two jobs with intermediate output, making it easy to follow for those familiar with Hadoop programming. The detailed breakdown of the first and second job configurations helps readers understand how data flows between the jobs. Overall, it's a valuable resource for developers looking to optimize their MapReduce workflows. Great job on simplifying a complex topic!
data analytics courses in dubai

ReplyDelete
Replies
praju3 October 2024 at 00:40
What a insightful read. The way you explained is very refreshing and easy to understand.Great content.
Online Data Science Course
ReplyDelete
Replies
IIMSkills Data Science Course In Kochi3 October 2024 at 22:44
Your indepth knowledge about Hadoop MapReduce is impressive. I found it very useful. Appreciate your efforts to explain in details. Thanks for sharing.
Data science courses in Kochi
ReplyDelete
Replies
Anonymous4 October 2024 at 04:55
Thanks unmesh for the blog its really helpful.The chaining job is very interesting one.your knowledge about hadoop is amazing.
Online Data Science Course
ReplyDelete
Replies
Data Analytics Courses In Ontario19 October 2024 at 12:07
"I’m really intrigued by the Data Science Course in Dadar!
The curriculum looks well-designed and thorough, covering essential topics.
I appreciate the focus on hands-on projects, which are so important for learning.
Having local access to quality education is a big advantage.
I’ll definitely be considering this course for my career path!"
ReplyDelete
Replies
Sadhvi26 October 2024 at 08:38
This is a solid approach to chaining multiple MapReduce jobs in Hadoop. By creating a dedicated driver method for each job and using an intermediate path (OUTPUT_PATH) to handle the output of the first job as the input to the next, you've set up a logical and efficient workflow.
Data science courses in Mysore

ReplyDelete
Replies
Anonymous21 November 2024 at 08:50
The post on chaining jobs in Hadoop MapReduce provides useful insights for optimizing workflows by linking multiple jobs together. A must-read for those working with big data frameworks.

Data Science Course in Delhi
ReplyDelete
Replies
Sunaina kaur22 November 2024 at 06:17
thank you for this insightful and in-depth article.
Data science courses in chennai
ReplyDelete
Replies
P. Zaheer Khan22 November 2024 at 20:34
Well-written! The real-world examples you included make the benefits of job chaining very relatable and actionable.
Data science course in Bangalore
ReplyDelete
Replies
AI Readers club26 November 2024 at 00:19
Great post! The explanation of chaining jobs in Hadoop MapReduce is really insightful and provides a clear understanding of how to effectively manage multiple jobs in a workflow. Your step-by-step guide makes it easier to grasp this concept
Data science courses in Bangladesh
ReplyDelete
Replies
iim skills Diksha28 November 2024 at 18:16
This post provides a clear and practical guide to chaining multiple MapReduce jobs in Hadoop. It demonstrates how to configure jobs so they execute in sequence, with the output of one serving as the input for the next. A great resource for anyone working with Hadoop MapReduce!
Digital marketing courses in mumbai
ReplyDelete
Replies
ISBF26 December 2024 at 03:18
I truly appreciate the effort you’ve put into this article. It’s highly informative and resolved all my doubts about the best economics colleges in India.
ReplyDelete
Replies
Sadhvi30 December 2024 at 00:58
This article provides an excellent explanation of chaining multiple Hadoop MapReduce jobs using a step-by-step approach.digital marketing courses in delhi
ReplyDelete
Replies
Shikha iimskills2 January 2025 at 07:43
It refers to the process of connecting multiple MapReduce jobs to create a workflow, where the output of one job serves as the input for the next. This technique allows for more complex data processing tasks, enabling users to build scalable data pipelines. By chaining jobs, you can optimize performance, reduce intermediate data storage, and streamline processes. It’s essential for handling large-scale data analysis and ensuring that Hadoop MapReduce efficiently processes big data across multiple stages.
Thank you for the content.
digital marketing course in Kolkata fees
ReplyDelete
Replies
Reena kuwar8 January 2025 at 21:22
It is perfect time to make some plans for the future and it is time to be happy. I've read this post and if I could I desire to suggest you some interesting things or suggestions. Perhaps you could write next articles referring to this article.
digital marketing course in coimbatore
ReplyDelete
Replies
Chanda7 February 2025 at 03:52
Very nice article, thank you for sharing.
Medical Coding Course
ReplyDelete
Replies
sanjana2 March 2025 at 07:45
Thank you for sharing this informative post! This post is really helpful.
Medical Coding Courses in Chennai
ReplyDelete
Replies
Best digital marketing institutes in India4 March 2025 at 02:11
Your post is incredibly enlightening and thought-provoking. I really really appreciate the detailed insights you shared. Thank you for your valuable contribution! If you're interested in exploring robust cloud solutions and hosting services and I highly recommend checking out One Up Networks. They offer a variety of specialized services to cater to different business needs.
Thanks for sharing your expertise! For more resources, please visit : -

OneUp Networks
CPA Hosting
QuickBooks Hosting
QuickBooks Enterprise Hosting
Sage Hosting
Wolters Kluwer Hosting
Thomson Reuters Hosting
Thomson Reuters UltraTax CS Cloud Hosting
Fishbowl App Inventory Cloud Hosting
Cybersecurity
ReplyDelete
Replies
Judith9 March 2025 at 08:36
this post gives information about hadoop map reduce. MapReduce is a programming model and framework used for processing large datasets across distributed systems.
Medical Coding Courses in Bangalore
ReplyDelete
Replies
IIM SKILLS (Pushpa)25 March 2025 at 07:16
Thanks for sharing! This guide offers a clear explanation of chaining Hadoop MapReduce jobs, detailing how to sequence multiple jobs and manage intermediate outputs efficiently. A practical resource for big data workflows!
Medical coding courses in Delhi/
ReplyDelete
Replies
mohd shoeab26 March 2025 at 03:08
"After completing the Digital Marketing course at IIM SKILLS, I felt fully prepared to take on digital marketing tasks at my job. It was a great experience!"

Medical Coding Courses in Coimbatore
ReplyDelete
Replies
Abdush Samad27 March 2025 at 05:20
Your explanation of [topic] was spot-on. Very well done!
Medical Coding Courses in Chennai
ReplyDelete
Replies
hktechdiary27 March 2025 at 23:28
Well-written and insightful blog about map reduce jobs and hadoop! Looking forward to more posts on similar topics. Medical Coding Courses in Delhi
ReplyDelete
Replies
Monisha3 April 2025 at 18:33
This was a fantastic post! I appreciate it.
Medical Coding Courses in Delhi
ReplyDelete
Replies
IIM SKILL Yashaswi8 April 2025 at 11:35
Thanks for the blog its really helpful.The chaining job is very interesting one.Thanks for the nice blog.
https://iimskills.com/medical-coding-courses-in-hyderabad/
ReplyDelete
Replies
Thrisha12 April 2025 at 07:54
I’ve been searching for a good explanation, and this one finally made things clear.

Medical Coding Courses in Bangalore
ReplyDelete
Replies
laungh new ipad pro15 April 2025 at 10:01
I have learned too many things from your article.
Medical Coding Courses in Delhi
ReplyDelete
Replies
MedicalCodingCoursesinvaranasi16 April 2025 at 07:02
wow, great, I was wondering how to cure acne naturally. and found your site by google, learned a lot, now i’m a bit clear. I’ve bookmark your site and also add rss. keep us updated.
https://iimskills.com/medical-coding-courses-in-bangalore/
ReplyDelete
Replies
Sarah18 April 2025 at 05:20
Very technical . Thanks for the detailed coding . Keep blogging .
technical writing course
ReplyDelete
Replies
Mitali25 April 2025 at 09:50
Chaining multiple MapReduce jobs in Hadoop ensures sequential execution, where the output of one job becomes the input for the next.
https://iimskills.com/data-science-courses-in-india/
ReplyDelete
Replies
rani iimskills27 April 2025 at 22:09
he chaining job is very interesting one. Thanks for the nice blog
Data Science Courses in India

ReplyDelete
Replies
Mitali29 April 2025 at 10:26
Chaining multiple MapReduce jobs in Hadoop ensures sequential execution, where the output of one job becomes the input for the next.
Data Science Courses in India
ReplyDelete
Replies
Anonymous7 May 2025 at 03:27
The recorded classes are a life-saver.

Medical Coding Courses in Delhi
ReplyDelete
Replies
Aisha Duhailij10 May 2025 at 01:57
This is a fantastic explanation of chaining MapReduce jobs in Hadoop! It really clarifies how to structure complex data processing workflows. The examples provided are super helpful in understanding the practical implementation. Thanks for sharing this valuable insight!
Data Science Courses in India
ReplyDelete
Replies
GajenderIIM14 May 2025 at 11:07
Chaining jobs in Hadoop MapReduce enables complex data workflows by passing the output of one job as the input to another. This technique allows for modular processing, improved task organization, and better resource management. It's essential for handling multi-step data transformations in large-scale distributed computing environments efficiently.
Data Science Courses in India
ReplyDelete
Replies
Aditya Shankar22 May 2025 at 08:39
Thanks for this clear and informative explanation on chaining jobs in Hadoop MapReduce! You’ve done a great job breaking down a complex topic into understandable steps. This is really helpful for anyone working with large-scale data processing and trying to optimize their workflows. Looking forward to more technical insights from your blog!
Medical Coding Courses in Delhi
ReplyDelete
Replies
digital.cvm.2@gmail.com25 May 2025 at 02:33
this post gives information about Hadoop map reduce. MapReduce is a programming model and framework used for processing large datasets across distributed systems.
Medical Coding Courses in Delhi
ReplyDelete
Replies
Elakhiya25 May 2025 at 04:15
Great explanation on chaining jobs in Hadoop! This helped me understand how to run multiple jobs step by step.
Medical Coding Courses in Delhi
ReplyDelete
Replies
harshgoswami25 May 2025 at 04:46
Great explanation of chaining MapReduce jobs!
Medical Coding Courses in Kochi
ReplyDelete
Replies
sree st25 May 2025 at 04:50
Excellent walkthrough! This is a very clear and practical explanation of chaining MapReduce jobs in Hadoop. The breakdown of how intermediate outputs are handled and passed between jobs is especially helpful for anyone trying to manage multi-stage data processing. Medical Coding Courses in Kochi
ReplyDelete
Replies
PathToSuccess26 May 2025 at 07:48
Neat walkthrough of job chaining in Hadoop—super helpful!

Medical Coding Courses in Kochi
ReplyDelete
Replies
Kajal9526 June 2025 at 01:22
A well explained and structured article on how to chain jobs in Hadoop MapReduce. Thanks for providing informative blog.
Medical Coding Courses in Delhi
ReplyDelete
Replies
ISBF27 June 2025 at 06:40
Thanks for this excellent piece. As BSc Economics Honours College in Delhi NCR, we believe every student deserves access to well-researched and clearly presented knowledge.
ReplyDelete
Replies
Saloni7 July 2025 at 03:04
You’ve captured the essence perfectly.
Medical Coding Courses in Delhi

ReplyDelete
Replies
Dimple9 July 2025 at 04:53
Very technical .
Medical Coding Courses in Delhi
ReplyDelete
Replies
Arpita ah9 July 2025 at 22:19
That was a clear and concise explanation of chaining jobs in Hadoop MapReduce—really helpful for understanding job dependencies.
Medical Coding Courses in Delhi
ReplyDelete
Replies
IIM Skills(Neha Tiwari)11 July 2025 at 11:56
Great explanation on chaining MapReduce jobs in Hadoop! Very useful for those building multi-step data pipelines. Thanks for sharing!

Medical Coding Courses in Delhi
ReplyDelete
Replies
kirti12 July 2025 at 04:12
Great explanation! Chaining jobs like this makes complex MapReduce workflows much easier to manage. Thanks for sharing the clear code example!
Medical Coding Courses in Delhi
ReplyDelete
Replies
Anonymous13 July 2025 at 04:18
Great explanation! Chaining MapReduce jobs is essential when you need to break complex workflows into manageable stages.
Medical Coding Courses in Delhi
ReplyDelete
Replies
Meghna15 July 2025 at 02:54
Chaining MapReduce jobs like this is super useful when handling complex multi-stage data transformations. Thanks for the practical code breakdown! For those exploring alternate career paths in tech and healthcare IT, check this out:
Medical Coding Courses in Delhi — great upskilling opportunity!
ReplyDelete
Replies
Meghna15 July 2025 at 02:59
Chaining jobs like this makes Hadoop much more efficient for complex workflows. Great explanation! For those exploring structured tech careers, check out Medical Coding Courses in Delhi — healthcare and data skills go hand in hand.
ReplyDelete
Replies
Tushar gautam15 July 2025 at 10:18
Excellent explanation! Chaining MapReduce jobs effectively is key to handling complex data workflows. Your step-by-step guide really helps demystify the process—thanks for sharing!
Medical Coding Courses in Delhi
ReplyDelete
Replies
Medical coding courses in Delhi19 July 2025 at 21:16
Thanks for the blog its really helpful.The chaining job is very interesting one.Thanks for the nice blog
Medical Coding Courses in Delhi
you can grow your career in Medical health care industry
ReplyDelete
Replies
Hima Rasheed25 July 2025 at 09:49
This post provides a clear and practical guide on chaining multiple MapReduce jobs in Hadoop. It explains how to configure jobs so they run sequentially, with the output of one job used as the input for the next. A valuable resource for developers working on complex big data workflows. Do check out Medical Coding Courses in Delhi for more career opportunities.
ReplyDelete
Replies
princy jain7 August 2025 at 00:50
Your breakdown of Oracle SOA AIA administration is impressive—especially when technical training can get overwhelming fast. The step-by-step format, paired with real-world context, makes learning feel grounded. It's clear you aim to empower, not just inform. Really appreciate that clarity!

financial modeling courses in delhi

ReplyDelete
Replies
Bhavika7 August 2025 at 14:09
financial modeling courses in delhi
Great explanation! Chaining jobs in Hadoop MapReduce can be a bit daunting, but your post breaks it down really well. I especially liked how you illustrated the flow between multiple jobs and highlighted the importance of managing intermediate outputs efficiently.

It’s helpful to see practical examples of how to structure job dependencies and optimize performance. Would love to see a follow-up on integrating this with tools like Oozie or using counters to pass metadata between jobs.

Thanks for making a complex topic much more approachable!
ReplyDelete
Replies
IIM Skills8 August 2025 at 12:00
This is a great explanation of a fairly complex concept in Hadoop. Job chaining is often necessary for real-world data workflows, and your step-by-step breakdown makes it much easier to understand. I liked your use of practical examples—it bridges the gap between theory and implementation. For those working with big data pipelines, this is definitely a go-to resource. Would love to see a future post on integrating chained jobs with Hive or Pig.
financial modeling courses in delhi
ReplyDelete
Replies
akashiimskill8 August 2025 at 22:49
Your posts always delivers valuable information in an engaging way. Keep up the excellent work!
href="https://iimskills.com/financial-modelling-course-in-delhi/">financial modeling courses in delhi
ReplyDelete
Replies
Sohail Digi9 August 2025 at 09:08
The mapper setup runs but not the map function within the second mapper. Is it because of format issues.
https://iimskills.com/financial-modelling-course-in-delhi/
ReplyDelete
Replies
Nilabh10 August 2025 at 02:46
This tutorial provides a clear and practical approach to chaining MapReduce jobs in Hadoop. The step-by-step instructions and code examples are particularly helpful for those new to job chaining. A valuable resource for anyone looking to build complex data processing workflows in Hadoop
financial modeling courses in delhi
ReplyDelete
Replies
Priti Saha10 August 2025 at 12:27
Needed to compose you a very little word to thank you yet again regarding the nice suggestions you’ve contributed here.
financial modeling courses in delhi
ReplyDelete
Replies
Monika Khatnani11 August 2025 at 12:02
Great explanation on chaining jobs in Hadoop MapReduce! Understanding how to connect multiple jobs effectively is key to optimizing complex data workflows. The examples provided make it much clearer how to implement this in real-world scenarios.
financial modeling courses in delhi
ReplyDelete
Replies
IIM Skills(Shreya Saha)18 August 2025 at 22:00
This explanation of chaining MapReduce jobs is really clear and practical. I like how you broke down the flow between Job1 and Job2, showing how the intermediate output connects them. The code snippet makes it much easier to understand for anyone trying to implement multi-stage processing in Hadoop.
financial modeling courses in delhi
ReplyDelete
Replies
iim.skills19 August 2025 at 01:25
This article gives simple tips that make IELTS study easier to handle. The practice methods and time management ideas feel useful and realistic. The guidance shows learners how to direct their effort toward key skills, which supports steady growth and builds confidence step by step.
financial modeling courses in delhi
ReplyDelete
Replies

Add comment