Sunday, 2 November 2014

How To Install Apache Mahout on Ubuntu


Prerequisites:

  1.  Hadoop Cluster
  2.  Maven


STEP 1: Download mahout latest source code from

http://www.apache.org/dyn/closer.cgi/lucene/mahout/

Make sure you download .src zipped file.


STEP 2: Unzip the file to a named folder “mahout”

unzip -a mahout-distribution-x.x-src.zip

STEP 3: Move mahout to /usr/local

mv mahout /usr/local

STEP 4: Build Mahout

unmesha@client:~$ cd /usr/local/mahout/mahout-distribution-0.9
unmesha@client:/usr/local/mahout/mahout-distribution-0.9$ ls
bin         core          examples     LICENSE.txt  math-scala  pom.xml     src buildtools  distribution  integration  math         NOTICE.txt  README.txt  target
unmesha@client:/usr/local/mahout/mahout-distribution-0.9$mvn install

Wait untill mahout is build. It would perform some tests also.It is recommended to complete the test for the first time.Later you can skip the test using

mvn install -Dmaven.test.skip=true

Once the tests are done and the mahout is built , we get a success message.


Congratz Apache Mahout is installed...


If you are using Cloudera(CDH) package , you can install Mahout in just 1 step.
apt-get install mahout

You can use mahout commands in /usr/bin and if you want to run mahout in hadoop cluster go to /usr/lib and reference mahout-cdhx-core-job.jar and full class path.



2 comments:

  1. Can we implement using this on OSGI framework..?

    ReplyDelete
    Replies
    1. I think they dont use OSGI at this point. They are primarily producing a library rather than standalone programs.Can you say something about how using OSGi would help ?

      Delete