Install Apache Tika on Debian

  1. 1. Update java (your current java should be java7 or higher, if its already updated proceed to step 2.

    echo “deb trusty main” | tee /etc/apt/sources.list.d/webupd8team-java.list

    echo “deb-src trusty main” | tee -a /etc/apt/sources.list.d/webupd8team-java.list

    apt-key adv –keyserver hkp:// –recv-keys EEA14886

    apt-get update

    apt-get install oracle-java8-installer

    now check the java version: eg. java -version

    I should show like 

    java version “1.8.0_25”

    Java(TM) SE Runtime Environment (build 1.8.0_25-b17)

    Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)

    If its still an old version you can follow the link below

    2. Donwload Tika

       b. Unzip:    eg. unzip

    3. Intall Maven 2 for our Tika build system

       b. Unzip: eg. unzip

       c. Follow the installation guide here->

    4. Install Tika

       a. Enter to tika base directory eg. cd tika-1.6

       b. build tika using maven2: eg mvn install

    5. Finish. Now you can test tika 

       a. From the base directory of tika you can do

          java -jar tika-app/target/tika-app-1.6.jar -j [file]  

          The output should be a json encoded metadata of a file

          For some options you can see or java -jar tika-app/target/tika-app-1.6.jar –help


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s