-
1. Update java (your current java should be java7 or higher, if its already updated proceed to step 2.
echo “deb http://ppa.launchpad.net/webupd8team/java/ubuntu trusty main” | tee /etc/apt/sources.list.d/webupd8team-java.list
echo “deb-src http://ppa.launchpad.net/webupd8team/java/ubuntu trusty main” | tee -a /etc/apt/sources.list.d/webupd8team-java.list
apt-key adv –keyserver hkp://keyserver.ubuntu.com:80 –recv-keys EEA14886
apt-get update
apt-get install oracle-java8-installer
now check the java version: eg. java -version
I should show like
java version “1.8.0_25”
Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)
If its still an old version you can follow the link below
2. Donwload Tika
a. Download: eg. wget http://mirror.sdunix.com/apache/tika/tika-1.6-src.zip
b. Unzip: eg. unzip tika-1.6-src.zip
3. Intall Maven 2 for our Tika build system
a. Download: eg. wget http://psg.mtu.edu/pub/apache/maven/maven-3/3.2.3/binaries/apache-maven-3.2.3-bin.zip
b. Unzip: eg. unzip apache-maven-3.2.3-bin.zip
c. Follow the installation guide here-> http://maven.apache.org/download.cgi#Installation
4. Install Tika
a. Enter to tika base directory eg. cd tika-1.6
b. build tika using maven2: eg mvn install
5. Finish. Now you can test tika
a. From the base directory of tika you can do
java -jar tika-app/target/tika-app-1.6.jar -j [file]
The output should be a json encoded metadata of a file
For some options you can see http://tika.apache.org/1.6/gettingstarted.html or java -jar tika-app/target/tika-app-1.6.jar –help
Leave a comment