|
||
---|---|---|
conf | ||
project | ||
src | ||
test_input | ||
.DS_Store | ||
LICENSE | ||
README.md | ||
build.sbt |
README.md
simba
insert, extraction and analysis framework for LDM
#Notice 1: scala version should be compatible for the system and the Spark
- spark 1.3.1
- scala 2.10.4
- hadoop 1.2.1
- titan 1.0.0
#Notice 2:
assume lib in simba home contains following libs
hadoop-client-1.2.1.jar
hadoop-gremlin-3.0.1-incubating.jar
hbase-common-0.98.2-hadoop1.jar
htrace-core-2.04.jar
hadoop-core-1.2.1.jar
hbase-client-0.98.2-hadoop1.jar
hbase-protocol-0.98.2-hadoop1.jar
or you need to include these libs through modifying the build.sbt
#Notice 3: (for titan)
- conf contains "conf/titan-hbase-es-simba.properties" configuration file for TitanDB(hbase+es in default)
- test_input contains the docs and links data and can be accessed as val docRDD = sc.objectFileDocument val linkRDD = sc.objectFileDocumentLink
compile####
sbt clean compile
run
sbt run
test
sbt test
#Simple Example: var gDB = TitanSimbaDB(sc, titanConf) val docRDD = sc.objectFileDocument gDB.insert(docRDD) gDB.docs().foreach(s => s.simbaPrint()) gDB.close()