Commit Graph

348 Commits

Author SHA1 Message Date
yanggang 81343172ea update unGz 2018-12-14 14:55:36 +08:00
yanfqidong0604 0a4e2f24e7 Optimization of HDFS file storage
yang qidong
2018-12-13 22:48:15 +08:00
yanggang bcd658bb11 Optimization and modification of Oracle
yang qidong
2018-12-11 10:52:34 +08:00
judy0131 8b109c6902 fix bug: remove hive-site.xml core-site.xml 2018-11-30 17:07:19 +08:00
yanggang 426bcff6c2 Merge remote-tracking branch 'origin/master'
# Conflicts:
#	piflow-bundle/src/test/scala/cn/piflow/bundle/UrlTest.scala
2018-11-29 17:01:59 +08:00
yanggang 499f70cda1 update unGz 2018-11-29 16:59:43 +08:00
judy0131 972f8af248 fix flow progress bug 2018-11-28 18:17:39 +08:00
judy0131 63e3cdda81 remove common-pool2.jar 2018-11-28 15:34:16 +08:00
judy0131 808a8e1693 fix checkpoint bug 2018-11-28 15:02:52 +08:00
yanggang 7921775508 Merge remote-tracking branch 'origin/master' 2018-11-26 15:43:02 +08:00
yanggang c29f9bcba7 bioProject stop 2018-11-26 15:42:48 +08:00
yanfqidong0604 c8a23fdb72 the test for dblp
yang qidong
2018-11-26 10:25:21 +08:00
yanfqidong0604 892866fd51 the test for dblp
yang qidong
2018-11-26 10:23:26 +08:00
yanfqidong0604 2b46264473 modification of ConvertSchema
yang qidong
2018-11-23 12:56:14 +08:00
yanggang cda81eae5c update describe 2018-11-22 19:56:08 +08:00
yanggang ba4b63ae15 Merge remote-tracking branch 'origin/master' 2018-11-22 19:52:33 +08:00
yanggang 04fe22c69c Merge branch 'master' of /workFtp/1119/piflow with conflicts. 2018-11-22 19:52:30 +08:00
yanggang 73ff82783a update describe 2018-11-22 19:49:06 +08:00
yanggang 95316d516d goldData Parse 2018-11-22 19:28:08 +08:00
yanggang f8f34a4cee update describe 2018-11-22 19:27:04 +08:00
xiaoxiao d02f3da4ee alter email and add description for properties 2018-11-22 17:23:29 +08:00
yanfqidong0604 e97fac9d84 Schema substitution for dataframe
About downloading network files to HDFS
About HDFS file decompression
yang qidong
2018-11-22 16:52:05 +08:00
yanggang 2e26f7cbe3 photo png(hdfs,hbase,hive) 2018-11-22 16:01:13 +08:00
yanggang 81cfdb8953 goldData Parse 2018-11-22 15:42:14 +08:00
yanggang 4bb8abba9c goldData Parse 2018-11-22 14:46:04 +08:00
yanggang 5099851fe7 png photo 2018-11-22 14:32:04 +08:00
yanggang 5828c2b5bc Merge remote-tracking branch 'origin/master' 2018-11-22 14:10:45 +08:00
yanggang 7a329eb7cb goldData Parse 2018-11-22 14:09:43 +08:00
judy0131 dd11c444a2 1.implement checkpoint 2018-11-20 16:46:47 +08:00
yanfqidong0604 ed8d9faaf2 Standardization of historical stop
yang qidong
2018-11-19 22:03:18 +08:00
yanggang 5a3b4d694d Merge remote-tracking branch 'origin/master' 2018-11-19 16:13:26 +08:00
yanggang 9160be777c add maven 2018-11-19 16:13:06 +08:00
xiaoxiao 239786f881 add graphx stops 2018-11-19 15:24:40 +08:00
yanggang 350f74a741 Merge remote-tracking branch 'origin/master' 2018-11-19 15:15:33 +08:00
yanggang a07cca1f8b doMap flatMap executeSql 2018-11-19 15:15:06 +08:00
yanfqidong0604 9fcf843efc Stop name and parameter name adjustment
yang qidong
2018-11-19 14:46:10 +08:00
yanfqidong0604 acf2acf3b3 Modification of special types of Oracle
yang qidong
2018-11-19 10:24:04 +08:00
yanfqidong0604 becfca6b91 Modification of special types of Oracle
yang qidong
2018-11-19 09:43:45 +08:00
yanfqidong0604 7bbfd7d3ee Oracle database read and write and related driver package
yang qidong
2018-11-15 10:13:49 +08:00
yanfqidong0604 32a0ef0f0f Merge remote-tracking branch 'origin/master' 2018-11-13 12:36:22 +08:00
yanfqidong0604 32925d3071 Additional stop reading impala, changes to description information of previous Memcache stop, and changes to logo of spider
yang qidong
2018-11-13 12:35:52 +08:00
yanggang 7f53e25275 netty-all dependency 2018-11-13 12:12:47 +08:00
yanggang 6994550803 Merge remote-tracking branch 'origin/master' 2018-11-12 18:11:53 +08:00
yanggang fd5cfe6b59 hbase 2018-11-12 18:11:11 +08:00
judy0131 2d80db9585 fix bug 2018-11-12 17:19:56 +08:00
judy0131 88def74fa1 Merge remote-tracking branch 'origin/master' 2018-11-12 15:26:27 +08:00
judy0131 5fcb6eff7d fix bug 2018-11-12 15:26:07 +08:00
yanfqidong0604 1360645156 The group name standardization of stop recently completed, and the image of the crawler.
yang qidong
2018-11-12 15:15:04 +08:00
yanfqidong0604 8d198cb392 Maven dependency of Memcache
yang qidong
2018-11-12 14:42:50 +08:00
yanfqidong0604 96abf1b61f Read and write mongodb, read and write Memcache, and complement Memcache.
yang qidong
2018-11-09 15:10:27 +08:00
judy0131 82c64aabae add flow name in flowImpl 2018-11-06 17:07:13 +08:00
judy0131 a4badc9703 1.modify getFlowProgress api
2.add maven-install-plugin to install the databricks file
2018-11-06 13:55:28 +08:00
yanggang 1b956b7cf0 update genbank 2018-11-06 10:01:55 +08:00
yanfqidong0604 32749fa906 The first version of web spider stop
yang qidong
2018-11-04 18:31:42 +08:00
yanggang 163acb2525 incremental download from ftpUrl 2018-11-02 14:50:30 +08:00
yanggang 707505979f update Es 2018-11-02 10:40:11 +08:00
yanggang d14b8ebc1f update Es 2018-11-02 10:10:08 +08:00
yanggang f8dd478f4d genbank002 2018-11-01 17:37:49 +08:00
yanggang 1d623d70b7 genbank001 2018-11-01 17:33:40 +08:00
xiaoxiao 06bdb859d2 add gaussion mixture stops 2018-11-01 17:17:21 +08:00
yanggang 778d41744e Merge remote-tracking branch 'origin/master' 2018-11-01 17:13:21 +08:00
yanggang f3f492e75b GenBankParse 2018-11-01 17:13:05 +08:00
xiaoxiao 6b2698b94f add lda stops 2018-11-01 16:48:43 +08:00
yanggang d8ba5012e7 GenBankParse 2018-11-01 16:44:25 +08:00
yanggang dfcc2a997e GenBankParse 2018-11-01 16:33:16 +08:00
yanggang 45319224a5 ungzUtil 2018-10-31 16:08:03 +08:00
yanggang 7ef5206305 New put to es,fetch from es.query from es 2018-10-31 10:40:11 +08:00
xiaoxiao 84768e8abe add Word2Vec stop 2018-10-26 11:21:22 +08:00
yanggang 0af6b10fb3 Merge remote-tracking branch 'origin/master' 2018-10-25 14:09:13 +08:00
yanggang def72cf714 hdfs(delete,get,list,put),http(get,post,invoke) 2018-10-25 14:05:48 +08:00
coco11563 2105731818 fix to String bug 2018-10-24 17:09:03 +08:00
xiaoxiao 8d626a876a add BisectingKMeans stops 2018-10-24 14:41:32 +08:00
yanfqidong0604 e3d1dda0e0 Merge remote-tracking branch 'origin/master' 2018-10-23 18:30:41 +08:00
yanfqidong0604 8d83adecba hdfs(delete,get,list,put),http(get,post,invoke) 2018-10-23 18:29:25 +08:00
xiaoxiao 6965a70a19 add Kmeans stops 2018-10-23 11:17:47 +08:00
coco11563 f9763670ca revise an little bug, which will cause the Array type reduce to single string 2018-10-23 11:07:39 +08:00
coco11563 2a4d33ac48 Merge remote-tracking branch 'origin/master' 2018-10-22 15:24:06 +08:00
coco11563 7d0fdaae35 revise an little bug, which will cause the Array type reduce to single string 2018-10-22 15:23:53 +08:00
judy0131 706517fef1 Merge remote-tracking branch 'origin/master' 2018-10-22 15:13:29 +08:00
judy0131 c329a8459f fix bug: merge and fork 2018-10-22 15:13:14 +08:00
coco11563 3327f58662 done the 28 parameter desc stop
i feel frustrated
2018-10-22 14:36:23 +08:00
judy0131 554cd25dc4 fix checkpoint bug 2018-10-22 12:22:55 +08:00
yanfqidong0604 a70fe6a989 New field expansion function has been added.
QiDong Yang
2018-10-22 11:03:35 +08:00
judy0131 d8d0768eb6 add image 2018-10-19 15:16:54 +08:00
judy0131 416594d594 fix image bug 2018-10-19 14:37:17 +08:00
yanfqidong0604 b2943ce928 GetHttp ,Posthttp 2018-10-19 08:57:19 +08:00
yanfqidong0604 e0d20b842b GetHttp ,Posthttp 2018-10-18 21:32:14 +08:00
xiaoxiao 9b621d4db1 add GBT stops 2018-10-18 17:19:39 +08:00
xiaoxiao fc83efec28 add Random Forest stops 2018-10-18 16:51:58 +08:00
coco11563 fcf0f40da8 add the structure of the csv to neo4j stop
it's so boring to finish this stop
so I left it to next week
2018-10-18 16:30:36 +08:00
coco11563 01d32397be modify the RdfToCsv stop
now we have the RdfToDF stop
all the process will encapsulate by the Row\StructField\StructType
there are no more CSV stuff in this stop
Since the new outport design, I revise the code
2018-10-18 16:07:38 +08:00
coco11563 172d1c7d61 Merge remote-tracking branch 'origin/master'
# Conflicts:
#	piflow-bundle/src/main/scala/cn/piflow/bundle/rdf/RdfToCsv.scala
2018-10-18 16:01:51 +08:00
coco11563 007d0e67d0 modify the RdfToCsv stop
now we have the RdfToDF stop
all the process will encapsulate by the Row\StructField\StructType
there are no more CSV stuff in this stop
2018-10-18 16:01:14 +08:00
xiaoxiao b38038285c add MLP stops 2018-10-18 15:35:15 +08:00
yanfqidong0604 a19e66e3da For JSON file read, folder read and file merge
QiDong Yang
2018-10-18 14:22:57 +08:00
judy0131 c696d1366a add stop ports name 2018-10-18 13:10:19 +08:00
coco11563 3e69e944a7 Merge remote-tracking branch 'origin/master' 2018-10-18 11:01:15 +08:00
coco11563 b19cb3be9f done the Entity test set 2018-10-18 11:00:55 +08:00
judy0131 46f70e6c6d add more stop info 2018-10-18 09:49:43 +08:00
coco11563 281fbb14ef done the Rdf2Csv stop 2018-10-17 18:14:47 +08:00
judy0131 0d98554e13 fix bug 2018-10-17 18:13:37 +08:00
xiaoxiao a2b4633f45 delete ml group 2018-10-17 17:31:46 +08:00
xiaoxiao 655a6d6eb7 fix an error 2018-10-17 16:00:32 +08:00
xiaoxiao b878944846 add decision tree classification stops 2018-10-17 15:47:25 +08:00
judy0131 2354b9b0c7 Merge remote-tracking branch 'origin/master' 2018-10-17 14:34:40 +08:00
judy0131 890e444141 select nested dataframe 2018-10-17 14:34:22 +08:00
xiaoxiao e0d36774e6 add Logistic Regression Classification stops 2018-10-17 11:07:22 +08:00
judy0131 0639ee8716 1.fix find configurableStop in classpath bug
2.get all stop with groups
2018-10-16 17:27:02 +08:00
yanfqidong0604 c3b8f7d8b4 QiDong Yang 2018-10-16 16:35:45 +08:00
yanfqidong0604 bc0c097908 QiDong Yang 2018-10-16 14:30:54 +08:00
yanfqidong0604 2cab1d07d1 putEs,qurryEs(*),fetchEs(*) 2018-10-16 09:45:06 +08:00
yanfqidong0604 67f64396a0 solr yang qi dong 2018-10-13 11:56:25 +08:00
yanfqidong0604 43ebe287ff csv
yang qi dong
2018-10-13 11:53:10 +08:00
yanfqidong0604 7ba388f396 solr
yang qi dong
2018-10-13 11:31:49 +08:00
yanfqidong0604 13d321d689 solr
yang qi dong
2018-10-12 21:47:51 +08:00
judy0131 7b61f021dc add stop description field 2018-10-12 15:27:09 +08:00
judy0131 7e05a0696e add stop description field 2018-10-12 14:51:46 +08:00
judy0131 b9a4fec6cd add stop description field 2018-10-12 14:51:22 +08:00
judy0131 a884c598eb add json group ,script group and xml group's getPropertyDescriptor api 2018-10-12 14:35:45 +08:00
judy0131 058325d089 add hive group and jdbc group's getPropertyDescriptor api 2018-10-12 14:17:24 +08:00
judy0131 7ebaba5df2 add common group and csv group's getPropertyDescriptor api 2018-10-12 14:01:07 +08:00
xiaoxiao 8427853df0 add NaiveBayesPrediction stop 2018-10-11 10:39:50 +08:00
xiaoxiao 4b854b9a1e add smoothing factor and model persistence for NaiveByesTraining Stop 2018-10-10 15:37:44 +08:00
xiaoxiao 763644cfda Merge branch 'master' of https://github.com/cas-bigdatalab/piflow 2018-10-10 10:49:04 +08:00
xiaoxiao 9550f60d70 add NaiveBayesTraining Stop 2018-10-10 10:42:21 +08:00
judy0131 de2efc9a24 fix bug: getGroups api guava version conflict 2018-10-08 17:56:00 +08:00
judy0131 f577b6ad79 run application on yarn by SparkLauncher(Now checkpoint.path is fixed, we can not read this config in config.properties, because yarn will find config.properties in path /opt/hadoop-2.6.0/tmp/nm-local-dir/usercache/root/appcache/appId/containerId/conf/config.properties) 2018-10-08 13:26:00 +08:00
xiaoxiao a987093467 fix the bug with data consumption from kafka 2018-09-28 16:14:04 +08:00
xiaoxiao 1b63b70c4d alter getGroup method and add authorEmail for stops 2018-09-28 13:31:27 +08:00
xiaoxiao 952f0cefea add ftp group(LoadFromFtp and UploadToFtp) 2018-09-28 13:24:32 +08:00
xiaoxiao 97e2600fb8 fix a bug for ReadFromKafka Stop 2018-09-28 12:47:07 +08:00
xiaoxiao caf919bae5 fix a bug 2018-09-28 10:50:37 +08:00
xiaoxiao 38cd7de567 fix a bug for ReadFromKafka stop 2018-09-28 10:40:57 +08:00
xiaoxiao 54b7a8509b Merge branch 'master' of https://github.com/cas-bigdatalab/piflow 2018-09-27 12:23:48 +08:00
xiaoxiao 9fd6688dd7 add redis and kafka group 2018-09-27 12:20:07 +08:00
xiaoxiao 7ae7e6a888 add redis group(WriteToRedis and ReadFromRedis) 2018-09-27 12:16:24 +08:00
judy0131 dcd8777558 add bundle for getStopInfo 2018-09-27 11:31:24 +08:00
judy0131 6804e91a59 fix bug: can not find user defined Configurable Stop 2018-09-27 11:23:25 +08:00
judy0131 9102bf97bb fix bug: can not find user defined Configurable Stop 2018-09-26 17:49:32 +08:00
judy0131 fd2ef07acd add stop info api 2018-09-25 18:07:34 +08:00
judy0131 e2b2df0418 1.modify ConfigurableStop getGroup api
2.modify findAllConfigurableStop
3.add findAllGroups
2018-09-25 16:31:55 +08:00
judy0131 e6c4b22e5a add stop's author 2018-09-17 14:10:51 +08:00
judy0131 c142c03529 add get stop properties api 2018-09-14 12:58:50 +08:00
judy0131 66b3175ef7 fix bug 2018-09-11 17:13:05 +08:00
judy0131 ac8ad1a92e Merge branch 'master' of https://github.com/cas-bigdatalab/piflow 2018-09-11 14:53:54 +08:00
judy0131 06f9a772dc remove core-site.xml hdfs-site.xml hive-site.xml yarn-site.xml 2018-09-11 14:51:24 +08:00
xiaoxiao bb2ddc60d3 Merge branch 'master' of https://github.com/cas-bigdatalab/piflow 2018-09-11 14:24:50 +08:00
xiaoxiao 3732d0bcef add ftp group(LoadFromFtp and UploadToFtp) 2018-09-11 14:11:46 +08:00
judy0131 60e4f3310e run on yarn 2018-09-10 15:26:51 +08:00
judy0131 88471237ad fix bug: can not run flow when package code into piflow.jar 2018-09-06 17:09:35 +08:00
judy0131 2ab05fccdd 1.modify piflow module,delete piflow-conf 2.deploy jar on server 2018-09-05 17:54:21 +08:00
lj044500 e4d51735cc XmlTesst 2018-09-04 10:48:35 +08:00
lj044500 ed1c4e47fa add folder xml file dispose 2018-09-04 10:46:01 +08:00
lj044500 1250dee09f 文件夹中所有得.xml文件处理 2018-09-04 10:40:06 +08:00
judy0131 c347710084 fix bug 2018-09-03 14:50:55 +08:00
judy0131 6db6d54e56 add httpClient 2018-08-31 13:07:44 +08:00
xiaoxiao 14ba86908e 8.28 2018-08-28 11:39:22 +08:00
xiaoxiao 590cba100e Merge branch 'master' of https://github.com/cas-bigdatalab/piflow 2018-08-28 11:34:45 +08:00
xiaoxiao 7d7d4e53a8 8.28 2018-08-28 11:34:38 +08:00
xiaoxiao a8bf668aec 8.28 2018-08-28 11:33:46 +08:00
xiaoxiao ec0f29aaf4 8.28 2018-08-28 11:33:30 +08:00
xiaoxiao 7fcd3756db 8.27 2018-08-27 18:50:00 +08:00
xiaoxiao f0637eef9b 8.27 2018-08-27 18:49:42 +08:00
xiaoxiao 4025108491 8.27 2018-08-27 18:49:28 +08:00
xiaoxiao 5b3d2b79fc 8.27 2018-08-27 18:49:09 +08:00
judy0131 e5d9882eaa add startFlow & stopFlow Api 2018-08-23 17:16:28 +08:00
xiaoxiao bf7b5d9ea4 Merge branch 'master' of https://github.com/cas-bigdatalab/piflow 2018-08-23 13:36:17 +08:00
judy0131 221171e388 Merge branch 'master' of https://github.com/cas-bigdatalab/piflow 2018-08-23 13:13:18 +08:00
xiaoxiao 947b606abd 8.16 2018-08-17 11:24:30 +08:00
xiaoxiao e41c826592 8.15 2018-08-15 18:28:14 +08:00
xiaoxiao 235f4087c8 8.15 2018-08-15 16:35:39 +08:00
xiaoxiao 0a6b4190aa 8.14 2018-08-14 13:54:53 +08:00
xiaoxiao 873b8a5586 8.14 2018-08-14 13:54:43 +08:00
xiaoxiao 21b52578d6 8.13 2018-08-14 10:05:32 +08:00
xiaoxiao 35ab13085c 8.13 2018-08-13 17:08:15 +08:00
xiaoxiao 712fcbc0ec 8.13 2018-08-13 17:02:14 +08:00
xiaoxiao 6c82b4c452 8.13 2018-08-13 17:02:00 +08:00
xiaoxiao e7014c206d 8.13 2018-08-13 17:01:49 +08:00
xiaoxiao b2b23ddedf 8.10 2018-08-10 17:26:35 +08:00
xiaoxiao 0d7d203f71 8.10 2018-08-10 10:35:01 +08:00
xiaoxiao 3172e18a9a 8.10 2018-08-10 10:34:31 +08:00
xiaoxiao 3bc955bee8 8.10 2018-08-10 10:34:06 +08:00
xiaoxiao 65d73aa9cb 8.10 2018-08-10 10:33:46 +08:00
xiaoxiao 1b898fd827 8.10 2018-08-10 10:33:21 +08:00
judy0131 f6223baf68 add CsvSave 2018-08-02 17:24:47 +08:00
judy0131 a7a11bd5dc add findAllGroup api 2018-07-31 17:18:48 +08:00
judy0131 ddf12a0e67 add checkpoint 2018-07-31 14:16:56 +08:00
judy0131 20a9bdd42e add checkpoint 2018-07-31 14:15:02 +08:00
judy0131 67d09e945e add fork test 2018-07-31 10:22:18 +08:00
judy0131 898889ab6d add inportCount & outportCount 2018-07-20 16:56:24 +08:00
judy0131 309e14819d delete unuse code 2018-07-18 14:21:51 +08:00
judy0131 faf07b49a0 add DataFrameRowParser Stop for ShellExecutor 2018-07-18 14:04:50 +08:00
judy0131 576fbd039e add parameters for shell script 2018-07-18 10:13:51 +08:00
judy0131 2a2b6bd351 add ShellExecutor Stop 2018-07-17 17:47:46 +08:00
judy0131 8af62a6d5e add Stop Group 2018-07-17 15:23:14 +08:00
judy0131 9f743c3a12 modify Stop name 2018-07-17 14:43:34 +08:00
judy0131 9100fb9e26 add getIcon for ConfigurableStop 2018-07-16 17:18:40 +08:00
judy0131 1440871961 add piflow bundle module 2018-07-13 10:42:48 +08:00