2018-12-24 15:59:18 +08:00
|
|
|
|
![](https://github.com/cas-bigdatalab/piflow/blob/master/doc/piflow.png)
|
|
|
|
|
**Piflow** is an easy to use, powerful big data pipeline system.
|
2018-05-03 18:15:05 +08:00
|
|
|
|
|
2018-12-24 15:59:18 +08:00
|
|
|
|
## Table of Contents
|
|
|
|
|
|
|
|
|
|
- [Features](#features)
|
|
|
|
|
- [Requirements](#requirements)
|
|
|
|
|
- [Getting Started](#getting-started)
|
|
|
|
|
- [Getting Help](#getting-help)
|
|
|
|
|
- [Documentation](#documentation)
|
|
|
|
|
|
|
|
|
|
## Features
|
|
|
|
|
|
|
|
|
|
- Easy to use
|
|
|
|
|
- provide a WYSIWYG web interface to configure data flow
|
|
|
|
|
- monitor big data flow status
|
|
|
|
|
- check big data flow logs
|
|
|
|
|
- provide checkpoint
|
|
|
|
|
- Strong Scalability:
|
|
|
|
|
- Support for custom development data processing components
|
|
|
|
|
- Superior performance
|
|
|
|
|
- based on distributed computing engine Spark
|
|
|
|
|
- Powerful
|
|
|
|
|
- 100+ data processing components available
|
|
|
|
|
- include spark、mllib、hadoop、hive、hbase、solr、redis、memcache、elasticSearch、jdbc、mongodb、http、ftp、xml、csv、json,etc.
|
|
|
|
|
|
|
|
|
|
## Requirements
|
|
|
|
|
* JDK 1.8 or newer
|
|
|
|
|
* Apache Maven 3.1.0 or newer
|
|
|
|
|
* Git Client (used during build process by 'bower' plugin)
|
|
|
|
|
|
|
|
|
|
## Getting Started
|