piflow/README.md

33 lines
1002 B
Markdown
Raw Normal View History

2018-12-24 15:59:18 +08:00
![](https://github.com/cas-bigdatalab/piflow/blob/master/doc/piflow.png)
**Piflow** is an easy to use, powerful big data pipeline system.
2018-05-03 18:15:05 +08:00
2018-12-24 15:59:18 +08:00
## Table of Contents
- [Features](#features)
- [Requirements](#requirements)
- [Getting Started](#getting-started)
- [Getting Help](#getting-help)
- [Documentation](#documentation)
## Features
- Easy to use
- provide a WYSIWYG web interface to configure data flow
- monitor big data flow status
- check big data flow logs
- provide checkpoint
- Strong Scalability:
- Support for custom development data processing components
- Superior performance
- based on distributed computing engine Spark
- Powerful
- 100+ data processing components available
- include spark、mllib、hadoop、hive、hbase、solr、redis、memcache、elasticSearch、jdbc、mongodb、http、ftp、xml、csv、jsonetc.
## Requirements
* JDK 1.8 or newer
* Apache Maven 3.1.0 or newer
* Git Client (used during build process by 'bower' plugin)
## Getting Started