Based on open source Apache Hadoop technology, Rapids Hadoop is dedicated to helping enterprises build data lakes in a short period of time through strictly size-controlled installation packages. The pre-loaded Hadoop with optimized configurations ensures fast deployment of large-scale clusters.
Through the federated HDFS Connector, CSV (delimited) Parquet and ORC format data can be extracted from Hadoop and consolidated with other federated data sources including streaming data for real-time big data analysis in RapidsDB.
- Open source-based Hadoop technology
- Supports batch processing and interactive SQL queries
- Supports real-time analysis of heterogeneous big data.
- Provides various SQL-on-Hadoop analytical application tools
- Supports cloud computing configurations including IaaS, YARN, and Mesos
- Integrates and authenticates various open source ETL and BI applications
- Provides a rich collection of APIs