Case Study: RDP Drives Port Digital Transformation


With the development of economic globalization, consumers are enjoying a large variety of products produced overseas and sold in their local markets. Ports have become an important component of the global economic system. They serve as a marine logistics and service center which transits, stores, and delivers imported and exported goods, supporting the economic development of a nation. Container shipping is a standardized method of international freight transportation in the modern world. Since the development of container shipping in the mid-20th century, global shipping volumes have increased tremendously as world trade flourishes. According to Statista, global container throughput reached approximately 802 million 20-foot equivalent units (TEUs) in 2019. As many ports are experiencing higher and higher container shipping throughput, the maintenance of a seamless and efficient operation has become more and more challenging for traditional port infrastructure.

The good news is that by taking advantage of the fast development of innovative technologies including big data, AI, Internet of Things (IoT) and blockchain, ports can start a digital transformation to automate work process, increase operation efficiency, and build economic competitiveness. A smart port, which emerged in the digital era is not only a physical location for cargo handling but also an intelligent hub that collects, processes, analyzes and shares information to enhance collaboration between humans and machines.



Rapids Data has successfully completed a project to upgrade a traditional port in China with an automatic guided vehicle system (AGVS). The main challenges that the port was facing before the outset of the project were:

  • Manually operated truck scheduling and dispatching system reduced terminal efficiency

Before the Rapids Data Platform was implemented, the scheduling process at the port was largely controlled by human workers. The old database system would calculate the average operation hours of the cranes and trucks based on historical data. Then a human coordinator would schedule truck appointments and send drivers and trucks to the container terminals based on the calculated static time intervals. However, in reality, the daily cargo handling situation may vary case by case. The actual loading and unloading status was very difficult to effectively communicate. As a result, truck delays or congestion often occurred and created critical problems for the port operation.

  • Legacy data systems cannot keep up with the volume, velocity and variety of modern data and bring automation to the port

With the advent of 5G wireless technology in China, network capacity has been dramatically improved with super-high data speeds and ultra-low latency. Sensors and actuators embedded in port equipment can link through the wireless network and send in real-time and streaming data, which reflects the current operational status. However, the unprecedented volume, velocity and variety of data create new bottlenecks for the legacy database system implemented with the old port infrastructure. The legacy database system is transaction-oriented and can only process certain types and amounts of data in batches, which cannot meet the requirements of high throughput and low latency of a modern data management system.

What the port needed was an intelligent data management platform, which can not only process and analyze massive amounts of data to support day-to-day operational tasks but also capture valuable insights and detect abnormalities for humans and machines to respond in real-time to smooth the automation process.



Rapids Data implemented the Rapids Data Platform (RDP) for this project, which includes the Rapids StreamDB, RapidsDB, and Rapids ParallelAI modules in the package. This intelligent big data analytics platform builds a high-performance and robust foundation to support the automatic guided vehicle (AGV) scheduling and dispatching system and the terminal automation. The AGVs operated at the port are computer-controlled, battery-powered and wheel-based load carriers that leverage electromagnetic induction technology to run without the need for onboard drivers. Once a cargo vessel has arrived at the port, dockside cranes will automatically unload the containers from the vessel and load them on AGVs to be transported to the container yard for storage.

  • Modulized product system to promote simplicity and cost efficiency

The Rapids Data Platform is a unified real-time big data analytics platform dedicated to the application of real-time AI analysis action and adaption using insights from big data. Its modularized architecture provides the flexibility for a customer to choose a personalized package with different combinations of RDP modules to be deployed based on the actual application needs. The platform offers the expandability and simplicity to help enterprises build future-oriented data pipelines with high performance and cost efficiency.

  • Streaming database to meet the challenge of data velocity

Rapids StreamDB is an in-memory and distributed stream database. It continuously processes and analyzes streaming data sent by sensors embedded in various port equipment such as cranes, trucks, AGVs, container spreaders, etc. to detect bottlenecks and ensure smooth cargo handling flows. Information sent by AGV sensors may include loading weight, driving route, driving distance, stop duration, battery capacity and so forth. The collection of real-time data is very important to support decision-makings. For example, based on the received data, we can learn whether a vehicle is overloaded or needs a battery replacement.

  • In-memory computing technology and easy scale-out architecture to guarantee high performance at scale

RapidsDB is a fully parallel, distributed, in-memory federated query system that is designed to support complex analytical SQL queries running against a set of heterogenous data stores. As a distributed, MPP (massively parallel processing), shared-nothing memory database, RapidsDB supports horizontal expansion to maximize the capacity of clusters based upon the growing needs of a business.

  • Fast integration of different types of data across various data sources

Rapids Federation is a core structural technology of RapidsDB. It provides a federated connector system to access various data sources and consolidate various types of data without the need for an expensive and time-consuming ETL (Extract, Transform, Load) process. It enables the integration of static historical data and real-time streaming data, which can be further processed and analyzed by RapidsDB. Ad hoc queries now can be responded to in real time and operation reports, which provides a holistic view of the port operation, can now be generated within milliseconds through BI tools.

  • AI-in-database to generate real-time insights and realize automation

ParallelAI is the in-memory, distributed, parallel implementation of the R language and the R operating environment integrated within a RapidsDB cluster. The AGV scheduling and dispatching system built on top of the RDP platform leverages ParallelAI’s machine learning algorithms to determine the best dispatching time and the optimal driving route for the vehicles. It can detect operational abnormity in real time, enabling the dispatching system to respond to a problem more than 20 times faster than the manual operation. Maintenance needs such as battery replacement can also be predicted and scheduled in advance to prevent unplanned downtime. As a result, transportation vehicle delay or congestion inside the container terminals has been prevented. Vehicle turnaround time has been maximized. Vehicle breakdown and machinery malfunction are largely reduced. Occasional traffic accidents due to careless drivers or confusing intersectional routes have been completely eliminated.


Value Realization

  • The AGV scheduling and dispatching system significantly increases the container handling efficiency and the productivity of terminal operations.
  • The container hourly loading rate and the annual average of containers loaded per vessel have been improved substantially.
  • The 24/7 operation of the port has also been simplified, which further cuts the labor cost down while maintaining a safer working environment.
  • Workflow has been optimized based on real-time and historical data.
  • Resources are better managed to enable more value-added services to be provided by the port to maximize revenue while satisfying customers’ growing needs.



RDP provides a modern data architecture and data management system that is real-time, agile and drives the port digital transformation. The port now has an upgraded and intelligent big data analytics infrastructure that works like a human brain to support data-driven decision-making and make the port truly smart.