MQTT Performance Benchmark Testing: EMQX-Kafka Integration
Table of Contents
IoT scenarios often face challenges like a large number of devices, high data generation rates, and the huge accumulated data volumes. Therefore, how to access, store, and process these massive amounts of data has become a critical issue.
EMQX, as a highly scalable, powerful and feature-rich MQTT broker for the IoT, can handle billions of concurrent connections and millions of messages per second in a single cluster. Furthermore, its built-in Data Integration functionality provides an out-of-the-box solution, which enables seamless integrating IoT data with more than 40 cloud services and enterprise systems, including Kafka, SQL, NoSQL, and time-series databases.
This blog series presents the benchmark test results of the integrations against a single node EMQX server.
In this first post, we provide the benchmarking result of Kafka integration - a single node EMQX processes and bridges 100,000 QoS1 messages per second to Kafka.
Test Scenario
This benchmark testing simulates 100,000 MQTT clients connecting to EMQX, with a connection rate of 5,000 per second. After all connections are established, each client publishes one QoS 1 message with the payload of 1K bytes per second, and all messages, via the rule engine, are forwarded to Kafka.
- Concurrent connections: 100,000
- Topics: 100,000
- CPS (new established connections per sec.): 5000
- QoS: 1
- Keep alive: 300s
- Payload: 1024 bytes
- Message publish TPS: 100,000/second
Testbed
The test environment is configured on Alibaba Cloud, and all virtual machines are within a VPC (virtual private cloud) subnet.
Machine Details
Service | Deployment | Version | OS | CPU | Memory | Cloud Host model |
---|---|---|---|---|---|---|
EMQX | single node | 5.1.0 | Centos 7.8 | 32C | 64G | c6.8xlarge |
Kafka | standalone | 2.13-2.6.0 | Centos 7.8 | 32C | 64G | c6.8xlarge |
Test Tool
XMeter is used in this benchmark test to simulate MQTT clients. XMeter is built on top of JMeter but with enhanced scalability and more capabilities. It provides comprehensive and real-time test reports during the test. Additionally, its built-in monitoring tools are used to track the resource usage of the EMQX and Kafka machines.
XMeter provides a private deployment version (on-premise) and a public cloud SaaS version. A private XMeter is deployed in the same VPC as the EMQX and Kafka in this testing.
Preparation
For the detailed steps of configuring EMQX-Kafka integration, please refer to EMQX Doc. The three figures below are Kafka Bridge settings used in this benchmark testing.
Kafka Bridge & Rule Config
After the bridge and rule were created, the data flow below can be seen from the Dashboard.
Kafka Topic
The Kafka topic used in this test is 16 partitions and 1 replica since we use standalone.
System Tuning
Please refer to EMQX Doc for the Linux Kernel tuning.
Benchmark Results
Observations
- The usage of CPU and memory keeps stable
- The average of CPU user: 76%
- Memory used: ~11GB
- The average of response time of publish: 3.8ms
- After the test was completed, by comparing the data statistics from the EMQX Dashboard Data Bridge Statistics with the total offset number of the Kafka topic from Kafka cli, it was observed that all messages were written to Kafka in real-time.
Result Charts
Screenshots of EMQX Dashboard & Rule Engine during the test
The above two screenshots show that both the incoming message rate & processing rate by Data Bridge are 100,000+ per second, and all messages hit by the rule are written to the database in real time.
Screenshots after the test completed
The above screenshots show that all messages EMQX received were forwarded to Kafka successfully.
XMeter chart
Wrapping up
From the EMQX rule engine interface, it is easy to integrate EMQX with Kafka. You only need to:
- Set up the MQTT-to-Kafka topic mapping and the MQTT topic/message filtering;
- Select synchronous or asynchronous write mode based on actual use cases;
- Determine the caching mode to prevent data loss from network disturbances or service outages.
By leveraging the advantages of Kafka in data storage and stream processing, this out-of-the-box solution provides a robust and scalable infrastructure for IoT scenarios. It enables bi-directional communication between devices and enterprise applications, as well as real-time processing of large-scale data. Organizations can make timely and informed decisions, and maximize the value of their data and drive innovation in the business.
This benchmark report has proved the capability of EMQX single-node deployment. Next, we'll conduct a test against an EMQX cluster to demonstrate its ability to support 1 million QoS 1 messages per second to Kafka. Stay tuned for the report.