Benchmarking MongoDB from 3.6 to 8.0 - Part 1

  • Introduction

This is a two-part post in which I will share my personal benchmarking results from MongoDB 3.6 to 8.0.

The idea of breaking the articles into different parts was to make the reading more dynamic. In this first part, I will explain more about the reasons and how the process was done, while Part 2 will focus on the results themselves.

If you want to jump to Part 2, you can click on the link below. Otherwise, you can follow along with this first part for more details and to understand the objective, test suite, and how the benchmark was performed.

  • Objective

The objective and motivation are pretty simple, I would say:

Understand If the Mongo database performance has changed over the years(releases). 

That’s because this question is refreshed with every new major release; Customers always get in contact, sharing their concerns about possible performance regression when moving to a newer release.

Although it’s not very clear and organized the way this information is presented, we do have other users performing benchmarks and reporting related issues about performance loss when moving over the releases:

That being said, this benchmark test aims to address these concerns by evaluating the performance of different MongoDB versions in a controlled and single environment.

While this test is not intended to be the definitive word on MongoDB performance, it aims to serve as a well-organized effort to illuminate that matter across all the tested versions, contributing to a broader conversation about it.

  • Test Suite

Database server with the following configuration profile:

  • CPU: AMD Ryzen(TM) 7 5800X — 8C/16T
  • RAM: 16 GB DDR4 at 3200 MHz (2 x 8 GB).
  • Disks: 512GB(SSD) -> OS and Database installation. | 512GB(NVMe)  -> MongoDB dbpath.
  • OS: Oracle Linux 8.10.
  • Kernel: 4.18.0–553.el8_10.x86_64

For the Client server, it had the following configuration profile:

  • CPU: Intel(R) Core(TM) i7–9750H CPU
  • RAM: 32 GB DDR4 at 2666 MHz (2 x 16 GB).
  • OS: Oracle Linux 8.10.
  • Kernel: 4.18.0–553.el8_10.x86_64
  • Topology

Single-node Replica Set.

This topology was chosen because the focus isn’t on replication latency, flow control, scattered gather queries, secondary reads, or any other features that could influence raw performance. Instead, the objective is to evaluate how MongoDB’s performance has changed across different versions without the influence of additional complexities.

  • Configuration

On the OS side for the database, all the production notes and best practices were applied accordingly:

On the Database side, the instance is running with default parameters with only Replica Set and Authorization enabled:

  • Benchmarking tool

The tool used for this test was the mongo-perf.

  • Driver version — PyMongo 4.4.0.
  • Python version — Python3.8.

As per the Compatibility documentation, PyMongo 4.4.0 and Python3.8 provide all the necessary compatibility needed from 3.6 to 8.0.

The mongo-perf tool utilizes Python scripts to conduct specific test runs that evaluate various aspects of performance. Currently, around 34 test case scripts are available for use, but I did not use all of them.

For the tests, the following scripts were selected:

The choice of those selected scripts is because they can comprehensively test the basics of CRUD operations.

The simple_insert.js, simple_update.js, simple_query.js, and simple_remove.js cover the basics of document creation, modification, retrieval, and deletion. The complex_update.js adds complexity by involving advanced operations and multiple indexes. complex_insert.js and partial_index.js evaluate performance under more demanding scenarios, such as sequential and random inserts, with and without contention, involving large values and multi-key indexes.

  • Methodology and Final Consideration

This test was conducted at the dawn of MongoDB 8.0’s release, and the latest available patch releases of the other versions were used accordingly:

  • 3.6.23 -> 4.0.28  -> 4.2.25 -> 4.4.29 -> 5.0.29  -> 6.0.18 -> 7.0.14  -> 8.0.0.

Red Hat 8 was chosen as the system exclusively due to its high compatibility across the tested versions. Later in this section, I further detail the tests, but below, we have a diagram showing how the MongoDB versions were tested.

  • Each release was tested following the upgrade patch approach:
It all started with 3.6.23.

Here, it’s how the benchmark process took place:

It’s a shell script that interacts with the testcases from the mongo-perf tool as listed before; From the benchmark tool, worth mentioning are the options used:

-t $th:

  • This defines the number of threads to use in the benchmark.
  • The $th variable is dynamically replaced by values in the loop (e.g., 1, 2, 4, 8, 16).

-trialTime 1:

  • This defines the duration of each trial in seconds; In the test, each trial runs for 1 second.

-writeCmd true:

  • This enables the use of write commands (e.g., insert, update, delete) instead of legacy write operations.
  • MongoDB introduced write commands in newer versions for more efficient write operations.

-readCmd true:

  • This enables the use of read commands (e.g., find, getMore) instead of legacy read operations.

-w 1:

  • Specifies the write concern level.

As a summary for the script, each CRUD test ran ten times for the number of threads given(1, 2, 4, 8, 16) for the duration of one second.

  • For example: The simple_insert.js is a script that tests different inserts operations, each of those operations ran for 1 second, 10 times using 1 thread. After finishing that sequence of operations, the loop restarts, but this time the operations for simple_insert.js will be tested with 2 threads, then 4 and so on until it extinguishes the thread variable.

After executing all the CRUD scripts mentioned for a release, a data cleanup was performed. For each operation tested, we will have 10 distinct outputs. The outliers (the fastest and slowest) were removed, and the other 8 remaining operations are summed up and divided, generating an average of values for that operation.

  • For example: The simple_insert.js has the operation Insert.EmptyCapped, which basically tests the Insertion of an empty document into a capped collection. We then have 10 distinct execution times running with 1 thread: 547.96, 552.54, 561.64, 562.18, 562.73, 564.83, 567.14, 570.20, 542.68, 567.62. The outliers would the highest number = 570.20 and the lowest number = 542.68; The average operations per second would be 560.83.

 

This calculation was applied to all operations for all threads and was organized into tables, which you can see and follow in part 2, where we have the finalized report.

 

See you there!

Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *