How to use MongoDB to realize a high-performance, high-availability active-active application architecture?

How to use MongoDB to realize a high-performance, high-availability active-active application architecture?


With the continuous increase of enterprise service windows, business interruption means a devastating disaster for many enterprises. Therefore, application deployment across multiple data centers has become one of the hottest topics at the moment.

Nowadays, in the best practice of application deployment across multiple data centers, databases are usually responsible for handling reads and writes from multiple geographic areas, replication of data changes, and provide the highest possible availability, consistency, and durability .

However, not all technologies are equal in choice. For example, a database technology can provide higher availability guarantees, but at the same time it can only provide lower data consistency and durability guarantees than another technology.

This article first analyzes the application requirements for database architecture in modern multi-data centers. Then it discussed the types, advantages and disadvantages of database architecture, and finally studied how MongoDB can be applied to these categories, and finally realized the dual-active application architecture.The demand for dual-active

When organizations consider deploying applications across multiple data centers (or regional clouds), they usually want to use a "active-active" architecture, that is, application servers in all data centers handle all requests at the same time.


Figure 1: "Dual-active" application architecture

As shown in Figure 1, the architecture can achieve the following goals:

By providing local processing (latency will be relatively low), to provide services for requests from all over the world.
Even if there is an entire regional downtime, high availability can always be maintained.
Through the parallel use of server resources in multiple data centers, it can handle various application requests and achieve the best platform resource utilization.

The alternative to the "active-active" architecture is a master-DR (also known as master-passenger) architecture consisting of a primary data center (region) and multiple disaster recovery (DR) regions (as shown in Figure 2).


Figure 2: Master-DR architecture

Under normal operating conditions, the main data center processes requests while the DR site is idle. If the primary data center fails, the DR site immediately starts processing requests (and becomes active at the same time).

Under normal circumstances, data will be replicated from the primary data center to the DR site so that if the primary data center fails, takeover can be quickly implemented.

Nowadays, the definition of the active-active architecture has not been universally recognized by the industry, and the above-mentioned main-DR application architecture is sometimes counted as "active-active".

The difference is whether the failover speed from the primary site to the DR site is fast enough (usually a few seconds) and whether it can be automated (without human intervention). In this interpretation, the active-active architecture means that application downtime is close to zero.

There is a common misunderstanding that the dual-active application architecture requires a multi-master database. This understanding is wrong, because it misinterprets multiple master databases' grasp of data consistency and durability.

Consistency ensures that the previously written results can be read, and data durability ensures that the submitted written data can be permanently stored without conflicting writes; or data loss due to node failure.

Database requirements for active-active applications

When designing a dual-active application architecture, the database layer must meet the following four architectural requirements (of course, it must also have the functions of a standard database, such as a query language with rich secondary index capabilities, and low-latency access to data. Local driver, comprehensive operating tools, etc.):

Performance, low latency read and write operations. This means that read and write operations can be processed on the nodes of the local data center application.
Data persistence is achieved by copying and writing to multiple nodes, so that in the event of a system failure, the data can remain unchanged.
Consistency, to ensure that the results written before can be read, and the results read in different regions and different nodes should be the same.

Availability, when a node, data center or network connection is interrupted, the database must be able to continue to run. In addition, the recovery time from such failures should be as short as possible, and the general requirement is a few seconds.

Distributed database architecture

For the dual-active application architecture, there are generally three types of database structures:

Distributed transaction using two-step commit.
The multi-master database mode is sometimes referred to as the "no master database mode".
A split (sharded) database has multiple primary shards, and each primary shard is responsible for a unique shard of data.

Let's take a look at the advantages and disadvantages of each structure. ,

Distributed transaction with two-step commit

The distributed transaction method is to update all nodes containing a record in a single transaction, instead of writing to one node and then (asynchronously) copying to other nodes.

This transaction ensures that all nodes will receive the update, otherwise if a transaction fails, all nodes will be restored to their previous state.

Although the two-step submission protocol can ensure durability and multi-node consistency, it sacrifices performance.

The two-step commit protocol requires two-step communication between all participating nodes in the transaction. That is, at each stage of the operation, a request and confirmation must be sent to ensure that each node completes the same write at the same time.

When the database nodes are distributed in multiple data centers, the query latency will be extended from milliseconds to several seconds.

In most applications, especially those where the client is a user device (mobile device, Web browser, client application, etc.), this response level is unacceptable.

Multi-master database

A multi-master database is a distributed database that allows a record to be updated only on one of multiple cluster nodes. The write operation usually copies the record to multiple nodes in multiple data centers.

On the surface, a multi-master database should be an ideal solution for implementing a dual-active architecture. It allows each application server to read and write a copy of local data without restriction. However, it has serious limitations in data consistency.

Because two (or more) copies of the same record may be updated simultaneously by different sessions in different locations. This will result in two different versions of the same record, so the database (and sometimes the application itself) must resolve inconsistencies by resolving conflicts.

Common conflict resolution strategies are: the most recent update "wins", or the record with more revisions "wins". Because if you use other more complex solutions, the performance will be significantly affected.

This also means that during this time period from writing to completing the conflict resolution mechanism, different data centers will read different values and conflicting values of the same record.

Partitioned (sharded) database

Partitioned databases divide the database into different partitions, or shards. Each shard is implemented by a group of servers, and each server contains a complete copy of the partition data. The key here is that each shard maintains exclusive control over the data partition.

For each shard at any given time, one server acts as the master server, while the other servers act as its replicas. The reading and writing of data is published to the main database.

If the primary server fails for any reason (such as hardware or network failure), a backup server will automatically take over the role of the primary server.

Each record in the database belongs to a specific partition and is managed by a shard to ensure that it can only be written by the main shard. The records in the shards are mapped to a primary shard of each shard to ensure consistency.

Since the cluster contains multiple shards, there will be multiple primary shards (multiple primary partitions), so these primary shards can be allocated to different data centers to ensure that they are all locally available in each data center. A write operation occurs, as shown in Figure 3:


Figure 3: Partitioned database

A shard database can be used to implement a dual-active application architecture by deploying at least as many shards as there are data centers, and assigning primary shards to the shards so that each data center has at least one primary shard, as shown in the figure 4 shows:


Figure 4: Active-active architecture with sharded database


In addition, sharding can be configured to ensure that each shard has at least one copy (a copy of data) in various data centers.

For example, the diagram in Figure 4 depicts a distributed database architecture across three data centers:

  • New York (NYC)

  • London (LON)

  • Sydney (SYD)

The cluster has three shards, and each shard has three replicas:

The NYC segment has a master segment in New York and copies in London and Sydney.
The LON segment has a master segment in London and copies in New York and Sydney.
The SYD segment has a master segment in Sydney and copies in London and New York.

In this way, each data center has a copy from all shards, so the local application server can read the entire data set and the primary shard of a shard to write operations locally.

The sharded database can meet the consistency and performance requirements of most usage scenarios. Since reads and writes occur on the local server, the performance will be very good.

When reading from the main shard, since each record can only be assigned to one main shard, consistency is ensured.

For example: We have two data centers in New Jersey and Oregon in the United States. Then we can split the data set according to geographic regions (East and West) and route the traffic of East Coast users to the data center in New Jersey.

Because the data center contains shards mainly used in the east; and the traffic of West Coast users is routed to the Oregon data center, because the data center contains shards mainly used in the west.

We can see that the sharded database provides us with all the benefits of multiple primary databases and avoids the complexity caused by inconsistent data.

The application server can read and write from the local master server. Since each master server has its own record, there will be no inconsistencies. On the contrary, a multi-master database solution may cause data loss and read inconsistencies.

Database architecture comparison


Figure 5: Comparison of database architecture

Figure 5 provides the advantages and disadvantages of each database architecture in meeting the needs of active-active applications. When choosing a multi-master database and a partitioned database, the determining factor is whether the application can tolerate possible read inconsistencies and data loss.

If the answer is yes, then multiple master databases may be slightly easier to deploy. If the answer is no, then a sharded database is the best choice.

Since inconsistencies and data loss are unacceptable for most applications, a sharded database is usually the best choice.

MongoDB active-active application

MongoDB is an example of a sharded database architecture. In MongoDB, the construction of the primary server and secondary server set is called a replica set. The replica set provides high availability for each shard.

A mechanism called zone sharding is configured as a data set managed by each shard . As mentioned earlier, ZoneSharding can achieve geographic partitioning.

White paper "MongoDB Multi-Data Center Deployment":

https://www.mongodb.com/collateral/mongodb-multi-data-center-deployments?utm_medium=dzone-synd&utm_source=dzone&utm_content=active-application&jmp=dzone-ref

Zone Sharding related documents:

https://docs.mongodb.com/manual/tutorial/sharding-segmenting-data-by-location/The "partitioned (sharded) database" section describes the specific implementation and operation details of MongoDB.

In fact, many organizations, including: Ebay, YouGov, Ogilvy and Maher, are using MongoDB to implement a dual-active application architecture.

In addition to standard sharded database functions, MongoDB also provides fine-grained control over write durability and read consistency, making it an ideal choice for multi-data center deployments. For writes, we can specify write concern (write concern) to control the persistence of writes.

Writeconcern allows the application to specify the number of copies to be written before MongoDB confirms the write, so that the write operation can be completed on one or more servers in the remote data center. In this way, it ensures that in the event of a node or data center failure, the database changes will not be lost.

In addition, MongoDB also complements a potential shortcoming of sharded databases: write availability cannot reach 100%.

Since there is only one master node for each record, if the master node fails, the partition cannot be written to for a period of time.

MongoDB has greatly shortened the failover time through multiple write attempts. Through repeated write operations, MongoDB can automatically cope with write failures caused by temporary system errors such as network failures, thus greatly simplifying the amount of application code.

Another distinguishing feature of MongoDB suitable for multi-data center deployment is the speed of MongoDB's automatic failover.

When a node or data center fails or a network interruption occurs, MongoDB can failover within 2-5 seconds (of course, it also depends on its configuration and the reliability of the network itself).

After a failure, the remaining replica set will select a new master slice and MongoDB driver according to the configuration, thereby automatically identifying the new master slice. Once the failover is completed, its recovery process will automatically perform subsequent write operations.

For reading, MongoDB provides two functions to specify the required consistency level.

1. when reading from secondary data, the application can specify the maximum aging value (maxStalenessSeconds).

This ensures that the lag time for the secondary node to replicate from the primary node cannot exceed the specified aging value, so that the data returned by the secondary node has its timeliness.

In addition, the read can also be associated with the read concern (ReadConcern) to control the consistency of the returned data in the query.

For example, ReadConcern can tell MongoDB through some return values that the data is replicated to most nodes in the replica set.

This ensures that the query only reads data that has not been lost due to node or data center failures, and it can also provide applications with a consistent view of the data over a period of time.

MongoDB 3.6 also introduces the concept of "causal consistency" to ensure that every read operation in the client session always "pays attention" to whether the previous write operation has been completed, no matter which one is. The copy is serving the request.

By strictly causally sorting operations in the session, this causal consistency can ensure that each read always follows the logical consistency, thereby achieving monotonic reads in distributed systems. This is exactly what various multi-node databases cannot satisfy.

Causal consistency not only allows developers to retain the traditional single-node relational database in the past, the advantages of strict data consistency during the implementation process; but also make full use of the current popular architecture to a scalable and highly available distribution Data platform.



Original publication time: 2018-04-9 author: Chen Jun Yunqi article from community partners' data and the cloud "for information may concern" data cloud "micro-channel public number