Dragon Pastoral

凯龘牧歌

Chapter23 - Hotel Reservation System

Posted at # SystemDesign

Hotel Reservation System

In this chapter, we’re designing a hotel reservation system, similar to Marriott International.

Applicable to other types of systems as well - Airbnb, flight reservation, movie ticket booking.

Step 1 - Understand the Problem and Establish Design Scope

Before diving into designing the system, we should ask the interviewer questions to clarify the scope:

Non-functional requirements

Back-of-the-envelope estimation

Let’s estimate the QPS. If we assume that there are three steps to reach the reservation page and there is a 10% conversion rate per page, we can estimate that if there are 3 reservations, then there must be 30 views of reservation page and 300 views of hotel room detail page. qps-estimation

Step 2 - Propose High-Level Design and Get Buy-In

We’ll explore - API Design, Data model, high-level design.

API Design

This API Design focuses on the core endpoints (using RESTful practices), we’ll need in order to support a hotel reservation system.

A fully-fledged system would require a more extensive API with support for searching for rooms based on lots of criteria, but we won’t be focusing on that in this section. Reason is that they aren’t technically challenging, so they’re out of scope. UI elements you “just get” without guidance—think autocomplete, swipe-to-delete, pull-to-refresh, and filter chips.

Hotel-related API

Room-related API

Reservation-related API

Here’s an example request to make a reservation:

{
  "startDate":"2021-04-28",
  "endDate":"2021-04-30",
  "hotelID":"245",
  "roomID":"U12354673389",
  "reservationID":"13422445"
}

Note that the reservationID is an idempotency key to avoid double booking. Details explained in concurrency section

Data model

Before we choose what database to use, let’s consider our access patterns.

We need to support the following queries:

From our estimations, we know the scale of the system is not large, but we need to prepare for traffic surges.

Given this knowledge, we’ll choose a relational database because:

Here is our schema design: schema-design

Most fields are self-explanatory. Only field worth mentioning is the status field which represents the state machine of a given room: status-state-machine

This data model works well for a system like Airbnb, but not for hotels where users don’t reserve a particular room but a room type. They reserve a type of room and a room number is chosen at the point of reservation.

This shortcoming will be addressed in the Improved Data Model section.

High-level Design

We’ve chosen a microservice architecture for this design. It has gained great popularity in recent years: high-level-design

Inter-service communication can be facilitated via a RPC framework, such as gRPC.

Step 3 - Design Deep Dive

Let’s dive deeper into:

Improved data model

As mentioned in a previous section, we need to amend our API and schema to enable reserving a type of room vs. a particular one.

For the reservation API, we no longer reserve a roomID, but we reserve a roomTypeID:

POST /v1/reservations
{
  "startDate":"2021-04-28",
  "endDate":"2021-04-30",
  "hotelID":"245",
  "roomTypeID":"12354673389",
  "roomCount":"3",
  "reservationID":"13422445"
}

Here’s the updated schema: updated-schema

Let’s take a look at the room_type_inventory columns as that table is more interesting:

There are alternative ways to design this table, but having one room per (hotel_id, room_type_id, date) enables easy reservation management and easier queries.

The rows in the table are pre-populated using a daily CRON job.

Sample data:

hotel_idroom_type_iddatetotal_inventorytotal_reserved
21110012021-06-0110080
21110012021-06-0210082
21110012021-06-0310086
2111001
21110012023-05-311000
21110022021-06-0120016
22101012021-06-013023
22101012021-06-023025

Sample SQL query to check the availability of a type of room:

SELECT date, total_inventory, total_reserved
FROM room_type_inventory
WHERE room_type_id = ${roomTypeId} AND hotel_id = ${hotelId}
AND date between ${startDate} and ${endDate}

How to check availability for a specified number of rooms using that data (note that we support overbooking):

if (total_reserved + ${numberOfRoomsToReserve}) <= 110% * total_inventory

Now let’s do some estimation about the storage volume.

73 million rows is not a lot of data and a single database server can handle it. It makes sense, however, to setup read replication (potentially across different zones) to enable high availability.

Follow-up question - if reservation data is too large for a single database, what would you do?

Concurrency issues

Another important problem to address is double booking.

There are two issues to address:

Here’s a visualization of the first problem: double-booking-single-user

There are two approaches to solving this problem:

Here’s how this flow works:

What if there are multiple users making the same reservation? double-booking-multiple-users

This problem can be solved using some form of locking mechanism:

Here’s the SQL we use to reserve a room:

# step 1: check room inventory
SELECT date, total_inventory, total_reserved
FROM room_type_inventory
WHERE room_type_id = ${roomTypeId} AND hotel_id = ${hotelId}
AND date between ${startDate} and ${endDate}

# For every entry returned from step 1
if((total_reserved + ${numberOfRoomsToReserve}) > 110% * total_inventory) {
  Rollback
}

# step 2: reserve rooms
UPDATE room_type_inventory
SET total_reserved = total_reserved + ${numberOfRoomsToReserve}
WHERE room_type_id = ${roomTypeId}
AND date between ${startDate} and ${endDate}

Commit

Option 1: Pessimistic locking

Pessimistic locking prevents simultaneous updates by putting a lock on a record while it’s being updated.

This can be done in MySQL by using the SELECT... FOR UPDATE query, which locks the rows selected by the query until the transaction is committed. pessimistic-locking

Pros:

Cons:

The author doesn’t recommend this approach due to its scalability issues.

Option 2: Optimistic locking

Optimistic locking allows multiple users to attempt to update a record at the same time.

There are two common ways to implement it - version numbers and timestamps. Version numbers are recommended as server clocks can be inaccurate. optimistic-locking

Optimistic locking is usually faster than pessimistic locking as we’re not locking the database. Its performance tends to degrade when concurrency is high, however, as that leads to a lot of rollbacks.

Pros:

Cons:

Optimistic locking is a good option for our system as reservation QPS is not extremely high.

Option 3: Database constraints

This approach is very similar to optimistic locking, but the guardrails are implemented using a database constraint:

CONSTRAINT `check_room_count` CHECK((`total_inventory - total_reserved` >= 0))

database-constraint

Pros:

Cons:

This is another good option for a hotel reservation system due to its ease of implementation.

Scalability

Usually, the load of a hotel reservation system is not high.

However, the interviewer might ask you how you’d handle a situation where the system gets adopted for a larger, popular travel site such as booking.com In that case, QPS can be 1000 times larger.

When there is such a situation, it is important to understand where our bottlenecks are. All the services are stateless, so they can be easily scaled via replication.

The database, however, is stateful and it’s not as obvious how it can get scaled.

One way to scale it is by implementing database sharding - we can split the data across multiple databases, where each of them contain a portion of the data.

We can shard based on hotel_id as all queries filter based on it. Assuming, QPS is 30,000, after sharding the database in 16 shards, each shard handles 1875 QPS, which is within a single MySQL cluster’s load capacity. database-sharding

We can also utilize caching for room inventory and reservations via Redis. We can set TTL so that old data can expire for days which are past. inventory-cache

The way we store an inventory is based on the hotel_id, room_type_id and date:

key: hotelID_roomTypeID_{date}
value: the number of available rooms for the given hotel ID, room type ID and date.

Data consistency happens async and is managed by using a CDC streaming mechanism - database changes are read and applied to a separate system. Debezium is a popular option for synchronizing database changes with Redis.

Using such a mechanism, there is a possibility that the cache and database are inconsistent for some time. This is fine in our case because the database will prevent us from making an invalid reservation.

This will cause some issue on the UI as a user would have to refresh the page to see that “there are no more rooms left”, but that is something which can happen regardless of this issue if eg a person hesitates a lot before making a reservation.

Caching pros:

Caching cons:

Data consistency among services

A monolithic application enables us to use a shared relational database for ensuring data consistency.

In our microservice design, we chose a hybrid approach where some services are separate, but the reservation and inventory APIs are handled by the same servicefor the reservation and inventory APIs.

This is done because we want to leverage the relational database’s ACID guarantees to ensure consistency.

However, the interviewer might challenge this approach as it’s not a pure microservice architecture, where each service has a dedicated database: microservices-vs-monolith

This can lead to consistency issues. In a monolithic server, we can leverage a relational DBs transaction capabilities to implement atomic operations: atomicity-monolith

It’s more challenging, however, to guarantee this atomicity when the operation spans across multiple services: microservice-non-atomic-operation

There are some well-known techniques to handle these data inconsistencies:

It’s worth noting that addressing data inconsistencies across microservices is a challenging problem, which raise the system complexity. It is good to consider whether the cost is worth it, given our more pragmatic approach of encapsulating dependent operations within the same relational database.

Step 4 - Wrap Up

We presented a design for a hotel reservation system.

These are the steps we went through: