Vartalap: Chat Server Architecture V2
This is third blog post in series of blogs posts, where I will be sharing details about Vartalap. How it started, decision I make while development, technical details, problem with current system and how I am planning to solve it. In this blog I will be sharing the new backend server architecture, I come up with to overcome the existing limitations. If you haven’t already read the first and second blog, I highly recommend reading them. Links to previous blog posts:
Quick Recap: Limitation of Current Architecture
- No proper acknowledgement system for message delivery.
- Only web socket gateway for events, making it difficult for background event syncing.
- Router publishing directly into gateway topic, making message delivery logic to be duplicated among services.
- Group operations are synchronous making it not useable offline.
- Base message template for events.
Architecture V2:
New Architecture update:

In V2 version of the system external interface of the system remains the same, however there are new components are introduced and the responsibilities has been distributed/ transferred within the components.
New Components:
We have two new services
-
Rest Gateway (HTTP)
- Provides the rest endpoint
/sendto client to send the messages via rest API call instead of just relying on the Web Socket.
- Provides the rest endpoint
-
Persistence Storage Rest Endpoint (HTTP)
- Provide the rest endpoint
/messagesto client to fetch the messages.
- Provide the rest endpoint
New Responsibility of existing components:
-
Web Socket Gateway
-
Instead of listening to a
Kafka topicfor messages to send how it will expose an internal endpoint/send. Which will be used to push messages to client. Advantage of using Sync communication (endpoint) instead of Async communication (Kafka topic)- Real time update on message status, if message is sent to client or not.
- Better retry logic.
-
-
Session MS
As
Session MSalready has the user connection state, it is an ideal candidate for handling routing message to gateway’s instead of just act as store.- Route message to respective gateway, include routing offline messages.
- Retry failed message.
-
Handle message acknowledgment.
Instead of using persistence storage service to store every message and delete them when ack is received. It will be more efficient for
session msto wait message ack and act accordingly like retrying or treating it as offline message assession msis storing the message in memory store likeRedisinstead of on disk stores. - Maintain user connection state.
-
Message Router
Message router responsibilities as now slim down, new responsibilities are:
- Message parsing and decoding.
- Forwarding message to respective services like group messages to
group message router, one to one tosession ms, etc.
-
Group Message Router
Group message router now instead of completely relying on
group msfor getting group data, as they both belongs to same category, it can show access data directly from the database. (Note: access is read only).-
Message parsing and decoding.
This includes updating message with users of the group, and decoding message to get relevant info.
-
Forwarding message to respective services like group upsert message to
group ms, chat message tosession ms, etc.
-
-
Persistence Storage Service
- Handle ack message as forwarded by
session ms.
- Handle ack message as forwarded by
Updated Request Flow
Send Message (1 to 1)
Client Side
- Send message against a chat.
- Store message against chat in local store.
- Publish message to web socket. (if connected).
- Publish message using rest endpoint from background service.
- Send message ack against received message.
Server Side
GatewaypublishonMessageevent to message broker to meta info like from.Message routerlisten foronMessage.- Parse the message and extract destination information.
- Publish message to
send messageevent to message broker. Session MSlisten tosend message.- If user is connected.
- It will send message to gateway using
Gatewaysend endpoint. Gatewaytries to send message to user socket.- If failed, return failure response.
Session MSwaits to ack message if not received within a time limit, it retry again.- If retry count exceeds publish message to
offline message. - If Ack received after timeout,
offline messageis published for Ack. - If user is not connected.
- Publish message to
offline message. Push notification mslisten for offline message and send push notification.Persistence mslisten for offline message.- If message is not ack, Store in the database until user comes online.
- If message is ack, remove the stored messages.
Group Chat
Create New Group
Client Side
- Choose the participants and group name.
- Generate an unique group id.
- Save group info in local store.
- Send http request to create group. (if connected).
- Publish message using rest endpoint from background service.
Server Side
- Sync/Online flow will remain the same as old one.
-
Async/Offline flow
GatewaypublishonMessageevent to message broker to meta info like from.Message Routerlisten foronMessageand forward toGroup message Router.Group Message Routerparse message and callGroup MSto create group.Group MSverify participants and create new group.- Publish new
send messagefor participants. - Further flow will be same as 1 to 1 chat.
Update Group
Client Side
- Choose participants to add or remove.
- Save group info in local store.
- Send http request to add/remove from group. (if connected).
- Publish message using rest endpoint from background service.
Server Side
- Sync/Online flow will remain the same as old one.
-
Async/Offline flow
- It will be handled similar to create group flow, with exception that instead
create groupcall by message handle it will beupdate group.
- It will be handled similar to create group flow, with exception that instead
Send Message (Group Message).
Client Side
This will be same as 1 to 1 chat flow.
Server Side
All the steps are similar to 1 to 1 chat flow, other than the following mentioned below
Message Routerpublish message togroup messageforGroup Message Router.Group Message Routerparse the message- Fetch the users for the group
- Update the message META with target users.
- Publish message to
send messageevent to message broker.
Let’s see how new version of the system solves the existing problems.
-
No proper acknowledgement system
Now the client needs to send the ack message against the message id to mark the message as delivered else the system will retry to send the message. If failed it will be treated as offline message and will be stored in database and push via firebase.
-
Only web socket gateway for events, making it difficult for background event syncing.
With new
Rest GatewayandPersistence store rest endpointbackground event syncing is possible. -
Router publishing directly into gateway topic, making message delivery logic to be duplicated among services.
Responsibility to route message to gateway now has been move to
Session MSwhich will act as the common point of contact to send the message to client. -
Group operations are synchronous making it not useable offline.
Group operation as now async using the
Rest GatewayandGroup message routernew update.
I have start working on incorporating the required changes in the system, It will be available soon.
If you have any have suggestions or questions feel free to reach out to me. My contact links are available on my bio page Raman.