Design Decision: Async Query Based Change Capture Vs Event Based Message Queue
type
status
date
slug
summary
tags
category
icon
password
Tweet
Problem Statement
I have a in-memory cache that needs invalidation when the underlying source data is updated. The underlying source data is stored in a postgres database
Characteristic
- Higher read rate than write rate
- data is rarely update
- read performance is priority
- parallel read and write
Proposal 1 - Async Query Based Change Capture
- For each data row, keep a
updated_at
date field, update that field whenever the data row is updated.
- At every set interval, query the database for
updated_at >= x
- where
x
could be - the last seen latest
updated_at
from the previous query - the time of the previous query
Characteristics
- pull based
- cron
- async
Pros
- simple - no additional infrastructure other than existing database
Cons
- increases database load - compared to an event based push approach
- could be mitigated by using indexes - trade off between write and read latency
- no eventual consistency guarantees between parallel caches - happens when a write updates
updated_at
to an earlier time and a read happens in between theupdated_at
and the actual write x
could be negative offset to reduce chances of inconsistencies - trade off between query performance and inconsistency
- the length of the interval determines the longest possible lag between data update and data syncing across the servers’ in-memory cache
Proposal 2 - Event based message queue
- On database update, sends a message to queue that represents the change event
- consumers consume from the queue and update their cache accordingly
Variations
- consumers: listen vs poll
- write-back vs write-through
- write-back might result in loss of data
- write-through increases write latency
Pros
- takes load off database
- guaranteed eventual consistency
- less lag than proposal 1
Cons
- additional infrastructure
References
- ‣