Autoprewarm
In PostgreSQL 11, a new functionality of autoprewarm has been added into the contrib module pg_prewarm. This automatically warms the shared buffers with the same pages held before the last server restart. To accomplish this, Postgres now has a background worker to periodically record the contents of the shared buffers in the file -- “autoprewarm.blocks”. Next, it will reload those pages after the server restart
Now, that we know the basic functionality of autoprewarm, let's dive into the details of how to use this feature.
Now, that we know the basic functionality of autoprewarm, let's dive into the details of how to use this feature.
How to enable Autoprewarm?
To enable autoprewarm, set "shared_preload_libraries" with pg_prewarm. This parameter requires the server restart to take effect.postgres=# alter system set shared_preload_libraries = 'pg_prewarm';
ALTER SYSTEM
After the restart, there will be a new background worker process -- "autoprewarm master" for automatic prewarming of shared buffers.
$ ps -aef | grep autoprewarm
$ ps -aef | grep autoprewarm
$ ps -aef | grep autoprewarm
mithuncy 5453 5446 8 02:56 ? 00:01:15 postgres: autoprewarm master
To be precise, "autoprewarm master" will periodically record the information about pages in shared buffers in the file "$PGDATA/autoprewarm.blocks". The frequency of updating "autoprewarm.blocks" is decided by a configuration parameter pg_prewarm.autoprewarm_interval. Once the server restarts, the master will read "autoprewarm.blocks" and sort the list of pages to be prewarmed. Next, it will launch a worker for each database, one-at-a-time. Then the per-database worker aka autoprewarm worker will load the pages that belong to their database. Once the prewarm is completed, the master will keep updating the "autoprewarm.blocks" periodically.$ ps -aef | grep autoprewarm
mithuncy 6377 6370 5 03:50 ? 00:00:00 postgres: autoprewarm master
mithuncy 6393 6370 15 03:50 ? 00:00:00 postgres: autoprewarm worker
To see the effects of prewarming, I re-ran the following query after a server restart.postgres=# explain (analyze, buffers) select count(*) from apw_tests;
QUERY PLAN
------------------------------------------------------------------------------
Aggregate (cost=188.44..188.45 rows=1 width=8) (actual time=1.804..1.804 rows=1 loops=1)
Buffers: shared hit=45
-> Seq Scan on apw_tests (cost=0.00..159.75 rows=11475 width=0) (actual time=0.013..1.071 rows=10000 loops=1)
Buffers: shared hit=45
Here you see all the buffers are hit. That means we found them in the shared buffer cache and none of them was loaded from disk. Without autoprewarm those pages will be loaded from disk after a restart.postgres=# explain (analyze, buffers) select count(*) from apw_tests;
QUERY PLAN
------------------------------------------------------------------------------
Aggregate (cost=170.00..170.01 rows=1 width=8) (actual time=1.798..1.798 rows=1 loops=1)
Buffers: shared read=45
-> Seq Scan on apw_tests (cost=0.00..145.00 rows=10000 width=0) (actual time=0.021..1.118 rows=10000 loops=1)
Buffers: shared read=45
Performance
Test setup: pg_bench prepared read only tests for 1 client.Machine: x86_64 8 core Intel machine with 16GB ram.
Server setup : shared_buffers = 8GB, pgbench scale_factor=300 (entire data fits into shared buffers)
TPS was measured at every 5 seconds of the run. From the tests, it was observed with autoprewarm system produced peak performance right immediately after the restart. When autoprewarm was disabled it took almost 300 secs to reach same peak TPS.
Inside "autoprewarm.blocks"
Contents of the file are in a readable format.
<<524288>>
13307,1663,16391,0,524065
13307,1663,16391,0,524066
13307,1663,16391,0,524067
13307,1663,16391,0,524068
13307,1663,16391,0,524069
………………….
The
first line says about the total number of pages and each line after
that gives information about a page. Each page is uniquely represented
by database oid, tablespace oid, relfilenode of the relation, fork file
number, and the block number.Utility Functions
- autoprewarm_start_worker() RETURNS void
Use this to launch the autoprewarm worker if autoprewarm was not configured during the server startup. - autoprewarm_dump_now() RETURNS int8
This updates autoprewarm.blocks immediately. This may be useful if the autoprewarm worker is not running currently but it is expected to be used at the next server restart.
In the end, I would like to thank Robert Haas and Amit kapila who guided me in this work and my employer EnterpriseDB who has supported this work.