pgsql之pg_stat_replication的使用详解

(编辑:jimmy 日期: 2025/1/10 浏览:2)

pg_stat_replication是一个视图,主要用于监控一个基于流的设置,建议您 注意系统上称作pg_stat_replication的视图。(注:当前版本为pg 10.0,10.0以下版本,字段名会有差异)此视图包含以下信息:

\d pg_stat_replication

pgsql之pg_stat_replication的使用详解

每个字段代码的含义:

"" src="/UploadFiles/2021-04-09/20210115092111.jpg">

在Linux上我们可以看到那个进程不仅有自己的作用 (在这种情况下, wal_sender),而且还带有终端用户的名字以及相关的网络连接信息。在上图中我们可以看到已经有人从192.168.47.127(对应pg_stat_replication的client_addr字段)通过51519(对应pg_stat_replication的client_port字段))端口连接到了master。

bonus:

上面我们提到replay_lsn是slave上重放的最后的事务日志位置。

pg_current_wal_lsn()函数的作用是获取当前的wal log的写位置。

pg_wal_lsn_diff()函数的作用是计算两个wal日志之间的差距。

所以我们可以通过下面的方法获取高可用架构下从库的复制延迟情况:

 SELECT
   pg_wal_lsn_diff(A .c1, replay_lsn) /(1024 * 1024) AS slave_latency_MB
  FROM
   pg_stat_replication,
   pg_current_wal_lsn() AS A(c1)
  WHERE client_addr='%s' and application_name = '%s'
  ORDER BY
   slave_latency_MB
  LIMIT 1;

补充:PostgreSQL pg_stat_replication sync_state introduce

PostgreSQL 9.2引入同步复制后, pg_stat_replication的sync_state列有3种状态.

sync

async

potential

分别代表同步standby, 异步standby, 可升级为同步的standby.

状态来自以下函数 : pg_stat_get_wal_senders

[测试]

环境:

1个 primary, 3个 standby.

第一种配置 :

primary配置

postgresql.conf
synchronous_standby_names = 'test1,test2,test3'

standby1配置

primary_conninfo = 'application_name=test1 host=127.0.0.1 port=1999 user=postgres keepalives_idle=60'

standby2配置

primary_conninfo = 'application_name=test2 host=127.0.0.1 port=1999 user=postgres keepalives_idle=60'

standby3配置

primary_conninfo = 'application_name=test3 host=127.0.0.1 port=1999 user=postgres keepalives_idle=60'

primary查询

digoal=# select pid,application_name,client_addr,sync_state from pg_stat_replication;
 pid | application_name | client_addr | sync_state 
------+------------------+-------------+------------
 6311 | test1   | 127.0.0.1 | sync
 6321 | test2   | 127.0.0.1 | potential
 6391 | test3   | 127.0.0.1 | potential
(3 rows)

如果sync节点挂掉, 按synchronous_standby_names的顺序, 第一个potential节点会变成sync状态.

pg_ctl stop -m fast -D /pgdata11999
digoal=# select pid,application_name,client_addr,sync_state from pg_stat_replication;
 pid | application_name | client_addr | sync_state 
------+------------------+-------------+------------
 6564 | test2   | 127.0.0.1 | sync
 6568 | test3   | 127.0.0.1 | potential
(2 rows)

当test1重新起来后又会变成sync状态.

pg93@db-172-16-3-33-> pg_ctl start -D /pgdata11999
server starting
digoal=# select pid,application_name,client_addr,sync_state from pg_stat_replication;
 pid | application_name | client_addr | sync_state 
------+------------------+-------------+------------
 6564 | test2   | 127.0.0.1 | potential
 6605 | test1   | 127.0.0.1 | sync
 6568 | test3   | 127.0.0.1 | potential
(3 rows)

第二种配置 :

primary配置

synchronous_standby_names = 'test1,test2'

standby1配置不变

standby2配置不变

standby3配置不变

primary查询

digoal=# select pid,application_name,client_addr,sync_state from pg_stat_replication;
 pid | application_name | client_addr | sync_state 
------+------------------+-------------+------------
 6470 | test1   | 127.0.0.1 | sync
 6472 | test3   | 127.0.0.1 | async
 6474 | test2   | 127.0.0.1 | potential
(3 rows)

test3变成异步了. 因为test3没有配置在primary的synchronous_standby_names 中.

第三种配置 :

primary配置

synchronous_standby_names = 'test1'

standby1配置不变

standby2配置不变

standby3配置不变

primary查询

digoal=# select pid,application_name,client_addr,sync_state from pg_stat_replication;
 pid | application_name | client_addr | sync_state 
------+------------------+-------------+------------
 6519 | test2   | 127.0.0.1 | async
 6521 | test3   | 127.0.0.1 | async
 6523 | test1   | 127.0.0.1 | sync
(3 rows)

test2,test3变成异步了. 因为test2,test3没有配置在primary的synchronous_standby_names 中.

1. src/backend/replication/walsender.c

/*
 * Returns activity of walsenders, including pids and xlog locations sent to
 * standby servers.
 */
Datum
pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
{
...略
   /*
    * More easily understood version of standby state. This is purely
    * informational, not different from priority.
    */
   if (sync_priority[i] == 0)
    values[7] = CStringGetTextDatum("async");
   else if (i == sync_standby)
    values[7] = CStringGetTextDatum("sync");
   else
    values[7] = CStringGetTextDatum("potential");
...略

以上为个人经验,希望能给大家一个参考,也希望大家多多支持。如有错误或未考虑完全的地方,望不吝赐教。