PostgreSQL - setup, admin and developer notes

Published on: May 3, 2026

Overview

This is a notes that I have been keeping since 2023. Hope you find it useful.

Installation

Installation on Ubuntu

Point to the latest repository:

12.Oct.2025: OLD installation steps

sudo sh -c 'echo "deb https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
sudo apt update
apt show postgresql

Notes: watch out for the dash symbol in the wget line!

Once you have confirmed to see the latest version, you may start the installation with the following command:

sudo apt install postgresql postgresql-contrib

12.Oct.2025: NEW installation guide: https://www.postgresql.org/download/linux/ubuntu/

sudo apt install -y postgresql-common
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
sudo apt install postgresql-18

Then, contine with configuring the pg_hba.conf that explains in the 'Security settings' below.

To view the daemon status:
```
sudo systemctl status post*
```
Notes: Postgres has 2 daemons. The 'cluster' daemon is the real process that handles the database requests.
To restart the daemon:
```
sudo systemctl restart postgresql
```
To access Postgres:

First, you have to switch to postgres 'OS' user ID. Then, run the psql command to access Postgres.
```
sudo -i -u postgres
psql
```
Then, check the Postgres version.
```
select version();
```
Make sure that the database encoding is in UTF-8. By right, this should be the default. In case it does not, you will have to find ways to change it.
```
\l
```

To remove Postgres from Ubuntu:

sudo apt-get --purge remove postgresql postgresql-*

Installation on Windows

Run Powershell without admin permission.

To setup the server:

cd 'd:\pgsql16\bin'
.\initdb.exe -D 'd:\pgsql16\data' --encoding=utf8

Another way which do the same thing,

pg_ctl -D 'd:\pgsql16\data' initdb

To create database

cd 'd:\pgsql16\bin'
createdb mydb

Using cli to include the encoding:

createdb --encoding=utf8 --owner=my_user_id my_db1

In case the above failed with the following message:

createdb: error: database creation failed: ERROR:  new encoding (UTF8) is incompatible with the encoding of the template database (WIN1252)
HINT:  Use the same encoding as in the template database, or use template0 as template.

You will have to specify the template0:

createdb -E UTF8 -O=my_user_id -T template0 myDB

Using SQL:

CREATE DATABASE my_db1
    OWNER=my_user_id
    ENCODING='UTF8'
    LC_COLLATE = 'en_US.UTF-8'
    LC_CTYPE = 'en_US.UTF-8'
    IS_TEMPLATE = False;

References: https://www.postgresql.org/docs/9.6/tutorial-createdb.html http://mysqltopgsql.com/post/unicode/

To change the default encoding after it has been setup.

UPDATE pg_database SET datistemplate=FALSE WHERE datname='template1';
DROP DATABASE template1;
CREATE DATABASE template1 WITH owner=postgres template=template0 encoding='UTF8';
UPDATE pg_database SET datistemplate=TRUE WHERE datname='template1';

References: https://coderwall.com/p/j-_mia/make-postgres-default-to-utf8

Running a server instance

Notes: you may download the latest installer (from the link below) and install it. In that case, you may skip the following manual steps.

https://www.postgresql.org/download/windows/

If you want to try to set it up manually so that you can understand more about the Postgres runtime, please follow the steps below:

To start the server:
```
cd 'd:\pgsql16\bin'
pg_ctl -D 'd:\pgsql16\data' --log='d:\pgsql16\log\my1.log' start
```
or
```
  ```
  .\pg_ctl -D 'd:\pgsql16\data' -l logfile start
  ```
```
Notes:
- For logging, search for 'log_destination' in 'postgresql.conf'.
Ref: file:///d:\pgsql16\doc/postgresql/html/creating-cluster.html file:///d:\pgsql16\doc/postgresql/html/app-postgres.html file:///d:\pgsql16\doc/postgresql/html/server-start.html
To register pgsql as Windows Service manually,
```
pg_ctl register -D 'd:\pgsql16\data' -N pgsql_01 -U localsystem -e 'pgsql_01'
```
With 'localsystem', postgres will have full access to the entire server. Another user id is "NT AUTHORITY\NetworkService" which is able to provide network service.
```
.\pg_ctl.exe register -D "D:\pgsql16\data" -N "pgsql16" -U "NT AUTHORITY\NetworkService" -e "pgsql16"
```
- Log to file:
  
  c:\pgsql16\bin\pg_ctl register -D 'd:\pgsql16\data' -N postgresql-x64-16 -U my_user_id -o 'C:\pgsql16\data\log'
To unregister the Windows Service,
```
pg_ctl unregister -N pgsq_01
```

Using cli

To access the database

To start the server.

pg_ctl -D 'd:\pgsql16\data' --log='d:\pgsql16\log' start

OR simply

 ```
 cd d:\pgsql
 start-pgsql.bat
 ```

Run the following command to access the cli,
```
cd 'd:\pgsql16\bin'
psql -d mydb
```

To look for help
```
mydb=> \h
```
To find out the log location
```
SHOW log_destination;
```
To switch data in psql
```
\c mydb2
```

To list all schemas,

SELECT schema_name
FROM information_schema.schemata;

To view the encoding

show server_encoding;
show client_encodig;

To set the communication encoding:
```
set client_encoding to utf8;
```
Find out the config file location
```
SHOW config_file;
```
Find out the data location
```
SHOW data_directory;
```
In Powershell, to change the current code page to Windows utf8 (before running psql).
```
chcp 65001
psql -d mydb
```
After that, set the communication encoding to uft8.
```
set client_encoding to utf8;
```
Without the execute above command, the following will appear on the Chinese character.
```
ERROR:  character with byte sequence 0xe5 0xa6 0x96 in encoding "UTF8" has no equivalent in encoding "WIN1252"
```
In Powershell, the display font must be 'MS Gothic'. This will display the Chinese character correctly.
To execute shell command. For example, view the current directory content:
```
\! ls
```
To execute a SQL script file in the current directory.
```
\i fn_my_function.sql
```
This command has another usage: if your database backup was dumped into SQL script file, you will have to use this command to rebuild the database.

Security settings

For the detail explanations, visit the following URL:

https://www.postgresql.org/docs/current/auth-pg-hba-conf.html
To find the location, run the following query in psql:
```
show hba_file;
```
The result on my server:
```
/etc/postgresql/17/main/pg_hba.conf
```
This file has 5 columns as shown below:
```
# TYPE  DATABASE        USER            ADDRESS                 METHOD
```
- TYPE: 'local' (the Postgres server) or 'host' (from remote computer).
- METHOD: 'trust' means skip asking for password. 'scram-sha-256' means ask for the password and encrypt it during the transmission.

To allows 'my_user_id' database user to logon with password from local computer, add a new line:

# "local" is for Unix domain socket connections only
local   all             my_user_id                              scram-sha-256
local   all             all                                     trust

To allows 'my_user_id' database user to logon with password from a remote computer (remote IP address: 192.168.68.54), add a new line:
```
# IPv4 local connections:
host    all             my_user_id             192.168.68.54/32        scram-sha-256
```

To allow any database user from any remote computer, add '0.0.0.1/0' with 'scram-sha-256' line.

# IPv4 local connections:
host    all             all             127.0.0.1/32            trust
host    all             all             0.0.0.1/0            	scram-sha-256

Notes: 'md5' has weakness. Use 'scram-sha-256' which is more secured.
In postgresql.conf, Postgres should listen to all addresses that has been assigned.
```
listen_addresses = '*'
```
ODBC driver

https://odbc.postgresql.org/

Server admin

Show database server version

select version():

To view database size

select pg_size_pretty(pg_database_size('mydb'));

Show all databases
```
select * from pg_database;
```

Show all tables

select * from pg_catalog.pg_tables;

To view the user table only.

select * from pg_catalog.pg_stat_user_tables;

Backup and restore

To do hot/online backup to "all" databases (not able to do single database backup with pg_basebackup),
```
.\pg_basebackup.exe -D 'D:\pgsql16\backup' -X fetch -Ft --compress=gzip:9
```
To do hot/online backup the "a" database, run the following command.
```
.\pg_dump.exe -d mydb -f 'd:\pgsql16\backup\mydb.sql'
```
Backup the database in directory structure ('-Fd') parallelly ('-j2', with 2 workers).
```
.\pg_dump.exe -Fd -d mydb -j2 -f 'd:\pgsql16\backup\mydb'
```
To create Wednesday backup, you will have to specify another directory.
```
.\pg_dump.exe -Fd -d mydb -j2 -f 'd:\pgsql16\backup\mydb-wed'
```
You may use /tools/db-script/db-backup.bat to do the job. For example, to backup mydb2, the script will create a backup directory ({yyyymmdd}-0) in D:\pgsql16\backup\. The second run on this file will create another directory {yyyymmdd}-1. So, it is safe to run multiple times on the same day.
```
cd tools/db-script/
./db-backup.bat mydb2
```
Notes: this db-backup.bat does not remove any old backup!
- 23.Oct.2024
  
  Disabled the compression ('-Z0') in pg_dump and compress the 'dumpdir' manually. Uses 'j 10' because the server has 10 cores.
```
$ pg_dump -Z0 -j 10 -Fd database_name -f dumpdir
$ tar -cf - dumpdir | pigz -p 10 > dumpdir.tar.gz
$ rm -r dumpdir
```
  The time has dropped from ~70 minutes to ~5 minutes. Quite amazing. (for a 100GB database with some bytea field storing images).
  
  You can restore it like this:
```
$ mkdir -p dumpdir
$ pigz -p 10 -dc dumpdir.tar.gz | tar -C dumpdir --strip-components 1 -xf -
$ pg_restore -j 10 -Fd -O -d database_name dumpdir
```
  Reference: https://serverfault.com/questions/1081642/postgresql-13-speed-up-pg-dump-to-5-minutes-instead-of-70-minutes
Restores 'mydb' as 'mydb2'.
```
.\dropdb.exe mydb2
.\createdb.exe mydb2
.\pg_restore.exe -Fd -j2 --no-owner -d mydb2 'd:\pgsql16\backup\mydb'
```
- Avoid executing the 'dropdb.exe' in case it does no exist.
- --no-owner means skip setting the object owner.
- -d is the database to be restored.
- The last parameter is the directory that contains the backup (created with -Fd parameter).
Notes: drop the database and create a new one before restoring the database.
- In case you are migrating the database from older Postgres version to newer version, it is best to dump the database into SQL script format and the restore in the newer version. This avoids any incompatibility in the backup structure.
- 13.Oct.2024-to speed up the database migration, use the following command:
```
pg_dump -d 'postgres://username:password@hostname:port/source_database' -Fc | pg_restore --no-acl --no-owner -d 'postgres://username:password@hostname:port/target_database' --data-only
```
  https://techcommunity.microsoft.com/t5/azure-database-for-postgresql/faster-data-migrations-in-postgres/ba-p/2150850
  
  The following uses Python script to download the data and upload it to another server using multiple threads.
  
  https://github.com/microsoft/OrcasNinjaTeam/tree/master/azure-postgresql/data_migration
  
  The idea of this Python script is that it runs multiple COPY command in different threads:
```
COPY (SELECT * FROM metrics WHERE id < 25000000) TO 'metrics_1.csv' DELIMITER ',' CSV;
COPY (SELECT * FROM metrics WHERE id >= 25000000) TO 'metrics_2.csv' DELIMITER ',' CSV;
```
References:

https://www.enterprisedb.com/postgresql-database-backup-recovery-what-works-wal-pitr
- Backup to Google drive: https://snapshooter.com/learn/backup-postgresql-to-google-drive-bash

Server configuration

Enable the server logging.

logging_collector=on
log_destination='stderr'
log_directory=log
log_file_mode=0640
log_rotation_age=1d
log_rotation_size=10MB
log_truncate_on_rotation=off
log_hostname = on
log_connections = on
log_disconnections = on
log_duration = on
log_line_prefix = '%t [%p]: user=%u,db=%d,app=%a,client=%h  '
log_lock_waits = on

Log the statement if it required more than 3 seconds to complete.

log_min_duration_statement = 3000

Reference: https://github.com/darold/pgbadger#postgresql-configuration https://betterstack.com/community/guides/logging/how-to-start-logging-with-postgresql/#log-formatting

Log down all queries
```
log_statement = 'all'
```
Use the following calculator to configure Postgres based on the server specification.

https://pgtune.leopard.in.ua/

Extensions

The following extensions are useful:
- pg_trgm - for high performance case insensitive text searching.
- pg_stat_statements - for tracking the query statistics.
- postgres_fdw - for accessing database in another server.

Accessing another database

Setup the necessary:

CREATE EXTENSION postgres_fdw;

CREATE SERVER server1
    FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (host 'localhost', port '5432', dbname 'mydb2');

CREATE USER MAPPING FOR my_user_id
    SERVER server1
    OPTIONS (user 'my_user_id', password 'ciysys123');

CREATE SCHEMA mydb2;

IMPORT FOREIGN SCHEMA public
FROM SERVER server1
INTO mydb2;

Run the query in 'mydb' context:

select * from mydb2.sales;

select coalesce(a.id, b.id) as id, coalesce( a.amt+b.amt, a.amt, b.amt)
from sales a
full outer join mydb2.sales b
    on b.id = a.id ;

07.Jan.2025-connecting two servers (a local server and a remote server):

To check if the FDW has been created:

select * from pg_extension

To create the FDW:

--drop server server_002 cascade;

create extension postgres_fdw;

CREATE SERVER server_002
    FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (host 'another_server', port '5432', dbname 'my_db1');

CREATE USER MAPPING FOR postgres
    SERVER server_002
    OPTIONS (user 'my_user_id', password 'ciysys123');

CREATE SCHEMA from_another_server;

IMPORT FOREIGN SCHEMA public
FROM SERVER server_002
INTO from_another_server;

Notes: in the USER MAPPING, it links the local 'postgres' user ID to 'my_user_id' in the remote server.

After that, executing the following query in the local server and it will retrieve the data from the remote server.

select *
from from_another_server.sales

References

https://www.postgresql.org/docs/current/postgres-fdw.html

Logical replication

Date: 23.Sep.2024 Information:

Primary server/publisher: 192.168.68.55
Secondar server/subscriber: 192.168.68.54

Edit postgres.conf in the primary server:

listen_addresses = '*'
wal_level = logical
max_wal_senders = 10

For security wise, use this setting:

listen_addresses = 'localhost, db_master_private_ip_address'

Edit the pg_hba.conf:

# TYPE  DATABASE        USER            ADDRESS                 METHOD
# IPv4 local connections:
host    all             all             127.0.0.1/32            trust
host    all             all             0.0.0.1/0            	scram-sha-256

For security wise, use this setting, replace '0.0.0.1/0' with the remote IP address.

host    all             all             db_master_private_ip_address/32            	scram-sha-256
host    all             all             192.168.68.54/32                            scram-sha-256

In the firewall, open port 5432.

In Linux
```
sudo ufw allow from db_replica_private_ip_address to any port 5432
```
In Windows, open Windows Defender Firewall with Advanced Security to open the port.

In the primary server, create a role specifically for replication process:

CREATE ROLE my_user_id WITH REPLICATION LOGIN PASSWORD 'ciysys123';
GRANT ALL PRIVILEGES ON DATABASE my_db1 TO my_user_id;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO my_user_id;

Grant CREATE TABLE permission:

GRANT ALL ON SCHEMA public TO my_user_id;

Restart primary server.
Now, we are going to publish the 'sales' table which has the following structure:
```
create table sales (prod_id int not null, qty int, unit_price numeric(15,2));
```
NOTES: this table does not have primary key yet! Will create it later.

Do this in primary server:
```
CREATE PUBLICATION my_publication;
ALTER PUBLICATION my_publication ADD TABLE sales;
```

In the secondary server,

CREATE SUBSCRIPTION my_subscription
CONNECTION 'host=192.168.68.55 port=5432 password=ciysys123 user=my_user_id dbname=test_repl application_name=mysub1'
PUBLICATION my_publication;

To test it out, run the following command in the primary server,

insert into sales (prod_id, qty, unit_price) values
    (1, 10, 1.2)
    , (2, 10, 1.2)
    , (3, 10, 1.2)
    , (4, 10, 1.2);

In the secondary server,
```
select * from sales;
```

Monitoring the replication.

In the primary server,

select * from pg_publication
select * from pg_publication_tables

--this returns the current LSN value which appears in pg_stat_replication
-- & also pg_stat_subscription (in the secondary server).
select * from pg_current_wal_lsn()

For monitoring who has subscribed and the replication state:

select * from pg_stat_replication

In the secondary server,

What has been subscribed:

select * from pg_subscription

For monitoring:

select * from pg_stat_subscription

FAQ for replication

If a table does not have primary key and it has been published, running the DELETE command will fail,

delete from sales
where prod_id = 2

This is the error message

SQL Error [55000]: ERROR: cannot delete from table "sales" because it does not have a replica identity and publishes deletes
Hint: To enable deleting from the table, set REPLICA IDENTITY using ALTER TABLE.

Solution 1: add primary key to the table.

Let's add a primary key in both primary server and secondary server:
```
alter table sales
add primary key (prod_id)
```
Run the command again and the DELETE command will be successfully replicate to the secondary server.
```
delete from sales
where prod_id = 1
```

Solution 2: set the replica identity of the table.

create unique index ix_sales on sales (prod_id);

alter table sales replica identity using index ix_sales;

Stop the Postgres in the secondary server and then run the following query:
```
insert into sales (prod_id, qty, unit_price) values
(31, 10, 1.2);
```
In the secondary server, start Postgres and the new data will be replicated automatically.
You may turn off the subscription with the following command:
```
DROP SUBSCRIPTION my_subscription;
```
To recreate the subscription, use 'copy_data=false' to prevent the starting the data initialization process. This works perfectly if you restore the table from a backup and you want to getting the new records only.
```
CREATE SUBSCRIPTION my_subscription
CONNECTION 'host=192.168.68.55 port=5432 password=ciysys123 user=my_user_id dbname=test_repl application_name=mysub1'
PUBLICATION my_publication
WITH (copy_data = false)
```

Adding new table:

Run the following query in both primary and secondary server:

create table prod (prod_id int primary key, name text);

In primary server, add the new table to the publication:

CREATE PUBLICATION my_publication;
ALTER PUBLICATION my_publication ADD TABLE sales;

Add some test data:

insert into prod (prod_id, name) values
   (11, 'apple')
   , (12, 'banana');

In the secondary server, pull the data from the newly added table using 'REFRESH PUBLICATION':

ALTER SUBSCRIPTION my_subscription REFRESH PUBLICATION;

References:

https://www.digitalocean.com/community/tutorials/how-to-set-up-logical-replication-with-postgresql-10-on-ubuntu-18-04 https://www.postgresql.org/docs/current/logical-replication-subscription.html https://www.postgresql.org/docs/current/logicaldecoding-example.html https://www.postgresql.org/docs/current/sql-createsubscription.html
- Handling conflicts
  
  https://www.postgresql.fastware.com/blog/how-to-handle-logical-replication-conflicts-in-postgresql https://www.postgresql.fastware.com/blog/addressing-replication-conflicts-using-alter-subscription-skip

FAQ

To set the statement_timeout

At the database level,

ALTER DATABASE mydatabase SET statement_timeout = '60s';

For a specific user

ALTER ROLE guest SET statement_timeout='5min';

To see the runtime value:

SHOW lock_timeout;

SHOW ALL;

SELECT name, setting, boot_val
FROM   pg_settings
WHERE  context = 'postmaster';

Reference: https://www.postgresql.org/docs/current/config-setting.html

To change the config at runtime:

SET configuration_parameter TO DEFAULT;
SET lock_timeout TO DEFAULT;

UPDATE pg_settings SET setting = reset_val WHERE name = 'configuration_parameter';

To extract the function definition:

select *
from pg_proc
where
proname like 'sales%';

select pg_get_functiondef(16401);

Case insensitive matching:

select
    'apple' = 'Apple' "FALSE"
    , 'apple' = 'Apple' "FALSE"
    , 'apple' ~* 'Apple' "TRUE"

    , 'apple' LIKE 'Apple' "FALSE"
    , 'apple' ILIKE 'Apple' "TRUE"

For more information, refers to /rnd/sql/pg-index-test.sql.

A better setup for development database to be restored in the production server.
- Create a standard user role for all projects (creating database user).
```
CREATE ROLE my_user_id WITH LOGIN SUPERUSER CREATEDB CREATEROLE INHERIT REPLICATION CONNECTION LIMIT -1;
```
  For testing, you may set 'my_user_id' (Postgres user) password to 'ciysys123'. Then, test out the psql connection with password.
- To change the Postgres user password:
```
ALTER ROLE my_user_id WITH PASSWORD 'ciysys123';
```
- Create a Windows User ID 'my_user_id' with password 'ciysys123'. You may want to use the same password as shown above if it is not configured as 'trust' in the pg_hba.conf which will ease the support job.
- 21.Sep.2024-you may consider to create 'postgres' with password of 'postgres123' in both Windows & Postgres.
- To register pgsql as Windows Service,
```
pg_ctl register -D 'd:\pgsql16\data' -N pgsql_01 -U my_user_id -e 'pgsql_01' -P ciysys123
```
- Using cli to connect to the database:
```
.\psql -d mydb my_user_id
```
  Where 'my_user_id' is the user ID.
- For more information, please review the pg_hba.conf in the 'Security settings'.
Transfer all objects from 'my_user_id' to 'my_user_id'.
```
reassign owned by my_user_id to my_user_id
```
The above command will fail if the system table is owned by 'my_user_id'. As a result, you have to transfer table by table.
```
alter table sales owner to my_user_id;
```
Transfer the 'public' schema to 'my_user_id' (new owner).
```
alter schema public owner to my_user_id;
```
13.Jul.2024 - to track the query activities
1. Enable the extension. Once enabled, it will track all query activities for the entire server.
```
create extension pg_stat_statements;
```
2. Then, edit the postgresql.conf file and append the following
```
compute_query_id = on
pg_stat_statements.max = 10000
pg_stat_statements.track = all
pg_stat_statements.track_planning = on
```
3. Finally, restart the server to take the effect.
4. Run some queries and see what is happening in the blackbox.
```
select *
from pg_stat_statements;
```
5. To clear the stats, run the following command.
```
select pg_stat_statements_reset();
```
Reference: https://www.postgresql.org/docs/current/pgstatstatements.html
19.Jul.2024 - full text search vs LIKE/ILIKE search (using Trigram index)
- Using FTS:
```
select
    audit_log_id
    , created_on
    , msg
    --, to_tsvector(msg)
    , ts_rank(to_tsvector(msg), websearch_to_tsquery('transaction ref'))  r
from tb_audit_log
where
    --to_tsvector(msg) @@ phraseto_tsquery('transaction')           -- it returned some records.
    --to_tsvector(msg) @@ phraseto_tsquery('trans')                 -- this won't work (zero matches).
    to_tsvector(msg) @@ websearch_to_tsquery('transaction ref')     -- 'ref' is a word in the 'msg'. So, it returns some records.
    --to_tsvector(msg) @@ tsquery('trans:* & 11')                   -- 'trans:*' means string prefix. It returns some records.
    --to_tsvector(msg) @@ tsquery('*:trans & 11')                   -- '*:x' is invalid/not supported.
order by r desc
```
  For websearch_to_tsquery(), it convert the query_text into ts_query which looks like this (refers to https://www.instaclustr.com/blog/postgresql-full-text-search/):
```
+-------------------------+-------------------------------------+
|query_text               |websearch_to_tsquery                 |
+-------------------------+-------------------------------------+
|the quick brown fox      |'quick' & 'brown' & 'fox'            |
|"the quick brown fox"    |'quick' <-> 'brown' <-> 'fox'        |
|brown or fox             |'brown' | 'fox'                      |
|"quick brown" fox -jumped|'quick' <-> 'brown' & 'fox' & !'jump'|
+-------------------------+-------------------------------------+
```
  FTS features:
  - FTS has many advanced operator such as
    - 'a & b' where both 'a' AND 'b' must exist.
    - 'a | b' where both 'a' OR 'b' must exist.
    - '!a' where 'a' must NOT exist.
    - 'a <-> b' which search 'b' appears after 'a'.
    - 'SELECT tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10);' - 'fat' has 10 lexeme distance from 'cat'.
    - 'a:*' - filter by prefix of 'a'.
    - It allows you to create a synonym dictionary.
    - It allows you to create a thesaurus dictionary.
  - It removes the common word such as 'the', 'is', etc. This can be configured by the programmer.
  - You can set weight ({D-weight, C-weight, B-weight, A-weight} which has the value of {0.1, 0.2, 0.4, 1.0}) for the keyword and the weight affects the ranking. The default weight is D.
    
    For example, the system allows 'title' and 'body' fields to be searchable using FTS and title will have heavier weight.
```
CREATE FUNCTION messages_trigger() RETURNS trigger AS $
begin
new.tsv :=
    setweight(to_tsvector('pg_catalog.english', coalesce(new.title,'')), 'A') ||
    setweight(to_tsvector('pg_catalog.english', coalesce(new.body,'')), 'D');
return new;
end
$ LANGUAGE plpgsql;

CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE
    ON messages FOR EACH ROW EXECUTE FUNCTION messages_trigger();
```
  Advantages:
  - Allows to setup a complex tsquery to match the document.
  - Allows grouping different words into same group using synonym dictionary and thesauraus dictionary.s
  Disadvantages:
  - Not able to handle partial string matching where the user may expecting soemthing like: 'msg like %trans%'.
  - Complex to tune the ts_query but it is useful if the search is simple matching.
  - Chinese character and Japanese characters has been drop since v15.
  Reference
- For simple wildcard search, it is recommended to use ILIKE/LIKE operator with Trigram index. First to create an index:
```
create index ix_tb_audit_log_2
on tb_audit_log
using gin (msg gin_trgm_ops);
```
  Then, the following will work. But, Trigram index only solving the partial matching issue and not other that can be solved by full text.
```
select
    audit_log_id
    , created_on
    , msg
from tb_audit_log
where msg '%trans%'
```
  Advantages:
  - Easier to implement using ILIKE/LIKE with a Trigram index.
  - Easier to use and understood by the user.
  Disadvantages:
  - Not to to rank the result.
  - No weight on the searching word.
  Reference:
  - https://www.postgresql.org/docs/current/pgtrgm.html
  - https://www.cybertec-postgresql.com/en/postgresql-more-performance-for-like-and-ilike-statements/
12.Aug.2024
- To find out what is the performance on the function that you have developed, you will have to rely on the following view.
```
select *
from pg_stat_user_functions
```
  By default, this view contains no data until you turn it on in the 'postgresql.conf'.
```
track_functions = all
```
  For example, the following result shows that 'fn_blackhole' has been called 405 times while the total_time spent is 1.31ms. On the other hand, the total_time is higher than the self_time for 'fn_tx_il_run' and 'fn_tx_knockoff'. You might want to investigate further to find out what other function was called within it that is slow.
```
funcid	schemaname	funcname	        calls	total_time	self_time
36857	public	fn_tx_outst_amt_ref	    61	    42.788	42.788
45808	public	fn_invt_update	        55	    27.775	5.282
45811	public	fn_invt_update_qoh	    55	    16.538	16.538
45820	public	fn_tx_summ_run	        140	    3.799	3.799
45828	public	fn_tx_il_run	        55	    89.444	19.813
37536	public	fn_tx_il_capture_cost	40	    5.788	5.788
45837	public	fn_tx_summ_run_daily	49	    17.063	17.063
36040	public	fn_blackhole	        405	    1.31	1.31
36629	public	fn_ctc_update_bal	    46	    1.9	1.9
36790	public	fn_tx_knockoff	        14	    20.838	2.935
```
- To reset the statistics,
```
select pg_stat_reset();
```
- 21-Feb-2025-to view the table columns statistics:
```
SELECT * FROM pg_stats WHERE tablename ='test_stats';
```
  'correlation' - if the value is zero, means don't use any index. If the value is closed to -1 or +1, use index.
  
  To view the extended statistics created with CREATE STATISTICS,
```
select * from pg_stats_ext
```
  For example, when querying the document in a multi-tenant database, co_id and tx_type is compulsory. On the other han, we most likely will filter the document by date range (dt) and customer/supplier (ctc_id). In this case, we need to create an extended statistics for the query planner:
```
create statistics if not exists st_tb_tx_master_1
    on co_id, tx_type, dt, ctc_id
    from tb_tx_master;
```
References: https://aws.amazon.com/blogs/database/understanding-statistics-in-postgresql/ https://www.postgresql.org/docs/current/sql-createstatistics.html
14.Aug.2024

To increase the client connections, edit the postgresql.conf file and append the following
```
max_connections = 200
shared_buffers = 128MB
```
Where 'shared_buffers' should be 25% of RAM size.

In summary:
- Reachig 200 connections will get the max TPS. Increasing the number of connection does not help much.
Reference:
- https://www.postgresql.org/docs/current/runtime-config-resource.html
- https://www.enterprisedb.com/blog/pgbench-performance-benchmark-postgresql-12-and-edb-advanced-server-12
- To increase the TPS, you may consider reduce the fillfactor (default is 100, suggest reduce to 75) on the busy table.
  
  https://scaleflux.com/blog/improve-tps-performance-postgresql/
29.Aug.2024

To get the view definition, run the following command:
```
select pg_get_viewdef('vw_tx_full'::regclass)
```
OR
```
select definition from pg_views
where viewname = 'vw_tx_full;
```
Note: the result is not the original statement.

01.Sep.2024

Check table size and record count:

select
    schemaname, relname, n_live_tup
    , pg_size_pretty(pg_total_relation_size(schemaname || '.' || relname::text)) tb_size
from pg_stat_user_tables

In case n_live_tup is zero, run the following command (in psql) against your database.

vacuumdb -d mydb

To view database size

select pg_size_pretty(pg_database_size('mydb'));

01.Sep.2024-for busy and large table (> 1 million rec), recommended to have the following auto vacuum setting:
- Change table auto vacuum setting (without affecting the global setting):
```
ALTER TABLE tb_tx_detail
SET (
    autovacuum_vacuum_threshold = 1000,
    autovacuum_vacuum_scale_factor = 0.0,
    autovacuum_analyze_threshold = 1000,
    autovacuum_analyze_scale_factor = 0.0
);
```
  - autovacuum_vacuum_threshold = 1000 - means run 1000 records INSERT/UPDATE that will trigger autovacuum.
  - autovacuum_vacuum_scale_factor = 0.0 - this is the percentage of total records that will trigger autovacuum.
- To check the table level settings:
```
SELECT relname, reloptions
FROM pg_class
WHERE reloptions IS NOT null;
```
- To check if the global setting is effective, run the following query. The n_live_tup is the estimated row count. For small table, the default auto vacuum setting is working properly. For large table, you will need to change the settings mentioned above.
```
select
    schemaname, relname, n_live_tup, last_autovacuum, last_autoanalyze
from pg_stat_user_tables
where last_autovacuum is not null
order by last_autovacuum
```
- References
  
  https://www.enterprisedb.com/blog/postgresql-vacuum-and-analyze-best-practice-tips https://www.cybertec-postgresql.com/en/reasons-why-vacuum-wont-remove-dead-rows/ https://www.percona.com/blog/tuning-autovacuum-in-postgresql-and-autovacuum-internals/ https://www.percona.com/blog/importance-of-postgresql-vacuum-tuning-and-custom-scheduled-vacuum-job/ https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.PostgreSQL.CommonDBATasks.Autovacuum.html
- To understand the cost based vacuum formula, see the following article:
  
  https://pganalyze.com/docs/vacuum-advisor/how-does-the-vacuum-cost-model-work
- An utility to analyze what needs to be improved:
  
  https://github.com/jobinau/pg_gather
Supplementary Scheduled Vaccum Job as recommended in:

https://www.percona.com/blog/importance-of-postgresql-vacuum-tuning-and-custom-scheduled-vacuum-job/

/tools/db-script/vacuumjob.sql will vacuum 100 tables in each run based on the table age (relfrozenxid).
```
20 11 * * * /full/path/to/psql -X -f /path/to/vacuumjob.sql > /tmp/vacuumjob.out 2>&1
```
In Windows, commented set ECHO all and changed gexec to \gexec. Then, execute against a database:
```
psql -X -f .\vacuumjob.sql mydb
```
19.Sep.2024 - find out how many role that has been granted to the current database:
```
select distinct grantee
from information_schema.table_privileges
order by grantee
```
With the result from above query and running an app that still prompted a missing role, it could be caused by the following command (which you might want to trace in your app source code or SQL script):
```
set role my_user_id;
```
11.Nov.2024
- Automatic index tuning
  
  https://www.percona.com/blog/automatic-index-recommendations-in-postgresql-using-pg_qualstats-and-hypopg/ https://github.com/HypoPG/hypopg
16.Dec.2024
- Issue: one thread is updating the record while another thread is reading the same request after nano-seconds. The reader might not get the latest version of record.
  
  Run in console #1:
```
begin;
set local lock_timeout=2000;
select * from sales where prod_id=2;
update sales set price=price*0.15 where prod_id=2;
```
  In console #2:
  1. Expecting the result without any error BUT old record version! //<<==========//<<==========
```
begin;
set local lock_timeout=2000;
select * from sales where prod_id=2;
```
    Then, abort the transaction.
```
rollback;
```
    If we query for prod_id=1, the result was returned without any error.
```
begin;
set local lock_timeout=2000;
select * from sales where prod_id=1;
rollback;
```
    This means, thread 1 who is updating prod_id=2 locks the row but not the table.
  2. Next test - expecting "canceling statement due to lock timeout" error and getting the latest version of record upon console #1 commit or rollback.
```
begin;
set local lock_timeout=2000;
select * from sales where prod_id=2 for update;
```
  3. What if we use LOCK TABLE and not SELECT..FOR UPDATE? - expecting "canceling statement due to lock timeout" error upon executing LOCK TABLE.
```
begin;
set local lock_timeout=2000;
lock table sales in share mode;
select * from sales where prod_id=2;
```
  4. What if we use SET TRANSACTION instead of LOCK TABLE?
```
begin;
set local lock_timeout=2000;
set transaction isolation level serializable;
select * from sales where prod_id=2;
```
    Well, this does not detect the UPDATE (in progress) made by other session and the result is the older data version.
25.Dec.2024
- To enable or disable the Postgres daemon, edit /etc/postgresql/9.1/main/start.conf and replace 'auto' with 'manual' or 'disabled'.
- We have a situation where the server was installed with v16 and v17 was installed automatically. Then, v17 was running. In this case, we do the above.

19.Jan.2025: timezone

View current timezone value,

select * from pg_settings
where name ilike 'timezone'

To change the timezone for the current query,

set local timezone to 'GMT';
show timezone;

set local time zone 'GMT';

To reset the timezone,

set local timezone to default;

reset timezone;

Get the supported timezones:

select * from pg_timezone_names
where name ilike '%kua%'

15.Mar.2025: sample cron job bash file:
```
#!/bin/bash
. ~/.profile

echo "db-cleanup.sh:start"
vacuumdb -j2 some_db
echo "db-cleanup.sh:done"
```
Where,
- The first line is the shell.
- The second line is loading the profile for the current user (where current user must be a postgres user as well!!).
Then, in your cron job file (that was kept in the /etc/cron.d), you may want to save the output to a log file:
```
* * * * * /path/to/your/script.sh >> /path/to/logfile.log 2>&1
* * * * * /path/to/your/script.sh >> /path/to/output.log 2>> /path/to/error.log
```
Where,
- >> means append while > means overwrite the log file.
- 2>&1 means output the error to current context.
27.Apr.2025: upgrade from v16 to v17:
1. Stops both v16 and v17 services.
2. Create a working directory 'd:\temp4' and allows 'everyone' to access this directory.
3. Goto this directory: cd d:\temp4.
4. Starts the upgrade.
  
  The following command will upgrade v16 to v17 with the database user 'postgres'.
```
D:\pgsql17\bin\pg_upgrade.exe --old-datadir 'D:\pgsql16\data' --new-datadir 'D:\pgsql17\data' --old-bindir 'D:\pgsql16\bin' --old-port=5432 --new-bindir 'D:\pgsql17\bin' --new-port=5433 -U postgres --check
```
  In case of the following error duraing the check:
```
database user "postgres" is not the install user
```
  You will have to create a new database cluster (v17):
```
.\initdb.exe -D 'd:\pgsql17\data' --encoding=utf8 -U my_user_id
```
  Then, run the upgrade with database user 'my_user_id'.
```
D:\pgsql17\bin\pg_upgrade.exe --old-datadir 'D:\pgsql16\data' --new-datadir 'D:\pgsql17\data' --old-bindir 'D:\pgsql16\bin' --old-port=5432 --new-bindir 'D:\pgsql17\bin' --new-port=5433 -U my_user_id --check
```
5. 'd:\temp4\update_extensions.sql' will be created. Copy this file into 'd:\pgsql17\bin' and you need to run this SQL script in v17.
```
cd d:\pgsql17\bin
psql -d postgres
\i update_extensions.sql
```
6. Starts v17 Postgres service and run the vacumm --all in v17.
7. Finally, drop the v16 database.

12.Jun.2025: to view the table storage details:

To get the average record size, we need to use query_to_xml() function which is able to execute dynamic query. As a result, we don't need to write a stored function to achieve that.

select
    schemaname, relname
    , pg_size_pretty(pg_total_relation_size(schemaname || '.' || relname::text)) tb_storage_size
    , pg_size_pretty(pg_indexes_size(relid)) idx_storage_size
    , n_live_tup, n_dead_tup
    , pg_size_pretty((xpath('/row/cnt/text()', query_to_xml(format('select sum(pg_column_size(%I.*)) as cnt from %I.%I',  relname, schemaname, relname), false, true, '')))[1]::text::bigint) as actual_tb_size
    , pg_size_pretty((xpath('/row/cnt/text()', query_to_xml(format('select sum(pg_column_size(%I.*)) / count(*) as cnt from %I.%I',  relname, schemaname, relname), false, true, '')))[1]::text::bigint) as avg_rec_size
from pg_stat_user_tables
where n_live_tup > 0
order by
schemaname, relname

Reference: https://blog.sql-workbench.eu/post/dynamic-sql/

03.May.2026

To add a new database user, first is to make sure you are connecting to my_db1 before running the following commands:

CREATE ROLE my_user_id WITH SUPERUSER LOGIN PASSWORD 'ciysys123';
GRANT ALL PRIVILEGES ON DATABASE my_db1 TO my_user_id;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO my_user_id;

Grant CREATE TABLE permission - this is not required if the user is a SUPERUSER.

GRANT ALL ON SCHEMA public TO my_user_id;

To remove a user from the database.

REVOKE ALL PRIVILEGES ON DATABASE my_db1 FROM my_user_id;
DROP ROLE my_user_id;

References

The Part of PostgreSQL We Hate the Most:

https://www.cs.cmu.edu/~pavlo/blog/2023/04/the-part-of-postgresql-we-hate-the-most.html
To shrink the database file size:

https://github.com/reorg/pg_repack https://github.com/cybertec-postgresql/pg_squeeze
To run multiple instance/cluster in same/different version:

https://dzone.com/articles/managing-multiple-postgresql-instances-on-ubuntude https://dev.to/titon/how-to-set-up-multiple-postgresql-instances-on-a-single-server-15gg
Formula for create index:

https://www.depesz.com/2021/10/22/why-is-it-hard-to-automatically-suggest-what-index-to-create/
Steps in upgrading Postgres Windows:

https://gist.github.com/truemogician/610300dd8e86f7a8ad01f73b9da1ff1a

Back to #POSTGRES blog

Back to #blog listing