The PostgreSQL Timeline Concept

1. Introduction

In my previous blog here, I discussed about PostgreSQL’s point in time recovery where PostgreSQL supports an ability to recover your database to a specific time, recovery point or transaction ID in the past but I did not discuss in detail the concept of timeline id, which is also important in database recovery.

2. What is timeline ID and Why it is important?

A timeline ID is basically a point of divergence in WAL. It represents a point, of to be exact, the LSN of the WAL in which the database starts to diverge. Divergence happens when an user performs a point in time recovery or when the standby server is promoted. The timeline ID is included in the first 8 bytes of WAL segment files under pg_wal/ directory.

For example:
pg_wal/000000010000000000000001, indicates that this WAL segment belongs to timeline ID = 1

and

pg_wal/000000020000000000000001, indicates that this WAL segment belongs to timeline ID = 2

Timeline ID behaves somewhat like git branch function without the ability to move forward in parallel and to merge back to the master branch. Your development starts from a master branch, and you are able to create a new branch (A) from the master branch to continue a specific feature development. Let’s say the feature also involves several implementation approaches and you are able to create additional branches (B, C and D) to implement each approach.

This is a simple illustration of git branch:

With timeline ID, your database starts from timeline ID 1 and it will stay at 1 for all subsequent database operations. Timeline ID 2 will be created when the user performs a point in time recovery on timeline 1 and all of the subsequnt database operations at this point belong to timeline ID 2. While at 2, the user could perform more PITR to create timeline 3, 4 and 5 respectively.

In the previous PITR blog, I mentioned that you could do PITR based on time, a recovery point, a LSN or a transaction ID but all these can only apply to one particular timeline. In postgresql.conf, you can select a desired recovery timeline by the recovery_target_timeline parameter. This parameter can be 'latest', 'current', or 'a particular timeline ID value'.

With this configuration, an user is able to recovery the database to a particular point of a particular timeline in the past

This is a simple illustration of timeline ID:

3. The History File Associated with a Timeline ID

The history files are created under pg_wal/ directory with a .history postfix when a new timeline Id is created. This file describes all the past divergence points that have to be replayed in order to reach the current timeline. Without this file, it is impossible to tell where a timeline comes from, thus not being able to do PITR.

For example, a history file 00000003.history may contain the following contents

cat pg_wal/00000003.history
1 0/30000D8 no recovery target specified

2 0/3002B08 no recovery target specified

which means that timeline 3 comes from LSN(0/3002B08) of timeline 2, which comes from LSN(0/30000D8) of timeline 1.

4. Importance of Continuous Archiving

With the concept of timeline ID, it is possible that the same LSN or the same WAL segments exist in multiple timelines.

For example: the WAL segments, 3, 4, 5, 6 exist in both timeline 1 and timeline 2 but with different contents. Since the current timeline is 2, so the ones in timeline 2 will continue to grow forward.

000000010000000000000001
000000010000000000000002
000000010000000000000003
000000010000000000000004
000000010000000000000005
000000010000000000000006
000000020000000000000003
000000020000000000000004
000000020000000000000005
000000020000000000000006
000000020000000000000007
000000020000000000000008

With more timelines created, the number of WAL segments files may also increase. Sine PG keeps a certain amount of WAL segment files before deleting them, it is super important to archive all the WAL segments to a separate location either by enabling continuous archiving function or using pg_receivewal tool. With all WAL segment files archived in a separate location, the user is able to perform successful point in time recovery to any timeline and any LSN.

PostgresConf.CN and PGConf.Asia 2021 (China Branch) Join Forces Again to Bring You the Best!

About the Event

In November, 2020, PostgresConf.CN and PGconf.Asia2020 cooperatively organized 2020 PG Asia Conference online for the very first time! This conference attracted over 100 participating experts, scholars and speakers around the world to deliver a grand technical feast for all the participants. The event was broadcast exclusively via the Modb Technology Community platform in China with a record-high number of viewers streaming the conference channels. Each channel on average accumulated over 30,000+ active LIVE streams, the official conference blog views accumulated over 50,000+ views, and the news reports and articles from media exceeded over 3,000+ entries.

This year, both PostgresConf.CN and PGConf.Asia 2021 (China Branch) will join forces again to bring you the best PostgreSQL conference in Asia!

Official Event Website - 2021

Time and Rough Agenda

The 2021 PG Asia Conference will be a 4-day event from December 14 to 17, 2021.

Rough schedule is as below. Please note that the final schedule is subject to the date of the meeting. The agenda time is Beijing Time(GMT+8)

  • December 14 - Day 1
    • 09:00-18:00 Main Conference
  • December 15 - Day 2
    • 09:00-17:00 Mandarin sub-Comprehensive forum
    • 09:00-17:00 Indonesia sub
    • 09:00-17:00 Korean Sub
    • 09:00-17:00 English sub
    • 09:00-12:00 Special Training A
    • 14:00-16:30 Special Training B
  • December 16 - Day 3
    • 09:00-17:00 Mandarin sub-Application Forum
    • 09:00-17:00 Japanese Sub
    • 09:00-17:00 Thai Sub
    • 09:00-12:00 Special Training C
    • 10:00-11:30 Developer Unconference
    • 14:00-16:30 Special Training D
  • December 17 - Day 4
    • 09:00-12:00 Special Training E
    • 14:00-16:30 Special Training F

About PostgresConf.CN

PostgresConf.CN is an annual event held by the China PostgreSQL Association for PostgreSQL users and developers, PostgreSQL is a leading open-source relational database with an active and vibrant community. PostgreConf.CN 2020 took place as an online conference for the very first time; it was very well attended by PostgreSQL users and community members across the globe. It is also one of the conference series held by PostgresConf Organization.

Official Event Website - 2020

About PGConf.Asia

PGCONF.Asia is a shorthand of the Asian PostgreSQL Conference which has been hosted in Tokyo from 2016 until 2018. The conference was held in Bali for the first time in Sep 2019 and it is intended as the Asian level of PostgreSQL International Conference. It co-hosted the 2020 PG Asia Conference together with PostgresConf.CN as an online event and accumulated unprecedented number of participants worldwide. The organization is often regarded as the HUB and the Summit of the PostgreSQL Community throughout Asian Countries.

Call for Paper

Each speaker is given a 30-minute session (25 minutes of speech + 5 minutes of Q&A).The direction of the speech can be: kernel development, application practice, operation and maintenance management, data migration, distributed, high availability, security control, optimization, cloud database, open source upstream and downstream ecology, etc.

Please submit your speech topic first. The deadline for topic submission is October 31, 2021. The organizing committee will complete topic review before November 15, 2021. You will be notified the result of your topic selection at that time, please start preparing the speech content. The deadline for PPT submission is November 30, 2021.

Please download the PPT template here.

You will have to create a free Postgres Conference account in order to submit your topic.

Access the account registration here

Contacts

Global Sponsor Interface: grantzhou@postgresconf.org
About Speech submission and other inquiries: meeting@postgresqlchina.com

PostgreSQL 14 Continuous archiving and Point In Time Recovery Tutorial

1. Introduction

Recently I have been practicing the internals of PostgreSQL continuous archiving and the point in time recovery features that are required for my current development work. Today I would like to demonstrate these important features with the recently released PostgreSQL 14 on Ubuntu 18.04.

2. Write Ahead Log?

Before going into the details of continuous archiving, it is important to understand the concept of Write Ahead Log (WAL). WAL files are generated by PG that contains all of the operations done on the database since the beginning. Operations such as INSERT, UPDATE, DELETE, VACUUM …etc are captured in WAL files. Having these WAL files, it is possible to recreate the database simply by replaying them, allowing an user to recover the database to a certain state in case of fault. This is the basis of continuous archiving and point in time recovery.

3. What is Continuous Archiving?

The generated WAL files are normally stored in the pg_wal directory within the PG database cluster, but they will not grow forever. The configuration parameters, max_wal_size and min_wal_size control how many WAL files can be kept in pg_wal directory. The checkpointer process will periodically purge the old WAL files, leaving only recent ones.

So, it is important to set up continuous archiving so all these WAL files can be archived to somewhere else outside of PG cluster. So, when you need all the old WAL files for recovery, PG can restore them from the archive.

To enable WAL archiving and restoring, set these parameters in postgresql.conf:

archive_mode = on
archive_command = 'cp %p /path/to/archive/%f'
restore_command = 'cp /path/to/archive/%f %p'

where you should replace /path/to/archive with your own archive path on your system. %p and %f will be swapped with path to WAL segment and WAL segment name to complete the command.

When a WAL segment is ready to be archived, PG will create a signal file in pg_wal/archive_status to indicate a particular WAL segment is ready for archive.

In the example below, the segment 00000001000000000000000E is ready for archive, indicated by the postfix .ready while all of the previous segments have been successfully archived and therefore indicated by the .done postfix.

$ ls pgtest/pg_wal/archive_status/
000000010000000000000002.done 000000010000000000000005.done 00000001000000000000000A.done 00000001000000000000000E.ready
000000010000000000000003.done 000000010000000000000007.done 00000001000000000000000B.done
000000010000000000000004.done 000000010000000000000008.done 00000001000000000000000D.done

The PG’s archiver process will then be waken up to perform the archive by running the archive_command configured.

$ps -ef | grep postgres

caryh 1487 1 0 11:10 ? 00:00:00 postgres -D /home/caryh/pgtest
caryh 1510 1487 0 11:10 ? 00:00:00 postgres: checkpointer
caryh 1511 1487 0 11:10 ? 00:00:00 postgres: background writer
caryh 1512 1487 0 11:10 ? 00:00:00 postgres: walwriter
caryh 1516 1487 0 11:10 ? 00:00:00 postgres: autovacuum launcher
caryh 1520 1487 0 11:10 ? 00:00:00 postgres: archiver archiving 00000001000000000000000E
caryh 1521 1487 0 11:10 ? 00:00:00 postgres: stats collector
caryh 1522 1487 0 11:10 ? 00:00:00 postgres: logical replication launcher

Note that the archiver (PID=1520) also shows its progress on the ps display.

When done successfully, the signal file in pg_wal/archive_status will be updated to .done postfix.

$ ls pgtest/pg_wal/archive_status/
000000010000000000000002.done 000000010000000000000005.done 00000001000000000000000A.done 00000001000000000000000E.done
000000010000000000000003.done 000000010000000000000007.done 00000001000000000000000B.done
000000010000000000000004.done 000000010000000000000008.done 00000001000000000000000D.done

In the next checkpoint, these .done files will be removed so these status files will not be continuously growing as well.

4. What is Point In Time Recovery (PITR)?

Having all of the WAL segments backed up in a separate archive, we gained the ability to recovery the database up yo a certain point in time in the past or completely recover the whole database. This depends on your use case, if you made a major mistake and need to start again from a point of time in the past, you can have PG to recover to that particular time during recovery mode and continue the database operation from that point. This is also referred to as switching to a new time line ID and we will discuss this more in the next blog.

Let’s continue from the above example (which already has 1 million rows of data) and do a point in time recovery.

  1. Make a basebackup of the current database, we can use pg_basebackup to achieve this

    $ pg_basebackup -U caryh -h 127.0.0.1 --progress -D pgtest-back
  2. back to the database and continue inserting some more data, use pg_switch_wal to immediate write out the WAL segment and obtain a LSN. LSN stands for Log Sequence Number, and it logically represent a WAL entry within a WAL segment. Refer to documentation here for more information.

After obtaining the LSN, we again insert some more rows of data.

insert into test values(generate_series(1,1000000), 'asdas');
insert into test values(generate_series(1,1000000), 'asdas');

pg_switch_wal();
pg_switch_wal
---------------
0/13DAC308

insert into test values(generate_series(1,1000000), 'asdas');
insert into test values(generate_series(1,1000000), 'asdas');

So, in total, this table test should have 5 million rows of data, because it started with 1 million and we just inserted 4 million more in the example above.

The WAL location indicated by LSN 0/13DAC308 indicated a time when the database only contains 3 million rows, and this is the point of time that we would like to recover up to in our example.

  1. Stop the database server

    pg_ctl -D pgtest stop
  2. Wipe out everything in this database pgtest

$ rm -rf pgtest/

I know it sounds crazy, but remember, we made a basebackup back in (1.) + all the WAL segments in the archive, so technically we still have everything.

  1. Copy everything from our basebackup back to pgtest
cp -r pgtest-back/* pgtest/
  1. Edit pgtest/postgresql.conf and set your recover target

Since we are using LSN as our target, we can simply put the LSN we captured to recovery_target_lsn configuration

recovery_target_lsn = '0/13DAC308'

PG also supports other ways to define recovery target, based on timestamp, name or xid. Refer to this documentation for other options.

  1. Signal the database to run in recovery mode by creating a recovery.signal file under the pgtest cluster
$touch pgtest/recovery.signal
  1. Start the server
    $pg_ctl -D pgtest start

the server will now start in recovery mode and it will restore WAL files from the archive and perform the recover. You may log in with psql and check that the database should contain also 3 million rows instead of 5.

You may notice that even though the database has been recovered to a point of time in the past, you will encounter a database in recovery or read only database error if you intend to insert additional data. This is because we are still in the recovery mode but is currently paused.

This is configured by the recovery_target_action option, which defaults to pause. This is actually intended, to allow you to have a moment to check your database and confirm that it is indeed the database state that you would like to recover to. If this is wrong, you can simply shutdown the database and reconfigure the recovery_target_lsn until you reach a desired state of database.

  1. Exit the recovery mode
    Once you are confirm the database is recovered correctly, you can exit the recovery mode by this psql command:
    select pg_wal_replay_resume();

This command will end the recovery mode and you should be able to insert additional data to the database. The recovery.signal file will be removed, and the future WAL segments will have a new timeline ID.

Timeline ID is also an important aspect of the recovery and we will discuss more on timeline ID in my next post.

Implement Foreign Scan With FDW Interface API

1. Introduction

Recently I have been tasked to familiarize myself with the Foreign Data Wrapper (FDW) interface API to build a new FDW capable of doing vertical / columnar sharding, meaning that the FDW is capable of collecting column information from multiple sources and combine them together as a result query. I will document and blog about the vertical sharding in later posts. Now, in this blog, I would like to share some of the key findings and understanding of FDW interface related to foreign scan.

FDW foreign scan API routines have some similarities to the table access method API’s sequential scan routines. You can learn more about it in the blog post here. Most noticeable difference is that the FDW routines require us to take part during the planning stage, so we have to come out with a foreign plan for the exectuor so it knows what and how to perform the foreign scan.

In order to complete a foreign scan, the following FDW routines are invoked in order:

  • IsForeignScanParallelSafe
  • GetForeignRelSize
  • GetForeignPaths
  • GetForeignPlan
  • BeginForeignScan
  • IterateForeignScan
  • IterateForeignScan
  • EndForeignScan

2.1 IsForeignScanParallelSafe

In this routine, you have to tell PG whether or not the foreign scan can support parallel Scan by returning true to indicate it is supported, or false otherwise. Parallel scan is invokved by the planner when a scan query involves retrieving a large amount of data. Depending on the foreign server and your implementation, you have to decide whether or not parallel scan is supported.

2.2 GetForeignRelSize

Similar to the table access method’s relation_estimate_size() API, this routine requires you to provide an estimate of the amount of tuples/rows would be involved during this round of scanning. This information is very important because it directly affects the planner’s decision to execute the query. For example, if you say foreign relation has 1 million tuples to be fetched, then the planner is most likely to use parallel scan to complete the scan for performance reasons if you answer “true” in the IsForeignScanParallelSafe() function call.

2.3 GetForeignPaths

In this routine, we are required to provide a list of possible paths to complete the required foreign scan. Paths simply means ways to execute. For example, you can complete a foreign scan by doing a sequential scan by fetching 1 tuple at a time, or you can complete the same scan using index to look-up a target tuple if the table has an index defined and user gives specific WHERE clause to pinpoint a tuple. If user provides a WHERE clause to filter the result, you also have an option to estimate whether the WHERE can be executed by the foreign server or by the local server as it fetches a tuple.

All these are possible pathnodes are required by the planner to decide on the optimal path to execute the scan. Each pathnode requires you to provide the startup cost, total cost, number of rows and path keys if any. The official documentation suggests that you should use add_path and create_foreignscan_path routines to prepare a list of path nodes.

2.4 GetForeignPlan

Having provided all the possible pathnodes, planner will pick one that is the most optimal and pass that pathnode to this routine and ask you to make a foreign scan plan for the executor. This is a very important function because whatever plan that we provide here directly affects how the executor is going to function, so be cautious here.

In this function, the planner will give you everything about the query such as the target attribute list and the restricting clauses (WHERE clauses) and also the information about the foreign relation, such as its OID, and each column’s data types. Official PostgreSQL documentation suggests using make_foreignscan() routine with the desired arguments to create the foreign plan.

In here, you also will have an option to push down the restricting clauses or not. In other words, you can choose to send all the WHERE clauses to the foreign server to process if it can support it. This results in much less data communication. You can also choose having all the WHERE clauses to be processed locally, but this requires the FDW to fetch every single row from the foreign server, resulting in more data communication. This decision is controlled by the List *qpqual argument to the make_foreignscan() function. If this list contains the list of your restricting clauses, the local server will perform the WHERE evaluation locally; if this list is empty, then the local server assumes the FDW will do the WHERE evaluation on the foreign server

2.5 BeginForeignScan

This routine will be called before starting the foreign scan, giving you a chance to allocate all the necessary control information required for your foreign scan. You can define your own control information, for example, cursor location, remote connection structure, current offset…etc. and store in node->fdw_state and it will be passed down to the IterateForeignScan as the scan is begin completed.

2.6 IterateForeignScan

This is the main routine to perform the actual scan. You need to put the proper logic here depending on your use case to fetch one or more rows from the foreign server. When you have the row data, you will need to convert that to a TupleTableSlot strucutre in which the PG internals can understand. We normally use ExecStoreHeapTuple() routine to convert a HeapTuple into a TupleTableSlot. This routine will be continuously called by the PG executor as long as you still have data to fetch and it will only stop once you return an empty TupleTableSlot.

2.7 EndForeignScan

This routine will be called at the end of foreign scan, giving you the opportunity to clean up the control information allocated in BeginForeignScan(). This marks the end of foreign scan via FDW.

3. Summary

There is a lot of information and articals out there related to PostgreSQL’s FDW API interface. This blog is merely a quick summary of what I have learned during the FDW evaluation and there is much more to what’s discussed here. For more detail, you can refer to the official documentation about the FDW callback functons here

Understanding the Security Around PostgreSQL

1. What Is Security?

The word “Security” is a very broad concept and could refer to completely different procedures and methodology to achieve. Knowing what security means to your application is very important, so you could execute proper security practices and procedures to ensure the safety of your company’s assets. Data compromises could often lead to financial loss, reputation damage, consumer confidence disintegration, brand erosion, and non-compliance of government and industry regulation.

For this reason, the security on infrastructure software such as PostgreSQL is even more important because any data compromises could have nation or city-wide impacts, which are often difficult to completely recover.

2. Common Database Compromises

User Compromises:

  • Excessive privileges
  • Privilege abuse
  • Weak user authentication
  • Weak password
  • Default privilege too open

Data Compromises:

  • Unmanaged and unprotected sensitive data
  • Backup data exposure
  • Stolen hard disks
  • Unmanaged encryption keys

Network Compromises:

  • Firewall rules
  • Deep Packet Inspection (DPS)
  • Vulnerability prevention
  • Denial of Service (DOS) attack

Vulnerability:

  • Software bug
  • Buffer overflow
  • SQL injection
  • Privileged escalation

3. The Big Picture

img

This picture shows different types of “security” around a PostgreSQL server and there are roughly 5 types of security concepts involved here:

3.1 Data Security (Over Network)

This is the security in the communication between PostgreSQL client and server that we almost always want to use TLS to encrypt the data communication in a production environment. TLS guarantees the mutual trust between the client and the server so each side is sure that it is communicating with the right entity instead of a rogue server. SSH tunneling is also a common option to secure a psql connection when TLS is not fully set up. SSH tunneling is also very secure as each connection forces client and server to generate and agree on an encryption key that is valid only for that session. Furthermore, SSH tunneling can be made more secured by setting up the public and private key pair between client and server to ensure the authenticity of the two entities.

3.2 Data Security

This is the security between PostgreSQL and the disk in which it writes data to. This security type is often refereed as a “Cluster Data Encryption” or “Transparent Data Encryption”. Current version of PostgreSQL does not support this feature but there is a handful of talented people working on this feature right now. This security is designed to prevent data compromises directly done on the hard disk. By encrypting the data on the disk, hard disk theft will not be able to extract useful information from the hard disk.

3.3 Network Security

This is the security that most likely will involve a firewall in between a connecting client and a server. The purpose of a firewall is to block most of the malicious connections coming from the public network and prevent unauthorized access to the server. Most advanced firewalls such as an IPS can block DOS attacks and perform deep packet examination according to a database of known malicious packet and attacks. There are also firewalls such as an IDS that perform network monitoring only and will raise alert to the operator should it detects an attack attempt.

3.4 Vulnerability

This is the security that is mostly caused by a software bug that allows an attacker to take advantage of the server, steal data, or simply out a stop to the server and cause damage. The best way to prevent this is upgrade your PostgreSQL server to the latest version that has addressed most of the known vulnerabilities.

3.5 User Security

This is the security that relates mostly to the user management, sometimes called a Role-Based Access Control (RBAC). This is where a database administrator is managing each database user and setting the right privileges for the right users. Excessive privileges, weak passwords and privilege abuses are very common if not done correctly. Make sure the right users get the right privileges and use a third party authentication servers such as LDAP or Kerberos instead of simple passwords can significantly increase the security ratings of your database infrastructure.

Using GDB To Trace Into a Parallel Worker Spawned By Postmaster During a Large Query

1. Introduction

I am working on a new PostgreSQL feature that redefines the way a tuple’s visibility status is determined. The feature is working very nicely until I start doing a large SELECT query, which triggers PostgreSQL to spawn multiple parallel workers to process the request. When this happens, the feature I am working on start to yield incorrect results. A good portion of the data tuples returned are missing because they are considered as invisible, while some portion of it remains visible. It immediately came to my attention that the new feature I am working on does not work in parallel worker mode and somehow I need to find a way to debug into a spawned parallel worker to examine how it is computing the visibility and what is missing inside.

In this blog, I would like to share with you how I use GDB to debug and trace into a new parallel worker spawned by Postmaster in order to fix the visibility issue.

2. GDB Basics

I wrote another blog previously that shows how to use GDB to trace and debug a PostgreSQL issues and share some of the most common commands that I use every day to resolve software issues. If you are new to GDB, I suggest giving this blog a read here

3. How and When does PG Spawn A New Parallel Worker

When you use psql to connect to a PostgreSQL database, it will spawn a new backend worker process to serve this connecting client. Most of the queries you provide will be processed by this backend process, includes SELECT, UPDATE, INSERT…etc. By default, if your SELECT query will require doing a sequential scan over 8MB of data, it will try to use a parallel worker to help speed up the processing. This 8MB threshold can be configured by the min_parallel_table_scan_size parameter in postgresql.conf . There is another configuration parameter max_parallel_workers that controls the maximum number of parallel workers is allowed to be spawned. The default is 8.

Technically, I can avoid my visibility issues simply by either setting min_parallel_table_scan_size to a huge number, or setting max_parallel_workers to 0. But this is really not my style, I would like to keep all these goodies that PG provides while being able to solve the problem.

To spawn a parallel worker, the psql backend will initialize a parallel worker context in the global process table and a message queue based on shared memory for communication with the backend. Then it sends a signal to postmaster to notify that the global process table has been updated.

When postmaster receives the signal, it will load the global process table and found that it needs to spawn a new parallel worker. It will proceed to fork a new parallel worker according to the context information supplied. This information determines the entry point for the parallel worker and what to do once spawned. During processing, the parallel worker and the psql backend use the message queue to communicate tuples back and forth and finally the psql backend will gather together all the data tuples and produce a final result back to the user.

4. Can We Use GDB to attach This Parallel Worker’s PID When Spawned?

Technically yes, but the life time of this parallel worker may be very short, by the time you see its PID from the ps -ef command, the worker may have already done its job and exited. This means, it is too late for me to start GDB and attach to its PID.

Instead, the technique I am going to show you today will trace the parallel worker from the moment it starts.

5. Tracing the Parallel Worker

I will be using this instance of PostgreSQL server (version 12.5) as an example where PID 11976 is the psql backend process serving the psql client.

img

Pre-Condition:

Connect psql to the PostgreSQL server, create an example table and inserted about 2.5M rows of data. This will for sure trigger parallel workers.

$ psql -d postgres -U postgres -p 6660
psql (12.5)
Type "help" for help.

postgres=# create table test(a int, b int);
CREATE TABLE
postgres=# insert into test values(generate_series(1,500000),1);
INSERT 0 500000
postgres=# insert into test values(generate_series(1,500000),1);
INSERT 0 500000
postgres=# insert into test values(generate_series(1,500000),1);
INSERT 0 500000
postgres=# insert into test values(generate_series(1,500000),1);
INSERT 0 500000
postgres=# insert into test values(generate_series(1,500000),1);
INSERT 0 500000
postgres=#

Step 1: Attach GDB to the psql backend having PID = 11976 and Set a Break Point

I am setting a break point at the function RegisterDynamicBackgroundWorker. This is called when parallel worker is required to complete the query. Setting a breakpoint allows us more control as to when to proceed with a parallel worker spawn.

gdb postgres
(gdb) attach 11976
(gdb) b RegisterDynamicBackgroundWorker

Step 2: Attach GDB to the Postmaster having PID = 11959 and Set 2 Break Points

We are using a second GDB to attach to the postmaster and set 2 break points there. fork_process is the function before postmaster actually spawns a new parallel worker using the system fork() call. ParallelWorkerMain is the main function for the parallel worker after it has been spawned.

gdb postgres
(gdb) attach 11959
(gdb) b fork_process
(gdb) b ParallelWorkerMain

Step 3: Execute a Large SELECT Query On psql To Trigger the Break Points

postgres=# select count(*) from test;

The RegisterDynamicBackgroundWorker break point will be hit on the first GDB session having attached PID = 11959

Use the continue or c GDB command to continue to spawn the worker

Breakpoint 1, RegisterDynamicBackgroundWorker (worker=0x7ffd867f3c80, handle=0x55a009b77388) at bgworker.c:1002
1002 bool success = false;
(gdb) c
Continuing.

As you continue the first GDB session, the second GDB session will pause due to receipt of a SIGUSR1 signal. This signal tells postmaster to reload the global process table and then spawn a parallel worker. Using the continue command will hit the first break point at fork_process

Program received signal SIGUSR1, User defined signal 1.
0x00007f301b97d0f7 in __GI___select (nfds=5, readfds=0x7ffd867f47d0, writefds=0x0, exceptfds=0x0, timeout=0x7ffd867f4740)
at ../sysdeps/unix/sysv/linux/select.c:41
41 in ../sysdeps/unix/sysv/linux/select.c
(gdb) c
Continuing.

Breakpoint 1, fork_process () at fork_process.c:47
47 fflush(stdout);

(gdb)

Step 4: Tell Postmaster GDB To Follow Child Process Instead Of Parent

At this point, the postmaster GDB is now waiting at the fork_process call, which is right before spawning a parallel worker. This is a good time now to tell GDB to follow the child process instead of staying at parent when the process calls fork(). The reason we want to set this late at this moment is because postmaster is occasionally spawning other backend processes such as walsender and walreceiver. Setting to follow child process early may cause our GDB to follow to another backend process that we are not interested in.

(gdb) set follow-fork-mode child

You may use the continue command after setting it to follow child. Then immediately the GDB will switch to the new child process having PID = 12198 below and hit our second break point ParallelWorkerMain. So, Now the GDB is debugging the parallel worker process instead of the original postmaster.

(gdb) c
Continuing.
[New process 12198]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Switching to Thread 0x7f301ca79740 (LWP 12198)]

Thread 2.1 "postgres" hit Breakpoint 2, ParallelWorkerMain (main_arg=1544458234) at parallel.c:1207
1207 {

Step 5: Continue To Debug The Parallel Process

Using the ps -ef | grep postgres command, we can see a new parallel worker being spawned having PID = 12198

img

At this point, you are free to explore the process flow of the parallel worker. For me, I am debugging the visibility issues, so I will set additional break points at HeapTupleSatisfiesMVCC and TransactionIdIsCurrentTransactionId. In your case, you may be debugging some other functionalities.

Being able to debugging into a parallel worker with GDB allows me to see the problems I was having and being able to fix quickly.

If you are having trouble tracing into a parallel workers spawned by PostgreSQL during run time, I hope this blog will be helpful to you.

How PostgreSQL Handles Sub Transaction Visibility In Streaming Replication Setup?

1. Introduction

As an experienced PostgreSQL user, you may have a lot of experience in setting up streaming replication in your database clusters to make multiple backups of your data. But have you wondered how the standby is able to correctly determine if a tuple sent from the primary should be visible to the user or not. In the case where a transaction contains multiple subtransactions, how does the standby determine the visibility in this case? You might say… well it is PostgreSQL so it will just work… This is true. If you are someone who is curious and interested in knowing how PostgreSQL does certain things internally, then this blog may be interesting for you as we will talk about normally transaction and subtransactions and how both affect the standby’s ability to determine tuple visibility.

2. How is Tuple Visibility Determined?

PostgreSQL determines a tuple’s visibility based on the concept of transaction ID (a.k.a txid) and it is normally stored in a tuple’s header as either xmin or xmax where xmin holds the txid of the transaction that inserted this tuple and t_xmax holds the txid of the transaction that deleted or updated this tuple. If a tuple has not been deleted, its t_xmax is 0. There is also another mechanism called commit log (a.k.a clog) where it has information of currently active transaction IDs. When an inserted tuple has a valid t_xmin and invalid t_xmax, PostgreSQL will have to consult the clog first to determine if the txid stored in t_xmin is visible or not, then the result will be stored in the tuple’s hint_bit field about the visibility information such that it does not always have to consult the clog which could be resource intensive. If a tuple has a valid t_xmax, then there is no need to consult the clog as the tuple must be invisible.

This is just the basic ideas of how visibility is determined but it is enough for me to continue with this blog. For more information about visibility, you could refer to this resource

3. What is a Subtransaction?

As the name suggests, it is a smaller transaction that exists within a regular transaction and it can be created using the SAVEPOINT keyword after your BEGIN statement. Normally, when you have issued a BEGIN statement, all the tuples created from the subsequent DML statements such as INSERT or UPDATE will have a txid associated with these tuples and they will all be the same.

When you issue a SAVEPOINT statement within this transaction, all the tuples created from the subsequent DML statements will have a different txid that is normally larger than the main transaction’s txid. When you issue the SAVEPOINT statement again, the txid changes again.

The advantage of Subtransaction is that in a very large transaction, the entire transaction will not be aborted due to an error. It allows you to rollback to the previously created SAVEPOINT and continue to execute the main transaction.

The biggest difference between a subtransaction and a regular transaction is that there is not a concept of commit associated with subtransactions and it will not be recorded in the WAL file. However, if subtransaction is aborted, it will be recorded in the WAL file and subsequently be sent to the standby.

4. How Does a Standby Determines Visibility Information In a Regular Transaction?

Here’s an example of a regular transaction without any subtransactions:

BEGIN;
INSERT INTO test VALUES(111,111111); => a heap_redo WAL record having t_xmin = 500 is sent to standby
INSERT INTO test VALUES(111,222222); => a heap_redo WAL record having t_xmin = 500 is sent to standby
INSERT INTO test VALUES(111,333333); => a heap_redo WAL record having t_xmin = 500 is sent to standby
COMMIT; => a XLOG_XACT_COMMIT WAL record having t_xmin = 500 is sent to standby

In a general case, the above queries will send at least 4 WAL records to the standby during streaming replication. First 3 contain the actual tuple information with t_xmin set to the current transaction ID which is 500 in the example. In other words, before the COMMIT is issued, the standby should have already received 3 WAL records from primary and completed the redo, meaning that the 3 tuples already exist in standby’s memory buffer. However, they are not visible yet because transaction ID 500 does not exist in the standby. It is until the COMMIT is issued in the primary that will subsequently generate a XLOG_XACT_COMMIT WAL record to be sent to standby to notify that transaction ID 500 is now committed and valid. When standby receives the XLOG_XACT_COMMIT WAL record, it will eventually update to its clog so the subsequent SELECT queries can do visibility check against.

5. How Does a Standby Process Visibility Information In a Regular Transaction Containing Subtransactions?

Now, let’s use the same example but add some SAVEPOINT to create subtransactions

BEGIN;
INSERT INTO test VALUES(111,111111); => a heap_redo WAL record having t_xmin = 500 is sent to standby
INSERT INTO test VALUES(111,222222); => a heap_redo WAL record having t_xmin = 500 is sent to standby
INSERT INTO test VALUES(111,333333); => a heap_redo WAL record having t_xmin = 500 is sent to standby
SAVEPOINT A;
INSERT INTO test VALUES(111,444444); => a heap_redo WAL record having t_xmin = 501 is sent to standby
INSERT INTO test VALUES(111,555555); => a heap_redo WAL record having t_xmin = 501 is sent to standby
INSERT INTO test VALUES(111,666666); => a heap_redo WAL record having t_xmin = 501 is sent to standby
SAVEPOINT B;
INSERT INTO test VALUES(111,777777); => a heap_redo WAL record having t_xmin = 502 is sent to standby
INSERT INTO test VALUES(111,888888); => a heap_redo WAL record having t_xmin = 502 is sent to standby
INSERT INTO test VALUES(111,999999); => a heap_redo WAL record having t_xmin = 502 is sent to standby
COMMIT; => a XLOG_XACT_COMMIT WAL record having t_xmin = 500 is sent to standby

As you can see, after the creation of SAVEPOINT, the tuple’s t_xmin value will be changed and is no longer equal to the current transaction ID, which is 500 in the example. Before the COMMIT is issued, the standby should have received 9 WAL records, 3 having t_xmin = 500, 3 having t_xmin = 501 and 3 having t_xmin = 502. None of them is visible yet in the standby because these transaction IDs do not exist yet. When the primary issues the COMMIT statement, it will generate a XLOG_XACT_COMMIT WAL record and send to standby containing only the current transaction ID 500 without the 501 and 502.

After the commit, the standby is able to see all 9 tuples that are replicated from the primary. Now, you may be asking…. why? How come the standby knows that txid = 501 and 502 are also visible even though the primary never send a XLOG_XACT_COMMIT WAL record to tell the standby that 501 and 502 are also committed. Keep reading to find out.

Using the similar example, if we rollback to one of the SAVEPOINTs, the primary will send a XLOG_XACT_ABORT WAL record to notify the standby a transaction ID has become invalid.

BEGIN;
INSERT INTO test VALUES(111,111111); => a heap_redo WAL record having t_xmin = 500 is sent to standby
INSERT INTO test VALUES(111,222222); => a heap_redo WAL record having t_xmin = 500 is sent to standby
INSERT INTO test VALUES(111,333333); => a heap_redo WAL record having t_xmin = 500 is sent to standby
SAVEPOINT A;
INSERT INTO test VALUES(111,444444); => a heap_redo WAL record having t_xmin = 501 is sent to standby
INSERT INTO test VALUES(111,555555); => a heap_redo WAL record having t_xmin = 501 is sent to standby
INSERT INTO test VALUES(111,666666); => a heap_redo WAL record having t_xmin = 501 is sent to standby
SAVEPOINT B;
INSERT INTO test VALUES(111,777777); => a heap_redo WAL record having t_xmin = 502 is sent to standby
INSERT INTO test VALUES(111,888888); => a heap_redo WAL record having t_xmin = 502 is sent to standby
INSERT INTO test VALUES(111,999999); => a heap_redo WAL record having t_xmin = 502 is sent to standby
ROLLBACK TO SAVEPOINT B; => a XLOG_XACT_ABORT WAL record having t_xmin = 502 is sent to standby
COMMIT; => a XLOG_XACT_COMMIT WAL record having t_xmin = 500 is sent to standby

In this example, when doing a ROLLBACK to SAVEPOINT B, a XLOG_XACT_ABORT WAL record will be sent to the standby to invalidate that the transaction ID 502 is invalid. So when primary commits, records having txid=502 will not be visible as primary has notified.

6. The Secret of Standby’s KnownAssignedTransactionIds

In addition to handling the XLOG_XACT_COMMIT and XLOG_XACT_ABORT WAL records to determine if a txid is valid or not, it actually manages a global txid list called KnownAssignedTransactionIds. Whenever a standby receives a heap_redo WAL record, it will save its xmin to KnownAssignedTransactionIds. Using the same example as above again here:

BEGIN;
INSERT INTO test VALUES(111,111111); => a heap_redo WAL record having t_xmin = 500 is sent to standby, and standby puts 500 in KnownAssignedTransactionIds
INSERT INTO test VALUES(111,222222); => a heap_redo WAL record having t_xmin = 500 is sent to standby, and standby puts 500 in KnownAssignedTransactionIds
INSERT INTO test VALUES(111,333333); => a heap_redo WAL record having t_xmin = 500 is sent to standby, and standby puts 500 in KnownAssignedTransactionIds
SAVEPOINT A;
INSERT INTO test VALUES(111,444444); => a heap_redo WAL record having t_xmin = 501 is sent to standby, and standby puts 501 in KnownAssignedTransactionIds
INSERT INTO test VALUES(111,555555); => a heap_redo WAL record having t_xmin = 501 is sent to standby, and standby puts 501 in KnownAssignedTransactionIds
INSERT INTO test VALUES(111,666666); => a heap_redo WAL record having t_xmin = 501 is sent to standby, and standby puts 501 in KnownAssignedTransactionIds
SAVEPOINT B;
INSERT INTO test VALUES(111,777777); => a heap_redo WAL record having t_xmin = 502 is sent to standby, and standby puts 502 in KnownAssignedTransactionIds
INSERT INTO test VALUES(111,888888); => a heap_redo WAL record having t_xmin = 502 is sent to standby, and standby puts 502 in KnownAssignedTransactionIds
INSERT INTO test VALUES(111,999999); => a heap_redo WAL record having t_xmin = 502 is sent to standby, and standby puts 502 in KnownAssignedTransactionIds
COMMIT; => a XLOG_XACT_COMMIT WAL record having t_xmin = 500 is sent to standby and standby submits 500, 501 and 502 to clog as valid.

Before the COMMIT is issued on the primary, the standby already has txid 500, 501, and 502 present in the KnownAssignedTransactionIds. When the standby receives XLOG_XACT_COMMIT WAL, it will actually submit txid 500 from the WAL plus the 501 and 502 from KnownAssignedTransactionIds to the clog. So that the subsequent SELECT query consult the clog it will say txid 500, 501, and 502 are visible. If the primary sends XLOG_XACT_ABORT as a result of rolling back to a previous SAVEPOINT, that invalid txid will be removed from the KnownAssignedTransactionIds, so when the transaction commits, the invalid txid will not be submit to clog.

7. Summary

So this is how PostgreSQL determines a tuple’s visibility within subtransactions in a streaming replication setup. Of course, there is more details to what is discussed here in this blog, but I hope today’s discussion can help you get a general understanding of how PostgreSQL determines the visibility and maybe it will help you in your current work on PostgreSQL.

How PostgreSQL Inserts A New Record With The help of Table Access Method API and Buffer Manager

1. Introduction

This blog talks about a high level description of the mechanism behind PostgreSQL to execute an INSERT query. This process involves many steps of processing before the data is put in the right place. These process normally involves several catalog cache lookup to determine if the destination table exists or several checking on the constraint violations..etc. This blog will mainly focus on the part where the processing handle is passed to the PostgreSQL’s table access method API and its interaction with buffer manager and WAL routines. This is also the core area where an INSERT query is actually executed. If you are a developer looking to understand how PostgreSQL works internally, this blog may be helpful to you…

2. Table Access Method APIs in PostgreSQL

Pluggable table access method API has been made available since PostgreSQL 12, which allows a developer to redefine how PostgreSQL stores / retrieves table data. This API contains a total of 42 routines that need to be implemented in order to complete the implementation and honestly it is no easy task to understand all of them and to implement them. This API structure is defined in tableam.h under the name typedef struct TableAmRoutine

Today I will describe the routines related to INSERT.

3. INSERT Query Overall Call Flow

A few of the 42 routines will be called by executor just to complete an INSERT query. This section will describe these routines in the order they are called.

3.1 slot_callbacks

const TupleTableSlotOps *(*slot_callbacks) (Relation rel);

The executor needs to find out which set of tuple table slot (TTS) callback operation this table access method is compatible with. TTS is a set of routines that ensures the tuple storage is compatible between the executor and your access method. The executor will execute the TTS callback to translate your tuple strucuture to TupleTableSlot format in which the executor will understand. The default heap access method uses TTSOpsBufferHeapTuple defined in execTuples.c to handle this operation

3.2 heap_insert

void
heap_insert(Relation relation, HeapTuple tup, CommandId cid,
int options, BulkInsertState bistate)

heap_insert is the entry point to perform the actual data insertion and it will undergo several other routines provided by buffer manager and WAL module in order to complete the insertion.

heap_prepare_insert
static HeapTuple heap_prepare_insert(Relation relation, HeapTuple tup,
TransactionId xid, CommandId cid, int options);

This is a subroutine for heap_insert where it will initialize the tuple header contents such as relation OID, infomasks, xmin, xmax values. It will also determine if the tuple is too big that TOAST is required to complete the insertion. These terms and parameters are very technical in PostgreSQL. If you are not sure what exactly they are, you could refer to resources here and here.

RelationGetBufferForTuple
extern Buffer RelationGetBufferForTuple(Relation relation, Size len,
Buffer otherBuffer, int options,
BulkInsertStateData *bistate,
Buffer *vmbuffer, Buffer *vmbuffer_other);

This is an entry function to access buffer manager resources and all it is doing is ask the buffer manager to return a buffer ID that can be used to store the target tuple. This may sound very straightforward, but there is quite a lot of processing on the buffer manager side to properly determine a desired buffer location.

First, it will do a quick size check. If the input tuple is larger than the size of each buffer block, it will return immediately with error as TOAST has to be used in this case. Then it will try to put the tuple on the same page the system last inserted the tuple on to see if it will fit there. If not, it will utilize the free space map to find another page that could fit tuple. If that does not work out, then buffer manage will allocate a new data page (also referred to as extend) to be used to hold this new tuple. As soon as we have a desired buffer page determined, buffer manager will cache this page in the relation structure such that next time the same relation visits the buffer manager, it knows immediately about the reference to the last inserted block.

RelationPutHeapTuple
extern void RelationPutHeapTuple(Relation relation, Buffer buffer,
HeapTuple tuple, bool token);

Once we have identified the location of the buffer to store the tuple, the insert routine will then call RelationPutHeapTuple to actually put the tuple in the specified buffer location. This routine will again ask the buffer manager to get a pointer reference to the data page using the buffer ID we obtained from RelationGetBufferForTuple, then add the tuple data using PageAddItem() routine. Internally in buffer manager, it manages the relationship between buffer ID, buffer descriptor and the actual pointer to the data page to help us correctly identify and write to a data page. After a successful write, the routine will save a CTID of the inserted tuple. This ID is the location of this tuple and it consists of the data page number and the offset. For more information about how buffer manager works, you can refer to the resource here

Mark buffer dirty
extern void MarkBufferDirty(Buffer buffer);

At this point, the tuple data is already stored in the buffer manager referenced by a particular data page plus an offset, but it is not yet flushed to disk yet. In this case, we almost always will have to call MarkBufferDirty function to signal buffer manager that there are some tuples on the page that have not been flushed to disk and therefore in the next checkpoint, it will ensure the new tuples are flushed to disk.

[Insert WAL Record]

Last but not least, after doing all the hassle of finding a buffer location to put our tuple in and mark it as dirty, it is time for the heap_insert routine to populate a WAL record. This part is not the focus of this blog so I will skip the high level details of WAL writing.

3.3 End of the insertion

At this point the insertion of a new tuple data has finished and proper WAL record has been written. The routine will once again save the CTID value that we derived during the data insertion and save this value to the TTS structure so the executor also gets a copy of the location of the tuple. Then it will clean up the local resources before returning.

4. Summary

What we have discussed here is the basic call flow of a simple sequential scan. If we were to visualize the process, it should look something like this:
tuple insert

How PostgreSQL Executes Sequential Scans with the Help of Table Access Methods APIs

1. Introduction

There are many approaches for PostgreSQL to retrieve the data back to the user. Depending on the user’s input query, the planner module is responsible for selecting the most optimum approach to retrieve the requested data. Sequential scan is one of these approaches that is mostly selected when the user requests a large volume of data (for example, “SELECT * from tablename;”) or when a table has no index declared. Sequential scan is mostly handled by the table access method API within PostgreSQL and heap access method is the default one PostgreSQL uses today. In this short post, I will show you how sequential scan is done in the table access method API.

2. Table Access Method APIs in PostgreSQL

Pluggable table access method API has been made available since PostgreSQL 12, which allows a developer to redefine how PostgreSQL stores / retrieves table data. This API contains a total of 42 routines that need to be implemented in order to complete the implementation and honestly it is no easy task to understand all of them and to implement them. This API structure is defined in tableam.h under the name typedef struct TableAmRoutine

Today I will describe the routines related to sequential scan and I hope it could help you if you are someone looking to create your own table access method.

3. Sequential Scan Overall Call Flow

Few of the 42 routines will be called by executor just to complete a sequential scan request. This section will describe these routines in the order they are called.

3.1 relation_size

uint64		(*relation_size) (Relation rel, ForkNumber forkNumber);

relation_size is the first routine to be called and it is relatively simple. The caller will expect the routine to return the total size of the relation described by rel and forkNumber. The default heap access method will simply invoke the storage manager smgr to find the number of data blocks this particular relation physically occupies on disk and multiplies that number with the size of each block BLCKSZ (default is 8k). If you are not sure about the relationship between relation and its fork number, you could refer to this blog to get more information.

The size returned by this routine basically sets the boundary of our sequential scan.

3.2 slot_callbacks

const TupleTableSlotOps *(*slot_callbacks) (Relation rel);

Next, the executor needs to find out which set of tuple table slot (TTS) callback operation this table access method is compatible with. TTS is a set of routines that ensures the tuple storage is compatible between the executor and your access method. The executor will execute the TTS callback to translate your tuple strucuture to TupleTableSlot format in which the executor will understand. The default heap access method uses TTSOpsBufferHeapTuple defined in execTuples.c to handle this operation

3.3 scan_begin

TableScanDesc (*scan_begin) (Relation rel,
Snapshot snapshot,
int nkeys, struct ScanKeyData *key,
ParallelTableScanDesc pscan,
uint32 flags);

Now the scan can officially begin. This is sequential scan’s initialization routine in which it will allocate a new scan descriptor using the parameters passed in by the executor. The purpose of scan descriptor structure is to keep track of the sequential scan while it is being executed. For example, to track where the scan should begin,; when was the block number of the last scan; which block should we resume scanning and how many blocks have been scanned…etc. scan descriptor will be destroyed once the sequential scan has completed.

The executor expects the routine to return a fully allocated and initialize pointer to TableScanDesc struct

3.4 scan_getnextslot

bool		(*scan_getnextslot) (TableScanDesc scan,
ScanDirection direction,
TupleTableSlot *slot);

This is the main routine for sequential scan where the caller expects the routine to fetch one tuple from the buffer manager, converts it to the TTS format in which executor understands and save it in the input pointer called slot. Each call to this routine will results in one tuple to be returned. If a table contains 1000 tuples, this function will be called 1000 times. The boolean return code is the indication to the caller if the routine has more tuples to return, as soon as false is returned, it signals the executor that we have exhausted all the tuples and it should stop calling this function.

In normal sequential scan case, this routine works in per-page mode. This means it will read one full block from buffer manager and scan it to get all the tuple addresses and their offsets in the scan descriptor, so in the subsequent calls to the same function, it will not load the full page again from buffer manage all the time; it will only start to load the next block when all the tuples on the current block have been scanned and returned.

As you can see, the scan descriptor plays an important role here as most of the control information is saved there and is regularly updated whenever a call the scan_getnextslot is made.

3.5 scan_end

void		(*scan_end) (TableScanDesc scan);

This is the last routine to be called to basically clean up the table scan descriptor, which was used heavily during the sequential scan. At this point, the executor should already have all the tuple information from the sequential scan methods.

4. Prepare the Data to Return

Now, the executor is finished with the table access method and has already had access to all the tuples for a particular relation. It then needs to go through another round of filtering to determine which of these tuples satisfy the condition set by the user, (for example, when user gives the WHERE clause to limit the scan results). This is done in another infinite for loop in execScan.c to perform ExecQual on each TTS. Finally, the end results will be sent to the end user.

5. Summary

What we have discussed here is the basic call flow of a simple sequential scan. If we were to visualize the process, it should look something like this:
Sequential Scan

2020 PG Asia Conference Ended Successfully at an Unprecedented Scale!

Introduction

On November 17-20, 2020, PostgresConf.CN & PGconf.Asia2020 (referred to as 2020 PG Asia Conference) was held online for the very first time! This conference was jointly organized by the PG China Open Source Software Promotion Alliance, PostgresConf International Conference Organization, and PGConf.Asia Asian Community. This conference was broadcast exclusively via the Modb Technology Community platform in China with a record-high number of viewers streaming the conference events. With the great support from these PostgreSQL communities, the conference was held with great success, which brought together the Chinese PG power, major Asian PG contributors and many PostgreSQL experts worldwide to build the largest PG ecosystem in Asia.

About the Conference

Also known as the Asia’s largest open source relational database ecology conference

PostgresConf.CN and PGConf.Asia, for the very first time, were hosted together as one conference online accompanied by additional offline sessions hosted at several reputable university campuses in China.

PostgresConf.CN is an annual conference held by the China PostgreSQL Association for PostgreSQL users and developers. It is also one of the conference series held by PostgresConf Organization. PostgreConf.CN 2019 took place in Beijing, it was very well attended by PostgreSQL users and community members across the globe.

PGCONF.Asia is also an annual PostgreSQL event that took place in Bali Indonesia in 2019, it was a continuation of the PGCONF.Asia event that took place in Tokyo, Japan in 2018. The first PGCONG.Asia conference took place in 2016 in Tokyo, this conference acts as a hub of PostgreSQL related development and technical discussion among PostgreSQL users and developers in the region as well as experts from around the globe.

Learn more about these conferences and the organizers from these resources:

Sponsors

This conference was sponsored by:

Platinum:

Golden:

Silver:

14 Conference Channels over 4 Days!

The conference lasted 4 days at an unprecedented scale. A total of 14 channels, including 5 technical training + 9 main/sub-forum training channels.

  • Alibaba Cloud Database Training Session
  • Tencent Cloud Database Training Session
  • HighGo Software Training Session
  • PG Training Institution Charity Session
  • PostgresConf Orgnization English Training Session (1 Day)
  • the main forum (2 days)
  • Chinese sub-forum (2 days)
  • English sub-forum A (2 days)
  • English sub-forum B (2 days)
  • CCF advanced academic forum (1 day)

Over 100 Participating Experts and Scholars Around the World

The number of invited speakers for this conference has set a new record. With the theme of “Born into the World”, the conference gathered 112 technical presentations and more than 100 well-known experts and scholars around the world to provide a grand technical feast for all the participating PGers.

  • Guangnam Ni, Fellow of the Chinese Academy of Engineering
  • Peng Liu, Vice chairman of China Open Source Software Promotion Alliance and researcher of the Chinese Academy of Sciences
  • Zhiyong Peng, deputy director of the database committee of the Chinese Computer Society
  • Bruce Momjian, co-founder of PostgreSQL international community and vice president of EDB
  • Peter Zaitsev, Founder and CEO of Percona
  • Tatsuo Ishii, the original author of Pgpool-ll and founders of PGconf.asian and Japanese PG user association
  • Experts from from Alibaba, Tencent, Amazon, JD.com, Inspur, Ping An, Suning, ZTE, HornetLab, Equnix, VMware Greenplum, yMatrix, HighGo Software, Yunhe Enmo, Percona, EDB, NTT, Postgres Professional, Fujitsu, AsiaInfo, Giant Sequoia, Mechuang, Wenwu, Guoxin Sinan, Hytera, Airwallex, Ottertune and many others.
  • Professors from Wuhan University, East China Normal University, Harbin Institute of Technology, Shandong University, CCF (China Computer Society) database committee members of Tianjin University
  • Professional lecturers from 10 well-known authorized PG training service providers
  • And many, many more!

Record High LIVE Streaming Viewers

The number of LIVE streaming viewers at this conference has also hit a new record. Each channel accumulated over 30,000 active LIVE streams, the official conference blog views accumulated over 50,000+ views, and the news reports and articles from media exceeded over 3,000 entries.

Conference Highlights

PostgresConf is an annual event for PostgreSQL developers and users worldwide. This conference attracted core members from the global PostgreSQL community, as well as corporate and individual users who use PostgreSQL.

The PGConf.Asia conference went smoothly for 4 days and it attracted many domestic and foreign audiences worldwide to join the LIVE streaming channels. The conference has 100+ subject sharing sessions, and each session on average accumulated 30,000+ active streams.

The first two days of the conference, several leading internet service vendors such as Alibaba Cloud, Tencent Cloud and database vendor HighGo Software brought a series of rich technical contents related to PostgreSQL, providing tremendous amount of values to the audiences who are willing to learn PostgreSQL database technology

The third and fourth days of the conference consist of numerous technical presentations covering wide range of area of technical interests.

Let’s take a look at some of the highlights of the conference!

Opening Speech by Fellow Guangnam Ni

Guangnam Ni
The conference was kick-started by an opening speech by Guangnan Ni, fellow of the Chinese Academy of Engineering and the leader of China’s basic software industry. Fellow Ni first expressed his warm congratulations on the holding of the conference and sincere greetings to the representatives of the participating countries and open source experts. He pointed out that open source is an important trend in today’s world development, and it is profoundly changing the global digital economy and information industry.

This conference brings great value to the development of the open source industry, promotes the open source movement and popularization of open source culture in Chin and also strengthens international exchanges and cooperation. Fellow Ni hopes to use the PGConf Asia Conference platform to realize intellectual exchanges between Asia and the world, achieve higher and higher levels of global cooperation, and contribute to the development of global open source technology. Finally, Fellow Ni wished the conference a complete success, and wished the PG open source ecosystem more prosperity! *”I wish the Chinese PG branch and the Asian PG community prosper and get better and better!”*

Peng Liu

Peng Liu, executive vice chairman of China Open Source Software Promotion Alliance and researcher at the Institute of Software and Chinese Academy of Sciences, delivered a speech on behalf of COPU. He pointed out that based on the open BSD license + decentralized ecological architecture, PostgreSQL has a permanent security guarantee in the past, present and future. Following the spirit of free software & complying with the BSD open source agreement, PostgreSQL meets the most stringent security compliance audits, and is not subject to US technology export control jurisdiction restrictions (Long Arm Jurisdiction Limitations/US Export Controls). PostgreSQL is a recognized global technology with public property rights and countless global PG contributors are committed to promoting the free and democratic software movement to protect the rights of anyone who obtain and use such software.
Peng Liu

Bruce Momjian

Bruce Momjian, co-founder of the PostgreSQL international community and vice president of EDB, shared “Will Postgres Live Forever” speech. He said that any business has its life cycle, and open source PG is no exception, but compared to closed source commercial software, the life cycle of open source software will have more vitality. As long as the source code is valuable, it can always get a new life. In 1996, Postgres got a new life due to the introduction of the SQL standard and the improvement of its functions. The development trend continues to rise today.

Bruce Momjian
Bruce Momjian

Tatuso Ishii

PGconf.Asia and Tatsuo Ishii, the founder of the Japanese PG User Association and the original author of Pgpool-ll, shared “Wonderful PostgreSQL!”. Tatsuo Ishii wrote the WAL system by himself based on Gray’s business thesis, and his creativity is admirable.
Tatsuo Ishii

Peter Zaitsev

Percona CEO Peter Zaitsev made a sharing of “The Changing Landscape of Open Source Databases”. He summarized several key points such as distributed, cloud native, storage and computing separation, and hardware acceleration, which basically covered the main focuses of the current database technology development.

Peter Zaitsev
Peter Zaitsev
Peter Zaitsev

Zhiyong Peng

Professor Peng Zhiyong, deputy director of the database committee of CCF China Computer Society, deputy dean of the School of Computer Science of Wuhan University, made a sharing of “My Way from PG to TOTEM”. Professor Peng talked about his 30-years of database research, from leading students to in-depth PG source code research database models, to compiling PG kernel analysis textbooks, to the development of the totem (TOTEM) database.
Zhiyong Peng

As we all know, PG originated from the University of California, Berkeley. This conference was also fortunate enough to have invited Professor Peng from Wuhan University. Professor Peng has been deeply involved in the PG database for more than 30 years and has led students to in-depth PG source code research database model, to write PG kernel and analysis book. Eventually created the totem (TOTEM) database based on PG. Peng has made outstanding contributions to PostgreSQL talent training and research!

Zhiyong Peng

Guoqiang Gai

Guoqiang Gai, the founder of Enmotech, gave a speech on “Observing the elephant: the skills migration from Oracle to PostgreSQL DBA”. Gai gave a principle analysis from a source code perspective through a typical PG rollback problem, encouraging everyone to play with open source databases to learn the source code deeply, emphasizing that DBA is essential to enterprise data management, and every DBA must pass self-employment training to reflect personal value.
Guoqiang Gai

Whether it is PG or PG-related database products, its value must always be reflected by helping companies manage core data assets. Here, Gai said that the role of DBA is crucial for the last mile from database products to users. They are the closest partners of databases, whether in client companies, database cloud vendors or software suppliers or integrators. Gai encouraged DBAs to take advantage of open source and analyze in-depth source code to solve problems.

Guoqiang Gai

Julyanto SUTANDANG

Julyanto SUTANDANG, CEO of Equnix Business Solutions, gave a very detailed presentation about the professional PostgreSQL certifications and how these certifications can help an individual advance his or her career in PostgreSQL related fields. Certification is one of the best ways to tell your client, or your boss that you are the right person for the job. Whether you are a regular user, a government regulator, a professor or a subcontractor, there will always be a suitable level of PostgreSQL certification for your needs!

Julyanto SUTANDANG

Lei Feng

Lei Feng, founder and general manager of Greenplum China founder shared “Greenplum’s open source journey: ecology and community leadership” speech.
Lei Feng

the AI-enabled database is also leading the future development direction of the database. Lei shared the trilogy of their core team’s digital advancement in the cloud era and emphasized that the core competitiveness of database companies in the future will be the exploration and practice of AI models.

Lei Feng

Shuxin Fang

Shuxin Fang, general manager of the technical support department of Inspur Business Machines, shared his presentation on “K1 power and PostgreSQL help enterprises to build new normal and new core”.
Shuxin Fang

Zhengsheng Ye

Zhengsheng Ye, general manager of Alibaba Cloud Database Products and Head of Database Products and Operations, shared a speech of “Database Development Trend”.
Zhengsheng Ye

Yicheng Wang

Yicheng Wang, Deputy General Manager of Tencent Cloud Database, shared the “Database Behind Every WeChat Payment”.
Yicheng Wang

Tencent Cloud uses Tbase’s distributed solution to carry the WeChat payment business, and at the same time, it continues to enhance the core value of the product through cluster scalability, enhanced security, and efficient data compression, providing DBaaS for more enterprise users.

Yicheng Wang

Xiaojun Zheng

Xiaojun Zheng, Chief Scientist of HighGo Software, shared the “Review of Commercial Database Technology Innovation”. With 30 years of senior management experience in several well-known database companies, he elaborated on major innovations in commercial databases in the past 30 years, which has important guiding significance for future database development.

Xiaojun Zheng

HighGo database (HGDB) values data security with great importance when fulfilling the demands of enterprise users, and it has enhanced the security functions through a variety of technical means, including separation of powers, FDE full disk encryption, security auditing, security marking, etc. More features of the security version can be referred to The image below figure. Not only that, the HighGo enterprise cluster database also has flexible scalability, high availability, and effective load balancing.

Xiaojun Zheng

Chaoqun Zhan

Chaoqun Zhan, a researcher of Alibaba Group and head of the OLAP product department of the database product division, shared “Opportunities and Challenges of Cloud Native Data Warehouse”.

Chaoqun Zhan

The development of database technology is mainly affected by business scenario requirements and development factors of hardware technology. From the perspective of business scenario requirements, Alibaba Cloud, as a domestic leading cloud vendor, mainly integrates PolarDB+ADB full-line database products and integrates PG to respond to users’ various business scenarios.

Zhan shared a cloud-native integrated solution within Alibaba Group, which provides extreme performance and extremely fast and flexible expansion of cloud-native DBaaS.

Chaoqun Zhan

Bohan Zhang

Bohan Zhang, the co-founder of Ottertune, shared effective solutions from the Carnegie Mellon University laboratory for the automatic tuning of PG parameters, and provided DBA recruitment information. If you are interested, please contact Zhang directly
Bohan Zhang

Closing the Conference

On the last day of the conference, Peter Zaitsev, CEO of Percona, shared the topic “17 Things Developers Need to Know About Databases” to help the database developers out there (including PG and non-PG developers) to increase their productivity, their quality of work, maintain a good relationship with DevOps and most importantly, avoid deadly and expensive mistakes!

Peter Zaitsev

About China PostgreSQL Assosication

The China PostgreSQL Association is a legitimate and non-profit organization under the China Open Source Software Promotion Alliance.

The association’s main focus is to conduct activities around PostgreSQL, organize operations, promote PostgreSQL, host trainings, contribute to technological innovations and implementations. In addition, the association aims to promote the localization of the database development by bridging the PostgreSQL Chinese community with the International community.

Official Website http://www.postgresqlchina.com
China PostgreSQL Association

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×