1. Introduction
This is part 2 of the blog “Understanding Security Features in PostgreSQL”, in which I will be discussing TLS in greater details. I will begin by going over some of the most important security concepts around TLS before jumping into enabling TLS on PostgreSQL server. I believe it is crucial to have sufficient background information on TLS before tweaking the TLS settings in both client and server sides.
In part 1 of this blog, we mostly discussed about authentication and authorization (AA), which is important to identify which client is permitted to connect and which table or column he/she is permitted to operate. Even with the strongest authentication and authorization, the actual communication between client and server will not be encrypted unless Transport Layer Security (TLS) is specifically enabled in the database server. TLS is one of the least understood but commonly used security protocol that ensures the security of many HTTPS sites and other services. TLS is a big protocol and this blog will describe how it works and how to enable TLS in your PostgreSQL server.
Here is the overview of the security
topics that will be covered in all parts of the blog:
Part 1:
- PostgreSQL Server Listen Address
- Host-Based Authentication
- Authentication with LDAP Server
- Authentication with PAM
- Role-Based Access Control
- Assign Table and Column Level Privileges to Users
- Assign User Level Privileges as Roles
- Assign and Column Level Privileges via Roles
- Role Inheritance
Part 2:
- Security Concepts around TLS
- Symmetrical Encryption
- Asymmetrical Encryption (a.k.a Public Key Cryptography)
- Block Cipher Mode of Operation (a.k.a Stream Cipher)
- Key Exchange Algorithm
- TLS Certificate and Chain of Trust
- Data Integrity Check / Data Authentication
- TLS Cipher Suite and TLS handshake
- TLS versions
Part 3:
- Preparing TLS Certificates
- Enabling Transport Layer Security (TLS) to PostgreSQL Server
- Enabling Transport Layer Security (TLS) to PostgreSQL Client
- TLS Connect Examples
- Transparent Data Encryption (TDE)
- Security Vulnerability
2. Security Concepts around TLS
Before we jump into configuring TLS in PostgreSQL. It is super important to have some background information on the following security topics build around TLS.
2.1 Symmetrical Encryption
Symmetrical Encryption is a type of encryption where only one secret key is used to encrypt and decrypt a message. In other words, the connecting client will use the secret key to encrypt the message and send to server, the server uses the same key to decrypt the ciphered message and obtain the original message. This is a very fast encryption operation and may sound simple, but the challenge here is how to securely share this one and only secret key between the client and server, how long should the secret key be used before next rotation? Should the secret key be pre-configured on both client and server sides? Should third-party key management software be integrated? These are some of the common challenges with symmetrical encryption.
The following is some of the most common symmetrical encryption algorithms today with the AES being the most popular: (reference: https://en.wikipedia.org/wiki/Symmetric-key_algorithm). Each algorithm supports key lengths having multiple sizes and normally is denoted after the encryption algorithm name, for example, AES-128, AES-256…etc.
- AES (Advanced Encryption Standard)
- DES (Data Encryption Standard)
- Triple DES
- Blowfish
Symmetrical encryption is normally paired with a Block Cipher Mode of Operation
to encrypt or decrypt a stream of data. Imagine there is a data stream of size 30GB that needs to be encrypted. Without Block Cipher Mode, we will have to load all 30GB of data into memory and encrypt it with (say AES128) and most likely we do not large enough memory to load all the data stream. This is where Block Cipher Mode of Operations come in handy, it encrypts the data stream block by block (most likely 16 byte block) until the entire block is encrypted. We basically can encrypt the 30GB data stream without having to have at least 30GB of memory.
2.2 Asymmetrical Encryption (a.k.a Public Key Cryptography)
Unlike symmetrical encryption, asymmetrical encryption uses two distinct keys called public and private keys; Public key is used for encryption and private key is used for decryption. Both keys are different but related by math and it is much slower than symmetrical encryptions. As name implies, public key can be distributed publicly while private key is to be kept private as it is the only key that is able to decrypt the messages encrypted by public key. This essentially forms a secured one-way communication.
Generally, asymmetrical encryption is not desirable to be used as stream data encryption algorithm though it is more secured; it requires more computational power to perform encryption and decryption and this is a major drawback. Asymmetrical encryption is commonly used as authentication protocol for a client to verify that server is indeed valid. During a TLS handshake for example, server will present its TLS certificate, which contains a public key, to the client, client uses the public key to encrypt a message and asks the server to decrypt with its private key and send back the result. If message match, then client is sure that the server possess the private key and therefore is valid.
The following is some of the most common asymmetrical encryption algorithms today with the RSA and Elliptic curve being the most popular: (reference: https://en.wikipedia.org/wiki/Public-key_cryptography).
- RSA
- DSS
- Elliptic curve
2.3 Block Cipher Mode of Operation (a.k.a Stream Cipher)
Block Cipher Mode of Operation is normally used with Symmetrical encryption to encrypt or decrypt a stream of data block by block. There are several available modes of block cipher operations that have different strengths and weaknesses. Most modes require a complete block of 16 bytes to be able to encrypt. In the case where the input stream is not in multiple of 16, padding is normally use to fill the block.
The following is some of the most common block cipher mode of operations today with the CBC and CTR being the most popular and ECB being the least secured: (reference: https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation).
- Cipher Block Chaining (CBC)
- Counter (CTR)
- Cipher Feedback (CFB)
- Output Feedback (OFB)
- Electronic Codebook (ECB)
The mode is normally denoted with the desired symmetrical encryption algorithm, for example, AES-128-CBC or AES-256-CTR are quite common.
2.4 Key Exchange Algorithm
Key exchange algorithm is a math algorithm designed to make both client and server agree on a secret key without actually sending the key to each other. This is done by pure math equations and require several steps of intermediate token exchange. In the end both client and server will end up with the same value, which is to be used as the secret key for symmetrical encryption algorithms
. Key exchange algorithm is common used in many services such as SSH and TLS. Services like these normally have a handshake
stage where both client and server have to agree on the subsequent algorithms to use for encryption and perform key exchange algorithm to get the secret key for symmetrical encryption.
The following is some of the most common key exchange algorithms with diffie-hellman and elliptic curve diffie-hellman being the most popular (reference: https://en.wikipedia.org/wiki/Key_exchange).
- Diffie-Hellman (DH)
- Elliptic Curve Diffie-Hellman (ECDH)
- Ephemeral Diffie-Hellman (DHE)
- RSA
2.6 Data Integrity Check / Authentication
The data integrity authentication is not to be confused with host-based or role-based authentication mentioned in part 1. Data integrity authentication refers to the methods to ensure that the data stream has been received without being altered during transmission. Think of it as a data checksum. In addition to encryption, ensuring data integrity is also very important security measure to avoid man-in-the-middle attack. Please note that data integrity check and data encryption are 2 separate processes, meaning that you can have data authentication without encryption, or encryption without authentication. SNMPv3 is a good example that treats data authentication and encryption separately while TLS requires both at the same time.
The following is some of the most common hash algorithms with SHA1 and MD5 being the most common (reference: https://en.wikipedia.org/wiki/Message_authentication).
- SHA1
- SHA2
- MD5
- BLAK2
2.5 TLS Certificate and Chain of Trust
TLS certificate and chain of trust are the core concepts in TLS to ensure maximum trust between a client and a server. The certificate used by PostgreSQL is X509 version 3 certificate, which has extension support to further refine the purpose of the certificate issued.
The certificates are created and signed in hierarchy. The certificate created at the top hierarchy is called a root CA
(root Certificate Authority) and is normally created by a trusted organization. This root CA
is able to sign additional Intermediate CA
that can be distributed to other organizations. The intermediate CA can then be used to create and sign individual certificates to be used by services like HTTPS, FTPS…etc.
There are several types of TLS certificate and each has its own place in the certificate hierarchy and serve different purposes. A TLS certificate is a small data file that contains the public key, organization details, trustee’s digital signature, extensions and validity dates. Normally a TLS certificate is generated with a private key. The key pair bounded with the certificate is important as they are required for authentication when a TLS client wishes to connect to the server.
The following image illustrates the idea of certificate trust chain:
As you can see, the root CA is on top of hierarchy and is able to generate and sign additional intermediate CA
and issue to several organizations. The organization then is able to take the intermediate CA and generate additional CA-signed certificates
and matching private keys
to use in their services such as PostgreSQL server, FTPS and HTTPS server.
A CA certificate can be purchased from a trusted organization or generated by oneself using openssl and java key tool. We will go over the procedure to generate these certificates using OpenSSL as examples in part 3 of this blog.
2.7 TLS versions
TLS is a newer protocol that replaces its predecessor, Secured Sockets Layer (SSL). Below is a list of TLS versions that we should use as of today:
- TLSv1.0
- TLSv1.1
- TLSv1.2
- TLSv1.3
TLSv1.3 is the newest TLS version that has significant improvement in the handshake process and introduces many more cipher suites specifically designed for TLSv1.3. Before TLSv1.3, TLSv1.2 is the most popular TLS version deployed in the world today. PostgreSQL server defaults to accept client connection that supports minimum TLS version to be TLSv1.2 and will reject any connection in the versions earlier than v1.2
2.8 TLS Cipher Suite and TLS handshake
TLS cipher suite refers to a set of algorithms that help secure a network connection. The suite of algorithms normally contains
- Key exchange algorithm
- Authentication algorithm
- Asymmetrical encryption algorithm
- Message authentication algorithm
for example, a TLSv1.2 cipher suite TLS_DHE_RSA_WITH_AES_256_CBC_SHA256 indicates the following
- DHE - use Ephemeral Diffie-Hellman key exchange algorithm
- RSA - use RSA asymmetrical keys for authentication
- AES_256_CBC - use AES-256 symmetrical encryption with CBC block cipher mode
- SHA256 - use SHA-256 as message authentication algorithm to make sure exchanged messages are not tempered with.
When a TLS client initiates a TLS connection to a server, a TLS handshake
process takes place that roughly performs the following:
- Agree on the TLS version to use. Abort if version cannot be agreed
- Agree on the cipher suite to use. Abort if cipher suite cannot be agreed
- Certificate exchange
- Client authenticates the server using agreed algorithm
- perform key exchange using agreed algorithm
- ensure handshake message is not tempered with the agreed message authentication algorithm
- secured communication then begins.
7. Summary
TLS is a big protocol involving a lot of steps including certificate exchange, chain of trust verification, key exchange, cipher suite exchange, authentication, data integrity check and finally symmetrical encryption of application data with appropriate block cipher mode. Having adequate fundamental understanding to TLS is crucial to ensure a correct and secured database environment setup. Of course there is more to what we have discussed here so far and I will be producing more articles in the near future to address some of the advanced TLS related practices.