7 Nov 2017

Prewarming / Initializing an RDS Postgres instance (from S3)

As many of you know, that AWS RDS Postgres uses EBS which has an interesting feature called Lazy Loading that allows it to instantiate a disk (the size of which can be mostly anything from 10GB to 6TB) and it comes online within a matter of minutes. Although a fantastic feature, this however, can lead to unexpected outcomes when high-end production load is thrown at a newly launched RDS Postgres instance immediately after Restoring from a Snapshot.

One possible solution is to use the pg_prewarm Postgres Extension that is well supported in RDS Postgres, immediately after Restoring from a Snapshot, thereby reducing the side-effects of Lazy Loading.

Although pg_prewarm was originally meant for populating buffer-cache, this extension (in this specific use-case) is heaven-sent to initialize (fetch), (almost) the entire snapshot from S3 on to the RDS EBS volume in question. Therefore, even if you use pg_prewarm to run through all tables etc., thereby effectively evicting the recent run for the previous table from buffer-cache, it still does the job of initializing all disk-blocks with respect to the EBS volume.

I've just checked in the SQL to this repository that seems to do this magic pretty well. It also enlists why this would only take you ~70% of the way owing to restrictions / limitations (as per my current understanding).

In the Sample below, I restored a new RDS Postgres instance from a Snapshot and immediately thereafter ran this SQL on it.


  • Notice that the first table (pgbench_accounts) takes about 22 seconds to load the first time, and less than a second to load the second time.
  • Similarly the second table (pgbench_history) takes 15 seconds to load the first time and less than a second, the second time :) !



pgbench=>       SELECT clock_timestamp(), pg_prewarm(c.oid::regclass),
pgbench->       relkind, c.relname
pgbench->       FROM pg_class c
pgbench->         JOIN pg_namespace n
pgbench->           ON n.oid = c.relnamespace
pgbench->         JOIN pg_user u
pgbench->           ON u.usesysid = c.relowner
pgbench->       WHERE u.usename NOT IN ('rdsadmin', 'rdsrepladmin', ' pg_signal_backend', 'rds_superuser', 'rds_replication')
pgbench->       ORDER BY c.relpages DESC;
        clock_timestamp        | pg_prewarm | relkind |        relname
-------------------------------+------------+---------+-----------------------
 2017-11-07 11:41:44.341724+00 |      17903 | r       | pgbench_accounts
 2017-11-07 11:42:06.059177+00 |       6518 | r       | pgbench_history
 2017-11-07 11:42:17.126768+00 |       2745 | i       | pgbench_accounts_pkey
 2017-11-07 11:42:21.406054+00 |         45 | r       | pgbench_tellers
 2017-11-07 11:42:21.645859+00 |         24 | r       | pgbench_branches
 2017-11-07 11:42:21.757086+00 |          2 | i       | pgbench_branches_pkey
 2017-11-07 11:42:21.757653+00 |          2 | i       | pgbench_tellers_pkey
(7 rows)

pgbench=>
pgbench=>       SELECT clock_timestamp(), pg_prewarm(c.oid::regclass),
pgbench->       relkind, c.relname
pgbench->       FROM pg_class c
pgbench->         JOIN pg_namespace n
pgbench->           ON n.oid = c.relnamespace
pgbench->         JOIN pg_user u
pgbench->           ON u.usesysid = c.relowner
pgbench->       WHERE u.usename NOT IN ('rdsadmin', 'rdsrepladmin', ' pg_signal_backend', 'rds_superuser', 'rds_replication')
pgbench->       ORDER BY c.relpages DESC;
        clock_timestamp        | pg_prewarm | relkind |        relname
-------------------------------+------------+---------+-----------------------
 2017-11-07 11:42:33.914195+00 |      17903 | r       | pgbench_accounts
 2017-11-07 11:42:33.917725+00 |       6518 | r       | pgbench_history
 2017-11-07 11:42:33.918919+00 |       2745 | i       | pgbench_accounts_pkey
 2017-11-07 11:42:33.919412+00 |         45 | r       | pgbench_tellers
 2017-11-07 11:42:33.919427+00 |         24 | r       | pgbench_branches
 2017-11-07 11:42:33.919438+00 |          2 | i       | pgbench_branches_pkey
 2017-11-07 11:42:33.919443+00 |          2 | i       | pgbench_tellers_pkey
(7 rows)


14 Oct 2017

First alpha release of PsqlForks - Menon

Primer: PsqlForks aims to support all DB Engines that (even partially) speak Postgres (psqlforks = psql for Postgres forks).

Given that PsqlForks has been in development for a few weeks, it's time to stabilize a bit and towards that, we finally have Menon, PsqlForks first Alpha Release. Being an alpha, by definition it isn't ready for production, but it feels stable enough ... feel free to test it out!

Importantly, this fork is synced with postgres/master regularly, and should ideally sport all recent psql developments. Further, I am not a C expert and am just barely comprehending Postgres, so let me know of any 18-wheelers that I didn't see.

The release title - 'Menon', is a common sub-Caste in South-Indian state of Kerala. Selecting this nomenclature emanates from the idea of popularizing (heh!) common names and places from Kerala... and that it doesn't hurt to have an identifiable name (and while at it, add character) to a Release :) 

This release includes: 

  • Decent support for Redshift:
    • SQL tab completion for Redshift related variations
    • \d etc. now support Redshift specifics - ENCODINGs / SORTKEYs / DISTKEY / COMPRESSION etc.
    • Support Temporary Credentials using IAM Authentication (via AWS CLI)
    • View detailed progress here.
  • Basic support / Recognition semantics for:
    • CockroachDB - view progress here
    • PipelineDB
    • PgBouncer
    • RDS PostgreSQL
You could read more here:

For the interested:

13 Oct 2017

PsqlForks supports AWS IAM authentication for Redshift

With this commit, PsqlForks ( http://psqlforks.com ) can now fetch credentials from AWS IAM. Read more about Redshift's support for generating database credentials using IAM authentication feature, here.

Since the entire AWS CLI isn't baked into PsqlForks (yet!), you'd need a working copy of AWS CLI installed / working on the host (from where psql is called).

This took a while, since I missed the basic assumption that Redshift enforces SSL and psql doesn't attempt SSLMODE by default in the first try. The fact that CYGWIN wasn't super-smooth with AWS CLI in my test installation, didn't help either.

But as they say, all's well that ends well. There are few obvious additions that are possible (such as expiration validation / re-use unexpired credentials on re-connect etc.) but this should get merged in the forks mainline soon.

I guess it's time to begin thinking of releases, instead of making the mainline jittery with feature additions such as this one.

Yenjoy!


$ psql "sslmode=require host=redshift_cluster port=5439 dbname=redshift2" -U testing1
Password for user testing1:
psql: fe_sendauth: no password supplied

$ psql -I "sslmode=require host=redshift_cluster port=5439 dbname=redshift2" -U testing1

CLI: aws redshift get-cluster-credentials --auto-create --db-user testing1 --cluster-identifier redshift2 # Informational / testing output

psql (client-version:11devel, server-version:8.0.2, engine:redshift)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: on)
Type "help" for help.

redshift2=> select current_user;
 current_user
--------------
 testing1
(1 row)

redshift2=> \du
                     List of roles
 Role name |          Attributes           | Member of
-----------+-------------------------------+-----------
 redshift2 | Superuser, Create DB         +|
           | Password valid until infinity |
 testing1  |                               |

redshift2=> \q

$ ./psql --help | grep -i iam
  -I, --aws-iam-redshift   use temporary database credentials from AWS IAM Service

12 Oct 2017

PsqlForks now recognizes RDS PostgreSQL separately

With this commit, PsqlForks ( http://psqlforks.com ) now recognizes RDS PostgreSQL separately.

This isn't utilized much yet, but the infrastructure is going to be helpful in skipping / avoiding some commands that are defunct / not possible in the RDS PostgreSQL offering.

28 Sep 2017

PsqlForks now recognizes PgBouncer

With this commit, PsqlForks knows when it's talking to PgBouncer (and not Postgres).

Down the line, this should pave way for PsqlForks to more cleanly convey why (most of) the given psql shortcut(s) don't work (and what else does).

As always, the psql/README always has the most updated status of any engine support.

$ psql -h localhost -E -p6543 -U postgres pgbouncer
psql (client-version:11devel, server-version:1.7.1/bouncer, engine:pgbouncer)
Type "help" for help.

pgbouncer=# show version;
NOTICE:  pgbouncer version 1.7.1
SHOW
pgbouncer=#