That Guy From Delhi

15 Sept 2017

PsqlForks now supports PipelineDB

After working on this PSQL variant that intends to support all Postgres forks, I finally narrowed down to naming it.

Since this was essentially Psql (for) Forks, quite intuitively, I chose to name it PsqlForks.

Considering that until recently this fork just supported Amazon Redshift, this naming didn't make much sense if it wasn't supporting at least 2 forks :) !

Thus, PsqlForks now supports PipelineDB!

$ /opt/postgres/master/bin/psql -U pipeline -p 5434 -h localhost pipeline
psql (client-version:11devel, server-version:9.5.3, engine:pipelinedb)
Type "help" for help.

pipeline=# \q

2 Sept 2017

psql \d now supports Interleaved / Compound SORTKEYs (in Redshift)

As a side-note, there is a consideration as to whether this should be on a separate section of its own (and not under Indexes, which it clearly isn't). May be another day. Happy Redshifting :) !

Update (15th Sep 2017):
This project has now been named PsqlForks!

31 Aug 2017

psql \d now supports DISTKEY / SORTKEY / ENCODING (in Redshift)

This is in continuation of my work for (my forked version of) psql to better support Redshift (read more here).

Now \d table provides some additional Redshift specific table properties such as:

DISTKEY
SORTKEY
COMPRESSION (ENCODING)
ENCRYPTION

Sample:

t3=# CREATE TABLE customer(

custkey SMALLINT ENCODE delta NOT NULL,

custname INTEGER DEFAULT 10 ENCODE raw NULL,

gender BOOLEAN ENCODE RAW,

address CHAR(5) ENCODE LZO,

city BIGINT identity(0, 1) ENCODE DELTA,

state DOUBLE PRECISION ENCODE Runlength,

zipcode REAL,

tempdel1 DECIMAL ENCODE Mostly16,

tempdel2 BIGINT ENCODE Mostly32,

tempdel3 DATE ENCODE DELTA32k,

tempdel4 TIMESTAMP ENCODE Runlength,

tempdel5 TIMESTAMPTZ ENCODE DELTA,

tempdel6 VARCHAR(MAX) ENCODE text32k,

start_date VARCHAR(10) ENCODE TEXT255

)

DISTSTYLE KEY

DISTKEY (custname)

INTERLEAVED SORTKEY (custkey, custname);

CREATE TABLE

t3=# \d customer

TABLE "public.customer"

------------+-----------------------------+-----------+---------+---------+---------+------------+-----------+----------+------------------------------------

custname | integer | none | t | 2 | f | none | | | 10

gender | boolean | none | f | 0 | f | none | | |

address | character(5) | lzo | f | 0 | f | none | | |

state | double precision | runlength | f | 0 | f | none | | |

zipcode | real | none | f | 0 | f | none | | |

tempdel1 | numeric(18,0) | mostly16 | f | 0 | f | none | | |

tempdel2 | bigint | mostly32 | f | 0 | f | none | | |

tempdel3 | date | delta32k | f | 0 | f | none | | |

tempdel4 | timestamp without time zone | runlength | f | 0 | f | none | | |

tempdel5 | timestamp with time zone | delta | f | 0 | f | none | | |

tempdel6 | character varying(65535) | text32k | f | 0 | f | none | | |

start_date | character varying(10) | text255 | f | 0 | f | none | | |

Now that a few 'ToDos' are listed on Github Issues, next would probably involve working on this ticket, which aims at elaborate SORTKEY details (such as INTERLEAVED / COMPOUND) etc. when using Describe Table.

Update (15th Sep 2017):
This project has now been named PsqlForks!

12 Aug 2017

Redshift support for psql

Am sure you know that psql doesn't go out of it's way to support Postgres' forks natively. I obviously understand the reasoning, which allowed me to find a gap that I could fill here.

The existing features (in psql) that work with any Postgres fork (like Redshift) are entirely because it is a fork of Postgres. Since I use psql heavily at work, last week I decided to begin maintaining a Postgres fork that better supports (Postgres forks, but initially) Redshift. As always, unless explicitly mentioned, this is entirely an unofficial effort.

The 'redshift' branch of this Postgres code-base, is aimed at supporting Redshift in many ways:

Support Redshift related artifacts

Redshift specific SQL Commands / variations
Redshift Libraries

Support AWS specific artifacts

For e.g. AWS Regions

Support Redshift specific changes

For e.g. "/d table" etc.

The idea is:

Maintain this branch for the long-term

At least as long as I have an accessible Redshift cluster

Down the line look at whether other Postgres forks (for e.g. RDS Postgres) need such special attention

Although nothing much stands out yet

Except some rare exceptions like this or this, which do need to go through an arduous long wait / process of refinement.

Change the default port to 5439 (or whatever the flavour supports)

...with an evil grin ;)

Additionally, as far as possible:

Keep submitting Postgres related patches back to Postgres master
Keep this branch up to date with Postgres master

Update (31st August 2017)

Currently this branch supports most Redshift specific SQL commands such as

CREATE LIBRARY
CREATE TABLE (DISTKEY / DISTSTYLE / ...)
Returns non-SQL items like

ENCODINGs (a.k.a. Compressions like ZSTD / LZO etc )
REGIONs (for e.g. US-EAST-1 etc.)

Of course some complex variants (for e.g. GRANT SELECT, UPDATE ON ALL TABLES IN SCHEMA TO GROUP xxx ) don't automatically come up with tab-complete feature. This is primarily because psql's tab-complete feature isn't very powerful to cater to all such scenarios which in turn is because psql's auto-complete isn't a full-fledged parser to begin with.
In a nutshell, this branch is now in a pretty good shape to auto-complete the most common Redshift specific SQL Syntax.
The best part is that this still merges perfectly with Postgres mainline!

Let me know if you find anything that needs inclusion, or if I missed something.

====================================

$ psql -U redshift_user -h localhost -E -p 5439 db
psql (client-version:11devel, server-version:8.0.2, engine:redshift)
Type "help" for help.

db=#

3 Aug 2017

Reducing Wires

Recently got an additional monitor for my workstation@home and found that the following wires were indispensable:

USB Mouse
Monitor VGA / HDMI / DVI cable
USB Hub cable (Pen Drive etc.)

I was lucky that this ($20 + used) Dell monitor was an awesome buy since it came with a Monitor USB Hub (besides other goodies such as vertical rotate etc).

After a bit of rejigging, this is how things finally panned-out:

1 USB Wire (from the laptop) for the MUH (Monitor USB Hub)

This is usually something like this.

Use a USB->DVI converter and use that to connect MUH -> Monitor DVI port

This is usually something like this.

Plug USB Mouse to MUH
With things working so well, I also plugged a Wireless Touchpad dongle to the MUH

So now when I need to do some office work, connecting 1 USB wire gets me up and running!

#LoveOneWires :)

Now only if I could find a stable / foolproof Wireless solution here ;)

29 Jul 2017

Symbols in Redshift User Passwords work just fine

Recently read a few posts / discussions about people doubting Redshift not accepting (working-well with) ASCII symbols in User-Passwords.

It felt like a good time to write this short post showing that Redshift (Engine) seems to work fine with (non-alphanumeric) (printable) ASCII symbols.

You can see a few things (in the sample output given below):

All non-alphanumeric printable ASCII characters worked fine (at least all that my US-International / QWERTY keyboard could throw at it)
For those who also need ' (single-quote) and " (double-quote) you could always use $$ as quote-delimiters
You still need at least One of each of the following:

Upper-Case English-Letter
Lower-Case English-Letter
One Digit / Numeral

------------------------------------------------------------
# psql -U adminuser -h rs_cluster -p 5439 db

psql (9.6.3, server 8.0.2)
Type "help" for help.

rs_cluster adminuser@db-# alter user userb with password 'Aa1~!@#$%^&*()_+-`{}[]|";:,<.>/?';
ALTER USER
Time: 237.012 ms
rs_cluster adminuser@db-# \q

# psql -U userb -h rs_cluster -p 5439 db
Password for user userb:

psql (9.6.3, server 8.0.2)
Type "help" for help.

rs_cluster userb@db-# alter user userb with password $$Aa1~!@#$%^&*()_+-`{}[]|";:,<.>/?'"$$;
ALTER USER
Time: 191.505 ms
rs_cluster adminuser@db-# \q

# psql -U userb -h rs_cluster -p 5439 db
Password for user userb:

psql (9.6.3, server 8.0.2)
Type "help" for help.

rs_cluster userb@db-#
------------------------------------------------------------

21 Jul 2017

Using generate_series() in Redshift

Considering that Redshift clearly states that it doesn't support (the commonly used postgres function) generate_series(), it gets very frustrating if you just want to fill a table with a lot of rows and can't without a valid data-source.

Solution (Generates a billion integers on my test-cluster):

--INSERT INTO tbl
WITH x AS (
SELECT 1
FROM stl_connection_log a, stl_connection_log b, stl_connection_log c
-- LIMIT 100
)
SELECT row_number() over (order by 1) FROM x;

For a Redshift server with even a basic level of login activity, this should generate enough rows. For e.g. On my test cluster, where I am the only user, this currently generates 4034866688 (4 billion) rows :) !

Interestingly, irrespective of the document, generate_series() actually does work on Redshift:

# select b from generate_series(1,3) as a(b);
┌───┐
│ b │
├───┤
│ 1 │
│ 2 │
│ 3 │
└───┘
(3 rows)

The reason why this wouldn't let you insert any rows to your table though, is that this is a Leader-Node-Only function, whereas INSERTs (on any non-single Redshift Cluster) are run on the Compute Nodes (which don't know about this function).

The reason why the above works, is ROW_NUMBER() and CROSS JOIN allow us to generate a large number of rows, but for that, the initial data-set (here the STL_CONNECTION_LOG System Table) should have at least some rows to multiply on! You could use any other system table (that is available on Compute Nodes) if required, for some other purpose.

Play On!