Solution (Generates a billion integers on my test-cluster):
--INSERT INTO tbl
WITH x AS (
SELECT 1
FROM stl_connection_log a, stl_connection_log b, stl_connection_log c
-- LIMIT 100
)
SELECT row_number() over (order by 1) FROM x;
Interestingly, irrespective of the document, generate_series() actually does work on Redshift:
# select b from generate_series(1,3) as a(b);
┌───┐
│ b │
├───┤
│ 1 │
│ 2 │
│ 3 │
└───┘
(3 rows)
The reason why this wouldn't let you insert any rows to your table though, is that this is a Leader-Node-Only function, whereas INSERTs (on any non-single Redshift Cluster) are run on the Compute Nodes (which don't know about this function).
The reason why the above works, is ROW_NUMBER() and CROSS JOIN allow us to generate a large number of rows, but for that, the initial data-set (here the STL_CONNECTION_LOG System Table) should have at least some rows to multiply on! You could use any other system table (that is available on Compute Nodes) if required, for some other purpose.
Play On!
2 comments:
generate_series is working with Redshift now.
SELECT CURRENT_DATE::TIMESTAMP - (i * interval '1 day') as date_datetime
FROM generate_series(1,31) i
ORDER BY 1
This will generate last 30 days date
Ref: https://stackoverflow.com/q/22554398/1939568
generate_series is working with Redshift only with select..... not with insert or else
Post a Comment