Mishaps in Redshift Temporary Tables

Max-Ganz-II · 2024-05-08T13:37:48 1715175468

1. Temp tables are a bad idea, last I knew, because they increase the amount of data in the system tables (every time you create a temp table, you create records for it, it's MVCC, so they don't go away until VACUUMed) and system table bloat has been a serious problem for Redshift, since it seems only to go away on reboot. With a client now, lots of tables, takes 60 seconds to issue a system command. A previous client had to reboot daily. I simply advise clients not to use them.

2. Temp tables do not participate in k-safety, so you avoid that performance cost.

3. I suspect I've seen something odd happening with compression with temp tables. I've not investigated.

4. Using `CREATE TABLE AS` is, for me, verboten. Absolutely forbidden. This is because Redshift selects column encodings, and does a very, very poor job of doing so. Never let Redshift select column encodings (or sort keys, or distribution keys, or rely on auto-vacuum, or auto-analyze, and above all, never use AutoWLM).

sschnei8 · 2024-05-08T16:29:16 1715185756

All excellent points. The best is when most workloads on the cluster use a temp table with an identical name. Then someone comes along and accidentally drops the TEMP from the their temp table ctas and it gets smashed into the public schema.

“Table already exists…”scheduled job run failure galore, fun times.