In the application that I work on, we have a materialized view. It was created from several joined tables. It’s used to speed up searching for data without joining seven or eight tables in every query. At least it should make searching faster but in practice, it didn’t and I will describe to you why.
Queries to that view were really slow, but only on the production server, on staging everything was fine. That view was exactly the same as on staging server, but queries were ~50 times slower on production.
The first step was to check execution plans for any query on both servers.
EXPLAIN ANALYZE SELECT COUNT(*) FROM table_name;
On staging execution plan involved parallel scan, on production index scan.
When I discovered that, the first thing that came to my mind was the possibility of different configurations that disabled parallel queries on production. I checked that and it turned out to be false. My next attempts were to check various Postgres views that kept some data about tables. After looking through many numbers that didn’t help I finally discovered that our table (materialized view actually, but in this context, it makes no difference) on the production server it took 7GB of disc space and on staging only 300MB.
"SELECT pg_size_pretty( pg_total_relation_size('table_name') );"
I expected that row count had to be totally different but unexpectedly it was ~320k on production and ~310k on staging.
After a really long investigation, I found the reason why those numbers were so different, the table on production contained an enormous number of dead rows.
"SELECT n_dead_tup
FROM pg_stat_all_tables
WHERE relname = 'table_name'"
At this time I was almost sure that’s the reason for our whole problem (or at least I hoped so). I started researching those dead rows, what they are, and why they appear. The major defect of that approach is that after every single update on one of the records from joined tables, the whole materialized view is refreshed. Refreshed i.e. whole view is dropped and then it’s building query is run. We have two types of a refresh in PostgreSQL:
- non concurrently
This refresh type does not produce any dead rows, but for the time of refreshing view is locked, and no data can be read from it. This is an unacceptable solution for my project.
- concurrently It allows reading data during the process. It’s possible thanks to duplicating all data before deleting it. During refresh, all SELECT queries see that duplicated data, and after the process, all queries have access to newly created view, and duplicates remain as dead rows. It’s the way how the view is bloated with tons of unnecessary data.
And here comes VACUUM mechanism that is used to remove all dead rows from the table or materialized view.
"VACUUM table_name"
This command removes all dead queries from the given table, but it has to be run ‘manually’, or by some application code. There is also a mechanism called autovacuum . PostgreSQL has some workers (quantity set in configuration) that all the time search our database and run VACUUM on tables that are in need. What do we know now? We have many dead rows in the materialized view, we also have a mechanism that should clean them. Why isn’t it? In order to check that I added query that logs last auto and vacuum time with dead rows count.
"SELECT n_dead_tup, last_autovacuum, last_vacuum
FROM pg_stat_all_tables
WHERE relname = 'table_name'"
After a few days, I checked that logs. The first thing to notice was that the Autovacuum process is working but it’s triggered only in evenings. All dead rows are cleaned and the next day from something like 7 a.m. they appear and it’s count is growing. I checked our search in the morning and indeed it was as fast as it should be, but within a day it was getting much slower. Following really long research why is that happening I found that Autovacuum cannot be run on the table when it’s locked by SHARE UPDATE EXCLUSIVE lock. What locks a table that way? Obviously it’s REFRESH MATERIALIZED VIEW CONCURRENTLY. When that view is refreshed in our application? Other logs added and the answer is: the view is refreshed almost whole time (during a workday from morning to evening). Refresh is as I mentioned triggered by every data update on each of the tables that problematic view is made of. And here comes our final answer. Dead rows aren’t cleaned because Autovacuum cannot be run during refresh and refresh is running continuously from morning to evening.
Possible solutions:
1. Architectural changes that would prevent an application from refreshing whole materialized view during every data update. This is definitely possible to just update those rows that really changed without dropping and building from the scratch whole table.
This is definitely the best solution and I would choose that if I had much more time to spend on fixing that problem, but it would be too long to refactor every place in the application that can update one of the included tables.
2. The second possible fix is to append some breaks between refreshes that AUTOVACUUM process could be fired on our view. Theoretically, it is a fine idea, it would reduce database overload, it should be quite fast to implement, but I think there is too much room for unforeseen consequences and finally, I decided to give up that idea. Even though adding breaks would not take much time to implement, testing it carefully would and as I noticed above there were no more time after that really long investigation
3. The third and final solution I anticipated was the simplest one. In the job used to refresh the materialized view, I added code that ‘manually’ vacuums view from dead rows. During that whole research on I run vacuum manually many times and I knew it worked. Going this way assures us that there is no possibility that dead rows stay in view because they are vacuumed as soon as refresh (that creates them) ends.
Weighing up the pros and cons I finally chose the third solution. Time was a problem and it was the only one that could be tested immediately. It worked, but I think refactoring the application to not refresh the whole view would be the best option.