Massive Technical Interviews Tips: Postgresql Advanced Usage

Tuesday, November 13, 2018

Postgresql Advanced Usage

https://medium.com/namely-labs/syncing-cache-with-postgres-7a4d78cec022

since we really only care about updating the cache when the database is updated, we can let the database itself update the caches by broadcasting when a change has been made. Postgresql provides functionality for a publish-subscribe pattern called LISTEN/NOTIFY. Like any pub-sub implementation, LISTEN/NOTIFY allows you to set channels on which the database can broadcast some text. Others can then listen on those channels and receive information asynchronously. Postgresql stores all the NOTIFY’s in a queue and drops them only when all registered listeners have received them. It is something to keep in mind because that queue can fill up if a listener fails which will cause an error in Postgresql on the next notify. Lastly, we can build a simple trigger in Postgresql that will NOTIFY on inserts to a table.

For example, let’s say we have an application that keeps track of employees and the departments they belong to. Each department has an employee designated as the manager of that department. For processing purposes, it’d be helpful if we kept a directory in memory of all the employees and who their department manager is.

CREATE OR REPLACE FUNCTION new_hire_notify() RETURNS trigger AS $$
  DECLARE
    payload varchar;
    mid uuid;
  BEGIN
    SELECT manager_id INTO mid FROM departments
    WHERE id=NEW.department;
    payload = CAST(NEW.id AS text) ||
    ‘, ‘ || CAST(mid AS text);
    PERFORM pg_notify(‘new_hire’, payload);
    RETURN NEW;
  END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER value_insert
AFTER INSERT
ON employees
FOR EACH ROW
  EXECUTE PROCEDURE new_hire_notify();

Then we create a new listener connection, which is a separate TCP connection to Postgresql. On that connection, we can then specify channels to listen to. We can subscribe to multiple channels on the same listener by calling listener.Listen on as many channels as we need. Finally, we pass the listener to the Cache.Listen method, and spin it off into a Go routine.

https://simongui.github.io/2016/12/02/improving-cache-consistency.html

MySQL has a binlog replication protocol which is used for primary/secondary replication. This is essentially a replicated queue that has all the transactions recorded in-order as shown in Figure 4.

This isn’t a popular solution but I say, why not? It works very well. You can write an application that can speak the MySQL binlog replication protocol that consumes the binlog entries and execute SET operations against the cache service(s). There are two ways you could consume the binlog data.

Interpret the raw SQL syntax and issue SET operations.
The web application embeds cache keys as a comment in the SQL.

Both of these options are good because you can even get the transaction scope of each transaction in the binlog statements if you need to and if the target system supports atomic multi-set operations. I prefer the 2nd option because it’s easier to parse and the application already has this information in most cases.

https://stackoverflow.com/questions/1772810/is-there-any-way-to-let-mysql-server-push-db-updates-to-a-client-program

For an actively connected client interested in cache invalidation techniques:

SQL Server support Query Notifications
Oracle support Continuous Query Notifications.
MySQL supports replication streams, but they're not the same as update notification: they cannot be set up and tear down dynamically for a client and they do not work well for monitoring individual rows a particular application is interested in.

For a disconnected client interested in data sync, all vendors support some sort of replication.

Tuesday, November 13, 2018

Postgresql Advanced Usage

Labels

Popular Posts