Turning on query logging for PostgreSQL in Amazon RDS

Source: https://www.postgresql.org/docs/9.4/runtime-config-logging.html

This is a very specific topic, but I needed to do it to track down performance of a slow web service call.

First, go to your Amazon RDS instances tab and select the database that needs logging turned on, then go to “See details” under instance actions.

Find the Parameter Group attached to the database. Click on it to edit it, or go to “Parameter groups” in the sidebar and find the corresponding parameter group to edit.

There are two parameters you want to look for: log_statement and log_min_duration_statement (see update below). log_statement accepts 3 values: none (default), ddl, mod, and all. For nearly all purposes, you want to change this to all. log_min_duration_statement simply sets a filter (in milliseconds) for when to log a SQL query – this lets you specifically target slow queries. The default value is -1 (disabled). In my case, most queries were in the 3-5ms range, so I set it to 10.

Once you save the parameters, it’ll take about a minute or so for them to be applied to your database(s) and logging to begin. Once they are done applying, go back to “See details” for your database instance, and scroll down to the “Logs” section. Logs (postgres.log files) are displayed from oldest to newest, so you actually have to click through to the last page (for me, page 15) to see the newest log files.

It turns out we had very few queries that were over 10ms, and the slowest one was 68ms, so the bottleneck was elsewhere.

UPDATE 2019-03-17

The original entry was written for Postgres 9.4 in 2018. It looks like I was out of my mind at the time, and this setting is called “log_min_duration_statement“. As a helpful person pointed out in the comments, log_statement = all will log all statements (schema updates and data modifications).

Here’s a summary of what the Postgres 9.4 documentation on these variables:

log_min_duration_statement will log the statement duration if it takes longer than the given time span; 0 will log all statement durations. This SO link indicates that the log_statement setting has no effect on which statements are logged by log_min_duration_statement.

While we’re at it, let’s talk about log_duration, too. This boolean value causes every duration to be logged, but differs from log_min_duration_statement in that log_duration does not log query text, while log_min_duration_statement will log query text of statements that exceed the duration.

Source: https://www.postgresql.org/docs/9.4/runtime-config-logging.html

Receiving and processing emails in Azure vs. AWS

One big difference I noticed between Amazon Web Services and Windows Azure is that Azure doesn’t have built-in support for sending and receiving emails, while AWS has Simple Email Service (SES) (Aside: of my friends at Amazon says they’re running out of words to put between “simple” and “service”). This is probably because the power of the cloud has been harnessed to send spam emails in the past, and indeed, Azure compute IP blocks have been added to spam blacklists. To send emails from Azure, Microsoft recommends using an external service like SendGrid.

However, what if you want to receive emails in Azure? I did some digging and there isn’t really a well-defined way. The only plausible results I found were setting up a compute role with a third-party SMTP server, or using an external service like Postmark, which cleans up email for you before calling a user-provided webhook. Postmark is nice because you can configure a spam filter before your webhook is called, which you would have to do manually if you were to run your own SMTP server.

Now let’s take a look at Amazon SES. SES has been around for a couple years now (the documentation was released in 2011). You can create a ruleset to chain multiple actions together when an email is trigger. For example, the first action can copy the email to an S3 (Simple Storage Service) bucket, followed by triggering a Lambda function (which is a serverless piece of code) to process the email, and the final action can be to make an SNS (Simple Notification Service) call to send a push notification to a mobile app, all from within AWS.

Amazon can even register your domain name using Amazon Route 53, and you probably should use that as well, because transferring an existing domain is a 7-step hassle. Also, if you registered or transferred your domain recently, ICANN rules prohibit you from transferring again for 60 days. Don’t shoot yourself in the foot by registering a domain name with another registrar within 60 days of using SNS, or you won’t be able to use that domain for receiving email until the prohibition expires.

So if you’re trying to send and receive emails in Windows Azure, you’re pretty much forced to use a third party service, whereas if you’re with Amazon, you can do everything from within AWS.

Getting Started with Azure Functions

I was investigating Azure Functions not so long ago when I came across the Azure Function challenge – a set of tasks posted by Microsoft for people to try out and learn about Azure Functions. Azure Functions is a relatively new (at time of writing) service offered by Microsoft, and is Microsoft’s version of AWS Lambda functions. It allows you to write a snippet of code which is triggered by certain events, and you’re charged by the number of times your code is executed and the memory consumption * time (measured in GB-seconds) of your executions. Here’s how I got started.

Editing tools

The first method of actually getting the code into Azure is directly through portal.azure.com. The second method is GitHub combined with Visual Studio Tools for Azure Functions, but it’s still in preview, has a particular set of prerequisites to install, and has a few known issues. However, the biggest advantages is that you get IntelliSense and you can run the code locally in an Azure simulator.

Triggers

Triggers are what cause your Azure Function to run, and you can select which trigger either through the Azure Portal, or by editing the file function.json in the Azure Function’s project directory. A few examples are triggers are BlobTrigger, HttpTrigger, QueueTrigger, and TimerTrigger. The file function.json is where you configure triggers and data bindings.

Input and output binding

Once you get the trigger set up, you most likely will want to get at the input parameters, and also read or write to Azure Storage. For HTTP triggers, the HTTP request is bound by default to a request parameter called req, and the HTTP response is the return value. The example code provided by Microsoft does a good job of illustrating how to use both of them.

Accessing Azure Storage

Sometimes you want to access Azure Storage to… well, to store something from the request. I found the documentation here kind of lacking (Azure Functions hit v1.0 in November 2016) and figured this out through a lot of trial and error, so I hope someone finds this useful.

To write out to an Azure Table, first you need an actual Azure Storage Table… Assuming you have one set up, you need the account name, account key, the storage account’s connection string, and the target table name. Give the table parameter a name like tableBinding, which you will use in code. This is what my configuration looks like in the Azure portal:

This is what the corresponding function.json looks like:

{
  "bindings": [
    {
      "type": "httpTrigger",
      "direction": "in",
      "webHookType": "genericJson",
      "name": "req"
    },
    {
      "type": "http",
      "direction": "out",
      "name": "res"
    },
    {
      "type": "table",
      "name": "tableBinding",
      "tableName": "AzureChallengeTable",
      "connection": "azurefunctionsc99a8a83_STORAGE",
      "direction": "out"
    }
  ],
  "disabled": false
}

Note the “connection” property, which is set to an environment variable which can be defined in your appsettings.json file locally, or in the Azure Functions portal under Function app settings > Configure app settings > App settings section, and contains the connection string to the storage account. This is what allows you to access the Azure Storage account if you’re running locally.

{
  "IsEncrypted": false,
  "Values": {
    "AzureWebJobsStorage": "",
    "AzureWebJobsDashboard": "",
    "azurefunctionsc99a8a83_STORAGE": "ConnectionStringHere"
 }

We configured the table binding to be called tableBinding, so our function signature now looks like this:

public static HttpResponseMessage Run(HttpRequestMessage req, CloudTable tableBinding, TraceWriter log)

You will also need to add #r Microsoft.WindowsAzure.Storage to the top of the run.csx file and add using Microsoft.WindowsAzure.Storage.Table (and import the Nuget package if you’re coding from VS). Now, we can insert objects that are a subclass of TableEntity (make sure you define how PartitionKey and RowKey are calculated) with a line like this:

tableBinding.Execute(TableOperation.InsertOrReplace(myEntity));

To read:

var tableResult = tableBinding.Execute(TableOperation.Retrieve<MyEntity>(partitionKey, rowKey));
var myEntity = tableResult.Result as MyEntity;
Deploying

If you wrote your code in VS, you can upload the project to GitHub, then use the Azure Functions portal to deploy the code to GitHub. One very important thing I noticed here is that the *.json and run.csx files must be at the top level of your repo; if you have multiple functions residing in multiple directories in the repo, Azure won’t find the code. So this basically means you need one repo per Azure Function.

One other quirk I noticed was that sometimes I got an HTTP 4xx response when testing, which was caused by an Azure bug with function keys. The workaround for this is to call the Azure Function with the key for the function, not the admin key (which is shared by all your Azure Functions).

In conclusion

Azure Functions (and AWS Lambda) is promising because it is simple, covers some very useful and common scenarios, and is easy to scale, but I found the documentation kind of spotty and hope it will improve in the coming months.