Azure function bindings

I covered the basics of a Python Azure Function before; now I’m going to look at the bindings, function.json, as a way to get additional settings or storage. Most of the information can be found in the Python developer guide.

The guide recommends that the Azure functions are kept within there own folder (called __app__ but you can change this) with tests and other files outside of this directory so they are not packaged up with the deployment. This is not the default if using the wizard in VS Code; if you move your code into a sub-directory afterwards you will need to re-initialise in VS Code in order to do local testing. Despite what the guide says, the .gitignore file should remain in the root. If you are doing local testing you should also have Azure Storage Emulator running.

The majority of function.json is taken up with the bindings array. The bindings link your function to other resources. All bindings will have the following three fields with additional fields determined by the type:

  • name: Name of the binding; this should match the parameter name in your function entry point apart from $return which binds to the returned output of the funtion. Note the name cannot contain underscores (unfortunately).
  • type: As a minimum, there will be one binding with the trigger type used to call the function (HTTP, timer, queue etc.). Additional types can also be bound to the function like storage tables.
  • direction: Either in (data is to be passed in to the function) or out (function will write data out to the binding).

Adding an additional binding to a storage table is a useful way to provide the function with configuration. Function apps are already attached to a storage account (connection string is stored in the AzureWebJobsStorage app setting); you can create a table in this storage account and put the configuration in there. Step 7 in the step-by-step guide above touched on storage account bindings – full details on binding a table can be found here.

When adding the binding in function.json, the partitionKey and rowKey are optional. If you do specify both this will point to a unique entity and the json string passed in with be an object with all the fields including partitionKey and rowKey. If you do not specify both then the json string passed in with be an array of objects that match the given partitionKey or rowKey. If you do not specify either the array will be the entire contents of the table.

Lets bind a table entity from the configuration table to our function by adding the following object to our bindings array:

{
      "name": "config",
      "type": "table",
      "direction": "in",
      "connection": "AzureWebJobsStorage",
      "tableName": "configuration",
      "partitionKey": "function",
      "rowKey": "myfunc"
}

We can then use load this configuration in our Azure function with the following code. Note the example is for a HTTP trigger function.

import json
import azure.functions as func

def main(req: func.HttpRequest, config: str) -> func.HttpResponse:
    configuration = json.loads(config)

That’s it; you now have a configuration dictionary with all the fields from the binded table entity. Also note that because the table is set in function.json, you can have different configuration entities for different functions. It’s a lot neater than having hundreds of app settings.

If using table(s) for configuration, this will necessitate creating them beforehand. You can use Azure Storage Explorer to do this; both to manage the tables in Azure and in the storage emulator when running locally.

Another common type to bind to is a queue (either a storage queue or a service bus queue). If used as a trigger you can use the function to respond to a message being placed into the queue. Binding to the function output allows you to write messages into a queue. Combining the two allows one function to call another; this is the Microsoft preferred method of doing this, rather the directly invoking the function with a HTTP request.

Binding a storage queue to the output is covered in step 7 of the tutorial. Creating a new function with an Azure Queue Storage trigger will create the necessary boilderplate code to use. There is not much more to it than that.

If you get the error message Value 'func.Out' is unsubscriptable when adding the queue to the function parameters try uninstalling pylint with the following command pip uninstall pylint – thanks to Stack Overflow as usual for this.

If you are looking for samples of other types of bindings check out the following repo.

Azure table storage

Storage accounts in Azure can be used for storing four types of information

  • Blob storage for data blobs using a REST interface (probably the most common use)
  • File shares for files access via SMB (with caveats)
  • Table storage for storing unstructured JSON documents (Cosmos is a better choice if you need database type functionality)
  • Queue for creating message queues (although Service Bus is probably a better choice)

Using table storage is straight forward with Python, finding the documentation to do this in Python less so as search results point to a lot of out-of-date articles. So here is a quick run down and links.

The module you need is not azure-storage (which is now depreciated – that would be too obvious). Instead you should pip install azure-cosmosdb-table. Once installed you use the TableService constructor to connect to the storage account and query or perform CRUD operations on the table.

A row in table storage is referred to as an entity. You can have any fields you like but each entity must have a PartitionKey and RowKey. Combined the two form a unique key to access the document. If you are creating or updating an entity, your dictionary must contain these two fields. Hopefully the following example should help

from azure.cosmosdb.table.tableservice import TableService
ts = TableService(connection_string="UseDevelopmentStorage=true")

tables = [t.name for t in ts.list_tables()]
if "monty" not in tables:
    ts.create_table("monty")
    entity1 = {"PartitionKey":"Countries", "RowKey":"Britain", "Ruler":"King", "HowToBecome":"Strange women lying in ponds distributing swords"}
    entity2 = {"PartitionKey":"Countries", "RowKey":"Rome", "Ruler":"Emperor", "Benefit": "Better sanitation, medicine, education, wine, public order, irrigation, roads, fresh water system and public health"}
    ts.insert_entity("monty",entity1)
    ts.insert_or_merge_entity("monty",entity2)

for entity in ts.query_entities("monty"):
    if entity.Ruler == "King":
        print(entity.RowKey)

if 'entity1' in locals():
    ts.delete_table("monty")

In the above example, I connected to the Azure Storage Emulator (running locally). Replace the connection string with one from Azure to connect to a storage account.

Azure functions with Python

There has been a lot of hype over the last few years about serverless computing, an oxymoron as the code is definitely running on servers – you just stop caring as you don’t maintain them. Azure functions have finally matured enough to allow you write Python functions (at least with Python 3.6 to 3.8) without much effort. If you want some background information on Azure functions and where they fit check out this blog post.

It helps to have an walk through introduction, and Microsoft handily provides a step-by-step guide here. This assumes you have VS Code and necessary modules installed as it uses this to publish the function up to Azure. VS Code also allows you to run the function locally during deployment.

Your function app with consist of one or more functions which are a REST endpoint. Each endpoint is organised as a module – in its own folder named the same as the endpoint which executes the main function inside __init__.py by default. Also inside the folder is a function.json file which contains all the settings for the function. Notice one of these is scriptFile which allows you to change the name of the python file should you wish. Also by convention there is a readme.md file describing the function and a sample.dat containing a sample of the data passed to the function if it accepts POST, PUT or PATCH requests.

Notice that the Azure function is running inside a virtual environment when ran locally. I’ve covered virtuals environments before, but quickly you can enter the virtual environment from the command line with .venv\Scripts\activate (.bat for command prompt and .ps1 for PowerShell). Do not use this to add modules (that should be done using requirements.txt as normal) but this is a good way to test a bit of code.

You are likely to want to pass a few settings into the function. The easiest way for settings shared across all the functions is through app settings. Like web apps, these settings are passed in as environment variables, so can be read by os.environ; if you want to see all the environment variables (which includes any app settings) try changing the output text of HttpExample to:
", ".join(os.environ.keys())

When running locally, you can add app settings to the values object in the local.settings.json file (in the root). Oddly when running on the Azure servers, the setting is passed in twice, once with APPSETTING_ prefixed to the name (key) and again without the prefix. Remember when running under windows, the Python libray call forces environment variables into uppercase, but on Linux and other Posix systems the environment variable is case sensitive.

Using app settings for secrets like passwords or API keys is not a great idea. For better security you can store the value in a key vault and put a reference to this in as the app setting. See this post for details.

The above should be enough information to get a running function in Azure written in Python. I am looking at writing a website monitoring suite of functions (similar to Pingdom or StatusCake but using the requests module to interact with the website and ensure it is working correctly and not just checking a page loads) so no doubt there will be other posts on this soon.

Routing table

I recently created a Azure SQL Managed Instance as a proof of concept for a project I was working on. Created through the portal it also created a routing table with 31 routes to the Internet which initially confused me.

On reflection, suspecting a simple answer, I set about working out the routing which I’ve shared.

Combined RouteIndividual CIDR routes
0.0.0.0 to 9.255.255.2550.0.0.0/5 + 8.0.0.0/7
11.0.0.0 to 172.15.255.25511.0.0.0/8 + 12.0.0.0/6 + 16.0.0.0/4 + 32.0.0.0/3 + 64.0.0.0/2 128.0.0.0/3 + 160.0.0.0/5 + 168.0.0.0/6 + 172.0.0.0/12
172.32.0.0 to 192.167.255.255172.32.0.0/11 + 172.64.0.0/10 + 172.128.0.0/9 + 173.0.0.0/8 + 173.0.0.0/8 + 174.0.0.0/7 + 176.0.0.0/4 + 192.0.0.0/9 + 192.128.0.0/11 + 192.160.0.0/13
192.169.0.0 to 255.255.255.255192.169.0.0/16 + 192.170.0.0/15 + 192.172.0.0/14 + 192.176.0.0/12 + 192.192.0.0/10 + 193.0.0.0/8 + 194.0.0.0/7 + 196.0.0.0/6 + 200.0.0.0/5 + 208.0.0.0/4 + 224.0.0.0/3

Yes, this basically routes everything apart from the private IP address ranges (10.0.0.0/8, 172.16.0.0/12 and 192.168.0.0/16) to the Internet.

Azure resource group deployments

Azure comes with some strange quotas. One of them is a limit on Resource Group Deployments. If you go to a resource group, you can see the number of succeeded deployments. The max allowed is 800. Once you reach it you will see an error message similar to the following:

Creating the deployment 'foobar' would exceed the quota of '800'. The current deployment count is '800', please delete some deployments before creating a new one.

What is also strange is there is no way to fix this from the portal. Instead you have to rely on either the CLI or PowerShell. The following PowerShell script is a quick fix that removes all deployments older than 2 months. Just replace myRG with your resource group name.

$keepafter = (Get-Date).AddMonths(-2)
Get-AzureRmResourceGroupDeployment -ResourceGroupName myRG |
    Where {$_.Timestamp -lt $keepafter} |
    Remove-AzureRmResourceGroupDeployment

Be warned this can take a long time (as in all day long)! Change the first line to change the max age.