SQL

Storing data between runs

Sometimes there will be a need for your Python scripts to store information between runs. While configuration information is best kept in human readable files (like ini of yaml files) if it is data used internally then a better approach is needed. Unless the data to be stored is just plain text it is usually better long term to use a database for this. On first read, that last statement might seem overkill but a few years down the line when the script has stopped working or you need to extract the data stored within you will curse the decision to go with your own home-rolled file format.

We are only looking for ease of use here, not concurrent access or performance. With this in mind, SQLite is a very mature product and the DLL has been included in the Python standard library since v2.5 so there is nothing to install. It is powerful enough to be used as the data store for the Skype client. A good introduction to the SQLite module in Python (assuming you know a little SQL) can be found on Python Central.

Putting the snippets into a slight more complete example, we can create the cliched task list example with surprisingly little code. I’ve tried to choose the SQL statements to show off a few features you would probably want with your CRUD operations but this is not meant as a SQL tutorial.

import sqlite3

sqlcmds = ["CREATE TABLE IF NOT EXISTS persons (pid INTEGER PRIMARY KEY, givennames VARCHAR, surname VARCHAR);",
    "CREATE TABLE IF NOT EXISTS tasks (tid INTEGER PRIMARY KEY, pid INTEGER, name VARCHAR, percentcomplete INTEGER DEFAULT 0);",
    "INSERT INTO persons (pid,givennames,surname) VALUES (1,'Tom','Smith'),(2,'Dick','Jones'),(3,'Harry','Hat');",
    "INSERT INTO tasks (pid,name) VALUES (1,'Checktest'),(1,'Report Checktest'),(3,'Submit Report');",
    "UPDATE tasks SET percentcomplete = 25 WHERE name = 'Checktest';"]
sqlselect = """SELECT tasks.name, persons.givennames, persons.surname, tasks.percentcomplete
    FROM tasks inner join persons on tasks.pid = persons.pid"""

with sqlite3.connect(r"c:\path\to\sqlite.db") as dbconnection:
    cursor = dbconnection.cursor()
    for statement in sqlcmds:
        cursor.execute(statement)
        db.commit()
    cursor.execute(sqlselect)
    tasklist = ['"%s" with %s %s (%d%% complete)' % row for row in cursor]

print('\n'.join(tasklist))

Why is using SQLite an advantage? For starters you can easily view and execute arbitary SQL with the DB Browser for SQLite utility. When unexpected characters or a dodgy record breaks the code now you have a fighting chance at fixing it. Also SQLite doesn’t create large files, the example above is a mere 3Kb file.

An interesting take on this is y_serial, which basically pickles your Python object and stores it with a label in the SQLite table. I’ve not tested this, so make no recommendation for using it, but it is a novel way to store objects between runs without any SQL knowledge or installing any additional modules.

When I get chance to finish testing and uploading there will be a bigger example of this. I want to monitor the scheduled tasks for failures. There is a command line interface for this called schtasks which I can run using the code for running shell commands. As I am only interested in cases where the task has just failed I need to store the state on the last run and this is where SQLite will come in.

TBC

 

Running SQL scripts

PowerShell has a very useful Invoke-SQLCmd cmdlet to run T-SQL scripts if you install the sqlps module. This module is installed as part of SQL Server Management Studio (SSMS) but you may not want to install the whole package just to run scripts. However you can install just the PowerShell module and a couple of dependencies.

First you need to download module and two dependencies. There are different versions for SQL 2012, 2014 and 2016 but all can manage instances of SQL Server back to 2000 (SQL Server 2000 must be running SP4 or later and SQL Server 2005 SP2 or later) so there is little reason not to just go with SQL 2016. You need the PowerShell Extensions for Microsoft SQL Server (PowerShellTools.msi) module plus the dependencies SQL Server Shared Management Objects (SharedManagementObjects.msi) and CLR Types for Microsoft SQL Server (SQLSysClrTypes.msi). All are available in both 32- and 64-bit depending upon which version of PowerShell you want to run.

SQL Server 2016 Feature Pack (download button and tick packages)
SQL Server 2014 Feature Pack (download button and tick packages)
SQL Server 2012 Feature Pack (links directly to the package download in the Install Instructions tab)

Once downloaded you need to install the CLR Types first, following by the Shared Management Objects and then the PowerShell module. This can be automated with the following commands:

msiexec /i SQLSysClrTypes.msi /qn
msiexec /i SharedManagementObjects.msi /qn
msiexec /i PowerShellTools.msi /qn

Once installed you are ready to go. Later versions of PowerShell will implicitly import the module as needed. Earlier versions will need to add the line Import-Module sqlps

To quickly test everything is working, open a new PowerShell console window and enter the following

Import-Module sqlps
Invoke-Sqlcmd -ServerInstance servername -Query "SELECT @@SERVERNAME"