Modules

Modules for Python

Python SQL Server driver on Linux

So you have packaged your SQL monitoring and maintenance routines into a web server and demonstrated it all works from your computer. Impressed they ask for it to be put on a proper server – a Linux box. 5 years ago this would have involved using unsupported 3rd party drivers and who ran internal Linux servers anyway. Now the request seems almost reasonable although you will have to jump through more hoops than you would with Windows.

First off I’ll assume you are using the pyodbc module. On Linux this will require a C compiler. If you have chosen a minimal install then you’ll need to install them. This can be done with the following command (depending upon the flavour)

Redhat (Centos/Fedora)
sudo yum groupinstall 'Development Tools' -y
sudo yum install python-devel unixODBC-devel -y

Debian (Ubuntu)
sudo apt-get install build-essential -y
sudo apt-get install python-dev unixodbc-dev -y

With this done you can now pip install pyodbc. The pyodbc module is a wrapper around the native system drivers so you will need to install a suitable unixodbc driver. Microsoft have produced an official unixODBC driver since 2012 and it has been regularly maintained since. Installation instructions for v13 can be found on this blog post.

With pyodbc and unixodbc set up all you need to change in your actual code is the driver on the ODBC connection string to ‘ODBC Driver 13 for SQL Server’ and away you go. As a quick test, the following example will establish a connection and return the servername through a SQL query.

import pyodbc
cnxnstr = "Driver=ODBC Driver 13 for SQL Server;Server=<yourserver>;Uid=<yourusername>;Pwd=<yourpassword>;database=<yourdatabase>"
cnxn = pyodbc.connect(cnxnstr)
cursor = cnxn.cursor()
cursor.execute("SELECT @@SERVERNAME")
result = cursor.fetchall()
for row in result:
    print(row)
cursor.close()
cnxn.close()

Virtual environments in Visual Studio

A virtual environment in Python is a folder with everything needed to set up local configuration isolated from the rest of the system. This allows you can have modules installed locally which are different or do not exist in the global Python configuration. If you have used Node.js then you can think of virtual environments as npm default way of working – creating a local install of a package rather than a global one (pip’s default).

If you have multiple versions of Python installed on your machine then you can also specify which version of Python the virtual environment should use. This gives you ability to test your code against multiple versions of Python just by creating multiple virtual environments.

There are already plenty of good posts out there on virtual environments so the aim of this blog post is not to rehash why you should use virtual environments (see here for a good introductory blog post here) or as a quick setup guide (see the Hitchhikers Guide to Python post). It is a quick guide to using virtual environments within Visual Studio. If you have not used virtual environments before it is worth giving these posts a quick read before continuing.

As an aside, Python 3.3 introduced the venv module as an alternative for creating lightweight virtual environments (although the original wrapper pyvenv has already be depreciated in Python 3.6). While this is the correct way going forward, Visual Studio uses the older virtualenv method which is what I am concentrating on here.

Once you have created your Python solution expand it until you get to Python Environments. Right-click on this and choose Add Virtual Environment… from the menu list as shown belowvsve1

 

You can change the name of the folder (defaults to env) which is also used as the name of the virtual environment and the version of Python to use. Click Create to finish and you are ready to go (easy wasn’t it). If you expand the Python Environments node you should see the virtual environment appear.

In the background this has created a folder (the virtual environment) in your working directory with the name given. In case you are unsure, your working directory is the location is the location of the solution which defaults to X:\Users\me\Documents\VS20xx\Projects\Project Name\Solution Name\ – tip, change the default location). This could have been done manually by changing into the working directory and entering the following command (where X:\Python_xx is the installation directory for the version of Python you want to use and env is the name of the folder / virtual environment – if you just want your default version of Python then just pass the name of the folder).

virtualenv -p X:\Python_xx\python.exe env

To install a module into the virtual environment from Visual Studio just right-click on the virtual environment and select Install Python Package… from the menu or if you have a requirements.txt file you can select Install from requirements.txt. If you expand the virtual environment node you will see the modules installed. Once you have all the modules installed you can generate the requirements.txt file from the same menu and it will add the requirements.txt to your project for portability.

What if you want to use this virtual environment from the command line? Inside of the virtual environment is a Scripts directory with a script to make the necessary changes; the trick is to run the correct script from the working directory. The script to run depends upon whether you are running inside a PowerShell console (my recommendation) or from a command prompt. Change into the working directory and type in the following command (where env is the virtual environment folder)

PowerShell: .\env\Scripts\activate.ps1
Command prompt: env\Scripts\activate.bat

The prompt will change to the name of the virtual environment to show activation has succeeded. You can do everything you would normally do from the command line but now you are running against the virtual environment. To confirm the modules installed are only those you have specified type in ‘pip list’ and the version of Python is the one you specified with ‘python -v’.

Update: It appears I’m not the only one to be looking at virtual environments today, see this article if you want a similar introduction but from the command prompt only.

Pip requirements

You should be used to installing new modules using pip. You have probably used a requirements.txt file to install multiple modules together with the command.

pip install -r requirements.txt

But what about if you need more flexibility. Why would you ever need more flexibility? If you look at my introduction to YAML post, the code supports either the yaml or ruamel.yaml module. There is no way to add conditional logic to a requirements.txt file so a different strategy is needed.

pip is just a module so it can be imported like any other module. This not only gives you access to the main method, which takes an argument list just as if you were calling pip from the command line, but also to its various methods and classes. One of these is the WorkingSet class which creates a collection of the installed modules (or active distributions as the documentation calls them). Using this we can create the conditional logic needed to ensure one of the yaml modules is installed as below.

import pip
package_names = [ ws.project_name for ws in pip._vendor.pkg_resources.WorkingSet() ]
if ('yaml' not in package_names) and ('ruamel.yaml' not in package_names):
    pip.main(['install','ruamel.yaml'])

WorkingSet returns a few other useful properties and methods apart from the package_name. The location property returns the path to where the module is installed and the version property naturally returns the version installed. The requires method returns a list of dependencies.

As with most modules, if you’re interested in finding out more dig around in the source code.

Learning the Python library

My motivation for starting this blog was to make a note of useful Python code I’ve discovered or written so I could easily find it again. It also has the added benefit of forcing me to understand the libraries I’m using so I can explain them and write concise examples.

Another person who used a blog to start something is Doug Hellmann. He created the Python Module of the Week blog to get in the habit of writing something on a regular basis. The blog focuses on Python’s standard library and includes some very good examples.

Unzip a file in memory

The zipfile module is fairly flexible but there are occasions when you cannot pass it a filename (as a string) or a file like object; for example the open method on AWS S3 buckets does not return a suitable object. What to do if you can read the zip file into memory – writing it to disk just to read it back in again seems a waste.

Python, as is often the case, already has a module to solve this problem, in this case StringIO. This allows you to treat a string, or in this case the entire file in memory, as if it was a file.

This allows us to write our unzip procedure compactly as

# module imports and S3 connection omitted for brevity (and beyond scope)
s3file = s3connection.get_bucket(bucketname).get_key(filename)
if s3file:
 s3file.open()
 zf = udbfile.read()
 s3file.close()
 zip = zipfile.ZipFile(StringIO.StringIO(zf))
 zip.extractall()

Windows binaries

One annoyance of using Python in a Windows environment is finding a really useful library only to find out you need to compile everything from source. Building from source is not a strong point of Windows.

A good resource is the Unofficial Windows Binaries for Python Extension Packages maintained by Christoph Gohlke. Chris has done the hard work compiling the libraries and creating an installer. All you have to do is run the correct version (there is often a version for each Python version and 32- and 64-bit versions).

I came across this page after looking at lxml.html. Once you have downloaded and installed the correct library you’ll be able to run the following script which displays all the links on the a page

import lxml.html
htmlpage = lxml.html.parse("https://quackajack.wordpress.com")
for item in htmlpage.getiterator():
	if item.tag == "a":
		print "%s=%s" % (item.text,item.values())

Modules

At the risk of simply repeating the document on modules, to run a method from a different python file you can use either of the following code snippets. The first gives you access to all the methods and variables inside of the file (note you don’t need the file extension) from within its own namespace. The second just the method you requested inside of your own namespace.

# Want access to all of the file's methods
import file
file.method_name()

# Just want a single method without anything
from file import method_name
method_name()

Both statements allow an optional as command to change the name. In the first case this changes the namespace (more on this later). The second changes the the name of the reference name for the method or variable. You will see the from … import a lot in code on the Internet although I tend to stay clear of it. There are a few things to be aware of

# Imports everything (with caveat) from file
# overwrites any object with the same name you already had
from file import *

# does not work
from file import method_name,another_method as new_name,another_name

# new_name refers to another_method, method_name is no longer available
from file import method_name,another_method as new_name

However how do you get access to a python file that is not in the same directory as calling python script, or the PYTHONPATH environment variable / registry value? There are two variations.

The paths search are held in a list object called sys.path and can be manipulated at runtime. Just add the your required path to this list. Don’t replace the list or you’ll lose access to all your standard libraries. As an example, the following code allows you to import any python file from either C:\PythonModules or the modules directory off the

import sys,os
sys.path.append(r"C:\PythonModules")
# getcwd gets the current working directory and add modules directory
sys.path.append(os.path.join(os.getcwd(),"modules"))

If the file you wanted to import was in a sub-directory that is already in your search path you can use the package notation. This works no matter how deep inside the directory structure the file is. So you could a import a file from the sub-directory modules \ local \ custom with the following code. Notice as gives you a shortcut rather than typing in the full namespace each time.

import modules.local.custom.file as mymod
mymod.mymethod()

The limitation of this method is that each directory will need a __init__.py file in each directory. In the above example there would have to be a  __init__.py file the modules, local and custom directories. This file can be empty or can contain initialisation code where required but if it does not exist, the directory will not be searched.

Python 3 users also note that importing a module creates a __pycache__ directory in the files location where it stores the compiled .pyc file rather than storing it in the same directory as the file which it what happened previously. So in Python 3 the above would create __pycache__ directories in modules, local and custom.