Month: June 2016

Running Powershell

After a slow start, there is now a large availability of powershell cmdlets to control most things in Windows. Whats more, powershell cmdlets are sometimes the only programmatic way to control some software. This means at some point you are likely to need to use cmdlets from Python. Until there is a native way of doing this with a Python module the easiest way is with subprocess as done previously for shell commands.

Whole Powershell scripts, not just single commands, can be ran with the powershell.exe command. There are three command line options that will be useful, -NoLogo removes the banner at startup, -ExecutionPolicy if set to bypass should run the script regardless to what the current execution policy is without changing the settings and -File to specify the script to run.

So just save the command(s) to execute into a temporary file and then call powershell.exe with the above options and the file name of the temp file to run. One oddity is powershell.exe requires the file to have a .ps1 extension or it will refuse to run it. You can do this by passing suffix=’.ps1′ into NamedTemporaryFile.

Putting all this together gives the following

import subprocess, tempfile, sys, os

def posh(command):
    commandline = ['powershell.exe',' -NoLogo','-ExecutionPolicy','Bypass','-File']
    with tempfile.NamedTemporaryFile(suffix=".ps1",delete=False) as f:
        f.write(command)
    commandline.append(f.name)
    try:
        result = subprocess.check_output(commandline)
        exitcode = 0
    except subprocess.CalledProcessError as err:
        result = err.output
        exitcode = err.returncode
    os.unlink(f.name)
    return exitcode , result

retcode, retval = posh("Write-Host 'Hello Python from PowerShell'\nexit 1")
print("Exit code: %d\nReturned: %s" % (retcode, retval))

Note that to get the output in the case of an error you need to get it from error object.

Regular expression substituions

Following on from my introduction to regular expressions in Python, it is time to substitute the match with something more useful. This is done with the sub method. This takes at least 3 parameters; the regular expression, the replacement and the text to search. At its most basic you have the following

re.sub("PERL","Python","I program in PERL!")

This is not very exciting, the replace method on a string does exactly the same. But this basic example hides two powerful features; the first parameter is a regular expression and the second parameter can also be a function. Put this together with the example I used when introducing regular expressions and we have:

import re
def toupper (matchobj):
    return matchobj.group().upper()
text="Welcome to Python's Regular Expressions. I hope you enjoy what you F1nD."
regex=r"([A-Z])\w+"
print(re.sub(regex,toupper,text))

This matches the same words as previously but this time changes them to uppercase. The regex I’ve covered in some detail but the function parameter needs a bit more explaination. The function is passed the match object for each match and whatever the function returns is what is substituted into the text.

In the example above I’ve used the group method with no parameters to return the entire string that was matched. I simply turned this to uppercase so you can see something happening before returning it to the sub method. It is not much of stretch to go from this to basic template functionality.

I am going to look through the template for any substitution variables enclosed in double braces, {{ and }}, and replace it with result of a few functions. My first decision is how to get the name out from the matched string. I know it is two character in from both ends so I could use matchobj.group()[2:-2] but this would hard code the pattern. Instead I’ll use the grouping option of regular expressions and just enclose the variable name in parentheses and get the variable name using matchobj.group(1). This way, if I want to change the double braces to something else I can just change the regex pattern.

Then I need a way to map the variable name to the output I want. For this example I will just create a dictionary with the variable names as the key and the function to call as the value. This way if the variable name exists in the dictionary I can simply return the result of the function back.

To demonstrate I’ve create this example. I’ve included the template as a variable to make the example self contained. It should be self explanatory from the text this contains what is happening. The only other thing to mention is I change the matched string to lowercase to make the substitution case insensitive.

Regular expressions

Regular expressions are a powerful search language for when you can’t rely on the data being at a set position, in given structure or containing a marker you can look for. At their most basic entering a word will search through for that word; but you get the same with the find method. The power or regular expressions come from its special characters. You maybe use to wildcards ? and * (_ and % in SQL) – the same can be achieved with . and .* in regular expressions. The dot matches any character and an asterisks matches the previous character zero or more times, hence .* matches anything. These can be built up to very complex matches.

Python supports regular expressions with the re module. The Python docs provide a brief summary of regular expression and the methods Python provides but cannot attempt to teach regular expressions and neither does this short blog post. Entire books have been dedicated to this but there are plenty of decent tutorials on the web to get you started. If you have a particular favourite you want to share put it in the comments.

As explained in the docs, unless you want to type in a lot of back slashes use the raw string format, r” or r””, when entering regular expressions in Python.

If you want to test whether your regular expression works try using regexr.com, put the regular expression in the top box and the sample text you want it to search through in the bottom box. It uses the PERL syntax, a regular expression starts with a slash (/) and continues to the last slash. You can then specify options (expression flags) after the last slash to control how the search works. So /Python/ is the regular expression to search for the word Python. At the time of writing regexr.com defaulted to the ([A-Z])\w+ which basically matches all words that begin with a capital letter (more later) storing the capital letter in question in group 1.

You can get a similar result to regexr.com in Python with the following code.

import re
text="Welcome to Python's Regular Expressions. I hope you enjoy what you F1nD."
regex=r"([A-Z])\w+"
for matchobj in re.finditer(regex,text):
    print('Matched %s with groups [%r]' % (matchobj.group(), matchobj.groups()))

I’ve skipped over what a match object is and the methods it provides but this should match Welcome, Python, Regular, Expressions and F1nD. Not quite what you were expecting when I said it matches all words that begin with a capital? The \w group doesn’t include any punctuation it stops at the hyphen with Python’s. But \w does include numbers and the underscore which is why it matches F1nD. Finally + requires one or more characters in the \w group after the capital it doesn’t match the capital I at the start of the second sentence.

If I wanted to include words with apostrophe’s and single letters you might be tempted to try ([A-Z])[a-z’]* but this matches any single capital letter so will also include F and D at the end (try it). As the only single letter words are A and I a working solution would be ([AI])|([A-Z])[a-z’]+

Hopefully this shows the power and the pitfalls of regular expressions and why getting matches to work can sometimes be harder than it first seems.