Diff release definition environments in VSTS

Recently I had to update a release definition in VSTS and found that the tasks for releasing to the various environments were slightly different, often for options which were collapsed by default. This made manually checking both difficult and error prone. Assuming this would be a standard problem, I checked for a tool to do this but could not find one.

A release definition is just a json file and if you export it you see the environments settings (tasks, variables etc.) are stored in one big array called naturally enough environments. So you could just manually extract the ones you want and use a diff tool.

I thought I could do better. The VSTS API allows you to get the release definition and Python is so much better at automating the manipulation of json documents. Needless to say it wasn’t quite as easy as it sounded but eventually I got my VSTS diff script working. I’ve uploaded it to GitHub in case anyone has a similar problem and could find the script useful.

Advertisements

Azure resource group deployments

Azure comes with some strange quotas. One of them is a limit on Resource Group Deployments. If you go to a resource group, you can see the number of succeeded deployments. The max allowed is 800. Once you reach it you will see an error message similar to the following:

Creating the deployment 'foobar' would exceed the quota of '800'. The current deployment count is '800', please delete some deployments before creating a new one.

What is also strange is there is no way to fix this from the portal. Instead you have to rely on either the CLI or PowerShell. The following PowerShell script is a quick fix that removes all deployments older than 2 months. Just replace myRG with your resource group name.

$keepafter = (Get-Date).AddMonths(-2)
Get-AzureRmResourceGroupDeployment -ResourceGroupName myRG |
    Where {$_.Timestamp -lt $keepafter} |
    Remove-AzureRmResourceGroupDeployment

Be warned this can take a long time (as in all day long)! Change the first line to change the max age.

Dates and time

Python has a reasonably good standard library module for handing dates and times but it can be a little confusing to a beginner probably because the first code they encounter will look something like the below with very little explanation.

import datetime
print("Running on %s" % (datetime.date.today()))
myDate = datetime.datetime(2018,6,18,16,13,0)

Why is it datetime.datetime? It is a simple explanation but one I’ve rarely seen included.

All Pythons classes for handling dates and times are in the module called datetime (naturally enough). This module contains a class for dates with no time element (datetime.date), a class for times (datetime.time) and a class for when you need both called unsurprisingly (but a little unfortunately) datetime.datetime, hence the code above.

It also contains 2 more classes; datetime.timedelta which is the interval between two dates / datetimes (the result of subtracting one datetime from another) and tzinfo, standard for time zone info, which is used to handle timezones in the time and datetime classes.

To add to the confusion, if you want to to get the date / time / datetime as of now, there is not a standard across the three; datetime uses the now() method, date uses the today() method and time does not have one! You have to use datetime and get the time part as below

import datetime
# Get the date and time as of now as a datetime
print(datetime.datetime.now())
# Get the date as of now (today)
print(datetime.date.today())
# Get the time as of now - have to use datetime!
print(datetime.datetime.now().time())

The confusion does not end there. If you want to format the date / time / datetime in a particular way you can use the strftime() method – probably short for string format time. The same method exists in all classes. Why it is called time and not date or something more generic is beyond me, datetime.date.strftime() makes little sense.

If you are reading in strings and need them parsed into a date / time / datetime there is strptime() method – probably short for string parse time – but this only exists in the datetime class. So you have to use a similar trick as above and create a datetime and extract just the date or time part.

Once you get passed the quirks above, you should find the datetime module straight forward to use. However if you do find yourself needing a library with more power, try the dateutil library. It can be installed with the usual pip install python-datetutil command.

PowerShell parameters

There is no question that PowerShell is a big improvement over the old DOS shell scripting. If you want to do anything more complex that piping commands from one to cmdlet to another you are probably going to end up using functions or passing parameters into your scripts. While you can add parameters to a function in the usual way by including them in parenthesis after the function name; if you want your function or script to behave like a cmdlet you are going to have to learn a new way – advanced functions.

The standard way to work with parameters is to include a param() call. This does little more than move the parameters from the parenthesis to further down the script but it does make splitting parameters over multiple lines more elegant – which you are probably going to do.

PowerShell automatically uses the name of the variable as the parameter identifier. So param($Query) will allow you to call the function with -Quest “To seek the Holy Grail”

The advanced features are activated by including [Parameter(…)] before the parameter name. The parameter call accepts multiple advanced features separated by commas. For example, if you want to make sure the parameter is included add Mandatory = $true

After the parameter block, you can optionally add validation in format [type(condition)]. A complete list can be found in the link above. For example to ensure you don’t get an empty parameter include [ValidateNotNullOrEmpty()] before the parameter name.

There is also common parameters that most cmdlets accept. Rather than enter these each time, just include the command [CmdletBinding()] before the param() call. The function will now automatically accept the following common parameters:

  • Verbose
  • Debug
  • ErrorAction
  • WarningAction
  • ErrorVariable
  • WarningVariable
  • OutVariable
  • OutBuffer
  • PipeLineVariable

These common parameters also propagate down through cmdlets and other functions that your function calls. So for instance, if you pass -Verbose into your function and your function uses Invoke-RestMethod, this cmdlet will be called with -Verbose automatically and you will see the details of the HTTP request and response.

You can also add help to your function or script by including a formatted <# #> block directly after your function definition or at the very start of your script similar to the docstring in Python. For details about how to format the comment block see this Micrsoft blog post.

A working example is always better than dry text and I’ll be uploading an example soon.

pip + virtualenv = pipenv

I have long argued that one of the reasons Node took off so quickly was the inclusion of npm for package management. For all its faults, it allows anyone to quickly get up and working with a project and to build powerful applications by utilising other libraries. What’s more, by being local first, it avoids some of the dependency problems caused by different applications requiring different versions of the same library (at the expense of disk space and a little RAM).

Python never initially had a package manager but pip has evolved into the de facto standard and is now included with the installer. All packages are installed globally on the machine; this makes sense given Pythons’ history but is not idea. To have local packages just for your app you needed virtualenv or a similar tool.

The obvious next step to close the gap with npm would be a single tool that would set up a local environment and install the modules into. And that is exactly what pipenv does. It was created by by Kenneth Reitz (the author of the requests module which I’ve used in several posts ) and has quickly gained popularity in the last year.

Lacey has done a good write-up of the history that lead to pipenv on this blog post and there is a full guide available here, but it is just as simple to show you with an example. First install pipenv with pip install pipenv

Then you can create a project, with virtualenv and install the requests module with the following

mkdir pipenvproject
cd pipenvproject
pipenv install requests

That’s it (although personally I would have liked to see a pipenv init command). To prove there is a virtual environment there use the shell option to switch to it (no more remembering the patch to the batch file). To prove this try the following

pipenv shell
pip list
exit
pip list

The first pip list should just show requests and its dependencies. After exiting out of the virtual environment shell, the second pip list will list all of the packages installed on your system.

Log Parser GUI

Log Parser is an old but still incredibly useful utility which I covered way back in this blog post. If you are fighting log files then I still recommend you give the post a read.

Since that post, v2 of a GUI for Log Parser has been released . For those who are more accustomed to using SSMS or similar to write queries this may be more to your tastes. It can be downloaded from here. See this Microsoft blog post for a summary of what has been added to v2.

There is already a decent tutorial from Lizard Labs on using the GUI but it is not very clear about where the options are so refer to the image below if you struggle to get started.

LogParserGUI

A little aside for Windows 8/ Server 2012 and above accessing the event log files. Don’t try to open the event logs directory (%SystemRoot%\system32\winevt\logs by default) directly. You will probably be unable to open it because the folder as it does not have All Application Packages in the security permissions.

There is no need to do this. Log Parser already knows how to access event logs, just use the event log name, Application, Security or System, as shown in the tutorial and example above.

Mr Popularity

David Robinson has done some good analysis on searches on Stack Overflow on the popularity of languages. This shows that Python is on track to be the most searched for programming language. He followed this up with further analysis to show the increase appears to be coming from data science and machine learning. This follows on from IEEE Spectrum putting Python as the most popular programming language for 2017 among developers.

Apart from giving me an excuse to put lots of links in the first paragraph what does this show. Probably little more than Python is flexible which we knew already (it’s why we’ve been using it). You can learn to program with it, produce a web service with it, do data analysis with it as well as all of automate your system administration jobs (which this blog mostly deals with).

Rather than starting a flame war over which language is the most popular or useful, a more useful takeaway from this is Python really is a first class language. It is not just an alternative to PERL, you can use it as your go-to language for everything and only change if a reason to do so appears.

AsciiDoc and DocBook

I’ve covered Markdown (.md) in other posts but another text format gaining popularity is AsciiDoc which is a plain text interpretation of DocBook XML. These files generally use a verbose .asciidoc file extension but you can sometimes see them using the text file (.txt) extension.

The main AsciiDoc program is written in Python, but there is no pip install method. Instead you need to get it from Github directly. Also it was written for Python v2 although a fork has been done for Python v3. For Python v2, clone the directory using a Git client from Github (Python v2) or alternatively use the Download ZIP option from the Clone or download button and unpack the zip file. Once downloaded you can build the documentation for the AsciiDoc program with the following command:

python asciidoc.py doc\asciidoc.txt

For Python v3, either clone or download and unzip from the Python v3 Github site. The programs have all gained a 3 suffix to their name so the equivalent build command for the AsciiDoc documentation is:

python asciidoc3.py doc\asciidoc.txt

This also acts as a way to test the basic setup. If all goes well you should see no error messages and it should create a doc\asciidoc.html file which you can open with any web browser.

To get from AsciiDoc to most other formats, the program converts the text file to the DocBook XML format and then acts as a wrapper around DocBook to create the necessary file. DocBook is not aimed at Windows users so getting it installed is not straight forward. Thankfully combining this blog post and this SO post gives us the installation steps below.

First go to DocBooks SourceForge site and download the zip file. Unpack this to the C: drive (or wherever you want it) and optionally rename the directory to docbook-xsl, that is remove the version number from the folder. Add this folder to your path environment variable.

Now you need libxml2, libxslt, libxmlsec , zlib and iconv. A Windows build of all these can be obtained from ftp://ftp.zlatkovic.com/libxml/. Download the latest version of the zip file for each library and extract the contents of the bin directory of each zip file into the docbook-xsl directory created above.

So avoid calling several programs to create the other files, there is also a2x.py provided. This is a wrapper around the various software programs that need to be called. To create an epub ebook of the instructions above the command becomes:

python a2x.py -L -f epub -v doc\asciidoc.txt

or for the v3 fork:

python a2x3.py -L -f epub -v doc\asciidoc.txt

Testing websites (headless)

I covered using Selenium to test website in previous posts (starting with this one which covers install and first test). Using a full browser ensures real world testing and can be done interactively. Using a full web browser does come with a performance penalty and may make CD integration tricky.

Partially for these reasons, PhantomJS was developed. It is based on WebKit and as well as offering a JavaScript API it can be controlled using Selenium.

Installation is easy. Download the windows zip file from the downloads page. There are no dependencies, you just need the phantomjs.exe file from the bin folder. Move this to a directory that exists on your path, like you did for the other webdrivers. You can then use this as apposed to one of the other drivers simply by calling the PhantomJS webdriver with

driver = webdriver.PhantomJS()

All of the Selenium examples will work with this one change.

Async requests

The advantages of asynchronous network requests (or any high latency requests) over synchronous is easy to explain with a little example. However as I’ll explain at the end, my little example hit a problem.

Doing the requests in a synchronous manner, one after the other, means you make the first one, wait for its response before moving on to the next and so on until all the requests are completed. The time taken is the sum of all the requests. If you did 20 requests where the longest took 1 second and the average took half a second then doing this synchronously would take 10 seconds.

Doing the same in an asynchronous manner you create a listener or a callback to handle the response then start the first request followed immediately by the second and so on without waiting for a response (as you have already created another piece of code to handle the responses). This should only take as long as the longest response, plus a little overhead, or about 1 second in our example above.

So why don’t we do everything asynchronously? Well JavaScript does, it’s one of the reasons Node became popular. However if you have done any serious amount of coding in JavaScript you will know the added complexity that this brings; because you do not wait for a response you either have to get the callback to update you, which gets tricky once callbacks get nested, or have some sort of polling mechanism to wait for all the responses before continuing. There are solutions to these issues, promises in JavaScript is one, and such are the advantages that from Python 3.5 onwards async and await keywords were added (similar to other languages, see PEP 492).

However if you just have a batch of HTTP based requests you want to run asynchronously, Kenneth Reitz, who wrote the excellent requests module I used in the Posting to Slack blog entry, has released grequests; basicly a monkey patch for requests to use the gevent module to make asynchronous calls.

You use the same get, post, put, head and delete request functions and returns the same response object. The difference the asynchronous way has an extra line to set up a tuple for handling the requests before calling grequests.map (or imap) to poll until all the requests are complete whereas the synchronous way just maps the get calls directly to the urls. I created a little program to demonstrate this and uploaded it to BitBucket. It makes 10 get requests asynchronously first and then synchronously and displays the timings. Putting the async first should eliminate any possibility requests being faster skewing the results.

So to the problem. The asynchronous calls on my Windows machine did not end up faster, if anything the average was slower. Confused, I tested the same code on a Linux box which produced the expected results; the async completing in a quarter of the time. At a guess it seems there may be a problem either with gevents or the greenlets module grequests depends upon for performance. I will do some more investigation and let you know.