Github API

In the old days of centralised version control systems, the number of repositories tended to be small as creating new ones usually involved convincing the admin to provide (usually expensive database) space for the repo. In these decentralised days where anyone can create or clone a repo, the number of repos the admin is responsible for has ballooned to hundreds or sometime even thousands of repos.

At this scale, manually configuring repos time consuming and error prone. Thankfully GitHub provides an API for management and there is a Python wrapper around the API called PyGithub. I’ve had a few tasks to do in Github recently so I’ve come up with a few automation scripts.

From the documentation, you first need to authenticate using either username and password or preferably an access token. There is also a variation if you have an enterprise account with its own domain name; as I haven’t I’ve ignored this option but I wanted to write the scripts is such a way as to make adding this easy.

Once authenticated you are most likely to want to limit the repos to those within your organisation using the get_organization method. This is already 4 options (ignoring enterprise accounts) just to list the repos. As I intend to have several scripts it makes sense to standardise this with the following 3 functions

Create an optparse (parameter) parser to read in all standard options. Returns the parser so additional options can be added

Authenticate with GitHub, the different methods are provider by
githuboptparse so the options need to be passed into this function

Return all the repos in the organisation (if the organisation was specified at the command line) otherwise return all repos the user has access to. This is set by githuboptparse so the options need to be passed into this function. You also need to be authenicated so the return value of
githublogin also needs to be passed in. Returns an iterator just like get_repos() does.

In order to share these between the scripts I created a file. With the standard options now abstracted, the boilerplate code to list all the repos reduces to just 6 lines, import, create parser with githuboptparse, parse options, authenticate with githublogin, iterate through repos with get_filtered_repos and print repo name.

Most organisations will have naming convention so it is likely you are going to want to filter the repos further based on some criteria. This will involve modifying the iterator return, which sounds tricky but in fact is fairly easy to achieve as this contrived example shows

names = ('title','first','middle','last','suffixes')
for n in names: # default iterator returns all elements

def no_suffixes ( iteratee ):
    # ensure no suffixes are returned by iterator
    for i in iteratee:
        if i != 'suffixes':
            yield i

for n in no_suffixes(names):
    print(n) # look, suffixes has gone

Using this principle I added the ability to include only repos that contain a given regex and to exclude repos that match a given regex. I added the options to the githuboptparse function and then changed the filtered the iterator returned from get_filtered_repos  in the same way as above. Use to test out this filtering.

Role-based Azure Certification

Over the summer, Microsoft announced they would be retiring some of their existing Azure exams (70-532/3/5) and replacing them with a role orientated exam, where the tested the functionality of the resources typically used by a role rather than all of the functionality that resource has to offer. For more information see this post on Build Azure.

Apart from announcing this just after I started studying for the old exams, there is too little information at present to know if this is a good step forwards or just a silly rebranding exercise like renaming VSTS to Azure DevOps. So I’ve just started studying for the Administator Certification, I’ll let you know how I get on.

The natural end goal will be the DevOps Certification when it is released.