Tracking WAN Uptime to My ASUS/Merlin Router – Lambda Plot

Posted On December 26, 2017

The introduction and architecture to the WAN uptime system that I have added for my home ASUS/Merlin router has been described in a previous post. This post will focus on the Lambda function that plots the graph of the outages. A number of issues with this functionality were worked out in this post about the Lambda function that plots the LAN uptime. The code for this lambda function can be found at the bottom of this post.

The DynamoDB WAN table keeps a moving 31 days of data. I didn’t feel the need to mess with the partial day at the beginning and end of the window so I decided to use 29 days in the middle. Yes, I could go 30 days, but I wanted to be sure about edge cases.

The function starts by initializing variables and the data frame. Each row of the dataframe will represent the status in a 5 minute window. The index will be the unchanged integer values and the two columns are:

DayHourId – This started out as <day><hour> but since the hour range is from 0 to 23, values progress from 623 to 700 and that leaves a gap in the plot. So I converted the hours to hour * (100/24) which spreads the 24 hours out across 100 values evenly. This value is later used to sum outages by.
Wan – The WAN outages. Use a value of 1 for outage during this 5 minute window or 0 for no outage. Initialized to a value of 1.

Now for each 29 days, the code is going to:

Add to the xlabel the current day’s date in the format of ‘MMM DD’. xlabel will be used as the labels for the plot X-axis since DayHourID is meaningless.
Read a days worth of status from the table. Not, to keep costs down, the table has only 1 RCU allocated, so reading by day should prevent spikes in RCU usage.
Add the statuses to the dataframe. Calculate the index and assign the values to that index. This way a missing status leaves the dataframe values set to their default, which is 1 (Outage).
Increment the timestamp to the next day.

Now the dataframe can be aggregated by DayHourId which will result in 29 * 24 = 696 rows which will be a good amount of detail in a plot that fits on the typical screen.

I added a step where the dataframe was exported as a CSV file to /tmp. I used that file with my own local copy of Jupyter notebooks to tweak the plot into a form I liked.

For the plot itself the first line of that section of the function creates the plot. The rest is used to create the axis labels and replace the X axis values with the xlabel array of values (MMM DD). The last line saves the plot into a PNG file in /tmp.

The final step is to move the CSV and PNG files to S3.

Here’s the code!

from datetime import datetime as dt, timedelta
from json import dumps
from boto3.dynamodb.conditions import Key, Attr
import pandas as pd
import numpy as np
import matplotlib
matplotlib.use('AGG')
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator
import boto3
import json

print('Loading function')

def handler(event, context):
    print("Received event: " + dumps(event, indent=2))

    # initialize

    ddb = boto3.resource('dynamodb')

    tablename = 'mcdeath-wan'
    tstartdt = dt.today().date() - timedelta(days=29)
    tenddt = dt.today().date() - timedelta(days=1)
    tdt = tstartdt
    xlabel = ['']

    # initialize the raw dataframe

    rdf = pd.DataFrame(index=np.arange(0, 29 * 24 * 12), columns=['DOWHourId', 'Wan'])
    rdf['DOWHourId'] = rdf.apply(lambda x: ((int((x.name) / (24 * 12)) + 1) * 100) + (((int((x.name) / 12)) % 24) *(100.0/24.0)), axis = 1)
    rdf.fillna(1, inplace=True)

    # loop through the days

    for d in range(1,30):
        # add to xlabel

        xlabel = xlabel + [tdt.strftime('%b %d')]

        # read from dynamodb table

        dayid = tdt.strftime('%d%m%y')
        tbl = ddb.Table(tablename)
        resp = tbl.query (TableName=tablename, KeyConditionExpression=Key('DayId').eq(dayid))

        # add to dataframe

        for itm in resp['Items']:
            h = float(itm['CreateTS'][8:10])
            m = float(itm['CreateTS'][10:12])
            idx = ((int(d) - 1) * (24 * 12)) + int(h * 12) + int(m / 5)
            rdf.at[idx, 'Wan'] = 1 - int(itm['AliveInd'])

        # increment the timestamp

        tdt = tdt + timedelta(days=1)

    # aggregate the week and add to the chart dataframe

    cdf = rdf.groupby('DOWHourId').sum()

    # save the data

    cdf.to_csv('/tmp/mcdeath-wan-graph.csv')

    # build the plot

    ax = cdf.plot(kind='line', xlim=[100,3000], figsize=[11,6])
    ax.set_xlabel('Day of the Week')
    ax.set_ylabel('Full or Partial Outages')
    plt.xticks(rotation=-45)
    if (cdf['Wan'].min() + cdf['Wan'].max()) == 0:
        plt.ylim(-0.01, 1.01)
    majorLocator = MultipleLocator(100)
    minorLocator = MultipleLocator(100)
    ax.xaxis.set_major_locator(majorLocator)
    ax.xaxis.set_minor_locator(minorLocator)
    ax.set_xticklabels(xlabel)
    for tick in ax.xaxis.get_major_ticks():
        tick.label1.set_horizontalalignment('left')

    plt.tight_layout(rect=[0.03, 0, 1, 1])

    # save the plot

    plt.savefig('/tmp/mcdeath-wan-graph.png')

    # move file to s3

    s3 = boto3.client('s3')
    s3.upload_file ('/tmp/mcdeath-wan-graph.png', 'crinc', 'mcdeath/export/mcdeath-wan-graph.png')
    s3.upload_file ('/tmp/mcdeath-wan-graph.csv', 'crinc', 'mcdeath/export/mcdeath-wan-graph.csv')
    s3.upload_file ('/tmp/mcdeath-wan-graph.png', 'jimburnham.cloud', 'mcdeath-wan-graph.png')

    return