Tracking WAN Uptime to My ASUS/Merlin Router – Lambda Check

The introduction and architecture to the WAN uptime system that I have added for my home ASUS/Merlin router has been described in a previous post.  This post will focus on the Lambda function that checks for connections from the house router via the LAN status data.  The code can be found at the end of this post.

This lambda function checks for a new record in the LAN table within the last 5 minutes.  This will indicate that that the home router was connected to the internet at some point in the last 5 minutes.  The function starts by initialing some variables and retrieving the routers IP address from Route 53.  This posts describes the DDNS system I created to keep that value up to date.

Next the function reads the last record in the DynamoDB LAN table.  The partition key for the LAN table is based on the date.  It may be that there is no records for today.  In that case yesterday is checked as well.  If there is a last record in the LAN table and it has a CreateTS within the last 5 minutes, the WAN is considered alive (newalive = 1), otherwise it is considered dead (newalive = 0).

Then the lambda function reads the DynamoDB WAN table for the last record.  The partition key for the WAN table is also based on the date, so today and possibly yesterday are checked for the last WAN record.  This record provides the previous AliveInd (oldalive) and CreateTS values.  As described in the intro, if the previous CreateTS value is older than 6 minutes, then the new CreateTS value is set to the old value plus 5 minutes.

If the newalive value does not equal the oldalive value, then create a message indicating the change in status and publish it to SNS topic that will text the message to my cell phone.

The function finishes by saving the latest data into the WAN table.

Here is the code.


from time import localtime, strftime, time
from json import dumps
from boto3.dynamodb.conditions import Key, Attr
from datetime import datetime
import boto3
import json
import subprocess
import sys

print('Loading function')

dns = boto3.client('route53')
ddb = boto3.resource('dynamodb')
sns = boto3.client('sns')


def respond(err, res=None):
    return {
        'statusCode': '400' if err else '200',
        'body': err.message if err else res
    }


def lambda_handler(event, context):
    print("Received event: " + dumps(event, indent=2))

    # initialize

    host='<domain name of host>'
    wanTableName = 'wan'
    lanTableName = 'lan'
    todayid = strftime('%d%m%y', localtime())
    timestamp = strftime('%Y%m%d%H%M%S', localtime())
    yesterdayid = strftime('%d%m%y', localtime(time() - 86400))
    expiration = int(time()) + 2678400 # plus 31 days in seconds
    error = False

    # get ip
    
    rrs = dns.list_resource_record_sets(HostedZoneId='<hosted zone ID>', StartRecordName=host, StartRecordType='A')
    ip = "0.0.0.0" if len(rrs['ResourceRecordSets']) == 0 else rrs['ResourceRecordSets'][0]['ResourceRecords'][0]['Value']

    # read from dynamodb table lan to see if we have a signal in the last 5 minutes
    
    tbl = ddb.Table(lanTableName)
    resp = tbl.query (TableName=lanTableName, Limit=1, ScanIndexForward=False, KeyConditionExpression=Key('DayId').eq(todayid))
    if len(resp['Items']) == 0:
        resp = tbl.query (TableName=lanTableName, Limit=1, ScanIndexForward=False, KeyConditionExpression=Key('DayId').eq(yesterdayid))
    
    newalive = 0
    now = datetime.strptime(timestamp, '%Y%m%d%H%M%S')
    if len(resp['Items']) > 0:
        last = datetime.strptime(resp['Items'][0]['CreateTS'], '%Y%m%d%H%M%S')
        diff = now - last
        if diff.seconds < 300:
            newalive = 1

    # read from dynamodb table wan
    
    tbl = ddb.Table(wanTableName)
    resp = tbl.query (TableName=wanTableName, Limit=1, ScanIndexForward=False, KeyConditionExpression=Key('DayId').eq(todayid))
    if len(resp['Items']) == 0:
        resp = tbl.query (TableName=wanTableName, Limit=1, ScanIndexForward=False, KeyConditionExpression=Key('DayId').eq(yesterdayid))
    oldalive = 1 if len(resp['Items']) == 0 else int(resp['Items'][0]['AliveInd'])

    # may need to adjust CreateTS due to cloudwatch events running many minutes late.
    # if the difference between the last run and this run is greater than 6 minutes, use 5 minutes
    # yes, this is ugly, but the plotted results look bad with a gap of more than 5 minutes. 
    # this isn't a perfect solution, but it should work through most problems.
 
    last = datetime.strptime(resp['Items'][0]['CreateTS'], '%Y%m%d%H%M%S')
    diff = now - last
    if diff.seconds > 360:
        timestamp = strftime('%Y%m%d%H%M%S', last + timedelta(minutes=5))
 
    # determine new status
    
    status = 'up' if newalive == 1 else 'down'
    msg = 'Router connection to the internet is ' + status + '.'
    
    # check for need to notify
    
    if newalive != oldalive:
        sns.publish(TopicArn='arn:aws:sns:<region>:<account #>:<topic>', Message=msg)

    # write a new record to the table lan
    
    tbl.put_item (Item={
        'DayId': todayid,
        'CreateTS': timestamp,
        'ExpirationEpochSecs': expiration,
        'AliveInd': newalive,
        'ID': ip
    })
    
    return respond(None, msg)