Using Amazon Lambda with API Gateway

Amazon Lambda is a “serverless compute” service from AWS that lets you run Nodejs, Python, or other code. They have extensive documentation, but I found a disconnect: I was struggling to get the API gateway to map into Lambda functions. I put together a YouTube video, but some of the highlights are below. The source code for the project is on GitHub.

 

Lambda

The Lambda function I demo is very simple, returning a URL based on a given input. When creating it, filter for “hello-world”, and adjust the following:

Name bsdfinder
Description BSD Finder
Role lambda_basic_execution

For the others, just accept the defaults. The IAM Role can be tweaked to allow your lambda function to access S3 as well.

Your method needs to have a event and context objects. The event object is how properties are fed into your function from the API Gateway, and the context object is how execution is stopped and response values passed out.

Test it

Once it’s saved, click Actions → Configure test event. Use this test data:

{
  "bsd": "open"
}

The execution should succeed and the results look like:

{
  "code": 302,
  "location": "http://www.openbsd.org"
}

If you got this far, the lambda function works.

API Gateway

The API Gateway fronts requests from the internet and passes them into the Lambda function, with a workflow:

Method Request → Integration Request → Lambda → Integration Response → Method Response

Method Request Specify what properties (query string, headers) will be accepted
Integration Request Map method request properties to the lambda fucntion
Lambda Properties passed into event object, passed out with context object
Integration Response Map lambda context properties to response
Method Response Response finally gets returned to UA

 

I deliberately kept this document thin. Check out the YouTube video to see it all come together.

Advertisements

AD authentication for RavenDB

RavenDB 2.5.2750 IIS 7.0. Wow, I’ve started a lot of posts in the RavenDB Google group like that. This is what I’ve learned about RavenDB AD authentication.

  1. You need a valid commercial license.
  2. You need to enable both Windows authentication and Anonymous.
  3. Your web.config must have Raven/AnonymousAccess set to none
  4. Users need explicit access to the <system> DB to create databases (all won’t cut it)
  5. Putting users in Backup Operators allows backups (who knew)
  6. Local admins are always admins!

I’m not going to cover the commercial license bit, that’s easy enough.

Authentication modes

It seems obvious that you would enable Windows Authentication and disable Anonymous in IIS. Turns out, this is not the case. According to Oren:

Here is what happens.
We make a Windows Auth request to get a single use token, then we make another request for with that token, which is the actual request.
The reason we do this (for streaming and as well as a bunch of other stuff) is that windows auth can do crazy things like replay requests, and there there is payload to consider.
You still keep Windows auth enabled, so that IIS will handle dealing with the creds, but raven will set the header.

The web.config

The only thing that needs to be set in here is:
    <add key="Raven/AnonymousAccess" value="None" />
You can also set this to bypass authorization if you’re local to the box:
    <add key="Raven/AllowLocalAccessWithoutAuthorization" value="True" />

Permissions

Permissions are set in the system database –> Settings –> Windows Authentication. There are user and group tabs. Once you’ve added a group, push the little green plus icon to add new DBs to that user/group.
ravengrouppermissions
In Raven, all the DB permissions live in the system DB, not the DB itself.

Gotchas

Local admins

Yeah, that’s a few weeks we’ll never get back. Regardless of domain group membership, local admins on the server get admin access. That means if you put contoso\Everyone into SERVER\Administrators, then everyone in contoso gets admin access. Surprise!

Backup Operators

This is another loosely documented feature. If you want non-admin users to be able to backup, make them backup operators. Seems obvious, but it’s not written down.

Testing

Raven has a special URL, https://ravendb.example.com/debug/user-info which will present an authentication challenge and report the users permissions. You’ll get something like:
{"Remark":"Using anonymous user","User":null,"IsAdminGlobal":false,"IsAdminCurrentDb":false,"Databases":null,"Principal":null,"AdminDatabases":null,"ReadOnlyDatabases":null,"ReadWriteDatabases":null,"AccessTokenBody":null}

Splunk user agent string lookups with TA-browscap_express

I got a requirement to find out what browsers our clients are using. We run a SaaS product, and every client is clientname.ourdomain.com, so I could use the cs_hostname field in the log. Using a 3rd party analytic tool was totally out of the question, all I had to go on were the IIS logs.

We’re already getting the IIS logs into Splunk, so with a bit of Googling I found the TA-browscap app by Dave Shpritz. It’s powered by the browscap project and it works. The problem is that the browscap file is now 18MB and searching it has become very slow. What started as an hack to cache matches in a separate file has turned into a total fork and re-write of most of the app, and has become TA-browscap_express.

There are installation instructions on the application page at Splunk.com, also in the GitHub repo, so I won’t rehash them here.

The Browscap file

The Browser Capabilities Project (browscap) is an effort to identify all known User Agent (UA) strings, which regretfully are a total mess. The project is active, and the data is accurate. They provide the data in a number of formats, the legacy INI file still used by PHP and ASP, and a CSV file, among others. The file is 18MB and 58,000 lines long.

The structure of the file is a name pattern for a UA string, followed by all the known properties. My UA string is:

Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0

And the matching name pattern is:

Mozilla/5.0 (*Windows NT 6.1*WOW64*) Gecko* Firefox/31.0*

The example above matches FireFox 31 on Windows 7 x64. Now here is an interesting challenge:

Mozilla/5.0 (*Windows NT 6.1*) Gecko* Firefox/31.0*

This name pattern matches FireFox 31 ono Windows 7 x86. It also matches x64. If you take the first match, you’ll get the wrong information. To get an accurate lookup, you need to compare all 58,000 name patterns, and the longest one which matches is the most correct. You can imagine, this is quite a challenge.

Parsing the browscap.csv file

The TA-browscap app uses pybrowscap, which is a Python library for parsing and managing the browscap.csv file. The library returns an object with properties for all the fields in the browscap file. I didn’t want to check 58,000 name patterns every time, so I wanted the successful pattern as well. pybrowscap doesn’t provide it, and it’s actually hard to re-create because they’re using python’s internal CSV buster.

The solution was to lift the core logic from pybrowscap and re-write it myself, busting strings as CSV data instead of files. The first thing you have to do is convert the name pattern into regex, which is easy, then compare your challenge string against it. Like I described above, after that  you loop through every name pattern until you find the longest match.

Knowing what to cache

The last entry in the file is “*” which will match anything. It returns a set of properties called “Default Browser” where everything is false. The idea is you’ll always get some response, you won’t get null. I didn’t want to cache these “Generic” or “Default” browsers, because once they’re in the cache, they’ll come up for every new UA string, and the data will be junk.

How it works

The app (TA-browscap_express) caches matched UA strings in a file and searches it first. During a query it also keeps matches in memory, using the memory cache before the disk cache. It also supports blacklisting obviously bad UA strings and storing the cache file on a network share to help with distributed search.

When a UA string is passed into the app, it checks 4 things:

  1. Is it blacklisted? If yes, return default browser.
  2. Is it in the memory cache? If yes, return the entry.
  3. Is it in the browscap_lite.csv file? If yes, add to memory cache and return.
  4. Is it in browscap.csv? If yes, add to browscap_lite.csv, the memory cache and return.
  5. If totally unidentifiable, return the default browser.

 

Browscap_lite.csv

The browscap_lite.csv file is the cache file which is checked before browscap.csv. It’s in the same format, and has the same fields. Matched UA strings are written to it.

The default location for the file is in the application\bin directory for the app. In a Splunk distributed environment, that’s not really a good idea. You never know what search head or indexer the app will run on, and you’ll end up rebuilding the cache over and over. The browscap_lookup.ini file lets you specify a location for the file.

The blacklist

Some UA strings are junk, and won’t ever be added to the browscap file. Others could be added, and I suggest you report any new strings to browscap.org, but it may take a while, or you just don’t care. The blacklist.txt file is used to weed out garbage UA strings so that you don’t waste time trying to look them up in the browscap.csv file, only to get “*” Default Browser.

 

The fields

The app returns all fields available in the browscap file. Two I find especially interesting are:

ua_comment This is a combination of the browser name and version, so that Internet Explorer 11 becomes IE 11.

ua_platform This is a combination of the operating system and version, so that Windows 7 becomes Win7.

 

There is one additional field which I added, ua_fromcache, which returns true, false, or blacklist, depending on where the data came from.

 

The demo

Sorry in advance for my droning voice, but if you want to see the setup and usage in action, check out my Youtube Video.

 

Getting started in ASP.Net debugging with WinDbg

We’ve all faced the dreaded application is slow or eats all my RAM. You take a memory dump and recycle the AppPool.  But then what? Some of the people I work with are memory dump gurus, but the barrier to entry seemed pretty high. This is a quick primer on how to dive into a dump file. I’m not an expert, and by no means will you be an expert from reading this. “Advanced .Net debugging in under an hour” is how our PFE explained it.

Note: The “bitness” of the dump file is important throughout. A 64bit dump requires the 64bit debugger and the 64bit symbols.

First install WinDbg. It’s a good idea to put it in C:\WinDbgx86 or C:\WinDbgx64, since the default path is a bit unwieldy.

Next create a directory called c:\symbols. This will become your symbol cache.

Now run WinDbg.exe. It will be in the x86 or x64 directory.

In WinDbg, pick file → Symbol File Path and enter the following string

srv*c://symbols*http://msdl.microsoft.com/download/symbols

Note the forward slashes in the path.

Next you’ll need the symbol files. If you have the exact .Net version to match the dump, you can run

.loadby sos clr

The symbol file is included in the .Net install and will be imported automatically. If you version does not match, you can use psscor. There is psscor2 and passcor4, depending on the framework version. Search Google for the download, and save the dll file in C:\WinDbg(bit)\(bit) (ie c:\WinDbgx64\x64). Import psscor by running

.load passcor4

Now your debugging environment is ready. Open a dump file. I’ve got a (short) reference table below, but the way I work through it is this:

!ASPXPages → Will dump all ASPX requests and their corresponding threads. If the thread ID is “XXX” then it’s already completed and there is no stack.

!runaway → Shows all threads and their runtime. The slowest thread is on top. Once you’ve picked a thread ID, change to it’s context by executing:

~XXs → Where XX is the thread ID. ie: ~29s switches to thread 29 context. In a thread context run

!CLRStack → Lists the .Net call stack for the thread

!dso → Dump Stack Objects, will show all the objects on the stack.

That’s pretty well it. From here, you use !do <addr> to dump objects located at the specified address. Run !dso, take any object address, and run !do on it. You’ll get the idea.

Command Description
!sym noisey Print output about automatic symbol file fetching
!sym quiet Undo !sym noisey
~ List threads
~##s Switch to a thread where ## is the thread number. ie: ~10s switches to thread #10.
!threads Lists managed threads that is .Net threads
!ASPXPages List threads handing ASPX pages. If the return code is XXX then the thread is completed.
!runaway Shows how long threads have been executing. Look at the top entries for obvious slowdowns.
!CLRStack Once you’re in a thread context, CLRStack will show the call stack
!dso Dump Stack Objects – this is the good stuff. It shows all the objects in the stack
!do ADDR Dump an object
!da ADDR Dumps an array
!dumpheap [-stat] [-min ###] Dumps all the objects on the heap. Adding -stat will show a count by type and a sum of size, and help identify the large objects in the heap. Adding -min will filter out objects which are smaller than ###
!dae Dump All Exceptions
!pe Print exception, if any, for the current thread
!GCRoot ADDR Finds all the objects which reference the object specified at ADDR

Dumping an object

Consider the following:

You run !CLRStack and see the following:

000000001432d078 0000000077746eba [NDirectMethodFrameStandalone: 000000001432d078] <Module>.SNIReadSync(SNI_Conn*, SNI_Packet**, Int32)
000000001432d040 000007fee89669b7 DomainNeutralILStubClass.IL_STUB_PInvoke(SNI_Conn*, SNI_Packet**, Int32)
000000001432d120 000007fee894e9cf SNINativeMethodWrapper.SNIReadSync(System.Runtime.InteropServices.SafeHandle, IntPtr ByRef, Int32)
000000001432d190 000007fee894e6cb System.Data.SqlClient.TdsParserStateObject.ReadSni(System.Data.Common.DbAsyncResult, System.Data.SqlClient.TdsParserStateObject)
000000001432d230 000007fee894e587 System.Data.SqlClient.TdsParserStateObject.ReadNetworkPacket()

The top three calls are native calls and can’t be debugged. They’re invoked by the .Net runtime to do some work. The last .Net call was SqlClient.TdsParserStateObject.ReadSni.

Now run !dso and scroll up to the top.

0:068> !dso
OS Thread Id: 0x5084 (68)
RSP/REG          Object           Name
000000001432CB00 000000024ff60b58 System.Reflection.RtFieldInfo
000000001432CBF8 000000024ff9ce90 System.Collections.Generic.Stack`1[[System.Byte[], mscorlib]]
000000001432CC40 000000016feb5318 System.Byte[][]
000000001432CC50 000000016feb0c48 System.Byte[]
000000001432CC60 000000012fd89b00 System.DefaultBinder
000000001432CC70 000000024ff9ce90 System.Collections.Generic.Stack`1[[System.Byte[], mscorlib]]
000000001432CC88 000000016feb0c48 System.Byte[]
000000001432CC90 000000016feb0c48 System.Byte[]
000000001432CCA0 000000024ff9ce70 System.Runtime.SynchronizedPool`1+GlobalPool[[System.Byte[], mscorlib]]
000000001432CCB0 000000024ff602f0 System.ServiceModel.Description.MessageDescription
000000001432CCB8 000000016feb0c48 System.Byte[]
000000001432CCC0 000000024ff9ce18 System.Runtime.SynchronizedPool`1[[System.Byte[], mscorlib]]
000000001432CCD8 000000024ff9ce70 System.Runtime.SynchronizedPool`1+GlobalPool[[System.Byte[], mscorlib]]
000000001432CCE0 00000000ffd40e40 System.Collections.Generic.Stack`1[[System.ServiceModel.Channels.TextMessageEncoderFactory+TextMessageEncoder+UTF8BufferedMessageData, System.ServiceModel]]
000000001432CD70 00000000ffd40e68 System.ServiceModel.Channels.TextMessageEncoderFactory+TextMessageEncoder+UTF8BufferedMessageData
000000001432D0C0 00000001dfefba18 System.Data.SqlClient.SqlCommand

Again, the top few lines are system level and can be ignored. Here you see the SqlClient SqlCommand child object.

Use !do to dump that object

!do 00000001dfefba18

From here the results will be specific to the object. It’s properties and values will be listed. There are two types of variables, reference and value. Value variables are almost always integers. The value column of the results will make it clear.

              MT    Field   Offset                 Type VT     Attr            Value Name
000007feeab15a48  40001e0        8        System.Object  0 instance 0000000000000000 __identity
000007feefb915c8  40002c3       10 ...ponentModel.ISite  0 instance 0000000000000000 site
000007feefb90280  40002c4       18 ....EventHandlerList  0 instance 0000000000000000 events
000007feeab15a48  40002c2      190        System.Object  0   shared           static EventDisposed
                                 >> Domain:Value  0000000001f7e030:NotInit  0000000003b89f20:000000010fe0b520 <<
000007feeab1c7d8  4001733       b0         System.Int32  1 instance           621735 ObjectID
000007feeab168f0  4001734       20        System.String  0 instance 000000020fe27720 _commandText
000007fee89a1818  4001735       b4         System.Int32  1 instance                4 _commandType
000007feeab1c7d8  4001736       b8         System.Int32  1 instance               30 _commandTimeout
000007fee8ea1668  4001737       bc         System.Int32  1 instance                3 _updatedRowSource
000007feeab1d608  4001738       d0       System.Boolean  1 instance                0 _designTimeInvisible
000007fee8ed46c0  4001739       28 ...ent.SqlDependency  0 instance 0000000000000000 _sqlDep
000007fee89a1b78  400173d       30 ...rameterCollection  0 instance 00000001dfefbaf8 _parameters

Use !do on the value to get the object referenced by that address. In the case of an SqlCommand object, _parameters and _commandText will be most insteresting

Finding unused app pools with powershell

I’ve been tasked with cleaning up our UAT environment, and needed a way to find unused App Pools. To make things more exciting, we don’t have Microsoft.Web.Administration.ServerManager installed, so the native powershell way wasn’t going to work. This is my solution, by invoking appcmd and string parsing the results.

$command = "c:\windows\system32\inetsrv\appcmd.exe list app"
$apps = Invoke-Expression $command

$command = "c:\windows\system32\inetsrv\appcmd.exe list apppool"
$pools = Invoke-Expression $command
foreach ($pool in $pools)
{
    $used = $false
    foreach ($app in $apps)
    {
        $apppool = $app
        $apppool = $apppool.Substring($apppool.indexof(":")+1)
        $apppool = $apppool.Replace(")","")
        $apppool = $apppool.trim()
        if ($pool.contains($apppool))
        {
            $used = $true
            break
        }

    }
    echo "$used $pool"
}

MediaWiki VisualEditor Parsoid on Windows Server 2012

The steps to get Parsoid working on Windows are here.

The MediaWiki project has been working on a visual text editor. It’s the default editor for the main namespace at mediawiki.org, and is in an early trial at Wikipedia.org. They’ve done a great job, I really like it. It also has some serious challenges to overcome, as outlined in a blog post by project lead Gabriel Wicke. Their solution is a project called parsoid which stands between the VisualEditor and the Wikitext which powers the project.

Parsoid is a nodeJS project. I needed to get it running on a Windows server, and I figured it would be pretty easy (node runs on Windows). I followed the instructions and quickly ran up against some red errors. The discussion page for the project had numerous complaints about it not working on Windows, and Google was no help. I installed it on a Linux box and grepped the entire tree for “windows”, and lo, the last result revealed unto me the truth:

### Windows

* A recent copy of the *x86* version of [Node.js for Windows](http://nodejs.org/download/), *not* the x64 version.
* A copy of [Visual C++ 2010 Express](http://www.microsoft.com/visualstudio/eng/downloads#d-2010-express).
* A copy of [Python 2.7](http://www.python.org/download/), installed in the default location of `C:\Python27`.

So that was it, with the dependencies satisfied Postoid installed correctly and ran normally. It turns out that Posoid has a deeply nested dependency on a module called “contextify” (parsoid → html5 → jsdom → contextify). Contextify has to compile something (honestly I have no idea what) and is expecting Python and a C compiler. These are standard tools on a Linux system, but not on Windows.

Server side browser features with phpcaniuse

The Browser Capabilities Project is an effort to determine browser capabilities based on the User-Agent string. It’s a great project which has kept up to date with UA stings. The data is used to power the PHP function get_browser(), and is used by the phpbrowscap standalone project. The challenge with the project is that their features haven’t kept up with the rapid development of web technology.

The CanIUse.com project catalogs granular data about the capabilities of web browsers. They’ve done an excellent job of tracking emerging features like HTML5 canvas and web audio. They also make all of the data available as a json file on github. Thanks!

So what do you do when you have two different projects, with different goals but  some overlap? You create a mashup! Enter PHPCanIuse, which will let you check browser capabilities from the caniuse.com project, based on the decoded UA string from the browscap project.

JQuery and other javascript frameworks have client side feature detection, but there are a few scenarios where you might want to do it server side:

  • Embed a Flash video player or an HTML 5 video
  • Use native websockets or flash based (or, ew, timer based)
  • Send SVG as native markup, or pre-render it as a PNG

DISCLAIMER: Users can theoretically change their UA string, so it can’t be trusted 100%. It’s an advanced feature, so anyone who does, deserves the broken web experience they get.

Using phpcaniuse

phpcaniuse is a composer project. You’ll need the following require:

"robertlabrie/phpcaniuse": "dev-master"

Then as usual just composer install. The class is a single file, so if you hate composer, you can  just include CanIUse.php. Then just instantiate the object

$can = new phpcaniuse\CanIUse($browser,$data);

The $browser object can come from either get_browser() or phpbrowscap. CanIUse expects the object, not the associative array. $data is a string containing the contents of data.json from the caniuse.com project. Pass it in as a string, do not decode it.

The methods are pretty straight forward:

Function Description
check(list) Checks to see if the browser supports the specified list of features. Takes a string of a single feature, or an array of features. Returns the lowest level of support for the list.
featureList() Returns an associative array of all the features tracked by caniuse. The key is the key used to check a feature, and value is a friendly description.
featureGet(feature) Retuns the set of JSON data for the specified feature.
browserCan() Returns an associative array of all features and the status within the browser. Key is the feature, value is the status.
agentMapAdd(browscapName,caniuseName) Adds a mapping between a browscap name and a caniuse name.
browserSet(browser) Sets the browser object. Must come from get_browser() or phpcaniuse.
dataSet(data) Sets the json data from caniuse. Must be a string, not a json object or array.
dataGet() Returns the caniuse json data as an array.

phpcaniuse maps the browser name returned by browscap with the one used by caniuse. It’s currently done for Firefox, Internet Explorer, Opera, Safari and Chrome. Mobile browsers are not done, but if you can identify a mapping, please tell me.

The package includes a demo.php file which will tell you all about your browser.

Understanding the CanIUse.com data.json

The caniuse.com data.json file is broken up into several sections. I’ve laid it out in a table with my working notes, but the best way to understand it is print_r().

agents ie array of browsers – major ones listed here, many others supported
firefox
browser Firefox Seems to match the ->Browser property from browscap
versions …,”24″,”25″ array of browser versions where the index is used by caniuse
chrome
safari
opera
statuses status codes for features
rec Recommendation
pr Proposed Recommendation
cr Candidate Recommendation
wd Working Draft
other Other
unoff Unofficial / Note
cats categories for capabilities
updated date() the date the data was last updated
data
title A friendly title
description A good description
spec URL to specification
status Relates to the status codes above
stats Associative array of browsers, version feature
firefox Or any agent name
23 Vsersion 23
x Supported or not

The actual status codes are outlined below. They’re based on reverse engineering and might not be 100% accurate:

Stat Description
n Not supported
a Partially Supported
a x Partially Supported with prefix
p Polyfill required
y x Supported with prefix
y Supported
u Unknown