Vimana Diaries: Recon Flights in Python CGI landscapes - part IV

Mapping Application Structure Through Exception-Driven Insights

Sep 17, 2024

Understanding Multi-Type Trigger Exceptions: Key to Mapping Application Behavior

In this research, multi-type trigger exceptions (as presented in part II) play a pivotal role in understanding and mapping the internal workings of an application. These types of exceptions occur when different types of input parameters trigger various error states, allowing the attacker or researcher to infer details about the expected data structures and the flow of data within the application.

In this part of our journey through Python CGI Landscapes, we explore how multi-type trigger exceptions serve as a mechanism for revealing hidden aspects of an application's internal logic. We also explain how these exceptions can be systematically exploited to map out the behavior of an application. By leveraging different inputs, we can identify the types of data the application expects, how it processes them, and what vulnerabilities might arise when these expectations are unmet.

The Role of Controlled Fuzzing

Multi-type trigger exceptions arise when an application encounters unexpected data types in its input, which results in errors specific to how those data types are processed. These errors provide valuable feedback that can be used to deduce the structure and flow of data in the application.

Below is a brief example of an (old) analysis of such context where a bunch of concrete exceptions were triggered and caught for further investigation:

Let’s consider the following Python-based web application script below:

Create a directory test_cgitb, a subdirectory named cgi-bin inside it, and a Python module called script.py inside cgi-bin:

mkdir -p ~/test_cgitb/cgi-bin && cd ~/test_cgitb/ && vim cgi-bin/script.py

Now add the code below in the script.py:

#!/usr/bin/env python3

import cgi
import json
import cgitb

cgitb.enable()

print("Content-Type: text/html")
print()  

def process_payload(payload):
    config = payload['config']
    settings = config['settings']
    level = settings['level']
    user_access = level['user_access']
    admin_status = user_access['admin']
    permissions = user_access['permissions']
    print(f"Admin status: {admin_status}")
    print(f"Permissions: {permissions}")
    
    data = payload['data']
    users = data['users']
    
    for user in users:
        username = user['username']
        tasks = user['tasks']
        
        print(f"User: {username}")
        for task in tasks:
            task_id = task['task_id']
            description = task['description']
            print(f"  Task {task_id}: {description}")

state = 'set'
form = cgi.FieldStorage()
payload_raw = form.getfirst("payload", '')
payload = json.loads(payload_raw)
process_payload(payload)

Then, in the parent directory of cgi-bin where we are (test_cgitb), run the following command to start the server:

python3 -m http.server --cgi 8000

Now, access the application URL:

http://localhost:8000/cgi-bin/script.py

This should result in something like this:

In this initial phase of manual testing (which in an automated process would occur independently), it’s crucial to note that when accessing the script directly, we haven’t yet identified any parameters. At this point, by observing the JSONDecodeError traceback, we see the reference to form.getfirst("payload", ''), which reveals the payload parameter as a potential target for further exploration. This marks the turning point where payload becomes our primary vector for the controlled exception-driven fuzzing process.

Mapping the Input Structure Through Multi-Type Trigger Exceptions

We will begin with the simplest inputs and gradually refine the payload structure based on the exceptions triggered, moving from a black-box perspective where the input structure is unknown, to a fully reconstructed payload.

Step 1: Sending an Empty List

The first input we test is an empty list (payload=[]):

# → script.py?payload=[]
curl "http://localhost:8000/cgi-bin/script.py?payload=%5B%5D"

Output:

This input triggers a TypeError, indicating that the application was expecting a dictionary-like structure, but received a list instead:

TypeError: list indices must be integers or slices, not str
      args = ('list indices must be integers or slices, not str',)

From this, we can deduce that the application is trying to access a key in the payload, but lists are indexed by integers, not strings.

Step 2: Sending an Empty String

Next, we send an empty string (payload=""):

curl "http://localhost:8000/cgi-bin/script.py?payload=%22%22"

This input also results in a TypeError:

Confirming that the input is still being indexed by string keys, as strings themselves do not have key-value pairs:

TypeError: string indices must be integers
      args = ('string indices must be integers',)

At this point, we can be fairly certain that the payload is expected to be a dictionary (or JSON object).

Step 3: Sending an Integer

To further verify our hypothesis, we send an integer (payload=123):

curl "http://localhost:8000/cgi-bin/script.py?payload=123"

This results in another TypeError, reinforcing that the input must be an object that can be indexed by keys:

Step 4: Sending an Incomplete JSON String

We now send an incomplete JSON string to see how the application handles malformed input:

curl "http://localhost:8000/cgi-bin/script.py?payload=%22"

This results in a JSONDecodeError, which reveals that the input must be properly formatted JSON:

We can now confidently assume that the payload must be a valid JSON object.

Step 5: Sending an Empty JSON Object

Armed with the knowledge that payload must be JSON, we send an empty JSON object:

curl "http://localhost:8000/cgi-bin/script.py?payload=%7B%7D"

This triggers a KeyError, indicating that the application expects a key named config inside the payload:

We now know that the payload should contain at least the config key.

Step 6: Incrementally Adding Keys

We progressively add keys to the payload based on the exceptions generated:

Adding config {"config":{}}:

curl "http://localhost:8000/cgi-bin/script.py?payload=%7B%22config%22:%7B%7D%7D"

Resulting exception:

Now we know that config contains settings.

Adding settings {"config":{"settings":{}}}:

curl "http://localhost:8000/cgi-bin/script.py?payload=%7B%22config%22:%7B%22settings%22:%7B%7D%7D%7D"

Resulting exception:

We now add level inside settings.

Adding level and user_access (lines 15,16) {"config":{"settings":{"level":{"user_access":{}}}}}:

curl "http://localhost:8000/cgi-bin/script.py?payload=%7B%22config%22:%7B%22settings%22:%7B%22level%22:%7B%22user_access%22:%7B%7D%7D%7D%7D%7D"

Exception:

At each step, the triggered exception reveals the next key that the application expects, allowing us to progressively reconstruct the payload.

Step 7: Reaching Deeper into the Structure

We continue fuzzing until we build the following payload, which now successfully processes without errors:

curl "http://localhost:8000/cgi-bin/script.py?payload=%7B%22config%22:%7B%22settings%22:%7B%22level%22:%7B%22user_access%22:%7B%22admin%22:true,%22permissions%22:%22full%22%7D%7D%7D%7D,%22data%22:%7B%22users%22:%5B%7B%22username%22:%22user1%22,%22tasks%22:%5B%7B%22task_id%22:1,%22description%22:%22Sample%20task%22%7D%5D%7D%5D%7D%7D"

Output:

Admin status: true
Permissions: full
User: user1
  Task 1: Sample task

Rendered response:

Below is the raw payload:

script?payload={
    "data": {
        "users": [
            {
                "username": "user1", 
                "tasks": [
                    {
                        "description": "Sample task", 
                        "task_id": 1
                    }
                ]
            }
        ]
    }, 
    "config": {
        "settings": {
            "level": {
                "user_access": {
                    "admin": true, 
                    "permissions": "full"
                }
            }
        }
    }
}

Note: during the “Mapping the Input Structure Through Multi-Type Trigger we were incrementing the keys based on the KeyError exceptions to gradually reconstruct the expected input structure. However, this step is mainly valuable when we do not have access to the source code where the key names are being accessed, it’s about the technique. In our example with the script.py, we do have access to the source code snippets, which makes it possible to gather the key names directly from the code snippets without the need to trigger new exceptions.

For instance, when sending the following payload:

http://0.0.0.0:8000/cgi-bin/script.py?payload={}

We encounter a KeyError, which reveals the missing key:

KeyError: 'config'
      args = ('config',)
      with_traceback = <built-in method with_traceback of KeyError object>

However, we could have already known the key config (and others) by reviewing the code in the process_payload function:

/tmp/cgitb_test/cgi-bin/script.py in process_payload(payload={})
     11 
     12 def process_payload(payload):
=>   13     config = payload['config']
     14     settings = config['settings']
     15     level = settings['level']

A automated example: Sniffing the stack traces and constructing the object from the first KeyError:

This illustrates that while in some situations (like this one) we can directly obtain the necessary information from the code, in other scenarios where the code is not available, we would need to incrementally build the payload based on the exceptions until we meet the expected object structure without encountering further errors. This case serves as a good illustration of both approaches.

In this section, we explored how exception-triggered insights can be systematically used to map an application’s internal structure. By analyzing exceptions such as KeyError and TypeError, we were able to incrementally reveal the expected input format and reconstruct the application's data handling patterns. Although access to code snippets can expedite this process, in scenarios where such access is unavailable, exception mapping remains a powerful strategy for understanding application behavior.

In Part V: Talking to the Ghosts—Exploiting Transient Data, we will explore a different dimension of exception-driven security research, focusing on how seemingly ephemeral data, which should remain internal, can be exposed through errors. This next chapter will reveal how transient data, such as dynamically loaded credentials or sensitive configuration details, can leak in unintended ways during exception handling, opening doors to critical vulnerabilities. Through real-world examples, we’ll uncover how this phenomenon plays into the broader landscape of security threats like privilege escalation and unauthorized data exposure.

s4dhu’s Substack

Discussion about this post