Mona, tellme - AI-assisted analysis 🧠

This article has 2,560 views

Introduction

Hi everyone! Thank you for checking in again.

On May 1st, 2026 we released a new version of Mona.

Meanwhile, we've continued polishing the code and added support for Python 3.14.4 and Pykd-ext bootstrapper 2.0.0.25.
Please read the readme on the Github repo for mona3 for more information about running mona with Python 3.14.4.

I would also like to take this opportunity to highlight the wiki linked to the Github repo. It contains information about every available command in mona3.

One of the new additions in mona v3, added after the initial release, is a command called tellme. The command is also available via its shorter alias ai.

If you haven't done so recently, make sure to run !mona up to get the latest version, including the tellme/ai command.

During crash triage or exploit development, people often end up manually analysing debugger output, stack dumps, disassembly, heap information, or copy/pasting information into external tools or AI platforms.
As demonstrated in previous posts (and of course, by mona.py itself), I am a big fan of automation.
As crash analysis can be repetitive and time consuming, it would make a perfect target for automation.
What if we could have a tool collect some context and submit it to an AI engine for a first triage?
Of course, you'd still have to apply your understanding of memory corruptions and the exact case you're working on to determine exploitability... but we could use AI for its ability to see links and relationships, and have it go hand in hand with human experience and expertise.

By generating reusable requests from debugger state directly, tellme helps automate a part of the crash analysis and triage workflow.
Long story short, the tellme command was designed to collect relevant debugger context, package it into a reusable request, and optionally submit that request to an AI provider.

In practice, mona can gather crash state, registers, stack contents, memory mappings, heap information, cyclic pattern analysis, call stacks, heapdynamics logs, PoC files, and other debugger artifacts, and combine them into a request file.

That request file has a prompt.

Mona currently implements 3 request prompts or profiles:
one to perform crash analysis, another one to explain what functions/routines do (useful for reverse engineering or just annotating routines), and a third one to reason about whether a controlled heap chunk can steer execution or influence meaningful write/dispatch paths.

You can review and modify the request file manually, feed it into another workflow or AI platform yourself, or let mona submit it automatically through a supported API.
And you can make your own request templates and customize pretty much everything.

This makes tellme useful both as an AI integration layer and as a debugger-context export mechanism.

Overview

At this time, tellme supports a number of platforms: openai, openaiagents, anthropic (Claude), ollama and customai. It also supports offline. The latter won't actually submit a request to an API. It is the default mode unless you choose and configure another mode.
(If anyone needs support for other platforms, please reach out.)

Most engines use a simple request/response model. The openaiagents engine is a little different: mona launches a small bridge helper outside the debugger, passes the generated request to that helper, and the helper submits it through the OpenAI Agents SDK. That bridge supports reasoning/thinking-style runs, writes its own diagnostic files, and can show progress such as when the model is reasoning, generating final answer text, or returning a partial answer before hitting output limits.

ollama has its own little twist too. mona can talk to it either through the native /api/generate endpoint or through an OpenAI-style /v1/responses endpoint. Both can work, but they don't always feel equally boring and predictable, which in this context is actually what you want.

Anyway, if youd like mona to submit requests to an AI automatically. You'll need:

  • a supported AI engine,
  • if needed, a valid API key and a paid API subscription or usage plan,
  • network access from the debugger host,
  • configuration via mona config or environment variables,
  • and the required Python dependency for the provider you want to use.

We'll talk about configuration in a moment.

As hinted already, you can also use tellme without automated API integration.
In that case, mona only creates the request file.
You can then inspect it, edit it, sanitize it, or copy/paste it manually into tools such as ChatGPT, Claude, Grok, Gemini, Copilot, a local LLM, internal analysis tooling, or any other workflow that accepts text-based requests.

Offline mode is the default behavior when no engine is configured. And even when an engine is configured, mona will ask you to confirm submitting the request, just to be sure. (You can provide a -submit flag if you prefer to submit the request automatically)

In short, there are 6 engines (5 'online' and 1 'offline'):

  • offline
  • openai
  • openaiagents
  • anthropic
  • ollama
  • customai

Using tellme / ai

Offline requests

As explained, you do not need automated API integration to benefit from tellme.

A practical workflow is to let mona gather the relevant debugger context, generate a reusable request file, open that file, review or tweak the prompt, and copy/paste the request into another tool manually. (You can copy/paste the request into ChatGPT or another tool and have it perform the requested analysis)

This gives you control over what gets submitted, which provider or model you use, which additional instructions you want to add, and whether sensitive information should be removed first.

It may also allow you to get AI-assisted analysis without spending API credits.

Example:

!mona tellme -q 1

If no engine is specified using -e, and no default engine is configured in mona.ini or via environment variables, tellme uses offline mode automatically to avoid accidental token usage.

And if you run !mona tellme without a query profile, mona now tells you which default engine/model/settings it would use, where they came from (mona.ini, environment variables, or built-in defaults), and then nudges you to pick a -q profile instead of just staring at you in silence.

However, if a default engine and corresponding API key are found, mona will submit the request unless you explicitly specify the -offline flag.

As indicated earlier, it is probably a good idea to test the request first, inspect the generated file, and only submit it when you are happy with the contents.

!mona tellme -q 1 -offline

Automated AI requests

Requirements & dependencies

If you want mona to automatically submit requests and retrieve responses, please check what the platform requires, and configure mona accordingly.

Do you need:

  • a valid API key,
  • a paid API subscription or usage plan,
  • network access from the debugger host,
  • the appropriate Python SDK or dependency for the selected provider.

We recommend running !mona up on a regular basis to get the latest updates and features.

Regarding the Python dependency: the current version prefers the provider library for both openai and anthropic, and falls back to direct HTTPS if that library is unavailable or if the library-side transport/setup path fails. It uses the openai-agents Python library for openaiagents, while ollama and customai continue to use direct HTTPS/JSON requests.
That means, if you plan on using openai, openaiagents and/or anthropic, you'll still want to install the corresponding libraries for each Python version you plan on using with mona, even though openai and anthropic now have a direct HTTPS fallback path.

For example:

py -3.14-32 -m pip install openai
py -3.14 -m pip install openai
py -3.14-32 -m pip install openai-agents
py -3.14 -m pip install openai-agents

Before submitting automated requests to external providers, always consider the sensitivity of the data being shared.
Debugger output may contain proprietary code, internal symbols, memory contents, credentials, customer data, or other sensitive information, depending on the target and environment.

If necessary, use offline mode first, review the generated request manually, and sanitize the contents before submission.

It's also worth noting that some providers apply additional verification requirements before allowing API-assisted offensive security or vulnerability research workflows.
This is particularly relevant for prompts involving exploit development, shellcode, memory corruption, crash analysis, reverse engineering, malware analysis, or offensive security research.

At the time of writing, both OpenAI and Anthropic apply additional safeguards and verification mechanisms for certain classes of cybersecurity-related prompts and workflows.

For OpenAI, access to advanced cybersecurity-related workflows may require Cyber Safety Evaluations and additional account verification steps.
For Anthropic, additional safeguards and approval requirements may apply to offensive-security related API usage as well.

For more info, check out the following links:

Depending on the provider, account tier, organization status, API history, region, or trust level, automated requests may be delayed, blocked, rejected, rate-limited, or require additional approval steps before they are accepted.

This is not specific to mona. The same restrictions generally apply to any automation or tooling that attempts to submit offensive-security related prompts through those APIs.

If automated requests fail unexpectedly, verify that:

  • your API account is fully verified,
  • your account tier allows cybersecurity-related workflows,
  • the correct SDKs or dependencies are installed,
  • the API key is valid,
  • your network allows outbound API communication,
  • the selected model supports the requested workflow,
  • and you still have funds or available credits.

Submitting a request to an AI engine

Choose your engine

If you want mona to submit the request, you need to specify or configure an AI engine.

As shared earlier, tellme currently supports openai, openaiagents, anthropic, ollama and customai.

You can specify the desired engine directly on the command line using the -e flag, followed by one of the supported engines.

For example:

!mona tellme -e openai
!mona tellme -e openaiagents
!mona tellme -e anthropic
!mona tellme -e ollama
!mona tellme -e customai

The -e argument only affects the current request.

Selecting a model

You can also specify a model for a single request:

!mona tellme -e openai -model gpt-5.4-mini
!mona tellme -e openaiagents -model gpt-5-mini
!mona tellme -e ollama -model llama3

For more info about available models, check the corresponding pages:

You can also ask tellme to list the models that are visible to the currently selected AI backend.

The quickest way is to pass -model without a value. mona will treat that as a model-list request, query the provider, print the available model IDs, and exit without sending an analysis request.

Examples:

!mona ai -e ollama -model
!mona ai -e openai -model
!mona ai -e anthropic -model
Timeouts, retries and output limits

If a request takes longer than expected, you can increase the request timeout.

By default, mona uses a timeout of 300 seconds and will try up to 5 times, adding up 120 seconds (capped at the value of the initial timeout, and with a maximum of 120) of extra timeout each time.
You can also set the initial timeout at the command line:

!mona tellme -e openai -timeout 600

(We also have the option to set a default timeout for each engine using mona config or via environment variable. I'll explain in a moment how to do so.)

For response size, openai and anthropic use a configurable max_tokens budget.
The openaiagents engine also supports openaiagents.max_tokens. If you do not set it explicitly, mona derives a default output-token budget from the size of the final request. The derived budget uses a floor of 8192 and a cap of 24576.

If the bridge detects that the model stopped early because max_output_tokens was reached, it will say so explicitly and point you at openaiagents.max_tokens / OPENAI_MAX_TOKENS. If partial answer text was already returned before truncation, that partial answer is preserved in the output file as well.

Submitting with or without confirmation

In any case, by default, mona will ask you to confirm before actually submitting the request. The -submit flag will tell mona to submit without asking.
If you're not using the -submit flag, you'll get the opportunity to review the request (tellme_request.md) and edit the prompt.
As soon as you confirm the submission, mona will read the request from file again (reading any changes you have made) before sending it to the engine.

Configuring AI parameters

Configuring defaults in mona.ini

Instead of specifying the engine, model, timeout, and provider-specific settings every time, you can store default values in mona.ini by using the config command.

mona supports five online AI engines:

  • openai
  • openaiagents
  • anthropic
  • ollama
  • customai

For cloud providers such as OpenAI and Anthropic, you typically configure the default engine, API key, model, timeout, and, where applicable, the maximum number of response tokens.

For OpenAI:

!mona config -set mona.ai.engine openai
!mona config -set openai.key <your OpenAI API key>
!mona config -set openai.model gpt-5.4
!mona config -set openai.timeout 900
!mona config -set openai.max_tokens 4096

For OpenAI Agents:

!mona config -set mona.ai.engine openaiagents
!mona config -set openaiagents.key <your OpenAI API key>
!mona config -set openaiagents.model gpt-5-mini
!mona config -set openaiagents.url http://127.0.0.1:8765
!mona config -set openaiagents.bridge.python py -3.14
!mona config -set openaiagents.reasoning_effort high
!mona config -set openaiagents.verbosity low
!mona config -set openaiagents.max_turns 8
!mona config -set openaiagents.max_tokens 8192

The optional openaiagents.bridge.python setting allows you to overrule which Python command is used to launch the helper outside the debugger. If you do not set it, mona derives the Python command from the current environment.

For Anthropic:

!mona config -set mona.ai.engine anthropic
!mona config -set anthropic.key <your Anthropic API key>
!mona config -set anthropic.model claude-opus-4-20250514
!mona config -set anthropic.timeout 900
!mona config -set anthropic.max_tokens 4096

For local or self-hosted engines, mona does not require an API key. Instead, you configure a URL, model, and timeout.

For Ollama:

!mona config -set mona.ai.engine ollama
!mona config -set ollama.url http://127.0.0.1:11434/api/generate
!mona config -set ollama.model llama3
!mona config -set ollama.timeout 900
!mona config -set ollama.response_field response

This is one of those small details that can save you a surprising amount of head scratching.

If your goal is simply: "collect debugger context, send it to the local model, get plain answer text back", then /api/generate is usually the nicer fit. It is Ollama's native endpoint, the returned payload is straightforward, and mona tends to have a much easier time extracting the answer cleanly.

You can also point mona at http://127.0.0.1:11434/v1/responses. That route is handy when you specifically want OpenAI-style API compatibility, or when you're trying to keep multiple tools speaking roughly the same dialect. But in practice, some local models are much more enthusiastic about producing metadata, reasoning-ish structures, or half-helpful API-shaped objects there than they are about returning one simple final answer string.

So the short version is:

  • prefer /api/generate when you want dependable plain-text output from a local model,
  • use /v1/responses when compatibility matters more and you've confirmed that the model behaves well there.

If you only provide the plain base URL, mona will treat that as native Ollama and route it to /api/generate for you.

Note: Ollama offers free cloud models (with limited usage). You can 'run' them locally (f.i. ollama run gemma4:31b-cloud), and access the model via your local URL

You can find an up-to-date list of cloud models here.
The list includes:

  • gpt-oss: ollama run gpt-oss:120b-cloud
  • kimi-k2.6: ollama run kimi-k2.6:cloud
  • gemma4: ollama run gemma4:31b-cloud
  • deepseek-v4-flash: ollama run deepseek-v4-flash:cloud
  • deepseek-v4-pro: ollama run deepseek-v4-pro:cloud
  • qwen3.5: ollama run qwen3.5:397b-cloud
  • glm-5.1: ollama run glm-5.1:cloud

For a generic JSON-based POST endpoint:

!mona config -set mona.ai.engine customai
!mona config -set customai.url http://127.0.0.1:8080/api/generate
!mona config -set customai.model llama3
!mona config -set customai.timeout 900
!mona config -set customai.response_field choices.0.message.content

The optional response_field setting is useful for ollama and especially customai when the returned text is nested somewhere inside the JSON response. The value uses a dotted path syntax, for example response, message.content, or choices.0.message.content.

Once these defaults are configured, you can run tellme requests without specifying the engine every time.

For example:

!mona tellme -q 1

Mona will use the configured engine and settings from mona.ini.

Configuring defaults with environment variables

You can also configure tellme by using environment variables.

This is useful if you do not want to store API keys or other engine settings inside mona.ini, or if you want to manage secrets and defaults outside of mona.

The default engine can be selected with:

set MONA_AI_ENGINE=openai

For OpenAI, you can use:

set OPENAI_API_KEY=<your OpenAI API key>
set OPENAI_MODEL=gpt-5.4
set OPENAI_TIMEOUT=900
set OPENAI_MAX_TOKENS=4096

For OpenAI Agents, you can use:

set MONA_AI_ENGINE=openaiagents
set OPENAI_API_KEY=<your OpenAI API key>
set OPENAI_MODEL=gpt-5-mini
set OPENAIAGENTS_URL=http://127.0.0.1:8765
set OPENAIAGENTS_REASONING_EFFORT=high
set OPENAIAGENTS_VERBOSITY=low
set OPENAIAGENTS_MAX_TURNS=8
set OPENAI_MAX_TOKENS=8192

For Anthropic, you can use:

set ANTHROPIC_API_KEY=<your Anthropic API key>
set ANTHROPIC_MODEL=claude-opus-4-20250514
set ANTHROPIC_TIMEOUT=900
set ANTHROPIC_MAX_TOKENS=4096

For Ollama, no API key is required. Instead, configure the server URL, model, timeout, and optionally the response field:

set OLLAMA_URL=http://127.0.0.1:11434/api/generate
set OLLAMA_MODEL=llama3
set OLLAMA_TIMEOUT=900
set OLLAMA_RESPONSE_FIELD=response

Same idea here: for most people, /api/generate is the sensible default. If you explicitly want the OpenAI-flavoured path, set OLLAMA_URL to http://127.0.0.1:11434/v1/responses instead.

For a generic JSON-based POST endpoint, you can use:

set CUSTOMAI_URL=http://127.0.0.1:8080/api/generate
set CUSTOMAI_MODEL=llama3
set CUSTOMAI_TIMEOUT=900
set CUSTOMAI_RESPONSE_FIELD=choices.0.message.content

For ollama and customai, the optional RESPONSE_FIELD variable tells mona where to find the returned text inside the JSON response. The value uses a dotted path syntax, such as response, message.content, or choices.0.message.content.

Configuration precedence

If both mona.ini and environment variables are present, mona.ini takes precedence. Command-line arguments such as -e, -model, and -timeout can still override both for a single request.

Command-line options override both mona.ini and environment variables for the current request.

If no command-line options are provided, mona first checks mona.ini, then environment variables.

If no engine is configured anywhere, tellme switches to offline mode automatically.

Note: keep in mind that mona tellme will automatically attempt to submit your request if it finds a default engine and API key. It will ask for confirmation (unless you have specified the -submit flag). If you want to be sure not to burn tokens while you're merely trying things out, you always have the option to use the -offline flag.

Configuring engine-specific options

Generic engine options

Some AI engines accept additional request parameters besides the usual settings such as engine, model, timeout, URL, or API key. Ollama is a good example: it allows extra runtime settings to be passed inside an options object.

mona supports this in a generic way for all engines. You can store engine-specific options in mona.ini with this syntax:

!mona config -set <engine>.options.<name> <value>

For example, to configure a larger context window for Ollama:

!mona config -set ollama.options.num_ctx 256000

mona will then include that value in the request body under an options object:

"options": {
  "num_ctx": 256000
}

You can define multiple options in the same way:

!mona config -set ollama.options.num_ctx 256000
!mona config -set ollama.options.temperature 0.2
!mona config -set ollama.options.repeat_penalty 1.1

mona collects these entries automatically and adds them to the request for the selected engine.

Model-specific option overrides

Sometimes an option should only apply to one model. mona supports that as well.

To make an option model-specific, place the model name between options and the actual option name:

!mona config -set ollama.options.dolphin-llama3.num_ctx 256000

With this configuration, num_ctx will only be included when the selected model is dolphin-llama3.

Generic options and model-specific options can be combined. If both exist, the model-specific value overrides the generic one for that model.

For example:

!mona config -set ollama.options.num_ctx 131072
!mona config -set ollama.options.dolphin-llama3.num_ctx 256000

In that case:

  • all Ollama models will use num_ctx=131072 by default
  • dolphin-llama3 will use num_ctx=256000 instead

This mechanism is generic, so it is not limited to Ollama. The same pattern can also be used with openai, openaiagents, anthropic, or customai, as long as the target API understands an options object.

What does the tellme command do?

Debugger context

At a high level, tellme collects debugger context, generates a request file, and optionally submits the request to a configured AI engine.

The command creates files inside the mona working folder. This includes tellme_request.md, and, when an API response is received, tellme_response.md.

For openaiagents, additional bridge-specific files may be created as well, such as a bridge log, status file, raw bridge request file, and raw bridge result/debug file.

Before using tellme, I recommend configuring a mona working folder:

!mona config -workingfolder c:\logs\%p

This stores all mona output in process-specific subfolders under c:\logs.
For example, if the debugged process is called target.exe, mona will create and use a dedicated folder called c:\logs\target.
This keeps output for different targets separated, including normal mona logs, generated request files, response files, reusable AI templates, and bridge diagnostics. It also makes it easier to find the files created by mona.

Request and response files

Every tellme run starts by creating a request file called tellme_request.md.

This file contains a small metadata header, followed by the exact prompt body that was prepared for the AI engine.
The metadata section identifies the selected engine, model, question profile, request id when available, template file when one was used, and target address information when applicable.

A typical request file starts like this:

AI engine : openai
Model     : gpt-5.4
Question  : 1
Target    : 41414141
Target src: -a

PROMPT BEGIN
------------
...
PROMPT END

The content between PROMPT BEGIN and PROMPT END is the actual request body.

For -q 1 and -q 2, that body contains prompt instructions, followed by structured JSON data under the variables object.

That JSON data contains the debugger context collected by mona. Depending on the question profile and supplied arguments, this can include registers, exception details, disassembly, stack contents, nearby memory, module information, call stack output, VAD/page metadata, heap/chunk context, cyclic pattern analysis, heapdynamics information, additional context files, and PoC/trigger file contents.

When automated submission is enabled, mona submits this request body to the configured AI engine.

If the request succeeds, mona creates a response file called tellme_response.md.
The response file contains a similar metadata header, followed by the output returned by the AI engine.

For openaiagents, mona submits the job to the local bridge helper, which writes the result file asynchronously. Depending on the run, the bridge may also write a status file such as tellme_<requestid>.status.json, a raw bridge request file such as tellme_<requestid>.bridge_request.json, a raw bridge result/debug file such as tellme_<requestid>.bridge_result.json, and a bridge log file.

In other words: tellme_request.md is the input to the AI engine, and tellme_response.md is the output from the AI engine.

This separation is useful because you can archive request/response pairs, compare results from different models, rerun the same request later, sanitize a request before sharing it, or manually copy/paste the request into another AI tool.

Common request options

Before looking at the individual question profiles, it is useful to highlight a few options that are not tied to custom templates.

The tellme command can add extra addresses, supporting files, PoC files, and uploaded evidence to the generated request.
These options work with the normal built-in profiles as well as with custom request templates, although the exact meaning of an option can depend on the selected profile.

Adding an extra target with -a

The -a option allows you to point mona at an additional address, register, symbol, or expression.

The exact interpretation depends on the selected question profile:

  • With -q 1, -a is treated as an extra heap or memory-analysis target.
  • With -q 2, -a is treated as an extra code address or function location to analyze.

Examples:

!mona tellme -q 1 -a eax
!mona tellme -q 2 -a kernel32!CreateFileW
!mona tellme -q 2 -a rip

The profile-specific sections below explain what mona collects for those targets.

Adding context files with -l

The -l option allows you to add one or more supporting files to a request.

Example:

!mona tellme -e openai -q 1 -l alloc.txt,triage.txt

This is generally useful when you already have relevant notes, debugger output, heap traces, allocation logs, command output, or other supporting material that should be included in the AI request.

Files that contain alloc() and free() lines are treated as heapdynamics logs.
Ideally, these lines are formatted like this:

alloc(0xsize) = 0xaddress from saved_return_pointer (heapHandle)
free(0xaddress) from saved_return_pointer (heapHandle)

Mona understands that format and uses the fields as heap-allocation evidence.

Other files are added as supporting context.
If alloc/free lines are found that contain one of the addresses referenced by the current EIP/RIP instruction, mona includes those matching lines plus 5 lines before and after.

If no heapdynamics log is supplied via -l, tellme still looks for c:\alloc.txt automatically.
It only uses content from that file when it finds an address that is actively referenced in the current instruction at EIP/RIP.

Adding PoC or trigger files with -p

The -p option allows you to add a PoC, trigger file, test case, or reproducer script to the request.

Example:

!mona tellme -e openai -q 1 -p poc.py

The full contents of the PoC file are added under the [poc_file] context entry.
This helps correlate the debugger state with the input that triggered the crash or behavior being analyzed.

You can combine -p with -l:

!mona tellme -e openai -q 1 -l alloc.txt,triage.txt -p poc.py

Uploading the request and supporting files with -upload

For the openai and anthropic engines, there is also an optional -upload mode.

Normally, mona renders the full request as one large text prompt.
If you provide extra context files via -l or a PoC file via -p, those contents are embedded into that prompt as well.

With -upload, mona behaves differently.
It first writes the generated request to disk, then uploads that saved request file together with any -l and -p files, and finally sends a short inline instruction that tells the selected provider to treat the uploaded request file as the authoritative prompt source.

Example:

!mona tellme -e openai -q 1 -l alloc.txt,triage.txt -p poc.py -upload
!mona tellme -e anthropic -q 1 -l alloc.txt,triage.txt -p poc.py -upload

This is particularly useful when the combined context is large, or when you prefer the saved request file to act as the single primary instruction instead of flattening everything into one long prompt.

The uploaded request file remains the main source of truth.
The additional uploaded files are supporting evidence.

After the upload finishes, mona prints the provider-assigned file IDs and also writes them into the Uploaded file IDs section inside tellme_response.md. Those IDs can be reused later with the -id flag.

For example:

!mona tellme -e openai -q 9 -f my_q1_request.md -id file_abc123,file_def456

This works with any request profile, including -q 9 templates. Mona simply passes the IDs through to the selected provider, so it is up to you to reuse IDs that belong to the same backend you selected with -e.

For openai and anthropic, mona tries the provider library first and only falls back to direct HTTPS when the library import or transport path is unavailable. That keeps upload mode flexible without making raw HTTPS the default path.

Combining request options

These options can be combined with the normal profile and engine options.

For example:

!mona tellme -e openai -q 1 -a eax -l alloc.txt,triage.txt -p poc.py -upload

That command asks mona to:

  • use the crash-triage profile,
  • investigate eax as an additional target,
  • include supporting context from alloc.txt and triage.txt,
  • include the PoC file poc.py,
  • and upload the generated request and supporting files through the openai engine.

The sections below describe how each question profile uses the collected context.

Requests and question profiles

The tellme command uses question profiles.

These profiles define what context mona collects and what kind of analysis the AI engine is asked to perform.

Mona currently provides 3 predefined profiles: -q 1, -q 2 and -q 3.

A fourth mode, -q 9, allows you to load a custom request template from a file.

Crash triage profile: -q 1

-q 1 is focused on crash triage and immediate exploit-relevant observations.

It collects crash context, cyclic-pattern hints, and extra heap/pointer context for the current registers.

This includes instruction windows around the current program counter, pointer dumps and nearby memory for registers, heap chunk/VAD metadata when register values point into known heap-managed regions, silent findmsp results, and the SEH chain on 32-bit targets.

tellme -q 1 enriches findmsp results with a first trampoline candidate for each register that points into the cyclic pattern. Mona will query the usual non-ASLR, non-rebase module set, and if you provide -cpb it applies the badchar filter before suggesting the trampoline. In practice, that makes the AI context more actionable and reduces the chance that a ā€œgood-lookingā€ jump target is unusable in the real exploit path.

So if you already know the badchars for the target, it is a good idea to pass them along in the request, for example with -cpb '\x00\x0a\x0d'. That gives mona a better chance of already proposing usable trampoline pointer(s) in the generated context.

Examples:

!mona tellme -q 1

or

!mona ai -q 1 -e anthropic -cpb '\x00\x20\x0a\x0d\x3f'

You can also add an extra address, register, symbol, or expression to investigate:

!mona tellme -q 1 -a eax

With -q 1, the address supplied via -a is treated as an extra heap target to investigate.

If heap walking or the process layout cannot resolve the address supplied with -a, mona still collects fallback data. This includes !heap -p -a, !heap -x, or the WinDBGX equivalents, plus a memory dump around the address.

If heapdynamics files are supplied with -l, or if c:\alloc.txt exists, the heapdynamics contents are added to the request.

Function-understanding profile: -q 2

q2 is the function-understanding profile in mona’s AI workflow. Instead of treating the current instruction as an isolated line of disassembly, it takes the current EIP or RIP, resolves the function that location belongs to, and builds context around the whole routine. The goal is to answer a higher-level question: what does this code actually do?

For that analysis, mona collects the current function disassembly, symbol and boundary information when available, and fallback code windows when a full function cannot be resolved reliably.
It also reports invalid locations cleanly, so if execution is no longer pointing at real code, the result says so instead of forcing a weak guess.
When you provide -a, mona keeps the live EIP/RIP function as the primary context and can analyze the function containing the additional address as well, unless both locations resolve to the same place.

A useful detail is that -q 2 does not stop at the current function body. If the function contains calls or unconditional jumps, mona also grabs disassembly at those target locations and includes that extra context in the request. That gives the AI enough evidence to produce better pseudocode, explain branch structure, and describe the real role of important helper routines instead of summarizing a single basic block in isolation. By default, it does it one level deep, but you can add more levels (which will make the context a lot bigger as well obviously) using the -d number argument.

In practice, tellme -q 2 is meant for reverse-engineering and code comprehension: understanding what a handler, parser, allocator wrapper, dispatch routine, or security check is doing, in plain human language, with decompiled-style logic and function-level context rather than raw assembly alone.

Example:

!mona tellme -q 2

You can also provide a function, symbol, register, or address explicitly:

!mona tellme -e openai -q 2 -a kernel32!CreateFileW

or:

!mona tellme -e openai -q 2 -a eip

With -q 2, the address supplied via -a is treated as the code address or function location to analyse.

This is especially useful when EIP or RIP is already corrupted and no longer points to a useful code location.

Controlled heap-chunk reachability profile: -q 3

-q 3 is meant for cases where you already know which heap chunk you control and want mona to gather enough evidence to reason about what that chunk can realistically influence from the current debugger snapshot.
You'd begin the analysis by making the debugger break at the location where you know you can control the contents of a chunk.
You provide the address of that chunk using -c <address>. (Of course, you can simply provide a register name)
Mona then resolves the owning chunk, collects heap metadata, records registers and stack slots that already point into the chunk, and includes a pointer-sized dump of the entire chunk when the chunk size can be determined.

If you also provide -t <address>, -q 3 switches into targeted mode. In that mode, the generated request asks whether execution can plausibly reach that concrete code address by changing only bytes inside the controlled chunk.

If you omit -t, -q 3 switches into discovery mode. In discovery mode, the generated request asks the AI to look for the most promising reachable paths that:

  • modify the controlled chunk itself,
  • use values from the chunk, or chunk-derived pointers, as destination pointers or write targets elsewhere,
  • or consume chunk-controlled data in a way that could influence control flow, such as vftable dispatch, callback use, indirect calls, or indirect jumps.

An important detail is that -q 3 does not only look at the current function body. It also follows direct calls/jumps, and it treats return-resume paths as first-class context. So if the current function can return, mona also captures the concrete caller resume address from the call stack, grabs forward disassembly from that resume site, and includes the immediate caller function context in the saved request. That is especially useful in cases where the strongest sink only becomes visible after the current function returns.

Just like with -q 2, you can use -d <number> to follow direct calls/jumps more deeply. With -q 3, the same depth value is also used to control how many caller frames mona inspects for return-resume analysis. Mona uses a default depth of 2, and you can use values up to 4. Of course, the amount of information to gather might increase expontentially, and the request file may be too big to submit automatically. Worst case, you'd have to copy/paste the file into an engine somewhere.

Examples:

!mona tellme -q 3 -c eax
!mona tellme -e openai -q 3 -c poi(esp+4) -t kernel32!CreateFileW
!mona tellme -e openai -q 3 -c eax -t kernel32!VirtualProtect -d 2

Custom template profile: -q 9

-q 9 loads a request template from a file.

This is useful when you want to write your own prompt, reuse a saved request, or modify one of the generated template files.

The -f argument is required when using -q 9.

Example:

!mona tellme -e openai -q 9 -f request.md

If the file contains [variable] placeholders, mona resolves them against the debugger context variables.

If the file already contains a built request, for example a file with PROMPT BEGIN / PROMPT END markers or a raw prompt containing Debugger request JSON:, and no unresolved placeholders remain, mona reuses that request body directly instead of rebuilding debugger context.

Customizing requests

Generated template files

When you run -q 1, -q 2 or -q 3, mona creates a reusable template based on the internal q1, q2 or q3 profile.

The template files are called ai.q1, ai.q2 and ai.q3 respectively.

If a working folder is configured, the files are created in the working folder. Otherwise, they are created next to mona.ini.
These template files are provided for inspection, customization, and reuse. They are not applied automatically when you run -q 1, -q 2 or -q 3.
Keep in mind that these files will be overwritten every time you run !mona tellme.
If you plan on modifying one of them to make your own custom requests, create a copy first and edit that copy instead of the original.

The templates contain the prompt, as well as [variable] placeholders.

When you use -q 9 and feed it a template with -f template_file, mona fills in the variables with actual context from the current process.

Using customized templates

To use one of those templates, run -q 9 and point -f to the template file:

!mona tellme -e openai -q 9 -f custom_ai.q1

or:

!mona tellme -e openai -q 9 -f custom_ai.q2

or:

!mona tellme -e openai -q 9 -f custom_ai.q3

You can customize these templates by changing the wording, adding extra instructions, removing instructions, forcing a specific output style, requesting JSON output, focusing more strongly on heap analysis, focusing on stack corruption, or adding organization-specific analysis requirements.

Once customized, the templates become reusable across sessions and targets.

As stated earlier, you have to run them manually with -q 9. They won't be used automatically if you run -q 1, -q 2 or -q 3.

If you would like me to add specific or additional variables in mona, reach out and we'll discuss.

Building a custom request file

Recommended custom request structure

Although the predefined -q 1, -q 2 and -q 3 profiles, together with their corresponding ai.q1, ai.q2 and ai.q3 template files, are useful starting points, you can also build your own request files from scratch and feed them to -q 9 and -f:

!mona tellme -e openai -q 9 -f myrequest.md

A custom request file can be plain text with [variable] placeholders anywhere in the file.

However, the recommended format is the same structure used by the generated ai.q1 and ai.q2 files: first write the instructions for the AI engine, then include a Debugger request JSON: block.

This gives the AI engine a clear set of instructions and a structured variables object to work with.

This same structure is also what mona’s openaiagents bridge expects. The bridge uses the text before Debugger request JSON: as agent instructions and the JSON block itself as the actual debugger-data payload.

Example custom q1 request

Example custom q1 request:

You are analyzing a debugger snapshot from mona.py running under WinDBG.

Focus on crash triage and immediate exploit-relevant observations.
Use the entries under the 'variables' object as the debugger context.
Explain what stands out in the registers, instruction pointer, stack, nearby memory, heap context, and call stack.
If cyclic pattern data is present, correlate it with the crash state.
If heapdynamics data is present, use it to reason about allocation/free history.

Debugger request JSON:
{
  "mode": "profile",
  "question_type": "1",
  "variables": {
    "debugger": "[debugger]",
    "debugger_flavor": "[debugger_flavor]",
    "processname": "[processname]",
    "modules": "[modules]",
    "architecture": "[architecture]",
    "pointer_size": "[pointer_size]",
    "python_version": "[python_version]",
    "timestamp": "[timestamp]",
    "registers": "[registers]",
    "program_counter": "[program_counter]",
    "stack_pointer": "[stack_pointer]",
    "pc_disasm": "[pc_disasm]",
    "pc_page": "[pc_page]",
    "pc_module": "[pc_module]",
    "pc_memory": "[pc_memory]",
    "stack_page": "[stack_page]",
    "stack_memory": "[stack_memory]",
    "ntglobal_flag": "[ntglobal_flag]",
    "seh_chain": "[seh_chain]",
    "findmsp": "[findmsp]",
    "call_stack": "[call_stack]",
    "heapdynamics": "[heapdynamics]",
    "heapdynamics_mini": "[heapdynamics_mini]",
    "additional_context_files": "[additional_context_files]",
    "poc_file": "[poc_file]",
    "instruction_heap_references": "[instruction_heap_references]",
    "heap_analysis_target": "[heap_analysis_target]"
  }
}

Save that file as, for example:

my_q1_request.md

Then run:

!mona tellme -e openai -q 9 -f my_q1_request.md

Mona will replace the placeholders with live debugger values, write the final request to tellme_request.md, submit it to the selected engine if automation is enabled, and write the returned output to tellme_response.md.

The template engine is not limited to JSON-only files. The Debugger request JSON: block is recommended because it mirrors the default template structure and gives the AI engine predictable, structured input.

That means you can add your own content, context, questions, information, or instructions. Anything that would be accepted by the AI engine and could serve as useful input can be added to the file.

Available template variables

The following variables can be used in custom request templates.

Variable Replaced with Typical use
[debugger] The debugger backend name. Identify the debugger context.
[debugger_flavor] The debugger flavor or presentation layer name. Differentiate between WinDBG variants or debugger front-ends.
[processname] The debugged process image name. Identify the target process.
[architecture] The target architecture. Indicate whether the analysis applies to a 32-bit or 64-bit target.
[pointer_size] The pointer width in bytes. Interpret pointer-sized values, stack entries, and memory dumps.
[python_version] The active Python runtime version used by mona. Capture the exact Python environment that built the request.
[timestamp] The local timestamp when the request was built. Match request and response files later.
[registers] The current register set and values. Core crash triage and code-analysis input.
[program_counter] The current instruction pointer. Identify where execution currently is, or where it crashed.
[stack_pointer] The current stack pointer. Anchor stack analysis.
[pc_disasm] The current instruction plus nearby disassembly. Instruction-level crash analysis or control-flow reasoning.
[pc_module] Module summary for the current instruction pointer. Identify the module that owns the current code location.
[pc_page] Memory page summary for the current instruction pointer. Check whether the instruction pointer points into expected mapped memory.
[stack_page] Memory page summary for the current stack pointer. Validate whether the stack pointer still points into expected stack memory.
[pc_memory] Raw bytes near the current instruction pointer. Inspect shellcode-like bytes, data-as-code, or corruption near the current PC.
[stack_memory] Raw bytes near the current stack pointer. Inspect overwritten values, pointers, strings, or cyclic patterns on the stack.
[modules] The compact crash-focused module summary used by default for -q 1. Reason about loaded modules and exploit-relevant module properties.
[modules_mini] Explicit alias of the compact crash-focused module summary. Use the reduced module view explicitly in templates.
[modules_full] The full loaded-module listing. Use broader module inventory when the compact summary is not enough.
[call_stack] The WinDBG call stack output. Explain how execution reached the current location.
[windbg_analyze] The compact !analyze -v crash summary used by default for -q 1. Add WinDBG’s crash heuristics without sending the full raw output.
[windbg_analyze_mini] Explicit alias of the compact !analyze -v summary. Use the reduced crash summary explicitly in templates.
[windbg_analyze_full] The full raw !analyze -v output. Use broader WinDBG diagnostics when the compact summary is insufficient.
[findmsp] Cyclic-pattern analysis results. Identify pattern offsets in registers, stack, SEH, or nearby memory.
[findseh] Automatic findSEH search results when findmsp confirms an overwritten SEH record. Surface candidate SEH-handler pivots, filtered with the same -cpb badchars when supplied.
[seh_chain] The 32-bit Structured Exception Handling chain summary. Useful for 32-bit SEH overwrite analysis.
[instruction_heap_references] Heap and pointer context related to the current instruction. Heap-aware crash triage and pointer interpretation.
[heap_details] Heap, segment, VAD, and chunk summary. Give AI a broader view of heap layout and chunk state.
[heap_analysis_target] The optional extra heap-focused target supplied with -a when using -q 1. Add a specific heap pointer, register, or address to the investigation.
[heapdynamics] The focused heapdynamics matches used by default for -q 1. Correlate a crash with relevant allocation/free history.
[heapdynamics_mini] Explicit alias of the focused heapdynamics matches. Use the reduced heap-history view explicitly in templates.
[heapdynamics_full] The larger raw heapdynamics context, including file-backed evidence when retained. Use broader alloc/free history when the focused view is not enough.
[evidence] Deduplicated shared heap and alloc/free evidence records. Reuse normalized heap-related evidence across summaries or templates.
[size_budget] The final -q 1 request size and optional requested -maxsize target. Explain how aggressively mona trimmed or preserved evidence.
[omitted_sections] Sections dropped or blanked when mini evidence omits data or -maxsize forces reduction. Understand what context was removed from a size-constrained request.
[additional_context_files] User-supplied supporting files from -l that are not heapdynamics logs. Add notes, triage output, command output, or other supporting material.
[poc_file] The optional PoC or trigger file contents supplied with -p. Correlate crash state with the input or trigger logic.
[analysis_target] The live EIP/RIP address and source used as the primary -q 2 context. Show which runtime location was selected for code analysis.
[current_function] The function context for the live EIP/RIP location. Explain what the current function does.
[additional_function] An extra -q 2 function context collected from -a when it differs from the live PC function. Compare or supplement the primary function analysis.
[additional_function_note] A note explaining cases where -a matched the live EIP/RIP location. Clarify why only one function analysis was collected.
[function_analyses] The ordered list of -q 2 function analyses, including invalid-location reports. Drive function-level explanation and decompilation-style output.
[ntglobal_flag] The current NtGlobalFlag value and decoded process-heap debugging flags. Identify whether debug heap features, page heap, heap tail checking, parameter validation, or other diagnostic heap behaviors are enabled.
[q3_goal] The controlled-chunk analysis mode: targeted reachability or untargeted discovery. Tell the model whether it should prove reachability to a specific target or look for promising sinks.
[reachability_target] The optional target address context supplied with -t for controlled-chunk analysis. Anchor a targeted controlled-chunk reachability question.
[controlled_chunk] The resolved controlled chunk metadata and pointer-sized dump. Show the bytes and layout of the attacker-controlled chunk.
[controlled_chunk_references] Registers and stack locations that already point into the controlled chunk. Identify live references that could carry chunk control into code or data flow.
[target_function] The function context for the optional -t target address in controlled-chunk mode. Explain the destination routine the chunk may need to reach.
[return_context] The current frame return address context. Reason about what happens after the current function returns.
[return_resume_analysis] The summarized post-return path analysis built from the call stack and caller resume site. Evaluate whether the best sink appears only after returning to the caller.
[controlled_object_callees] First-hop callees that likely receive controlled object or chunk-derived pointers. Highlight callee bodies that deserve sink analysis.
[control_flow_disasm_map] The deduplicated disassembly snippets for reachable control-flow targets. Keep larger q3 templates readable while still exposing branch and callee evidence.
[control_flow_target_map] The deduplicated catalog of reachable control-flow target entries. Reference resolved branches, calls, and jumps without repeating them inline everywhere.
[caller_function] The caller function context reached via the current frame return address. Inspect caller-side logic that may become relevant after return.
[caller_chain] The compact caller-chain analysis across additional return frames. Extend controlled-chunk reasoning beyond the immediate caller when needed.
[caller_resume_window] The forward disassembly window at the caller resume site. Show the first instructions and branches executed after the current function returns.
[post_return_constraints] Conditions that must still hold after return for a candidate path to continue. Identify blockers or external state requirements on caller-side routes.
[reachable_functions] The flattened list of reachable branch, call, and jump targets collected from the visible scope. Provide concrete breakpoint candidates and sink-hunting targets.
[rop_target_modules] The q8 ROP module scope, IAT metadata, and compact return-ending windows. Drive ROP-primitive quality analysis from a curated module set.

If you need specific context from the process that is not currently available as a variable, please don't hesitate to contact me to discuss feasibility and options.

Final thoughts

The tellme command was designed to reduce repetitive work during debugger-driven analysis and crash triage.

Instead of manually collecting registers, stack dumps, disassembly, heap context, mappings, heapdynamics logs, and supporting files, mona can prepare reusable analysis requests directly from the debugger state.

Whether you use automated APIs, local models, OpenAI Agents reasoning runs, or fully manual workflows, the goal remains the same: reduce friction, improve consistency, and make complex analysis workflows easier to reproduce and extend.

Mona’s AI support can speed up crash triage and code understanding, but it is only one step in the workflow.
It helps summarize evidence, suggest likely interpretations, and explain low-level behavior in human language.
That is useful, but it is not the same as proving that an interpretation is correct.

AI works from the debugger context it is given. If that context is incomplete, ambiguous, or misleading, the answer can be incomplete or wrong as well.
A strong-looking explanation is still just a hypothesis until it is checked against the real debugger state, disassembly, memory, control flow, and runtime behavior.

That is why human validation still matters. A skilled analyst must confirm whether the crash classification is accurate, whether a function really behaves the way the AI claims, and whether any suggested exploitability is actually supported by the evidence.

The value of AI is speed and assistance, not authority. It helps the analyst get to plausible conclusions faster, but the analyst still has to verify what is true.

If mona AI helped you understand a crash, untangle a function, or move forward on an exploit-development problem, share your experience.
I’m especially interested in real examples of custom prompts, workflows, or prompt profiles that produced useful results during debugging, reversing, or triage.
Practical feedback like that helps improve the feature and makes it easier to see which approaches actually work in the field.

Drop it in the comments below or - even better - share your experiences with the world and let me know!

Have a great day!


I hope you found this useful šŸ™šŸ» šŸ¤—

Ā© Corelan Consulting BV. All rights reserved. ​The contents of this page may not be reproduced, redistributed, or republished, in whole or in part, for commercial or non-commercial purposes without prior written permission from Corelan Consulting bv. See our Terms of Use & Privacy Policy (https://www.corelan.be/index.php/legal) for more details.



Discover more from Corelan | Exploit Development & Vulnerability Research

Subscribe to get the latest posts sent to your email.

About the author