Your agent does not have to stop at conversation. With AnySoul’s browser runtime, your agent can open tabs, read pages, click through flows, fill forms, upload files, and continue multi-step web tasks.

But there are currently two different ways to do that:

  • Web + browser extension — use your current browser and current signed-in browser identity
  • Desktop app — use the AnySoul desktop browser runtime with the richest current browser capability surface

The important part is choosing the right runtime for the kind of browsing work you want your agent to do.

What This Use Case Looks Like

Imagine you ask your agent to help with a real browser task:

  • open a site
  • search for something
  • click into a result
  • fill a form
  • upload a file
  • wait for the next page
  • extract the final result

That is no longer theoretical. AnySoul already supports that browser workflow family today through explicit structured browser actions.

Two Runtime Paths

RuntimeBest ForWhat It Uses
Web + browser extensionActing inside the browser you already use every dayYour current browser tabs and current browser profile
Desktop appRicher browser workflows and the fullest current capability surfaceThe AnySoul desktop browser runtime

What Both Paths Can Do

Today, both paths support the explicit browser workflow family:

  • open and activate tabs
  • navigate, go back, go forward, reload
  • read pages and extract structured data
  • scroll, focus, hover, click, double-click, right-click
  • type, paste, clear, copy text
  • set checked state
  • select dropdown options
  • submit forms
  • upload files
  • wait for selectors, text, or URL changes

This is enough for a large class of deterministic browser tasks.

Where the Difference Really Matters

The biggest difference is semantic browser actions.

Browser Extension Path

The extension path is the right choice when:

  • you want the agent to continue in your real browser
  • you want to reuse your current signed-in browser identity
  • your workflow can be expressed with explicit, structured browser steps

But the extension path currently does not support:

  • semantic_act
  • semantic_extract

So it should be treated as an explicit-action browser agent.

Desktop App Path

The desktop app path is the right choice when:

  • you want the richest current browser capability surface
  • you want a desktop-managed browser runtime
  • your task benefits from semantic browser actions in addition to explicit actions

The desktop app path can expose richer semantic browser actions, but there is a tradeoff:

  • they are usually slower than explicit actions
  • they usually cost more tokens
  • they add a model-mediated reasoning layer on top of the browser runtime

So even on desktop, the best default is still:

  • use explicit actions first
  • use semantic actions when the page is too hard to express with selectors alone

A Practical Example

Example: Fill a Real Web Form

You ask your agent to help submit an application or upload a document.

With the current browser runtime, the agent can:

  1. open the target page
  2. read the visible controls
  3. focus the right field
  4. type or paste values
  5. select dropdown options
  6. upload a file
  7. submit the form
  8. wait for the confirmation state
  9. extract the result

This is exactly the kind of browser workflow AnySoul is already good at today.

Which One Should You Pick?

Use the browser extension if:

  • you want the agent to act in your current browser
  • you are fine with explicit structured browser workflows
  • you do not need semantic browser actions

Use the desktop app if:

  • you want the richest browser runtime
  • you want the local desktop browser runtime path
  • you want semantic browser actions when available

Get Started

  • Read the full Browser Runtime manual
  • Install the desktop app if you want the richest browser runtime
  • Use the browser extension path if you want to keep everything in your real browser