AllianceCorp Confluence Mirror

01At a glance

43 nodes. 11 whiteboards. Zero failed fetches.

A single re-runnable Python build script reads cached API responses from _raw/ and assembles the mirror. Whiteboards captured via browser automation in a second pass.

The mirror covers the entire PM (Digital Transformation) space rooted at page 327789. Every page returned by getConfluencePageDescendants is represented — full markdown for the 16 actual pages, stub files for the 25 nodes the API can't serialise (whiteboards, embeds, databases), and PDF exports for all 11 whiteboard process maps.

43

Total nodes

incl. root page

16

Pages with full content

1 via ADF fallback

11

Whiteboards exported

~45 MB of PDFs

25

Stub files

11 wb · 8 embed · 6 db

3

Drafts skipped

empty titles

0

Failures

Repo location: /Users/adrianbortignon/Documents/GitHub/AllianceCorp/. The mirror itself sits at the repo root; cached API responses are in _raw/ (gitignore-eligible once this becomes a tracked repo).

02Structure on disk

Twelve section folders. One file per node.

Folder naming is {NN}-{kebab-section-title} derived from the Confluence page titles. File naming is {page-id}-{kebab-title}.md (or .stub.md, or .pdf for whiteboards).

Each markdown file starts with YAML frontmatter (id, title, parent_id, type, status, space, web_url, last_updated, last_synced) and is followed by the cleaned markdown body and any footer/inline comments rendered with date + author + anchor.

alliancecorp-confluence-mirror/
├── README.md                                  index + hierarchy + counts
├── WHITEBOARDS-MANUAL-EXPORT-NEEDED.md        capture log + recipe
├── build_mirror.py                            re-runnable builder
├── update_whiteboard_stubs.py                 stub-annotator
├── _raw/                                      cached API responses · 47 JSON files
├── 00-homepage/
│   └── 327789-welcome-to-your-digital-transformation-space.md
├── 01-marketing/
│   ├── 524289-1-marketing.md
│   ├── 10289166-marketing-process.pdf       2.2 MB
│   ├── 10289166-marketing-process.stub.md
│   ├── 28770305-leads-fields-this-is-a-common-page-for-pecs-and-dcs.stub.md
│   └── 63078401–63111175  ·  5× untitled-database stubs
├── 02-telesales-pecs/
│   ├── 8454239-2-telesales-pecs.md
│   ├── 8880130-leads-fields-(common page).md
│   ├── 73564161-aircall.md
│   ├── 10878979-telesales-pecs-process.pdf   3.6 MB
│   └── stubs · 1 whiteboard, 1 database
├── 03-discovery/
│   ├── 10125313-3-discovery.md
│   ├── 10584065-discovery-process.pdf        4.5 MB
│   └── stubs · 1 whiteboard, 1 embed  (+ 2 drafts skipped)
├── 04-pwp/                       15663105 + 23756801 + PWP Process PDF (6.2 MB)
├── 05-finalliance/               11534384 + FinAlliance Process PDF (3.9 MB) + embed stub
├── 06-acquisitions/              15138864 + 53739521 + Acquisitions Process PDF (5.7 MB)
├── 07-accounts/                  11468848 + Accounts Process PDF (3.6 MB) + embed stub
├── 08-pp-contract/               11534337 + PP Contract Process PDF (4.5 MB)
├── 09-settlement/                14712833 + Settlement Process PDF (4.5 MB) + embed stub
├── 10-membership/                11468895 + Membership Mgmt Process PDF (4.5 MB)
└── 11-broker-channel/            15138817 + Broker Process PDF (2.0 MB)

The full per-section index, with every node ID and clickable file link, lives in README.md at the mirror root.

03Whiteboards

The eleven process maps, recreated inline.

Each whiteboard below is a hand-rebuilt SVG approximation of the Confluence original — accurate enough for downstream agents to reason about the process flow without opening the PDFs. The high-fidelity capture lives in the section folder alongside the section's markdown page.

01 · Marketing

Marketing Process

↗ Confluence PDF · 2.2 MB

02 · Telesales / PECs

Telesales / PECs Process

↗ Confluence PDF · 3.6 MB

03 · Discovery

Discovery Process

↗ Confluence PDF · 4.5 MB

04 · PWP

PWP Process

↗ Confluence PDF · 6.2 MB

05 · FinAlliance

FinAlliance Process

↗ Confluence PDF · 3.9 MB

06 · Acquisitions

Acquisitions Process

↗ Confluence PDF · 5.7 MB

07 · Accounts

Accounts Process

↗ Confluence PDF · 3.6 MB

08 · PP Contract

PP Contract Process

↗ Confluence PDF · 4.5 MB

09 · Settlement

Settlement Process

↗ Confluence PDF · 4.5 MB

10 · Membership

Membership Management Process

↗ Confluence PDF · 4.5 MB

11 · Broker Channel

Broker Process

↗ Confluence PDF · 2.0 MB

04How the recipe works

Three commands. One catch.

Built around the Atlassian MCP for page fetches and Claude-in-Chrome for whiteboard exports. Python composes the markdown files.

Pages and comments — Atlassian MCP

Run getConfluencePageDescendants against the root page 327789 at depth 10. That yields the 43-node manifest. For every node with type: page (and non-empty title), three calls:

getConfluencePage(pageId, contentFormat: "markdown") — wraps the page in content.nodes[0]; body is a markdown string for most pages, a JSON-stringified ADF document for the auto-generated root.
getConfluencePageFooterComments(pageId, includeReplies: true) — straight array; comments use top-level authorId, createdAt, and body (string, same custom-tag shape as the page bodies).
getConfluencePageInlineComments(pageId, includeReplies: true) — same shape; the anchor text is in properties.inline-original-selection (use that, not the marker-ref UUID).

All responses cache to _raw/. The Python builder reads them, applies the cleanup pass below, and writes the section folders.

Cleanup transforms

Confluence's markdown export leaves a handful of custom HTML tags that downstream agents stumble on. Strip them all:

<custom data-type="emoji">:hospital:</custom> → :hospital:
<custom data-type="status">MUST HAVE</custom> → **MUST HAVE**
<custom data-type="mention">@Name</custom> → @Name
<custom data-type="date">3/6/2026</custom> → 3/6/2026
<custom data-type="smartlink">https://…</custom> → bare URL
<custom data-type="placeholder">…</custom> → dropped
![](blob:…) → *[Image: embedded in original Confluence page]*
U+200C zero-width non-joiners → stripped

Whiteboards — Claude-in-Chrome

The Atlassian API doesn't expose whiteboard content as text. Confluence renders them in a same-origin iframe via WebGL. Three things have to be true for the export to fire:

The page's document.hidden must be false — Chrome pauses WebGL in backgrounded tabs, and the MCP automation tab runs backgrounded by default.
The iframe's document.hidden also has to be false — the spoof needs both, because the canvas lives inside.
A full canvas frame has to have rendered before the Export dialog will enable. Shift+1 (zoom-to-fit) is the cleanest way to force one.

The working sequence per whiteboard:

navigate → wait 3s
cmd+R reload → wait 5s
JS: defineProperty document.hidden=false + visibilityState='visible' (parent + iframe)
JS: dispatch visibilitychange on both
wait 15s for WebGL to draw

click in canvas → focus
press Shift+1 → zoom-to-fit (forces full render)
wait 3s

open More-actions → click Export menu item
wait 3s for dialog to populate
if "Export area" defaulted to "Selected area", change to "Entire board"
click blue Export button
wait 10–30s for "Exporting…" → file in ~/Downloads
mv to {section-folder}/{id}-{kebab-slug}.pdf

That recipe captured all 11 boards. Two of them (PP Contract, Membership) needed a second pass — the Export dialog opened with everything greyed out on the first try and required a dismiss → zoom-fit → reopen cycle before the blue button activated.

05Lessons

The false-empty trap, and why the user saved this run.

Documented here so the same pattern doesn't bite the next time a whiteboard-heavy space goes through this pipeline.

Original (wrong) finding

"9 of 11 whiteboards are empty placeholders." — I'd opened each in the MCP automation tab, waited 15+ seconds, watched the canvas remain blank, and observed that the Export dialog rendered with the submit button disabled. I read that as "Confluence is telling me there's nothing to export" and recorded it as a verified-empty result in WHITEBOARDS-MANUAL-EXPORT-NEEDED.md.

What actually fixed it

Adrian sent a screengrab of the Marketing whiteboard from his own Chrome tab — clearly full of content. That single screenshot invalidated the "empty" verdict and forced a re-test. Without it, the mirror would have shipped with the wrong story about 9 of the 11 process maps.

Three lessons worth keeping

Absence of evidence isn't evidence of absence — especially for WebGL.

The Page Visibility API is silent. Chrome doesn't warn you that it's throttling rendering. A blank canvas in an automation tab is consistent with both "empty whiteboard" and "Chrome paused WebGL because the tab is hidden." The next time I see a blank canvas in an automated browser, the default hypothesis should be rendering-paused, not content-empty.

A disabled Export button isn't a content signal — it's a render-state signal.

I assumed Confluence was telling me "there's nothing to export" when actually it was telling me "the canvas hasn't finished drawing yet, so the export pipeline isn't ready." The two states look identical from outside the React app.

Iframes carry their own visibility state.

When the parent document is spoofed visible but the iframe isn't, the WebGL inside the iframe stays paused — even though the parent looks fine. Spoof both. This is the kind of thing that's obvious once you know but easy to miss when you're moving fast through the parent document's console.

What I did right (worth keeping)

Retracted the wrong finding the moment the screengrab arrived. Updated every stub, the README, the whiteboards index, and the raw status notes — not just the headline. Future-me reading the mirror six months from now sees the retraction in every relevant place.
Kept the recipe in WHITEBOARDS-MANUAL-EXPORT-NEEDED.md, including the bits that didn't work. So the next person (or me, next time) can re-run this against a different space without rediscovering the iframe gotcha.
Cached the raw API responses in _raw/ before parsing them. When the page-builder script needed fixes (comment field shapes were different than I expected), I could re-run the build in seconds without re-paying the API call cost.

06What downstream agents get

A clean local snapshot. Nothing more, nothing less.

The mirror is the input for HubSpot solution-design work. It is deliberately not opinionated about that work yet.

This artefact captures what AllianceCorp has documented, not yet what AllianceCorp should build. The next pass — turning these 11 process flows + the 16 page bodies into a HubSpot foundation design — is a separate job for downstream agents reading this mirror as input.

Things they'll find useful out of the box:

Every page's frontmatter carries the Confluence web_url — agents can deep-link back to the source when they need to verify a field name.
Comments are preserved with author IDs + dates + anchor text. Useful for catching disagreements that were captured inline but might be missed by a glance at the page body.
The whiteboard PDFs are vector — agents that read PDFs can extract text directly, no OCR needed.
The SVG diagrams above are inline in this doc. Downstream agents reading this HTML can parse them programmatically to reason about the flow structure without touching the PDFs at all.