A local snapshot of the Digital Transformation Confluence space — 43 nodes, 11 process whiteboards, every page body cleaned and frontmatter-tagged. Built so downstream agents can read it offline while we design the HubSpot solution.
_raw/ and assembles the mirror. Whiteboards captured via browser automation in a second pass.The mirror covers the entire PM (Digital Transformation) space rooted at page 327789. Every page returned by getConfluencePageDescendants is represented — full markdown for the 16 actual pages, stub files for the 25 nodes the API can't serialise (whiteboards, embeds, databases), and PDF exports for all 11 whiteboard process maps.
Repo location: /Users/adrianbortignon/Documents/GitHub/AllianceCorp/. The mirror itself sits at the repo root; cached API responses are in _raw/ (gitignore-eligible once this becomes a tracked repo).
{NN}-{kebab-section-title} derived from the Confluence page titles. File naming is {page-id}-{kebab-title}.md (or .stub.md, or .pdf for whiteboards).Each markdown file starts with YAML frontmatter (id, title, parent_id, type, status, space, web_url, last_updated, last_synced) and is followed by the cleaned markdown body and any footer/inline comments rendered with date + author + anchor.
alliancecorp-confluence-mirror/ ├── README.md ├── WHITEBOARDS-MANUAL-EXPORT-NEEDED.md ├── build_mirror.py ├── update_whiteboard_stubs.py ├── _raw/ ├── 00-homepage/ │ └── 327789-welcome-to-your-digital-transformation-space.md ├── 01-marketing/ │ ├── 524289-1-marketing.md │ ├── 10289166-marketing-process.pdf │ ├── 10289166-marketing-process.stub.md │ ├── 28770305-leads-fields-this-is-a-common-page-for-pecs-and-dcs.stub.md │ └── 63078401–63111175 · 5× untitled-database stubs ├── 02-telesales-pecs/ │ ├── 8454239-2-telesales-pecs.md │ ├── 8880130-leads-fields-(common page).md │ ├── 73564161-aircall.md │ ├── 10878979-telesales-pecs-process.pdf │ └── stubs · 1 whiteboard, 1 database ├── 03-discovery/ │ ├── 10125313-3-discovery.md │ ├── 10584065-discovery-process.pdf │ └── stubs · 1 whiteboard, 1 embed (+ 2 drafts skipped) ├── 04-pwp/ ├── 05-finalliance/ ├── 06-acquisitions/ ├── 07-accounts/ ├── 08-pp-contract/ ├── 09-settlement/ ├── 10-membership/ └── 11-broker-channel/
The full per-section index, with every node ID and clickable file link, lives in README.md at the mirror root.
Each whiteboard below is a hand-rebuilt SVG approximation of the Confluence original — accurate enough for downstream agents to reason about the process flow without opening the PDFs. The high-fidelity capture lives in the section folder alongside the section's markdown page.
Run getConfluencePageDescendants against the root page 327789 at depth 10. That yields the 43-node manifest. For every node with type: page (and non-empty title), three calls:
getConfluencePage(pageId, contentFormat: "markdown") — wraps the page in content.nodes[0]; body is a markdown string for most pages, a JSON-stringified ADF document for the auto-generated root.getConfluencePageFooterComments(pageId, includeReplies: true) — straight array; comments use top-level authorId, createdAt, and body (string, same custom-tag shape as the page bodies).getConfluencePageInlineComments(pageId, includeReplies: true) — same shape; the anchor text is in properties.inline-original-selection (use that, not the marker-ref UUID).All responses cache to _raw/. The Python builder reads them, applies the cleanup pass below, and writes the section folders.
Confluence's markdown export leaves a handful of custom HTML tags that downstream agents stumble on. Strip them all:
<custom data-type="emoji">:hospital:</custom> → :hospital:<custom data-type="status">MUST HAVE</custom> → **MUST HAVE**<custom data-type="mention">@Name</custom> → @Name<custom data-type="date">3/6/2026</custom> → 3/6/2026<custom data-type="smartlink">https://…</custom> → bare URL<custom data-type="placeholder">…</custom> → dropped → *[Image: embedded in original Confluence page]*The Atlassian API doesn't expose whiteboard content as text. Confluence renders them in a same-origin iframe via WebGL. Three things have to be true for the export to fire:
document.hidden must be false — Chrome pauses WebGL in backgrounded tabs, and the MCP automation tab runs backgrounded by default.document.hidden also has to be false — the spoof needs both, because the canvas lives inside.Shift+1 (zoom-to-fit) is the cleanest way to force one.The working sequence per whiteboard:
navigate → wait 3s
cmd+R reload → wait 5s
JS: defineProperty document.hidden=false + visibilityState='visible' (parent + iframe)
JS: dispatch visibilitychange on both
wait 15s for WebGL to draw
click in canvas → focus
press Shift+1 → zoom-to-fit (forces full render)
wait 3s
open More-actions → click Export menu item
wait 3s for dialog to populate
if "Export area" defaulted to "Selected area", change to "Entire board"
click blue Export button
wait 10–30s for "Exporting…" → file in ~/Downloads
mv to {section-folder}/{id}-{kebab-slug}.pdf
That recipe captured all 11 boards. Two of them (PP Contract, Membership) needed a second pass — the Export dialog opened with everything greyed out on the first try and required a dismiss → zoom-fit → reopen cycle before the blue button activated.
"9 of 11 whiteboards are empty placeholders." — I'd opened each in the MCP automation tab, waited 15+ seconds, watched the canvas remain blank, and observed that the Export dialog rendered with the submit button disabled. I read that as "Confluence is telling me there's nothing to export" and recorded it as a verified-empty result in WHITEBOARDS-MANUAL-EXPORT-NEEDED.md.
Adrian sent a screengrab of the Marketing whiteboard from his own Chrome tab — clearly full of content. That single screenshot invalidated the "empty" verdict and forced a re-test. Without it, the mirror would have shipped with the wrong story about 9 of the 11 process maps.
Absence of evidence isn't evidence of absence — especially for WebGL.
The Page Visibility API is silent. Chrome doesn't warn you that it's throttling rendering. A blank canvas in an automation tab is consistent with both "empty whiteboard" and "Chrome paused WebGL because the tab is hidden." The next time I see a blank canvas in an automated browser, the default hypothesis should be rendering-paused, not content-empty.
A disabled Export button isn't a content signal — it's a render-state signal.
I assumed Confluence was telling me "there's nothing to export" when actually it was telling me "the canvas hasn't finished drawing yet, so the export pipeline isn't ready." The two states look identical from outside the React app.
Iframes carry their own visibility state.
When the parent document is spoofed visible but the iframe isn't, the WebGL inside the iframe stays paused — even though the parent looks fine. Spoof both. This is the kind of thing that's obvious once you know but easy to miss when you're moving fast through the parent document's console.
WHITEBOARDS-MANUAL-EXPORT-NEEDED.md, including the bits that didn't work. So the next person (or me, next time) can re-run this against a different space without rediscovering the iframe gotcha._raw/ before parsing them. When the page-builder script needed fixes (comment field shapes were different than I expected), I could re-run the build in seconds without re-paying the API call cost.This artefact captures what AllianceCorp has documented, not yet what AllianceCorp should build. The next pass — turning these 11 process flows + the 16 page bodies into a HubSpot foundation design — is a separate job for downstream agents reading this mirror as input.
Things they'll find useful out of the box:
web_url — agents can deep-link back to the source when they need to verify a field name.