Looker Studio Technical Implementation

Overview

42crawl provides a "Community Connector" style integration for Looker Studio, allowing users to build live SEO dashboards. Unlike traditional direct exports, this feature uses an Edge Function as a data provider that Looker Studio can query.

Why This Exists

Data Persistence: SEO crawls are transient by default. The Looker Studio integration allows users to "pin" a specific crawl result for reporting.
Reporting Automation: Once connected, Looker Studio handles the visualization, allowing for custom agency-branded reports without building a custom UI.

How It Works

Architecture

The integration relies on the looker-data-source Supabase Edge Function and Supabase Storage.

Selection: The user selects "Export to Looker Studio" in the dashboard (LookerStudioModal.tsx).
Snapshotting: The frontend sends the crawl results and stats to the looker-data-source Edge Function.
Storage: The Edge Function serializes the data into a standardized JSON format and uploads it to a Supabase Storage bucket named looker-exports.
Retrieval: The function generates a unique URL (or signed URL) that Looker Studio uses to fetch the JSON data.

Standardized Schema

The Edge Function (supabase/functions/looker-data-source/index.ts) enforces a strict schema for the JSON output to ensure compatibility with Looker Studio fields:

url: STRING
healthScore: NUMBER
performanceScore: NUMBER
geoScore: NUMBER
criticalIssues: NUMBER
wordCount: NUMBER
(and 15+ other SEO metrics)

Configuration

Required Setup

Supabase Storage: A bucket named looker-exports must be created and set to public or managed via signed URLs.
Service Role: The Edge Function uses the SUPABASE_SERVICE_ROLE_KEY to bypass RLS and manage storage objects.

Environment Variables

SUPABASE_URL: The project's Supabase URL.
SUPABASE_SERVICE_ROLE_KEY: Required for storage management.

User Flow

User clicks Export > Looker Studio.
The modal (LookerStudioModal.tsx) triggers a POST request to the Edge Function.
The Edge Function returns a dataUrl.
The user is provided with a "Deployment URL" for the Looker Studio Connector, which they configure using the provided dataUrl.

API

looker-data-source (Edge Function)

POST - Create Snapshot

Endpoint: /functions/v1/looker-data-source

Body:

json

{
  "action": "create",
  "domain": "example.com",
  "pages": [...],
  "stats": {...}
}

Returns: Promise<{ success: true, url: string, id: string, expiresAt: string }>

GET - Retrieve Snapshot

Endpoint: /functions/v1/looker-data-source?id=<UUID>
Returns: The full JSON data in Looker-compatible format.

Edge Cases & Limitations

Data Expiration: Snapshots are configured to expire after 24 hours (controllable via EXPIRY_SECONDS in the function).
Storage Limits: Large crawls can generate multi-megabyte JSON files. The function currently relies on Supabase Storage quotas.

Security Considerations

Signed URLs: The function prefers signed URLs for retrieval to prevent unauthorized access to private SEO data.
Service Role Scoping: The service role is only used within the Edge Function to facilitate storage operations.

Looker Studio Technical Implementation ​

Overview ​

Why This Exists ​

How It Works ​

Architecture ​

Standardized Schema ​

Configuration ​

Required Setup ​

Environment Variables ​

User Flow ​

API ​

looker-data-source (Edge Function) ​

POST - Create Snapshot ​

GET - Retrieve Snapshot ​

Edge Cases & Limitations ​

Security Considerations ​

Related Features ​