SEO and Performance Self-Improvement Automation Workflow

This file is an implementation specification for an IDE coding agent working inside the repository for https://scottcoff.in.

The goal is to build a repeatable diagnose → improve → diagnose workflow for a Jekyll/GitHub Pages site using:

Local Lighthouse / Lighthouse CI for fast pre-deployment checks
PageSpeed Insights API for public deployed URL checks
Optional Google Search Console API for indexing/canonical/search-performance data
Optional GTmetrix API for an independent external performance check
Local HTML/SEO validation scripts for canonical URLs, metadata, sitemap, internal-link leakage, and JSON-LD syntax

This workflow should prioritize correctness, canonical identity, crawlability, accessibility, and stable layout over chasing small Lighthouse score changes.

0. Operating principles for the IDE agent

Follow these rules throughout the implementation.

Core rules

Make small, reviewable changes.
Run diagnostics before making improvements.
Make exactly one improvement per iteration unless explicitly instructed otherwise.
Rerun diagnostics after each improvement.
Compare before/after results.
Revert changes that break the build, degrade SEO/accessibility, or introduce visual regressions.
Do not remove content.
Do not remove analytics or third-party scripts unless there is explicit evidence they are unused and the user approves.
Do not make Search Console, DNS, indexing, or GTmetrix claims unless the corresponding external API/tool was actually run successfully.
Treat Lighthouse/PageSpeed performance scores as noisy. Prefer concrete audit failures and measurable byte/time savings.

SEO priority order for this site

Prioritize issues in this order:

Canonical/internal-link leakage from scottcoff.in to scottcoffin.github.io
Broken internal links
Missing or duplicate titles/descriptions
Missing canonical tags or wrong canonical tags
Sitemap/robots issues
Missing or invalid structured data
Accessibility issues
Missing image dimensions and layout shift risks
Unoptimized images
Render-blocking resources
Unused CSS/JavaScript
Third-party script cost
Marginal performance score improvements

Stop and ask for review if

Stop and request human review if any proposed fix requires:

Changing theme architecture substantially
Deleting existing content
Removing analytics or required theme scripts
Changing visual layout substantially
Adding a complex build pipeline
Storing secrets in the repo
A fix that improves performance but worsens accessibility or SEO
A change with less than 2% expected improvement but high code complexity

1. Initial repository inspection

First inspect the repo. Do not edit files in this step.

Agent task

Inspect the repository and report:

Whether this appears to be Jekyll/GitHub Pages
Whether these files exist:
- _config.yml
- Gemfile
- package.json
- _includes/head.html
- _layouts/default.html
- _data/navigation.yml
- robots.txt
- sitemap.xml
- SEO_CHECKLIST.md
Where these pages live:
- homepage
- /Research/
- /Data_Science/
- /Media/
- /Expertise/
- /CV/
Whether `

SEO and Performance Self-Improvement Automation Workflow | Scott Coffin, PhD

` is used

Whether jekyll-seo-tag is configured
Whether jekyll-sitemap is configured
All hard-coded occurrences of:
- scottcoffin.github.io
- http://scottcoffin.github.io
- https://scottcoffin.github.io
Existing build commands from README, Gemfile, GitHub Actions, or package scripts

Required output

Return a concise file-by-file implementation plan before editing.

2. Add local Node-based diagnostic tooling

If no package.json exists, create one. If it exists, extend it conservatively.

Desired dev dependencies

Add these development dependencies:

{
  "@lhci/cli": "latest",
  "lighthouse": "latest",
  "http-server": "latest"
}

Only add concurrently or wait-on if genuinely useful.

Desired `package.json` scripts

Adapt commands to the actual repo if needed.

{
  "scripts": {
    "build:site": "bundle exec jekyll build",
    "serve:site": "npx http-server _site -p 4000",
    "lhci": "lhci autorun",
    "psi": "node scripts/pagespeed_insights.mjs",
    "audit:tasks": "node scripts/audit_to_tasks.mjs",
    "diagnose": "bash scripts/diagnose_seo_perf.sh",
    "diagnose:live": "RUN_PSI=1 bash scripts/diagnose_seo_perf.sh",
    "jsonld": "node scripts/validate_jsonld.mjs",
    "gsc": "node scripts/search_console_check.mjs",
    "gtmetrix": "node scripts/gtmetrix_check.mjs"
  },
  "devDependencies": {
    "@lhci/cli": "latest",
    "lighthouse": "latest",
    "http-server": "latest"
  }
}

Add `.gitignore` entries

Add report outputs to .gitignore unless the user explicitly wants to version reports:

reports/lighthouse/
reports/lhci/
reports/pagespeed/
reports/search-console/
reports/gtmetrix/
reports/*.json
reports/*.csv
reports/*.html
reports/*.log

Keep curated Markdown summaries trackable only if desired. If uncertain, ignore all generated reports and let the user decide.

3. Create Lighthouse CI configuration

Create:

lighthouserc.js

Use this as the starting configuration. Adjust paths only if the site uses different URL paths.

module.exports = {
  ci: {
    collect: {
      staticDistDir: './_site',
      url: [
        'http://localhost:4000/',
        'http://localhost:4000/Research/',
        'http://localhost:4000/Data_Science/',
        'http://localhost:4000/Media/',
        'http://localhost:4000/Expertise/'
      ],
      numberOfRuns: 3,
      settings: {
        formFactor: 'mobile',
        screenEmulation: {
          mobile: true,
          width: 390,
          height: 844,
          deviceScaleFactor: 3,
          disabled: false
        },
        throttlingMethod: 'simulate'
      }
    },
    assert: {
      assertions: {
        'categories:performance': ['warn', { minScore: 0.70 }],
        'categories:accessibility': ['error', { minScore: 0.90 }],
        'categories:best-practices': ['warn', { minScore: 0.90 }],
        'categories:seo': ['error', { minScore: 0.90 }]
      }
    },
    upload: {
      target: 'filesystem',
      outputDir: './reports/lhci'
    }
  }
};

If staticDistDir does not serve correctly in this repo, switch to a startServerCommand approach:

module.exports = {
  ci: {
    collect: {
      startServerCommand: 'bundle exec jekyll build && npx http-server _site -p 4000',
      startServerReadyPattern: 'Available on',
      url: [
        'http://localhost:4000/',
        'http://localhost:4000/Research/',
        'http://localhost:4000/Data_Science/',
        'http://localhost:4000/Media/',
        'http://localhost:4000/Expertise/'
      ],
      numberOfRuns: 3,
      settings: {
        formFactor: 'mobile',
        screenEmulation: {
          mobile: true,
          width: 390,
          height: 844,
          deviceScaleFactor: 3,
          disabled: false
        },
        throttlingMethod: 'simulate'
      }
    },
    assert: {
      assertions: {
        'categories:performance': ['warn', { minScore: 0.70 }],
        'categories:accessibility': ['error', { minScore: 0.90 }],
        'categories:best-practices': ['warn', { minScore: 0.90 }],
        'categories:seo': ['error', { minScore: 0.90 }]
      }
    },
    upload: {
      target: 'filesystem',
      outputDir: './reports/lhci'
    }
  }
};

Validation command

After setup, run:

npm run lhci

If this fails, fix the configuration before proceeding.

4. Create PageSpeed Insights API script

Create:

scripts/pagespeed_insights.mjs

Purpose: run Google PageSpeed Insights against deployed public URLs and save both raw JSON and summary CSV.

Requirements

Use Node.js built-in fetch.
If Node lacks global fetch, print a clear message requiring Node 18+.
Do not require an API key by default.
Support optional API key through:

PAGESPEED_API_KEY=...

Test these deployed URLs:

[
  'https://scottcoff.in/',
  'https://scottcoff.in/Research/',
  'https://scottcoff.in/Data_Science/',
  'https://scottcoff.in/Media/',
  'https://scottcoff.in/Expertise/'
]

Test both strategies:
- mobile
- desktop
Request categories:
- performance
- accessibility
- best-practices
- seo
Save raw JSON results under:

reports/pagespeed/YYYY-MM-DDTHH-mm-ss/

Create:

reports/pagespeed/latest-summary.csv
reports/pagespeed/latest-summary.md

Fields to extract into CSV

Use NA for missing values.

timestamp
url
strategy
performance_score
accessibility_score
best_practices_score
seo_score
lcp_ms
cls
inp_ms_or_na
tbt_ms
fcp_ms
speed_index_ms
total_byte_weight
render_blocking_savings_ms
unused_css_savings_bytes
unused_js_savings_bytes
image_savings_bytes
canonical_audit_score
viewport_audit_score
meta_description_audit_score
crawlable_anchors_score
http_status_code
final_url

API endpoint pattern

Use:

https://www.googleapis.com/pagespeedonline/v5/runPagespeed

Parameters:

url=<encoded URL>
strategy=mobile|desktop
category=performance
category=accessibility
category=best-practices
category=seo
key=<optional API key>

Implementation notes

The script should:

Create report directories if missing.
Rate-limit requests modestly to avoid quota problems.
Retry once on transient HTTP 429/5xx.
Save raw API responses even if a page has a poor score.
Fail gracefully if a URL is unreachable.
Print a concise summary table to stdout.

5. Create report parser: audit-to-tasks

Create:

scripts/audit_to_tasks.mjs

Purpose: read latest Lighthouse/PageSpeed reports and turn them into actionable engineering tasks.

Inputs

Search for latest available reports from:

reports/lhci/
reports/lighthouse/
reports/pagespeed/

Outputs

Create:

reports/audit-summary.md
reports/audit-tasks.json

Extract from each report

For each URL:

Category scores:
- performance
- accessibility
- best practices
- SEO
Core metrics:
- LCP
- CLS
- TBT
- FCP
- Speed Index
- total byte weight
Failed SEO audits
Failed accessibility audits
Failed best-practices audits
High-impact performance opportunities

Prioritize performance opportunities

Order issues by:

Largest Contentful Paint
render-blocking resources
unoptimized images
missing image width/height
cumulative layout shift
unused CSS
unused JavaScript
total byte weight
third-party scripts
font loading issues

Each generated task must include

{
  "page": "string",
  "source": "lighthouse|pagespeed|custom",
  "category": "seo|performance|accessibility|best-practices",
  "auditId": "string",
  "title": "string",
  "description": "string",
  "score": "number|null",
  "numericSavings": "object|null",
  "affectedAssets": ["string"],
  "likelySourceFiles": ["string"],
  "recommendedFix": "string",
  "risk": "low|medium|high",
  "canAttemptAutomatically": true
}

Risk classification

Use these default risk rules:

Low risk

Fixing canonical URLs
Fixing internal links
Adding missing meta descriptions
Adding missing alt text
Adding width/height to images
Optimizing images without changing rendered dimensions
Fixing typos
Adding loading="lazy" to below-the-fold images

Medium risk

Deferring scripts
Moving CSS/JS
Changing image formats in templates
Modifying navigation templates
Adding JSON-LD sitewide

High risk

Removing scripts/styles
Rewriting theme layout
Introducing complex build tooling
Changing visual design
Removing content

Markdown summary format

reports/audit-summary.md should include:

Timestamp
Pages tested
Score summary table
Top 10 actionable tasks
SEO/canonical warnings
Accessibility blockers
Performance opportunities
Recommended next single fix

6. Create local SEO/HTML validation script

Create:

scripts/local_seo_check.mjs

Purpose: inspect built HTML directly for site-specific SEO problems that Lighthouse may not fully catch.

Build expectation

This script assumes _site exists. The diagnose script will build before running it.

Checks

For these generated files if present:

_site/index.html
_site/Research/index.html
_site/Data_Science/index.html
_site/Media/index.html
_site/Expertise/index.html

Check:

Exactly one <h1> or a clearly acceptable theme equivalent
<title> exists
<meta name="description"> exists
<link rel="canonical"> exists
Canonical starts with https://scottcoff.in
No canonical contains scottcoffin.github.io
Internal links do not point to scottcoffin.github.io
No obviously broken internal links to missing _site paths
Images have alt attributes
Images have width and height where feasible
robots.txt exists
sitemap.xml exists
sitemap URLs use https://scottcoff.in
sitemap does not use scottcoffin.github.io

Outputs

Write:

reports/local-seo-check.md
reports/local-seo-check.json

Add a package script:

"seo:check": "node scripts/local_seo_check.mjs"

7. Create JSON-LD syntax validator

Create:

scripts/validate_jsonld.mjs

Purpose: validate local JSON-LD syntax without claiming Google rich-result eligibility.

Requirements

Scan built HTML files under _site/.
Extract all blocks:

<script type="application/ld+json">
...
</script>

Parse each block as JSON.
Report:
- file
- number of JSON-LD blocks
- parse success/failure
- error message if invalid
Save:

reports/jsonld-validation.md
reports/jsonld-validation.json

Important limitation

The script validates JSON syntax only. It must not claim that a page is eligible for Google rich results. Google Rich Results Test remains a manual check.

8. Create full diagnostic shell script

Create:

scripts/diagnose_seo_perf.sh

Make it executable if possible.

Behavior

Stop on errors:

set -euo pipefail

Create report directories:

mkdir -p reports reports/lhci reports/pagespeed

Print environment info:
- Node version
- npm version
- Ruby version
- Bundler version if available
Build site:

bundle exec jekyll build

Run local SEO check:

node scripts/local_seo_check.mjs

Run JSON-LD validation:

node scripts/validate_jsonld.mjs

Run Lighthouse CI:

npx lhci autorun

If RUN_PSI=1, run:

node scripts/pagespeed_insights.mjs

Run audit-to-tasks:

node scripts/audit_to_tasks.mjs

Print final locations:

reports/audit-summary.md
reports/audit-tasks.json
reports/local-seo-check.md
reports/jsonld-validation.md
reports/lhci/
reports/pagespeed/latest-summary.md

Script success criteria

The script should exit nonzero if:

Jekyll build fails
Local SEO check finds critical canonical leakage
JSON-LD is invalid
Lighthouse CI assertion fails for SEO or accessibility

The script should not fail solely for performance warnings.

9. Optional: Google Search Console API script

Only implement this after Search Console ownership verification is complete.

Create:

scripts/search_console_check.mjs

Purpose

Query Google Search Console for:

URL Inspection data
Search Analytics query/page data
Sitemap status if feasible

Credentials

Do not hard-code credentials.

Support:

Google Application Default Credentials, or
OAuth flow documented in comments, or
Service account only if the Search Console property grants the service account access

If credentials are missing, exit with a clear setup message.

Environment variables

GSC_SITE_URL="sc-domain:scottcoff.in"
GSC_DAYS=28

Default to:

sc-domain:scottcoff.in

URLs to inspect

[
  'https://scottcoff.in/',
  'https://scottcoff.in/Research/',
  'https://scottcoff.in/Data_Science/',
  'https://scottcoff.in/Media/',
  'https://scottcoff.in/Expertise/'
]

Outputs

reports/search-console/url-inspection-latest.json
reports/search-console/search-analytics-latest.csv
reports/search-console/search-console-summary.md

URL Inspection output should summarize

Inspection URL
Coverage/indexing state
Google-selected canonical
User-declared canonical
Last crawl time if available
Crawl status
Robots/indexing issues if present
Mobile usability if available

Search Analytics output should include

For the past GSC_DAYS:

top queries
top pages
clicks
impressions
CTR
average position

Classify queries as:

branded
topical
other

Use simple rules:

Branded if query contains scott, coffin, scott coffin, plastiverse
Topical if query contains microplastic, pfas, toxicology, toxicokinetic, risk assessment, environmental toxicology
Otherwise other

Add package script:

"gsc": "node scripts/search_console_check.mjs"

Do not include this in default npm run diagnose.

10. Optional: GTmetrix API script

Only implement this if the user wants an independent external performance service.

Create:

scripts/gtmetrix_check.mjs

Credentials

Do not hard-code credentials.

Use:

GTMETRIX_API_KEY=...

If missing, exit with clear setup instructions.

Test URLs

[
  'https://scottcoff.in/',
  'https://scottcoff.in/Research/'
]

Outputs

reports/gtmetrix/raw/
reports/gtmetrix/summary.md

Summary should include

URL
test status
Lighthouse performance score if available
Web Vitals if available
page weight
request count
top issues if available
report URL if provided by API

Add package script:

"gtmetrix": "node scripts/gtmetrix_check.mjs"

Do not include this in default npm run diagnose.

11. Main improvement-loop prompt for the IDE agent

After the tooling exists and npm run diagnose works, use this process.

Agent instruction

Use the latest:

reports/audit-summary.md
reports/audit-tasks.json
reports/local-seo-check.md
reports/jsonld-validation.md

Choose exactly one low-risk, high-impact improvement.

Selection rules

Prefer issues affecting /Research/ first.

Use this order:

Canonical/internal link problems
Missing or duplicate meta descriptions
Broken internal links
Missing image width/height causing CLS
Unoptimized images
Render-blocking local scripts/styles
Unused CSS/JS
Accessibility issues
Best-practices issues
Performance-only score improvements

Improvement workflow

For the selected issue:

State the single issue selected and why.
Identify source files to modify.
Make the minimal fix.
Run local build.
Run:

npm run diagnose

Compare before/after:
- performance score
- accessibility score
- best-practices score
- SEO score
- LCP
- CLS
- TBT
- failed audit count
- local SEO check warnings/errors
If the fix worsened the page, broke build, or created SEO/accessibility regressions, revert it.
If successful, report:
- files changed
- diff summary
- before/after metrics
- next recommended single issue

Stop after

Stop after either:

3 successful improvements, or
Remaining issues are medium/high risk, or
Remaining improvements are marginal and complex, or
Build/diagnostics cannot run reliably

12. Canonical-domain improvement task

This task is especially important for this site.

Agent instruction

Audit and fix canonical domain leakage.

Problem to look for:

The public site should consistently use:

https://scottcoff.in

Internal navigation, canonical tags, feeds, sitemap, footer links, and social/profile site links should not point to:

https://scottcoffin.github.io

except where intentionally linking to GitHub-hosted source code or a GitHub profile/repository.

Steps

Search repo for:

grep -R "scottcoffin.github.io" .

Replace internal/canonical/navigation/feed links with:

https://scottcoff.in

Do not replace intentional GitHub source-code repository links.
Build site.
Search built output:

grep -R "scottcoffin.github.io" _site || true

Explain each remaining occurrence.

Success criteria

_site contains no unwanted internal links to scottcoffin.github.io.
All canonical tags use https://scottcoff.in.
Sitemap uses https://scottcoff.in.
Internal navigation links use relative URLs or https://scottcoff.in.

13. Research page improvement task

Prioritize /Research/ because it is a high-value SEO page.

Agent instruction

Improve /Research/ while preserving publication content.

Required checks

Page has a unique title:
- Research | Scott Coffin, PhD
Page has a unique meta description:
- Research by Scott Coffin, PhD on microplastics in drinking water, ecotoxicology, PFAS, New Approach Methodologies, toxicokinetics, computational toxicology, and regulatory risk assessment.
Canonical URL is:
- https://scottcoff.in/Research/
H1 exists and is clear:
- Research
Add or preserve a short intro describing:
- microplastics risk assessment
- drinking water
- PFAS toxicokinetics
- computational toxicology / NAMs
- regulatory science
Fix obvious typos and broken DOI links.
Do not remove publication entries.

Known typo fixes to check

Replace:

little or not toxicology testing

with:

little or no toxicology testing

Replace:

https.://doi.org/

with:

https://doi.org/

Replace malformed concatenations such as:

ShareMicroplastics

with:

Share Microplastics

14. Image optimization task

Agent instruction

Optimize images conservatively.

Steps

Identify images loaded on /Research/, especially author/profile/sidebar images.
Record original:
- file path
- dimensions
- byte size
- rendered size
If an image is much larger than rendered size, create optimized derivatives.
Prefer WebP if the site/theme can support fallback safely.
Preserve visual appearance.
Add width and height attributes where feasible.
Add meaningful alt text.
Use loading="lazy" only for below-the-fold images.
Do not lazy-load above-the-fold author/profile/hero images if they contribute to LCP.
Build and diagnose again.

Success criteria

Image byte weight reduced.
No broken images.
CLS does not worsen.
LCP does not worsen.
Accessibility does not worsen.

15. Render-blocking resources task

Agent instruction

Reduce render-blocking resources conservatively.

Steps

Identify local CSS and JS loaded on /Research/.
Identify whether JavaScript can safely use defer.
Do not defer scripts required before rendering.
Do not remove scripts unless clearly unused and low risk.
Avoid complex critical-CSS tooling.
Prefer minification if already supported by Jekyll/theme.
Build and diagnose again.

Success criteria

No visual regression.
No JavaScript errors in browser console if checked.
Lighthouse render-blocking opportunity improves or remains stable.
Accessibility and SEO remain passing.

16. Accessibility task

Agent instruction

Fix accessibility failures before marginal performance score issues.

Common low-risk fixes

Add missing alt text.
Fix empty links/buttons.
Improve ambiguous link text.
Ensure heading order is logical.
Ensure sufficient accessible names for icons.
Avoid using color alone to communicate meaning.

Do not

Redesign layout without review.
Remove content.
Hide content from assistive technologies unless clearly decorative.

17. Structured data task

Agent instruction

Validate JSON-LD syntax locally and add structured data conservatively.

Rules

Use scripts/validate_jsonld.mjs for syntax.
Do not claim Google rich-result eligibility from local syntax validation.
Use Google Rich Results Test manually for deployed pages.
Do not invent publication metadata.
Prefer page-level Person, ProfilePage, or WebPage JSON-LD over marking up every publication unless data are complete and accurate.

Recommended structured data targets

Homepage:

ProfilePage
Person

Research page:

WebPage
Person as author/mainEntity
about topics:
- Microplastics
- Environmental toxicology
- PFAS
- Risk assessment
- Toxicokinetics
- Computational toxicology

18. Search Console workflow after deployment

This requires site ownership verification first.

Manual prerequisites

Verify scottcoff.in as a Domain property in Google Search Console using DNS TXT.
Submit:

https://scottcoff.in/sitemap.xml

Use URL Inspection for:
- https://scottcoff.in/
- https://scottcoff.in/Research/
- https://scottcoff.in/Data_Science/
- https://scottcoff.in/Media/
- https://scottcoff.in/Expertise/

API diagnostics

After credentials are configured, run:

npm run gsc

What to check

Google-selected canonical equals user-declared canonical.
Indexed status is valid or improving.
Sitemap is discovered.
Search Analytics shows impressions for:
- Scott Coffin
- Scott Coffin PhD
- Scott Coffin microplastics
- microplastics risk assessment
- environmental toxicologist
- PFAS toxicokinetics

19. Live PageSpeed workflow after deployment

After deploying changes to GitHub Pages/custom domain, run:

npm run psi

or:

npm run diagnose:live

Compare live vs local

If local Lighthouse improves but PageSpeed does not:

Confirm deployment actually includes the changes.
Check whether PageSpeed tested the correct final URL.
Check redirects.
Check whether server/CDN caching is stale.
Wait and rerun.
Do not immediately undo local improvements if HTML/source confirms they are correct.

Treat PSI field data carefully

If PageSpeed says insufficient real-user data are available, rely on lab diagnostics and Search Console until field data accumulate.

20. Optional GitHub Actions workflow

Only add CI after local diagnostics run reliably.

Create:

.github/workflows/seo-performance.yml

Suggested workflow:

name: SEO and Performance Diagnostics

on:
  pull_request:
  workflow_dispatch:

jobs:
  seo-performance:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Set up Ruby
        uses: ruby/setup-ruby@v1
        with:
          bundler-cache: true

      - name: Set up Node
        uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm

      - name: Install npm dependencies
        run: npm install

      - name: Run diagnostics
        run: npm run diagnose

      - name: Upload reports
        uses: actions/upload-artifact@v4
        if: always()
        with:
          name: seo-performance-reports
          path: reports/

Do not include PageSpeed Insights, Search Console, or GTmetrix in PR CI unless credentials/secrets and rate limits are configured safely.

21. Final implementation report format

After implementing the workflow, report:

# SEO/performance automation implementation report

## Files changed

| File | Change |
|---|---|

## Commands added

| Command | Purpose |
|---|---|

## Diagnostics run

| Command | Result |
|---|---|

## Current baseline

| Page | Performance | Accessibility | Best practices | SEO | LCP | CLS | TBT |
|---|---:|---:|---:|---:|---:|---:|---:|

## Critical issues

## Low-risk next fixes

## Medium/high-risk issues requiring review

## Manual follow-up

- [ ] Verify Domain property in Google Search Console
- [ ] Submit sitemap
- [ ] Run URL Inspection
- [ ] Run Google Rich Results Test manually
- [ ] Run live PageSpeed after deployment

22. Quick-start command sequence

After implementation, the normal workflow should be:

npm install
npm run diagnose

After deployment:

npm run psi

After Search Console credentials are configured:

npm run gsc

For one improvement iteration, instruct the IDE agent:

Use reports/audit-tasks.json and reports/audit-summary.md to select exactly one low-risk, high-impact issue affecting /Research/. Make the minimal fix, rerun npm run diagnose, compare before/after metrics, and revert if SEO/accessibility/build status worsens.

23. Expected repo additions

At minimum, this implementation should add or modify:

package.json
lighthouserc.js
scripts/pagespeed_insights.mjs
scripts/audit_to_tasks.mjs
scripts/local_seo_check.mjs
scripts/validate_jsonld.mjs
scripts/diagnose_seo_perf.sh
.gitignore

Optional additions:

scripts/search_console_check.mjs
scripts/gtmetrix_check.mjs
.github/workflows/seo-performance.yml
SEO_CHECKLIST.md

24. Notes on external tooling

Lighthouse / Lighthouse CI

Use local Lighthouse and Lighthouse CI for repeatable pre-deployment diagnostics. They are suitable for an IDE agent because they can run against the locally built site and produce machine-readable reports.

PageSpeed Insights API

Use PageSpeed Insights API for deployed public URLs. It can be automated and can return Lighthouse-based lab diagnostics and, when available, field data.

Google Search Console API

Use only after site ownership is verified. This is the best programmatic source for Google-selected canonicals, indexing status, sitemap status, and actual search query/page performance.

GTmetrix API

Use optionally as a second external performance opinion. It requires API credentials and should not be part of the default local loop.

Rich Results Test

Use manually for Google-specific structured-data eligibility. Locally, only validate JSON-LD syntax.

25. Definition of done

The workflow is complete when:

npm run diagnose builds the site and produces reports.
reports/audit-summary.md exists.
reports/audit-tasks.json exists.
Local SEO checks detect canonical leakage if introduced.
JSON-LD syntax validation works.
Lighthouse CI runs against the key pages.
npm run psi can test deployed URLs.
The IDE agent can make one improvement, rerun diagnostics, and compare before/after.
The workflow avoids secrets in the repo.
The workflow does not require manual browser use except for Search Console setup and Rich Results Test verification.

SEO and Performance Self-Improvement Automation Workflow

0. Operating principles for the IDE agent

Core rules

SEO priority order for this site

Stop and ask for review if

1. Initial repository inspection

Agent task

Required output

2. Add local Node-based diagnostic tooling

Desired dev dependencies

Desired package.json scripts

Add .gitignore entries

3. Create Lighthouse CI configuration

Validation command

4. Create PageSpeed Insights API script

Requirements

Fields to extract into CSV

API endpoint pattern

Implementation notes

5. Create report parser: audit-to-tasks

Inputs

Outputs

Extract from each report

Prioritize performance opportunities

Each generated task must include

Risk classification

Low risk

Medium risk

High risk

Markdown summary format

6. Create local SEO/HTML validation script

Build expectation

Checks

Outputs

7. Create JSON-LD syntax validator

Requirements

Important limitation

8. Create full diagnostic shell script

Behavior

Script success criteria

9. Optional: Google Search Console API script

Purpose

Credentials

Environment variables

URLs to inspect

Outputs

URL Inspection output should summarize

Search Analytics output should include

10. Optional: GTmetrix API script

Credentials

Test URLs

Outputs

Summary should include

11. Main improvement-loop prompt for the IDE agent

Agent instruction

Selection rules

Improvement workflow

Stop after

12. Canonical-domain improvement task

Agent instruction

Steps

Success criteria

13. Research page improvement task

Agent instruction

Required checks

Known typo fixes to check

14. Image optimization task

Agent instruction

Steps

Success criteria

15. Render-blocking resources task

Agent instruction

Steps

Success criteria

16. Accessibility task

Agent instruction

Common low-risk fixes

Do not

17. Structured data task

Agent instruction

Desired `package.json` scripts

Add `.gitignore` entries