Live-site checks
The categories of checks SiteCMD runs against your URL, what each one looks for, and why it matters.
The live-site engine runs hundreds of checks against your URL on every scan. This page is a reference for what those checks cover, organized by category.
For the engine architecture, see How a scan works. For the source-folder audit, see Source audit.
Categories at a glance
| Category | What it covers |
|---|---|
| Security | Headers, SSL/TLS, exposed files, CSRF/XSS protections, dependency CVEs surfaced by your site’s HTTP responses. |
| Performance | Compression, caching, render-blocking resources, response times, image sizing. Optionally Core Web Vitals via headless Chrome. |
| SEO | Meta tags, structured data, canonical URLs, sitemap presence, robots directives, indexability. |
| Accessibility | Heading hierarchy, ARIA usage, color contrast, form labels, alt text. Optionally axe-core deep scan. |
| Compliance | Cookie banners, privacy policy presence, GDPR signals, tracker disclosure. |
| Polish | The vibe-code signals: AI-aesthetic patterns, default favicons, missing Open Graph tags, source maps shipped to production, framework boilerplate left in. |
| Config | Operational gaps surfaced from your live site: missing security headers in unusual places, deploys without redirects, CDN behavior. |
A single scan runs every category unless you’ve passed a focused scan_type (security, accessibility, polish). Most users always run the full pass.
Security
Most-impactful category for newly launched sites. SiteCMD checks:
- HTTPS and SSL/TLS configuration. Certificate validity, redirect from
http://tohttps://, HSTS, modern cipher suites. - Security headers. Content-Security-Policy, X-Frame-Options, X-Content-Type-Options, Referrer-Policy, Permissions-Policy, Cross-Origin-* headers.
- Exposed files.
/.env,/.git/,/.DS_Store,/backup.zip, common admin endpoints, source maps that leak source code. - CORS. Overly permissive
Access-Control-Allow-Originvalues, especially*paired with credentials. - Server fingerprinting. Version banners that leak the exact server, framework, or runtime version (Apache 2.4.41, PHP 7.4.3, etc.).
- TLS subdomain coverage. Whether common subdomains (
www.,mail., etc.) have the same TLS posture as your main domain.
Most of these are confirmed checks: the header is either present or not. A few are needs-review (e.g. “your CSP allows unsafe-inline, this might be intentional, review”).
Performance
What you’d find with a Lighthouse-style scan, with some additions:
- Response time and TTFB. From the time SiteCMD’s request goes out to the first byte coming back.
- Compression. Whether gzip/Brotli is actually serving for the responses where it should.
- Caching headers.
Cache-Control,Expires,ETag, and whether they’re set appropriately for static vs. dynamic content. - Render-blocking resources. Synchronous scripts in
<head>, blocking stylesheets, non-async font loads. - Image hygiene. Images without dimensions (causes layout shift), modern formats not in use (no WebP/AVIF), oversized images served at small render sizes.
- HTTP/2 and HTTP/3. Protocol version, whether keep-alive is working.
If you run sitecmd scan --cwv from the CLI, the engine also measures Largest Contentful Paint, First Input Delay, Cumulative Layout Shift, Time to First Byte, and Interaction to Next Paint via a headless Chrome instance. These take 5-15s per page and aren’t on by default.
SEO
The basics that any indexable site needs:
<title>and meta description. Present, length reasonable, not duplicated across pages.- Canonical URL. Self-referential where it should be, points elsewhere where it should.
- Open Graph and Twitter Card tags. For sites that get shared on social.
- Structured data. JSON-LD presence and validity for common types (Organization, Product, Article).
- Sitemap.
/sitemap.xmlpresent, valid, references URLs that actually return 200. - Robots directives.
/robots.txtconsistent with what<meta name="robots">claims on individual pages. - HTTP status hygiene. No accidental 404s, redirect chains, soft-404s.
- Indexability signals. Pages that should be indexed are indexable; staging-flavor noindex tags don’t leak into production.
SEO findings often correlate with Search Console impressions data, surfacing as combined issues when Core or above is active.
Accessibility
Both quick-pass checks and (optionally) a deeper axe-core run:
- Heading hierarchy.
<h1>present, levels don’t skip (<h2>followed by<h4>), only one<h1>per page. - Language declaration.
<html lang>set. - Image alt text. Either present or explicitly empty for decorative images.
- Form labels. Every input has a programmatic label.
- Color contrast. WCAG AA contrast ratios on text and interactive elements.
- ARIA usage.
aria-*attributes used correctly (not on elements they don’t apply to, not duplicating native semantics). - Interactive element semantics. Clickable
<div>s vs. proper<button>and<a>.
The axe-core deep scan (available on Core and above) catches additional accessibility issues that need DOM-level analysis.
Compliance
Signals that affect GDPR, CCPA, and similar:
- Cookie consent. Banner present, dismissable, respects “decline”.
- Privacy policy link. Present in footer or somewhere reachable from the homepage.
- Tracker disclosure. Third-party scripts (analytics, advertising, social) disclosed.
- Sensitive data exposure. Forms that collect personal information served over HTTPS, with appropriate
autocompleteattributes.
Compliance is a low-weight category by default (10% of the raw category score). It’s not a substitute for actual legal review.
Polish
This is the category that catches the patterns of “this site was vibe-coded and shipped fast.” Specifically:
- CSS architecture. Inline style density, Tailwind class density, missing CSS architecture, utility-to-custom ratio.
- HTML quality. Div soup ratio, heading hierarchy, form accessibility,
<button>vs clickable<div>, missinglang. - Copy & content. Em-dash density (AI tell), AI buzzword dictionary, AI header formulas (“Embark on a journey”, “In today’s fast-paced…”), inclusive framing, emoji-as-icons, the rule-of-three columns layout.
- AI aesthetic. Gradient backgrounds, glassmorphism, scroll animations, excessive border-radius, glow shadows, floating blob decorations.
- Meta & infrastructure. Default page titles (“Untitled”, “Document”), missing OG tags, default favicons, missing
/sitemap.xmland/robots.txt, source maps in production, console.log in production. - Framework defaults. Default deployment subdomain still in use, boilerplate HTML left in, default error pages.
A high vibe-probability score isn’t necessarily bad. It’s a signal that the site looks like one of thousands of AI-generated landing pages. Some of those signals are visual style choices; others (default favicon, missing OG tags, source maps in prod) are genuine issues to fix.
Config
Operational signals that don’t fit neatly elsewhere:
- Redirects. Excessive redirect chains, redirect loops, missing canonical redirects.
- HTTP/HTTPS consistency.
http://vshttps://, with vs. withoutwww.. - Subdomain hygiene. Whether your
www.subdomain redirects appropriately, whether legacy subdomains still resolve. - Robots/sitemap consistency. Sitemap URLs that conflict with
/robots.txtexclusions.
Probe vs. parse checks
Internally, the engine has two kinds of checks:
- HTML parse checks look at the page you already fetched. They run in-memory and are nearly free.
- Probe checks make their own follow-up requests (fetching
/robots.txt, alternate URLs for security headers, checking SSL). They take longer but run concurrently.
For the engine internals, see How a scan works.
What gets skipped
A few situations where checks are skipped:
scan_typefilters. If you pass--type security, non-security categories are skipped entirely. Useful for fast focused scans.- Pre-deploy scan mode. Some checks need a live site (uptime probes, certain header checks). Pre-deploy scans skip them. Mainly used by the CLI in CI when scanning a build artifact.
- Tier gating. Polish signals run on every tier; the deep accessibility axe-core scan needs Core or above.
The list of skipped checks appears at the bottom of the scan summary, with the reason each was skipped.