Overview
This page documents common issues encountered during SanMarcSoft development and operations, with root causes and resolution steps.
1. Bun vs Node.js Runtime Mismatch
Symptoms
- Service crashes on startup with “bun: not found” or “node: not found”
- Different behavior between local development and container
Root Cause
Nix-built containers may not have the expected runtime in PATH. Shebangs like #!/usr/bin/env bun fail because /usr/bin/env does not exist in Nix sandbox.
Fix
Use absolute Nix store paths in entrypoints:
| |
For container config:
| |
2. fakeroot Error on macOS
Symptoms
error: builder for '/nix/store/...-...oci-image.drv' failed: fakeroot: not found
Root Cause
pkgs.dockerTools.buildLayeredImage uses fakeroot internally, which is not available on macOS (aarch64-darwin).
Fix
Build for the target architecture explicitly:
| |
Ensure the flake outputs are under packages.x86_64-linux, not packages.aarch64-darwin:
| |
3. FOD Hash Mismatch
Symptoms
error: hash mismatch in fixed-output derivation
specified: sha256-AAAA...
got: sha256-BBBB...
Root Cause
The Fixed-Output Derivation hash no longer matches the build output. This happens when:
- Dependencies changed (bun.lock, go.sum, package.json)
- Build-time environment variables changed
- Source files included in the FOD changed
Fix
- Set hash to
pkgs.lib.fakeHashinflake.nix - Build and capture the correct hash:
1nix build .#packages.x86_64-linux.oci-image 2>&1 | grep "got:" - Replace
fakeHashwith the new hash - Rebuild
4. Cloudflare Worker Versioning Issues
Symptoms
- Deploy succeeds but old code still serves
- Dashboard shows multiple versions with gradual rollout percentages
/__debugendpoint returns old version timestamp
Root Cause
Cloudflare Worker versioning system creates multiple versions. New deployments may create a new version at 0% traffic instead of replacing the active version.
Fix
Check versions:
1 2 3 4CF_TOKEN=$(pass cloudflare/api-token) ACCOUNT_ID=$(pass cloudflare/account-id) curl -s "https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/workers/scripts/<worker-name>/versions" \ -H "Authorization: Bearer ${CF_TOKEN}" | jq '.result'Set 100% traffic to the latest version via dashboard or API
If stuck, delete and recreate the worker:
1 2 3 4curl -s -X DELETE \ "https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/workers/scripts/<worker-name>" \ -H "Authorization: Bearer ${CF_TOKEN}" npx wrangler deployRe-set all secrets after recreation (secrets are deleted with the worker)
5. Scaleway Container Stuck in Error State
Symptoms
- Container status shows “error”
- Requests return 502 or timeout
- Repeated
pulumi updoes not fix it
Root Cause
Common causes:
- Image not found in registry (wrong tag, deleted image)
- Port mismatch between container config and application
- Entrypoint script crash (missing binary, permission error)
- Memory exhausted during startup
Fix
Check error message:
1 2 3SCW_TOKEN=$(pass sanmarcsoft/scaleway/api-secret) curl -s -H "X-Auth-Token: ${SCW_TOKEN}" \ "https://api.scaleway.com/containers/v1beta1/regions/fr-par/containers/<id>" | jq '.error_message'Verify image exists:
1 2skopeo inspect "docker://rg.fr-par.scw.cloud/sanmarcsoft/<name>:<tag>" \ --creds "nologin:$(pass sanmarcsoft/scaleway/api-secret)"Delete and recreate via Pulumi:
1 2pulumi destroy --stack <env> pulumi up --stack <env>
6. Clerk SSL Certificate Not Provisioning
Symptoms
- Clerk dashboard shows “DNS verification pending” for custom domain
- Custom domain returns SSL error
- Works on default Clerk domain but not custom domain
Root Cause
Cloudflare DNS records for Clerk are set with proxied: true. Clerk needs direct access to the DNS records to provision SSL certificates.
Fix
Set all 5 Clerk DNS records to proxied: false:
| |
After fixing, wait up to 24 hours for Clerk to verify and provision the SSL certificate.
7. tsconfig Test File Leaks into Build
Symptoms
tsc --noEmitfails with errors in test files- Build includes test files that should be excluded
- Type errors from test utilities (jest, vitest) in production build
Root Cause
tsconfig.json does not properly exclude test files, or a wildcard include ("include": ["src/**/*"]) captures test files.
Fix
Add explicit exclusions to tsconfig.json:
| |
Or create a separate tsconfig.build.json for production builds:
| |
Then build with:
| |
8. Docker Network DNS Failure (NAS)
Symptoms
- Inter-container communication fails with connection refused or DNS resolution error
curl http://container-name:portfails from another container- HTTP 502 from reverse proxy
Root Cause
Containers on the NAS are on different Docker networks. Docker DNS only resolves container names within the same network.
Fix
Ensure all related containers are on the same network:
| |
This was identified as a root cause for 502 errors in the Phenom Drop ecosystem. Added as preflight check #11.
9. Synology Proxy vs Container Nginx Confusion
Symptoms
- Debugging nginx configuration but changes have no effect
- Wrong nginx version reported
- Reverse proxy rules seem to not apply
Root Cause
The Synology NAS has its own reverse proxy (DSM built-in) in addition to any nginx running inside Docker containers. Changes to the container’s nginx have no effect if the Synology proxy is the one handling the request.
Fix
Identify which nginx is serving the request:
1curl -sI https://site.matthewstevens.org | grep "server:"If the Synology proxy is involved, configure it in DSM > Control Panel > Application Portal > Reverse Proxy
If the container nginx should handle directly, ensure the container port is exposed and the Synology proxy is not intercepting the traffic
10. PostCSS in Nix Sandbox (Hugo Docs)
Symptoms
- Hugo build fails with “PostCSS not found”
- Docsy theme fails to compile SCSS/CSS
Root Cause
The Nix sandbox does not have /usr/bin/env, so PostCSS CLI shebangs fail. Additionally, npm-installed binaries in the sandbox may have broken shebangs.
Workaround
Create a wrapper script in the Nix build:
| |
Status: This is a known workaround. The Hugo docs build in verifieddit-www currently skips PostCSS processing.