{"id":3924,"date":"2026-04-10T16:10:02","date_gmt":"2026-04-10T13:10:02","guid":{"rendered":"https:\/\/upcloud.com\/global\/?p=3924"},"modified":"2026-04-13T14:57:05","modified_gmt":"2026-04-13T13:57:05","slug":"from-zero-to-56-resources-deploying-cratis-studio-on-upcloud","status":"publish","type":"post","link":"https:\/\/upcloud.com\/global\/blog\/from-zero-to-56-resources-deploying-cratis-studio-on-upcloud\/","title":{"rendered":"From Zero to 56 Resources: Deploying Cratis Studio on UpCloud"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Data sovereignty is something I genuinely care about. Not as a compliance checkbox or a talking point, but as a first-principles belief that the organizations and people you work with should have meaningful control over where their data lives, who can access it, and under what legal regime.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">That belief has become harder to maintain comfortably. Most cloud infrastructure \u2014 even for European companies \u2014 runs on American platforms, subject to American law. The CLOUD Act means that data stored on US-owned infrastructure can be demanded by US authorities regardless of where the servers physically sit. GDPR gives European citizens rights over their data; it doesn&#8217;t give European companies rights over their infrastructure. Those are different things.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So when I advise customers on cloud architecture, I increasingly feel the weight of recommending platforms I can&#8217;t fully vouch for from a data sovereignty standpoint. That needed to change \u2014 and that required a hands-on answer, not a vendor comparison spreadsheet. Real infrastructure, real workloads, real friction.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">UpCloud was our answer. This is an honest account of what it took to go from zero to a fully running Kubernetes deployment \u2014 mistakes and all \u2014 because the only way I can confidently recommend a platform to customers is if I&#8217;ve done it myself.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What we were deploying<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/cratis.studio\" target=\"_blank\" rel=\"noopener\">Cratis Studio<\/a> is a collaborative tool for software development built around <a href=\"https:\/\/eventmodeling.org\" target=\"_blank\" rel=\"noopener\">Event Modeling<\/a>. Think Miro, but purpose-built for designing event-driven systems \u2014 not generic whiteboards. You sketch out the full lifecycle of your system as a timeline of commands, events and read models, and the tool helps your whole team stay aligned on what you&#8217;re actually building.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The deployment we needed wasn&#8217;t trivial. Beyond the application itself, we required:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>A managed Kubernetes cluster<\/li>\n\n\n\n<li>A self-hosted Docker registry (to keep images European)<\/li>\n\n\n\n<li>A MongoDB replica set with high-speed storage<\/li>\n\n\n\n<li>Automated backups to UpCloud object storage with rolling retention (hourly \/ daily \/ weekly \/ monthly)<\/li>\n\n\n\n<li>NGINX ingress + cert-manager + Let&#8217;s Encrypt for TLS<\/li>\n\n\n\n<li>Grafana + Loki for observability<\/li>\n\n\n\n<li>OpenTelemetry Collector<\/li>\n\n\n\n<li>A Kubernetes dashboard<\/li>\n\n\n\n<li>OAuth2-based access control via GitHub teams<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Two environments: <code>development<\/code> (from the <code>development<\/code> branch) and <code>production<\/code> (from <code>main<\/code>).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Here&#8217;s how it all fits together:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/upcloud.com\/media\/cratis-studio-infrastructure-deployment-1-1024x675.png\" alt=\"-\" class=\"wp-image-78867\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Why Pulumi and why C#<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">We&#8217;ve used Pulumi before and we like what it gives us: real programming languages, real abstractions, real tests. The alternative \u2014 YAML files describing desired state \u2014 works fine until it doesn&#8217;t, at which point you&#8217;re reading documentation for a DSL instead of writing code your team already knows.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Since our team lives in C#, that was the natural choice. Pulumi&#8217;s C# support is solid. The codebase ended up as a proper .NET project with components like <code>KubernetesCluster<\/code>, <code>MongoDeployment<\/code>, <code>OAuth2Proxy<\/code>, <code>Ingresses<\/code>, each with typed arguments. It reads like an application, not a configuration file.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We also had GitHub Copilot along for the whole ride. More on that in a moment.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The journey<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">First check: verify your auth setup against the current dashboard<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The very first thing we hit wasn&#8217;t really an UpCloud problem \u2014 it was ours. The initial Pulumi code that had been scaffolded for us was using an older subaccount-based credential approach. UpCloud&#8217;s API token documentation was current and easy to find; we just hadn&#8217;t cross-checked the generated code against it before running. Once we did, switching to API tokens was straightforward.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The lesson isn&#8217;t about UpCloud&#8217;s documentation drifting \u2014 it&#8217;s a reminder to verify your tooling&#8217;s assumptions against the current platform before you start. Especially when AI-generated scaffolding is involved: it reflects what the model was trained on, not necessarily what shipped yesterday.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Setting up the registry<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Before the Kubernetes cluster could do anything useful, we needed a Docker registry to serve images from. UpCloud gives you straightforward VM-based infrastructure to build on, so the registry runs on a dedicated VM with a floating IP and a self-signed TLS certificate provisioned by Pulumi.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Getting Docker to trust that certificate across machines turned out to be the trickiest part. The process of distributing the certificate so that <code>docker login<\/code> would stop complaining about <code>x509: certificate signed by unknown authority<\/code> is genuinely tedious \u2014 it needs to happen on every machine that pushes or pulls. We documented it carefully so the second person through didn&#8217;t have to figure it out from scratch.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Kubernetes: CIDR surprises<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Standing up the cluster itself was mostly smooth once we had the configuration right. We hit one bump worth noting.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Network CIDR overlap. UpCloud has a default private network in the <code>10.0.0.0\/8<\/code> range, and the initial cluster network config used <code>10.0.0.0\/24<\/code>, which overlapped directly with it. We had to choose a non-conflicting range. Simple in hindsight, but it cost a failed deploy to find out.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">The backup chicken-and-egg problem<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">This one was interesting. Our backup system writes to UpCloud Object Storage. To configure that in Pulumi we needed an access key. But the access key can only be created after the bucket exists. And the bucket is what Pulumi is supposed to create.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The naive answer is: create the bucket manually first, get your credentials, then paste them into config. We didn&#8217;t love that.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The better answer \u2014 which we implemented \u2014 was to have Pulumi create the bucket <em>and<\/em> generate the access keys as part of the same deployment. Pulumi&#8217;s output system handles this elegantly: the keys become <code>Output&lt;string&gt;<\/code> properties on the storage component and flow directly into the downstream configuration. No manual steps, no credentials to manage, no chicken, no egg.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">var accessKey = new ManagedObjectStorageUserAccessKey(\n    $\"backup-object-storage-access-key-{args.Environment}-g{args.AccessKeyGeneration}\",\n    new ManagedObjectStorageUserAccessKeyArgs\n    {\n        ServiceUuid = managedObjectStorage.Id,\n        Username = userName,\n        Status = \"Active\",\n    },\n    new CustomResourceOptions { DependsOn = [backupUser] });\n\nAccessKeyId = Output.CreateSecret(accessKey.AccessKeyId);\nSecretAccessKey = Output.CreateSecret(accessKey.SecretAccessKey);<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The <code>AccessKeyGeneration<\/code> config value is worth calling out: incrementing it rotates the credentials on the next deploy without touching the resource name prefix \u2014 a simple but explicit rotation mechanism.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">DNS: CNAME, not A record<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">UpCloud&#8217;s managed Kubernetes doesn&#8217;t give you a static IP for the load balancer. It gives you a hostname like <code>lb-0afa36aa50364d92baa35e47ef2e72c5-1.upcloudlb.com<\/code>. That means your DNS records need to be CNAMEs, not A records. If your DNS provider (or your registrar&#8217;s control panel) nudges you towards entering an IP address, you need to push back.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We updated our stack outputs and documentation to make this explicit. It&#8217;s a detail, but it&#8217;s the kind of detail that wastes an hour if you don&#8217;t know it.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Let&#8217;s Encrypt and the load balancer TLS trap<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">This was our most involved debugging session and probably the most instructive.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We had cert-manager issuing Let&#8217;s Encrypt certificates correctly. certs were appearing in Kubernetes secrets. NGINX had access to them. But when we hit a domain in the browser, the certificate we were getting back was issued for the UpCloud load balancer&#8217;s own hostname, not our domain.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">What was happening: UpCloud&#8217;s cloud controller manager (CCM) was provisioning the load balancer with an HTTPS frontend \u2014 Layer 7 \u2014 using UpCloud&#8217;s own default certificate. Traffic from the browser was being decrypted at the load balancer before it ever reached NGINX. Let&#8217;s Encrypt was doing its job; NGINX was serving the right cert; the load balancer just wasn&#8217;t letting it through.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/upcloud.com\/media\/load-balancer-enable-tcp-passthrough-1-1024x514.png\" alt=\"-\" class=\"wp-image-78869\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The fix was setting TCP mode via an annotation on the NGINX controller&#8217;s <code>LoadBalancer<\/code> service, expressed in Pulumi C# as:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">[\"service\"] = new Dictionary&lt;string, object&gt;\n{\n    [\"type\"] = \"LoadBalancer\",\n    [\"annotations\"] = new Dictionary&lt;string, string&gt;\n    {\n        [\"service.beta.kubernetes.io\/upcloud-load-balancer-config\"] =\n            \"\"\"{\"frontends\":[\n                {\"name\":\"https\",\"port\":443,\"mode\":\"tcp\",\"default_backend\":\"port-443\"},\n                {\"name\":\"http\", \"port\":80, \"mode\":\"tcp\",\"default_backend\":\"port-80\"}\n            ],\"backends\":[\n                {\"name\":\"port-443\",\"port\":443},\n                {\"name\":\"port-80\", \"port\":80}\n            ]}\"\"\",\n    },\n},<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Setting both frontends to <code>\"mode\": \"tcp\"<\/code> tells the CCM to pass raw TCP connections through to NGINX rather than terminating TLS at the load balancer. cert-manager&#8217;s certificates started being served correctly immediately after.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">One wrinkle: the UpCloud CCM honors this annotation only at load balancer creation time. Modifying an existing service&#8217;s annotations doesn&#8217;t reconfigure the load balancer. We had to delete the Kubernetes service entirely, wait for UpCloud to deprovision the old load balancer, then let Pulumi recreate it with the correct annotation from the start. Not complicated, but not obvious.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">There was also a subtlety in the annotation format: the <code>default_backend<\/code> field expects a string (a backend name reference), not an inline object. Getting the JSON structure wrong produced a cryptic error from the CCM and another cycle of delete-and-recreate.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">OAuth2 via GitHub teams<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">We wanted protected endpoints that only members of specific GitHub organization teams could access. The answer here was <a href=\"https:\/\/github.com\/oauth2-proxy\/oauth2-proxy\" target=\"_blank\" rel=\"noopener\">OAuth2-Proxy<\/a>, deployed as two instances via Helm \u2014 one keyed to our <code>admin<\/code> team, one to our <code>operations<\/code> team \u2014 with NGINX forwarding through the appropriate proxy before serving the actual backend.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This part went largely to plan. The one gotcha: newer versions of the NGINX ingress (v1.9+) introduced an additional <code>annotations-risk-level<\/code> field alongside <code>allowSnippetAnnotations<\/code>. Setting only the latter isn&#8217;t enough. We needed:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">[\"config\"][\"allow-snippet-annotations\"] = \"true\",\n[\"config\"][\"annotations-risk-level\"] = \"Critical\"<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The error message when you get this wrong points you at the admission webhook denying the annotation \u2014 helpful enough once you know what you&#8217;re looking for.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">The Kubernetes dashboard subdomain issue<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">We added the Kubernetes dashboard for operational visibility. The dashboard&#8217;s Helm chart (version 1.7.0) no longer supports the <code>--base-href<\/code> flag, which broke the subpath approach we&#8217;d initially tried. The fix was to give the dashboard its own subdomain (<code>console.dev.cratis.studio<\/code>) covered by our wildcard certificate rather than hosting it under a path. Tidier anyway.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">One final small puzzle: the Helm release name for the dashboard gets a random hash suffix appended \u2014 <code>kubernetes-dashboard-development-fdaae6a3-kong-proxy<\/code> rather than the expected <code>kubernetes-dashboard-kong-proxy<\/code>. You can&#8217;t know this name statically, so you can&#8217;t hardcode it in the ingress backend. The fix is to use Pulumi&#8217;s <code>.Apply()<\/code> to derive it at deploy time from the release name itself:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">\/\/ Service name is the Helm release name + \"-kong-proxy\"\n\/\/ e.g. kubernetes-dashboard-development-fdaae6a3-kong-proxy\nName = dashboardRelease.Name.Apply(n =&gt; $\"{n}-kong-proxy\"),<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This is the kind of thing that only shows up when you actually deploy \u2014 the Pulumi preview passes fine because the release name resolves to the expected value when the resource is live.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">The load balancer hostname instability problem<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The CCM-provisioned load balancer has a hidden fragility we discovered the hard way: every time the NGINX Kubernetes service is deleted and recreated \u2014 which is the only way to apply a changed annotation, as noted earlier \u2014 the CCM treats it as a new service and provisions a <em>brand new<\/em> load balancer with a <em>brand new<\/em> random DNS hostname. Every CNAME record pointing at the old hostname breaks instantly.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This isn&#8217;t hypothetical: the cycle of delete-and-recreate that the TCP annotation fix required triggered it precisely once during our work, and having all DNS stop resolving mid-debugging session is a memorable experience.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The fix is to invert ownership of the load balancer. Instead of letting the CCM provision it implicitly, pre-create the UpCloud load balancer as an explicit Pulumi <code>ClusterLoadBalancer<\/code> ComponentResource \u2014 complete with TCP frontends, backends, and static backend members pointing at every worker node. Then fix the NGINX NodePorts so the LB backend members&#8217; port references never drift between NGINX recreations:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">[\"nodePorts\"] = new Dictionary&lt;string, object&gt;\n{\n    [\"http\"]  = args.HttpNodePort,   \/\/ fixed, e.g. 32080\n    [\"https\"] = args.HttpsNodePort,  \/\/ fixed, e.g. 32443\n},<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The load balancer&#8217;s <code>DnsName<\/code> is stable because the Pulumi resource itself persists across deploys. NGINX can be recreated as many times as needed without touching the DNS target. Point your CNAME records at the LB hostname once and never worry about it again.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Discovering node IPs automatically<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The Pulumi Kubernetes provider includes a <code>NodeList<\/code> resource type, and it looks like exactly the right tool for discovering worker node IPs at deploy time. It isn&#8217;t: it&#8217;s marked compatibility-only and cannot be used as a data source. Attempting to read from it blocks the deploy with a runtime exception about a required field.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The working approach is <code>Pulumi.Command.Local.Command<\/code>, which runs <code>kubectl get nodes<\/code> against the cluster using the UpCloud-fetched kubeconfig. The IPs are sorted before indexing so that node1\/node2 assignments remain stable across runs, and <code>ClusterId<\/code> is passed as a trigger so the command re-runs and returns fresh IPs whenever the cluster is recreated:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">var nodeIPsCommand = new Command(\n    $\"get-node-ips-{args.Environment}\",\n    new CommandArgs\n    {\n        Create = \"kubectl get nodes -o jsonpath='{.items[*].status.addresses\" +\n                 \"[?(@.type==\\\"InternalIP\\\")].address}' --kubeconfig &lt;(echo \\\"$KUBECONFIG_DATA\\\")\",\n        Environment = new InputMap&lt;string&gt;\n        {\n            [\"KUBECONFIG_DATA\"] = args.Kubeconfig,\n        },\n        Triggers = [args.ClusterId],\n    });<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The sorted IPs flow directly into the LB backend member configuration. No hardcoded IP addresses in stack config.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">The oauth2-proxy Helm v7.x credentials trap<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">This was the most subtle bug in the entire deployment, and the hardest to diagnose. We spent a full debugging session on it.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The standard advice for keeping secrets out of Helm release history is to create a Kubernetes Secret separately and reference it via the <code>existingSecret<\/code> value. We followed this exactly \u2014 created a Pulumi <code>Secret<\/code> resource with the GitHub OAuth credentials, set <code>existingSecret<\/code> to its name. Everything deployed without errors.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Then every OAuth authorization attempt returned a 404 from GitHub.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The cause: <code>existingSecret<\/code> <strong>does not work<\/strong> in oauth2-proxy Helm chart v7.x. The chart unconditionally creates its own <code>{release-name}<\/code> Secret seeded from <code>config.clientID<\/code>, <code>config.clientSecret<\/code>, and <code>config.cookieSecret<\/code> \u2014 defaulting those values to the literal placeholder string <code>XXXXXXX<\/code>. The Deployment&#8217;s env vars always reference that chart-managed Secret. Your separately-created Secret is created, runs cleanly, and is referenced nowhere.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Both proxies were sending <code>client_id=XXXXXXX<\/code> to GitHub&#8217;s authorize endpoint. GitHub accepts the URL (it shows the login page for any <code>client_id<\/code>), but returns 404 after the user logs in because no OAuth App with that ID exists. The error only appears in the browser after a full login cycle \u2014 there&#8217;s nothing in the proxy logs, nothing in the Kubernetes events, nothing to indicate the credentials are wrong until you actually trace the raw redirect URL.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The fix is to abandon <code>existingSecret<\/code> entirely and pass credentials directly in Helm values:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">[\"config\"] = new Dictionary&lt;string, object&gt;\n{\n    [\"clientID\"]     = clientId,\n    [\"clientSecret\"] = clientSecret,\n    [\"cookieSecret\"] = cookieSecret,\n},<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This means credentials appear in the Helm values that Pulumi stores in state. Pulumi marks them as secrets (they&#8217;re <code>Output&lt;string&gt;<\/code> values wrapped with <code>Output.CreateSecret<\/code>), so they&#8217;re encrypted in the state backend \u2014 not plaintext. It&#8217;s a less clean separation than a standalone Secret, but it&#8217;s the only option that actually works until the chart addresses this.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If you&#8217;re using oauth2-proxy Helm chart v7.x and <code>existingSecret<\/code>, your credentials are <code>XXXXXXX<\/code>. Check this before wondering why GitHub returns 404.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Cookie domain scoping for multiple proxies<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">With two oauth2-proxy instances \u2014 one guarding admin endpoints, one guarding operations \u2014 their cookie domains need different scopes. The admin proxy intentionally uses a broad cookie domain (<code>.dev.cratis.studio<\/code>) so its session cookie covers all subdomains, including the dashboard. The operations proxy should use a narrower domain (<code>.operations.dev.cratis.studio<\/code>) to contain its sessions.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A small bug in the Pulumi C# code applied the same subdomain-stripping formula to both:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">\/\/ Bug: both computed to .dev.cratis.studio\nvar adminCookieDomain      = \".\" + string.Join('.', adminDomain.Split('.').Skip(1));\nvar operationsCookieDomain = \".\" + string.Join('.', operationsDomain.Split('.').Skip(1));<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The fix is direct:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">\/\/ Admin: broad, covers all subdomains including the dashboard\nvar adminCookieDomain      = \".\" + string.Join('.', adminDomain.Split('.').Skip(1));\n\/\/ Operations: narrowed to its own subdomain\nvar operationsCookieDomain = \".\" + operationsDomain;<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The practical impact: the operations session cookie was being scoped to the entire <code>.dev.cratis.studio<\/code> domain. In a private cluster with carefully controlled access this isn&#8217;t catastrophic, but it&#8217;s wrong \u2014 sessions should be isolated to their respective proxies.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Kubernetes Dashboard: removing the second login prompt<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">After oauth2-proxy validates GitHub team membership and passes the request through, the Kubernetes Dashboard still presents its own login screen asking for a Bearer token. This is a second authentication step with no additional security value \u2014 the GitHub gate already established who the person is and whether they&#8217;re authorised.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The fix is an NGINX <code>configuration-snippet<\/code> annotation on the dashboard ingress that injects the service account Bearer token automatically after oauth2-proxy passes the request:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">[\"nginx.ingress.kubernetes.io\/configuration-snippet\"] =\n    AdminToken.Apply(token =&gt; $\"proxy_set_header Authorization \\\"Bearer {token}\\\";\"),<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This requires <code>allowSnippetAnnotations = true<\/code> and <code>annotations-risk-level = Critical<\/code> in the NGINX Helm values \u2014 already in place from the OAuth2 setup earlier. After this change, authenticated users land directly on the Dashboard UI with no additional prompts. The Bearer token remains accessible as a Pulumi stack output for direct API access and debugging; it&#8217;s simply no longer something users ever need to handle manually.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Automating the deployment<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Getting infrastructure to work manually is only half the job. The other half is making sure it stays working automatically. We set up three GitHub Actions workflows that together form the complete CI\/CD pipeline.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"\">flowchart LR\n    subgraph Triggers\n        PUSH[Push to main\\nDeployment\/** changed]\n        DISPATCH_DEV[workflow_dispatch]\n        PUBLISH[Push to main]\n    end\n\n    subgraph Workflows\n        WF_DEV[deploy-development]\n        WF_PROD[deploy-production]\n        WF_PUB[publish]\n    end\n\n    subgraph Targets\n        ENV_DEV[UpCloud development\\npulumi up --stack development]\n        ENV_PROD[UpCloud production\\npulumi up --stack production]\n        DOCKERHUB[DockerHub\\ncore \/ admin \/ lobby images]\n    end\n\n    PUSH --&gt; WF_DEV --&gt; ENV_DEV\n    DISPATCH_DEV --&gt; WF_DEV\n    DISPATCH_DEV --&gt; WF_PROD --&gt; ENV_PROD\n    PUBLISH --&gt; WF_PUB --&gt; DOCKERHUB<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong><code>deploy-development<\/code><\/strong> triggers automatically whenever changes to the <code>Deployment\/<\/code> directory land on <code>main<\/code>. It installs the Pulumi CLI, runs <code>pulumi up<\/code> against the <code>development<\/code> stack, and uses GitHub&#8217;s environment protection model to scope the secrets. A <code>concurrency<\/code> group with <code>cancel-in-progress: false<\/code> ensures deploys queue rather than stomp each other.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong><code>deploy-production<\/code><\/strong> is intentionally manual \u2014 <code>workflow_dispatch<\/code> only. It accepts an optional tag input; if none is given, it auto-generates a <code>YYYY.MM<\/code> version tag (incrementing a patch number if the month already has a release) and pushes it back to the repository. Production deployments are deliberate, not automatic.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong><code>publish<\/code><\/strong> handles Docker images. On every push to <code>main<\/code> it builds <code>core<\/code>, <code>admin<\/code>, and <code>lobby<\/code> images for both <code>linux\/amd64<\/code> and <code>linux\/arm64<\/code> and pushes them to DockerHub. The development cluster pulls the <code>latest-development<\/code> tag; production gets versioned tags matched to the release.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">One thing worth calling out: the <code>UPCLOUD_TOKEN<\/code> secret is preferred, but the workflow falls back gracefully to <code>UPCLOUD_USERNAME<\/code> \/ <code>UPCLOUD_PASSWORD<\/code> if the token isn&#8217;t set. That pattern kept us from breaking anything while we were iterating on the auth approach early on.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Documentation as a first-class deliverable<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">One of the better decisions we made was to treat the setup documentation as a first-class part of the deployment \u2014 not something to write at the end, but something to keep accurate throughout.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We keep a <code>setup.md<\/code> file that covers every manual step required to go from a fresh clone to a running system: UpCloud API credentials, Pulumi state backend configuration, GitHub environment secrets, first-time secret seeding, kubeconfig setup, DNS configuration, and CA trust for the private registry. It&#8217;s the kind of guide where one missing step means a confused second person spending an hour debugging what you already solved.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We updated it every time something changed \u2014 and things changed often. When <code>pulumi up<\/code> started automating NGINX and cert-manager setup that had previously been manual, the docs were updated immediately. When we discovered the load balancer uses a hostname rather than an IP, the DNS section was corrected. When the dashboard moved from a subpath to its own subdomain, the guide reflected that before the next person touched it.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The discipline is simple: if you had to figure something out, document it now, before you forget what the confusion actually was. The person who most benefits from good documentation is you, three months later.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We&#8217;d recommend the same approach to anyone setting up infrastructure like this: write the setup guide alongside the code, treat every debug session as a documentation opportunity, and use the AI assistant to keep things in sync \u2014 it&#8217;s genuinely useful for that specific task.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Copilot as co-pilot<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">We ran GitHub Copilot throughout the entire deployment process. It drove the initial Pulumi structure (from a GitHub Issue we raised with the full requirements), helped debug each error we encountered, and kept the documentation in sync with reality.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The honest report: it was genuinely useful for keeping momentum up, especially when we hit a new UpCloud-specific quirk that required reading API docs and translating that into correct Pulumi C# code. The LB annotation debugging in particular \u2014 reading the CCM source expectations and then generating the correct JSON format \u2014 would have taken considerably longer without that loop.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">It got things wrong on occasion (the annotation format, the base-href approach), but the feedback cycle was tight enough that the corrections came quickly. It&#8217;s a different kind of pair programming, and for infrastructure work on a platform you&#8217;re learning, the pace advantage is real.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The result<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">56 Pulumi resources deployed. 0 errors.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Both environments \u2014 <code>development<\/code> and <code>production<\/code> \u2014 follow the same code path with different config. The deployment is fully reproducible from a clean checkout. The backup system creates archives per-database and stores them in UpCloud object storage on the rolling schedule we specified. TLS works. OAuth2 works. Observability works.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Lessons learned<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The platform is solid. The Kubernetes offering is straightforward, and the integration with Pulumi via the <a href=\"https:\/\/www.pulumi.com\/registry\/packages\/upcloud\/\" target=\"_blank\" rel=\"noopener\">UpCloud provider<\/a> works well. The documentation covers most of what you need, with some gaps around how the CCM interacts with load balancer configuration \u2014 if you&#8217;re doing anything non-standard with networking, budget some time to read the API reference directly.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The specific things that caught us, so they don&#8217;t catch you:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Check the dashboard for current auth requirements.<\/strong> API credential formats can drift from what older documentation describes. The UpCloud dashboard is the source of truth.<\/li>\n\n\n\n<li><strong>Watch for CIDR conflicts.<\/strong> UpCloud&#8217;s default private network uses <code>10.0.0.0\/8<\/code>. Make sure your cluster network config doesn&#8217;t overlap.<\/li>\n\n\n\n<li><strong>Let&#8217;s Encrypt requires TCP passthrough.<\/strong> The CCM defaults to L7 HTTPS on the load balancer. You need to explicitly configure TCP mode via the <code>service.beta.kubernetes.io\/upcloud-load-balancer-config<\/code> annotation \u2014 and the JSON format matters precisely.<\/li>\n\n\n\n<li><strong>Load balancer annotations only apply at creation time.<\/strong> If you change the annotation on an existing service, nothing happens. You have to delete the service and let UpCloud reprovision the LB from scratch.<\/li>\n\n\n\n<li><strong>Pre-create the load balancer as a Pulumi resource.<\/strong> Letting the CCM provision it implicitly means every NGINX service deletion creates a <em>new<\/em> LB with a new hostname, breaking DNS. Create it explicitly, fix the NGINX NodePorts, and your CNAME records are stable forever.<\/li>\n\n\n\n<li><strong>Pulumi&#8217;s <code>NodeList<\/code> is not a data source.<\/strong> The Kubernetes provider includes the type for compatibility but it cannot be read from at apply time. Use <code>Pulumi.Command.Local.Command<\/code> running <code>kubectl get nodes<\/code> instead \u2014 and pass <code>ClusterId<\/code> as a trigger so node IPs are refreshed when the cluster is recreated.<\/li>\n\n\n\n<li><strong>oauth2-proxy Helm chart v7.x ignores <code>existingSecret<\/code>.<\/strong> The chart creates its own release-named Secret with placeholder <code>XXXXXXX<\/code> values and always mounts that \u2014 not the Secret you created. The symptom is a GitHub 404 after login with nothing in the proxy logs. Pass <code>config.clientID<\/code>, <code>config.clientSecret<\/code>, and <code>config.cookieSecret<\/code> directly in Helm values instead.<\/li>\n\n\n\n<li><strong>Cookie domains for multiple proxy instances need explicit scoping.<\/strong> If you run both a broad admin proxy and a narrower operations proxy, verify the cookie domain formula independently for each. It&#8217;s easy to apply the same stripping logic to both and silently produce the wrong scope.<\/li>\n\n\n\n<li><strong>The Kubernetes Dashboard has its own login prompt behind your OAuth proxy.<\/strong> Eliminate it with an NGINX <code>configuration-snippet<\/code> that injects the service account Bearer token automatically. GitHub team membership becomes the only gate.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">None of these are blockers. They&#8217;re the kind of friction you hit once. After that, the deployment runs repeatably and everything behaves as expected.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For teams that take data sovereignty seriously \u2014 and I believe more teams should \u2014 UpCloud delivers what it promises: capable European infrastructure, transparent pricing, and a Kubernetes platform you can build real production workloads on. That&#8217;s what I&#8217;ll be telling my customers.<\/p>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-top is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:66.66%\">\n<p class=\"wp-block-paragraph\"><strong>Guest post by <strong>Einar Ingebrigtsen<\/strong><\/strong><\/p>\n\n\n\n<p class=\"has-small-font-size wp-block-paragraph\"><strong>Einar Ingebrigtsen<\/strong> is a technical advisor at <a href=\"https:\/\/novanet.no\" target=\"_blank\" rel=\"noopener\">Novanet<\/a>, a Norwegian software consultancy, and the creator of <a href=\"https:\/\/cratis.io\" target=\"_blank\" rel=\"noopener\">Cratis<\/a> \u2014 an open-source event sourcing and CQRS platform for .NET. <br><br>He has spent over two decades building distributed systems and advising organizations on cloud architecture, developer experience, and software design. <br><br>He writes at <a href=\"https:\/\/ingebrigtsen.blog\" target=\"_blank\" rel=\"noopener\">ingebrigtsen.blog<\/a>.<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-bottom is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:33.33%\"><div class=\"wp-block-image is-style-rounded\">\n<figure class=\"aligncenter size-medium is-resized\"><img decoding=\"async\" src=\"https:\/\/upcloud.com\/media\/img_4179-300x300.jpg\" alt=\"-\" class=\"wp-image-77473\" style=\"width:240px;height:auto\" \/><\/figure>\n<\/div><\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Data sovereignty is something I genuinely care about. Not as a compliance checkbox or a talking point, but as a first-principles belief that the organizations [&hellip;]<\/p>\n","protected":false},"author":15,"featured_media":78792,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_relevanssi_hide_post":"","_relevanssi_hide_content":"","_relevanssi_pin_for_all":"","_relevanssi_pin_keywords":"","_relevanssi_unpin_keywords":"","_relevanssi_related_keywords":"","_relevanssi_related_include_ids":"","_relevanssi_related_exclude_ids":"","_relevanssi_related_no_append":"","_relevanssi_related_not_related":"","_relevanssi_related_posts":"652,247,772,583,811,289","_relevanssi_noindex_reason":"Blocked by a filter function","footnotes":""},"categories":[94,22,13],"tags":[],"class_list":["post-3924","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-guest-stories","category-cloud-infrastructure","category-data-sovereignty"],"acf":[],"_links":{"self":[{"href":"https:\/\/upcloud.com\/global\/wp-json\/wp\/v2\/posts\/3924","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/upcloud.com\/global\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/upcloud.com\/global\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/upcloud.com\/global\/wp-json\/wp\/v2\/users\/15"}],"replies":[{"embeddable":true,"href":"https:\/\/upcloud.com\/global\/wp-json\/wp\/v2\/comments?post=3924"}],"version-history":[{"count":4,"href":"https:\/\/upcloud.com\/global\/wp-json\/wp\/v2\/posts\/3924\/revisions"}],"predecessor-version":[{"id":6007,"href":"https:\/\/upcloud.com\/global\/wp-json\/wp\/v2\/posts\/3924\/revisions\/6007"}],"wp:attachment":[{"href":"https:\/\/upcloud.com\/global\/wp-json\/wp\/v2\/media?parent=3924"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/upcloud.com\/global\/wp-json\/wp\/v2\/categories?post=3924"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/upcloud.com\/global\/wp-json\/wp\/v2\/tags?post=3924"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}