Sometimes the most frustrating bugs are the ones that should have been baseline knowledge. This week we fixed an issue with Memory Lab—our agent memory system—that exposed a gap between local development assumptions and production container behavior.
What Happened
Memory Lab was failing silently on Railway. The Python modules memory_lab_client.py and retrieval.py were reading environment variables at module import time using os.getenv() at the top level. On Railway's Linux containers, modules sometimes import before the platform injects environment variables. The result? Empty strings where API keys should be, and failing requests that looked like configuration problems.
Why It Matters
This wasn't just a Railway quirk. It's a pattern that breaks in any environment where module load order and environment initialization aren't perfectly synchronized—Docker, Kubernetes, serverless functions, and most modern container platforms.
The fix is simple: move os.getenv() calls into functions. Read environment variables at call time, not import time. This ensures that by the time your code actually needs the value, the platform has had time to inject it.
Lessons Learned
Container platforms have their own initialization order. What works in local development (where env vars are set before anything runs) may not work in production. Railway, like many platforms, injects secrets just-in-time.
Module-level initialization is risky. Reading config at the top of a Python module is convenient, but it bakes in assumptions about when the module loads. Lazy evaluation—reading values when they're actually used—is more resilient.
This should have been baseline. Honestly, we should have caught this in code review. It's a well-documented pitfall. But it's also a reminder that even experienced teams ship patterns that work until they don't.
What's Next
We're auditing the rest of our Python codebase for similar patterns. Any module that reads env vars at the top level is getting refactored. We're also adding linting rules to catch this pattern in future PRs.
More broadly, this reinforces our commitment to building Strug Works as a platform that works the same way in every environment. Local, staging, production—if the behavior diverges, something's wrong. Infrastructure shouldn't be a guessing game.
If you're deploying Python apps to Railway, Render, or any container platform: check your import-time assumptions. Read your env vars when you use them, not when the module loads. Your future self will thank you.