If you're working in AI, you're probably working in Python. Maybe you have a webapp in whatever JS framework is popular right now, but most of the core tooling in AI is built in and around Python. So maybe it’s time for a Go programmer like me to figure out how the production Python gets made.
Last week I rediscovered Development Containers. When
you use them, you do all your development in a container instead of on your
machine directly. This container is defined using a
devcontainer.json
file and
when you create a development container it’s rebuilt from scratch every time.
This means that when you get your build working in development, it won’t just
work on your machine. It’ll work on anyone’s machine.
Having to use Python shouldn't be that big of a deal, my first programming language was Python, but there’s one small problem that has resulted in me thinking that I have been cursed by an elder deity: Python environment management tools randomly break for me. I’ve never been able to figure out why this happens, but in the last three years I have not been able to consistently have basic editing, testing, or other project management tooling work reliably. I’ve spent hours debugging weird SIGBUS errors that nobody else can recreate and other things that go way above and beyond normal debugging of issues.
The biggest thing that breaks is the language server in my editor. If I can’t get the language server working, I don’t know what I’m allowed to do with any given thing in the file without having to have a bunch of documentation tabs open. This combined with Python not having a standard documentation site like Go does means that figuring out what I can do is not easy.
Making things worse, there’s as many ways to manage Python as there are grains of sand on the planet. Starting to use Python means you get to make a lot of lovely decisions:
- What environment manager are you using? Conda? Virtualenv? uv? Anaconda? Miniconda? Homebrew? Pipenv?
- Which version of Python does your project depend on? Many big libraries like tensorflow do deep monkey patching of Python for performance reasons and as a result they can’t work on newer versions of the interpreter.
- How are you installing your dependencies? Pip? Pip3? Uv?
There has to be some kind of middle path. We should be able to have nice things like the ability to just open a git repo and get a working development environment, right?
How it works
When you package your app in a Docker image, you make a Dockerfile
manifest
with a base image and then list out all the changes you make to that base image
to get things working. This could be anything from copying your source code into
the image, building that code, installing dependencies, or anything else that
boils down to copying files and running commands. When you define a development
container, you make a devcontainer.json
manifest that specifies the base image
you’re working from and any features you want
to add to it.
For example, let’s consider what you need to do in order to get a
Node.js environment working. Here’s a sample
devcontainer.json
file for working with Node:
{
"name": "Node",
"image": "mcr.microsoft.com/devcontainers/base:bookworm",
"features": {
"ghcr.io/devcontainers/features/node:1": {},
"ghcr.io/devcontainers-extra/features/neovim-apt-get:1": {}
},
"postCreateCommand": "npm ci"
}
This tells your editor to make a copy of Microsoft’s base Debian image with Node and neovim automatically installed. It also installs all of your Node dependencies so that all you need to do to get up and running is:
- Open the repo in a development container
- Open a terminal
- Run
npm run start
- There is no step 4.
Just imagine what that workflow could give you. Spinning people up would be a walk in the park.
What about Python?
You’re probably sitting there asking yourself “yeah, that’s cool, but what about Python?” Python presents a lot of challenges for development use because there’s so many variables at play. If you know what you’re doing, this is fine and manageable. If you don’t, you end up in pip hell. You don't want to be in pip hell with me.
One of the big things that development containers give teams that have a mix of Python experts and non-Python experts is the ability to just have a known working setup for people to fall back on in case they aren’t an expert in Python environment metaphysics. It’s great for people like me who care about the end result, but do not care at all how things go about getting done as long as it works (for some reasonable definition of “works”). Even better, you can define editor configuration settings and a list of extensions specifically for that project, meaning that you really can just open a new repo and get up and running within seconds.
This editor preconfiguration means you can fix problems like “What version of
Python do I need?” or “How do I just install the dependencies?” forever. Take
tigrisdata-community/huggingface-datasets-with-tigris
for example. Its
devcontainer.json
answers that question for you:
{
// ...
"postCreateCommand": "uv python install && uv venv && uv sync",
"remoteEnv": {
"UV_LINK_MODE": "copy",
"UV_PYTHON": "3.10"
}
// ...
}
When you create a development container with this manifest, it does the following:
- Installs Python 3.10.x with uv
- Creates a Python virtual environment for all of your dependencies
- Installs all of the Python dependencies
And then you can run the code with uv run
and things Just Work™. All of that
complicated dependency management becomes your environment’s problem. Even
better, take a look at
this part of the manifest:
{
// ...
"customizations": {
"vscode": {
"extensions": [
"ms-python.python",
"ms-python.vscode-pylance",
"tamasfe.even-better-toml",
"ms-toolsai.jupyter",
"ms-toolsai.vscode-jupyter-cell-tags",
"ms-toolsai.jupyter-renderers",
"ms-toolsai.vscode-jupyter-slideshow",
"ms-python.debugpy",
"ms-toolsai.jupyter-keymap",
"amazonwebservices.aws-toolkit-vscode"
],
"settings": {
"python.defaultInterpreterPath": "./.venv/bin/python"
}
}
}
// ...
}
This makes VS Code install every extension you need to get a working development
environment and that python.defaultInterpreterPath
setting is the cherry on
top that makes the language server integration work. This lets you simply clone
a repo and get a working language server.
Conclusion
I realize this sounds like a fairly simple thing, and let’s be honest, it should be this simple, but it’s taken me three years of experimentation, toil, and suffering to get to the point where you really can just clone a repo and get working language server integration. If you have also been suffering trying to get Python installed so you can vibe code your way to an IPO, give development containers a try.
This even works if you use GitHub Codespaces, meaning that you don’t even need to install a copy of VS Code to work on the project.