On Frugal AI Usage
I worry a lot about what will happen when the music stops and the bill for the trillions of dollars spent on AI, data centers, chips, and memory makers comes due. Those payments are going to come from the people who can afford to stay locked in to the providers they’ve chosen with the services they’ve built.
And we all know those organizations are not going to be in civil society.
So this begs the question: how do we benefit from the boom in heavily subsidized AI without becoming a victim to it later? How do we avoid getting locked into skyrocketing usage pricing or giving up tools we’ve spent time building our workflows around?
In other words, how do we articulate principles of frugal AI usage?
First, a definition. Frugal AI usage means limiting the compute power to the minimum required. Right now, with user license-based pricing, this won’t net you a direct savings. Later, when it converts more completely to license plus usage-based pricing—where you pay by the token, the basic unit of data an AI processes—it will matter a lot. Frugal AI also reduces the environmental harm of AI use.
So back to the principles:
- Use AI to build automations, but keep AI out of the repetitive execution. You don’t want to build a tool that relies on AI for the whole thing. It is time-consuming, expensive, and not as reliable.
- Decompose your process into steps. Determine if a step is best accomplished by deterministic processes (fixed, predictable rules) or probabilistic processes (AI guesswork).
- Use a hybrid approach with local models. When you are iterating, testing, or working on well-documented processes outside your personal skills, move the work to a local model running right on your machine.
- Keep your context, instructions, and common prompts in lightweight, portable structured files. These markdown files. JSON. They live on your hard drive and are easy to edit and reference.
- Use the principles of progressive disclosure. Only provide the required information to the AI at the required times. Because context equals cost, don’t burden the AI with rules it doesn’t need yet.
A Real-World Example: My Python-PPTX Effort
I can use a small project to demonstrate this. I used a command-line AI tool to build a workflow to help me make brand-compliant PowerPoints according to the rules I follow when making decks. It’s made up of a PowerPoint template in our brand style, a rules document saved as a markdown file, and a python script that generates the PowerPoint. Simple.
I start the python script in Terminal. I get a prompt asking for the topic of my presentation or the path for relevant files. Then—and for the first time—it invokes GenAI and, using my rules file, it generates the content. Finally, it hands control back to the Python script to put that text into the approved template and deposit the result in a specific folder on my hard drive. From there, I open it up and edit by hand. That whole process takes about 3 minutes.
Here’s how the frugal AI principles apply:
- I used AI to build the automation, not run it entirely. I didn’t ask the AI to handle the deck design or layout. Python is a well-established tool for generating PowerPoints, so I used that. Our template is deterministic. I don’t even use AI to create images for the deck; I just instruct it to put in placeholders with a design brief style instruction for the kind of image that might go there.
- I used a hybrid model approach. I shifted tools based on the phase of the project. I used a big, cloud-based AI for the initial design and brainstorming. But when I went from design to actually building and fixing the script, I switched to a local model on my machine. I could go back and forth, get it working, and never need to touch the internet. Zero tokens used for that portion, no internet bandwidth burned, but I still got the benefit of AI.
- My rules live in a markdown file. I can invoke that same file in different processes with different tools—my local model or any of the big commercial LLM tools out there. If I want to change my rules—say, go from a single-sentence headline to a descriptive phrase—I just open that document and edit it. No AI needed. Change the rule once, and I change it everywhere that markdown file is referenced. This keeps me from being locked into any single provider.
- I practice progressive disclosure so I only use what I need, when I need it. Let’s imagine I make my little project more complex: I want a set of rules for how numbers, charts, and graphs are displayed. I could put that all in my main rules document, but then that text is longer, and I’m burdening the AI’s instruction set with things that are only needed some of the time. Instead, I make a separate markdown file for my numbers rules. Then I update my script: if the information is general, just use my global rules. If the information includes sets of numbers, only then do we pull in the numbers rules document.
While this example is small, scale this approach across an organization running tens, hundreds, or thousands of workflows a day, and the savings on token context windows become staggering. You add context only when you need it, which takes less compute, costs less, and makes hallucinations much less likely.
I’m still working through these rules, these examples, and how to play it out in organizations and across the sector. I welcome any feedback.