Orchestra Conductors are Prompt Engineers
Intro
I’ve played the clarinet since 5th grade, and since then have spent a lot of time playing music. I’ve also spent a lot of time writing software. My motivation here is to make an analogy between prompt engineering and music, with the goal of shedding some insight into the limitations of today’s AI. I believe that claims about the automation of software engineering or white collar work are overblown.
What Does a Conductor Do
A conductor is someone who stands in front of an ensemble of musicians. The conductor is primarily responsible for providing feedback and instruction to musicians on how to improve their collective performance.
Musicians produce music with some degree of precision. 5th graders make a substantial number of mistakes, even while playing very simple pieces. With years of training, musicians can make fewer mistakes while playing increasingly complex pieces. Professional musicians make very few mistakes, and can successfully perform the most complex pieces composers can come up with.
Conductors communicate to musicians with natural language with the hope of improving their output. “Please remember this entrance for the suspended cymbal at measure 14, remember that we are accelerating the tempo at measure 125, measure 230 is the climax of the piece”.
Conductors also often are the ones responsible for choosing which pieces the ensemble will play. Conductors of professional musicians have their choice of essentially any interesting piece, regardless of the difficulty, because the individual performers are so skilled. Conductors of amateurs and beginners, less so. Asking 5th graders to play Shostakovich would be catastrophic. Asking them to play a simple arrangement of the Star Wars theme is tractable.
What Does a Prompt Engineer Do
A prompt engineer asks AI models to complete tasks. AI models are prone to making mistakes. A prompt engineer’s job is to find ways to ask AI models for output in a way that minimizes the mistakes they make. A prompt engineer tries to find tasks that are tractable for an AI model to complete. That’s it, that’s the analogy!
The models we have today are more talented at white collar work than a 5th grader is at playing the trombone. The fact that a model can play the intellectual equivalent of the Star Wars theme song with very little prompting is basically magic (not sarcasm). In some specific domains, well prompted AI models can produce truly impressive output.
The problem is, to automate white collar work (as is frequently touted as a goal of AI companies), we need AI models to be able to perform the intellectual equivalent of Shostakovich in a wide variety of areas. By my rough estimation, today’s AI models are about as talented and mistake prone as a high schooler, maybe a decent college player. High school and college players can make catastrophic errors while playing music, and it doesn’t take a very difficult piece to break them.
Where the Analogy Breaks
If high schoolers completely botch their playing of a symphony, their parents will still clap.
If a prompt engineer “vibe coding” an app introduces a critical security error, real people’s data will be compromised. If a prompt engineer creates an AI therapist that doesn’t properly detect signs of mental illness, a patient can end up in a psychiatric hospital. You get the point.
Maybe the models will get good! Maybe everything will be fine. There are some truly brilliant people working on this problem, and I wouldn’t be shocked if they improved the models drastically, faster than a fifth grader becomes a member of the New York Philharmonic. But if they don’t, maybe this whole AI thing won’t have as much impact as we’ve been led to believe.