LMQL (arxiv) is a tool to steer language model generation. The language interface nicely stitches together a set of features such as:
- String manipulation, filling holes and variables
- String constraints for output tokens, switch on and off generation
- Tool use
Here’s how it works:
1. String manipulation: [WORDS] denotes a hole that needs language model to fill; {WORDS} denotes a variable that exists in the scope context. For instance:
Write a summary of {name}, the singer:
{{
"name": "[STRING_VALUE]",
"age": [INT_VALUE],
"top_songs": [[
"[STRING_VALUE]",
"[STRING_VALUE]"
]]
}}
Given the context {'name': 'Bruno Mars'}
, the variable name is replaced with Bruno Mars
to get the initial prompt Write a summary of {name}, the singer: \{\{ "name": "
, the generation spits out a few tokens, followed by a quote "
, LMQL detects the quote and stops the generation, appends ", "age":
, and continues.
2. String constraints using masking: constraints, options all use the simple idea that we could limit the available tokens models can choose from. In openai API, this is done by setting the logit_bias
parameter, with the format of {"50256": -100, ...}
.
3. Tool use: This is similiar to variable substitution, with the ability to call a function to get the variable’s value. However it is different from the commonly used ‘tool use’ where models can choose among a set of tools. Although in theory, there’s nothing that stops LMQL to switch to structured output to choose a tool and run it, and switch back to the previous generation context.
Why it didn’t take off?
I suspect there are two reasons: The real world use cases are largely satisfied by structured/json output; instruction following has improved a lot. Combining these two gives us a nice alternative to LMQL using simply string interpolation using context from json output.