I've been working on converting all the interpreter work into an interface that can support standalone interpreter contracts. This will bring several benefits:
- Upgradeability by using newer interpreters in old contracts
- Smaller code size for contracts that use interpreters
- Ability to provide more opcodes in the interpreter contract
- Ability to support third party interpreters that match the interface
The basic process to split out inherited interpreters into standalone intereters is to remove all functionality that relies on inheritence then create a single shared interpreter contract for all existing Rain contracts. Notably this means that anything directly referencing contract storage within an expression needs an alternative mechanism. This is because the standalone interpreter will not have access to the calling contract's storage. Usually this problem is solved using the "delegate call" pattern in the EVM but this allows the called contract to modify the storage of the caller. We can't allow this if the calling contract wants to allow arbitrary interpreters to be called. An interpreter MAY be malicious, so we want any undesirable consequences to be limited to the users that decided for themselves to trust a malicious interpreter. That is why we restrict interpreters to read only calls, so that the worst an interpreter can do is miscalculate some value for the caller. Of course even the ability to manipulate read only logic can be quite bad, especially if used downstream as an oracle. We're only aiming to put limits on what a bad interpreter can do, that can be reasoned about and sandboxed, and in that context a delegate call is unacceptable (because writes from the callee modify the caller).
I identified a few different strategies for providing storage data to intepreters.
- Pass any values that would have been read from storage by an opcode to the interpreter as context instead
- Define an external interface for reading the storage and opcodes to use the interface
- Define external opcodes that can be used directly in expressions, treating the caller as an extension of the interpreter
Each of these have clear tradeoffs, so it's likely that all three strategies will exist in the wild. The second two both require the interpreter to reenter the calling contract, which comes with additional gas, but at least we're guaranteed the caller will be "warm" so the additional gas will be limited to passing inputs and outputs, not loading a cold contract.
The big issue with defining external interfaces on the caller is that the opcodes to read them need to exist in the interpreter. This means the interpreter needs to be aware of the contracts that call it. This is a problem because it undermines the kind of decoupling that we'd hope to achieve with a truly standalone interpreter. It would probably solve our short term issues with the code size of the flow contracts, and then we'd have to refactor again shortly afterwards.
That's exactly why I was working on the external opcodes, to function as a kind of "meta interface" that allows expressions to reach into contracts the interpreter doesn't have any knowledge of, but the expression author does know about.
The main problem right now with external opcodes is that they don't exist yet 🙃
I was hoping to split all the intepreters out into a standalone shared interpreter before tackling the external opcodes in full. Right now there's a proof of concept but no production ready external opcode implementation.
Thankfully there appears to be at least a partial way forward by slightly extending the definition of context.
2 dimensions are better than 1
Context already exists in the interpreter, the calling function can pass in an
uint256 values to
eval and then the expression can reference these
There is a problem here in dynamic length arrays. If we have 2 dynamic length arrays of values then a single index becomes ambiguous.
Consider if we had two arrays,
[1 2 3] and
[4 5 6] and we concatenated them
into a single context
[1 2 3 4 5 6]. The original arrays have completely different
meanings, let's say
[1 2 3] is a list of rankings as the outcome of a tournament
[4 5 6] is a list of tournament sponsors. Now imagine an expression writer
wants to reference the users that came in third place, so they use index
their context to read the third value of the first array (index
0 is the first value).
This all works as long as both arrays are always the same size, but imagine there
is some tournament with 2 players and 4 sponsors, the final context is still 6
values but now the context index
2 references a sponsor, not a player! That kind
of thing is a critical bug that makes money go missing.
What we want is to be able to support dynamic lengths of orthogonal concerns and
have the expression authors safely reference these. If an expression author
references "third place" in a tournament and there are only 2 players, we would
prefer that be an "out of bounds" error than silently pass and treat a sponsor as
a player. Of course, this naive example would result in trapped funds, which is
why e.g. the
Sale contract has a hard timeout that refunds everyone if a bad
expression locks up the normal processing. Failsafes may not always be possible,
Flow has no clear auto-timeout process as it is so abstract, but in many
cases it will be possible to implement failsafes that return everyone's funds in
case of a buggy expression. This is preferrable to an expression clearing funds
to the wrong addresses. Once funds move in a decentralised system it is often
impossible to return them as unintended beneficiaries simply keep their funds.
Luckily we have 2 bytes available to
index into context and we're currently only using 1 of them. This means we can
trivially support 2 dimensional contexts by using the first byte as the "column"
and the second byte as the "row", to use spreadsheet terms. In our above example
imagine that the third ranked player is now
A:A column is dedicated to
players only, and all the sponsors go into the
B:B column. Now it is impossible
for an expression author to accidentally reference a sponsor as long as they stick
to values in the
A:A column. Authors can be confident that they will either
reference a valid player or the expression will rollback, a much simpler binary
outcome to plan around.
This new paradigm immediately helps us refactor out storage reads from local opcodes
into context as e.g. we can move all the data from storage reads into
all the user-provided context into
B:B. In doing so, a storage read from index
1 would be equivalent to a context read from
2 dimensional contexts also open up a new pattern for the future where the stack from some expression can become a context column for a subsequent expression. Chained expressions could drastically improve legibility by breaking up complex "if this then that" style checks into a series of logical steps that are each processed individually by the implementing contract.
Gas & lazy context
Simply putting data into a 2 dimensional context doesn't add much gas overhead at
all as a
uint256 is just a list of pointers to the
There's no additional copying or memory usage beyond the pointers, so in that
sense it is lightweight.
Where we run into gas issues is preloading values from storage. In the current setup we only load something from storage if and when the storage opcode is used. If we want to put all our storage data into context then we'd have to load every possible storage value that might be used by the expression, and pay exorbitant gas for every slot.
Luckily there is a worked example in
OrderBook that uses the scratch space to
track which opcodes are used in the integrity check. We can do the same thing for
context. A single
uint256 value can represent a 16x16 grid of 2D context data
where each bit is a simple yes/no flag tracking which slots are used by the
expression. If a slot is not used then we don't need to load the storage, that
position in context can safely remain a
0 for that expression as we know it will
never be read by the expression.
Gas & sparse context
Context size isn't such an issue currently as all the interpreters are still inherited
so the context is just data in memory. A "large" context for the types of expressions
we're writing are still only a dozen or so
Where this may become an issue is passing context from a calling contract to
a standalone interpreter. In that case we have to pay gas for the data that moves
between contracts, and this can easily be thousands or 10s of thousands of gas
if the context is large. Luckily
0 value bytes are cheaper, and doubly luckily
this seems to be an area of active research upstream to bring the cost of L2s
down, by bringing calldata costs down.
A 2 dimensional context array could be a relatively large gas hit for some
expression that bootstraps and executes today in ~5k gas total. We're getting into
edge case hypotheticals here but imagine some simple expression that reads from
A:15, ideally we'd not pad the context with 15
0 values to put the 16th value
in the correct slot. That would require 512 (16x32) bytes in column
A:A to be
passed between caller and interpreter. Better if we had a single byte to represent
the 15 skipped values and can then only pass 33 bytes to achieve the same. Conceptually
this is how sparse matrix data is
usually represented, but we'd need something highly optimised for EVM gas usage
to be worthwhile. We had something similar in the serialized representation of
interpreter state in the past, and it did add significant complexity to the
serialization and deserialization of data. This complexity comes with a breeding
ground for bugs, and its own gas overhead to process the offsets, so we'd have to
carefully measure and assess the cost/benefit of a sparse context representation.
It is worth noting that if we adopted a sparse context representation then ALL intepreters would need to respect the context handling to be compatible with the caller. This sets an even higher "worth it" bar for the sparse handling, as it will bloat the caller code and force many implementers to lean on libraries as additional dependencies in their repositories, increasing audit costs, maintenance, etc.
A new pattern that 2D context opens up trivially is the idea of signed context. Any contract can accept arrays as calldata with an associated signer and signature. An expression can be written to authorize the signer, and if authorized the signature checked against the data array, and if valid the data array can be used as context for the main expression.
The latest flow contracts leverage this pattern to allow end users to provide their own context as a proof to release some flow. The flow author specifies both the authentication of signer and flow logic, the flow contract handles the authorization of the context based on the (in)valid signature.
Note that signed contexts aren't proving provenance of the signed data, only that
the signer agrees the data is authentic. It is trivial for some signed data to be
re-signed by anyone else, unless the signed data specifies who is allowed to sign
it and the verification logic confirms the signed/signer match
Flow does NOT do).
Hopefully this pattern proves to be a nice bridge between web2 and web3 without relying on heavy-handed oracle solutions. For example, this context signing can allow a single proof to be reused across many contracts on many chains in parallel, whereas oracles would need gas, transactions and coordination across all relevant contexts. Data signing is very familiar to web2 developers, such as in the production of JWT tokens, whereas web3 oracle management is an extremely niche and expensive skillset requiring dedicated infrastructure.
Future problem to solve: Dynamic storage reads
One problem that 2D contexts cannot help with is runtime dynamic slot resolution.
For example, imagine in our example from earlier of the expression that reads from
the third ranked player in some tournament, instead the index is only known at
runtime. Perhaps some external oracle/logic tells the expression which position
is "last" so that instead of erroring when there are only 2 participants, the
expression knows to read index
1 or index
In such cases we cannot simply scan the use of context opcodes during the integrity check and preload values from storage.
The problem is even worse when we consider mappings. In this case there are 2^256 possible slots to read from, effectively infinite.
In these cases we are going to need to use one of the other two strategies listed earlier, either define/use an interface (e.g. an ERC token standard) or implement external opcodes that an expression can pass the storage location in via. inputs to lazy-load storage outputs during expression execution.
In the short term (until I implement external opcodes and standalone interpreter)
this is probably going to mean removing some generality from existing contracts
develop branch. For example, the
OrderBook volume tracking could only read
the volume from the current counterparty, not some arbitrary other counterparty
that is not even involved in the current trade.
I expect this loss of generality to be mostly or completely edge-case-y and NOT impact anyone in practise. If it does, those people will need to wait for future releases that introduce external opcodes to move to latest contract versions.
In fact, the
OrderBook example specifically is dubious anyway as the tracking
to storage only tracks the current counterparty anyway and I'm quite sure none of
our current tooling checks external counterparties to see if those orders are
tracking the data the current order is supposed to be reading... That's a problem
worth it's own blog post and rabbit hole, having the expression author specify
additional tracking, an having cross-referencing expressions check each other's
tracking, and displaying that in the self-audit tooling cleanly...
For now it might be safer just to remove such complexities and foot-guns and only revisit when we can really do it robustly.