Skip to main content

2 Dimensional interpreter context

ยท 13 min read
David Meister

I've been working on converting all the interpreter work into an interface that can support standalone interpreter contracts. This will bring several benefits:

  • Upgradeability by using newer interpreters in old contracts
  • Smaller code size for contracts that use interpreters
  • Ability to provide more opcodes in the interpreter contract
  • Ability to support third party interpreters that match the interface

The basic process to split out inherited interpreters into standalone intereters is to remove all functionality that relies on inheritence then create a single shared interpreter contract for all existing Rain contracts. Notably this means that anything directly referencing contract storage within an expression needs an alternative mechanism. This is because the standalone interpreter will not have access to the calling contract's storage. Usually this problem is solved using the "delegate call" pattern in the EVM but this allows the called contract to modify the storage of the caller. We can't allow this if the calling contract wants to allow arbitrary interpreters to be called. An interpreter MAY be malicious, so we want any undesirable consequences to be limited to the users that decided for themselves to trust a malicious interpreter. That is why we restrict interpreters to read only calls, so that the worst an interpreter can do is miscalculate some value for the caller. Of course even the ability to manipulate read only logic can be quite bad, especially if used downstream as an oracle. We're only aiming to put limits on what a bad interpreter can do, that can be reasoned about and sandboxed, and in that context a delegate call is unacceptable (because writes from the callee modify the caller).

I identified a few different strategies for providing storage data to intepreters.

  • Pass any values that would have been read from storage by an opcode to the interpreter as context instead
  • Define an external interface for reading the storage and opcodes to use the interface
  • Define external opcodes that can be used directly in expressions, treating the caller as an extension of the interpreter

Each of these have clear tradeoffs, so it's likely that all three strategies will exist in the wild. The second two both require the interpreter to reenter the calling contract, which comes with additional gas, but at least we're guaranteed the caller will be "warm" so the additional gas will be limited to passing inputs and outputs, not loading a cold contract.

The big issue with defining external interfaces on the caller is that the opcodes to read them need to exist in the interpreter. This means the interpreter needs to be aware of the contracts that call it. This is a problem because it undermines the kind of decoupling that we'd hope to achieve with a truly standalone interpreter. It would probably solve our short term issues with the code size of the flow contracts, and then we'd have to refactor again shortly afterwards.

That's exactly why I was working on the external opcodes, to function as a kind of "meta interface" that allows expressions to reach into contracts the interpreter doesn't have any knowledge of, but the expression author does know about.

The main problem right now with external opcodes is that they don't exist yet ๐Ÿ™ƒ

I was hoping to split all the intepreters out into a standalone shared interpreter before tackling the external opcodes in full. Right now there's a proof of concept but no production ready external opcode implementation.

Thankfully there appears to be at least a partial way forward by slightly extending the definition of context.

2 dimensions are better than 1โ€‹

Context already exists in the interpreter, the calling function can pass in an array of uint256 values to eval and then the expression can reference these by index.

There is a problem here in dynamic length arrays. If we have 2 dynamic length arrays of values then a single index becomes ambiguous.

Consider if we had two arrays, [1 2 3] and [4 5 6] and we concatenated them into a single context [1 2 3 4 5 6]. The original arrays have completely different meanings, let's say [1 2 3] is a list of rankings as the outcome of a tournament and [4 5 6] is a list of tournament sponsors. Now imagine an expression writer wants to reference the users that came in third place, so they use index 2 in their context to read the third value of the first array (index 0 is the first value). This all works as long as both arrays are always the same size, but imagine there is some tournament with 2 players and 4 sponsors, the final context is still 6 values but now the context index 2 references a sponsor, not a player! That kind of thing is a critical bug that makes money go missing.

What we want is to be able to support dynamic lengths of orthogonal concerns and have the expression authors safely reference these. If an expression author references "third place" in a tournament and there are only 2 players, we would prefer that be an "out of bounds" error than silently pass and treat a sponsor as a player. Of course, this naive example would result in trapped funds, which is why e.g. the Sale contract has a hard timeout that refunds everyone if a bad expression locks up the normal processing. Failsafes may not always be possible, e.g. Flow has no clear auto-timeout process as it is so abstract, but in many cases it will be possible to implement failsafes that return everyone's funds in case of a buggy expression. This is preferrable to an expression clearing funds to the wrong addresses. Once funds move in a decentralised system it is often impossible to return them as unintended beneficiaries simply keep their funds.

Luckily we have 2 bytes available to index into context and we're currently only using 1 of them. This means we can trivially support 2 dimensional contexts by using the first byte as the "column" and the second byte as the "row", to use spreadsheet terms. In our above example imagine that the third ranked player is now A3 as A:A column is dedicated to players only, and all the sponsors go into the B:B column. Now it is impossible for an expression author to accidentally reference a sponsor as long as they stick to values in the A:A column. Authors can be confident that they will either reference a valid player or the expression will rollback, a much simpler binary outcome to plan around.

This new paradigm immediately helps us refactor out storage reads from local opcodes into context as e.g. we can move all the data from storage reads into A:A and all the user-provided context into B:B. In doing so, a storage read from index 1 would be equivalent to a context read from A1.

2 dimensional contexts also open up a new pattern for the future where the stack from some expression can become a context column for a subsequent expression. Chained expressions could drastically improve legibility by breaking up complex "if this then that" style checks into a series of logical steps that are each processed individually by the implementing contract.

Gas & lazy contextโ€‹

Simply putting data into a 2 dimensional context doesn't add much gas overhead at all as a uint256[][] is just a list of pointers to the uint256[] arrays. There's no additional copying or memory usage beyond the pointers, so in that sense it is lightweight.

Where we run into gas issues is preloading values from storage. In the current setup we only load something from storage if and when the storage opcode is used. If we want to put all our storage data into context then we'd have to load every possible storage value that might be used by the expression, and pay exorbitant gas for every slot.

Luckily there is a worked example in OrderBook that uses the scratch space to track which opcodes are used in the integrity check. We can do the same thing for context. A single uint256 value can represent a 16x16 grid of 2D context data where each bit is a simple yes/no flag tracking which slots are used by the expression. If a slot is not used then we don't need to load the storage, that position in context can safely remain a 0 for that expression as we know it will never be read by the expression.

Gas & sparse contextโ€‹

Context size isn't such an issue currently as all the interpreters are still inherited so the context is just data in memory. A "large" context for the types of expressions we're writing are still only a dozen or so uint256 values.

Where this may become an issue is passing context from a calling contract to a standalone interpreter. In that case we have to pay gas for the data that moves between contracts, and this can easily be thousands or 10s of thousands of gas if the context is large. Luckily 0 value bytes are cheaper, and doubly luckily this seems to be an area of active research upstream to bring the cost of L2s down, by bringing calldata costs down.

A 2 dimensional context array could be a relatively large gas hit for some expression that bootstraps and executes today in ~5k gas total. We're getting into edge case hypotheticals here but imagine some simple expression that reads from A:15, ideally we'd not pad the context with 15 0 values to put the 16th value in the correct slot. That would require 512 (16x32) bytes in column A:A to be passed between caller and interpreter. Better if we had a single byte to represent the 15 skipped values and can then only pass 33 bytes to achieve the same. Conceptually this is how sparse matrix data is usually represented, but we'd need something highly optimised for EVM gas usage to be worthwhile. We had something similar in the serialized representation of interpreter state in the past, and it did add significant complexity to the serialization and deserialization of data. This complexity comes with a breeding ground for bugs, and its own gas overhead to process the offsets, so we'd have to carefully measure and assess the cost/benefit of a sparse context representation.

It is worth noting that if we adopted a sparse context representation then ALL intepreters would need to respect the context handling to be compatible with the caller. This sets an even higher "worth it" bar for the sparse handling, as it will bloat the caller code and force many implementers to lean on libraries as additional dependencies in their repositories, increasing audit costs, maintenance, etc.

Signed contextโ€‹

A new pattern that 2D context opens up trivially is the idea of signed context. Any contract can accept arrays as calldata with an associated signer and signature. An expression can be written to authorize the signer, and if authorized the signature checked against the data array, and if valid the data array can be used as context for the main expression.

The latest flow contracts leverage this pattern to allow end users to provide their own context as a proof to release some flow. The flow author specifies both the authentication of signer and flow logic, the flow contract handles the authorization of the context based on the (in)valid signature.

Note that signed contexts aren't proving provenance of the signed data, only that the signer agrees the data is authentic. It is trivial for some signed data to be re-signed by anyone else, unless the signed data specifies who is allowed to sign it and the verification logic confirms the signed/signer match (which Flow does NOT do).

Hopefully this pattern proves to be a nice bridge between web2 and web3 without relying on heavy-handed oracle solutions. For example, this context signing can allow a single proof to be reused across many contracts on many chains in parallel, whereas oracles would need gas, transactions and coordination across all relevant contexts. Data signing is very familiar to web2 developers, such as in the production of JWT tokens, whereas web3 oracle management is an extremely niche and expensive skillset requiring dedicated infrastructure.

Future problem to solve: Dynamic storage readsโ€‹

One problem that 2D contexts cannot help with is runtime dynamic slot resolution.

For example, imagine in our example from earlier of the expression that reads from the third ranked player in some tournament, instead the index is only known at runtime. Perhaps some external oracle/logic tells the expression which position is "last" so that instead of erroring when there are only 2 participants, the expression knows to read index 1 or index 2 dynamically.

In such cases we cannot simply scan the use of context opcodes during the integrity check and preload values from storage.

The problem is even worse when we consider mappings. In this case there are 2^256 possible slots to read from, effectively infinite.

In these cases we are going to need to use one of the other two strategies listed earlier, either define/use an interface (e.g. an ERC token standard) or implement external opcodes that an expression can pass the storage location in via. inputs to lazy-load storage outputs during expression execution.

In the short term (until I implement external opcodes and standalone interpreter) this is probably going to mean removing some generality from existing contracts on develop branch. For example, the OrderBook volume tracking could only read the volume from the current counterparty, not some arbitrary other counterparty that is not even involved in the current trade.

I expect this loss of generality to be mostly or completely edge-case-y and NOT impact anyone in practise. If it does, those people will need to wait for future releases that introduce external opcodes to move to latest contract versions.

In fact, the OrderBook example specifically is dubious anyway as the tracking to storage only tracks the current counterparty anyway and I'm quite sure none of our current tooling checks external counterparties to see if those orders are tracking the data the current order is supposed to be reading... That's a problem worth it's own blog post and rabbit hole, having the expression author specify additional tracking, an having cross-referencing expressions check each other's tracking, and displaying that in the self-audit tooling cleanly...

For now it might be safer just to remove such complexities and foot-guns and only revisit when we can really do it robustly.