A registry for safe transformation of media types

Context

REST APIs are great in that they can offer machine readability (in the form of structured data) at the same time as human readability (in the form of plain text or generic formats like HTML) – at the same time, thanks to content negotiation.

When a server is not able to offer human readable formats, accessibility suffers. The proposal of a registry for safe media type transformations could mend that gap.

Proposal

Let’s have an online repository of of media type translators. The repository can be browsed online, but also queried by debug tools whenever they encounter data in any given media type.

A media type translator is a program in a safely executable language that provides one single pure function that maps input in one media type to output of another media type. It should be annotated with several properties:

Input media type.

For example, this could be application/link-format.
Output media type.

For example, this could be text/html;charset=utf-8.
Language the translator.

While this can be an arbitrary language in theory, users will only want to process languages that they can execute in a safe way.

A typical example would be WASM (with some convention on the signature of the main function to call); note that this doesn’t need any extensions such as WASI, as a translator describes a pure function. (Languages such as Python may lend themselves to this kind of tasks for simplicity, but it is doubtful whether the typical users would be able to provide a sandboxed execution environment for them).

Translators could also be described in more high-level languages, eg. as XSLT or CDDL. For some of these, the execution environment will need to make choices of how to represent the output, if the language doesn’t describe a conversion per se, but rather an annotation: In the CDDL case, “executing” a CDDL over CBOR might produce annotated CBOR diagnostic notation (when converting to plain text), but may also produce an interactive diagnostic CBOR explorer that colorizes or otherwise visualizes the CDDL’s annotations.
Other administrative metadata, such as name, author, creation date and size of the program.

This can also contain quality indicators, and attributes that tell whether the conversion is complete (i.e., at least theoretically reversible) or whether it loses (summarizes) data.

Example use case

An IoT device may have some debug interface through which an authorized user can ask it to run a ping command. In a RESTful context, this is expressed through a resource to which an address is POSTed through CoAP, and a resource created as a response to that POST that contains the latest state of the ping until it expires or is deleted. (There may be pforms involved in setting up the POST, and while a language for them is still under heavy development, this mechanism might even translate those in the first place).

That resource representing a series of pings returns data in CBOR format, and annotates it as application/vnd.riot-os.pingstate+cbor. It looks about like this:

88D8368350FE8000000000020202FFFFFFFE030303F6182A1872187419FFFE03187818C9190960

Without this registry, the best a general command line tool for interacting with REST IoT servers can do is to recognize the structured CBOR suffix, and display the data in diagnostic notation:

[ipv6'fe80::202:2ff:ffff:fe03:0303%42', 114, 116, 65534, 3, 120, 201, 2400]

However, if there were suitable CDDL around, the tool could look up the provided media type in the registry, and the data could be shown in an annotated way:

[
    /destination/ ipv6'fe80::202:2ff:ffff:fe03:0303%42',
    /received/    114,
    /sent/        116,
    /recent/   0xfffe,
    /late/          3,
    /min/         120 /ms/,
    /avg/         201 /ms/,
    /max/        2400 /ms/
]

which is already a great improvement, and only required that someone iled the CDDL that is part of the definition of application/vnd.riot-os.pingstate+cbor into the database.

If, instead, they wrote a small script to convert to plain text, all of a sudden a generic IoT debug tool with no prior knowledge of the precise data format would be able to produce good output without any steps that’d be taxing on the IoT device:

Ping to fe80::202:2ff:ffff:fe03:0303%42

...............>

116 packets transmitted, 114 received, 2% packet loss
rtt min/avg/max = 120/201/2400 ms
3 packets received late (possibly duplicates)

If a similar generic client were implemented in a browser instead of a command line tool, it would prefer translators that produce HTML over those that produce plain text, and might render this in a more visually appealing way.

Extensions

This could and should be distributed.

Different servers can provide databases with different focus areas (say, IoT focused, image format focused) or different moderation policies.
Provide example of input media types.

This would make it friendlier for human navigation, when there is no input available yet – just pick one example, and browse through the various visualizations.
For some languages, it may make sense to have a converter that can process different input and output media types.

Works all the same, the function is still pure, it just takes more inputs.
Visualization adapters could be provided in a similar format.

Instead of expecting the clients to provide runtimes for a wide range of languages, language environments could be provided in a similar way than translators themselves. They would consume two inputs (the higher level translator and the input data).

An example may be a CDDL processor written in Rust and shipped as a WASM binary, or an implementation of pygments that processes syntax highlighting files that were uploaded to the registry. Even a full language runtime environment such as pyodide could be provided, allowing the safe execution of Python programs. (Such adapters would only be loaded if no high quality converters of smaller size are available for download).
Translators can be chained.

When a translator is available for converting some input to plain text with ANSI color escapes (for which a media type has yet to be registered), a web browser based tool may not prefer to use it. In the absence of a direct translator to HTML, it could still use that translator and chain it through a translator converting ANSI color escapes to HTML.
Translators might produce followable links.

If the execution environment provides the translator with additional context, a translator might use that for more adequate output, or may even produce output that the execution environment can process further.

For example, if a media type uses URI references to indicate further actions (eg. “POST your data to ./foo”), and the execution environment provides metadata (“The base URI of this representation is https://example.com/hello/, and and the user could start POSTs by running curl ${URL} --data ${DATA}”), then a representation could be rendered as
```
The input data has been processed but needs additional sample material.

To provide further data, you can run
$ curl https://example.com/hello/foo --data file-containing-your-data
```
Access to the registry can reveal the current actions of the user

It should be recommended to download the full registry, or to access it in privacy preserving shards (as is done with databases of compromised passwords).
Going half the way without media types

For highly bespoke applications, it may not always make sense to register a media type.

In those cases, at the cost of sending an extra link during discovery, the server could indicate a registry of suitable translators for a given resource directly.

Idea incubator summary

I think this could be set up in a weekend. I don’t have a weekend right now.

This page is part of chrysn's public personal idea incubator; go up for its other entries, or read about the idea of having an idea incubator for more information on what this is.