Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantic Interpretation Revisited #420

Open
mikeday opened this issue Nov 27, 2022 · 0 comments
Open

Semantic Interpretation Revisited #420

mikeday opened this issue Nov 27, 2022 · 0 comments

Comments

@mikeday
Copy link
Contributor

mikeday commented Nov 27, 2022

In issue #166 I brought up the semantic interpretation of binary formats, but should have started with some simple examples such as the many ways in which you can interpret formats that parse a single byte:

format: U8
meaning: 8-bit unsigned integer
type: {n:Int | 0 ≤ n < 256} (or possibly {n:Nat | n < 256})

format: S8
meaning: 8-bit signed integer
type: {n:Int | -128 ≤ n < 128 }

format: if flags.signed then S8 else U8
meaning: an 8-bit integer that is either signed or unsigned depending on a flag bit seen earlier
type: if flags.signed then {n:Int | -128 ≤ n < 128 } else {n:Int | 0 ≤ n < 256}

format: Byte (or Bits8 or BitVector 8 or whatever)
meaning: a vector of eight bits
type: Array 8 Bool

format: ZeroPaddingByte (just a handy typedef)
meaning: eight bits, all zero
type: Unit (conveys no information)

format: ASCIIChar
meaning: an 7-bit ASCII character (high bit must be zero)
type: ASCIIChar (Unicode subtyping questions?)

format: UTF8CodeUnit
meaning: an 8-bit UTF-8 code unit
type: ??? (string questions?)

And so on, clearly there are many formats that match a single byte and they can have many different semantic interpretations and hence types as a result! The type of values parsed by these formats then affects how they can be used, for example whether they can be array lengths or indices.

However if we consider more complex examples, like a partial map or function represented as an array of pairs:

format Map {
    len: U8,
    pairs: Array len Pair
}

format Pair {
    fst: U8,
    snd: U8
}

The interpretation of these formats is records, integers, and arrays, and treating them as a map would require another layer on top of that, like making them an instance of a type class interface with methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant