git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Map Type Metadata Representation


Hello All,

I would like to start moving forward with Map type support and begin
working on implementations. I believe we just need to define the specifics
of the metadata representation before getting started. Previously, there
was a thread [1] that discussed adding Map as a logical type and I'll try
to summarize where we are currently.

Map has been added as a logical type and defined in the Flatbuffer schema
format with 1 field "keysSorted" which indicates if the child keys vector
has been presorted. A Map is a nested type that is represented as
List<entry: Struct<key: K, value: V>>.

I think these are the 2 main issues of the metadata that need to be agreed
upon:

- Same memory layout as List<entry: Struct<key: K, value: V>>. This is so
implementations lacking Map can alias as repeated struct values.

- `Struct` and `K` fields are constrained to be non-nullable, other fields
can be nullable


Here is a sample JSON metadata representation:

{
"name" : "MapName",
"nullable" : true|false,
"type" : {
    "name" : "map",
    "keysSorted" : true|false
},
"children" : [{
    "name" : "entry",
    "nullable" : false,
    "type" : {
        "name" : "struct"
    },
    "children" : [{
        "name" : "key",
        "nullable" : false,
        "type" : {
            "name" : K
        },
        "children" : []
    },{
        "name" : "value",
        "nullable" : true|false,
        "type" : {
            "name" : V
        },
        "children" : []
    }]
}]


Any concerns or objections to the above?  Hopefully that covers what needs
to be discussed, please correct me if I missed something. Thanks!

Bryan


[1]:
https://lists.apache.org/thread.html/d61f21924159718fb31d27f5c85d58d393a88708f76dff510c8da322@%3Cdev.arrow.apache.org%3E