git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table


That's a reasonable alternative.

On Fri, Jun 29, 2018 at 7:57 PM Julian Hyde <jhyde@xxxxxxxxxx> wrote:

> Maybe there could be a separator char as one of the adapter’s parameters.
> People should choose a value, say ‘$’ or ‘#’, that is legal in an unquoted
> SQL identifier but does not occur in any of their index or type names.
>
> If not specified, the adapter would end up in a simple mode, say looking
> for indexes first, then looking for types, and people would need to make
> sure indexes and types have distinct names. After the transition to
> single-type indexes, people could stop using the parameter.
>
> Julian
>
>
> > On Jun 29, 2018, at 4:43 PM, Andrei Sereda <andrei@xxxxxxxxx> wrote:
> >
> > That's a valid point. Then user would define a different pattern like
> > "i$index_t$type" for his cluster.
> >
> > I think  we should first answer wherever such scenarios should be
> supported
> > by calcite (given that they're already deprecated by the vendor). If yes,
> > what should be collision strategy ? User defined pattern like above or
> > failure or auto generated name ?
> >
> > On Fri, Jun 29, 2018, 19:14 Julian Hyde <jhyde@xxxxxxxxxx> wrote:
> >
> >>> In elastic (index/type) pair is guaranteed to be unique therefore
> >>> "${index}_${type}" will be also unique (as string). This is only
> >> necessary
> >>> when we have several types per index. Valid question is wherever user
> >>> should be allowed such flexibility.
> >>
> >> Uniqueness is not my concern.
> >>
> >> Suppose there is an index called "x_y" with a type called "z", and
> >> another index called "x" with a type called "y_z". If I write "x_y_z"
> >> it's not clear how it should be broken into index/type.
> >>
> >>
> >> On Fri, Jun 29, 2018 at 3:15 PM, Andrei Sereda <andrei@xxxxxxxxx>
> wrote:
> >>>> Can you show how those examples affect SQL against the ES adapter
> and/or
> >>> how they affect JSON models?
> >>>
> >>> The discussion is how to properly bridge (index/type) concept from ES
> >> into
> >>> relational world. Proposal to use placeholders ($index / $type) affects
> >>> only how table is named in calcite. They're not used as SQL literals.
> IE
> >> it
> >>> affects only configuration phase of the schema.
> >>> Pretty much we're doing string/replace to derive table name from
> >>> ($index/$type).
> >>>
> >>>> You seem to be using '_' as a separator character. Are we sure that
> >>>> people will never use it in index or type name? Separator characters
> >>>> often cause problems.
> >>> In elastic (index/type) pair is guaranteed to be unique therefore
> >>> "${index}_${type}" will be also unique (as string). This is only
> >> necessary
> >>> when we have several types per index. Valid question is wherever user
> >>> should be allowed such flexibility.
> >>>
> >>>
> >>>
> >>> On Fri, Jun 29, 2018 at 2:19 PM Julian Hyde <jhyde@xxxxxxxxxx> wrote:
> >>>
> >>>> Andrei,
> >>>>
> >>>> I'm not an ES user so I don't fully understand this issue, but my two
> >>>> cents anyway...
> >>>>
> >>>> Can you show how those examples affect SQL against the ES adapter
> >>>> and/or how they affect JSON models?
> >>>>
> >>>> You seem to be using '_' as a separator character. Are we sure that
> >>>> people will never use it in index or type name? Separator characters
> >>>> often cause problems.
> >>>>
> >>>> Julian
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Fri, Jun 29, 2018 at 10:58 AM, Andrei Sereda <andrei@xxxxxxxxx>
> >> wrote:
> >>>>> I agree there should be a configuration option. How about the
> >> following
> >>>>> approach.
> >>>>>
> >>>>> Expose both variables ${index} and ${type} in configuration (JSON)
> and
> >>>> user
> >>>>> will use them to generate table name in calcite schema.
> >>>>>
> >>>>> Example
> >>>>> "table_name": "${type}" // current
> >>>>> "table_name": "${index}" // new (default?)
> >>>>> "table_name": "${index}_${type}" // most generic. supports multiple
> >> types
> >>>>> per index
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Fri, Jun 29, 2018 at 9:26 AM Michael Mior <mmior@xxxxxxxxxx>
> >> wrote:
> >>>>>
> >>>>>> I think it sounds like you and Andrei are in a good position to
> >> tackle
> >>>> this
> >>>>>> one so I'm happy to have you both work on whatever solution you
> >> think is
> >>>>>> best.
> >>>>>>
> >>>>>> --
> >>>>>> Michael Mior
> >>>>>> mmior@xxxxxxxxxx
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Le ven. 29 juin 2018 à 04:19, Christian Beikov <
> >>>> christian.beikov@xxxxxxxxx
> >>>>>>>
> >>>>>> a écrit :
> >>>>>>
> >>>>>>> IMO the best solution would be to make it configurable by
> >> introducing
> >>>> a
> >>>>>>> "table_mapping" config with values
> >>>>>>>
> >>>>>>>  * type - every type in the known indices is mapped as table
> >>>>>>>  * index - every known index is mapped as table
> >>>>>>>
> >>>>>>> We'd probably also need a "type_field" configuration for defining
> >>>> which
> >>>>>>> field to use for the type determination as one of the possible
> >> future
> >>>>>>> ways to do things is to introduce a custom field:
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2
> >>>>>>>
> >>>>>>> We already detect the ES version, so we can set a smart default for
> >>>> this
> >>>>>>> setting. Let's make the index config param optional.
> >>>>>>>
> >>>>>>>  * When no index is given, we discover indexes, the default for
> >>>>>>>    "table_mapping" then is "index"
> >>>>>>>  * When index is given, the we only discover types according to
> >> the
> >>>>>>>    "type_field" configuration and the default for "table_mapping"
> >> is
> >>>>>>> "type"
> >>>>>>>
> >>>>>>> This would also allow to discover indexes but still use "type" as
> >>>>>>> "table_mapping".
> >>>>>>>
> >>>>>>> What do you think?
> >>>>>>>
> >>>>>>> Mit freundlichen Grüßen,
> >>>>>>>
> >>>>
> ------------------------------------------------------------------------
> >>>>>>> *Christian Beikov*
> >>>>>>> Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
> >>>>>>>> Yes. There is an API to list all indexes / types in elastic. They
> >>>> can
> >>>>>> be
> >>>>>>>> automatically imported into a schema.
> >>>>>>>>
> >>>>>>>> What needs to be agreed upon is how to expose those elements in
> >>>> calcite
> >>>>>>>> schema (naming / behaviour).
> >>>>>>>>
> >>>>>>>> 1) Many (most?) of setups are single type per index. Natural way
> >> to
> >>>>>> name
> >>>>>>>> would be  "elastic.$index" (elastic being schema name). Multiple
> >>>>>> indexes
> >>>>>>>> would be under same schema "elastic.index1" "elastic.index2" etc.
> >>>>>>>>
> >>>>>>>> 2) What if index has several types should they exported as
> >> calcite
> >>>>>>> tables:
> >>>>>>>> "elastic.$index_type1" "elastic.$index_type2" ?  Or (current
> >>>> behaviour)
> >>>>>>> as
> >>>>>>>> "elastic.type1" and "elastic.type2". Or as subschema
> >>>>>>>> "elastic.$index.type1" ?
> >>>>>>>>
> >>>>>>>> Now what if one has combination of (1) and (2) ?
> >>>>>>>> Setup (2) is already deprecated (and will be unsupported in next
> >>>>>> version)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <
> >>>>>>> christian.beikov@xxxxxxxxx>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Is there an API to discover indexes? If there is, I'd suggest we
> >>>>>> allow a
> >>>>>>>>> config option that to make the adapter discover the possible
> >>>> indexes.
> >>>>>>>>> We'd still have to adapt the code a bit, but internally, the
> >> schema
> >>>>>>>>> could just keep a cache of type name to index name map and be
> >> able
> >>>> to
> >>>>>>>>> support both scenarios.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Mit freundlichen Grüßen,
> >>>>>>>>>
> >>>>>>
> >> ------------------------------------------------------------------------
> >>>>>>>>> *Christian Beikov*
> >>>>>>>>> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
> >>>>>>>>>>> 1) What's the time horizon for the current adapter no longer
> >>>> working
> >>>>>>>>> with these
> >>>>>>>>>> changes to ES ?
> >>>>>>>>>> Current adapter will be working for a while with existing
> >> setup.
> >>>> The
> >>>>>>>>>> problem is nomenclature and ease of use.
> >>>>>>>>>>
> >>>>>>>>>> Their new SQL concepts mapping
> >>>>>>>>>> <
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >>>>>>>>>> drops
> >>>>>>>>>> the notion of ES type (which before was equivalent of RDBMS
> >> table)
> >>>>>> and
> >>>>>>>>> uses
> >>>>>>>>>> ES index as new table equivalent (before ES index was equal to
> >>>>>>> database).
> >>>>>>>>>> Most users use elastic this way (one type , one index) index ==
> >>>>>> table.
> >>>>>>>>>>
> >>>>>>>>>> Currently calcite requires schema per index. In RDBMS parlance
> >>>>>> database
> >>>>>>>>> per
> >>>>>>>>>> table (I'd like to change that).
> >>>>>>>>>>
> >>>>>>>>>>> 2) Any guess how complicated it would be to maintain code
> >> paths
> >>>> for
> >>>>>>> both
> >>>>>>>>>>> behaviours? I know this is probably really challenging to
> >>>> estimate,
> >>>>>>> but
> >>>>>>>>> I
> >>>>>>>>>>> really have no idea of the scope of these changes. Would it
> >> mean
> >>>> two
> >>>>>>>>>>> different ES adapters?
> >>>>>>>>>> One can have just a separate calcite schema implementations
> >> (same
> >>>>>>>>> adapter /
> >>>>>>>>>> module) :
> >>>>>>>>>> 1)  LegacySchema (old). Schema can have only one index (but
> >>>> multiple
> >>>>>>>>>> types). Type == table in this case.
> >>>>>>>>>> 2)  NewSchema (new). Single schema can have multiple indexes
> >>>> (type is
> >>>>>>>>>> dropped). Index == table in this case
> >>>>>>>>>>
> >>>>>>>>>>> 3) Do we really need compatibility with the current version of
> >>>> the
> >>>>>>>>>> adapter?
> >>>>>>>>>>> IMO this depends on what versions of ES we would lose support
> >> for
> >>>>>> and
> >>>>>>>>> how
> >>>>>>>>>>> complex it would be for users of the current ES adapter to
> >> make
> >>>>>>> updates
> >>>>>>>>>> for
> >>>>>>>>>>> any Calcite API changes.
> >>>>>>>>>> The issue is not in adapter but how calcite schema exposes
> >> tables.
> >>>>>>>>> Should
> >>>>>>>>>> it expose index as individual table (new), or ES type (old) ?
> >>>>>>>>>>
> >>>>>>>>>> Andrei.
> >>>>>>>>>>
> >>>>>>>>>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mmior@xxxxxxxxxx
> >>>
> >>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Unfortunately I know very little about ES so I'm not in a
> >> great
> >>>>>>>>> position to
> >>>>>>>>>>> asses the impact of these changes. I will say that that legacy
> >>>>>>>>>>> compatibility is great, but maintaining two sets of logic is
> >>>> always
> >>>>>> a
> >>>>>>>>>>> challenge. A few follow up questions:
> >>>>>>>>>>>
> >>>>>>>>>>> 1) What's the time horizon for the current adapter no longer
> >>>> working
> >>>>>>>>> with
> >>>>>>>>>>> these changes to ES?
> >>>>>>>>>>>
> >>>>>>>>>>> 2) Any guess how complicated it would be to maintain code
> >> paths
> >>>> for
> >>>>>>> both
> >>>>>>>>>>> behaviours? I know this is probably really challenging to
> >>>> estimate,
> >>>>>>> but
> >>>>>>>>> I
> >>>>>>>>>>> really have no idea of the scope of these changes. Would it
> >> mean
> >>>> two
> >>>>>>>>>>> different ES adapters?
> >>>>>>>>>>>
> >>>>>>>>>>> 3) Do we really need compatibility with the current version of
> >>>> the
> >>>>>>>>> adapter?
> >>>>>>>>>>> IMO this depends on what versions of ES we would lose support
> >> for
> >>>>>> and
> >>>>>>>>> how
> >>>>>>>>>>> complex it would be for users of the current ES adapter to
> >> make
> >>>>>>> updates
> >>>>>>>>> for
> >>>>>>>>>>> any Calcite API changes.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks for your continued work on the ES adapter Andrei!
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> Michael Mior
> >>>>>>>>>>> mmior@xxxxxxxxxx
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <andrei@xxxxxxxxx>
> >> a
> >>>>>>> écrit
> >>>>>>>>> :
> >>>>>>>>>>>> Hello,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Elastic announced
> >>>>>>>>>>>> <
> >>>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
> >>>>>>>>>>>> that they will be deprecating mapping types in ES6 and
> >> indexes
> >>>> will
> >>>>>>> be
> >>>>>>>>>>>> single-typed only.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Historical analogy <
> >> https://www.elastic.co/blog/index-vs-type>
> >>>>>>> between
> >>>>>>>>>>>> RDBMS and elastic was that index is equivalent to a database
> >> and
> >>>>>> type
> >>>>>>>>>>>> corresponds to table in that database. In a couple of
> >> releases
> >>>>>>> (ES6-8)
> >>>>>>>>>>> this
> >>>>>>>>>>>> shall not longer be true.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Recent SQL addition
> >>>>>>>>>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released>
> >> to
> >>>>>>> elastic
> >>>>>>>>>>>> confirms
> >>>>>>>>>>>> this trend
> >>>>>>>>>>>> <
> >>>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >>>>>>>>>>>>> .
> >>>>>>>>>>>> Index is equivalent to a table and there are no more ES
> >> types.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I would like to propose to include this logic in Calcite ES
> >>>>>> adapter.
> >>>>>>>>> IE,
> >>>>>>>>>>>> expose each ES single-typed index as a separate table inside
> >>>>>> calcite
> >>>>>>>>>>>> schema. This is in contrast to  current integration where
> >> schema
> >>>>>> can
> >>>>>>>>> only
> >>>>>>>>>>>> have a single index. Current approach forces you to create
> >>>> multiple
> >>>>>>>>>>> schemas
> >>>>>>>>>>>> to query single-typed indexes (on the same ES cluster).
> >>>>>>>>>>>>
> >>>>>>>>>>>> Legacy compatibility can always be controlled with
> >> configuration
> >>>>>>>>>>>> parameters.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Do you agree with such changes ? If yes, would you consider a
> >>>> PR ?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards,
> >>>>>>>>>>>> Andrei.
> >>>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
>
>