git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table


That's a valid point. Then user would define a different pattern like
"i$index_t$type" for his cluster.

I think  we should first answer wherever such scenarios should be supported
by calcite (given that they're already deprecated by the vendor). If yes,
what should be collision strategy ? User defined pattern like above or
failure or auto generated name ?

On Fri, Jun 29, 2018, 19:14 Julian Hyde <jhyde@xxxxxxxxxx> wrote:

> > In elastic (index/type) pair is guaranteed to be unique therefore
> > "${index}_${type}" will be also unique (as string). This is only
> necessary
> > when we have several types per index. Valid question is wherever user
> > should be allowed such flexibility.
>
> Uniqueness is not my concern.
>
> Suppose there is an index called "x_y" with a type called "z", and
> another index called "x" with a type called "y_z". If I write "x_y_z"
> it's not clear how it should be broken into index/type.
>
>
> On Fri, Jun 29, 2018 at 3:15 PM, Andrei Sereda <andrei@xxxxxxxxx> wrote:
> >> Can you show how those examples affect SQL against the ES adapter and/or
> > how they affect JSON models?
> >
> > The discussion is how to properly bridge (index/type) concept from ES
> into
> > relational world. Proposal to use placeholders ($index / $type) affects
> > only how table is named in calcite. They're not used as SQL literals. IE
> it
> > affects only configuration phase of the schema.
> > Pretty much we're doing string/replace to derive table name from
> > ($index/$type).
> >
> >> You seem to be using '_' as a separator character. Are we sure that
> >> people will never use it in index or type name? Separator characters
> >> often cause problems.
> > In elastic (index/type) pair is guaranteed to be unique therefore
> > "${index}_${type}" will be also unique (as string). This is only
> necessary
> > when we have several types per index. Valid question is wherever user
> > should be allowed such flexibility.
> >
> >
> >
> > On Fri, Jun 29, 2018 at 2:19 PM Julian Hyde <jhyde@xxxxxxxxxx> wrote:
> >
> >> Andrei,
> >>
> >> I'm not an ES user so I don't fully understand this issue, but my two
> >> cents anyway...
> >>
> >> Can you show how those examples affect SQL against the ES adapter
> >> and/or how they affect JSON models?
> >>
> >> You seem to be using '_' as a separator character. Are we sure that
> >> people will never use it in index or type name? Separator characters
> >> often cause problems.
> >>
> >> Julian
> >>
> >>
> >>
> >>
> >> On Fri, Jun 29, 2018 at 10:58 AM, Andrei Sereda <andrei@xxxxxxxxx>
> wrote:
> >> > I agree there should be a configuration option. How about the
> following
> >> > approach.
> >> >
> >> > Expose both variables ${index} and ${type} in configuration (JSON) and
> >> user
> >> > will use them to generate table name in calcite schema.
> >> >
> >> > Example
> >> > "table_name": "${type}" // current
> >> > "table_name": "${index}" // new (default?)
> >> > "table_name": "${index}_${type}" // most generic. supports multiple
> types
> >> > per index
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Fri, Jun 29, 2018 at 9:26 AM Michael Mior <mmior@xxxxxxxxxx>
> wrote:
> >> >
> >> >> I think it sounds like you and Andrei are in a good position to
> tackle
> >> this
> >> >> one so I'm happy to have you both work on whatever solution you
> think is
> >> >> best.
> >> >>
> >> >> --
> >> >> Michael Mior
> >> >> mmior@xxxxxxxxxx
> >> >>
> >> >>
> >> >>
> >> >> Le ven. 29 juin 2018 à 04:19, Christian Beikov <
> >> christian.beikov@xxxxxxxxx
> >> >> >
> >> >> a écrit :
> >> >>
> >> >> > IMO the best solution would be to make it configurable by
> introducing
> >> a
> >> >> > "table_mapping" config with values
> >> >> >
> >> >> >   * type - every type in the known indices is mapped as table
> >> >> >   * index - every known index is mapped as table
> >> >> >
> >> >> > We'd probably also need a "type_field" configuration for defining
> >> which
> >> >> > field to use for the type determination as one of the possible
> future
> >> >> > ways to do things is to introduce a custom field:
> >> >> >
> >> >> >
> >> >>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2
> >> >> >
> >> >> > We already detect the ES version, so we can set a smart default for
> >> this
> >> >> > setting. Let's make the index config param optional.
> >> >> >
> >> >> >   * When no index is given, we discover indexes, the default for
> >> >> >     "table_mapping" then is "index"
> >> >> >   * When index is given, the we only discover types according to
> the
> >> >> >     "type_field" configuration and the default for "table_mapping"
> is
> >> >> > "type"
> >> >> >
> >> >> > This would also allow to discover indexes but still use "type" as
> >> >> > "table_mapping".
> >> >> >
> >> >> > What do you think?
> >> >> >
> >> >> > Mit freundlichen Grüßen,
> >> >> >
> >> ------------------------------------------------------------------------
> >> >> > *Christian Beikov*
> >> >> > Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
> >> >> > > Yes. There is an API to list all indexes / types in elastic. They
> >> can
> >> >> be
> >> >> > > automatically imported into a schema.
> >> >> > >
> >> >> > > What needs to be agreed upon is how to expose those elements in
> >> calcite
> >> >> > > schema (naming / behaviour).
> >> >> > >
> >> >> > > 1) Many (most?) of setups are single type per index. Natural way
> to
> >> >> name
> >> >> > > would be  "elastic.$index" (elastic being schema name). Multiple
> >> >> indexes
> >> >> > > would be under same schema "elastic.index1" "elastic.index2" etc.
> >> >> > >
> >> >> > > 2) What if index has several types should they exported as
> calcite
> >> >> > tables:
> >> >> > > "elastic.$index_type1" "elastic.$index_type2" ?  Or (current
> >> behaviour)
> >> >> > as
> >> >> > > "elastic.type1" and "elastic.type2". Or as subschema
> >> >> > > "elastic.$index.type1" ?
> >> >> > >
> >> >> > > Now what if one has combination of (1) and (2) ?
> >> >> > > Setup (2) is already deprecated (and will be unsupported in next
> >> >> version)
> >> >> > >
> >> >> > >
> >> >> > > On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <
> >> >> > christian.beikov@xxxxxxxxx>
> >> >> > > wrote:
> >> >> > >
> >> >> > >> Is there an API to discover indexes? If there is, I'd suggest we
> >> >> allow a
> >> >> > >> config option that to make the adapter discover the possible
> >> indexes.
> >> >> > >> We'd still have to adapt the code a bit, but internally, the
> schema
> >> >> > >> could just keep a cache of type name to index name map and be
> able
> >> to
> >> >> > >> support both scenarios.
> >> >> > >>
> >> >> > >>
> >> >> > >> Mit freundlichen Grüßen,
> >> >> > >>
> >> >>
> ------------------------------------------------------------------------
> >> >> > >> *Christian Beikov*
> >> >> > >> Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
> >> >> > >>>> 1) What's the time horizon for the current adapter no longer
> >> working
> >> >> > >> with these
> >> >> > >>> changes to ES ?
> >> >> > >>> Current adapter will be working for a while with existing
> setup.
> >> The
> >> >> > >>> problem is nomenclature and ease of use.
> >> >> > >>>
> >> >> > >>> Their new SQL concepts mapping
> >> >> > >>> <
> >> >> > >>
> >> >> >
> >> >>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >> >> > >>> drops
> >> >> > >>> the notion of ES type (which before was equivalent of RDBMS
> table)
> >> >> and
> >> >> > >> uses
> >> >> > >>> ES index as new table equivalent (before ES index was equal to
> >> >> > database).
> >> >> > >>> Most users use elastic this way (one type , one index) index ==
> >> >> table.
> >> >> > >>>
> >> >> > >>> Currently calcite requires schema per index. In RDBMS parlance
> >> >> database
> >> >> > >> per
> >> >> > >>> table (I'd like to change that).
> >> >> > >>>
> >> >> > >>>> 2) Any guess how complicated it would be to maintain code
> paths
> >> for
> >> >> > both
> >> >> > >>>> behaviours? I know this is probably really challenging to
> >> estimate,
> >> >> > but
> >> >> > >> I
> >> >> > >>>> really have no idea of the scope of these changes. Would it
> mean
> >> two
> >> >> > >>>> different ES adapters?
> >> >> > >>> One can have just a separate calcite schema implementations
> (same
> >> >> > >> adapter /
> >> >> > >>> module) :
> >> >> > >>> 1)  LegacySchema (old). Schema can have only one index (but
> >> multiple
> >> >> > >>> types). Type == table in this case.
> >> >> > >>> 2)  NewSchema (new). Single schema can have multiple indexes
> >> (type is
> >> >> > >>> dropped). Index == table in this case
> >> >> > >>>
> >> >> > >>>> 3) Do we really need compatibility with the current version of
> >> the
> >> >> > >>> adapter?
> >> >> > >>>> IMO this depends on what versions of ES we would lose support
> for
> >> >> and
> >> >> > >> how
> >> >> > >>>> complex it would be for users of the current ES adapter to
> make
> >> >> > updates
> >> >> > >>> for
> >> >> > >>>> any Calcite API changes.
> >> >> > >>> The issue is not in adapter but how calcite schema exposes
> tables.
> >> >> > >> Should
> >> >> > >>> it expose index as individual table (new), or ES type (old) ?
> >> >> > >>>
> >> >> > >>> Andrei.
> >> >> > >>>
> >> >> > >>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mmior@xxxxxxxxxx
> >
> >> >> wrote:
> >> >> > >>>
> >> >> > >>>> Unfortunately I know very little about ES so I'm not in a
> great
> >> >> > >> position to
> >> >> > >>>> asses the impact of these changes. I will say that that legacy
> >> >> > >>>> compatibility is great, but maintaining two sets of logic is
> >> always
> >> >> a
> >> >> > >>>> challenge. A few follow up questions:
> >> >> > >>>>
> >> >> > >>>> 1) What's the time horizon for the current adapter no longer
> >> working
> >> >> > >> with
> >> >> > >>>> these changes to ES?
> >> >> > >>>>
> >> >> > >>>> 2) Any guess how complicated it would be to maintain code
> paths
> >> for
> >> >> > both
> >> >> > >>>> behaviours? I know this is probably really challenging to
> >> estimate,
> >> >> > but
> >> >> > >> I
> >> >> > >>>> really have no idea of the scope of these changes. Would it
> mean
> >> two
> >> >> > >>>> different ES adapters?
> >> >> > >>>>
> >> >> > >>>> 3) Do we really need compatibility with the current version of
> >> the
> >> >> > >> adapter?
> >> >> > >>>> IMO this depends on what versions of ES we would lose support
> for
> >> >> and
> >> >> > >> how
> >> >> > >>>> complex it would be for users of the current ES adapter to
> make
> >> >> > updates
> >> >> > >> for
> >> >> > >>>> any Calcite API changes.
> >> >> > >>>>
> >> >> > >>>> Thanks for your continued work on the ES adapter Andrei!
> >> >> > >>>>
> >> >> > >>>> --
> >> >> > >>>> Michael Mior
> >> >> > >>>> mmior@xxxxxxxxxx
> >> >> > >>>>
> >> >> > >>>>
> >> >> > >>>>
> >> >> > >>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <andrei@xxxxxxxxx>
> a
> >> >> > écrit
> >> >> > >> :
> >> >> > >>>>> Hello,
> >> >> > >>>>>
> >> >> > >>>>> Elastic announced
> >> >> > >>>>> <
> >> >> > >>>>>
> >> >> > >>
> >> >> >
> >> >>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
> >> >> > >>>>> that they will be deprecating mapping types in ES6 and
> indexes
> >> will
> >> >> > be
> >> >> > >>>>> single-typed only.
> >> >> > >>>>>
> >> >> > >>>>> Historical analogy <
> https://www.elastic.co/blog/index-vs-type>
> >> >> > between
> >> >> > >>>>> RDBMS and elastic was that index is equivalent to a database
> and
> >> >> type
> >> >> > >>>>> corresponds to table in that database. In a couple of
> releases
> >> >> > (ES6-8)
> >> >> > >>>> this
> >> >> > >>>>> shall not longer be true.
> >> >> > >>>>>
> >> >> > >>>>> Recent SQL addition
> >> >> > >>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released>
> to
> >> >> > elastic
> >> >> > >>>>> confirms
> >> >> > >>>>> this trend
> >> >> > >>>>> <
> >> >> > >>>>>
> >> >> > >>
> >> >> >
> >> >>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
> >> >> > >>>>>> .
> >> >> > >>>>> Index is equivalent to a table and there are no more ES
> types.
> >> >> > >>>>>
> >> >> > >>>>> I would like to propose to include this logic in Calcite ES
> >> >> adapter.
> >> >> > >> IE,
> >> >> > >>>>> expose each ES single-typed index as a separate table inside
> >> >> calcite
> >> >> > >>>>> schema. This is in contrast to  current integration where
> schema
> >> >> can
> >> >> > >> only
> >> >> > >>>>> have a single index. Current approach forces you to create
> >> multiple
> >> >> > >>>> schemas
> >> >> > >>>>> to query single-typed indexes (on the same ES cluster).
> >> >> > >>>>>
> >> >> > >>>>> Legacy compatibility can always be controlled with
> configuration
> >> >> > >>>>> parameters.
> >> >> > >>>>>
> >> >> > >>>>> Do you agree with such changes ? If yes, would you consider a
> >> PR ?
> >> >> > >>>>>
> >> >> > >>>>> Regards,
> >> >> > >>>>> Andrei.
> >> >> > >>>>>
> >> >> > >>
> >> >> >
> >> >> >
> >> >>
> >>
>