git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Elasticsearch Adapter. Removal of Mapping Types (by vendor). Index == Table


I'm not sure what the benefit of allowing users to specify this scheme would be. We'd have to parse it, interpret it, make sure the expressions don't result conflicting names etc.

IMO a simple mode configuration would be way easier to implement and probably cover 99% of the use cases.


Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 29.06.2018 um 20:19 schrieb Julian Hyde:
Andrei,

I'm not an ES user so I don't fully understand this issue, but my two
cents anyway...

Can you show how those examples affect SQL against the ES adapter
and/or how they affect JSON models?

You seem to be using '_' as a separator character. Are we sure that
people will never use it in index or type name? Separator characters
often cause problems.

Julian




On Fri, Jun 29, 2018 at 10:58 AM, Andrei Sereda <andrei@xxxxxxxxx> wrote:
I agree there should be a configuration option. How about the following
approach.

Expose both variables ${index} and ${type} in configuration (JSON) and user
will use them to generate table name in calcite schema.

Example
"table_name": "${type}" // current
"table_name": "${index}" // new (default?)
"table_name": "${index}_${type}" // most generic. supports multiple types
per index





On Fri, Jun 29, 2018 at 9:26 AM Michael Mior <mmior@xxxxxxxxxx> wrote:

I think it sounds like you and Andrei are in a good position to tackle this
one so I'm happy to have you both work on whatever solution you think is
best.

--
Michael Mior
mmior@xxxxxxxxxx



Le ven. 29 juin 2018 à 04:19, Christian Beikov <christian.beikov@xxxxxxxxx
a écrit :

IMO the best solution would be to make it configurable by introducing a
"table_mapping" config with values

   * type - every type in the known indices is mapped as table
   * index - every known index is mapped as table

We'd probably also need a "type_field" configuration for defining which
field to use for the type determination as one of the possible future
ways to do things is to introduce a custom field:


https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2
We already detect the ES version, so we can set a smart default for this
setting. Let's make the index config param optional.

   * When no index is given, we discover indexes, the default for
     "table_mapping" then is "index"
   * When index is given, the we only discover types according to the
     "type_field" configuration and the default for "table_mapping" is
"type"

This would also allow to discover indexes but still use "type" as
"table_mapping".

What do you think?

Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 29.06.2018 um 02:41 schrieb Andrei Sereda:
Yes. There is an API to list all indexes / types in elastic. They can
be
automatically imported into a schema.

What needs to be agreed upon is how to expose those elements in calcite
schema (naming / behaviour).

1) Many (most?) of setups are single type per index. Natural way to
name
would be  "elastic.$index" (elastic being schema name). Multiple
indexes
would be under same schema "elastic.index1" "elastic.index2" etc.

2) What if index has several types should they exported as calcite
tables:
"elastic.$index_type1" "elastic.$index_type2" ?  Or (current behaviour)
as
"elastic.type1" and "elastic.type2". Or as subschema
"elastic.$index.type1" ?

Now what if one has combination of (1) and (2) ?
Setup (2) is already deprecated (and will be unsupported in next
version)

On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov <
christian.beikov@xxxxxxxxx>
wrote:

Is there an API to discover indexes? If there is, I'd suggest we
allow a
config option that to make the adapter discover the possible indexes.
We'd still have to adapt the code a bit, but internally, the schema
could just keep a cache of type name to index name map and be able to
support both scenarios.


Mit freundlichen Grüßen,

------------------------------------------------------------------------
*Christian Beikov*
Am 29.06.2018 um 00:12 schrieb Andrei Sereda:
1) What's the time horizon for the current adapter no longer working
with these
changes to ES ?
Current adapter will be working for a while with existing setup. The
problem is nomenclature and ease of use.

Their new SQL concepts mapping
<
https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
drops
the notion of ES type (which before was equivalent of RDBMS table)
and
uses
ES index as new table equivalent (before ES index was equal to
database).
Most users use elastic this way (one type , one index) index ==
table.
Currently calcite requires schema per index. In RDBMS parlance
database
per
table (I'd like to change that).

2) Any guess how complicated it would be to maintain code paths for
both
behaviours? I know this is probably really challenging to estimate,
but
I
really have no idea of the scope of these changes. Would it mean two
different ES adapters?
One can have just a separate calcite schema implementations (same
adapter /
module) :
1)  LegacySchema (old). Schema can have only one index (but multiple
types). Type == table in this case.
2)  NewSchema (new). Single schema can have multiple indexes (type is
dropped). Index == table in this case

3) Do we really need compatibility with the current version of the
adapter?
IMO this depends on what versions of ES we would lose support for
and
how
complex it would be for users of the current ES adapter to make
updates
for
any Calcite API changes.
The issue is not in adapter but how calcite schema exposes tables.
Should
it expose index as individual table (new), or ES type (old) ?

Andrei.

On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mmior@xxxxxxxxxx>
wrote:
Unfortunately I know very little about ES so I'm not in a great
position to
asses the impact of these changes. I will say that that legacy
compatibility is great, but maintaining two sets of logic is always
a
challenge. A few follow up questions:

1) What's the time horizon for the current adapter no longer working
with
these changes to ES?

2) Any guess how complicated it would be to maintain code paths for
both
behaviours? I know this is probably really challenging to estimate,
but
I
really have no idea of the scope of these changes. Would it mean two
different ES adapters?

3) Do we really need compatibility with the current version of the
adapter?
IMO this depends on what versions of ES we would lose support for
and
how
complex it would be for users of the current ES adapter to make
updates
for
any Calcite API changes.

Thanks for your continued work on the ES adapter Andrei!

--
Michael Mior
mmior@xxxxxxxxxx



Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <andrei@xxxxxxxxx> a
écrit
:
Hello,

Elastic announced
<

https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html
that they will be deprecating mapping types in ES6 and indexes will
be
single-typed only.

Historical analogy <https://www.elastic.co/blog/index-vs-type>
between
RDBMS and elastic was that index is equivalent to a database and
type
corresponds to table in that database. In a couple of releases
(ES6-8)
this
shall not longer be true.

Recent SQL addition
<https://www.elastic.co/blog/elasticsearch-6-3-0-released> to
elastic
confirms
this trend
<

https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html
.
Index is equivalent to a table and there are no more ES types.

I would like to propose to include this logic in Calcite ES
adapter.
IE,
expose each ES single-typed index as a separate table inside
calcite
schema. This is in contrast to  current integration where schema
can
only
have a single index. Current approach forces you to create multiple
schemas
to query single-typed indexes (on the same ES cluster).

Legacy compatibility can always be controlled with configuration
parameters.

Do you agree with such changes ? If yes, would you consider a PR ?

Regards,
Andrei.





( ! ) Warning: include(msgfooter.php): failed to open stream: No such file or directory in /var/www/git/apache-calcite-development/msg03926.html on line 462
Call Stack
#TimeMemoryFunctionLocation
10.0032393400{main}( ).../msg03926.html:0

( ! ) Warning: include(): Failed opening 'msgfooter.php' for inclusion (include_path='.:/var/www/git') in /var/www/git/apache-calcite-development/msg03926.html on line 462
Call Stack
#TimeMemoryFunctionLocation
10.0032393400{main}( ).../msg03926.html:0