git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[jira] [Updated] (ATLAS-2649) Hive Hook should create lineage entities when storage handler mechanism to create hbase tables via hive


     [ https://issues.apache.org/jira/browse/ATLAS-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ramesh Mani updated ATLAS-2649:
-------------------------------
    Affects Version/s: trunk

> Hive Hook should create lineage entities when storage handler mechanism to create hbase tables via hive
> -------------------------------------------------------------------------------------------------------
>
>                 Key: ATLAS-2649
>                 URL: https://issues.apache.org/jira/browse/ATLAS-2649
>             Project: Atlas
>          Issue Type: Bug
>    Affects Versions: trunk
>            Reporter: Ramesh Mani
>            Priority: Major
>             Fix For: trunk
>
>
> Hive Hook should create lineage entities when storage handler mechanism to create hbase tables via hive.
> When Hive on HBase is done via Hive's HBaseStorageHandler mechanism, corresponding HBase table is created in HBase and data is store in it. In this process Hive Hook should show Input process as Hive Table and Output as HBase Table.
> e.g
> CREATE TABLE hbase_table_emp(id int, name string, role string) 
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:name,cf1:role")
> TBLPROPERTIES ("hbase.table.name" = "emp");
> This will create a corresponding HBase table emp
> hbase(main):003:0> list
> TABLE
> ATLAS_ENTITY_AUDIT_EVENTS
> atlas_janus
> emp
> 3 row(s)
> Took 0.0127 seconds
> => ["ATLAS_ENTITY_AUDIT_EVENTS", "atlas_janus", "emp"]
> hbase(main):004:0> describe 'emp'
> Table emp is ENABLED
> emp
> COLUMN FAMILIES DESCRIPTION
> {NAME => 'cf1', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS =>
> 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION
> _SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'fals
> e', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
> 1 row(s)
> Took 0.1961 seconds
>  
> In this process the Hive hook should provide the lineage info for the corresponding Hive table -> HBase Table Storage.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)