git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Hbase state backend in Flink


Hi Naveen,

AFAIK, there are two level of storage in typical statebackend
(local/remote). I think it kinda similar to what PC main memory and disk
analogy.

Take RocksDB Statebackend as example, window state (typical very large
ListState) persisted in partitioned local rocksdb files, adding element to
window is localized and cheap.When checkpoint starts, each of those rocksdb
do upload to corresponding HDFS directories separately.This is good in a
sense when any intermediate states between two successful checkpoints can
be overwritten and local snapshots can be done cheaply and asynchronously.

I heard folks tried to build mysqlbackend(deprecated), remote rocksdb as
service backend(hard to scale and performance bottleneck) , Cassandra(hard
to snapshot). All of which shares same trait on lack of local
parallelizable snapshot semantic.

Hope this helps!
Chen

On Thu, Dec 27, 2018 at 8:27 AM miki haiat <miko5054@xxxxxxxxx> wrote:

> Did try to use rocksdb[1] as state backend?
>
>
> 1.
>
> https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/state_backends.html#the-rocksdbstatebackend
>
>
> On Thu, 27 Dec 2018, 18:17 Naveen Kumar <naveenkumar.g@xxxxxxxxxxxx
> .invalid
> wrote:
>
> > Hi,
> >
> > I am exploring if we can plugin hbase as state backend in Flink. We have
> > need for streaming jobs with large window states, high throughput and
> > reliability.
> >
> > I wanted to know if implementing Flink backend in Hbase or other
> > distributed KV store is possible. Any documentation or pointers will be
> > helpful.
> >
> > Thanks,
> > Naveen
> >
>