git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

How to read from Cassandra using Apache Flink?


My flink program should do a Cassandra look up for each input record and
based on the results, should do some further processing.

But I'm currently stuck at reading data from Cassandra. This is the code
snippet I've come up with so far.

> ClusterBuilder secureCassandraSinkClusterBuilder = new ClusterBuilder() {
>         @Override
>         protected Cluster buildCluster(Cluster.Builder builder) {
>             return
> builder.addContactPoints(props.getCassandraClusterUrlAll().split(","))
>                     .withPort(props.getCassandraPort())
>                     .withAuthProvider(new DseGSSAPIAuthProvider("HTTP"))
>                     .withQueryOptions(new 
> QueryOptions().setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM))
>                    .build();
>        }
>    };

>    for (int i=1; i<5; i++) {
>        CassandraInputFormat<Tuple2&lt;String, String>>
> cassandraInputFormat =
>                new CassandraInputFormat<>("select * from test where
> id=hello" + i, >secureCassandraSinkClusterBuilder);
>        cassandraInputFormat.configure(null);
>        cassandraInputFormat.open(null);
>        Tuple2<String, String> out = new Tuple8<>();
>        cassandraInputFormat.nextRecord(out);
>        System.out.println(out);
>    }
But the issue with this is, it takes nearly 10 seconds for each look up, in
other words, this for loop takes 50 seconds to execute.

How do I speed up this operation? Alternatively, is there any other way of
looking up Cassandra in Flink?





--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/