git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fwd: Re: Data model storage optimization



How many rows in average per partition?
around 10K.


Let me get this straight : You are bifurcating your partitions on either email or username , essentially potentially doubling the data because you don’t have a way to manage a central system of record of users ?

We are just analyzing output logs of a "perfectly" running application!, so no one let me change its data design, i thought maybe it would be a more general problem for cassandra users that someone both
1. needed to access a identical set of columns by multiple keys (all the keys should be present in rows)
2. there was a storage limit (due to TTL * input rate would be some TBs)
I know that there is a strict rule in cassandra data modeling : "never use foreign keys and sacrifice disk instead", but anyone ever been forced to do such a thing and How?



( ! ) Warning: include(msgfooter.php): failed to open stream: No such file or directory in /var/www/git/apache-cassandra-users/msg06615.html on line 68
Call Stack
#TimeMemoryFunctionLocation
10.0007358376{main}( ).../msg06615.html:0

( ! ) Warning: include(): Failed opening 'msgfooter.php' for inclusion (include_path='.:/var/www/git') in /var/www/git/apache-cassandra-users/msg06615.html on line 68
Call Stack
#TimeMemoryFunctionLocation
10.0007358376{main}( ).../msg06615.html:0