[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fwd: Re: Data model storage optimization

How many rows in average per partition?
around 10K.

Let me get this straight : You are bifurcating your partitions on either email or username , essentially potentially doubling the data because you don’t have a way to manage a central system of record of users ?

We are just analyzing output logs of a "perfectly" running application!, so no one let me change its data design, i thought maybe it would be a more general problem for cassandra users that someone both
1. needed to access a identical set of columns by multiple keys (all the keys should be present in rows)
2. there was a storage limit (due to TTL * input rate would be some TBs)
I know that there is a strict rule in cassandra data modeling : "never use foreign keys and sacrifice disk instead", but anyone ever been forced to do such a thing and How?