git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Data model storage optimization


The current data model described as table name: ((partition_key),cluster_key),other_column1,other_column2,...

user_by_name: ((time_bucket, username)),ts,request,email
user_by_mail: ((time_bucket, email)),ts,request,username

The reason that all 2 keys (username, email) repeated in all tables is that there may be different username with the same email or different email with same username, and the query for data model is:
1.  username = X
2. mail=Y
3. username = X and mail= Y (we query one of tables and because there is small number of records in result, we filter the other column)

This data model results in wasting lots of storage.
I thought using UUID or hash code or sequence to handle this but i can't keep track of the old vs new records (the ones that already have UUID).
Any recommendation on optimizing data model to save storage?

Sent using Zoho Mail





( ! ) Warning: include(msgfooter.php): failed to open stream: No such file or directory in /var/www/git/apache-cassandra-users/msg06608.html on line 68
Call Stack
#TimeMemoryFunctionLocation
10.0007363544{main}( ).../msg06608.html:0

( ! ) Warning: include(): Failed opening 'msgfooter.php' for inclusion (include_path='.:/var/www/git') in /var/www/git/apache-cassandra-users/msg06608.html on line 68
Call Stack
#TimeMemoryFunctionLocation
10.0007363544{main}( ).../msg06608.html:0