cassandra - Map type read latency -


first of here answers our queries.

i stucked in 1 of points hope people can me out of problem.

i have apache 2.1 cluster 6 nodes , have created table 3 columns..1st column text type , other 2 map type. when insert data table , read data..for fetching 1 row taking around 20 milliseconds if create table text type 3 columns taking 5 ms. kindly suggest me missing..why taking time if map type?? confused start map type read latency.

below cfstats , query:

table:

product_type sstable count: 1 sstables in each level: [1, 0, 0, 0, 0, 0, 0, 0, 0] space used (live): 81458 space used (total): 81458 space used snapshots (total): 0 off heap memory used (total): 87 sstable compression ratio: 0.15090414689301526 number of keys (estimate): 6 memtable cell count: 0 memtable data size: 0 memtable off heap memory used: 0 memtable switch count: 0 local read count: 5 local read latency: 22.494 ms local write count: 0 local write latency: nan ms pending flushes: 0 bloom filter false positives: 0 bloom filter false ratio: 0.00000 bloom filter space used: 16 bloom filter off heap memory used: 8 index summary off heap memory used: 15 compression metadata off heap memory used: 64 compacted partition minimum bytes: 73458 compacted partition maximum bytes: 105778 compacted partition mean bytes: 91087 average live cells per slice (last 5 minutes): 1.0 maximum live cells per slice (last 5 minutes): 1.0 average tombstones per slice (last 5 minutes): 0.0 maximum tombstones per slice (last 5 minutes): 0.0  create table test.product_type ( type text primary key, col1 map<int, boolean>, timestamp_map map<int, timestamp> ) bloom_filter_fp_chance = 0.1 , caching = '{"keys":"all", "rows_per_partition":"none"}' , comment = '' , compaction = {'min_threshold': '4', 'class':                       'org.apache.cassandra.db.compaction.leveledcompactionstrategy', 'max_threshold': '32'} , compression = {'sstable_compression': 'org.apache.cassandra.io.compress.lz4compressor'} , dclocal_read_repair_chance = 0.1 , default_time_to_live = 0 , gc_grace_seconds = 864000 , max_index_interval = 2048 , memtable_flush_period_in_ms = 0 , min_index_interval = 128 , read_repair_chance = 0.0 , speculative_retry = '99.0percentile';    activity                                                                                                              | timestamp                  | source        | source_elapsed  -----------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------                                                                                                     execute cql3 query | 2015-06-03 21:57:36.841000 | 10.65.133.202 |              0                                             parsing select * location_eligibility_by_type5; [sharedpool-worker-1] | 2015-06-03 21:57:36.842000 | 10.65.133.202 |             54                                                                              preparing statement [sharedpool-worker-1] | 2015-06-03 21:57:36.842000 | 10.65.133.202 |             86                                                                        computing ranges query [sharedpool-worker-1] | 2015-06-03 21:57:36.842000 | 10.65.133.202 |            165   submitting range requests on 1537 ranges concurrency of 1 (0.0 rows per range expected) [sharedpool-worker-1] | 2015-06-03 21:57:36.842000 | 10.65.133.202 |            410                                                              enqueuing request /10.65.137.191 [sharedpool-worker-1] | 2015-06-03 21:57:36.849000 | 10.65.133.202 |           7448                                                                       message received /10.65.133.202 [thread-15] | 2015-06-03 21:57:36.849000 | 10.65.137.191 |             15                                       submitted 1 concurrent range requests covering 1537 ranges [sharedpool-worker-1] | 2015-06-03 21:57:36.849000 | 10.65.133.202 |           7488                                                               sending message /10.65.137.191 [write-/10.65.137.191] | 2015-06-03 21:57:36.849000 | 10.65.133.202 |           7515  executing seq scan across 0 sstables [min(-9223372036854775808), min(-9223372036854775808)] [sharedpool-worker-1] | 2015-06-03 21:57:36.850000 | 10.65.137.191 |            105                                                               read 1 live , 0 tombstoned cells [sharedpool-worker-1] | 2015-06-03 21:57:36.866000 | 10.65.137.191 |          16851                                                               read 1 live , 0 tombstoned cells [sharedpool-worker-1] | 2015-06-03 21:57:36.882000 | 10.65.137.191 |          33542                                                               read 1 live , 0 tombstoned cells [sharedpool-worker-1] | 2015-06-03 21:57:36.899000 | 10.65.137.191 |          50206                                                               read 1 live , 0 tombstoned cells [sharedpool-worker-1] | 2015-06-03 21:57:36.915000 | 10.65.137.191 |          66556                                                               read 1 live , 0 tombstoned cells [sharedpool-worker-1] | 2015-06-03 21:57:36.932000 | 10.65.137.191 |          82814                                                                     scanned 5 rows , matched 5 [sharedpool-worker-1] | 2015-06-03 21:57:36.932000 | 10.65.137.191 |          82839                                                             enqueuing response /10.65.133.202 [sharedpool-worker-1] | 2015-06-03 21:57:36.933000 | 10.65.137.191 |          82878                                                               sending message /10.65.133.202 [write-/10.65.133.202] | 2015-06-03 21:57:36.933000 | 10.65.137.191 |          83054                                                                      message received /10.65.137.191 [thread-151] | 2015-06-03 21:57:36.944000 | 10.65.133.202 |         102134                                                          processing response /10.65.137.191 [sharedpool-worker-2] | 2015-06-03 21:57:36.944000 | 10.65.133.202 |         102191                                                                                                       request complete | 2015-06-03 21:57:36.948916 | 10.65.133.202 |         107916 

thanks in advance support , answers.

thanks, john

collection types in cassandra under hood implemented blobs, no real magic here.

to measure difference can enable tracing in c* , see difference yourself:

create table no_collections(id int, value text, primary key (id)); create table with_collections(id int, value set<text>, primary key (id));  cqlsh:stackoverflow> select * no_collections ;   id | value ----+-------------   1 | foo,bar,baz   2 | xxx,yyy,zzz   3 | aaa,bbb,ccc  (3 rows) cqlsh:stackoverflow> select * with_collections ;   id | value ----+-----------------------   1 | {'bar', 'baz', 'foo'}   2 | {'xxx', 'yyy', 'zzz'}   3 | {'aaa', 'bbb', 'ccc'}  (3 rows) 

now let's enable tracing see what's going on:

cqlsh:stackoverflow> tracing on ; tracing enabled cqlsh:stackoverflow> select * with_collections id=3;   id | value ----+-----------------------   3 | {'aaa', 'bbb', 'ccc'}  (1 rows)  tracing session: 7c3d4ed0-09c8-11e5-b4cd-2988e70b20cb  activity                                                                                            | timestamp                  | source    | source_elapsed -------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------                                                                               execute cql3 query | 2015-06-03 11:13:58.717000 | 127.0.0.1 |              0                         parsing select * with_collections id=3; [sharedpool-worker-1] | 2015-06-03 11:13:58.718000 | 127.0.0.1 |             72                                                        preparing statement [sharedpool-worker-1] | 2015-06-03 11:13:58.718000 | 127.0.0.1 |            218                       executing single-partition query on with_collections [sharedpool-worker-3] | 2015-06-03 11:13:58.718000 | 127.0.0.1 |            547                                               acquiring sstable references [sharedpool-worker-3] | 2015-06-03 11:13:58.718000 | 127.0.0.1 |            556                                                merging memtable tombstones [sharedpool-worker-3] | 2015-06-03 11:13:58.719000 | 127.0.0.1 |            574  skipped 0/0 non-slice-intersecting sstables, included 0 due tombstones [sharedpool-worker-3] | 2015-06-03 11:13:58.719000 | 127.0.0.1 |            636                                 merging data memtables , 0 sstables [sharedpool-worker-3] | 2015-06-03 11:13:58.719000 | 127.0.0.1 |            644                                         read 1 live , 0 tombstoned cells [sharedpool-worker-3] | 2015-06-03 11:13:58.719000 | 127.0.0.1 |            673                                                                                 request complete | 2015-06-03 11:13:58.717847 | 127.0.0.1 |            847 

as see, took ~800ns parse , execute query uses collection. without collections situation looks same:

cqlsh:stackoverflow> select * no_collections id=3;   id | value ----+-------------   3 | aaa,bbb,ccc  (1 rows)  tracing session: 7e9ac6d0-09c8-11e5-b4cd-2988e70b20cb   activity                                                                                        | timestamp                  | source    | source_elapsed -------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------                                                                               execute cql3 query | 2015-06-03 11:14:02.685000 | 127.0.0.1 |              0                           parsing select * no_collections id=3; [sharedpool-worker-1] | 2015-06-03 11:14:02.686000 | 127.0.0.1 |             77                                                        preparing statement [sharedpool-worker-1] | 2015-06-03 11:14:02.686000 | 127.0.0.1 |            209                         executing single-partition query on no_collections [sharedpool-worker-3] | 2015-06-03 11:14:02.686000 | 127.0.0.1 |            525                                               acquiring sstable references [sharedpool-worker-3] | 2015-06-03 11:14:02.686000 | 127.0.0.1 |            534                                                merging memtable tombstones [sharedpool-worker-3] | 2015-06-03 11:14:02.687000 | 127.0.0.1 |            553  skipped 0/0 non-slice-intersecting sstables, included 0 due tombstones [sharedpool-worker-3] | 2015-06-03 11:14:02.687000 | 127.0.0.1 |            598                                 merging data memtables , 0 sstables [sharedpool-worker-3] | 2015-06-03 11:14:02.688000 | 127.0.0.1 |            606                                         read 1 live , 0 tombstoned cells [sharedpool-worker-3] | 2015-06-03 11:14:02.688000 | 127.0.0.1 |            630                                                                                 request complete | 2015-06-03 11:14:02.685789 | 127.0.0.1 |            789 

so see no real difference here.

the timing shown cqlsh tracing approximate , not statistically correct. explore difference need run @ least few dozens experiments , compare it's results. results may depend of different things:

  • network latency between nodes. can cause of latency problems in shared infrastructure aws.
  • cluster load. if cluster not idle, may background work may interfere measurements.
  • background jobs. if have dataset frequent updates/deletes, c* may doing compaction tasks under hood, , can interfere other queries.
  • heavy updates/deletes, low-memory. if have (or had in past) heavy update/delete workload, data may spread among multiple small sstables not yet compacted. c* must read of them row, lead high query latency.

so suggest run queries tracing enabled see problem, bet isn't connected collections @ all.


Comments