caching - In-Memory Data Grid for Java Project -


i'm looking use in memory data grid java project. know there few relevant products such vmware gemfire, gigaspaces xap, ibm extreme scale , others. can elaborate experience of these tools , how compare 1 another? thanks, alex

(disclaimer - work gigaspaces)

hi alex

there many criteria compare by, depends on you're trying do. in memory data grid have lot of use cases, e.g. caching, oltp, high throughput event processing, etc. in general, main criteria should looking @ are:

  • programming model: support popular java frameworks such spring (xap , gemfire support natively)
  • querying , indexing: if want more trivial key/value data access. people need sql semantics, or full text search, , if data grid can provide out of box it's big advantage.
  • ability execute code on grid nodes, , colocate code them , handle events injected grid (e.g. objects written or updated). massive scalability benefit , allows implement efficient shared-nothing architectures.
  • languages , apis support: data grid support @ least java , jvm based languages (e.g. scala), lot of them support other languages , allow access same data various programming languages. example xap supports natively java, .net , c++, , other languages using rest , memcached interfaces. far apis go, grid support more 1 api. @ gigaspaces support map, spring/pojo, jpa, jdbc , others.
  • transactions: big 1 if want go anywhere beyond caching. when using memory system of record, should able rollback state in case have error or bug, otherwise end corrupt data. important thing types of transactions supported. lot of data grids support "local" transactions. i.e. within boundaries of single node / partition / shard (which want in cases performance reason). more advanced grids support distributed transactions , know how seamlessly upgrade local distributed when needed.
  • replication: there various models here (synchronous, asynchronous, hybrid) , need decide 1 of them best use case. grids have explicit support cross cluster replication on wan important if you're implementing dr.
  • data partitioning , scalability: how grid partition data (fixed / consistent hashing), level of control user has on it, , support dynamic addition of server grid increase capacity.
  • administration , monitoring: last not least - kind of facilities provided out of box, such monitoring , administration hooks (jmx or administrative api), user interfaces , integration other 3rd party systems.

the following links place start:

  1. http://gojko.net/2009/06/01/oracle-coherence-vs-gigaspaces-xap/. read comments
  2. http://www.neovise.com/neovise-data-caching-performance-technical-white-paper - recent comparison between gigaspaces , gemfire think speaks :)

hth, uri


Comments

Popular posts from this blog

c# - SVN Error : "svnadmin: E205000: Too many arguments" -

c++ - Using OpenSSL in a multi-threaded application -

All overlapping substrings matching a java regex -