streaming - Apache Storm compared to Hadoop -


how storm compare hadoop? hadoop seems defacto standard open-source large scale batch processing, storm has advantages on hadoop? or different?

why don't tell opinion.

twitter storm has been touted real time hadoop. more marketing take easy consumption.

they superficially similar since both distributed application solutions. apart typical distributed architectural elements master/slave, zookeeper based coordination, me comparison falls off cliff.

twitter more pipline processing data comes. pipe connects various computing nodes receive data, compute , deliver output. (there lingo spouts , bolts) extend analogy complex pipeline wiring can re-engineered when required , twitter storm.

in nut shell processes data comes. there no latency.

hadoop how ever different in respect due hdfs. solution geared distributed storage , tolerance outage of many scales (disks, machines, racks etc)

m/r built leverage data localization on hdfs distribute computational jobs. together, not provide facility real time data processing. not requirement when looking through large data. (needle in haystack analogy)

in short, twitter storm distributed real time data processing solution. don't think should compare them. twitter built because needed facility process small tweets humungous number of them , in real time.

see: hstreaming if compelled compare thing


Comments

Popular posts from this blog

c# - SVN Error : "svnadmin: E205000: Too many arguments" -

c++ - Using OpenSSL in a multi-threaded application -

All overlapping substrings matching a java regex -