separate query from distributed computation
Build for high concurrency
Network io most precious resource
Prefer predictable degradation to complex recovery scenarios
Build every component to be distributed, do not reuse old RDBMS architecture
Netty - java network library
ProtoBuff, Thrift faster than RPC in Hbase
Shard - have to ask every server. Every query hits every machine.
Global indexes faster for distributed systems. Tells you which server has data you need based on primary index telling ranges on each server.