Distributed query processing ensures high scalability and low latency when working with Big Data.
With this feature, databases are partitioned (“sharded”) with each partition/shard managed by an instance of the DBMS server. Shards are typically distributed on a storage array (which may be a SAN) – with each server keeping a CPU core busy – or distributed across different physical servers with their own storage systems.
Each shard can have one or more backup (replica) copies, which in addition to delivering high availability via failover (referred to as k-safety in technical jargon) can also share the query processing load. Distributed query processing across multiple servers, CPUs and/or CPU cores accelerates performance – dramatically, in some cases – via parallel execution of database operations and by harnessing the combined processing power, memory and I/O bandwidth of many nodes rather than just one.
Sharding, with distributed query processing, leverages the processing power, memory and bandwidth of multiple hardware nodes. Each database shard can have one or more backup (replica) copies.
The developer specifies the storage (in-memory or persistent) for any record type, which is ideal for handling real-time quote and historical data within a single database architecture.
Distributed query processing played a key role in eXtremeDB Financial Edition’s record-setting STAC-M3 benchmark implementations. For example, McObject’s DBMS on an IBM POWER8 S824L Linux server achieved best ever overall processing time and lowest standard deviation of results (low jitter) with processing spread across 72 shards.