PDW hardware basics
Microsoft has brought with PDW a great appliance for MPP architectures. In order to get all of this to work, there are a lot of different hardware and software components involved. This article is the foundation for upcoming articles which hold the fundamentals everybody should know about PDW and MPP architecture. So let’s startDifference between SMP and MPP The SMP architecture (Symmetric multiprocessing) provides fast performance by holding several CPUs, RAM and Storagen available to multiple processes. This is the default architecture for SQL Server and holds some limitations in regards of performance. As you can imagine SMP Server at some point can’t execute the statements anymore due to limitations on any of the mentioned components above. Another bottleneck ist the system bus. Everything executed will go through their. So what is the alternative? MPP!! The MPP architecture (Massiv Parallel Processing) is a different approach to the well known SMP architecture. Instead of having a server to do all the work, you have multiple servers doing the work. A special application / services splits the process into several and makes sure that they can communicate and get the results from the different machines back together. Now that why know the difference between those two types of architecture we can look deeper into Microsofts approach, called PDW. PDW hardware components Whenever Microsoft sells a PDW, there a always several hardware components included. The so called Base Unit includes besides of the 4 hosts machines and the JBOD, Ethernet and Infiniband switches so the appliance can communicate which its components as well as clients. What you can see in the picture below is the so called "Base Unit“ So what is included:
- HST01 holds the necessary services for the Appliance to run
- HST02 is the failover server whenever something goes offline
- HSA01 and HSA02 are workload machines
- JBOD is attached to the HSA machines and provides the storage
- Control Node (CTL) is the main entry point for all client related requests. Whenever a client sends a query to the control node the MPP Engine takes the query and builds a parallel plan. This plan is executed against every involved compute nodes. After each node send it results back the Control Node puts the results back together and sends it to the client.
- Management Node (MAD) provides all the basic functionality to manage the appliance. It involves everything related to patching and maintaining the appliance, as well as a Management Studio to query the single compute nodes involved.
- Active Directory (AD) is used to take care of the physical hardware. It is also necessary for the cluster functionality.
- Virtual Machine Manager (VMM) is in charge for the virtual machines. VMM checks that everything runs properly and starts Failover Images an different nodes if necessary.