1. A distributed coherency processor comprising: a plurality of filter pipes for tracking memory requests from a plurality of caches, a requesting filter pipe in the plurality of filter pipes storing a memory-request entry having a request address for a requested cache line;
a central coherency controller that receives memory requests from the plurality of filter pipes, the central coherency controller generating ordering messages and invalidate messages in response to the memory requests;
a snoop tag directory storing snoop entries that indicate sharing caches having a copy of the requested cache line at the request address;
the central coherency controller further for searching the snoop tag directory using the request address from a memory request from a requesting cache, and for sending invalidate messages to sharing filter pipes for the sharing caches identified by the snoop tag directory;
an ordering message sent from the central coherency controller to the requesting filter pipe, the ordering message indicating an order for processing memory requests, the order determined by the central coherency controller;
an invalidate count in the ordering message, the invalidate count indicating a number of the sharing caches in the plurality of caches for the sharing caches identified by the snoop tag directory; and
a plurality of invalidate acknowledgement messages, generated by the sharing caches in response to the invalidate messages from the central coherency controller, each of the invalidate acknowledgement messages verifying invalidation of the copy of the requested cache line by the sharing cache;
wherein the requesting filter pipe receives the plurality of invalidate acknowledgement messages and releases data in the requested cache line for processing after a number of the plurality of invalidate acknowledgement messages received by the requesting filter pipe matches the invalidate count from the ordering message,
whereby coherency order is determined by the central coherency controller and coherency tracking is performed by the requesting filter pipe in the plurality of filter pipes.