01 Interconnect Fabric
Interconnect Fabric
Why we need it?
The Interconnect Built in 00 Basic 1 to 1 Interconnect only accommodates 1 Requestor and completer.
How can we connect multiple completers to multiple Requestors?
The naive method would be making many Requestor interfaces for each completer but that would incur a lot of logic and chip area. As you increase the number of requestors, things get problematic as you would need a complete bipartite graph topology between requestors and completers, this does not scale well.
Thus we need something to Multiplex requestor signals to the needed completer. To do this we need to introduce an interconnect Fabric.
Proposed Design
This is an elegant solution that simply lets requestor address the completers using an ADDR
signal to the interconnect fabric. This is called the address space. All interconnects contain an address decoder which takes care of address resolution, this is just a bunch of comparators.
This arrangement of address decoder is very bulky, and introduces a long critical path reducing the interconnects maximum clock frequency. We can implement better address mapping to simplify the logic of address decoders.
Lets propose only putting completers's address space on power of 2 addresses.
This way, we can simplify our logic, and instead of using comparators for both the upper bound and lower bound of the address space, we can just compare the
Arbitration
When dealing with multiple requestors it is not uncommon to have two requestors request a signal from a completer at the same time. Therefore there must be an intermediary entity, an arbiter that takes into account requestor requests.
Shared Bus Architecture
Shared bus architectures are viable when all completers and requestors share the same interconnect protocol interfaces.
All requestor and completer interconnect interfaces have to be guarded by a tri-state gate because only a single requestor–completer pair can be connected to the bus at any given time.
we need instead to use an additional out-of-protocol signal between the requestors and the arbiter for this purpose (thus, shared buses use requestor-side arbitration).
The arbiter receives all requestor requests and sequentially grants bus access by enabling the specific requestor and completer’s tri-state gates.
- Easy to Implement
- Can operate at high clock speed
- Economical in Hardware resources
Does not Support any notion of concurrency, i.e, requestors cannot communicate with different completers in parallel.
Shared-bus architectures can therefore cause requestors in a systems to spend a significant amount of time waiting for access to the bus, and waiting times get worse the more requestors exist in the system, which is an unfavorable situation for SoCs, which often contain many independent accelerator units that act as requestors. Shared-bus architectures are, therefore, rarely used in SoCs.
Full Crossbar-Switch Architecture
Full Crossbar-Switch Architectures are Completer-side arbitration and this allows for maximum concurrency in the system.
On the left Completers Switches
2-SW
are closed as there is no requestor collision, On the right Completer's Switch is now open as there is requestor collision.
"full crossbar switches do not require any extra out-of-protocol signals for arbitration purposes because each switch can see when multiple transactions are simultaneously directed at the same completer and can automatically provide backpressure on all requestors except the one that was granted access to the completer. Therefore, the interconnect protocol itself acts as the arbitration mechanism"
The Crossbar logic itself defines the arbitration policy: this can be round-robin priority of requestors, or priority access, basically a scheduler.
Full Cross-bar switches require a lot of resources area-wise on the die. The area of the cross-bar switch grows quadratically with the number of requestors and completers in the system. (as it is basically a square matrix of sitches)
Partial Crossbar-Switch Architectures
In real-world SoCs, some requestors may only ever need connectivity to a subset of the system’s available completers. For example, while a CPU may need to be connected to most peripherals, a DMA unit may only need interconnection with memories.
Each completer switch is sized to only accommodate the requestors that communicate with it.
- Offers all the advantages of Full Crossbar-Switch Architecture
- Less die area than that of the Full Crossbar-Switch
Design for Flexibility
Just like in software, your interconnect fabric can act as an adapter following the adapter design pattern, for example, if a completer does not support Burst Transactions, the interconnect fabric should be tasked with converted burst transaction to Single Word Transactions the completer can understand, while stalling the requestor through its interface if needed.
For real world examples on Interconnects: 02 AHB-Lite