1

I am looking for more info on AI hardware architectures, but I am a bit confused. Here are my questions:

  • Does it all come down to MACs(Multiply And Accumulate) units?
  • Do MACs usually integrate into ALU like this: MAC

or the ALU in this case IS MAC like shown here

ALU

In case anyone is interested I got this from MIT ISCA 2017 (page 124 and down)presentation.

Thanks!

winny
  • 13,064
  • 6
  • 46
  • 63

2 Answers2

0

You seem to be mixing different presentations and different domains.

"MAC" in this context means Multiply-Accumulate, also known as Fused Multiply and Add (FMA). It's indeed the core operation for highly-connected neural networks including fully-connected networks and CNN's. The other main operation is the non-linear operation done on the result of all those MAC's. But since that's often a ReLu these days, that operation is practically free. (essentially one gate delay)

ALU (Arithmetic Logical Unit) is an outdated CPU concept. Modern CPU architectures don't have them anymore; the logical equivalent would be the FP Execution Units. They're definitely not connected to memory as in figure 1; they operate on registers. Memory wouldn't be able to keep up, eliminating the benefit of a dedicated MAC operation.

The second picture is a high-level overview of a new non-CPU architecture. This makes sense; you were discussing AI hardware and not general-purpose hardware such as CPU's.

MSalters
  • 561
  • 3
  • 7
  • I was indeed confused because the slide above mentioned ALU and i though that it is integrated. Thanks you for solving my confusion! – Aleksandar Kostovic Aug 29 '18 at 13:20
  • 1
    Why we don't have ALU in modern CPU architecture anymore? – Kindred Dec 09 '18 at 00:47
  • We do. Especially in specialized processors like the stream processors in GPGPUs, there's specific units for integer operations. – Marcus Müller Dec 09 '18 at 00:51
  • 1
    @ptr_user7813604: The ALU concept dates back to the 1970's. It closely ties a single register to the transistors that perform the math operations. In fact, it's often named the "A" register, for instance in Intel designs. We now have a 64 bit RAX register, its lineage dates back to the A register in the 4004. However, the RAX register no longer is special; every x64 chip can perform math equally well on RBX, or R15 for that matter. – MSalters Dec 10 '18 at 09:14
  • @MSalters: Thank you sir, thanks for your kindness. I will do some search to understand your words. – Kindred Dec 10 '18 at 15:24
0

Symbolic AI needs lots of pointer-chasing.

DeepLearning neural systems needs lots of multiply-accum math, and pointer chasing to access the various synapse strengths and their connections.

analogsystemsrf
  • 33,703
  • 2
  • 18
  • 46
  • Actually, deep nets don't need pointer chasing. If you implemented it like that'you did it wrong. A good programmer will organize all connections in the order that they're needed for the calculations. That means the only pointer operation needed is a simple increment by a fixed size, typically 4 bytes. This sequential access is heavily optimized in hardware, to the point that many CPU's will auto-detect the pattern and prefetch the connection weights before they are needed. – MSalters Dec 10 '18 at 16:03