Yea i couldn't think of the right word to describe it as all i kept picturing was the big black monolith from 2001.
It wasn't so much the scale side of the equation i was thinking of, as you say that's well understood in graphics and compute cards, it's more the Zen type design of taking "an" core that deals with x86-x64 instruction sets and scaling that up to work with many more cores and have them all working as efficiently as possible.
If i understand correctly with GPU's and compute only situations you're working with fairly well defined timings, i.e a single MAC does X amount of multiply-accumulate operations per clock whereas "an" core designed to deal with x86-x64 instruction can be more 'random' based on what their running.
Basically i think they hired Keller to design something similar to how Zen was designed to be used in everything from low power single core designs all the way up to monster multicore server chips, Intel's current philosophy of stitching increasing numbers of ring buses together only take them so far before it starts to cause problems so they need another way of doing things.