this contradict the third example of this infographic (I'm guessing that each one is framing it's subject the same way):
that would be too easy

Sharpening is forbidden, it introduce noise, the noise pattern being different on each sharpened pictures it creates "snow" around the reconstruction and the reconstruction will look like it's made of sand.
As a matter of fact ANY post processing is forbidden.
And yes your camera does a bit of post processing, but I can't do anything about that beside not adding an other layer of post processing over it.
1280x1280 is not the resolution of the pictures but the area in the pyramidal matching process :
Look at page 6.
In photogrammetry the higher the sharpness and the higher the resolution is better, reducing you pictures to 1280px while your camera gives you 4000+px pictures would be madness.
Again, I won't be able to do perfect shots like thos studio who uses 64* 1500$ DSLR.
I'm just trying to get the most perfect possible shot from my camera.
Sadly no,
If you just want a neutral head you could actually make a pretty good scan with only one DSLR.
But since I want to be able to do emotions the multi-cam setup is the only way to go.
It could be easier if I had a whole empty room.
I could put the cameras 3 meter away and use various light under white umbrellas.
Here since the wood support are very close if I but the light behing they will cast a shadow.