The problem with the algorithms faster than conventional mm is, that they are de...

The problem with the algorithms faster than conventional mm is, that they are defined on mathematical numbers.

E.g. strassens-algorithms assumes something like:

(A+B)C - BC = AC

With the convential floating point representation this becomes a problem if B >> A because of rounding errors.

Most algorithms faster than Strassen assume:

(( fA + B)C - BC)/f = AC

and

(fA + B)C = BC +fAC

f is then choosen in such a way, that fAC is neglegible. But if that is the case fA is much smaller than B and we automatically get numerical problems.

There is a procedure to remove the fAC terms (which would imply that f could be choosen close to 1) which armortizes over enough recursions. But for Coppersmith-Winograd (and derivatives like Le-Galls algorithm). We need to divide the matrix in each recursion step into enough submatrices that the indices can be treated heuristically(lets say a division of each index in each iteration in 100 subindices). This would imply the sides of each matrix must be at least of size 100^n_recusion. We won't need to multiply matrices of that size.