Hacker News new | past | comments | ask | show | jobs | submit login

you guys are wrong and spreading blatant misinformation - there is no magic number whose square is 0 but which is itself not zero anywhere in pytorch or tensorflow or any other real DNN framework that i'm familiar with. it's all fun and games to participate in math woo but you shouldn't be proclaiming things you don't actually know on a public forum.



The dual numbers exist just as surely as the real numbers and have been used well over 100 years

https://en.m.wikipedia.org/wiki/Dual_number

Pytorch has had them for many years.

https://pytorch.org/docs/stable/generated/torch.autograd.for...

JAX implements them and uses them exactly as stated in this thread.

https://github.com/google/jax/discussions/10157#discussionco...

Many other frameworks use them also, for many reasons.

As you so eloquently stated, "you shouldn't be proclaiming things you don't actually know on a public forum," and doubly so when your claimed "corrections" are so demonstrably and totally incorrect.


They exist by definition. Your claim makes no more sense than confidently proclaiming that there is no x such that x^2 < 0. We invented the imaginary numbers and made it so. So too the dual numbers.


I don't know why you keep calling it "magic". Whether or not pytorch uses them, they aren't magic, neither in the derogatory sense nor in the praise sense.


> I don't know why you keep calling it "magic".

because they have all of the gee-whiz factor of a freshman calc proof of the chain rule that divides and multiplies infinitesmals and absolutely not enough of the substance necessary to prove much more than that. they are absolutely, in the research literature, at best an anachronism (harkening back to leibniz) and at worst a parlor trick.

in literally my first response i provided the most trivial counter-example to the magic of non-standard analysis. no answers (crickets). i surmise this is because the people in here talking it up aren't really serious.


As I use the terms, the dual numbers are a different thing from non-standard analysis. Non-standard analysis, as I understand the term, uses non-standard models of the real numbers, and its infinitesimals do not satisfy h^2 = 0. In non-standard analysis, f'(x) is the standard part of (f(x+h)-f(x))/h , for an infinitesimal h (i.e. for a non-standard real which is smaller than any non-zero standard rational number). (In order to apply this definition, f should be defined in a way which does not use anything requiring determining if a number is standard, or taking the standard part of something, etc.)

The dual numbers, on the other hand, are the ring R[h]/(h^2) . This is not a field, while non-standard models of the real numbers do form fields.

The dual numbers suffice to define differentiation of polynomials (which may not be sufficient for some purposes! [a]), and something like dual numbers is used in algebraic geometry to define the Zariski tangent spaces for points of algebraic varieties (whether in characteristic 0 or in characteristic p. In characteristic p, one certainly can't use an epsilon-delta definition!).

I really don't see your point about the "gee-whiz factor". While different things, both non-standard analysis and dual numbers can be handled rigorously, and have their use-cases, even though I certainly would at least default to thinking of differentiation in terms of the limits definition (assuming I'm thinking of any specific definition at all).

I assume that the counter-example you refer to is the (dx/dy)(dy/dz)(dz/dx) thing. That indeed doesn't seem like the kind of thing that using the dual numbers would be especially suited for. Though, also, not the sort of thing that should really come up in auto-diff I would think?

If there is a common parameterization of the values of x,y,z by some variable t (on some interval I), on some neighborhood of the point under consideration, where {(x(t),y(t)) | t in I}, {(y(t),z(t)) | t in I}, and {(z(t),x(t)) | t in I}, are each differentiable functions, and where they satisfy the relationships between the variables x,y,z from the larger context, and all three of (dx/dy), (dy/dz), (dz/dx) exist at the point in question, then it seems that the product should be 1.

[a] Though, if the function is analytic (not just smooth), unless I'm missing something, it should also give the right answer (but still not a good definition of course, because should define differentiation before defining what it is for a function to be analytic.) (of course, just because the function is analytic doesn't make its domain include the dual numbers. One has to take a power series for it, and apply this power series to the element of the ring of dual numbers, not apply the original function to it.)


The grandparent I'm responding to sure uses a very sloppy presentation of things. Not everyone here is a trained mathematician though, so you may want to give people some slack.

Obviously, if h² = 0, then h = 0, so this statement made no sense. What the author probably tried to convey, is that one can reason with infinitely small values as symbols, and perform automatic differentiation with that.


No, there’s an abstract algebra extension of real numbers to have an extra symbol h such that h^2=0. This is not a real number so you cannot apply the argument h^2=0 implies h=0, much like complex numbers don’t obey all properties of real numbers.

(For example for real numbers, x!=0 implies x^2>0 but i^2=-1)

https://en.m.wikipedia.org/wiki/Grassmann_number


  a^2 = 1, first base vector is a regular one
  b^2 = -1, second base vector is "imaginary"
  ab = 0, base vectors are orthogonal

  (a+b)^2 = a^2 + 2ab + b^2 = 1 + 2\*0 + (-1) = 0
Trick is taken from conformal geometric algebra [1].

[1] https://en.wikipedia.org/wiki/Conformal_geometric_algebra




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: