Hacker News new | past | comments | ask | show | jobs | submit login

Statistical hypothesis tests: Commonly calculations to predict something have two ways to be wrong (A) predict it will happen when it doesn't and (B) predict it won't happen when it does. Then in the context of a statistical hypothesis test, get to address the probabilities of A and B and how to adjust the test to get the combination of A, B like best or get a better test that will give better combinations. If have enough data, then the classic Neyman-Pearson result says how to get the best test. The proof is like investing in real estate: First buy the property with the highest ROI. Then the next highest, etc. until out of money. That's crude but not really wrong. I have a fancy proof based on the Hahn decomposition from the Radon-Nikodym theorem. Well, statistical hypothesis tests are being seriously neglected.

E.g., some tests are distribution-free. And for other tests, will want to make good use of multi-dimensional data, e.g., not just, say, blood pressure or blood sugar level but both of those two jointly. Well, I'm the inventor of the first, and a large, collection of statistical hypothesis tests that are both distribution-free and multidimensional. That work is published, powerful, valuable, but neglected. I did the work for better zero-day detection of anomalies in high end server farms and networks. So, I got a real statistical hypothesis tests, e.g., know the false alarm rate and get to adjust it and get that rate exactly in practice. IMHO, my work totally knocked the socks off the work our group had been doing on that problem with expert systems using data on thresholds. Also, the core math is nothing like what is most popular in AI/ML now and as far as I know nothing like anything even in small niches of AI/ML now.

Once I was asked to predict revenue. We knew the present revenue, and from our planned capacity knew our maximum, target revenue. So, roughly had to interpolate between those two. So, how might that go? Well, assume that the growth is mostly from current happy customers talking to people who are target customers but not customers yet. Let t denote time, in, say, days. At time t, let y(t) be the revenue, in, say, dollars, at time t. Let b be the revenue at full capacity. Let the present be time t = 0 so that the present revenue is y(0). Then the rate of growth should be, first-cut, ballpark, proportional to both the number of customers talking or y(t) and the number of target customers listening or (b - y(t). Of course the rate of growth is the calculus first derivative of y(t) or

d/dt y(t) = y'(t)

Then for some constant of proportionality k, we must have

y'(t) = k y(t) (b - y(t))

Yes, just from freshman calculus, there is a closed form solution. I'm guessing that the solution is a logistic curve. So, the growth starts slowly, climbs quickly as an exponential, and then grows slowly again as it approaches b asymptotically from below. So, get a lazy S curve. So, it's a model of viral growth. Get the whole curve with minimal data, just y(0), b, and the guess for k. The curve looks a lot like growth of several important products, e.g., TV sets. I derived this and used it to save FedEx. For all the interest in viral growth, there should be more interest in that little derivation.

There is the huge field of optimization -- linear, integer linear, network integer linear (gorgeous stuff, especially with the Cunningham strongly feasible ideas), multi-objective linear, quadratic non-linear, non-linear via the Kuhn-Tucker necessary conditions, convex, dynamic, optimal control, etc. optimization. It is a well developed field with a lot known. I've made good attacks on at least three important problems in optimization, via stochastic optimal control, network integer linear programming, and 0-1 integer linear programming via Lagrangian relaxation and attempted several more where ran into too much in politics. Sadly the great work in optimization is neglected in practice.

The world is awash in stochastic processes, but they are neglected in practice. E.g., once for the US Navy, I dug into Blackman and Tukey, got smart on power spectral estimation, IIRC important for cases of filtering, explained to the Navy the facts of life, helped their project, and got a sole source development contract for my company.

The crucial core of my startup is some applied math I derived based on some advanced pure/applied math prerequisites.

And there is a huge body of brilliant work with beautifully done theorems and proofs that can be used to get powerful, valuable new results for particular problems.

Computers are now really good at doing what we tell them to do. Well, IMHO, for what we should tell them to do that isn't just obvious is nearly all from applied math.




Can you point me to some paper I might read? If it's not too much trouble.

I'm CS a grad student and sometimes it's hard to filter out the hype and find promising but underrated ideas among all the noise.


"Paper"? There are lots of pure/applied math journals packed with papers. I touched on the fields of statistics, probability, optimization, and stochastic processes, and each of these fields has their own journals.

Usually a start better than papers in journals is books. A first list of books would be for a good ugrad pure math major. There get to concentrate on analysis, algebra, geometry with some concentration on topology or foundations.

For grad school might want to do well with measure theory, functional analysis, probability based on measure theory, statistics based on that probability, optimization, stochastic processes, numerical analysis, pure/applied algebra (applied algebra -- coding theory), etc.

Then, sure, work with some promising applications and then dig deeper into relevant fields as needed by the applications.

One key to success is good "problem selection". So, with good problem selection, some good background, and maybe some original work, might do really well on a good problem, publish some papers, do a good startup, make some big bucks, etc. That's what I'm working on -- picked my problem, for the first good, an excellent, solution did some original applied math derivations, have my production code in alpha test, 24,000 programming language statements in 100,000 lines of typing.

It's applied math; hopefully it's valuable; but I wouldn't call it either AI or ML.

In case my view is not obvious, it is that the best help for the future of computing is pure/applied math and not much like current computer science. Computer science could help -- just learn and do more pure/applied math.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: