I'm one of the founders of SigOpt. Let me know if you have any questions about what we do, how we do it, or the math behind SigOpt. We'll be in the thread all night and are happy to answer any questions.
> Even more intriguingly, SigOpt helps you optimize physical experiments. For example, Clark said one of his initial customers is using SigOpt to test different chemical combinations in creating shaving creams. It shouldn’t matter what you’re testing — Clark said SigOpt’s technology just uses your data and “builds up this model of what it thinks the world looks like. … It’s designed to be a very general system.”
This is really vague, can you give some more technical details on what's actually happening?
Great question. We build off of research in Design of Experiments [1] in general and Optimal Learning [2] in particular. Our algorithms attempt to make the tradeoff between exploration (learning more about the space we are optimizing in) and exploitation (using the information we have to achieve the best values) to find optimal parameter configurations for experiments as quickly and efficiently as possible.
This has been an active field in academia for the last few decades, and the seminal paper behind some of our algorithms was published in 1998 [3]. There have been many successful applications in different fields from drug discovery [4] to nanotechnology experiments [5].
We wrap these powerful techniques behind an API and web interface to let anyone start running optimal experiments and leverage this research right away for any experiment they are trying to optimize. We have some more examples and a use case on our site [6].
* how many parameters can be optimized at once? Peter Frazier's slides [1] seem to imply that a gaussian-process based approach is suited for d<20 parameters
* can multiple objectives be handled?
* can the system produce experimental designs for n>1 points designed to be sampled in a batch, or is it just n=1 point at a time?
* in some scenarios, it might only be cost-effective to perform experiments in batches, where there are additional constraints upon which set of samples can be selected per batch (these might be dictated by the manufacturing process). Would it be possible to support this somehow?
1. The more dimensions in your parameter space the longer it will take to find an optima in general. Something like grid search is exponential in the number of dimensions while we find that in practice our methods are closer to linear (something around 2D - 10D, where D is the number of dimensions). While there is no explicit "limit" to the number of parameters (in the code or algorithm) we tend to see the best results when they are < 20 as Peter Frazier suggests as it allows us to find an optima much quicker (in the number of samples).
2. You provide a single metric (Overall Evaluation Criterion) to SigOpt, this can be a combination of many sub-objectives. We talked a little about this in the MOE docs [1] and are happy to help people brainstorm on the best objectives. This is a very important part of the process as Microsoft points out in a recent paper (section 3.1) [2].
3. We can give you back experimental designs with n>1 points. We condition on any outstanding experiments being run too. Some of this work was developed in my thesis [3]. We have some docs on this too [4].
4. We do support the notion of "untunable" parameters in beta (releasing soon). In a given experiment if you know you need to use a specific chemical we can hold that constant while optimizing the other free parameters so that you can run a single batch with that specific constraint (or a set of constraints). I think this covers what you are asking, but I am happy to dive deeper if not. I'd be happy to help you set up one of these experiments if you are curious.
At Yelp we used MOE to tune the advertising system, optimizing thresholds for when to display certain types of ads, and how to rank the advertisements. More info can be found in this talk [1] and slides [2]. Tuning these thresholds across hundreds of categories, device types, and related dimensions would be impossible to do by hand, but is exactly what this system is built for.
Netflix has used MOE to optimize their deep learning systems (talk at MLconf [3], slides from NIPS [4]). Yelp was also able to leverage it to tune the hyperparameters of various machine learning systems in advertising and search.
SigOpt takes this work and related research and puts it behind an easy-to-use API and web interface. We want to bring these powerful tools to everyone and optimize everything.