have you considered offering it as a cloud platform? we're doing something along these lines, niche scientific software (biological modeling, bioinformatics) as a paid hosted service. still at the prototype stage! so I can't comment on how well the business model actually works yet lol
but the idea is our mathematician will be able to publish whatever novel math she develops, and we may eventually open source the math core as a reference impl, but we'll keep all the cluster management and other supporting infrastructure code proprietary. sort of a "if you want to run it on your desktop, go ahead! if you want to actually scale this up for big jobs, we've done all the legwork already so it's really in your best interests to just pay us." I think open source ideals are good and worthy but from a business perspective, you capture value by providing value that can't be got without you. relying on customer goodwill is particularly difficult because any large org, the people who will feel goodwill toward you and the people who can authorize purchases are in two different departments
also fwiw I think if you wanted to do the model you described in the paper unchanged, gpl is a much better choice than mit. copyleft actually serves as a wonderful poison pill: you can try us out for free, but if you want to ship us, you need to pay for a proprietary license or legal will nail you to the wall. whereas mit, there's no stick. I've seen affero used by several projects for this express purpose: you have to buy a proprietary license because agpl is so onerous you just can't use the code for commercial purposes at all
interesting project btw, I love seeing stuff like this!
Yes, I've considered cloud platform. There are several big difficulties with that.
First, data. It's easy to grab public data from PubChem, ChEMBL, and a few other projects, and make a service. But why would anyone pay for it given that PubChem, ChEMBL, ChemSpider, and others already provide free search services of that data?
There's search-as-improved-sales, like how Sigma-Aldrich lets people do a substructure search to find chemicals available for sale.
There's value-add data. eMolecules includes data from multiple vendors, to help those who want to purchase compounds more cheaply.
Or there's ZINC, which already provides search for their data.
So you can see there's plenty of competition for no-cost search. I don't have the ability to add significantly new abilities that people are willing to pay for.
Note also there's a non-trivial maintenance cost to keep the data sets up-to-date.
Second, the queries themselves may be proprietary. I talked with one of the eMolecules people. Pharmaceutical companies will block network access to a public services to reduce the temptation of internal users to do a query using a potential $1 billion molecular structure (or potential $0 structure). eMolecules instead has NDAs with many pharmas which legal bind them. Managing these negotiations takes experience I don't have, and neither do I have the right contacts at those pharmas.
Sequences don't have quite the same connection between sequence and profit as molecules do.
BTW, part of the conclusion of my work is that people don't need a cluster for search - they can handle nearly all data sets on their laptop, so there shouldn't be a need to scale up any more. And small molecule data has a much smaller growth curve than sequence data, so Moore's Law is keeping up.
My first customer, who continues to be a customer, said outright that they would not buy if it were under GPL.
Since my paying customers are pharmaceutical companies who, as a near-rule, don't redistribute software, it doesn't really matter if they don't redistribute under MIT or don't redistribute under GPL.
I came into the project in part to see if FOSS could be self-supporting on it own. AGPL is often used as a stick to try to get people to use a commercial license - the implicit view of the two-license model is that FOSS is not sustainable. Which is now my conclusion, for this project and field.
not really into industry, but a) the pharma-companies using it are probably reluctant to give you their data and b) uni researchers are not overly fond of high-fee services and labor is cheap there.
but the idea is our mathematician will be able to publish whatever novel math she develops, and we may eventually open source the math core as a reference impl, but we'll keep all the cluster management and other supporting infrastructure code proprietary. sort of a "if you want to run it on your desktop, go ahead! if you want to actually scale this up for big jobs, we've done all the legwork already so it's really in your best interests to just pay us." I think open source ideals are good and worthy but from a business perspective, you capture value by providing value that can't be got without you. relying on customer goodwill is particularly difficult because any large org, the people who will feel goodwill toward you and the people who can authorize purchases are in two different departments
also fwiw I think if you wanted to do the model you described in the paper unchanged, gpl is a much better choice than mit. copyleft actually serves as a wonderful poison pill: you can try us out for free, but if you want to ship us, you need to pay for a proprietary license or legal will nail you to the wall. whereas mit, there's no stick. I've seen affero used by several projects for this express purpose: you have to buy a proprietary license because agpl is so onerous you just can't use the code for commercial purposes at all
interesting project btw, I love seeing stuff like this!