Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Huge enterprise customer wants to see our source code
513 points by throwawaysource on Dec 6, 2017 | hide | past | favorite | 277 comments
We are a small 5-person enterprise software startup, operating in the data analytics/ML space. We are working on starting a proof-of-concept with a huge potential customer ($50+bn in revenue). We recently identified our first use case, and are ready to get into contract negotiations.

But then they got back to us with an odd request: they want to see our source code (likely upon completion of the PoC). Given that our core IP is our models and algorithms, we are reluctant to agree. Their justification is: "we want to see how your algorithms made their decisions."

We know that they have lots of resources and are building up internal data science team. And yet it was pointed out to me that their goal might not necessarily be to outright steal our IP, but rather to cover their bases. But we are still worried they might be "inspired" by the parts they see and get their internal teams to replicate across other sites or use cases. And we don't have the resources to litigate, nor any way of knowing they do this.

My questions: 1) Has anybody run into a request like this? How would you respond? 2) How likely do you think their goal is to genuinely "see what happens under the hood" as opposed to replicate in the future? 3) Are there any legal protections we can put in place to prevent them from not just copy-pasting our code, but also from "learning from it" or so?




Well, having worked for a small software startup that did just that, I can tell you what we did.

We agreed to letting them audit the code with conditions.

1. The audit happened on our computers with someone from our team in control (me). I locked the computer when I wasn't physically there to watch what they did. 2. We removed the most sensitive part of the code and told them what it did. We kept the method signature. 3. All of the source code remained on my laptop and my laptop was never attached to their network. 4. We could tell them that we would not answer any question they asked. 5. They paid for expenses and time. It lead to a decent sized contract.

It lasted about 2 days for a medium sized Java application. They asked one or two questions I wasn't allowed to answer and took it well when I told them so.

I am not a lawyer but you can also ask for a non-compete and or a non-disclosure from their individual employees that will conduct the review. Get your lawyer on that before you do it.

It is up to you or your company to decide what to do. No one can tell you if it a good or bad decision for you. At least for me and my company, it worked out. As far as I know the auditing company never developed a competitor and was a customer of my company for a long time to come.


I've done this too. The question is usually a matter of compliance more than anything else. They want to check the licenses of any included packages, makes sure there's no encryption stuff that can't leave the USA, etc.

Doing what OP described is great: it lets their folks do the audit with no risk of you loosing "ownership". It shows you are both a good partner and value what you do.


License compliance is easily verified using 3rd party services like Black Duck.

You basically install a scanner on your machine, feed your sources into it, it hashes them line by line and sends hashes to the mothership for analysis. It then spits out the report that file X, line Y matches something in the open source package Z. At least that's how it worked a decade ago, when we had to do a pre-acquisition source code audit.


BD still works that way. The lawyers love making us use it.

https://www.blackducksoftware.com/solutions/mergers-and-acqu...


GPL in my source? You'd be surprised.


License compliance is not verified in closed source stuff, period.

When was the last time some user of Windows verified whether Microsoft Windows contains some piece of code that Microsoft shouldn't be redistributing?


Microsoft has provided Windows source code for auditing for a long time, especially to governments. Here's an article about it from 2003:

https://www.computerworld.com/article/2580563/microsoft-wind...

And again in 2015:

https://www.petri.com/no-back-doors-microsoft-opens-windows-...


License compliance is incredibly important and unfortunately overlooked by many smaller firms. The potential liability to a GPL or other violation is just not worth it.

Anecdote: We have released code under the Apache 2 License (our biggest project by far is https://github.com/sheetjs/js-xlsx) and we've been roped into negotiations because some companies tried to take shortcuts by copying our code without proper attribution.


What exactly is the potential liability for a GPL violation?

I've gone 12 rounds with IP lawyers over these theoretical violations (static vs dynamic links). But I found it odd that I could never find a single case of significant liability due to infringement. The nature of damages is unclear and the landscape of counter-parties (with an incentive to sue) is amorphous. It seemed like worst-case, a proven infringer just had to re-write the offending module and make a $10k donation to an open source foundation. I've never knowingly infringed GPL and am not advocating that anyone should; it's just as I said, I found the legal community's focus on this area out of step with their otherwise well-measured calculations of risk and reward.


Not trying to troll here... The biggest violation might simply be the size of the attorney bills related to using GPL licensed dependencies in your code. There's simply no great way that I've found to get attorneys to give you a checklist of how to comply with licenses that aren't on a short list of reasonably well understood licenses. Saying "no GPL" is easier in many cases for practical reasons that kind of stink. To say it differently, how much time would you like to dedicate to getting GPL blessed when doing so is competing with all of your other priorities?


Fair point but doesn't this run directly counter to the trend of startups leveraging an increasing amount of open source code?


In my experience, having a good ratio of responsible, experienced developers on the team (one of those may do it, for a small startup) is often enough to keep an eye on it, and a lot of startups get quite far while avoiding the most risky (for them) licenses. If this is reasonably gated while starting a project, or whenever you want to add new dependencies, it's well possible to stay on course.


I have no idea what lawyers actually think but the fears I hear about generally don't come from lawyers. The most common fear I think is "we'll be forced to release our closed code!" But no, my intuition is that even with willful violation of something licensed under e.g. the AGPL which is supposed to force such a thing on a SaaS company, the worst case is something like: the company is hit with the maximum penalty for copyright infringement and will have to shell out a few hundred thousand dollars and someone goes to jail for a few years, and the company has to immediately stop distributing/serving products with the software on them until the software is either legally licensed, legally distributed, or replaced.


Jail for copyright violation? I guess people who've distributed pirated goods, like movies or music, have gone to jail, but I've never heard of it happening to a software developer. Fines are plausible, but I'm having a hard time imagining a scenario that leads to jail time.


The plausible situation is no criminal case, just a civil case, and probably settling for what amounts to a fine / delayed purchase, e.g. the "$10k donation" alluded to. But the worst case possibility, even if very unlikely, should include jail time (somehow -- I too don't know exactly how that would work out or who exactly would go to jail, manager or dev).


Just curious here... It's a license violation, then, not a copyright (law) violation. So isn't it more like breach of contract, i.e. a civil matter between parties, rather than a criminal case? And one does not go to jail for civil cases, do they?


GPL is a license to use copyrighted code, not a contract. If you violate the terms, your license is revoked. If you then continue to distribute the code in question without a license, then you may be liable for copyright infringement.

The first consequence of this is that only the copyright holder has standing to sue you, not anyone else who thinks they have a right to your code because you violated the GPL terms. They’re just wrong.

The second consequence is that the only remedies for copyright infringement are an injunction and damages related to the copyrighted code. So a court could force you to stop distributing the copyrighted code, requiring you to either (1) remove the GPL code from your product and use a substitute, or (2) pay for a commercial license from the owner. But you would not be forced to release your proprietary code under a GPL license.

Regarding jail for civil cases, no. In the US, we eliminated debtors’ prisons many years ago.


Developers have gone to jail for stealing code from financial companies.

See: https://en.wikipedia.org/wiki/Sergey_Aleynikov

Most GPL projects however don't have expensive, hateful, blood seeking lawyers behind them.


That is completely different. You don't go to jail for a license violation.


Honest question:

Is a license violation not equivalent to unlicensed use of code?

I mean, if you don't meet the terms of a license, you're not eligible to use the code under that license. That means you have no license to the code, and therefore are in breach of copyright, no?

How is that different from just using someone else's code outright - you're breaching copyright and you're not licensed, no?


Well, the difference in that case is that it isn't license violation at all. This case is a trade secret violation. Since Goldman Sachs never released the source code, it is by definition a trade secret which Sergey Aleynikov was not allowed to release or utilize outside of his job at Goldman Sachs.

If Goldman Sachs ever officially released the software it could be a copyright or license violation but this isn't the case.


I remember in some situations that copyright infringement is cheaper than license violation. The former you can argue ignorance and pay a a fine the later you've knowing acknowledge what you've done is a violation of another person license but continued your behavior.


Aleynikov was charged with stealing trade secrets, which does not apply to GPL software.


> I found the legal community's focus on this area out of step with their otherwise well-measured calculations of risk and reward.

That seems true with pretty much any legal area. But, it's not their job to calculate risk, it's their job to tell you what is legal. It's management's job to decide if they want to take the risk or take the legal advice.


I think it's 100% their job to calculate risks and inform about legal customs/expected outcomes around something in question.


You're going to be pedaling that idea for a long time before you convince lawyers that it's in their interest to generalize on expected outcomes.


> What exactly is the potential liability for a GPL violation?

An anecdote from 2001:

http://web.archive.org/web/20061106184219/http://docs.gnu-de...


Usually, when doing due diligence, anything that might conceivably be a lawsuit or ongoing lawsuits results in a lower price or funds in escrow or similar agreements.

So, even if there never was a big payoff, the mere potential is a big red flag for the due diligence team (and leverage for negotiations as well). They can add indemnification clauses on contracts (they won't do that for a small company like the OP I guess).


Most cases are settled out of court or aren't publicized.

https://sfconservancy.org/docs/2010-07-27_dj-opinion.pdf interesting read


> otherwise well-measured calculations of risk and reward

Where do you work that attorneys generally make well-measured calculations of risk? And are you hiring?


They are just that: theoretical violations and even they are touted just by the FSF: no one else agrees that linking against source code taints your own code.


It really depends what you mean by "linking" and what you mean by "taints".

Linking GPL code to incompatible code is against the license terms and since you can't copy/distribute the code without a license you violate copyright if you distribute that code. Nobody, not even the FSF, believes that this means that your code must be under the GPL at that point -- it's just the remedy that the GPL explicitly allows. Other remedies are usually possible from the copyright holder (usually just "stop infringing our copyright"), or imposed by a court.

IANAL, but the act of a user linking GPL code to incompatible code without distribution is not copyright infringement IFAICT. I believe the FSF also agrees with that. In fact the GPL specifically states that you may run the code for any purpose. This leaves a kind of grey area where you could write code that can link to GPL code, but that you leave the end user to do the final linking. In the extreme these are the so-called "binary blobs" in things like the Linux kernel. My understanding is that the FSF thinks that these are an infringement but that other people disagree. I don't think it's ever been tested in court.

Edit: I should point out that the reasoning for it being an infringement is that by intentionally writing code that can link to the GPL code, you are creating a derived work of the GPL code. Whether this argument holds water is anybody's guess and I would love to see it tested in court.


> Linking GPL code to incompatible code is against the license terms and since you can't copy/distribute the code without a license you violate copyright if you distribute that code. Nobody, not even the FSF, believes that this means that your code must be under the GPL at that point

Except as I understand it, this is the position pushed by WordPress and other GPL CMS's (Drupal?).

They say that because a theme or a plugin uses WordPress's functions and these are GPL licensed, then you're 'linking' with it and your PHP code has to be GPL licensed too.

To me this is crazy, but no one has challenged it yet.


There are 2 issues here and it's important to keep them separate.

1: If you are in violation of the license, must you put your code under the GPL? The answer is definitely no. If you are in violation of the GPL you may license your code under the GPL to become compliant (though in V2 and below you also have to be forgiven by the copyright holder(s)). This is an option that you have in addition to the options you have for any other copyright infringement (i.e. going to court). You never have to relicense your code. It's just a way to avoid court. There are definitely people who misunderstand this point, but I don't think you'll find anybody in an official capacity at the FSF who doesn't agree with the above.

2: If you write code that complies with an API in an GPL project, do you need a license? Like I said near the bottom of my other post, that's the grey area. Some people will say yes and some will say no. The idea is that complying with the API necessarily means that you are creating a derived work and are therefore infringing copyright if you don't have a license.

Like you, I initially found this idea to be very strange. However, the Oracle vs. Google court case lends considerable credence to it. In that case Google was found to have infringed on Oracle's copyright by implementing a published API. The judge ruled that it was fair use, though. I suspect (IANAL, remember) that if this was ever ruled upon you would probably get a similar judgement. The situation is fairly different, though, so you never know.

Now, the reason you don't find people challenging this stuff is (I think) mainly because the copyright holders are clear how they stand on the issue. I don't know anything about it, but I'll take your word that this applies to WordPress plugins/themes. You can just go against their wishes, but it's kind of a jerk move. Will anybody sue you if you do it? Will you win a court case? Maybe, maybe not, but you will certainly upset a lot of people so why do it? With the Linux kernel and binary blobs, it's all good as long as you stick with certain APIs. People do it knowing that nobody is going to sue them. NVidia steps over the boundaries, gets yelled at by Linus, but still nobody sues them.

I mean, you can also look at the really weird interpretation of the AGPL that projects like Mongo DB have. Their attitude is way more liberal than the license seems to state (to the point where I frequently wonder if they have actually read the license). But who is going to complain? It's their code -- they can do what they like. They don't have to sue anybody if they don't want to.


IFAICT ?


Typo. As far as I can tell. Sorry!


Thank you for the clarification

(TYFTC)


Cease and desist selling your core product until you replace the GPL bits, or else make your core produt open source.


For companies interested in addressing licensing concerns before they balloon into real problems, FOSSA builds a platform which analyzes code and automates license disclosures and legal boilerplate: https://fossa.io/

Full disclosure: founder is a friend and all around great person.


I don't own a company, but I'd like to try this out myself. Is there any possibility of having a "personal" trial mode instead of having to enter a company name?


From team FOSSA here, you can put in anything for the company name -- or leave it blank!


I don't understand how license compliance matters here. Why does the big customer care about whether the seller has violated licenses? The seller would be the one in trouble, not the customer.


If the seller is in violation of one or more licenses, they might get sued and be forced to discontinue or significantly alter their product. If the big customer depends heavily on the product and cannot easily replace it, they are also screwed in this scenario, even if they aren't in violation of any laws.


> The question is usually a matter of compliance more than anything else.

What would Microsoft say if a customer said "we need to see the source code for Windows, Outlook, Exchange and Office applications before we use them, just as a matter of license compliance"?


One thing I would note is that the individual employee might not be authorized to sign a legal document, such as "a non-compete and or a non-disclosure". My company informs us told to actively refuse to sign any legal documents (even at visitor check-in) -- we have a Legal team for that. All documents should be signed by Legal before I step on site.


Out of curiosity, are you barred from installing software on company computers, because you're either explicitly or implicitly accepting an EULA?


My company will scan our computers every once in a while. I've gotten alerts about having software I have installed that I should not - both cases we had an enterprise license and I was using a trial. I didnt want to go through a burdensome request process to try something.


This is a common thing in many larger corporations. Some even lock down machines quite a lot. I have seen multiple cases where this lead to laptops for dev work which weren't connected to the company's network.


There is a list of pre-approved apps, and a method to request approval if it is not on the list. Also, there is a list of approved licenses for open source software.


Slightly tangential, but working for a small software company (<20 employees) who writes Line Of Business code for very large enterprise customers in the health care industry, we at times offer source code licenses, but at the very least (and this was my original point), our license as a vendor also includes source code escrow in the event of our collapse. We haven't really had anyone request a pre-sale source code review.


> our license as a vendor also includes source code escrow in the event of our collapse

Wow, I've never heard of this before - it sounds great!

Do you advertise this on your website, or is it just buried in the terms and conditions? I was just wondering how you might get across this info without worrying potential customers with the mere mention of collapse?


You don't advertise it. You say nothing until a customer expresses concern about your long term survival then you say "Oh yeah this comes up from time to time and the solution is code escrow, we've done it for customer Y and X so we know the drill".


Given the nature of the product we sell (where implementations are in the six-seven digit range, with sometimes over a year of prep work), it tends to come up in discussion most times, and I think may even be in our boilerplate contract (which is signed by both parties, it's not 'order online, enter credit card' purchasing).


Even big vendors do source code escrow for very big contracts (e.g. government).


In our case it has always been something our large clients request and expect as part of any project we work on, rather than something we advertise as a feature of working with us.

Most companies will dictate that you use an escrow provider from their "preferred suppliers" list.


Any suggestions for that “preferred suppliers” list?


If there is a preferred suppliers list in operation, you don't pick those suppliers your client does. And it will often be relatively local, at least country local, companies in my experience, presumably for legal jurisdiction reasons.

So what our clients dictate is probably irrelevant to your needs.


Thanks. I misunderstood. I thought you had a list of preferred escrow services.


Seconding the “source code in escrow” solution as a mitigation about concerns of working with a startup. IMO enterprise customers happy to pay for it and it usually never gets asked about again


How does "code escrow" work in practice? Are you actively syncing to a private Git repo?

Or is it more a case of, if we're collapsing, we'll put this in place?


> Or is it more a case of, if we're collapsing, we'll put this in place?

No-one would trust that. If you are collapsing you aren't going to care enough, or the people who do/did both care and know what to include (source, build toolchain setup, documentation) have already gone.

It is surprising, given how important it could be, how little the clients bother to check that escrow updates are happening - so even if the contract says otherwise it could often come down to this!

> Are you actively syncing to a private Git repo?

Some do that, but in my experience it is more common to provide a snapshot (a full copy of the relevant parts of your repo(s) with supporting documents), with a new snapshot uploaded with each major or minor release. Sometimes it isn't released based but instead the client dictates escrow is updated "at least once per year or X months" which to my mind shows that part of the contract was written by a legal/admin person without a lot of technical experience.


Correct there's some (not necessarily daily or weekly) amount of sync frequency that you negotiate.


Haha, was thinking that wouldn't be too trusted alright! Scheduled syncing makes sense. Cheers for the replies, have never heard of the concept of code escrow before, seems like a neat solution


You mean to say that you reserved the right to not answer certain questions or that you refused any questions?


We reserved the right to not answer certain questions. I answered most of their questions, only 1 or 2 were in areas where I wasn't suppose to give them answers.


There is also a very fine difference in the exact wording. You might even have the choice to decide the location where the physical copy of the code is located (E.g. your office, not theirs / on the north pole) and they actually might never actually bother to show up. Does the source code even need to be humanly readable or are the actually more happy with a schematic diagram?


For many reasons, always choose your office as the location. Even if they're paying expenses, it makes your employees available for other things that may come up at your site. All the security benefits are obvious, but lastly... from a sales standpoint, it allows you to treat them like a welcomed guest and win them over on your turf.

Either way, get the review done fast. As they say, time kills all deals.


one thing about non-compete and non-disclosures is that if you are a small shop going up against big shops it can be costly for you to try to enforce it against a larger corp. not impossible but not cheap and without any guarantee of success.


I think this is a signal that you're probably underpriced by a factor of 2X to 10X.

So this is a sales objection. With regards to that justification, I'd want to know a) who in the business is generating it and b) what they expect to feed your answer into.

Is this just somebody who wanted to sound smart in a meeting? Then they don't need your source code; they need ~5 nice PowerPoint slides and you're done. I'd be polite but firm in this case.

Is this e.g. a risk management officer? Then OK, that is a totally reasonable ask. It's going to cost you $20k and some legal wrangling regarding non-disclosure and non-competes, but we will give up to 2 team members of yours up to one business day of access to our code in a clean room on our hardware, under supervision by a technical leader, with appropriate rules of engagement regarding questions.

Is this a business unit head sniffing for new things to build? Then I will not automatically eject out of this conversation, but my price just had another zero added to it, at least.


I’m a security engineer at a large financial organization and we audit all 3rd party products most of these audits include a code review.

This is a compliance process and it’s controlled by the SEC.

All of your suggestions would make the company that I at least work for to simply walk away from table.

This has happened more than once and all of our contracts contain a clause that if the application does not get a pass from our application security team the procurement would not continue.


Thank you for your comment. Can you add, for the benefit of the peanut gallery, what the typical cost of a one year software license you pay for is? I have a good guess but I think HNers need to hear people-other-than-me say it to internalize it.


Really depends usually in the millions especially if this is a product related to trading.

It’s also important to note that as far as I am aware contracts on this scale have multiple stages and we are paying for PoC/pilot programmes during which we usually perform our review. This usually isn’t software that a pre-sale engineer would come and setup in a few days so we are paying effectively even for a demo.

Overall we audit everything from Xerox software to custom balance sheet management products.

The process is dependent on the level of trust/confidence we have in the vendor and the risk profile of the product.

But in all cases there will be some sort of review.

For vendors that would not give us code access we request permission to pentest the product the pentest would include reverse engineering in most cases and as the vast majority of products in the financial sector are in Java the source code isn’t hard to get through reversing the byte code.


Do you use SaaS products at all? If so, do you audit source for them too, or just products that run on your hardware / network?


SaaS is usually just a pentest some we get/request code access for.

Overall it depends on the risk profile of the product SaaS products tend to have a lower risk profile becuase they don’t deal with trades/contracts directly.

It’s all about managing risks. One of the products we use is Salesforce. Penetesting Salesforce would be a waste of time for us because Salesforce has a good application security team and is a trusted vendor. However we did perform a review of apps/plug-ins that run on the Salesforce platform which we use but have much less confidence in.


Not OP, but we were in the same situation and required source review. We negotiated escrow, the whole shebang. License costs for our enterprise was multi-year and ran over $15 million for the project for this one app. It's not an odd or rare request.


Anecdotal - for a large publicly traded financial services who wants to manage their book of record software (i.e. the software that holds the truth of "dear client you hold this many assets"), you're looking at $3-5M/year in license costs.

Depending on the business context of a ML solution for a company like the OP describes, you're likely looking at $500k+, but this is pure speculation and requires more information about the customer situation (e.g. is the ML solution going to save the company money or help them sell more).


Why would a financial product have a compliance procedure by the FCC?

> All of your suggestions would make the company that I at least work for to simply walk away from table.

Unless the company is already a customer and is paying for the development of the software, please do walk away from the table, I'm sure they have better things to do and better customers to sell to


Typo, too many acronyms.

As for the second part I wont comment really I’m not in a place to make decisions on procurement I just brake things. I would just say that size/volume wise as far as clearings go there isn’t a better/bigger client. And everyone is the same.


Ah your other comment makes the procedure and why it's needed clearer


FCC is a typo I'm sure.

>Unless the company is already a customer and is paying for the development of the software, please do walk away from the table, I'm sure they have better things to do and better customers to sell to

This sentiment is similar to things I hear all of the time working in health care / biotech. The fact is you have no domain knowledge and not a clue as to how things work in the real world. If your company just ignores the way things are done and the requirements of the domain you're simply going to go out of business.

When the stakes are high (e.g. death/harm, massive financial losses, etc.) the business process changes accordingly.


I'm thinking it's a typo for SEC.


That’s correct.


Seems like FUD to me ! Maybe Sarbanes Oxley.


There is no compliance rule put out by the SEC that requires (or even suggests) source code review.


> controlled by the FCC

I'm guessing this is a typo?


Or perhaps bad voice-recognition for SEC?


Brain typo :)


I’ve had a financial product audited by an independent firm the client and I agreed on. In fact they still audit it on a regular basis despite the product no longer being maintained.


This also can be acceptable based on the risk; sometimes we get a full pentest report from an independent security vendor.


How specifically did the request for source code signal to you that the vendor's charging too little? You suspect the customer is worried that the product is really nothing because it's priced so low? High prices from a vendor signal they mean business and ward off fiddly objections? (I'd expect the opposite, higher price => higher stakes => more interest in knowing just what you're purchasing.) That an ML product with source that BigCo would want to read probably has some special sauce that deserves a high price? Just that most small vendors looking at selling to very large customers tend to underestimate the optimal price?

I guess I risk inviting you to reply "Yes." or "mostly yes" with that laundry list, because most are theoretically plausible signals and sort of statements of the same idea, but what was your thought process leading to the conclusion they're underpriced? As someone who works for a vendor this is relevant to my interests :)


How specifically did it signal to you that they're underpriced?

Because the poster is surprised by this requirement and doesn't have standardized answers to it yet.

In poker, there is this thing called "assigning a range" to someone based on their actions. You can't see their cards, but their actions might give you signal where you could say "Hmm, playing like they have a middle pocket pair and not totally air nor a monster." If sophistication with regards to enterprise sales is a spectrum like poker hands, surprise about certain parts of the process causes me to adjust my range accordingly, and that plus experience working with software people colors my estimate of pricing sophistication.


At the flea market they call it "looking at the customer's shoes."


Priceless!


Ah, that's very helpful. As you could see from my reply I was reaching for reasons you might think their pricing was low without knowing their pricing, and this makes more sense than the ones I came up with.


No strat talk on hn bro ... ;)


Another possibility is they're worried you'll go out of business. In that case, they don't need to see the code right now. You can set up a code escrow agreement in case you cease to exist.


no, every big enterprise I've worked for we do source audits on everything that is allowed to run. We check for phoning home, we check for license issues, we check for massive security vulnerabilities. It's just a cover your ass kinda thing when 100k people are going to run an app. There's a whole department for that.


Yes, the calculus is completely different when you are a 5k midsize company vs a 50k+ behemoth. There are strict, multi-layered processes for everything in larger companies, especially software procurement (including yearly upgrades!), and legal attends every meeting.


This advice is generally very sound but misses the mark slightly.

> underpriced by a factor of 2X to 10X.

The OP never asked about price, so this is pure speculation. Why is the conclusion that this is a sales objection?

> It's going to cost you $20k and some legal wrangling regarding non-disclosure and non-competes

It sounds like they haven't even gotten through procurement yet, which means they can't even suggest charging $20k for NDA's etc.

> Is this just somebody who wanted to sound smart in a meeting?

This absolutely does happen, but it doesn't happen when the company is discussing POC plans.

> We know that they have lots of resources and are building up internal data science team. And yet it was pointed out to me that their goal might not necessarily be to outright steal our IP, but rather to cover their bases. But we are still worried they might be "inspired" by the parts they see and get their internal teams to replicate across other sites or use cases. And we don't have the resources to litigate, nor any way of knowing they do this.

> Is this a business unit head sniffing for new things to build?

These are both hypotheses based on observations, but they are not conclusions to the situation.

The OP's company needs someone who has experience in this realm (a "grey haired enterprise software sales guy") and help them come to a conclusion before they make decisions to move this forward. There's way too many variables and open questions to offer any real sound advice.

TL;DR - for situations this complex and high stakes, relying on a HN comments will not bode well.

For the record since someone asked in another comment, I have navigated a procurement process with a financial services company with $1T in AUM, working on a $50M/year software project, as an example of experience.


I'll offer a different pov from many other comments.

I work for a fortune 50 basically doing web server stuff. Right now our security team would like to run some startups code synchronously as a module in our web server.

Their code could easily cost us millions off dollars (if the outage was small). I need to make sure their sdk is free of race conditions, and has proper timeouts and throttling and has proper metrics.

If your product can interrupt billions of dollars in revenue, I'm going to need some assurances and "we're really careful" doesn't count.

Of course they are free to say "no", and we can go our separate ways.

Because we're public we can't always tell vendors how much money they've cost us so when they accidentally point their prod to Dev and cost us millions of dollars all we can do is try and get a credit on this months bill.

I've services described a pretty specific use case, but there are other more generic possibilities. It's great that you have magic data science sauce for scoring customers, but when I get sued for racial profiling "I didn't know they used race as a factor" might not save my ass in court.


Just run it in a separate process with a seccomp sandbox, using pipe/socket IPC with a timeout (if it's some anti-DoS/intrusion thing, just fail allowing the request, since it's going to have false negatives anyway).

The IPC cost should not be significant compared to the rest of the web server code.

Can also ask them to do the work and provide a small open source in-process shim that sets up and talks to the sandboxed process.


I'm surprised your comment is showing up grey (downvoted). It' a perfectly reasonable solution. I'm considering adding a new "remote" plugin type to our server for running hooks in sidecar processes. It wouldn't be my first or second choice but it makes lots of sense if you have to talk to something that might have bad p99 response times.

Please feel free to reach out to me (info in profile) if you ever want to have a chat or are looking for some fun work.


It's not only the idea of "running this code might be dangerous". It's also "depending on this code to do something we need, it might suddenly drop dead and we'll still need it".

It's a question of devops- if this thing dies, or changes in negative ways, do I have the power to fix it internally? Or am I calling an outside company during business hours only to request that they please look at my problem?

That's the danger large companies face when deciding if they should depend on an external vs build it internal.


After my first skim of your post, I thought you might have had a huge misunderstanding of what a "race condition" was.


Are you going to audit all of their code changes from now into perpetuity as well? It seems like you kinda alluded to the thing that would actually be effective, which is not synchronously calling external services in high-uptime-requirement applications...


Sure, why not?

In such environments you anyway wouldn't deploy a new version from a vendor without extensive testing, so it's not something you do frequently and you can re-audit at the time. If you're paying a few million per year for a licence, then it's not an issue to spend $10k worth of manpower per upgrade to ensure that it goes properly.


I'm not sure what you mean here by external service, the solution you describe is just "build everything in house" which is possibly not reasonable.


The solution is to build in-house or to acquire the vendor for any systems you need to run synchronously inside your multi-billion-dollar revenue stream. Which doesn't seem unreasonable, and seems like a good motivation to architect that stream so third-party products don't have to be run synchronously.


If I read the comment right (a big if, I'm terrible at interpreting sentences that end in the middle with storm of dots and an implication), then he/she is implying that obviously we never considered running code asynchronously.

If so, this implication was inaccurate.


I’m not saying that it wasn’t considered, I just don’t know that a one time code audit offers sufficient downtime protection for me. If you’re worried about an external service being in the critical path then taking it out of the critical path seems like the only foolproof approach.


Not if it really is responsible for keeping billions of dollars flowing.


fantastic question. Just because the software complied with the requirements at one point doesn't mean it will in the future.


True, but I'm thinking if you can tell it was written by people who know what they're doing, odds are the company will continue to have people who know what they're doing.


In this situation at least it’s a 5 person company so the flux that they’re going to have to go through both with people and tech is gonna be pretty large. If you’re trying to gauge the future you could probably do the same / better job figuring that information out by having technical deep dives and researching their key employees and their past track record. Personally the source code is not the way I would want to gauge that information in the first place anyways, it’s a logistical nightmare and overkill.

Also to be more helpful it’s not unreasonable to ask to speak with current customers, seek proof of claims on stuff like SLAs and uptime, etc. There’s literally a million ways people do this in the real world without obtaining source code.


It's funny seeing this. I work in one of those big corporates and maybe I can share a little about how they work. From what I've seen this could be completely normal or a huge red flag.

We recently looked at partnering with an analytics company for a credit scoring solution. The idea was to use their SaaS and have a contract in place regarding use, pricing, up-time etc (Corporates will want that). When we kicked off the procurement process internally an army of people got involved. Their was a project manager, a number of lawyers. I believe their were audit and compliance people their too. They ended up creating a project just to figure out what needed to be figured out. Here's where it get's helpful. We couldn't even share what we needed to know with the SaaS team or product owner because that could compromise the investigation. In this case, the corporate's concerns were genuine, numerous and required rigorous answers.

In another case I watched an exec try to figure out how a KYC solution provider has achieved the level of accuracy in identifying ad-hoc documents in the hope of solving the problem with internal resources.

If I were you I would not send them your code. Rather say would not be closed to them asking questions and you answering by showing code as long as it's not the secret-sauce.

Just know that for our scoring problem we actually had to see how the decision was being made because we're subject to a ton of legislation around that.


I ran a SAAS company for 8 years, going from zero to double-digit million rev per year.

The answer is a clear no. They can PAY YOU to make custom plots/charts/reporting or run queries if they want to understand what it does better. There is almost always a way to achieve any business goal without requiring source code.

The only case I can think of source code needing formal verification by a third party is if you're targeting drones to kill people or government jailing people. That doesn't sound like a commercial company in any event.


Have you actually dealt with enterprise sales? Source code is a fairly typical request, and there are software escrow and audit companies specifically set up to address concerns.

Microsoft, for example, gives source code access to paying enterprises and governments under the Shared Source Initiative specifically for security vetting and other auditing purposes.

OP: Consult a lawyer who specializes in these matters.


> Have you actually dealt with enterprise sales? Source code is a fairly typical request

Not the parent, but I have done deals like this and I feel the need to counter your sentiment.

Sure, 3rd party security review, escrow, etc etc is normal.

However, this is a massive red flag for me:

>Their justification is: "we want to see how your algorithms made their decisions."

I mean this is straight out of an episode of "Silicon Valley". The OP's entire product value is in "how your algorithms made their decisions" and this is not something you want to expose to anyone unless perhaps they are about to acquire your company.


I've seen my company ask countless time for source code and your financials. If your financials are iffy, you might not even get past the RFP process.


Exactly, software escrow is for this type of need. Your legal team should know how to handle.


I agree, although if the company asking is not engaging with an aforementioned auditing company (Mitre is the one I've dealt with the most), then that would be a red flag to me.


Yes, including accounts > $1M.


you could also give them source code but on printed paper so they don't outright steal the source, and also make them sign an NDA for a free with a much larger fee if they break it.


> targeting drones to kill people or government jailing people. That doesn't sound like a commercial company in any event.

https://www.lockheedmartin.com/us/what-we-do/aerospace-defen...

http://www.corecivic.com/


Isn't security a potential concern? If they're selling analytics software ostensibly they have access to a lot of the customer's data.


I have run into requests like these twice.

First time the big gorilla company liked our product and they wanted it for a core process of their business. They knew we were a small startup and they wanted to be sure we were doing things properly. So they asked us for a full audit of the code by a third party company. This external company was a big consultancy and auditing company and they run something like a 'due diligence' process on us and our code. Extremely professional and clear. They never had access to the full source code, only access to pieces they requested and 'only for their eyes' during a short period of time and always supervised by us. Since we wanted to close the deal, everybody worked hard to pass this test. Later on, we signed a source code escrow agreement.

Several years later, I was a working partner for a different company and same situation with a big Telco company. Since I had the experience I was involved in the deal, but this time this company did not want to play fair and it was clear that they wanted to copy the 'magic' of our solution. Their R&D division asked for the source code and they wanted it all the time they needed to study it by themselves. It was crystal clear from the early beginning they wanted to copy us, so we did not sign the deal. Then they tried to hire the core members of our development team for their R&D division.

So the lesson here is any company can screw you if they want and you don't protect yourself. You have to take risks when you are a small company, but your code and core team are sacred. I think most companies prefer to be nice guys rather than being a gang of bastards, but it's good to be cautious.

Find a good lawyer, or better get a good advisor on your board in your company to assess you on how to deal with this kind of negotiations. And never say 'no'; say 'yes, but we would like to do it under our conditions'.

Good luck!


> this company did not want to play fair and it was clear that they wanted to copy

Were there other signs they wanted to copy besides "asked for the source code and they wanted it all the time they needed to study it by themselves"? I'm curious about the red flags to look out for when negotiating such deals


First company was very clear about its concerns and came with a list of what they wanted to know. Something they wanted to see with their own eyes it was no violation of any license, specially GPL.

The second company (Telco) was the opposite. The requirements were vague and its R&D team felt attacked because a local division of the company in UK chose us versus them. So I listed what they could see of our code and how (white room, our computers, no network, partnering with us all the time...) and I asked for a scoring method based on their requirements to know if we passed or not the test. They rejected all our conditions. They wanted the code to give us the “blessing”.

The funny thing about this Telco is the UK branch signed the deal no matter what R&D said. As I said R&D tried to hire our Team. They failed. Then they spent 2 years developing a copycat of our solution to replace us in UK. And when they finally were ready, the cost and impact of transition to their solution was too high, and the copycat solution was abandoned.

How to know if are good or bad guys? It’s hard. When both parts want to sign a deal everybody gets involved and pushes towards the goal. Also, connect to companies that did that before to tell you their story. For example, the Telco had an “extractive” reputation and people warned me.


This is just politics.

Find a way to say "yes", which satisfies their need to hear you say "yes", but your "yes" conditions mean they need to spend money (which they won't want to do), and further conditions, even if they do, as other commenters have suggested, make the process dysfunctional.

Watch our politicians in government handle any issue. They are masters of saying "yes" and delivering "no", which makes people feel like they got "yes".


Slightly off topic, but....

> Find a way to say "yes", which satisfies their need to hear you say "yes"

This is the best skill to train if you're going to be working with enterprise clients.


Winning government work (which is probably similar to large enterprise) is all about ensuring all of the government tender requirements get a "yes" tick when considering your product, but then ensuring that your fine print gets you out of all the unsavory things you had to agree to for those "yes" ticks. You can afford at this stage of the process to underprice and beat your competition. Note that the thing you underprice is the fixed price commitments that the government will change anyway, invalidating your fix price commitment instead switching over to the overpriced fine print pricing (see below).

You might wonder how to do that without them noticing that you are backing out of all your commitments. You in effect force them to agree that the fixed price commitments that you make are only valid if the government does their bit of the project, and you lock down exactly what their bit is. Anything outside this clearly defined scope is a "variation" or "change".

You do it by identifying the grey areas and putting a condition on each grey area.

For example there might be a requirement in the tender that you'll deliver some report in a week, or build some software function in 3 months at a fixed price. You would have some fine print to say "if all client staff are available to contribute, if all systems are fully available for review and if all approvals are gained. 1 week project $2,000 additional hours $220 per hour". You know its going to take far more than one week, now you are going to get paid for it, but you charged only $2,000 for the initial week which was likely cheaper than the other tender submitting companies who you are competing with. You just got them to agree to pay you $220/hour, and they didn't realize that most of the project will actually be carried out under this bit of fine print because they didn't meet their commitment to their side of the project - bingo!

Your client can't disagree with the grey area conditions because its only reasonable that if they require the report in a week that they do their side of the project work and if they don't then you need to be paid for the extra time it is taking.

Your client will be optimistic about what they can do when it comes to what they need to contribute to the project. Take advantage of that optimism and get them to be paying you when it takes them longer than they expected to do their bit.

The real money is made in these additional terms because in many cases all the initial conditions of the contract become irrelevant for various reasons and thus the terms are dictated by the fine print that you defined, that the client is not paying close attention to during the tender assessment process because all they care about is ticking off their predefined requirements, which you already made sure you get a big "yes" tick on. This is how big companies make scads of money from government consulting contracts. They know to make their initial tender extremely appealing to the client and their fine print extremely lucrative.

Make sure you have a good project manager whose job is mainly to track actual against contract commitments and get signoff when moving to hourly instead of the fixed price commitments.


You're exactly right and that makes me feel sick about how ugly the world works.


All business is a variation on this. If you're wanting to be a business person in a "nice" world where its not all about manipulation and sleight of hand then you can do that but you won't be mega successful.

Read up on Bill Gates Windows license contracts that the hardware manufacturers had to sign .... genius.

Truth is all in how you tell the story.


I remember the day a friend explained to me that MS Word, Internet Explorer, and MS Frontpage were the same DLL just with different entry points.


What? That seems both wrong and meaningless. You can compile as many programs as you want into a single DLL. Doesn't mean a thing.


It wasn't N programs packed into a DLL, it was one program (in a DLL) that acted like three different programs depending on how it was "started".

They were selling the same software as three different products.


You consider MS Word and Internet Explorer in the same DLL to be "the same software" rather than 3 different pieces of software packed in the same DLL?


I'm just telling HN what my friend told me. It's not as if I've ever examined the file, y'know?


Your friend was lying to you.

Source: I'm a former MSFT employee, and I've spent WAY too much time working in the Office codebase.


Fully agree!

I used to run a enterprise software business for many years and we learned quickly to always just answer “Yes”.

Specially with requirement lists clients would send us...much of that stuff didn’t make any practical sense and it was clear that who ever wrote them had no clue what they where doing...answering “Yes” got us into the door and we always ended up delivering what they actually needed at the end.

Just make sure to charge for every Yes they make you deliver :)


This is a complete no-no. There really is no justification for this whatsoever.

What does "cover their bases" mean?

As them to explain what they are trying to achieve and find other ways to assuage their concerns.

The only legitimate thing is to have something in case you fail and they have "banked" on you. There is a legit way to solve that. basically if they want that tell them they should pay for an Escrow service - that will hold your code and they would receive it if you ever cease to exist. But it's important that they would need to pay those fees (they are significant.)

That should make them back down.

It's entirely unreasonable for them to demand access to source code.


It's entirely unreasonable for them to demand access to source code.

The government and large corporations apparently disagree, or Microsoft wouldn't have their Shared Source Initiative. And for any nay-sayers in the crowd, that should be all that need be known: Microsoft thinks it's okay, and they have a lot more to lose than you do.

The only legitimate thing is to have something in case you fail and they have "banked" on you.

Which might be the exact thing they're trying to avoid. If your "machine learning" algorithm amounts to a bunch of nested if statements, they'd probably rather not "bank on you" in the first place.


Bringing up Microsoft is neither here nor there.

Microsoft made the decision to create Shared Source Initiative only after they were very successful and only after a very, VERY detailed cost vs. benefit analysis.

Furthermore, they are handsomely paid for it (e.g in order to be eligible, you need to pay for at least 10k Windows licenses as per https://www.microsoft.com/en-us/sharedsource/enterprise-sour...) and it's ultimately up to Microsoft to grant/deny access.

And since we're talking about Microsoft, in the early days they were infamous for pumping competition for technical information under the guise of due diligence and then crushing said competition by developing competing products.

The person who asked the question is clearly not at the "successful monopoly" stage as Microsoft but more in the "there are legitimate concerns someone might steal our core ip" stage.


Microsoft also has enough legal resources to assure recompense for any license or ip violations...


well I presume that MS does not provide access to code to direct competitors. Which is OP’s concern- client is building ML team and so on and they ask access to code.

National agencies such as FBI are not threat to MS to become competitor, unlike OP’s client.


I work for an enterprise company and we demand the source code for machine learning models from vendors all the time. This isn't like asking for the code to Excel. It's a model derived from our data that has likely no use for anyone but us and is highly susceptible to misunderstandings of the data. We absolutely need to verify your work.

Models are the output of the process. It's like going to graphic designer for work and them not giving you the PSD file. It's just not acceptable.


> You clearly have no idea what you're talking about.

I have no idea what I'm talking about in this specific niche either... but does it normally go over well when you start a conversation that way?


The parent comments author has this in the profile:

>about: CEO at http://www.3scale.net

I'd venture to say there is at least some qualification to answer here. </sarcasm>


I have removed the comment.

It's just frustrating that this poor startup is going to make a monumentally bad decision based on people in here who have no zero clue what they are talking about.


I don't think people who start conversations like that are able to effectively gauge other human's reactions, they're kinda inter-related lol.


I think you make an interesting point, ml models and algorithms are two different things.

I also think it's reasonable to ask for a model so you can test and validate it yourself.


By this logic the said designer should also get Photoshop source code from Adobe.


I want to plus one everyone saying no. All of their reasons make sense to me.

In addition, I'd love to help your confidence with getting used to the idea that saying no is ok. I remember being in this situation many times and feeling like if we said no we'd lose the client and maybe go out of business.

But truly, clients often ask for things that aren't very important to them and they will not mind being told that the answer is no.

The mistake you can make here is to blow up your no answer into a big deal.

The term I was given for how you should respond is "the principle of the simple explanation." People are totally ready to believe you, so just say no in the simplest way possible.

In the middle of some other response just include a line like:

"Re: source code access. No, our source code is proprietary and we don't share it."

Or even more simply "No, we don't share access to our source code."


Your no answers reminds me of an experiment I read about where they determined people were more likely to accept a request if you included pretty much any justification, even when totally self evident.

E.g people are more likely to cut in line at the copier if you say 'can I cut in line, because I need to make a copy'. Even with an inane excuse like that (everyone else there also needs to make a copy) people are more likely to accept.

So your first example is likely the better one, even though the reason given is basically void of explanatory content.


To be honest I might be at the copy machine to make a copy but I might not actually need to make it then and there. I might be able to re-print it, or do it later. When someone tells me he needs to do that then I think maybe it's more urgent than mine. I wonder if there are subtleties like this that explain these in reality.


But that is the case for almost anything. Exceedingly few things are things you need to do right then, right there, or even at all.

For some it is more important, and they won't yield, but some proportion will doing something because the cost of doing it is low enough that there's no reason not to.

In that case giving a reason creates a cost in persisting in that you need to consciously think through whether or not it's worth arguing over, and if so what to respond. And sometimes that is enough that you'll just shift your focus to something else and not bother.

I have no doubt the effectiveness will vary greatly depending on type of action, and that it will vary greatly depending on other factors tied to why people wanted to carry out a given action in the first case, of course, but I also think people ask for or do a lot of things where they have not really made a conscious decision that they need it, just decided on a whim that there's no reason not to ask.

Especially with group dynamics involved, it might have been as simple as someone asking a question in a meeting ("do we need the source?" for example) and someone deciding "might as well ask for it, just in case", with nobody actually caring enough to defend it i you stand up to the demand. In that case the need is low enough that even a totally inane excuse for saying no might stand up even if it on the surface looks like a big, important question.


This. The request may be a big deal to you, but it isn't necessarily a big deal to the client. Don't go big on the ultimatums and alternative plans until they indicate that it is a deal-breaker for them.


If they are trying not to end up in the situation where you go out of business and they are screwed, offer them to escrow the code for a certain amount of money. If you go out of business, they get the code. If you do not, they don't


If the out of business scenario includes the possibility of acquisition or asset sale, code escrow is a significant factor lessening the value of assets or killing the deal entirely. Stated differently, it's not what a buyer wants to find in due diligence.

The $50+bn company might even be the potential buyer, but under code escrow they just get it for free (less whatever was negotiated up front.)


Typically these are worded as part of support & maintenance and is triggered by failure to meet that obligation. Acquisition, merger, spin-out, JV or bankruptcy would not trigger it provided support is maintained. Also, the force majeure clause typically allows some reasonable timeline to reinstate support before it is triggered.

I would think the positive of a large client greatly outweighs the liability of the code escrow.


Rubbish.

Structure it as 1 year, 2 year, 10 year whatever the hell deal you want. If you are a $50BN company ( you aren't ) you simply fire that customer.


Even source code escrow should be resisted. "Out of Business" is not always clear and if you are trying to get acquired your value drops dramatically if major customers have your source.


True, the code escrow needs to be written correctly. Done correctly it only kicks in if the code goes unsupported. You can be acquired so long as the company buying you continues to support it at a "reasonable price". If they don't support the code for a "reasonable price" they didn't buy you for the code anyway so it won't matter. If they do support it, then the escrow doesn't come into play.


I had a customer ask (we have an escrow clause), "What's to stop us not paying your bills (for support and licensing) until you go out of business and then getting the source code?"

We pointed to (one of many) conditions that said that escrow would be only released to customers who were in good standing with us -before- the escrow event (and that wasn't transferable, as in they couldn't settle accounts with trustees, acquirers or the like - similar to 'not being able to buy retroactive insurance').


Ask your attorney.


I guess reasonable people can disagree. It's a business risk calculation not a point of law.


If you think "out of business" can't be defined adequately, it's a point of law. That's what lawyers get paid to do for you.


Hi i'm founder of https://bitbank.nz a trading prediction and stats platform that uses machine learning to predict cryptocurrency price, seems very similar to problems we have been facing.

After a quick call with a massive customer and walking them through our forecasting strategy and code we saw an abrupt end of communication after that!

Brain rape like something straight out of a silicon valley TV show https://www.youtube.com/watch?v=JlwwVuSUUfc

After giving away our secret sauce they simply cut all communication and one can only assume they are implementing their own version of what we have now...

If they are such a huge customer they should be prepared to pay like everyone else should be if you can prove from your predictions/charts that your algorithms performance is solid.

Give them a short free trial but be careful not to give them too much for free.

We now only offer a 1 day free trial and the value should be obvious after that, start with a crazy price and slowly drip feed discounts, product features and trial extensions like you would market to a normal customer, if they are going to do invest time doing any custom integration with your apis ect then why cant they invest money upfront too?

Its easy as a scientist to not make a strong sales standpoint but your worth more than you think!


There are more likely reasons they might have cut you off other than "brain rape", specifically 1) They didn't like your code or your forecasting strategy. 2) They didn't like you, didn't trust that you could get it done. 3) They chose a competitors product. 4) They canceled the project.

I work for a 50,000+ person company. I do software evals all the time, not source code but still. When their sales people call me, I ignore them. I'm very busy, if I want to buy I'll call you.

My group (200+ devs) starts and cancels new projects all the time. No one has time to call every losing vendor and tell them they didn't make the cut, or that we decided not to continue the project.


This is a simple case of "Big clients expect to own their little consultants,"

What do you do if they ask for customizations? What do you do if they request features that are not on your roadmap? What do you do if they ask for a large discount?

How you decide those cases determines how much your business is like a startup moving fast and not negotiating over source code disclosure and how much your business is like an enterprise consultancy that prices out each request for one off work by their clients. There's no right answer. But clarity about how a particular client affects your business model and whether or not that client is worth having right now is important. Are there other potential customers who are easier to service?

It is not clear from the question whether this request is part of ongoing contract negotiations or presented as a prelude to negotiation. If the former, put a price on it. If the latter, it smells a bit like a "no" in the form of a "maybe" or window shopping without much intent or someone higher up with purchasing authority putting on the brakes. Figure out an expected value of the client by assigning probabilities to various size contracts closing at different points in time.

Good luck.


No way in hell would I show my code if that was my gold. But if you don't get the business otherwise, you have a choice to make. In my experience, people and corps are bastards and will say anything to get what they want. Don't be a sucker, be wise.

Provide a counter offer. If they want to see your code, counter with "we will show our code when you provide a 10 year contract with X dollars per month/year". Then I would do it.

They are covering their ass, so cover yours. People and businesses who are strong, respect others that are strong.

All my clients will rip me off if I let them. ALL OF THEM. They can't help it. If you asked your cable company "can I have free cable for a year?" And they said "yes", wouldn't you take it? That is what is happening to you. They are testing you.

If you are desperate and foolish, they get your code for free. If you are brave and wise, they get a good partner.

Take a leap of wisdom.


Most of the people in this thread clearly don't understand the situation as Analytics/ML is quite different to normal IT.

1) The company is not going to acquire you nor will they sue you.

2) They absolutely want to replicate what you've done. It's not so that they can build their own startup and compete with you. It's because they genuinely want to understand how you got the results. Companies are increasingly basing their decisions on ML models and it's not acceptable for them to just "trust your black box". Especially since your model could be based off a misunderstanding of the data.

3) If you decide to say no then chances are they will walk away. Especially in the ML space companies are not going to let their core intellectual property be locked away in someone else's vault. They will simply not work with you.

4) The best way to handle this is simply to get them to sign a NDA or something equivalent that protects you in the case that a rogue employee decides to build a startup him/herself.

I am someone who works in ML for an enterprise company. For god sake give them the code and be grateful they are even doing a PoC with you.


Best answer here. I am in the exact same situation BUT I actually share the code from the get go because of the need to understand decisions (and GDPR is going to make that even more important). If the OP company is doing something that is not possible elsewhere then fine but if they are just allowing them to goto market faster then the effort to re-engineer and get working (particularly in the AI/ML space) is pretty time consuming. EDIT: I think that the time of proprietary algorithms as MOATS are an old way. Things are moving SO fast now that your business is going to be outdated with the next release of Tensorflow or whatever is over the horizon.


This is a massively important point (if you're selling to the EU) - as part of GDPR companies need to have a way to audit the decision making process if a decision is being made by a computer.


What happens if they have their own scientists take a look at the model and decide to replicate it internally without paying a cent?

Knowing the sorts of decisions that were made around what types of algorithms and features to use (obvious after a cursory examination of the code) would give them a huge leg-up in developing a similar model.


Everyone in here is generally giving you the right advice.

If you want someone to help specifically advise, guide you through the process, and potentially represent you in negotiations, feel free to get in touch and I can probably help.

By way of background, I primarily work with investors who are selling middle-market software companies to larger companies (over 80 or so personally and oversaw another 80 or so, over the last 3 years). I have a lot of insight and experience into contract negotiation from both sides of the table here. I've also worked for/with the big enterprise software players (SAP, Oracle, Microsoft, etc).


I like your “everyone here is here ally giving you the right advice”. At the same time there is lots of contradictory advice.

To me it’s clear that there are lots of possible scenarios with lots of possible “right” decisions. Without more info, i doubt anyone can give better advice than what’s on this forum.

Certainly a professional should be able to give better advice but only after they get more info on the specifics.

Good luck!


If their justification is "we want to see how your algorithm made its decision", you should perhaps have your model output more stats and infos about that.

As a user of data tools myself, I am generally suspicious of any black box model, and would like to understand the model well before using it. For instant if your model is a deep neural network, I'd like to know the structure of the network and the activation of the layers when I run my data through it.

If they're really interested about the guts of the model they will agree to a solution like that. Having the source code will certainly not help them understand this as it is highly unlikely that anyone will dig into this.

Note that asking for the source code is fairly common practice in finance cause people generally distrust black boxes (at a firm I worked for they specifically chose MySQL over some other tools because they had the ability of looking through the code if they needed to).

There is actually no point for them to look at your code...

You can agree to something where if you go bust, then you'll give them the source code.


That is the very problem with ML, it comes up with answers but provides no insight as to how. For example, you teach a vision ML to detect balls of various colours in pictures, then give it pictures of battleships - it might see the bubbles in the wake as balls, but won't tell you there is a great big hulking grey thing in the picture.

Escrow arrangements are common to protect clients from software vendors going bust.


Options:

1. Ask for a giant pile of money for the privilege.

2. Come up with visualization that answers their question without giving them access to source code. Presumably "how do we know we can trust results" is a common problem.

3. Walk away.


And a possible 1a: Tell them you'll be happy to give them your source code, and everything else to boot, if they'll buy your company.

This way you get to jump straight to your exit, and can maybe also structure the deal so that you and your employees get jobs at the BigCo (if you want them). Feel free to set the selling price as high as you like -- they would want the deal more than you would, so you have a lot of leverage in the negotiation.

And if they balk? Hey, they asked for terms to see the source code, you gave them terms! It's not your fault if they don't like them :-D


Good answer. No, they don't get to see your source code without buying the rights to it. Otherwise you're going to end up in trade secret litigation after they steal something.


maybe what they want is just white box explanation on your models?


This is all it is. I ask for explanations every week from vendors.

And everyone here is going on about them wanting to buy the company or acquire licenses to the code. It's just bizarre.


my .02 from having been CTO at a SaaS analytics company for a long time: "we want to see how your algorithms made their decisions."

that's great product feedback. and a problem you need to solve. You shouldn't solve it by handing out your source code.


In ML, that's a problem lots of people want to solve!


I've read most of the comments here and I think many of them are off track.

You need to take this approach when dealing with a large potential customer : let's not get into the weeds of what you are asking for, but rather tell me what the underlying need is. That might, for example be : "we'll rely on your code to secure our customers' sensitive data and so we need to take steps to ensure you are practicing industry best processes for security". That need could be addressed by having a third party review your code and processes, protecting your IP.

In your case they said :

"we want to see how your algorithms made their decisions."

For me this would be a "Hello no", perhaps put more politely. You're building a product that has value embodied in those algorithms. The customer is paying you for that service. Therefore you should take absolutely no steps to tell them how it works. There is only downside.


I used to work for a smallish (<50 headcount) company that sold high-value software with a LOT of secret sauce in it (image processing algorithms) mostly to telco, wireless, and MSOs. The primary reason those folks wanted the source code was as insurance that if, as a smallish company, you went out of business, they would be jolly-well-rogered if they had deployed your stuff in the middle of their mission-critical revenue-generating operations. Our solution was always to use a mutually agreed third-party source code escrow service such as Iron Mountain. The only escrow release triggers were company insolvency/bankruptcy, or refusal to meet the SLA of support requests for an extended period of time (the -Off Clause).


you could also send the source code to a third party notary to send to them incase you went bankrupt and also release your product open source when you go bust.


Unless downtime of your application results in money lost for them (as opposed to bad decision making) then I would probably say no to this. If it comes down to "Is the data science good enough?" that is basically a sales objection. If you can, talk about success with other companies and your track records. There are legitimate reasons for a customer to need to see your source code, verifying that you can do your job is not one of them. Treat like a sales objection.


> 1) Has anybody run into a request like this? How would you respond?

Yes. Satisfy their justification without showing code. It is possible to show how algorithms made their decisions. Make this part of your product. You already noticed there is a demand for it and that delivering a black box can be a deal-breaker. So read up on LIME, decision paths, interpretable models on black box output, etc. and give them the capability to see how an algo made its decision.

> 2) How likely do you think their goal is to genuinely "see what happens under the hood" as opposed to replicate in the future?

Unlikely they'll replicate. It would set them up for legal problems. Depending on how deep your moat is (training data, novel optimization techniques, encoded domain expertise), they probably wouldn't even need to see source to replicate in-house. It may be more about not being hood-winked, paying top dollar for a product that does a few imports from open source libraries.

> 3) Are there any legal protections we can put in place to prevent them from not just copy-pasting our code, but also from "learning from it" or so?

Not that I know of. Perhaps you could charge extra for the code review, so in the case of "learning from it" they'll at least pay for it. A thought exercise: did you learn from open source/open research/commercial solutions before building your Proof-of-Concept? If no, then they don't need to either (provided they can hire the talent), if yes, you are like a thief who is worried they will steal from you :).


You can find templates for proprietary information agreements at EveryNDA:

Defeat the Confusion: Confidentiality v. Non-Disclosure

https://everynda.com/blog/confidentiality-v-non-disclosure/

Examples of Microsofts Shared Source licensing can be found as well. These contracts are typically reserved for heavy hitters. Who have enhanced security or performance requirements. FBI, JP Morgan, etc. And of course Microsoft has open sourced large portions of its own dev tools and sdks.

Microsoft Shared Source Initiative

https://www.microsoft.com/en-us/sharedsource/

I think what you may begin to realize is that its their alternative data that represents the motherload. And its not your algorithms but level of service that will differentiate you. The insights mined from that alternative data may be so valuable as to outweigh your other concerns. And gaining access to it might be the paramount mission for your startup. As the executive, ultimately its your call. Good luck!

https://blog.quandl.com/category/alternative-data


Are you rolling out your product as a SaaS offering, or is this something you're planning to license to people to run on-premises? If the latter, I expect that you'll get lots of requests for source, either for inspection or for escrow.

Personally, I wouldn't be scared of BigCo ripping you off. For the most part, large companies care a lot about staying on the right side of their contracts, and also it's generally really hard for a large company to out-innovate a startup. So I would be pretty surprised to see them steal your source (assuming you put in place an appropriate NDA etc.). EDIT: I'd be even more surprised to see them try to compete; the worst likely case is that they steal the source and stop paying you, not that they steal the source and get into the data analytics / ML business themselves.

However, I think that in the ML world, this "what the hell is the algorithm doing" question is a really common one, and it'd be super-worthwhile to invest in some sort of tooling to peel back the cover of the algorithm a bit. Validation of appropriate responses against future data is a real quagmire right now, to the point that some people are using ML to help find a solution, but then trying to re-implement the logic more traditionally once the ML algorithms figure out what to design for. I think there's something there, at least for a good subset of use cases.

Also, it's common for a large enterprise to require some sort of source code escrow if they do a big deal with a startup. Sounds like this is different than what they're asking for, since the source in escrow won't be available to them until the escrow conditions are triggered. Again, I wouldn't be concerned about signing escrow agreements, but I would make it a negotiation point, rather than a standard term.


Can you elaborate a little on this? The reason I ask is that my experience has been different. I have seen a lot of requests for code escrow (source is released to big company if and only if small company goes out of business) when a large company was dependent on a third-party product, but I have almost never seen a request to review source code (the one exception being selling into classified government environments where security considerations required a source code review).


I wouldn't say it's been the norm in my experience, but I've definitely been on the receiving end of requests for source (both when working at a large enterprise and at a startup). In my experience, this is much more common when selling something that will run on-prem or in process, however.


I'm willing to be that the same company wouldn't ask Microsoft or IBM to see the source code of the software they buy from them. So I'd start by wondering why they can't treat your software, too, as the proprietary black box it is.

Is it because they're aware that you can't convincingly threaten them with litigation? Do they think you're too small to protect yourself effectively from the danger of IP theft?

If that is the case, then the answer is clear.

If the client actually has legitimate concerns- couldn't they ask you to run some specific tests, or make some experiments, and report the results to them? The amount of time spent to think of such tests should not be more than the amount of time needed to review your code and you could argue that examining the behaviour of your system can be more informative than looking at the source code.


They likely won't treat your software the same way unless they can hold you liable for damages to the same extent.

With a startup I was working on, we managed to get meetings with the CEO of the world's largest fashion company and several others with $10b+ revenue. A universal issue, however, was that we were bootstrapped and lacked the funding to insure against their losses, which for them meant unchecked liability.


I have not run into issues like this, but I would at least make an attempt to meet them in the middle. NDA's and the such are not something I would be confident in, particularly given that this customer can throw a lot more money and effort into legal.

Regarding meeting them in the middle, I would put together a presentation that describes how your algorithm works at a high level. I'd do your best to split the balance between being transparent and focused on the customer, and not divulging what you consider to be differentiating parts of your implementation.

If you get push back for the above, I think you're dealing with either brain rape, or some pretty unprofessional contacts in your customer's organization. If it's the latter, you should do your best to navigate around those contacts.


Another data point, but I worked for a company that made an expensive DSS for a very lucrative industry. We showed a potential "partner" our code and how it worked and everything. They took our ideas and made their own product as a direct competitor. Reminded me of Apple and Xerox.

Definitely get legal council involved.

A possible way to protect yourself is print it out and put it in old fashioned binders and let them see the binder while you are watching. Not sure if that will fly, but it would be hard(er) for them to steal it. Tell the company your concerns (which are valid) and what methods they would accept. I don't think it's unreasonable for you to bring it up with them.


In Apple & Xerox case, legal council was involved. Xerox actually got paid to allow Apple to see their PARC demos. And no source code was shared.

And in the end, the Macintosh was a huge leap beyond the Alto, that really didn't work much like it at all.


>the Macintosh was a huge leap beyond the Alto

I would say it was a significant iteration, not a leap. iPhone was a leap.


lavabit them


I used to work for a company that did model risk management consulting for large banks and source code reviews were a standard part of what we did. What sounds different from the OPs situation is that it is the customer who would be conducting the review and not a third party. Take everything you read here with a grain of salt but it would be best to consult a lawyer. Even if you hold the patents for what your software is doing under the hood it may difficult and expensive to sue in the event that your customer does simply copy your secret sauce


Good point. If the OP company is doing something they can't then how are they going to know its fit for purpose? From my experience the big guys don't really have those skills in house OR the ability to organise said skills in a timely fashion.


My partner and I met with Google ATAP after emails and conversations.

Our experience if I was you is to be weary. Google just wanted to see our secret sauce and once revealed kick us to the curb.

They need to pay you or you walk away!


I remember reading something about that on HN; digging it up I saw that it was indeed you who posted that:

https://news.ycombinator.com/item?id=15422475


Yup just sharing my experience to help others who are so desperate to prove a market and or have the amazing opportunity to make a living building awesome tech by selling their tech to a big behemoth.

The OP says they are trying to show a market for their tech by winning this contract. After meeting with Google... all tech company's came out of the wood work and one .. Black & Decker wanted to become a customer. Though they refused to sign any type of NDA and or IP agreement yet were willing to pay a very small fee. My mentor (guy who created Xbox headphones) at the time said if they don't sign anything forget it. Ultimately I did, yet it was the crushing blow to the SpeakerBlast team and me personally for awhile.

I think the guy I met with is still the head of Chrome Audio, yet his boss at the time Regina Dugan was just in the news. She was heading up Facebook's secretive research lab, Building 8.


Edit: Pay you and sign an rock solid IP agreement and or just pay you a tons of money!


Sam Ritchie from Stripe was a guest on This-Week-in-Machine-Learning/AI (TWiMLAI) (podcast) talking about explaining black box predictions.

https://twimlai.com/twiml-talk-73-exploring-black-box-predic...

TL; DR is basically you keep a decision tree in parallel to your model that carries with it long/short-form text that "explains" why the model does what it does.


Here's a somewhat contrarian view - if what you are making is so good (or unique/special/non-trivial) that it is valuable to more than just one huge customer, then just because they are huge and you are seeing dollar signs, doesn't mean you should agree to their request without due consideration.

They may attempt to compete with you or re-implement what you do, but again - if what you did is so good and perhaps non-trivial to re-implement (an assumption), then you should consider the value of what you have above their immediate demands. They might not have to be your "only" customer.

If you are trying to build a business to last on your own, you must ask yourself, is this the kind of customer I want?

Of course, you should consider who the customer is and their mission/vision/values/actions, what your and their goals might be in this instance, how to accommodate their request (like the suggestions for code escrow, etc.) while protecting yourself (surely there are ways through a competent IP lawyer).

Sometimes the big customer can break (or make) your company. I think your course of action depends a lot on what you have and what you want.

That they are interested in you so seriously is probably a good sign, as much as it is something to be concerned about. Consider the value of your use case and what it might be to others as well.

This might be me being naive, but, if they will be good to work with, and a good customer to have, they will be willing to work with you fairly. Otherwise, be cautious and consider the calls for legal advice.


According to to their stated motivation, they're not actually interested in the source code: they want to know how your software makes decisions, presumably important ones on behalf of their business. The people who articulate such a request (management), as a rule, are not qualified to answer it by looking at the source code. If that is indeed their concern, they've probably filtered it through their internal development shop, who's come up with the idea that they could review the source code and answer. Management thought that was a splendid way to do it - no need to bother you guys with such tedious busywork (OK, probably slightly rose-tinted, but the general outline of the narrative is plausible).

If this understanding is what they're really after, then that's what you need to think about answering. Worrying about ML as an opaque black box is a bit of a thing these days, so it will probably come up with future clients as well.

If you answering this is not satisfying to them, and they keep insisting on the source code, and they can't articulate why, then they are not being honest, and you should walk away (or at least clearly state that if they don't withdraw that requirement, there will be no agreement).


Ask them what their real needs are. Odds are they really want one of the two following. Protection in case you go out of business; Protection in case you have a legal violation (ie you use GPL code and they link to you - suddenly their code is GPL).

If it is the first, code escrow is very common. You should probably set this up as a gesture of goodwill even if they don't ask for it.

If it is the second, there are tools that you can run to ensure you don't (you should anyway - though the tools tend to be "enterprise software" and thus expensive for what they do). Once you are sure you are free from that type of them a lawyer can draw up legal indemnification documents.

If it is anything else, this is done - for an additional fee. 20 years ago a company sold us an OS, as I recall the price for source code was $100,000 on top of all other costs. My company refused to pay even though it would have saved far more money if we had been able to understand what the code was doing, and thus been able to integrate our code better.

I'm not sure what legal requirements were in place, but you should defiantly have a lawyer who knows this area of law create the agreement. Not just any lawyer, one with experience is worth paying for - find the lawyer first and pay him $200 to give a high end estimate of his costs to draw up the agreement - this is your minimum price for seeing the code. (which is to say you expect to make nothing after the lawyer is paid unless a second customer also wants source code)

Unlike most I wouldn't reject it. However it should be an additional expense, and it should be covered by some strong legal language.


I don't think this is that extra-ordinary of a request. I work in FinTech, and we have routinely had large financial institutions who hire third-party companies in doing code audit and reviews. The key is the third-party companies who do this for their business.

I think the key for you is that it needs to be a third-party company that specializes in these types of code audits. I would NOT just hand over source code to the actual company.


If they require proof that your models are working, you should be able to show them simulations to that effect.

If it is a matter of due diligence, then it should be something that is discussed between your lawyers and the company's lawyers.

If this ask is coming from the engineering side of the company then that is a red flag.

You should also think of the impact in terms of acquiring other customers once you have opened a bit too much to this customer.


Yep - third party code reviews for compliance or security purposes are not unheard of.

The "we want to see how your algorithms made their decisions" justification is a little weird, since that is effectively your secret sauce and just math equations.

Are they worried you're joining in illegally- or unethically-obtained data in making your recommendations? Large enterprise companies have brand reputations that they factor into vendor decisions, and they may not trust you as a startup just yet.


(Customer has $50+bn in revenue) tells us nothing. How good is it going to be for YOU? Because that should be the main data point for this decision. You might want to charge them 10X-50X the usual price if their request involves trade secret disclosure.

Insofar as their motives are concerned, you should assume the worst; that that they will steal your IP and force you into endless litigation. From a game theoretic min/max perspective this is a sound way to think about it. It's also reasonably likely. There is a good chance they want you to get them started on a problem they don't know how to solve and then they will iterate off your solution. Happens all the time. Imo, if you are building a business in this space, you shouldn't let them do this to you.

Other options: Offer to add some explainable AI visualization stuff into the app. Hide the best parts behind a web service and only agree to give them the source code for the gui. It sounds crazy but people will agree to compromises like this all the time.


The answer is no. I had huge enterprise customer do the same thing, asking lots of questions. We didn't tell them much but they were asking very detailed questions as to the how. They didn't follow through with the deal and released a competing product about a year later.

They were simply fishing us for answers on how to solve problems.


Nope, never let them see it. Those algorithms are key to your success.

Unless they are buying your company and are doing due diligence.


This is a legal issue which files under compliance. Many times enterprises that are looking to acquire a small company they will request code audit. This audit usually is requested after the agreement of sale. For legal reasons, this provides the safety that your company doesn't provide any compliance issues moving forward in your code. This process is just playing it safe so there wont be any legal issues moving forward. WhiteSource easily verifies license compliances on premises or using their Saas program. They provide a very timely report that will ease the process. https://www.whitesourcesoftware.com/open-source-license-comp...


I've seen huge enterprise customers ask similar things in the past. Especially with startups, they know that if they decide to steal your algorithm and replicate it, you will never be able to afford to sue them. I've seen a number of enterprises where they had a team large enough that they could throw whatever resources it took to build your solution and once they had some ideas how it worked, then they were off to the races.

It could also be a bluff to see how far they can push you. Enterprises love to ask for ridiculous things to see if they can get it. I once worked for a fairly large retailer ($10b/yr) that wanted to put into a contract with Microsoft that they could have access to the Windows and SQL Server source code.


Clarify if the concern is Chapter 11 motivated.

We've put source code in escrow to address concerns of going out of business.

But if they really just want to see IP (which is fluid and changes every 90 days), then I'd only provide abstract diagrams and maybe decision tree outputs from the ML.


In a past life I was at a SaaS company that was asked this from every single customer (and they were BIG customers). We always said no. No, no, no.

You know why? It was not secret sauce at all and boolean logic. Amazing how the wool was pulled over a pile of rubbish ;)


Answer: No.

No one gets the secret sauce. They pay for your results.


And in the ML space we laugh at you and goto the next vendor.

Because we paid for that model, it belongs to us. Just like how if I pay for an illustration I expect the PSDs.


> Because we paid for that model, it belongs to us. Just like how if I pay for an illustration I expect the PSDs.

What? They want "the PSD" before buying.


I used to work a large investment bank, and it was not uncommon for our clients and auditors to ask about how we handled, security, financial transactions etc. Our rules were pretty much as follows.

1. Everyone involved signs an NDA. 2. No property of the bank is allowed to leave the premises at any point in time. Any supporting documentation was printed out and they were allowed to review it in certain designated rooms. The documents were not allowed to leave that room and they were reviewed to make sure nothing was missing at the end. 3. Any questions must be provided in writing so there is a record of the question and response.


Gut answer is No.

Honestly, they are no experts, or they wouldn't need you, so them reading and interpreting your code is a patently ridiculous request.

However, there is a huge opportunity here, based on the fact this is new ground for you and them. If they are truly worrying about justifying later the decisions to be made, then you CAN agree to design a report based on whatever your engine is doing, that shows addditional useful data (i.e., not just the correct decision in each situation, but the likelihood of it being correct or expected return). Invent a middle-ground solution.

Then make them agree TO PAY FOR THIS REPORT AS PART OF THE CONTRACT.


By the way, there is a saying in negotiating that you may want to repeat to yourself a couple times: "If you're not willing to say 'no', then you are not really negotiating."


Reply back with “is this an acquisition offer?”


Never worked for an enterprise company before I take it.

Because nobody when they are dealing with a vendor is thinking about how to acquire them. They just want your product.


Hi,

Would be good to know if its SaaS or something that goes into their datacenters.

If you are SaaS, I would not share source code. Ever. I often get questions from potential enterprise customers, and while pushing back is not always easy, the reasons are respected. One argument is that you are protecting other customers/tenants by not allowing it, and you will do the same for them when they are onboarded.

If its in their datacenters, there are many reason they might want to see source code (licensing, security, scalability etc). But i would still argue you could keep your core algoritm IP out of that.

Other thoughts: Are they vetting you for potential acquisition?


1) Yes - We said no because the source code is an embodiment of our trade secrets and the basis of our company but we'd be happy to answer your questions.

2) Likely they aren't trying to "steal" you IP. They stated their goal as them want to see how your algorithms made their decisions. So just answer as best as you can without giving away your trade secrets.

3) Yes, there is a way to do this with legal protection. Which would be to follow a clean room process. However, this is pretty expensive for a small company to do. So I wouldn't offer it.


Give them a test: if they are reasonable they will respond to anything reasonable in a reasonable way. So ask them in detail why they want to do this, how they want to do it, are there alternatives. If they answer in a fair and honest fashion, then propose some way to solve their problem in a way that is fair to you. There are lots of good suggestions here: third party, at your site, hiding the secret sauce, etc.

If they are bullies, irrational, arrogant or silly, then save yourself lots of headaches and very nicely say its not something you can do.


They stated their intent was "to see how your algorithms made decisions" that sounds like its your secret sauce. They didn't ask for escrow of source code should you go bankrupt.

Have your people interacting with them be confident enough to tactfully state they will walk away from the PoC if that is a condition.

Coca-cola hasn't revealed theirs, why should you?

Ask more questions about if they are seeing things in the analytics results that don't look correct... that might be their concern if they are basing business decisions on your software.


> They stated their intent was "to see how your algorithms made decisions" that sounds like its your secret sauce.

It's not the secret sauce though. The models are the output.

The secret sauce is the engine that builds it and companies aren't asking for the source for that.


It is so funny to read very decisive answers with no explanation and/or not mentioning if author has any real world experience dealing with this kind of deals and/or machine learning.


1) Were they stakeholders or from legal?

2) They are likely looking for concrete methods they can leverage as the platform grows.

3) Don't give them access to the entire kitchen if they're only asking for a recipe.


Is this the business side asking? They're implicitly giving you product guidance here.

I'm also in the DS/ML space, and there is an absolute dearth of explainability in our models. It's atrocious given the decision that are based on these models to not be able to explain why they come up with what they did.

Get back to them and tell them you'll add explainability to your models. Even if it's just something simple like LIME (vary each input to a model, measure the change it produces - works for any model).


I'd show my source code if I were you -- for one simple reason. Most data science teams in large enterprises suck. Even teams that have technical chops, I can tell you with experience that they won't be able to achieve anything even if they have access to your code. Having a solution is a minor thing for an internal team to push out a solution. You as a startup have more chances of success than an internal team. I'v been on both sides, multiple times and it's not even close.


When I worked for a very large networking company, we wanted to bring in a small company's only product which would have been central to the day-to-day running of engineering. Because they were so small with so little track record, we wanted to make sure that we could get their source code should they ever go out of business.

I can't think of a good reason to require code inspection prior to purchase. Testing, measuring, and generally trying to break the code during evaluation seems fair game though.


If you were selling, you would be obliged to do this, under enforcable contract terms which penalize theft of the IPR.

I think the onus would be similar for the single customer model. In effect, if you are coding to deliver to them, and they are bankrolling your deployment, they're buying you in all but name. So, the conditionality on their checking of your IPR, should be the same as selling your IPR.


> How would you respond?

In these situations it's best to ask yourself a simple question: what would Coca-cola do if they asked to see the formula.


Bad analogy. Better one is:

If you hired a photographer would you expect the RAW files ?

I would.


No, it would be "would you expect the camera settings"


A point to make:

The company asking to see your code has deep pockets. That means they have a whole lot to lose if they breach a contract and steal your intellectual property.

So, if you put together a well-crafted non-disclosure and non-compete contract their risk is high if they mess with you. It'll cost you something to get a solid contract, but it may be worth it if you also get the business.


Is there such a thing as an "independent code review", I would be perfectly happy to give an independent auditor access to my source code???

They could verify it works, that it will preform at scale, and that it doesn't have security vulnerabilities. But I wouldn't want to show that source code to a client who would potentially build the same product themselves.


I am a little surprised by many of the answers here. I thought the conventional wisdom here was, essentially, "Your source code is not as valuable as you think it is". For what it's worth (literally 2c probably), my $0.02 is that I agree with that conventional wisdom. Don't be anal about this, nobody wants to steal your source code.


What would you see this technology outright for if they just asked you to cash out and leave? Double or triple that amount and ask for it to be put in escrow as security, subject to release on the decision of an arbitration panel consisting of 3 academic computer scientists. They'll refuse of course, but it gives you something to negotiate with.


I'd show my source code if I were you -- for one simple reason. Most data science teams in large enterprises suck. Even such teams have intellectual brain power, I can tell you with experience that they won't be able to achieve anything even if they have access to your code. Having a solution is a minor thing for an internal team.


There's a lot of technical focus here. For me, it's about the money. If a client is going to pay me £3m to use my software, you're damn right they can review our source code. As the amount of revenue/profit declines, so too does my desire to expose my business to the risk of a source code review.


Surely this is what trade secrets laws are for.

Probably if you get then to sign some kind of non-compete non-copy contract and hand over the source then if they do decide to cut you out, you can just sue them and make more money than you would have as a startup anyway.

Get it legalled and then just hope they overstep.


I'm not experienced with these types of situations but my common sense is telling me that you could, and should, make a high level design presentation.

That could include snippets of code where appropriate to show exactly how decisions are being made.

This presentation will likely benefit you in the future.


Yes, if forced, ask for an escrow source code agreement where you release source code only in the instance of your company's bankruptcy/insolvency and they are stuck with an unfinished product. Otherwise, this is too great a risk of your sauce being out there.


I work a a large corp and we occasionally run our own POCs or engage externally to learn about significant aspects of a technology or process. Sometimes learning about what matters can make us better at selecting good vendors.

We are generally not in the business of the services our vendors provide and don’t have the staff, expertise, incentives or instructional fortitude to compete.

That said, I’d never ask for a vendors source code (especially on a POC) unless I felt like an escrow situation was warranted if they went out of business.

If I were you I’d ask for

1. Something like a non-compete or exclusivity agreement. They will not replicate the functionality internally or work with another vendor.

2. A lot of money to see the source code. If they are trying to learn from what you have, then you are providing them with a material benefit you should be compensated for. Offer to throw in consulting services if their objective is to learn.

Bottom line, cover the risk that they steal your magic or otherwise benefit without you being compensated.


Isn't this one of the big research areas in the ML space? ie explaining how a model made a decision. This is hard if you have the source code. Personally, I'd look into how to explain your model without giving away the secret sauce.


Isn't the logical question to ask "Why do you need to see how our algorithms make their decisions?" Surely if this is due to legal issues then they can point you to the relevant law?


I've had customers ask to see certain parts of the source, and we've never done it – it has also never been a dealbreaker. If it is, I'd be very suspicious of that deal.


Do not show. Enterprise customers are known to string startups along, and then not buy anything. They don't even think it's wrong -- to then, a couple of months is nothing.


It sounds like they want to understand how an algorithm came to it's decision. If I were in your position, I would ask if rule extraction would satisfy their needs.


If you agree, this may mean no investor will be serious with you after that (you've shown your internals to lots of other people). Use that as an excuse to say no.


Depending on what they're reason is, it may be satisfactory to have the code looked at by a qualified 3rd party (with proper NDAs all-ways).


Can you write a methodological doc in LaTeX, detailing general algorithmic choices, without giving away your secret sauce?


There are a lot of interesting posts in this thread, but if your Spidey-senses are tinging, pay attention to them.


If my memory serves me right, This is exactly how Microsoft stole Apple's code in the early days


I'm not sure it does.

Apple borrowed ideas from Xerox. Microsoft were given source code for Mac in order to keep producing Office for it. Both Apple and Microsoft have copied UX elements off of each other.

But I don't recall Microsoft stealing Apple's code. Maybe copying some GUI elements at most...


Easy: Make them pay for the privilege, based on how much you think it’s worth you to NOT show them.


Sure, access to the source is acceptable if they're offering to buy you out.


My 2c as someone who works on the enterprise side of these requests:

There are a couple of reasons reasons we might ask to look at your code:

1. While not a reason to look at your code, instead, if we don't have a valid reason to look or don't have access to technical resources either internal or via external consultants who we are fairly confident could build whatever the software is we are buying given time and resources then we DO NOT WANT TO SEE YOUR IP. This goes as far as shell scripts vendors use for stuff that we don't particularly care about. If they leave them on our boxes we make sure we destroy the data. If the company is worth 50bn then there is it a very small chance their about to make a huge pivot to your particular niche and therefore need your code to solve a problem. The reason companies buy software is because they don't want to pay people to maintain it and in addition they DEFINITELY don't want to get sued for looking at your code. So for no other reason than legal repurcussions you can probably trust them not to do anything sketchy. (Disclaimer: small business units do sometimes go rouge. Make sure youre talking to someone who understands the company wide impacts of fucking this up)

2. If the code is going to be used in sensitive environment (ie. Air gapped networks) we may want to scan for both destructive malware dependencies or just bad code that intentionally or unintentionally might damage systems. Also you would be amazed how many vendors build hooks to call out to the internet in standalone software packages that they "certify" for offline use.

3. If we need to build a bunch of integrations ourselves (ie you would be useless to us in so far as needing to understand legacy core banking systems and the like and therefore are not helpful with your knowledge of the code base, we need someone with knowledge of both code bases at a fairly low level) then depending on the size of the code base we might ask for all of it or just all the external interface implementations. Not the definitions. The actual code.

4. If you are a small company it is not unlikely that we will negotiate a clause which says that if you disappear or all your developers die or whatever, then we are allowed to internally use your code base to build our own stuff since we will end up with dependencies on it and will want to make sure we can still function without you (this is obviously not ideal, we would rather throw money at you to make problems go away, but if you aren't a business any more then we just have to hire people to do it) I actually heard a colleague working at a competing bank in Australia tell me that their agreement with hashicorp gives them ownership of consul enterprise code base for use internally if hashicorp disappears. You just need to make sure your lawyers and on this properly to make sure you clearly define the circumstances in which the large companies expectations of you maintaining the code are no longer met and therefore they can do it if they need to.

5. If we just don't trust you to not be hiding some black magic bullshit behind the scenes. This is usually the result of particularly uninformed sales people making claims that cannot technically be true, and thus out due diligence require that we handle it ourselves. It's also much more likely that we will recommend a bunch of software auditing companies we have used and we trust to audit the code base for us, just so we don't have the liability of your IP in our heads.

6. If we have government financial institution regulations which apply to the thing we want to use your software for and we are required to check of sign off the risk. As an example, an Australian bank running things on cloud platforms that hook back into traditional on prem systems it is mandatory without exception that all data at rest or in flight be encrypted. We trusted a large software company on this and only when we had auditors sniffing traffic over the network did we discover that major data intensive operations relating to backup integrity decrypted everything and then pumped it over the wire between instances using HTTP at which point we where $6m deep in licensing fees so we had a few very difficult conversations about "fix it or fuck off and pay us substantial reparations" because we suddenly needed a lot of technical lawyers (who ate as rare as hens teeth) to explain what had happened to avoid fines that could have cost literally billions.

Summary: there are a bunch of reasons a company might want to see your code. If the person you are talking to is speaking on behalf of the whole organisation (ie. They understand broader business implications of doing anything shady) then you're almost definitely safe. If your a bit on the fence about the whole thing, get a third party auditor in, but the request itself is pretty reasonable.


I just remembered a specific use case which we had where we where told no re: looking at code and it completely screwed us.

One of the banks owned by our parent company was using a system built by HP to determine loan rates. Now, being a bank that's kind of their core competency so letting someone else build it just seems silly.

But two years later I was on a project trying to hook a bunch of stuff up to this system and we noticed that we couldn't get a proper test run because it kept giving us different figures for the same inputs.

After asking again for the code, or even pseudo code that would explain the behaviour we again told no.

A bunch of lawyers got in a room and they explicitly threatened to sue us if we tried to decompile their code or monitor the system to learn the logic. My response to this was immediate alarm bells because that meant we where not allowed to actually test the system which paid for most of the development of and hosted internally.

A discussion with just our lawyer revealed that, because it was related to home loan rates, our liability was actually significantly reduced if it where technically possible to decompile the code and verify what it was doing because the cost of not being able to explain it to auditors would be so high and the amount we would be sued for might maybe max out in the 10's of millions.

After a bit of very low level analysis we found that their algorithm for determining a load rate included a random number generator to determine an arbitrary discount in the case that very similar data was input repeatedly.

We later discovered that a developer was trying to be clever and get more home lones sold and had entirely missed the point of risk profiling properly. Ie. We don't want to sell you a home loan at a low interest rate if the risk of you defaulting makes it less profitable.

Eventually the resolution we came to was that HP would fix this shit up and they would provide us with all of the original source they had access to.

But for the next 20 years or so we will be carrying the liability of potentiallt hundreds of loans (this system wasn't used by most business units at the time but we where thinking of expanding out its use, hence the project) that have an interest rate that is lower than it needs to be (like 1-1.5% lower!) to cover the risk profile of the person who got the loan.

And all we needed was the code and that wouldn't have happened.


You should consider hiring the services of a Software Escrow firm.


Do you ask to see Microsoft Windows source code. Well, then that.


"If you want to see our source code then acquire us."


1) Yes. I would say "no". 2) Not very. 3) Yes, but they are not likely to be effective or easy to enforce.

If they want to look at it badly enough they can buy it.


If it is amazon, take a pass


This is the untold story about Nothing Real (the original Shake developers), Apple, Steve Jobs, Disney, and the shake[1] source code...

NR developed shake, and when bigger studios started using it, they wanted access to the source code. Disney was one customer who paid to put shake's source code into escrow, with a stipulation that if NR ever went under or got acquired, they could pull it out of escrow and build shake (on Linux) themselves.

Enter Steve Jobs and Apple, who viewed shake as an asset that could help push studios towards Mac OS X, and away from SGI and upstart Linux which was steadily taking SGI's high-end CG market. When Apple acquired Nothing Real, Steve wasn't at all happy with the animation studio's CTO who decided to exercise their contract and pull the shake code out of escrow.

Disney ported shake to amd64 Linux and continued to use and extend it all the way until Tangled (2008), and while it's not their main workhorse compositor these days, shake still runs today, despite Apple's best efforts to kill it ;-). To Apple's credit, they eventually realized that shake's source code was not really that big of an asset, so they offered a deal at one point where studios could pay (50k+?) to get access to the source code. Many studios payed.

Having compared the NR sources to the Apple sources, quite a bit of work was done to put in PPC-specific assembly and performance optimizations (e.g. optimizing for G5's cache sizes) into Darwin-specific #ifdefs.

One takeaway from this story is that the "big" customer (Disney) in this case was not at all interested in stealing IP. They were an animation studio and their core product was something not-software. If the customer in your case is in a closer space, it's a different calculus.

The customer's viewpoint was more of being able to decide their own fate in the event of the company going away. And it turns out, those things did happen, so having the deal in place was a good thing for the customer. Likewise, Nothing Real made out very well in all of this, as they got paid by Disney (among others), and later acquired by Apple who really didn't care enough about the pre-existing contracts to not acquire them.

One protection in the contract was that it only allowed for the studio to produce binaries for themselves, but not distribute them. A practical consequence of this is that Disney could not share shake binaries with Pixar, and while Pixar had also purchased a source license from Apple, Disney could only share (git format-patch) patches with Disney's changes. I really wish we could have open sourced the shake source code, even for historical purposes[2]. Apple still can.

Make sure the deal is reasonable and there shouldn't be any problems. If you don't trust the company then maybe you shouldn't take them on as a client, but otherwise a reputable company will be very willing to keep things as friendly as your lawyers can get them to be. Having a good lawyer is key.

Regarding copy-pasting your code -- in shake's case, having the source code was really helpful for plugin development. Maybe consider ways to help customers leverage actually having the source code. Making the system plugin-able is a great way to do that, as customers mostly want to customize stuff, and plugins are a great way to get customers "deeply integrated" (aka locked) into your ecosystem.

[1] https://en.wikipedia.org/wiki/Shake_(software)#History

[2] https://www.flickr.com/photos/ronbrinkmann/sets/721576179728...


If I were you and I decided that this is worth it, I would try to arrange the deal so that it only occurs after all of their other due diligence has been completed, and the source code audit is the last remaining obstacle to doing the deal. And I would insist that they first define, at least roughly, what it is acceptable and unacceptable. So, after the audit is completed, unless they can point to something in the code that is unacceptable, then they are compelled to go through with the deal.

You want to avoid a situation where you accommodate them on this, and they come up with some other hoop you have to jump through. Or where they give you some vague excuse like, "thanks, but we decided to go in a different direction" and walk away.


we wanted to see our supplier's source code to use their api as their documentation seemed lacking, they asked us sign an nda that if broken would cost us a lot of money. all of a sudden we found their api sufficient to do what we needed to do.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: