Consent is calculated at the time that each reports are sent back. This means that your users can grant and revoke their consent on a per-report basis, which is the only respectful way to do things.
We are also building programs which will deidentify reports on the client side, before any data is even sent back to our servers. This work is still in the early stages, but here's v0.0.1 of the Python stack trace deidentifier: https://www.kaggle.com/simiotic/python-tracebacks-redactor/e...
Besides Python, we also support Javascript, Go, and we added Java support last week.
I would really love to hear any feedback you have.
That is, are you relying entirely on trust and/or contractual obligations, or do you have some means of enforcing that the user of your SDK isn't cheating?
> Consent is calculated at the time that each reports are sent back. This means that your users can grant and revoke their consent on a per-report basis, which is the only respectful way to do things.
Correct. I like how you think about this. I assume the SDK user will be ultimately responsible for prompting the end-user for consent; I wonder if you have any "best practices" documents for the software authors, so that they don't have to reinvent respectful consent flow UX from scratch?
> We are also building programs which will deidentify reports on the client side, before any data is even sent back to our servers.
I don't see any code in that Kaggle notebook you linked (I'm not very familiar with Kaggle, I might be clicking wrong). Should I assume your approach is based on training a black-box ML model? Or do you use some heuristics to identify what data to cut?
Thanks for looking at the code, and for your feedback!
Here is a recipe for adding error reporting (reporting of all uncaught exceptions) in a Python project. The highlighted line shows that, when you instantiate a reporter, you have to pass a consent mechanism:
https://github.com/bugout-dev/humbug/blob/main/python/recipe...
We allow you to create a consent mechanism that always returns true:
> consent = HumbugConsent(True)
Of course, someone can always create their own subclass of HumbugConsent which overrides that check. We don't have a good way to prevent this, nor would we want to restrict anyone's freedom to modify code.
Our emphasis is on building simple programs that we can reasonably expect to run on any reasonable client without using an exorbitant amount of CPU or memory. For this reason, we aren't using black box ML models. Rather, we analyzed the data and came up with some simple regex based rules on how to deidentify stack traces for our v1 implementation.
We are in the process of doing this for more languages and building this into a proper deidentification library that can be imported into any runtime - Python + Javascript + Go + etc.
You cannot set up Bugout telemetry in your codebase without first defining your consent flow.
We have a library of consent mechanisms that you can chain together like lego blocks to build these flows. For example, our Python consent library is here: https://github.com/bugout-dev/humbug/blob/main/python/humbug...
Consent is calculated at the time that each reports are sent back. This means that your users can grant and revoke their consent on a per-report basis, which is the only respectful way to do things.
We are also building programs which will deidentify reports on the client side, before any data is even sent back to our servers. This work is still in the early stages, but here's v0.0.1 of the Python stack trace deidentifier: https://www.kaggle.com/simiotic/python-tracebacks-redactor/e...
Besides Python, we also support Javascript, Go, and we added Java support last week.
I would really love to hear any feedback you have.