Yeah there is a lot of breadth in what YC puts out as we found out ourselves. Categorization for specific domains is hard since we don't have labelled data. It's a mix of unsupervised token extraction, clustering to detect significant concepts along with a human layer to tune false positives and also curate few topics manually since the process will never get everything right on its own.
We are in the process if scaling it further so we can help create many more resources for users and their own data.
We are in the process if scaling it further so we can help create many more resources for users and their own data.