Ingestion is pretty straightforward, you can call R2R directly or use the client-server interface to pass the html files in directly to the ingest_files endpoint (https://r2r-docs.sciphi.ai/api-reference/endpoint/ingest_fil...).
The data parsers are all fairly simple and easy to customize. Right now we use bs4 for handling HTML but have been considering other approaches.
What specific features around ingestion have you found lacking?
Ingestion is pretty straightforward, you can call R2R directly or use the client-server interface to pass the html files in directly to the ingest_files endpoint (https://r2r-docs.sciphi.ai/api-reference/endpoint/ingest_fil...).
The data parsers are all fairly simple and easy to customize. Right now we use bs4 for handling HTML but have been considering other approaches.
What specific features around ingestion have you found lacking?