How is this loading in PDF page content? I think its just ripping text off the document?
It looks like it works well on PDF with a digital text layer. I tried it on an image of the Declaration of Independence and it told me "I'm Sorry but the webpage is empty..."
PDF chatbots are the new equivalent of todo apps... https://custombot.ai/. No doubt cool but loses the lustre after you've seen the umpteenth one that passes on the token cost to you and still hallucinates.
Yes - I write cyber security contracts. I feed it proposals i find from solicitation aggregators and it's suprisingly accurate at answering the below questions about 150 page pdfs.
*Granted, I don't use this service. I use llama index and AWS to store the vectorized prompt files.
Whats the contracting officers email address where the proposal needs to be submitted?
What is the Proposal Due Date?
What's the full Agency Name?
What's a shorter way to say the agency name?
What is the RFP number?
What type of project are we responding to? Is it a proposal or just a request for information/ Sources Sought Notice? If it is a proposal, simply say proposal. If it is an Request for information or sources sought, say RFI.
Are there any vendor upcoming vendor meetings? What is the CALENDAR OF EVENTS? Project Timelines, or deliverables?
Can the response be submitted VIA Email? Or is it a sealed bid?
What is a description of the scope of work, Scope Of Services or Performance Work Statement? IS the requirement for a Penetration Testing, or Audit Software?
How Many Targets are there? Targets consist of Enterprise Systems (Databases, Cloud Enviroments, applications, servers, IP Addresses, Firewalls, network attached storage, and wireless networks ) and Enterprise infrastructure(Switchgears, routers, Modems, HUBS). List the total number of targets by each category.
Does it require Social Engineering, Physical Security, Wireless Scanning or Perimieter Security?
Is there something that will allow me to run this locally?! This is exactly what I want to do but no clue how to pipe my data into llama. Any pointers will be highly appreciated!
Terms of service for various companies and other long and boring documents.
My real estate agent wanted me to sign up a document that is 10 pages long. I would prefer to use the bot to answer my questions, and possibly - verify with other legal things.
Tried the document with the service (after removing personal info), and it worked so-so. Could specify which paragraphs mention the commission, but couldn't extract info about how high the commission is.
Perhaps it's because the document is in Polish. But GPT-3.5 or 4 shouldn't have a problem with such queries.
> Terms of service for various companies and other long and boring documents.
OK, fine. Do you have a working example of this? e.g. he's a contract, and please find me the unfavorable and / or non-standard terms. People have tried this before with no success, and it would be great if someone finally made some headway here. Even more points if the GPT things find onerous terms, but says, "hey don't worry about this non-compete bit, it's not enforceable."
It's a low-stakes scenario, and the alternative was reading it myself - since paying legal fees in this specific case would be prohibitively expensive.
Understandably but my thinking is that when dealing with real estate you are paying for expertise navigating a--typically--large transaction for most people that wouldn't want to rely on an LMM that was trained on a dataset that might not have included the latest laws.
Yep. I've been using a similiar service (chatpdf.com) and uploading amateur radio mathematics pdfs and portions of the arrl handbook. I can't quite upload the entire thing due to size constraints, but I've been able to do quite a bit of Q&A from the book chapters.
This Ask your Pdf allows up to 20MB for free vs chatpdf's 10MB free; though chatpdf has a 32MB allowance on the paid plain. Not sure how how ask your pdf plans to monetize this.
I'm personally looking into setting up my own self-hosted "chatpdf/askyourpdf" clone so I can put put a whole bunch of my reference material in there. I can't actually open it up as a service because of the copyrighted works, but I would really love to have ham Q&A site based on the ARRL Handbook and other resources. Even expand that out to an electronics Q&A.
There's a site called llamahub.ai that lets you load lots of your own resources into a LLM index so you can train GPT (or potentially the opensourceish gpt4all variant) on your own resources.
Also want to jump in with https://libraria.dev/ for PDFs too, I just recently updated this with large PDF uploads, you can share your chatbots, have it in different views, and more
PDFs sure are annoying when you want to quickly jump through the docs… but really I wonder how this will do once gpt4 api supports images … maybe then it can help me understand electronic data sheets… cause I’m still trying to figure out was pin 0 the sdl or sda pin… and was vcc 3.3 or 1.8 volts…
Almost all of them do the exact same thing and it is completely saturated with these websites looking very similar akin to a copy and paste job.
There is nothing new or unique about any of them other than a new AI snake-oil to push their new grift on to users uploading sensitive PDFs to 'chat' with their document as 'the future'.
Another race to the bottom until Microsoft Word or Google Docs releases the exact same thing for free and unlimited tokens.
Interesting, but how many people are going to upload things they really shouldn't?
...
You retain ownership of any PDF documents you upload to AskYourPdf. By uploading PDF documents to AskYourPdf, you grant AskYourPdf a non-exclusive, worldwide, royalty-free license to use, modify, reproduce, and distribute the PDF documents for the purpose of providing the AskYourPdf web application
Since you mention it, I have heard several of my friends and colleagues saying that they think ChatGPT could be their lawyer, doctor, and tax prep advisor if only they could send it documents for review.
It is legal jargon that gives them the right to do that, but it also gives them a lot of other rights. If they only wanted to display the PDF back to you, they could affect that meaning very easily.
Not using phrasing thst has already been tested in court is easy, but fraught. If someone sues you because of a reasonable thing you did to display a document and you have this phrasing. It's open and shut because someone else has already litigated it and so there's legal precedent. If you use different phrasing and someone sues you, there's a greater chance you'll have an actual drawn out court case to convince a judge that your phrasing means what you wanted it to mean. Remember, the meaning of words and phrases in a legal context can differ almost arbitrarily from what they mean in a conversational one.
As a business owner that just wants to get on and provide a service that displays a pdf you got sent, which do you go with, the one that lets your resources go to providing the service you intend to provide, or the one where there's a greater chance your resources will get tied up in a legal battle for the sake of making the terms almost no-one reads anyway a little nicer?
bad actors are not gonna be stopped by their own "legal jargon"... the terms look like copy pasta or AI generated themselves. can't imagine the operators spent much time reading them.
though maybe true bad actors would try harder to pretend being a company with some humans involved, rather than this openly anonymous site.
Tangentially, I haven't been able to find any software which has reliable OCR for music scores; they tend to be just bad enough as to be useless. Was curious if any recent AI developments could be applied to this, but don't have the expertise to look into this myself. If anyone has any thoughts or wants to look into this, please feel free to email me! (link to my website in profile, which has my email)
It’ll accept images and PDFs of music, extracting the notes, rhythms, etc., so you can play it back and edit it with our built-in editor.
It uses machine learning and works significantly better than the other products on the market. There’s a bunch it doesn’t do yet, but it’s useful enough already that we launched in public beta.
My take on this space is that it'll eventually be built into the operating system or PDF viewers, so you're going to have to do more than just "chat with a PDF" -- but that chatting with PDFs is a great place to get started!
Getting ChatGPT (or any good-enough LLM) to generate/manipulate/edit/find discrepancies with PDFs would be great too. Probably best done with a plugin so it can execute code though. Dumping the PDF to HTML, telling ChatGPT to edit the HTML, then converting it back to a PDF is probably a non-starter although that does work on a basic level.
Anything ChatGPT<->PDF is probably a good business idea IMO. That stuff comes down to developers so often that it's almost a career specialization and PDF code can be unfun and tedious to write and maintain.
How does this work? Do you first scrape the PDF or do you have gpt4 multimodal access? The privacy policy link is broken at the moment so I can’t tell for sure
I think the GP was interested in technical details. Specifically, do you first scrape the pdf (using another tool presumably), or do you have gpt4 multimodal access?
This is a great application. My example use case was to upload a REIT quarterly report (in Spanish) and ask questions that I am interested in about it.
I felt that at this point, it was more the "potential" than what it actually did. I asked some questions that it just couldn't answer. Also, at some point it started answering in English even though all my conversation with it had been in Spanish.
I think this could be very useful! But I guess there are still some hallucinations to deal with: I uploaded an arxiv paper and asked who the authors were, and it it hallucinated 4/4 (and stopped working when I asked it to please use the paper as a reference to answer my question).
I had an idea to do something like this combined with something like Zeal or DevDocs so you can have a kind of chatGPT localised just to your specific language or framework. But I guess this does just that job, but in a far more general way
I just had the misfortune of having to wade through pages and pages of byzantine building code where the relevant parts were scattered though the document. This would be very helpful for that kind of thing.
I'd be interested in how people parse text off a PDF. I'm making a TTS tool to convert documents (mainly HTML docs at the moment) to speech, PDF's would be a great addition.
I have it answering a list of questions in an excel from three different 'Frames of mind' by passing three different directory'swith different content in each to get three different responses i can craft together.
It’s a great mystery why people rave about Langchain library. I found it to be really bloated and verbose syntax makes it hard to understand what it’s doing unless you are digging into the code base. Poorly documented, too
What alternatives exist today? Honestly, I think langchain just fills a void for having a more streamlined "api" for LLM-driven app workflows which is why it's hyped. Plus the fact that they have bindings for both JS and Python makes it easy to get up and running and building custom "agents" for different tasks.
Think of anything ever written, your personal library, the tax code, every up to date doc for your project and it's dependencies, your code base, your university texts...
Index them, store them in LLM format.
Then ask a question, you first semantically search all the relevant sources you've indexed, and get back a tight set of under the token limit results that you then pass on to your favorite LLM. Chat4all, ChatGPT etc then read those parts of your library and answer your question.
My head bursted with ideas when i first found out i could vectorize directorys and answer questions on them. I can run entire logic loops using an LLM as the input... How does that not blow your mind? '
Imagine scouring financial reports in real time?
Imagine being able to analyze thousand page regulations for self interest?
Imagine being able to interact with old newpapers, articles, and media lost to time.
Ask questions on entire class of books, and any information you want to aggregate...
Since I can't delete on Hacker News, this comment was one of several for an April Fool's joke related to AGI "taking over my account." I didn't realize there was a hidden meaning associated with triple parentheses. I apologize!