Hacker News new | past | comments | ask | show | jobs | submit login
Obsidian-Copilot: A Prototype Assistant for Writing and Thinking (eugeneyan.com)
232 points by alexmolas on June 13, 2023 | hide | past | favorite | 69 comments



A tool for Obisidian which no has yet made: AI memory palace illustrations

There would be a simple system for rooms and the AI / program would edit them and things to them which when clicked on could lead to new "places"


Can you elaborate? Rooms + exits can already be done by creating links.


The AI would illustrate these rooms and allow their manipulation.


> We start by parsing documents into chunks. A sensible default is to chunk documents by token length, typically 1,500 to 3,000 tokens per chunk. However, I found that this didn’t work very well. A better approach might be to chunk by paragraphs (e.g., split on \n\n).

Hmm good insight there. I've done some experimenting formerly by chunk length and it's been pretty troublesome due to missing context.


You don't do a sliding window? That seems like the logical way to maintain context but allow look up by 'chunks'. Embed it, say, 3 paragraphs at a time, advancing 1 paragraph per embedding.


This is only a good idea if you are *specifically not* using OpenAI.

If you use local models then it's a fantastic idea.


If you're concatenating after chunking , then the overlapping windows add quite a lot of repetition. Also, if it cuts off mid-json / mid-structured output then overlapping windows once again cause issues.

Define a custom recursive text splitter in langchain, and do chunking heuristically. It works a lot better.

That being said, it is useful to maintain some global and local context. But, I wouldn't use overlapping windows.


In place of simply concatenating after chunking, a more effective approach might be to retrieve and return the corresponding segments from the original documents that are relevant to the context. For instance, if we're dealing with short pieces of text such as Hacker News comments, it's fairly straightforward. Any partial match can prompt the return of the entire comment as it is.

When working with more extensive documents, the process gets a bit more intricate. In this case, your embedding database might need to hold more information per entry. Ideally, for each document, the database should store identifiers like the document ID, the starting token number, and the ending token number. This way, even if a document appears more than once among the top results from a query, it's possible to piece together the full relevant excerpt accurately.


I don't think the repetition is a problem. He's using a local model for human-assisted writing with pre-generated embeddings - he can use essentially an arbitrary number of embedding calls, as long as it's more useful for the human. So it's just a question of whether that improves the quality or not. (Not that the cost would be more than a rounding error to embed your typical personal wiki with something like the OA API, especially since they just dropped the prices of embeddings again.)


I've thought about doing this as well, but I haven't tried it yet. Are there any resources/blogs/information on various strategies on how to best chunk & embed arbitrary text?


I’ve been experimenting with sliding window chunking using SRT files. They’re the subtitle format for television and have 1 to _n_ sequence numbers for each chunk, along with time stamps for when the chunk should appear on the screen. Traditionally it’s two lines of text per chunk but you can make chunks of other line counts and sizes. Much of my work with this has been with SRT files that are transcriptions exported from Otter.ai; GPT-3.5 & 4 natively understand the SRT format and the concepts of the sequence numbers and time stamps, so you can refer to them or ask for confirmation of them in a prompt.


The unstructured package works well to partition text, markdown, html, even pdf on structural boundaries like paragraphs, h, hr etc

https://unstructured-io.github.io/unstructured/bricks.html#p...


This looks great! I was about to start learning and diving into Obsidian about a month ago, finally driven to begin building a personal knowledgebase...

And then I found Mem.ai and dove into that instead, and i've been extremely happy with it. It accomplishes this aspect he's offering here (where it uses your knowledgebase to assist in your writing). However, it's also got built in chat with your knowledgebase, and helps with auto-sorting and all of that.

For those that want their data on their computer, I totally see why Obsidian is the most desirable. So this sort of addition would be the best of both worlds for them.


I'm not sure a offline-first document editor is comparable to a hosted SaaS about AI. This plugin is one of many, while mem.ai is non-customizable tool where someone else owns your data and seems to offer no data portability.


I was excited about Mem a few years ago. But was disappointed they did not support math/MathJax notation ($…$) despite millions raised.


How does notion compare?


I haven't used Notion, but in my research its AI feature is only the Smart Write/Edit that Mem has - tho I am unsure how well it uses the rest of the content you have inside of Notion, as their sales page doesn't really make that clear.

Mem.ai has integrated many aspects into it - I love that I am now unconcerned about tags or folders or categories.


So what is the sample size at which this can be useful. A notes vault doesn't seem large enough.

Obsidian/Logseq are already great for thinking by way of their "show a random note" feature. Usually pulling up unfinished thoughts from past days will give me an idea for extending it.


I use vim with the Copilot plugin. It's pretty astounding what it spits out. I was writing a handbook for a credit union board of directors and it was quite helpful at times.


This is a great idea. I'll have to try opening my obsidian vault with the new copilot chat.


I thought copilot only works for code. How did you use it to write a handbook?


1. Install vim Copilot plugin. 2. Name your file with .txt extension. 3. Write stuff.

https://github.com/github/copilot.vim


Copilot is designed for code, but it can still be used for whatever you want. I found it useful when writing latex files, even for the text explanation portions as opposed to the text markup 'code'. There was a y combinator thread on this earlier.

https://news.ycombinator.com/item?id=29920035


Another option is to use llama-index and index the Obsidian vault. I use a gradio based web interface to query my Obsidian vault via GPT-3.5. It's pretty awesome.


Do you have code for this (or pointers to what you used to set this up)? I'd love to set up something similar!


This: https://github.com/hwchase17/chroma-langchain/blob/master/pe...

I turned that into cli tool, but it isn't ready to release. Mine works like this:

    qwoo index .
    qwoo qa "What have I been up to?"


Does your data stay local by doing this?


Mostly but it does upload some of the vectorized data to insert into the prompt for context. When you do a query llama-index tries to discover content related to your prompt and injects it for context so its not entirely local.


> Mostly but it does upload some of the vectorized data to insert into the prompt for context. When you do a query llama-index tries to discover content related to your prompt and injects it for context so its not entirely local.

When you say "upload some of the vectorized data" do you mean in a numerical embedding form or that it will embed the original text from original similar-seeming notes directly into the prompt? I've only ever done the latter, is there a way to build denser prompts instead? I can't find examples on Google.


The numerical vector representing the embedding is only useful for finding documents that are similar to your search query.

Those documents are then injected into your prompt and sent to some kind of LLM completion system such as GPT.

So yes, you will be sending chunks of your actual notes over the wire.


Are you calculating your embeddings locally, and using OpenAI's APIs only for text generation?


We really need a better search engine for obsidian notes, the default search is horrendous


In the past I have used Omnisearch which I have found to be an improvement.

https://github.com/scambier/obsidian-omnisearch


This is why I don't use Obsidian. Without giving root to an aggregate of random git repos, the most important features are very subpar. I have no idea what I would even do with a fancy star graph or whatever it is, and VS Code has much better search.

Edit: Also, why is the search bar for searching all notes so buried, requiring so much effort to open? Is that because it works so poorly?


Smart Connections allows you to search using embeddings[1].

[1] https://github.com/brianpetro/obsidian-smart-connections


What would you like to see? (I work on Obsidian)


He said he’d like to see a better search engine


I was hoping for something more descriptive than “better”


Semantic search. The current search feels like it is barely doing something smarter than substring matching / basic regex. The search should be more like Google and less like matching substrings


Fair enough. The built-in search is terrible, it doesn't respect any of the regex filters you set up to exclude files and folders. I fought with it for almost an hour trying to get it to exclude png images, and anything in a 'media' or 'attachment' folder, and it never worked completely. Then I installed OmniSearch it just worked, instantly.


I've been using the Text Generator Obsidian plugin for a while now:

https://text-gen.com/

uses GPT 3.5, requires an OpenAI API KEY

It has prompt templates.

For bulk processing of files I still use my own python scripts.


Um... can someone explain what this actually does?

In the video the user chooses the 'Copilot: Draft' action, and wow, it generates code...

...but, the 'draft' action [1] calls `/get_chunks` and then runs 'queryLLM' [2] which then just invokes 'https://api.openai.com/v1/chat/completions' directly.

So, generating text this way is 100% not interesting or relevant.

What's interesting here is how it's building the prompt to send to the openai-api.

So... can anyone shed some light on what the actual code [3] in get_chunks() does, and why you would... hm... I guess, do a lookup and pass the results to the openai api, instead of just the raw text?

The repo says: "You write a section header and the copilot retrieves relevant notes & docs to draft that section for you.", and you can see in the linked post [4], this is basically what the OP is trying to implement here; you write 'I want X', and the plugin (a bit like copilot) does a lookup of related documents, crafts a meta-prompt and passes the prompt to the openai api.

...but, it doesn't seem to do that. It seems to ignore your actual prompt, lookup related documents by embedding similarity... and then... pass those documents in as the prompt?

I'm pretty confused as to why you would want that.

It basically requires that you write your prompt separately before hand, so you can invoke it magically with a one-line prompt later. Did I misunderstand how this works?

[1] - https://github.com/eugeneyan/obsidian-copilot/blob/bdabdc422...

[2] - https://github.com/eugeneyan/obsidian-copilot/blob/bdabdc422...

[3] - https://github.com/eugeneyan/obsidian-copilot/blob/main/src/...

[4] - https://eugeneyan.com/writing/llm-experiments/#shortcomings-...


That looks really helpful. I'll give it a try.

I like the "open" aspect and using it to pull from my docs rather than being a cloud based thing.


I love obsidian but the first plugin I tried (and paid for) led to subtle data loss and resulted in many hours of checking and merging a month of backups. Not going to risk that again.


Wow, that's awful -- and an extreme outlier. Most plugins are free. And given under the hood Obsidian notes are just markdown files on the local filesystem, backups can be managed by git or TimeMachine or rsync or whatever else you might use on other directories. That's not to discredit your experience, just speaking up for the sake of others who might be unduly scared off.


Sure, this one had a free version and it was great. Except after a few weeks Obsidian started behaving weirdly, and was 100% reproducible by enabling or disabling the plugin in question. And then I noticed that some of the markdown files were modified (sections deleted) - but I hadn't noticed immediately, so they'd had edits after they were modified - hence the tedious manual merge.

So it may be "just this one plugin", but Obsidian is so important that I'm just not willing to risk it.


[flagged]


This will sound harsh, but I won’t sugarcoat it: OpenAI isn’t a charity.


But at this point it should be (maybe not OpenAI but some other organization). OpenAI is close to becoming a necessity and a human right just like education so it should be 100% free and accessible at some point.

EDIT: I meant AI, not specifically OpenAI.


"OpenAI is close to becoming a necessity and a human right" is the wildest claim I have heard yet about AI. (Though it's possible that maybe someday I will agree with this).


and some still say that AI isn't being overhyped.


“It’s my human right to delegate my thinking to a higher power” works for religions and cults so why not machine learning?


I say this calmly, it's wild for you because you're probably part of the privileged group of people who can sustain themselves with a steady job and has no problem paying for it.


Like having access to HN via some form of Internet access?

Envy doesn't create rights for oneself nor does it impute privilege to others, and people who read and write on the internet about privilege seem blinkered, to me, about how they'd sound to someone who walks two miles for water polluted by the mining of rare earth elements.


> Envy doesn't create rights for oneself nor does it impute privilege to other

I'm saving this for posterity!


Question, why is this company a necessity and a human right?


See my edit. I meant AI generally (such as in AI-aided/enhanced learning, communication, teaching etc. etc.) not OpenAI the company per see. For disabled people (like me) first and foremost but right after the general populace as well.


What is your definition of a human right, and why does AI access meet it?


What are your skills ? Do you want to work ?

Contact me to discuss . Charles@turnsys.com


>OpenAI is close to becoming a necessity

I disagree. What gives you the idea that any software is a necessity? We've (modern humans) been around for 200,000+ years and software (and LLMs for less than a tenth of the time that software has been around) has existed for ~0.04% of that time.

Oxygen (in its molecular, 02 form) is a necessity. Water is a necessity. Nutrition of some sort is a necessity.

Pretty much everything else is a nice-to-have (with some things like money and shelter being important, but as we see from the poverty and homelessness around the world, definitely not a necessity).

I'd posit that LLMs are helpful and sometimes even useful. But necessary? I think not.

I'd note that I'm not dismissing LLMs, nor am I trying to dump on you. But the idea that any software is necessary is ridiculous on its face.

cf. https://www.merriam-webster.com/dictionary/necessary


The right to AI is different than the right to free hosted inference.

Anyone can download a model and use it completely offline, or hosted on your own server. And there's a lot of effort to make these models work on devices that don't have much computing power (even phones [0]), which increases access even more.

Since this is the world we currently live in, what are you suggesting should change?

[0] https://mlc.ai/mlc-llm/


If you use Microsoft Edge browser and go to bing.com, you can use chatGPT4 for free. Bard is also AFAIK free, albeit not as high-quality (yet).


ChatGPT has a free tier that anyone can sign up for. It’s eminently usable.


Not OpenAI. But AI in principle? Perhaps.

This would have to be solved by actual nonprofits, though.


There are smaller open-source options that you can use, however they don't quite live up to using large open-source model (which likely won't fit on your machine) or commercial models like those of OpenAI.


This is great to hear. Thanks. I'll see if I can find them if only just for the sake of curiosity.


You can use https://localai.io if you have a GPU or Apple Silicon CPU to serve up local models with an OpenAI-compatible API.


Do you know of a list of hardware recommendations to run this?


That's a loaded question, because there's different approaches you can take to run these models. Basically, you want lots of memory (ram or vram), and the more you have, the larger the models can be that you run.

I'd recommend shooting for at least 13B models.

Use "oobabooga/text-generation-webui", which can also serve an OpenAI-compatible API as well as provide a chat interface. It can serve most models, using most methods.

Check out their system requirements page[0], and join some of the communities to learn more about what hardware will work best for you.

This person[1] is providing models of all sorts, in pretty much every optimized format. They also post the minimum RAM requirements for each of the GGML models, which are best if you want to host using CPU/RAM (no video card).

[0] https://github.com/oobabooga/text-generation-webui/blob/main...

[1] https://huggingface.co/TheBloke


i have money for the API. i wouldnt pay for it ever tho. if you want to learn things, read. if you want to create things, create. no need for some language model assistance. ull end up relying on it and knowing nothing yourself and having no abilities. at most its useful for people writing marketing texts...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: