Ask HN: What have you built with LLMs?

duckkg5 · 2024-02-05T18:45:32 1707158732

I don't like selling. I wanted a way to practice cold calling in a realistic way. I set up a phone number you can call and talk to an AI that simulates sales calls.

I ended up using it for more general purpose things because being able to have a hands-free phone call with an AI turned out to be pretty useful.

It's offline now, but here's the code with all the stack and deployment info: https://github.com/kevingduck/ChatGPT-phone/

Edit: forgot to mention this was all running off a $35 raspberry pi.

dsco · 2024-02-05T19:03:10 1707159790

So the AI tries to sell to you, or you try to sell to the AI? This sounds very intriguing but I can tell by your README that you're an engineer and not a sales guy - there are no distinct value propositions.

But it sounds damn creative as a project.

duckkg5 · 2024-02-05T19:42:16 1707162136

The AI answers the call and acts as a potential customer. They take on personas to simulate behaviors like difficult or reluctant customers. You then do your pitch, handle objections, etc. At the end you get a transcript that's 'graded' to show you where you could improve your sales approach.

And you're right, I'm not a sales guy. This project is for people like me who want a risk-free place to learn the basics of sales so that when I do talk to an actual human, I won't panic and freeze up like I always do.

cabinguy · 2024-02-05T19:59:15 1707163155

I absolutely love this idea.

Most high-level sales people rely on role play partners but that requires a pretty a big commitment. This would make a great product, imo.

Also (tip): Study, memorize and internalize a sales script for your product/service...along with the objection handlers and closing questions. Practice every single day. You'll gain massive confidence because you know exactly what you are going to say, every time.

tkgally · 2024-02-05T23:01:12 1707174072

> a risk-free place to learn

That's turning out to be a valuable feature of LLMs in many areas. You can practice complex interactions with them without worrying about boring or annoying them. Even the most patient human teacher gets tired eventually. LLMs don't.

BenoitP · 2024-02-06T11:45:49 1707219949

I'd buy that. I'd buy that for interview preparation as well. Maybe 5$ per hour, up to 15$. I wouldn't buy a subscription, only actual consumption of the service.

Please consider putting it in online.

teleforce · 2024-02-07T04:30:33 1707280233

Love the idea of AI grading the answers, hopefully this can be extended to marking/evaluating/grading subjective manuscripts.

For me there's none is more boring than marking/evaluating/grading manuscripts. I prefer hard labor like gardening or farming than doing that activities although I'm quite good at evaluating stuffs I think.

Can you please elaborate how you do this and based on what metrics/scheme/etc the answers are being evaluated?

3abiton · 2024-02-06T09:49:31 1707212971

This, for some reason, reminds me of Nathan Fielder rehearsal skits.

nico · 2024-02-07T19:13:15 1707333195

This would be amazing to do practice code/tech interviews for software engineering roles

It could work both for practice and for automating interviews as well

spywaregorilla · 2024-02-05T23:33:56 1707176036

do you have any reason to believe the phone calls are realistic?

anxman · 2024-02-06T06:14:14 1707200054

This could be a product. AI sales training.

satvikpendem · 2024-02-05T21:20:25 1707168025

Now you can turn this into an AI sales cold caller based on the data you could collect from how the AI reacts to your selling. That is to say, the entire system becomes a generative adversarial network.

SpaceL10n · 2024-02-05T21:05:02 1707167102

Like exposure therapy for people afraid of sales. Very nice idea.

cushpush · 2024-02-05T22:51:22 1707173482

nice term this, exposure therapy"

duckkg5 · 2024-02-05T21:17:30 1707167850

Yes exactly!

alentred · 2024-02-05T21:48:51 1707169731

I like the idea very much! Using an LLM as a "sparring partner" for training in various areas. LLMs tend to hallucinate, so I find it harder to use them reliably in the context of decision making. Training however is a nice idea indeed: mistakes are not as critical, just as in real life any peer can make a mistake.

VoodooJuJu · 2024-02-06T13:26:42 1707226002

Very cool, sounds like a saleable product. I feel like there's already half a dozen landing pages with people trying to sell what you just made in the 18 hours since you've shared it here. That should however be a red flag to those same people, a demonstration in just how easily commoditized LLM products are.

jtolster · 2024-02-05T20:44:29 1707165869

Are you finding response time to be an issue? I can imagine some very long pauses might kill the flow of conversation.

duckkg5 · 2024-02-05T21:05:00 1707167100

It's not perfect, but it's tolerable, and not unlike some real-world calls where there's a slight delay. There are some "Hmm ..." and "well ..." scripted in as well to make it feels natural if there is a long response.

splatzone · 2024-02-05T21:18:34 1707167914

I love the scripted filler words, that’s smart

elicash · 2024-02-05T20:49:25 1707166165

To that point, I would love to hear an audio file of it in action since I see from GitHub the phone number is down.

qup · 2024-02-05T18:56:23 1707159383

That's cool. Thanks for sharing the source. What else has it been good at for you?

duckkg5 · 2024-02-05T19:45:21 1707162321

The cold call sales part can be replaced to suit any need. I had another version that was just a generic AI (no sales stuff). I found myself on walks frequently ringing up the chatbot ("Hey siri, call ChatGPT") and just asking it whatever is on my mind. "Tell me about Ghengis Khan" or "where's a good place to catch trout in north Georgia" or "how do I make baked ziti". Makes the walks go by super quickly.

craigdalton · 2024-02-06T05:07:43 1707196063

Would you be willing provide a live demo (via web interface) - as a preludebto providing a similar training bot as a consultant?

DrNosferatu · 2024-02-05T23:26:53 1707175613

Now do it for dating practice - great for nerds ;)

xtracto · 2024-02-06T00:19:10 1707178750

I helped "writing" a cookbook from my grandmother's recipes. For her 100th birthday, my dad rescued more than 250+ pages of recipes that my Grandma had collected over the years. Some were written in typing machine, others written by hand by her. So, my dad scanned (pictured) all the typed recipes, and "dictated" all the handwritten.

For the dictated recipes, I told him to dictate just "flat" the words and numbers. So that I had paragraphs of recipes.

For the scanned recipes, I used Google OCR (I found out it was the best one quality wise).

For both sets of recipes, I then used GPT4 to "format" the unformatted recipes into well formatted Markdown. It successfully fixed typos and bad OCR from Google.

We then pasted all that well formatted text into a big Google Docs, and added images. Using OpenAI image generation I generated images for each of the 250+ recipes. For some of them I had to manually curate it, given that some of the recipes are for typical Mexican food: For example there's a (delicious) recipe called "PibiPollo" that for the unitiated it may look like a stew, so I had to tell something like "large corn tamale with thick hard crust".

In the end, the book was pretty nice! We distributed digital copies within the family and everybody was amazed :) . I loved spending time doing that.

syntaxing · 2024-02-06T04:36:57 1707194217

This is absolutely awesome. I really want to do the same for my mom’s recipe before it’s too late. Though I wonder what would have happened if you went for GPT-V or LLaVa and the like. I have a hunch you might have been able to skip over the OCR part and straight from picture to markdown? Would be awesome if you can try and compare!

addandsubtract · 2024-02-06T10:35:15 1707215715

Would you mind sharing the cookbook or excerpts from it? I'd love to see it.

xtracto · 2024-02-07T23:21:11 1707348071

I cannot share the full book because a) I don't own the copyright and b) My dad (who ultimately owns it) still has plans to put sell it. And I feel it still requires some editing. But I can share a couple of sample pages:

https://drive.google.com/file/d/1OGE-zfNHHDnALbhgmf3lykBjcSg...

It is in Spanish though.

addandsubtract · 2024-02-08T09:43:25 1707385405

Very cool, thanks for sharing! The pictures also look a lot better than I imagined.

wonderfuly · 2024-02-06T11:12:42 1707217962

That's great!

geor9e · 2024-02-06T00:21:01 1707178861

My "stack" is just Apple Shortcuts making HTTP POST API calls to OpenAI, which does stuff in MacOS via BetterTouchTool. I trigger each by hotkey or typing a few letter into Spotlight (with Alfred). One transcribes and summarizes whatever youtube URL is highlighted. One does grammar and style correction of whatever is highlighted (and replaces it). One simply replaces the Dictate key with OpenAI Whisper but otherwise works exactly the same as voice typing. It's just way more accurate. One replaces the magnifying glass key to have a voice conversation with ChatGPT (using Microsoft voice synthesis). The built in prompt keeps it's answers short and conversational. It's like asking Siri something, but much better. One simply reduces the highlighted text by ~50% by rewriting it shorter, for when I have typed too much. One gives the key points of whatever article is in the foreground tab, so I know what I'm about to read. One outputs purely code, for example I use my voice to say "javascript alert saying blah" and alert("blah"); will appear at my cursor. Of course, it's usually more complex boilerplate stuff, but it helps speed up my coding. Every time I find myself using an LLM repeatedly for something, I make it into a little Apple Shortcut to streamline it into my workflow, as if it were a built in MacOS feature.

clapslock · 2024-02-06T11:21:10 1707218470

Could you please share the prompt for the grammar and style correction shortcut? I've just started using it for the same purpose, but I haven't been able to find a prompt that yields consistent results. Sometimes, ChatGPT completely changes the style of my text.

geor9e · 2024-02-07T01:55:47 1707270947

I use role: system, temperature 0.7, prompt: Fix the spelling, grammar, punctuation, order, and sentence structure. It's does sometimes change the style too much, but not often enough to annoy me into fiddling with it.

dostick · 2024-02-08T04:43:13 1707367393

Have you tried Raycast? It has all the AI scripts you mentioned and many more. And many things done better, like showing diff before inserting grammar-corrected text.

rsanek · 2024-02-10T16:17:52 1707581872

Looks expensive at $20/month if you want GPT-4

mpalmer · 2024-02-06T14:34:33 1707230073

I have looking for a way to do "push to record audio" (instead of Mac's dictate) for ages, thanks for the push to look at Shortcuts!

Are you using the "Record Audio" action or something else? Ideally the shortcut would stop listening after a pause like the native Dictate feature does it. At a minimum Record Audio seems to require hitting spacebar to stop - not great but not terrible.

geor9e · 2024-02-07T00:36:17 1707266177

Yes, "Record audio". BetterTouchTool launches the shortcut on keydown, then clicks the Stop button on keyup.

isenhaard · 2024-02-06T19:50:15 1707249015

I really love that, super good ideas. I also generally love to create workflow optimizations. Will probably create some of your stuff for myself (I especially like the dictation replacement, could be super useful to me).

Wondering: How big is you monthly OpenAI bill when using all these tools? Only a few $$$, or is it higher?

geor9e · 2024-02-07T02:00:10 1707271210

Only a few dollars a month

tomcam · 2024-02-06T01:39:39 1707183579

You beast! They all sound awesome!

lemming · 2024-02-06T11:22:20 1707218540

These sound amazing, if you don’t mind sharing somehow, I’d love to see how these work. I’ve never used shortcuts, but I think you’ve inspired me to try.

geor9e · 2024-02-07T02:06:14 1707271574

I put couple screenshots here https://news.ycombinator.com/item?id=39283515 to show the API call part. The rest is just whatever you want it to feed into in Shortcuts. For launching a shortcut on keydown, and clicking out of it on keyup, I used BetterTouchTool like this https://i.imgur.com/sqJ7cOc.png

alizbazar · 2024-02-06T17:19:03 1707239943

Would love to learn more details about your setup! I use BetterTouchTool, too and wonder how to make use of it + shortcuts + the API

geor9e · 2024-02-07T01:52:22 1707270742

You might want BetterTouchTool too, since it adds things like cut and paste, to Apple Shortcuts All Actions list. I also use it as the initial trigger usually, to make a hotkey launch a Shortcut. Whisper looks like this https://i.imgur.com/ApAwf2E.png and ChatGPT looks like this https://i.imgur.com/g9f9ZDH.png .

dostick · 2024-02-08T04:43:50 1707367430

Nobody heard of Raycast?

geor9e · 2024-02-09T04:26:17 1707452777

I have not. It looks like it just does some things Alfred was already doing 10 years earlier, and some ChatGPT. Seems like everything has ChatGPT integration these days. I don't think it's compelling enough for me to try tho personally, but I only skimmed the home page. The $8/mo subscription model turned me off a bit.

nbbaier · 2024-02-07T19:47:14 1707335234

What are the settings and prompt you use for the youtube one?

jonnycoder · 2024-02-05T19:00:48 1707159648

I built an Interactive Resume AI chatbot where anyone can ask questions about my experience and skills: https://www.jon-olson.com/resume_ai/

The backend is a Python FastAPI that uses ChromaDB to store my resume and Q&A pairs, OpenAI, and Airtable to log requests and responses. The UI is Sveltekit.

I'm currently building a different tool and will apply some learnings to my Interactive Resume AI. Instead of Airtable, I am going to use LangSmith for observability.

I started writing and my Substack articles are also linked to via my website. I'm currently working on applying sentence window retrieval and that article will be out shortly. This is part of a #buildinpublic effort to help build my brand as well.

I've been unemployed since Sept as a Senior Software Engineer. The market is tough so I'm focusing on the above to help get employment or a contract.

kredar · 2024-02-07T00:38:01 1707266281

Nicely done Jon. I really like the UI - I wanted to have buttons as well but didn't find how to do it in Streamlit.

I also built Resume Chatbot but using slightly different stack: Python, Langchain, Faiss as vector store, MongoDB to store chat logs and Streamlit for UI. Here is a link: https://www.artkreimer.com/resume/ or you can try it on streamlit https://art-career-bot.streamlit.app/. Code is available here https://github.com/kredar/data_analytics/tree/master/career_.... Great thread and I got some ideas for my next project. Thanks a lot everyone.

FlyingWallrus · 2024-02-07T02:01:08 1707271268

Hey Jon - I'm Jon too - working on an AI startup in the recruiting space and will be hiring remotely. I like your resume ap and can definitely see utility in it. I'd be happy to connect and maybe see if there is a way we could work together! I'll find you on LinkedIn and send you an invite request.

dimaboyko42 · 2024-02-06T11:20:58 1707218458

Resume AI is cool, really nicely done, mate !

jonnycoder · 2024-02-06T18:14:51 1707243291

Thanks. This is the first real test apart from a couple dozen test users. I've received hundreds of prompts in the past 24 hours from Hacker News users, mainly from my suggested questions buttons.

The actual questions I got did not provide a response that is to my liking. Most of that is due in part because I'm using gpt3.5 since gpt4-turbo is a lot more expensive, and I can learn a lot more by using an inferior LLM.

For example, using an llm router to analyze the query and route to a specific helper function with a specific prompt would be helpful. Sometimes a user starts with a greeting but the response is a pre-written "Sorry an answer cannot be found". Questions are typically grouped into a category such as skills, experience, project, personal (ie: where are you located), preferences (ie: favorite language), and general interview questions (ie: why should I hire you). Questions in categories can be better answered by using a different prompt and/or RAG technique.

wayfareryouth · 2024-02-06T06:11:41 1707199901

I’m sorry it’s been tough. The job market for seniors and leads is still quite strong in Australia if you can move here

jonnycoder · 2024-02-06T18:26:37 1707243997

Thanks. I'm starting to realize that part of the problem is job search and matching.

I was contacted by a company recruiter for a small healthcare SaaS in California and had 3 interviews recently. When I looked up the job, only 7 people had applied in 2 weeks on LinkedIn. They are a very real company with very real people, but their job post is not getting seen (it's not a promoted post).

My next AI project will be to scrape LinkedIn jobs, analyze it for repost/promoted behavior, group it by consulting/headhunters vs company job post, eliminate duplicates, and filter based on my skillset and hard-no qualities (such as can't work if I live in California, must be in EST but I'm in PST timezone, requires Java experience, etc).

isenhaard · 2024-02-06T20:51:45 1707252705

... btw. are you sure that only 7 people applied for that job? Because there are a lot of job announcements on LinkedIn which just won't show the number of applicants correctly in case there's an application link outside of LinkedIn for applying, meaning the application doesn't take place within LinkedIn. In that case, you'll get the question by LinkedIn if you have applied for the job, which most people just won't click. I'm seeing this all the time.

But still good point that there might be promoted jobs and non-promoted ones, maybe it's worth creating an own job scraper.

jonnycoder · 2024-02-06T21:22:00 1707254520

That's a good point about the applicant number. I don't think anyone knows exactly how it works, but it was the first time I saw a job posting older than a few hours with <10 applicants with such straightforward skills such as Python.

isenhaard · 2024-02-06T20:16:35 1707250595

Just played with your app, I think is super cool! I especially liked the way that you can just click the next question within an answer, makes it super convenient and fun to use.

I'm currently also looking for a dev job. So you have 15 years of experience, live in California and struggle to find something? That sounds a bit demotivating to me lol, because I'm kinda half of all of that or a bit less.

I also like your LinkedIn analysis idea, should try that maybe, too.

jonnycoder · 2024-02-06T21:26:08 1707254768

I'm looking for remote dev job, which is much harder, since I am in Sacramento.

kredar · 2024-02-07T01:59:20 1707271160

I had a similar idea. Scraping LinkedIn and some other job boards and analyzing and highlighting jobs that best fit specific criterias.

abeisgreat · 2024-02-05T21:52:09 1707169929

I've done a handful of fun hardware + LLM projects...

* I built a real life Pokedex to recognize Pokemon [video] https://www.youtube.com/watch?v=wVcerPofkE0

* I used ChatGPT to filter nice comments and print them in my office [video] https://www.youtube.com/watch?v=AonMzGUN9gQ

* I built a general purpose chat assistant into an old intercom [video] https://www.youtube.com/watch?v=-zDdpeTdv84

Again, nothing terribly useful, but all fun.

63 · 2024-02-05T22:58:41 1707173921

Oh hey I just watched that pokedex video. It was so impressive! Deserves way more attention

moralestapia · 2024-02-06T10:51:08 1707216668

Indeed! Such a beautiful project!

aerosteelzero · 2024-02-06T17:33:05 1707240785

Great job on that Pokedex and the video entirely. So freakin cool!

bluecoconut · 2024-02-05T18:45:58 1707158758

We've made a lot of data tooling things based on LLMs, and are in the process of rebranding and launching our main product.

1. sketch (in notebook, ai for pandas) https://github.com/approximatelabs/sketch

2. datadm (open source, "chat with data", with support for the open source LLMs (https://github.com/approximatelabs/datadm)

3. Our main product: julyp. https://julyp.com/ (currently under very active rebrand and cleanup) -- but a "chat with data" style app, with a lot of specialized features. I'm also streaming me using it (and sometimes building it) every weekday on twitch to solve misc data problems (https://www.twitch.tv/bluecoconut)

For your next question, about the stack and deploy: We're using all sorts of different stacks and tooling. We made our own tooling at one point (https://github.com/approximatelabs/lambdaprompt/), but have more recently switched to just using the raw requests ourselves and writing out the logic ourselves in the product. For our main product, the code just lives in our next app, and deploys on vercel.

jefc1111 · 2024-02-05T20:37:00 1707165420

Having a play with datadm. It's really good and intuitive to use - good job! I'm getting errors now, but was having a lot of fun before.

lefstathiou · 2024-02-05T19:42:49 1707162169

This is cool. Thank you for sharing.

andher · 2024-02-05T19:00:36 1707159636

I've built several things! These include bots for code generation that you can tag onto issues, q&a on text etc.

The thing I'm working on now is AI mock interviewing. It's basically scratching my own itch, since I hate leetcode prep, and have found I can learn better through interaction. To paste a blurb from an earlier comment of mine:

I'm building https://comp.lol. It's AI powered mock coding interviews, FAANG style. Looking for alpha testers when I release, sign up if you wanna try it out or just wanna try some mock coding. If its slow to load, sorry, everything runs on free tiers right now.

I really dislike doing leetcode prep, and I can't intuitively understand the solutions by just reading them. I've found the best way for me to learn is to seriously try the problem (timed, interview like conditions), and be able to 'discuss' with the interviewer without just jumping to reading the solution. Been using and building this as an experiment to try prepping in a manner I like.

It's not a replacement for real mock interviews - I think those are still the best, but they're expensive and time consuming. I'm hoping to get 80% of the benefit in an easier package.

I just put a waitlist in case anyone wants to try it out and give me feedback when I get it out

Gonna apologize in advance about the copywriting. Was more messing around for my own amusement, will probably change later

jonnycoder · 2024-02-05T19:09:16 1707160156

Very cool, I signed up. I agree that practicing a coding interview is better under pressure. It's a much difference skill to solve a coding problem both under time pressure and pressure to speak your thoughts to entertain the interviewer. Only practice can help improve that skill.

andher · 2024-02-05T21:47:26 1707169646

Yeah, I agree, the scenario is totally different in an actual pressure situation, I've fumbled so many easy questions. I don't necessarily like leetcode style questions as the standard for the industry for interviewing, but its still a reality and, from what I'm noticing, becoming more difficult in terms of expectations.

Thanks for signing up, will send out an email once its ready to take for a spin!

hazard · 2024-02-05T19:30:33 1707161433

A Twitter filter to take back control of your social media feed from recommendation engines. Put in natural language instructions like "Only show tweets about machine learning, artificial intelligence, and large language models. Hide everything else" and it will filter out all the tweets that you tell it to.

Runs on a local LLM, because even using GPT3 costs would have added up quickly.

Currently requires CUDA and uses a 10.7B model but if anyone wants to try a smaller one and report results let me know on github and I can give some help.

https://github.com/thomasj02/AiFilter

eurekin · 2024-02-05T20:03:19 1707163399

That could actually be a universal ad-whacker for similarily stubborn sites (reddit)

beginning_end · 2024-02-05T20:24:31 1707164671

I've been thinking the same thing. It'll be interesting to see if we end up with prompt-injecting ads

jerpint · 2024-02-06T02:12:47 1707185567

I didn’t know you could interact with pages like that so easily with Chrome extensions

pashariger · 2024-02-05T19:49:41 1707162581

I built an AI Hiring Assistant that performs an initial screening, collects candidate information, answers questions about the role, and also asks a several behavioral interview questions: https://hiring.gracekelly.dev/

Built entirely on Vercel & OpenAI. Took about a day, hardest part was configuring Sign In With Google. Had several dozen candidates use it, saved a lot of time and helped prioritize conversations.

I just did a brief writeup about it yesterday: https://www.linkedin.com/pulse/i-built-ai-hiringscreening-as...

have_faith · 2024-02-05T22:30:17 1707172217

> A few people emailed their resumes directly rather than using the chat

How did they fare compared to candidates that went through the chat process?

pashariger · 2024-02-06T01:14:53 1707182093

Small dataset for those that emailed, n=~3, but none of them were standout resumes. Best few candidates actually went through chat and also followed up via email with additional information a few days later.

lalaithion · 2024-02-05T18:22:05 1707157325

I wrote a script that takes in my credit card statement line by line and categorized the transactions into a custom set of categories that I cared about as well as generating a human readable description of the transaction.

ishtanbul · 2024-02-05T21:33:23 1707168803

Tell me more, that is interesting. Even my bank (a big one) is unable to categorize the transactions correctly.

nbbaier · 2024-02-07T19:48:59 1707335339

I'd love to see the script but especially the prompt you're using here.

lamroger · 2024-02-07T04:10:05 1707279005

Was thinking about this the other day too!

elimc184 · 2024-02-06T20:35:49 1707251749

I used an LLM connected to a messaging service to defeat romance scammers. I was able to get these romance scammers to speak to my program for hours without knowing they were talking to a machine. Essentially, it's a DDOS for scammers. The scammers can only talk to a few dozen victims at a time, while the "people" in my programs can be spun up by the millions. It will essentially eliminate messaging scams from whatever messaging platform it's on.

I believe a large company like Meta, or any of the other companies with messaging platforms, would find this valuable. Especially because they will be fined by the UK for fraud that takes place on their messaging services.

rsanek · 2024-02-10T16:22:24 1707582144

At what point is it just two AIs talking to each other, back-and-forth?

x3b1 · 2024-02-07T11:34:43 1707305683

Great idea! Defense through a trap, where the criminals will be used against themselves, I like it! Do you happen to have an address for your project so I can forward it? I may know people who would be interested in purchasing or supporting it.

cl42 · 2024-02-06T01:45:05 1707183905

LLM agents to forecast geopolitical and economic events.

- Site: https://emergingtrajectories.com/

- GitHub repo: https://github.com/wgryc/emerging-trajectories

I've helped a number of companies build various sorts of LLM-powered apps (chatbots mainly) and found it interesting but not incredibly inspiring. The above is my attempt to build something no one else is working on.

It's been a lot of fun. Not sure if it'll be a "thing" ever, but I enjoy it.

btbuildem · 2024-02-06T02:36:09 1707186969

Fascinating. I've done this on a tiny, micro scale -- giving the GPT scenarios (eg, conversations, situations) and asking how it would play out. In early 2023 it seemed to work really well, now that they've nerfed it so much, it's a bit too generic and proper.

cl42 · 2024-02-06T02:38:13 1707187093

Have you tried GPT-4 with the update from the past few days? (When Sam mentioned it should be less lazy.) I notice it’s gotten much better and more willing to make forecasts since then.

btbuildem · 2024-02-06T02:42:18 1707187338

No but I'll check it out. Thanks!

cedws · 2024-02-06T02:18:45 1707185925

Very interesting, have you attempted to backtest to see if the LLM forecasts are accurate?

cl42 · 2024-02-06T02:36:29 1707186989

Thanks for asking! Not yet as I’ve been focusing on building agents that can properly and regularly log predictions.

Ideally, I’d like the agents to then participate in prediction markets or “superforecasting” groups to use actual human predictions as baselines.

cedws · 2024-02-06T15:12:05 1707232325

If your project makes you rich and you need some engineering help, call me ;)

jamesponddotco · 2024-02-05T20:33:40 1707165220

I built a couple of things, but the most useful is probably allalt[1], which describe images and generate alt tags for visually impaired users using GPT-4V. Next I want to add the option to use local LLMs using ollama[2], but I'm still trying to decide the UX for that.

There's also Moss[3], a GPT that acts as a senior, inquisitive, and clever Go pair programmer. I use it almost daily to help me code and it has been an huge help productivity-wise.

[1] https://git.sr.ht/~jamesponddotco/allalt

[2] https://ollama.ai/

[3] https://git.sr.ht/~jamesponddotco/moss

nip · 2024-02-05T18:48:38 1707158918

A “YouTube video subtitles generator” script for Estonian content.

Powered by whisper-timestamped [1] using a model trained by the local tech university TTÜ [2]

And it just… works! (with some tweaks and corrections)

[1] https://github.com/linto-ai/whisper-timestamped

[2] https://huggingface.co/TalTechNLP/whisper-large-et

fcmgr · 2024-02-05T18:16:55 1707157015

I've created just-tell-me [1] that summarizes youtube videos with ChatGPT. It's built with Deno, uses TypeScript and is deployed with Deno Deploy. It's open source, you can run it from CLI as well [2]

[1] https://just-tell-me.deno.dev/

[2] https://github.com/franekmagiera/just-tell-me

throwaway2203 · 2024-02-05T21:25:17 1707168317

This is great! I ignore so many videos from friends and family because I suck at watching videos!

fragmede · 2024-02-05T18:51:38 1707159098

This is really neat!

tcpiplab · 2024-02-05T22:22:21 1707171741

I used FlowWise[1], LM Studio[2], the llama2[3] model, and Ollama[4] (for embeddings) to create a local-only RAG chatbot so I could chat directly with Tristram Shandy, Gentleman[5]. For the context document I used the text of the novel of the same name, downloaded from Project Gutenberg.

Primarily it was a PoC to see if a document based chatbot could work without crossing trust boundaries by calling out to untrusted APIs. It only makes calls to localhost.

If you’re familiar with the novel you will be pleased to know that the chatbot ended a recent answer with, “I must go now as I have an appointment with my chamber pot and I wouldn’t want to keep it waiting.”

[1]https://github.com/FlowiseAI/Flowise

[2]https://lmstudio.ai/

[3]https://llama.meta.com/

[4]https://ollama.ai/

[5]https://www.gutenberg.org/ebooks/1079

Everything runs on a Mac Mini with the M2 Pro CPU/GPU and Mac OS Sonoma.

widdershins · 2024-02-06T15:29:17 1707233357

Seriously!? I love the idea, but I perhaps love more that Tristram Shandy was your choice of character to chat with. You have good taste!

jamifsud · 2024-02-05T20:40:38 1707165638

I'm building https://www.brief.news, an AI powered newsletter that condenses tens of thousands of news articles into a daily briefing of the top stories, we support 30 topics today and are adding the ability to add your own!

Stack is a combination of TypeScript (Next / Node) + Python with a pretty simple deployment setup right now (GHA -> Container -> Cloud Run).

riku_iki · 2024-02-05T22:22:53 1707171773

How much money you need to spend per day on OpenAI api?

jamifsud · 2024-02-06T21:07:36 1707253656

A lot! We’re actively reducing that though by training our own specialized models. We’re seeing equal or better performance with curated datasets at > 10x cost reduction.

meknassih · 2024-02-06T12:38:10 1707223090

This is pretty cool. Just a heads up: there's a french newsletter company I was subscribed to that is using brief.eco, brief.me and brief.science. Ironically, their main selling point is summarized news but by humans.

jamifsud · 2024-02-06T21:09:53 1707253793

There’s definitely a need here, love the category specific TLDs.

jtolster · 2024-02-05T20:46:44 1707166004

This looks awesome - might I suggest splitting the headlines on the homepage into a punchy title and subtitle? The wordiness of them makes it difficult for me to parse them for the topic quickly

jamifsud · 2024-02-06T21:08:47 1707253727

Thanks! We got a couple different formats available, check out the top stories format which is close to what you suggest. Would love to hear your thoughts on it! We’re considering making that the default.

MostlyVirtual · 2024-02-09T22:31:17 1707517877

Have you considered a weekly version as well - I personally don't like receiving daily e-mails

le-hu · 2024-02-10T09:07:02 1707556022

great stuff, for general news-synthesis I use https://www.newsminimalist.com/ definately check it out!

adventured · 2024-02-05T23:00:17 1707174017

This is fairly well done, good job!

The layout is clean, and it's fast. The summaries are solid as well.

jamifsud · 2024-02-06T21:07:46 1707253666

Thanks!

joetann · 2024-02-05T23:02:11 1707174131

I’ve always found podcast discovery to be lacking, so I’m building the ultimate solution to that.

We’re processing the top podcasts in many genres every day (currently thousands of daily episodes) and running them through our pipeline.

From this we’ve made a semantic search engine, for example: https://www.podengine.ai/podcasts/search?search_term=Should+...

We’re soon going to improve and summarise the responses from the raw embeddings in a few ways. Would love some feedback on the experience.

We have also opened up a keyword alerting feature to alert folks when they’ve been talked about in an episode.

pseudosavant · 2024-02-05T23:24:11 1707175451

I really like the idea of using embeddings in this way. I'm sure scaling out to get "most" of the podcasts is no joke. But some bigger podcasts like Smartless didn't seem to be in your database.

Have you considered using embeddings to show similar podcasts?

urbandw311er · 2024-02-06T15:50:32 1707234632

I wanted to automate the process of creating self-guided tours and online treasure hunts around towns and cities.

Ultimately I wanted a whole marketplace where anybody can create a tour and then sell it.

But the process of creating the tours was quite laborious.

So to speed this up I fed GPT-4 information about local points of information and had it write the questions and the multi choice answers. It also wrote some narrative bits as various personas. For example, there was a Christmas hunt where GPT4 played the part of an elf and came up with a theme about Santa needing to recruit you to be a new elf, once you’d answered all the various clues etc.

Front end is React Typescript, backend is Net Core Web API on Linux with MySQL under EF Core and also integrations with GPT4 and Stripe.

It’s hosted at treasuretours.org

Only superusers can access the AI tools right now because cost, but you can try out some of the pre-made hunts which were partially AI generated.

woadwarrior01 · 2024-02-05T22:12:38 1707171158

I built an iOS and macOS offline LLM app called Private LLM[1]. I don't have any visibility into what the users do with it, but from what I hear on the app's discord, people love to use it in their Apple Shortcuts workflows for text manipulation.

I initially built it using llama.cpp for offline LLM inference, but soon discovered mlc-llm and moved to using it, because the latter is way faster and flexible.

[1]: https://apps.apple.com/us/app/private-llm/id6448106860

trvhig · 2024-02-06T03:02:29 1707188549

https://www.rivadata.com/

I have been hacking together a poor-man's crunchbase that's fueled by GPT.

React / Python / Supabase. The most interesting piece thus far has been the success of the self-correcting loops through GPT. At each turn basically feeding the results back to another 3.5 prompt that is only about reviewing quality. I found that with these loops you can get solid results without having to use the more expensive GPT4 API.

(Also loving all the projects in this thread)

lamroger · 2024-02-07T04:12:30 1707279150

It's funny bc that's exactly how fine tuning is done rn. I find it amusing how we can leverage the same techniques pre and post. Wild wild west

SiNTEx · 2024-02-06T10:11:05 1707214265

This is awesome. I've been looking for this exactly in the last few days.

ramn7 · 2024-02-06T16:28:40 1707236920

I have a somewhat unique answer for that- I started with building a product, and ended up building a dev platform for LLM based products (more specifically- dev platform for json outputting LLM structured tasks).

Here's the story:

At first I was building a tool for stock analysis- the user writes in free language what companies they want to compare, along with a time period, and their requested stocks show up on a graph. They can then further reiterate on it- add companies, and change range all in free language (I had many more analysis functions planned). Following some unique dev challenges I've found- I ended up not releasing the product (possibly will sometime in the future..), and switched to work on a dev platform to help with these challenges.

I was using what I called 'LLM structured task'- basically instructing the LLM to perform some task on the user input, and outputting a json that my backend can work with (in the described case- finding mentioned companies and optional time range, and returning stock symbols, and string formatted dates). The prompting has turned out to be not trivial, and kind of fragile- things broke with even minor iterations on the prompt or model configurations. So- I developed a platform to help with that- testing (templated) prompt versions, as well on model configurations on whole collections of inputs at once- making sure nothing breaks in the development process (or after). * If you're interested, welcome to check it out on https://www.promptotype.io

ianbicking · 2024-02-05T18:43:25 1707158605

I've made a couple games, though I am still having a hard time finding the soul of the game in the LLM and haven't released them; there's a historical roleplay game (that I plan to release soon), a storytelling game (the player tells stories to the LLM), a wander-a-world-aimlessly-and-chat game, and I never get further than 50% through the way of murder mystery games, though murder mysteries seem like an excellent structure.

I've built some abstract content development tools, generally focused on building larger content somewhat top-down (defining vibes, then details).

I'm working on a general project helper using the GPT-Vision, voice, and regular GPT. You setup the camera above your workspace, work on paper, and chat with the LLM while you do it. I think there's a lot of potential, but the voice stuff is quite hard to deal with... there's just a ton of stuff happening in parallel, and I find it very hard to code something reliable.

The stack I use is all in the browser, generally Next.js, Preact Signals, and my own code to call into GPT, Whisper, etc. I like having everything available for inspection, and I generally keep all the working bits visible somewhere. (This can be overwhelming when other people see it.)

But I haven't gotten over the deployment hump... the cost and complexity is a challenge. I've used Openrouter.ai recently in a project, and I think if I leaned on that more completely I'd find the release process easier.

notzane · 2024-02-05T21:31:07 1707168667

https://www.askaway.bot/

AI concierge for my parents’ vacation rental. Mostly just pulling info from the guest binder, but I’ve also started using some local guides to give better suggestions. Built with NextJs and deployed on Vercel (was really easy and they have a generous free tier).

nicbou · 2024-02-07T12:02:25 1707307345

How well does it work, regarding accuracy?

mnky9800n · 2024-02-06T08:19:10 1707207550

I built a summarizer for drilling reports. Anytime you drill boreholes, whether it's on a drilling platform in the ocean or the middle of the desert or wherever, there's a geologist watching what comes out and writing notes about it. They likely do this multiple times both in the field and a laboratory setting. These notes are paired with logging software which also asks the geologist more quantitative questions sometimes (e.g., on a scale 1 to 5 how many fractures are there). Typically these are written for at least every meter of extracted core/rock/etc. typically you are drilling hundreds or thousands of meters, or more. So you end up with a highly unstructured data set that occasionally someone glances through to find tidbits. Using chatgpt we converted this data into keywords that could then be used to look at depth dependencies of various geological or petrological features of the region.

jrhizor · 2024-02-05T22:05:13 1707170713

https://www.mealbymeal.com

It's macro + calorie tracking over text message. You just text what you eat and it matches against a food database to estimate your food intake. It's basically an easier alternative to MyFitnessPal.

My stack is OpenAI on Azure, Vercel, Convoy, FatSecret API, Postmark, NextJS.

dmitrysergeyev · 2024-02-05T20:00:49 1707163249

I was tired of the need to scroll through dozens of blogs and RSS feeds to learn about technologies and industry news, so I’ve built a service that helps you learn and stay updated about any topic by sending a single fully personalized weekly email digest, making relevant information come to you, instead of you chasing it (push vs pull):

https://peekly.ai

It’s basically an LLM-based RAG that works over the best blogs and websites covering any topic you provided during onboarding.

meandave · 2024-02-05T21:37:36 1707169056

I'm unable to submit my email/interests.

This is in firefox with and without UBlock Origin.

Errors: https://i.imgur.com/N28wnVY.png

aspenmayer · 2024-02-06T00:36:30 1707179790

Ironically, when trying to view your picture of errors, I myself get an error on Imgur itself:

{"data":{"error":"Imgur is temporarily over capacity. Please try again later."},"success":false,"status":403}

dmitrysergeyev · 2024-02-05T22:54:43 1707173683

Wow, thanks for letting me know! I've just fixed the problem thanks to your bug report.

gardnr · 2024-02-05T21:46:01 1707169561

I read the paper "Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling" that was published last week and started building a tool for people to collectively generate synthetic training data.

The tool still needs a trust mechanism and a coherent incremental publishing strategy to be able to operate in a public fashion. Right now, running one node using my RTX 3060 it would take 1.2 years to do one split of the C4 dataset.

https://arxiv.org/abs/2401.16380

https://www.emergentmind.com/papers/2401.16380

https://github.com/gardner/gsd

Ldorigo · 2024-02-06T07:11:34 1707203494

I got fed up sending cover letters so I made a tool that writes them for me. Scrapes the company website and summarizes it to get relevant background info, takes my resume + arbitrary info I provide as input, and the job posting (can also work without for unsolicited applications). I then fine-tuned a GPT3 model on actual cover letters I had written to make it sound like me, and voilà ! Actually landed me a job.

iAkashPaul · 2024-02-05T18:08:34 1707156514

Summarisation for calls, emails. Lots of extraction tasks & closed domain chatbots.

Deployment is usually FastAPI for business logic, Langchain or MS/Guidance library, LLM hosted via. HF-TGI server

dvcrn · 2024-02-05T23:38:32 1707176312

Lots of small stuff like bots and scripts to automatically rename files that I use locally every single day

Then things like:

“Fix My Japanese” - uses LLM to correct Japanese grammar (built with Elixir LiveView): https://fixmyjapanese.com

It has different “Senseis” that are effectively different LLMs, each with slightly different style. One is Claude, one is ChatGPT.

Or a slack bot that summarizes long threads:

https://github.com/dvcrn/slack-thread-summarizer

frellus · 2024-02-06T17:05:16 1707239116

Love this, and love that you're using Elixir LiveView as well as Elixir with the Slack bot and lastly that you're based in Tokyo.

Followed you on Github, I'm looking at moving (back) to Japan in two years, likely to start a bootstrapped startup business and always good to have dev friends!

dvcrn · 2024-02-06T22:39:08 1707259148

Thanks :)

tmm84 · 2024-02-06T00:52:30 1707180750

I use a local LLM too for my Japanese grammar and sentence improvements to make thme more native.

Sadly, every time I tried the fix my japanese page it just says the text I inputted wasn't Japanese. Maybe next time it'll work for me.

nbbaier · 2024-02-07T19:59:33 1707335973

> Lots of small stuff like bots and scripts to automatically rename files that I use locally every single day

What kinds of prompts are using for the file renaming scripts?

octopusRex · 2024-02-07T14:00:20 1707314420

I will check this out. My native Japanese wife rolls her eyes at any AI Japanese bot. "We would never say it that way."

cmgriffing · 2024-02-05T20:56:35 1707166595

I just built a tool that uses Whisper.cpp compiled to WASM in conjunction with SQLite WASM for a fully client-side book writing tool.

Basically, I want to write a book without having to type out the whole thing. I got the dictation idea from an episode of Columbo.

It is very much a work in progress and a proof of concept for another writing tool I want to make.

https://orderly.cmgriffing.com/

https://github.com/cmgriffing/orderly

coderKen · 2024-02-06T07:43:02 1707205382

This is awesome! What is the performance like? particularly around WASM compiled Whisper.

cmgriffing · 2024-02-06T20:32:54 1707251574

Thanks! It really depends on your machine.

My mac mini is heavily constrained due to only having 2 proper cores and not much RAM. So the smallest models run best. The quantized tiny runs better than the regular tiny simply due to memory pressure.

So, in my testing on my mac mini it tends to take about 30% more time to process than the audio clip that was recorded. I added a specific warning that lets the user know to reduce their thread count if the processing time takes longer than a specific threshold.

Some of my stream viewers report it being pretty fast though. (much faster than what they see on my mac mini).

binsquare · 2024-02-05T21:26:50 1707168410

LLMs have been game changing productivity-wise for me

But I found that LLMs are often wrong and hallucinates, so I have to double check with google or other resources.

So I built a google and chatgpt alternative to answer any question and hallucinations are more obvious. I do this by using by multiple LLM's including search enabled ones i.e. GPT4, Gemini, Claude, Perplexity, Mistral, and Llama.

It's been growing healthily https://labophase.com

collaborative · 2024-02-05T21:07:58 1707167278

A search engine that saves me time by detecting SEO spam, downranks results containing ads, and summarizes click bait descriptions away

I made it available to the public aisearch.vip

nicbou · 2024-02-07T12:30:24 1707309024

I like how a few comments in this thread are the cause of the problem you're fighting.

fullstackchris · 2024-02-05T22:30:54 1707172254

I'm building a way to automate creation of software video lessons and courses, putting it all under the name 'CodeVideo'. One tool leverages OpenAI's whisper, as well as GPT3.5 or GPT4 for help with generating the steps that ultimately produce the video (this part is not yet in the repo; everything is a work in progress). The tool is here:

https://github.com/codevideo/codevideo-ai

My goal is to definitely NOT generate the course content itself, but just take the effort out of recording and editing these courses: you provide (or get help generating) the stuff to speak and the code to write and the video is deterministically generated) The eventual vision is to convert book or article style text to generate the steps to generate the video in an as-close-as-possible-to-one-shot.

I also leverage Eleven Lab's voice cloning (technically not an LLM, but impressive ML models nonetheless)

For anyone more curious, I'm wondering if what I'm trying to do is in general a closed problem - to be able to generate step by step instructions to write functional code (including modifications, refactoring, or whatever you might do in an actual software course) or if this truly is something that can't be automated... any resources on the characteristics of coding itself would be awesome! What I'm trying to say is, at the end of the day code in an editor is a state machine - certain characters in a certain order produce certain results. Would love if anyone had more information about the meta of programming itself - abstract syntax trees and work there comes to mind, but I'm not even sure of the question I'm asking yet or trying to clarify at this point.

hn_7 · 2024-02-06T19:08:41 1707246521

This is super interesting. From what I gathered, generating a video autonomously given the content or instructions is sort of difficult. Curious to hear if you (or others) have any leads on how you can build this? (I'm assuming you intend to generate an instructional course video given the content) If this works well, could be used for teaching students via auto-generated video.

czzarr · 2024-02-05T22:24:28 1707171868

We built https://gptforwork.com a set of add-ons for Excel, Word, Google Sheets and Docs that brings custom GPT functions in Excel and Sheets, to prompt directly from cells, a chat in Word to interact with documents, and a simple prompt box in Docs We offer OpenAI and Azure providers (as well as Anthropic on Sheets)

wonderfuly · 2024-02-06T11:17:44 1707218264

5M installations, wow!

kebsup · 2024-02-05T19:55:57 1707162957

I'm building a spaced-repetition flashcards language learning app, that generates sentences and explanations for a given word.

Unfortunately only for German, but I plan on expanding the languages soon.

https://vokabeln.io

Tech stack: - The app is in Flutter. - Backend I'm nodejs TS. - GPT4 for generation of sentences and explanations - GCP text-to-speech for audio

zffr · 2024-02-05T19:59:08 1707163148

Nice! I'm building something similar for French

computers3333 · 2024-02-05T21:16:38 1707167798

Built this little tool to summarize Hacker News articles using HuggingFace. https://gophersignal.com

It doesn't do a ton, but it's kinda cool. Feel free to fix/add anything https://github.com/k-zehnder/gophersignal

afiodorov · 2024-02-05T18:37:25 1707158245

I have built a webapp for translating srt files: https://www.subsgpt.com

GPT-4 excels as a translator, but it often encounters issues with content warnings and formatting errors when translating entire subtitle files via ChatGPT. The solution is straightforward: divide the subtitle file into sections, focusing solely on translating the text and disregarding the timestamps. While it's feasible to have ChatGPT maintain the correct format, I've observed a decline in translation quality when attempting this in a single pass. My preferred approach is a two-phase method: first, translate the text, and then, if necessary, request ChatGPT to adjust the formatting.

The webapp splits the srt file into batches of 20 phrases and translates each batch. It also allows for manual correction of the final translation.

Ah and it's also serverless: you input your OpenAI token & select the model of your choice and the webapp makes the requests to OpenAI directly.

erikig · 2024-02-05T22:01:24 1707170484

This is great, how well does it do with informal/slang Portuguese, Russian or Spanish?

afiodorov · 2024-02-06T14:39:33 1707230373

GPT4 is incredible! We watched https://en.wikipedia.org/wiki/The_Boy%27s_Word:_Blood_on_the... which is full of slang and it was perfectly comprehensive. I love how it'd translate informal speech into informal.

iloveitaly · 2024-02-05T19:43:31 1707162211

Some little projects I've been playing around with:

- https://github.com/iloveitaly/sql-ai-prompt-generator generate a ChatGPT prompt with example data for a sqlite or postgres DB

- https://github.com/iloveitaly/conventional-notes-summarizati... summarize notes (originally for summarizing raw user interview notes)

- https://mikebian.co/using-chatgpt-to-convert-labcorp-pdfs-in... convert labcorp documents into a google sheet

- https://github.com/iloveitaly/openbook scrape VC websites with AI

makowskid · 2024-02-10T10:22:46 1707560566

I've recently built https://SharpAPI.com/ on top of LLMs.

SharpAPI is an AI-powered Swiss Army Knife API for every software developer.

Talking directly as a programmer to Large Language models might get unpredictable and prone to issues. It is hard to fully automate its communication with them.

AI sometimes hallucinates, GPT endpoints sometimes simply time-out, it's hard to force GPT to give you the same response format and replay the same request to a failed GPT job.

All of the issues mentioned above are the main reason I'm building Sharp API.

What can it do for developers and their apps?

With Easy-to-Use RESTful Format, extensive documentation, and support for 80 languages and SDKs, It allows to add usual AI capabilities quickly for common scenarios of data processing, f.e. - product categorization for a well-organized products catalogs - generating engaging product descriptions and tagging - filtering out spam content - allowing to understand and analyze sentiment in product reviews, comments, users feedback for data-driven decision-making - generating complex job descriptions and extracting information from resume files for easy processing - and much much more

Some ideas for use cases I've added in this article: https://dev.to/makowskid/when-laravel-e-commerce-app-meets-a...

eurekin · 2024-02-05T19:03:49 1707159829

"Widjosumarajzer" = video summarizer

It's just a hodgepodge of prototype scripts, but one that I actually used on a few occasions already. Most of the work is manual, but does seem easily run as "fire and forget" with maybe some ways to correct afterwards.

First, I'm using the pyannote for speech recognition: it converts audio to text, while being able to discern speakers: SPEAKER_01, _02, etc. The diarization provides nice timestamps, with resolution down to parts of words, which I later use in the minimal UI to quickly skip around, when a text is selected.

Next, I'm running a LLM prompt to identify speakers; so if SPEAKER_02 said to SPEAKER_05 "Hey Greg", it will identify SPEAKER_05 = Greg. I think it was my first time using the mistral 7b and I went "wow" out loud, once it got correct.

After that, I fill in the holes manually in speaker names and move on to grouping a bunch of text - in order to summarize. That doesn't seem interesting at a glance, but removing the filler words, which there are a ton of in any presentation or meeting, is a huge help. I do it chunk by chunk. I'm leaning here for the best LLM available and often pick the dolphin finetune of mixtral.

Last, I summarize those summarizations and slap that on the front of the google doc.

I also insert some relevant screenshots in between chunks (might go with some ffmpeg automatic scene change detection in the future).

aaand that's it. A doc, that is searchable easily. So, previously I had a bunch of 30 min. to 90 min. meeting recordings and any attempt at searching required a linear scan of files. Now, with a lot of additional prompt messaging I was able to:

- create meeting notes, with especially worthwile "what did I promise to send later" points

- this is huge: TALK with the transcript. I paste the whole transcript into the mistral 7b with 32k context and simply ask questions and follow-ups. No more watching or skimming an hour long video, just ask the transcript, if there was another round of lay-offs or if parking spaces rules changed.

- draw a mermaid sequence diagram, of a request flowing across services. It wasn't perfect, but it got me super excited about future possibilities to create or update service documentation based on ad-hoc meetings.

I guess everybody is actually trying to build the same, seems like a no-brainer based on current tool's capabilities.

ravila4 · 2024-02-05T20:16:02 1707164162

Very interested in this. I have been contemplating building something similar, but am unaware of any existing services that do this. Haven't played with pyannote, how does it compare to whisper? Also thought it might be useful to be able to OCR screenshots and use the text to inform the summariation and transcription especially for things like code snippets and domain-specifc terms.

eurekin · 2024-02-05T20:37:18 1707165438

I remember whisper v3 large blowing my mind: it was able to properly transcribe some two language monstrosity (przescreenować, which is a english word "to screen a candidate", but conjugated according to standard polish rules). Once I saw that I thought "it's finally time: truly good transcription has finally arrived".

So I view whisper as sota with excellent accuracy.

Now, for the type of transcription I need speaker discerning is much more valuable than accurate to the point translation: so it will be summarized anyway and that tends to gloss over some of errors anyway.

That said, pyannote has also caught me off guard: it correctly annotated lazily spoken "DP8" with non native speaker accent.

It looks really good

zakariaelhjouji · 2024-02-06T17:07:31 1707239251

Is pyannote the best diarization library you found? What's SOA? I've been using a saas product (Gladia) and I'm getting close to my 10-hour mark.

eurekin · 2024-02-06T18:53:50 1707245630

The first and good enough for me not to look further

abrichr · 2024-02-06T02:16:29 1707185789

At https://openadapt.ai/ we are using LLMs to automate repetitive tasks in GUI interfaces. Think robotic process automation, but via learning from demonstration rather than no-code scripting.

The stack is mostly python running locally, and calling the OpenAI API (although we have plans to support offline models).

For better visual understanding, we use a custom fork of Set-of-Mark prompting (https://github.com/microsoft/SoM) deployed to EC2 (see https://github.com/OpenAdaptAI/SoM/pull/3).

instagary · 2024-02-05T22:55:20 1707173720

We're building a GPT for managing your finances.

https://candle.fi/gpt

Our backend stack: - AWS - SST - TypeScript

Our clients:

- Next (web) - Vanilla React Native (mobile)

OpenAI's App Store announcement is what got us interested in building w/ LLMs.

bomewish · 2024-02-06T08:50:33 1707209433

Why not show names and faces of the founders? Explain the backstory. Using your service requires users put absolutely enormous trust in you. But there is currently nothing on the site to engender that trust. I would work on that as a priority.

instagary · 2024-02-08T04:45:58 1707367558

I appreciate the feedback; I'm working on adding an about page!

We have an /updates page & blog, as well as links to GitHub. I figured finding out more about myself and my co-founder was pretty easy.

whiskeykilo · 2024-02-06T19:07:19 1707246439

Even more ominous that it's free

reality_swim · 2024-02-05T23:04:45 1707174285

link seems broken to me.

instagary · 2024-02-05T23:57:26 1707177446

We've been deploying changes all day so could related, thanks for the report. Should work now.

callmeed · 2024-02-05T21:18:53 1707167933

I'm building a weight-loss app that leverages LLM to do 2 things:

1. Analyze calories/macronutrients from a text description or photo

2. Provide onboarding/feedback/conversations like you'd get from a nutritionist

https://www.fatgpt.ai/

My stack is Ruby on Rails, PostgreSQL, OpenAI APIs. I chose Rails because I'm very fast in it, but I've found the combination of Rails+Sidekiq+ActionCable is really nice for building conversational experiences on the web. If I stick with this, I'll probably need a native iOS app though.

Vendor stack is: GitHub, Heroku (compute), Neon (DB), Loops.so (email), PostHog (analytics), Honeybadger (errors), and Linear.

throwup238 · 2024-02-05T21:27:10 1707168430

> 1. Analyze calories/macronutrients from a text description or photo

Step 1: Is it a hot dog or not hot dog? https://www.youtube.com/watch?v=ACmydtFDTGs

I'm glad someone is keeping the dream alive!

callmeed · 2024-02-05T21:52:47 1707169967

Jokes aside, GPT-4 Vision is surprisingly good at noticing facts from food images. For example:

- In my chipotle bowl, it can tell if I had brown rice vs white rice

- In my In-n-out, it can tell if I got it protein style

It struggles with accurate weights/volumes but I'm excited about where this is going.

adrianmonk · 2024-02-05T22:02:31 1707170551

fatGPT... the LLM that helps you be more model, less large.

gremlinsinc · 2024-02-05T23:29:11 1707175751

our large language model is large so you don't have to be.

callmeed · 2024-02-06T16:41:22 1707237682

Transformers that transform your body

izaidi · 2024-02-05T22:07:08 1707170828

I was holding a free screening of a short film I made, and as an alternative to Eventbrite and the like, I built a simple SMS-based ticket reservation system that used GPT-4 to read and respond to messages. People interested in attending would text a number and their messages were routed by Twilio to my Node.js app, which in turn sent them to GPT to generate a response. The LLM was instructed to provide a structured JSON of each reservation once the person gave their name and the number of the seats they wanted. Worked very smoothly and only took an afternoon to build. Would've been infinitely more tedious if I had to worry about parsing messages with my own code.

hlfshell · 2024-02-05T20:04:42 1707163482

I have two main projects that are public ATM with LLMs.

The more notable one was experimenting with LLMs as high level task planners for robots (https://hlfshell.ai/posts/llm-task-planner/).

The other is a golang based AI assistant, like everyone else is building. Worked over text, had some neat memory features. This was more of a "first pass" learning about LLM applications. (https://github.com/hlfshell/coppermind).

I plan to revisit LLMs as context enriched planners for robot task planning soon.

cryptoz · 2024-02-05T18:50:42 1707159042

I made some LLM-powered text-adventure games: https://cosmictrip.space/gameannouncement

And I'm working on a webapp that is a kanban board where LLM and human collaborate to build features in code. I just got a cool thing working there: like everyone, having LLM generate new code is easy but modifying code is hard. So my attempt at working on modifying code with LLM is starting with HTML and having GPT-4 write beautfulsoup code that then makes the desired modification to the HTML file. Will do with js, python via ast, etc. No link for this one yet :) still in development.

s-macke · 2024-02-05T19:04:15 1707159855

I didn't make text-adventures with LLMs. I try to solve them [0].

So, far, none of the 7 tested models were able to win even one of the easiest text adventures. I tried many prompting techniques. But only GPT-4 was able to play through the first half of the game.

[0] https://github.com/s-macke/AdventureAI

ianbicking · 2024-02-05T19:38:55 1707161935

Fun, I tried to do this back with GPT-3: https://llm.ianbicking.org/interactive-fiction/

But Zork wouldn't be a very accurate measure of skill because GPT definitely knows Zork. Unfortunately the emulator (https://github.com/DLehenbauer/jszm) doesn't work with most games newer than Zork. I haven't revisited the code with newer GPT models either.

s-macke · 2024-02-05T19:57:17 1707163037

GPT-3 doesn't even manage the first few steps of the tested text adventure. And GPT-4 is not good at playing these adventures either.

However, my code run a newer version of the Z-machine. So Zork and many other text adventures will work. I have not tried many other games though.

ianbicking · 2024-02-05T20:24:59 1707164699

I was surprised how high your costs were. I assume you are putting the entire transcript into each prompt, but even then that seems high. Is GPT's planning also taking up a lot of room?

I did find giving GPT some hints about the known commands helped a lot, and I put in some detection of error messages and kept a running log of commands that wouldn't work. Getting it to navigate the parser is kind of half of the skill of playing one of these games. It would be interesting to have it play some, then step back and have it reflect and enumerate things about how the play itself works.

s-macke · 2024-02-05T22:28:37 1707172117

The costs have dropped significantly months after I created the cost image. Now I use GPT-4 Turbo. This GPT-4 model understand how text adventures work and there is no need to give him known commands.

Of course you try even more sophisticated techniques than mine. I tried the ReAct pattern and virtual discussions. So far, he always stumbles at the same place in a critical understanding of the text. And I tried exactly this critical step dozens of times.

You will understand the issue yourself, once you play the game yourself. It just takes 20 minutes and is very easy:

https://adamcadre.ac/if/905.html

ianbicking · 2024-02-06T00:10:58 1707178258

You mean at the very end of the game? The game seems like it's only designed to trick you into that very ending :) Are you hoping it will figure out the game based on the context clues? I'm not sure I can find them myself...

A long time ago I did some exercises in "classical planning algorithms", which all feel very like the early part of this game. I.e., how do you get ready to leave if you have to shower, and can't do that with clothes on, etc. A similar planning example involved changing a tire (opening the trunk, removing lug nuts, etc). It was surprisingly difficult to make an algorithm that could figure it out! You could search the state space given the transitions, but it exploded with what was effectively lots of dead ends; obvious to me as a human, but not to the algorithm. Which is to say that this is a harder problem than it might seem.

s-macke · 2024-02-06T08:26:50 1707208010

Yes, that is the first "bad" ending. After that follow just the one relevant context clue and look under the bed. That might be already enough.

I chose this game, because the game just helps you, at the every step, what you have to do next. Not much to try out. Just the narrative changes. One time, you have to go to work and one time you have to flee.

Other text adventures are even more problematic. I saw GPT-4 trying for dozens of steps in the "The Hitchhiker's Guide to the Galaxy" adventure just to turn on the lights. And this just the first command you have to get right in the game.

huydotnet · 2024-02-05T23:37:56 1707176276

I built a diagram generator in PlantUML format: https://chatuml.com

Also, hello HN! If you are interested, use this promo code for 50% off your first purchase ;)

  HELLOHACKERNEWS

bing_dai · 2024-02-05T22:13:52 1707171232

Project 1 — Source code: https://github.com/bingdai/summaryfeeds. The code is for Summary Feeds (https://www.summaryfeeds.com). It shows summaries of AI-related YouTube Channels.

****

Project 2 - I also built a YouTube summarizer for individual video called Summary Cat (https://www.summarycat.com). It is not open source for now. The stack is very similar to project 1.

****

And yes I like summarizing YouTube videos:)

not_a_dane · 2024-02-06T12:21:39 1707222099

I'm working on Invoker Network.

A Decentralised AI App store with cross border micro transactions.

You will be able to sell your LLM output (could be multi modal) for dollars or you decide. (LLMs working on your infra, you can keep weights for yourself forever.)

https://dev.invoker.network/share/9/0 (Dev environment is ready).

https://dev.invoker.network/share/9/1

scastiel · 2024-02-05T18:16:26 1707156986

For my expense sharing app [1], I added receipt scanning [2] in a few minutes and a few lines of code by using GPT 4 with Vision. I am aware that LLMs often are a solution looking for a problem, but there are some situations where a bit of magic is just great :)

It is a Next.js application, calling OpenAI’s API using a plain API route.

[1] https://spliit.app

[2] https://spliit.app/blog/announcing-receipt-scanning-using-ai

mooreds · 2024-02-05T23:07:10 1707174430

I build this with ChatGPT: http://salaryoverlap.s3-website.us-east-2.amazonaws.com/

Der_Einzige · 2024-02-05T22:08:05 1707170885

I was working on this stuff before it was cool, so in the sense of the precursor to LLMs (and sometimes supporting LLMs still) I've built many things:

1. Games you can play with word2vec or related models (could be drop in replaced with sentence transformer). It's crazy that this is 5 years old now: https://github.com/Hellisotherpeople/Language-games

2. "Constrained Text Generation Studio" - A research project I wrote when I was trying to solve LLM's inability to follow syntactic, phonetic, or semantic constraints: https://github.com/Hellisotherpeople/Constrained-Text-Genera...

3. DebateKG - A bunch of "Semantic Knowledge Graphs" built on my pet debate evidence dataset (LLM backed embeddings indexes synchronized with a graphDB and a sqlDB via txtai). Can create compelling policy debate cases https://github.com/Hellisotherpeople/DebateKG

4. My failed attempt at a good extractive summarizer. My life work is dedicated to one day solving the problems I tried to fix with this project: https://github.com/Hellisotherpeople/CX_DB8

nmfisher · 2024-02-06T01:30:44 1707183044

1) https://imaginanki.com - auto generating flashcards (Anki decks) for language learning with accompanying images and speech audio. Flutter web (JS) with backend on Cloudflare Pages Functions, connected to SDXL, Azure TTS and Claude.

2) https://amiki.app - practise speaking French, Spanish, German or Italian with a 3D partner. Flutter web with Whisper and my own rendering package.

stuartriffle · 2024-02-05T21:51:53 1707169913

I've been learning about RAG using LlamaIndex, and wrote a small CLI tool to ingest folders of my documents and run RAG queries through a gauntlet of models (CodeLlama 70b, Phind, Mixtral, Gemini, GPT-4, etc etc) as a batch proccess, then consolidate the responses. It is mostly boilerplate but comparing the available models is fun, and the RAG part kind-of works.

https://github.com/StuartRiffle/ragtag-tiger

rgbrgb · 2024-02-05T18:35:59 1707158159

I know chat is lame and overdone but here's my open source local AI chat app for macOS :). I wanted something simple enough for the non-technical people in my life who were using ChatGPT. For better or worse, those people are mostly not using chat AI much anymore. Seems like the initial awe wore off.

https://github.com/psugihara/FreeChat

I'm also working on a little text adventure game that I hope to release soon.

thebestmoshe · 2024-02-05T22:46:04 1707173164

I’ve always wanted a tool to help me track my online orders. However, it wasn’t practical to make integrations with every merchant. Even scraping the order emails was way too much work to do for an unproven product.

Now with LLMs it’s simple to extract structured data from emails.

I built [Orderling](https://orderl.ing) that is basically a CRM for your orders. It uses OpenAI api to extract the order information and automatically adds it.

ZeidJ · 2024-02-06T02:03:32 1707185012

We built a social media platform for chatbots... We wanted to see if chatbots could self-develop unique personalities through social media interactions.

The results were actually hilarious... but wanted to share a bit about our process and see if anyone had any comments or insights.

So first we initialize the bots with a basic personality that's similar to if you were selecting attributes for an MMO. Things like intelligence, toxicity, charisma and the like. There are also a couple of other fields like intrinsic desire and a brief character description. These are fed to the model as a system prompt with each inference.

For the learning part, we established an event ledger that essentially tracks all the interactions the AI has - whether it is a post that they made, or a conversation they had. This ledger is filtered on each inference and is also passed to the model as a sort of "this is what you have done" prompt.

Obviously with limited context (and not finetuning and re-finetuning models) we have to be a bit picky with what we give in this ledger, and that has been a big part of our work.

Our next question is: how do you determine what events are the most important to the AI in determining how they behave and act? It's been interesting!

The platform is anotherlife.ai for those curious!

dorkwood · 2024-02-06T14:12:08 1707228728

I generate and sell books that summarize historical events. I was actually ready to launch last month until I realized I could generate extremely realistic photographs in Midjourney and splice them between paragraphs using a simple python script, so I went back and did another pass.

My process involves generating chapters as markdown, using a script to join chapters together, and then finally converting the markdown to ebooks using Gitbook.

nicbou · 2024-02-07T12:29:13 1707308953

You really need to tell people that the images are AI-generated. Anyone who remotely cares about history will feel very upset otherwise. Even using real images in the wrong context is a big no-no.

Honestly, for anything non-fiction, I would strongly advise against using fake images.

joshelgar · 2024-02-07T08:36:43 1707295003

Hey, had a similar idea, would be great to chat - can reach me at my username on google’s mail.

emporas · 2024-02-05T19:59:46 1707163186

I am currently building an automatic book generator of Rust source code, in which the LLM will write the description of the code of a whole Rust project. It will be a bot, which will connect to the website, generate descriptions, download them, and create the book. It is very early in the project, 3 days in, but it's going well.

https://github.com/pramatias/documentdf

vmt-man · 2024-02-06T08:11:49 1707207109

Nice idea, but README is required. Also it can be generated by GPT :)

emporas · 2024-02-06T08:37:39 1707208659

It is generated in it's entirety by GPT. Well 98% is more like it. By the time it's ready, it will have a README. I will announce it on Reddit /r/rust if you are interested.

Something i want to test, is how much documentation is needed, for the machine to infer the rest of it. Something like, one sentence of human documentation + code, how much can LLM infer and describe the code as accurately as possible. Does it need two sentences? 3? We'll see.

personjerry · 2024-02-05T22:13:12 1707171192

I built https://eternalsouls.ai/ for a client recently.

You just export and upload a WhatsApp conversation and it will learn the personality AND voice of your conversation partner. You can send/receive text or voice messages; It was pretty damn spooky to actually have a voice conversation back and forth with an AI standing in for my "friend"

smeej · 2024-02-05T22:54:50 1707173690

I've seen this episode of Black Mirror.