Hacker News new | past | comments | ask | show | jobs | submit | seligman99's comments login

I suspect this is where Windows backwards compatibility bites them a bit. I've got a very old tool [1] that uses WriteProcessMemory and CreateRemoteThread to create a thread in the command process that launched it to remotely change the directory in that process.

It works to this day, despite looking exactly like what malware would do. My tool is nothing in the grand scheme, but I suspect I'm not the only one doing these sort of shenanigans, and no doubt some big important app is doing it and can't be bothered to fix itself, so MS is stuck supporting it.

[1] https://github.com/seligman/ccd


This is the same idea I used for my Cloud IP lookup tool [1], lets it all work in browser with a small file to search against

[1] https://cloud-ips.s3-us-west-2.amazonaws.com/index.html


Oh this is nice, and a cool use of HTTP range requests!


Oh, then you'll love: Hosting SQLite databases on GitHub Pages or any static file hoster (2021), https://news.ycombinator.com/item?id=27016630


cool


I ended up using the same basic layout for the database behind a little IP lookup tool I wrote to make lookups somewhat responsive from JavaScript [1]. It ends up working out pretty well.

[1] https://cloud-ips.s3-us-west-2.amazonaws.com/index.html


My internship project involved a tool to look up individual IP addresses as well as checking a range of address blocks. I ended up implementing the trie data structure, which performed well in terms of storage and run-time efficiency.

My initial approach was to store IP addresses into a hashmap but if the average query was to check address blocks like /24, then it'd turn out to be very poor data structure to search through.


WhisperX along with whisper-diarization, runs at something around 20x of real time on audio with a modern GPU, so for that part, you're looking at around $1 per twenty hours of content to run it on a g5.xlarge, not counting time to build up a node (or around 1/2 that for Spot prices, assuming you're much luckier than I am at getting stable spot instances these days).

You can short circuit that time to build up a node a bit with a prebaked AMI on AWS, but there's still some amount of time before a new node can start running at speed, around 10 minutes in my experience.

I haven't looked at this particular solution yet, but I really find the LLMs to be hit or miss at summarizing transcripts. Sometimes it's impressive, sometimes it's literally "informal conversation between multiple people about various topics"


For $5 for 20 hours of audio you can try https://deepgram.com.

They give $200 of credit.


Along the same lines, in the Windows world:

The current directory is managed with SetCurrentDirectory/GetCurrentDirectory, however the cmd.exe command-line shell also stores the current directory for each drive in an environment variable like "=C:", and the CRT and shell hides all environment variables that start with a "=".

It gets mightily confused if these two concepts of current directory ever diverge.


Who is still using cmd.exe? I understand that there are system processes that still need it, but if you see a human using cmd in the year of our lord 2023, that’s a cry for help!


What should I be using instead?

I don’t mind cmd.exe and it launches instantly (same reason I frequently use notepad.exe for quick edits). That latter quality is very hard to find :)

Edit: but if you meant for scripting, yeah, batch files are terrible.


On my Windows 10 with no profile it takes 1-2 seconds (Ryzen 3600/M2/32GB RM). Like, what is it doing? I get annoyed if bash on Linux takes like 250ms.


Opening cmd.exe or PowerShell (or something else)?


PowerShell


Personally I use the bash that comes with git for Windows. I only need to use cmd.exe for creating symlinks since mklink is a built-in.


It's installed everywhere on any version of windows and works fine for interactive tasks (personally I wouldn't write anything but the simplest scripts for it, anything with for loops is a big no-no)


Powershell is installed everywhere on any version of Windows that still receives security updates.


I do. I don't like PowerShell (and it took me years to realize it wasn't a diagnostic tool for power management), and I find bash for Windows to be ill fitting. I don't do a lot of stuff in the command line on Windows, so working like it has for decades is a plus.


My similar take on the idea: https://cloud-ips.s3-us-west-2.amazonaws.com/index.html

I wanted to see if I could do the lookup work client-side, and also include some more metadata about cloud provider's IPs (region, service, etc), not that it's really better, just a toy idea I had.


$0.005 per hour per IP.

Assuming AWS has 50% utilization on IPs they've assigned for EC2, this is a $1.28 billion/yr fee they created.

Scale is fun.


Extra scandalous too since stuff like load balancers will use a bunch of IPv4 addresses that they can now charge extra for, with no way to avoid the fees since you can't turn off IPv4.


That’s also assuming that adding the fee with 6 months of notice won’t make people reduce their IP address usage to avoid the fee.


Jeff will be able to commission another sculpture for his yacht.


I asked it to summarize the transcript for a podcast about an episode of Star Trek. One minor issue in grammar, but otherwise, it does a remarkable job of making a summary:

> In the Star Trek episode "The Cage", Captain Pike is held captive by a race of advanced televisions. He is offered a life of luxury in a cage, but he realizes that this is not the life he wants and manages to escape. The televisions watch everything on TV and have become too voyeuristic and passive. At the end of the episode, Pike makes the right choice and is able to see the difference between the falsehood of the fantasy and the difficult but more acceptable reality. This episode highlights the importance of exploration and being better than oneself, and is a reminder of the dangers of becoming too passive and voyeuristic. It also shows how technology can be used to create illusions and how these illusions can be used to manipulate people.


I wrote MicroKeys[1] out of a similar frustration. Granted, I never got past the POC stage, so it's not as feature rich as AHK, but it solved a very specific itch I had. I debate if I should flesh it out further, or try using AHK again.

[1] https://github.com/seligman/microkeys


Long ago I had my podcast downloader keep all files it downloads and recently I've been using OpenAI's Whisper to go through and create transcripts of the 8000 or so hours of data I have downloaded over the years.

It's very cool to be able to search through and remind myself of something I heard once. Not exactly life changing, but still, nice to be able to quickly drill down and find audio for something when a curiosity strikes me.


What kind of hardware do you have that makes it feasible to process thousands of hours of podcasts? I want to do the same but I’ve heard that Whisper requires some serious GPU might for decent accuracy (Linux Unplugged podcast specifically).


Yep, it takes a bit of GPU RAM. I'm using 3 machines with NVidia 3080 or better. I let them go for a few weeks over the winter break when I was mostly disconnected from the tech world. The workers prioritized podcasts I'm personally likely to want to search, and got through almost a third of my archive.

Now it's down to 1 or 2 machines depending on what's going on, so it'll take much longer to finish up, but I'm in no rush.


8000 hours? Napkin math time, that's 20 years of 10+ hours daily.

I call BS.


It's about an hour or two or a day.

This includes data from 1995 on. The early data is backfill of radio shows that transitioned to podcasts and dumped old episodes in their feed at some point. My reader itself started in 2012, I downloaded around 7000 hours of new podcasts, which works out to 1.7 hours per day. So, around 2 hours per day, since I don't listen every day, and to be fair, I haven't listened to every podcast I've downloaded, some don't interest me. But 1-2 hours of listening a day is the sweet spot for me.


My math says 365 days x 10 hours/day = 3650 hours. 8000 hours is just over 2 years, not 20.


You need to resize your napkins.


Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: