Do you work from the command line? Butterfish is a project I wrote for myself to use AI prompting seamlessly directly from the shell. I hope it's useful to others, give it a try and send feedback!
> Within Butterfish Shell you can send a ChatGPT prompt by just starting a command with a capital letter, for example:
This is a dangerous assumption. Not all commands are lowercase. Interaction with an external service should be a deliberate, discrete action on the user's part.
I like that a lot! It would be awesome if the client running on goal mode had capabilities to request some search engine API + do some crawling. Imagine getting the info out of up to date github issues or directly from AWS docs.
I've experimented with it, the reason I haven't yet added it is that I want deployment to be seamless, and it's not trivial to ship a binary that would (without extra fuss or configuration) efficiently support Metal and CUDA, plus download the models in a graceful way. This is of course possible, but still hard, and not clear if it's the right place to spend energy. I'm curious how you think about it - is your primary desire to work offline or avoid sending data to OpenAI? Or both?
The latter mostly. It's also free, uncensored, and can never disappear from under me.
FWIW, from my understanding llama.cpp is pretty easy to integrate and is reasonably fast for being API agnostic. Ollama embeds it, for example. No pressure, just pointing it out :)
Really like the design of these tools so that you can easily pipe between them, this a good way to make things composable. Also really cool to see all of the other CLI tools folks have posted here, lots that I wasn't aware of.
I've been experimenting with CLI/LLM tools and found my favorite approach is to make the LLM constantly accessible in my shell. The way I do this is to add a transparent wrapper around whatever your shell is (bash,zsh,etc), send commands that start with capital letters to ChatGPT, and manage a history of local commands and GPT responses. This means you can ask questions about a command's output and autocomplete based on ChatGPT suggestions, etc.
- An unmentioned alternative to this pricing is that GCP has a deal with Cloudflare that gives you a 50% discount to what is now called Premium pricing for traffic that egresses GCP through Cloudflare. This is cheaper for Google because GCP and Cloudflare have a peering arrangement. Of course, you also have to pay Cloudflare for bandwidth.
- This announcement is actually a small price cut compared to existing network egress prices for the 1-10 TiB/month and 150+ TiB/month buckets.
- The biggest advantage of using private networks is often client latency, since packets avoid points of congestion on the open internet. They don't really highlight this, instead showing a chart of throughput to a single client, which only matters for a subset of GCP customers. The throughput chart is also a little bit deceptive because of the y-axis they've chosen.
- Other important things to consider if you're optimizing a website for latency are CDN and where SSL negotiation takes place. For a single small HTTPS request doing SSL negotiation on the network edge can make a pretty big latency difference.
- Interesting number: Google capex (excluding other Alphabet capex) in both 2015 and 2016 was around $10B, at least part of that going to the networking tech discussed in the post. I expect they're continuing to invest in this space.
- A common trend with GCP products is moving away from flat-rate pricing models to models which incentivize users in ways that reflect underlying costs. For example, BigQuery users are priced per-query, which is uncommon for analytical databases. It's possible that network pricing could reflect that in the future. For example, there is probably more slack network capacity at 3am than 8am.
I like your thinking, but one minor clarification. BigQuery's actually introduced Flat Rate [0] ( a year ago) and Committed Use Discounts [1] (Amazon RIs are similar) since that's kind of what some Enterprises want. These are optional and flexible.
I personally still hold that pay-per-use pricing is the cloud native approach [2], the most cost-efficient, and the most customer-friendly. However, it's unfamiliar and hard to predict, so starting out on Flat Rate pricing as a first step makes sense.
( work at Google and was a part of the team that introduced BQ Flat Rate)
The problem with bundling is that it stops reflecting underlying costs and creates incentives for customers that skew customer population.
Contrived example of this: since most HDDs workloads are IOPS bound, you decide to sell IOPS bundles and give space for free. Not before long all your customers are backup companies that have low IOPS and high space usage. Your service runs at loss, customers are doing nice price arbitrage on top of it.
Same goes for all aspects of computing platforms for sale: CPUs, RAM, Networking, HDDs, SSDs, GPUs.
Two additional problems are bin packing and provisioning: you need to sell things in such quantities and ratios that you can actually effectively utilize your hardware configurations. You need to order and design your hardware in a flexible manner to be able to adapt for changing ratios of component needs due to changing customer demand.
So it's easier to run "pay for what you use plus profit" pricing, but some customers don't like it due to perceived complexity and potential unpredictability.
We've been running it for about a year to power Quizlet - overall things have been good and we're happy. AWS and GCP are complicated enough that they're tough to compare holistically, but on most of the things we care about we find GCP to be equivalent or better (sometimes significantly) than AWS. It really does have better networking and disk technology, and the pricing is much better. Here's the analysis we did: https://quizlet.com/blog/whats-the-best-cloud-probably-gcp.
To be more explicit on this point - I think Azure is a good product and the growth that Microsoft is seeing speaks for itself. However, for our use case, which is at least somewhat representative of a rapidly growing Linux-based startup, we didn't see any compelling advantage in using the Azure compute product (we may use one of their ML products in the future). Hence, it made sense to narrow the focus somewhat to what we thought were the best options. We eliminated Azure from our list based on the fact that our preliminary analysis didn't uncover any big advantages to use it over the other clouds, we wanted a cloud focused more on linux, and we don't currently use any products in the Microsoft ecosystem.
Yes I know Azure runs Linux, let me unpack that point: We had run previously on a cloud that wasn't focused on Linux hosting as their flagship OS. The effect we observed was that Linux was a second-class citizen in terms of features and performance. Perhaps its unfair to project that onto Azure, but I think its true that AWS and GCP think about Linux first, and Azure doesn't. Running a company on the cloud means relying on the compute product (GCE/EC2) as the foundation for your infrastructure, so we think this makes a difference.
It would be valuable for a lot of people to see more comprehensive stats across all clouds - I would love to see this personally and I think it would help people make better decisions about cloud infrastructure.
Sure - this isn't really a dimension of comparison, just something that I found interesting / surprising. It seems like SDN is probably the future, and this is an illustration of how its different.
This also enables shell-aware LLM prompting!