Happy to help and thanks! Do let me know how the material could be better. Always looking to improve it.
To answer your question:
EECS Major in college; 20 years of engineering and product management experience. I have given a few technical talks at conferences and I do enjoy the process of explaining things though it takes a surprising amount of work.
When I went to school ironically neural nets were the one thing they didn't cover in the intro to AI courses. I've basically learned modern AI from just filling my own curiosity over the years through online resources on nights and weekends. Learned a lot from Jeremy Howard's Fast.ai and Andrej Karpathy's stuff just like everyone else. I really wanted to know how every step of GPT worked, kind of like how you learn Computer Architecture in college: you learn how CPUs work in principle starting with circuits. Then I got a crazy idea the whole model could fit in a spreadsheet because well I just really like spreadsheets. Went down a 2-3 month rabbit hole in my non-existent sparetime to make it work.
The passion and curiosity you have for the topic are palpable in the videos. Thank you again for also sharing with us your exploration into this learning endeavor.
This work is another classic in the "neural nets meet spreadsheets" genre [0]. Really helps visualize what is going on in (at least some) latent spaces.
This is so awesome!! I'm going to have to show this in the embeddings video I'm working on when I discuss non-text embeddings and CLIP.
While I created spreadsheet-are-all-you-need.ai as teaching tool, as I've been playing with it I've been having a growing suspicion the spreadsheet interface for AI might be useful beyond teaching, either as a power user control interface or for interpretability. For example, making simple changes to the architecture of GPT and observing how it changes the model behavior can be as simple as cloning a tab and a few spreadsheet functions. Of course, you can do the same in python as well so it remains to be seen.
It's actually available in beta. It was announced while I working on this project but I kept going with pure Excel functions because I wanted to illustrate the transformer without abstractions getting in the way. It would make many aspects easier but also make it easier to hide a lot.
That being said, Python+Excel makes a ton of sense in general. And in this project, it would help in the tutorials. For example, in the embeddings tutorial I'm working on I wanted use PCA plots and SVD to illustrate the workings of embeddings but neither are natively supported in Excel without paid plug-ins. But both are easy in Python.
Not 100% sure this will work, but there is a Mailchimp signup, and Mailchimp newsletters have RSS feeds. Based on the newsletter URL the RSS feed should be https://us21.campaign-archive.com/feed?u=96d33fd949860389358.... There's nothing in there yet but I would expect posts to show up if they start sending out emails.
To be notified of updates. Why in preference to email? Web feeds are typically presented as information streams while email is presented as interpersonal dialogue. Organizing your information streams into an aggregator is clarifying.
Cannot answer for parent post, but can share personal perspective. I love Atom / RSS because it allows me to aggregate all the information that I care about in one place. It greatly improves the signal to noise ratio and spares my communication channels for direct human interactions.
Author here. Yes, Jeremy Howard and fast.ai was one of the inspirations for this! I'd actually be curious what he thinks of the project if he ever sees it.
I'm not a big Excel user, but I see errors that I get when I type in English function names while using a non-English version of Excel. Is it correct that functions (and thus this xlsb file) are not portable to other language versions of Excel?
It's been my experience that Excel spreadsheets are not transferable from one locale to another. Maybe there is a "culture-invariant" version of a spreadsheet but I haven't found it.
Even the CSV export of Excel sets the separator based on locale, rendering them hard to use in international setting. I work at a European statistics office, and although there's SDMX, it's not under Save-as in Excel.
> Aside from the minuscule context length, it also lacks the instruction tuning and reinforcement learning from human feedback (RLHF) that turn a large language model into a chatbot.
Strictly necessary? Maybe not. I wrote that before URIAL [1][2]. I actually haven't tried URIAL in GPT2 small but I need to give it a whirl. Might be too small a model to work?
Even if URIAL works with GPT2 small, the really small context length in the Excel file as currently implemented will make it hard to leverage. I've considered a more flexible implementation to support a longer context length (e.g. using Macros to build the layout of the sheet) but have prioritized the teaching videos first.
By default it's just going to be a text completion model, you want an additional round of training to make it behave like a chatbot. I guess you could probably get away with just fine-tuning on chatbot discussions, but everybody uses RLHF so I guess it must be much more efficient for that.
I find this visualization is useful for showcasing the difference between the different model generations: https://bbycroft.net/llm . The difference in scale is massive.
I try to look at it as motivation to stay committed.
Like spreadsheets, domain names are a bit of an obsession for me. One of the other AI side projects I'm working on is a CustomGPT to help come up with domain names (https://niftynamer.com) so I don't have to come-up-with-another-unwieldy-long-domain-name.again.
Classic scenario in legacy finance tech: the hotshot trader or investment banker shakes his fist furiously demanding to be given back his Excel spreadsheets and the entire tech department be fired.
It's GPT-2 Small, which is only ~125 million parameters. That's big, but it's not out of reach for modern versions of Excel. The final spreadsheet is a 1.25GB file.
It’s often true though. So many things I think “I could make an app for that” I wind up just using a spreadsheet for. At least it helps me explore the use cases more deeply for when I’d want to actually take it to the next level.
I hung out with a friend while they solved Advent of Code challenges in Excel, that was a trip to watch.
Check out "Google AppSheets". I've only scratched the surface while investigating other stuff but it's basically "drag-and-drop mobile GUI builder w/ sheets as a backend". If it were 2005 it would SLAY so much code. As it is, it seems really useful but the outcome seems a bit generic for modern tastes.
This seems quite hidden. Googling "google appsheets" results in "did you mean 'google sheets'?" and showing only results for google sheets unless you specifically then request AppSheets.
It sounds cool, but I'd hate to rely on it since Google will probably shut it down.
I wonder at what point we stop calling them "spreadsheets" though.
You mention Excel, and a bunch of us do it Google Sheets, but at this point it's not about sheets of data anymore and more about the interface and runtime, and we have full applications running in it.
I remember a colleague running API tests inside his Excel sheet to more easily check for the different parameter combinations, but telling everyone he was still using Postman just to avoid discussing it.
A spreadsheet is really just an easily accessible, visual, functional programming environment. I think the question is not how to make spreadsheets more programmable, but how to make programming IDEs as simple as spreadsheets.
From the discussion of Brad Myers' classic 1990 paper (originally published by the ACM CHI conference in 1986, then updated in 1990 in the Journal of Visual Languages & Computing), "Taxonomies of Visual Programming and Program Visualization" (where Brad dropped by to answer questions):
>Brad Myers' paper answers the age-old argument about whether or not spreadsheets are visual programming languages!
>Google sheets (and other google docs) can be programmed in "serverless" JavaScript that runs in the cloud somewhere. It's hellishly slow making sheets API calls, though. Feels like some kind of remote procedure call. (Slower than driving Excel via OLE Automation even, and that's saying something!) Then it times out on a wall clock (not cpu time) limit, and breaks if you take too long.
>A CS grad student friend of mine was in a programming language class, and the instructor was lecturing about visual programming languages, and claimed that there weren't any widely used visual programming languages. (This was in the late 80's, but some people are still under the same impression.)
>He raised his hand and pointed out that spreadsheets qualified as visual programming languages, and were pretty darn common.
>They're quite visual and popular because of their 2D spatial nature, relative and absolute 2D addressing modes, declarative functions and constraints, visual presentation of live directly manipulatable data, fonts, text attributes, background and foreground colors, lines, patterns, etc. Some even support procedural scripting languages whose statements are written in columns of cells.
>Maybe "real programmers" would have accepted spreadsheets more readily had Lotus named their product "Lotus 012"? (But then normal people would have hated it!)
Excerpt from "Taxonomies of Visual Programming and Program Visualization", by Brad A Myers, 1990/3/1, Journal of Visual Languages & Computing, Volume 1, Issue 1, pages 97-123:
Spreadsheets, such as those in VisiCalc or Lotus 1-2-3, were designed to help nonprogrammers manage finances. Spreadsheets incorporate programming features and can be made to do general purpose calculations [71] and therefore qualify as a very-high level Visual Programming Language. Some of the reasons that spreadsheets are so popular are (from [43] and [1]):
1. the graphics on the screen use familiar, concrete, and visible representation which directly maps to the user's natural model of the data,
2. they are nonmodal and interpretive and therefore provide immediate feedback,
3. they supply aggregate and high-level operations,
4. they avoid the notion of variables (all data is visible),
5. the inner world of computation is suppressed,
6. each cell typically has a single value throughout the computation,
7. they are nondeclarative and typeless,
8. consistency is automatically maintained, and
9. the order of evaluation (flow of control) is entirely derived from the declared cell dependencies.
The first point differentiates spreadsheets from many other Visual Programming Languages including flowcharts which are graphical representations derived from textual (linear) languages. With spreadsheets, the original representation in graphical and there is no natural textual language.
Action Graphics [41] uses ideas from spreadsheets to try to make it easier to program graphical animations. The 'Forms' system [43] uses a more conventional spreadsheet format, but adds sub-sheets (to provide procedural abstraction) which can have an unbounded size (to handle arbitrary parameters).
A different style of system is SIL-ICON [49], which allows the user to construct 'iconic sentences' consisting of graphics arranged in a meaningful two-dimensional fashion, as shown in Figure 5. The SIL-ICON interpreter then parses the picture to determine what it means. The interpreter itself is generated from a description of the legal pictures, in the same way that conventional compilers can be generated from BNF descriptions of the grammar.
10. Conclusions
Visual Programming and Program Visualization are interesting areas that show promise for improving the programming process, especially for non-programmers, but more work needs to be done. The success of spreadsheets demonstrates that if we find the appropriate paradigms, graphical techniques can revolutionize the way people interact with computers.
>By the way, something I always meant to ask you, Brad:
How does "C32" fit into your acronym theme of gemstones and rocks? Is it a teeny tiny 4x4x2 carbon atom block of diamond? How many carats would that be?
>Brad Myers wrote several articles in that book about his work on PERIDOT and GARNET, and he also developed C32:
>C32: CMU's Clever and Compelling Contribution to Computer Science in CommonLisp which is Customizable and Characterized by a Complete Coverage of Code and Contains a Cornucopia of Creative Constructs, because it Can Create Complex, Correct Constraints that are Constructed Clearly and Concretely, and Communicated using Columns of Cells, that are Constantly Calculated so they Change Continuously, and Cancel Confusion
>Spreadsheet-like system that allows constraints on objects to be specified by demonstration. Intelligent cut and paste. Implemented using Garnet.
> The dynamic spreadsheet is a good example of such a tissuelike superobject. It is a simulation kit, and it provides a remarkable degree of direct leverage. Spreadsheets at their best combine the genres established in the 1970's (objects, windows, what-you-see-is-what-you-get editing and goal-seeking retrieval) into a "better old thing" that is likely to be one of the "almost new things" for the mainstream designs of the next few years.
The "Visual" in "Visual Programming Language" is about the graphical, interactive method of creating and understanding programs, rather than merely the visibility of textual or graphical code.
Spreadsheets typically show the entire formula of the selected cell at the top of the window, at the full width of the window.
Visual programming languages based on outliners and notebooks, like UserLand Frontier, Jupyter, or Mathematica, let you hide the code by closing the outlines or code editor views.
Ivan Sutherland's pioneering PhD thesis "Sketchpad" didn't show code or formulas or constraints on the screen all the time either, focusing on the graphical content itself that you were creating and editing and programming with direct manipulation, demonstration, and constraints, instead of just the code.
>This video is a TV show made about the software Ivan Sutherland developed in his 1963 thesis at MIT's Lincoln Labs, "Sketchpad, A Man-Machine Graphical Communication System", described as one of the most influential computer programs ever written. This work was seminal in Human-Computer Interaction, Graphics and Graphical User Interfaces (GUIs), Computer Aided Design (CAD), and contraint/object-oriented programming. While watching this video, remember that the TX-2 computer (built circa 1958) on which the software ran was built from discrete transistors (not integrated circuits -it was room-sized) and contained just 64K of 36-bit words (~272k bytes).
Many visual programming languages don't necessarily display a connected graph of nodes, or explicit visual code beyond the data you're acting on.
What many people don't realize about visual programming languages is that there are so many of them that look and behave extremely differently than the few recent and popular ones they might have actually seen and used, like Unreal Blueprints.
People have been inventing wildly diverse VPLs for a long time, and there is no one standard visual design (like graphs with boxes and arrows, interlocking blocks, text, images, etc) or execution model (like data flow, control flow, cellular automata, constraints) or interface style (like keyboard, mouse, dialog panels, notebook, direct manipulation, demonstration or example) that defines the genre.
Your narrow definition of VPL excludes not only spreadsheets but also Visual Programming by Example (VPBE), Programming by Demonstration (PBD), and Visual Constraint Programming (VCP), topics that Brad Myers and others have studied, researched, and written code and papers about for decades.
It also excludes groundbreaking influential work like Ivan Sutherland's 1963 "Sketchpad" PhD thesis that pioneered Visual Constraint Programming and many other advanced interaction techniques.
VPBE and PBD enable users to teach the computer new behaviors by demonstrating actions on the interface, rather than by writing code explicitly. This approach is inherently visual and interactive, focusing on the outcomes of actions to infer the underlying logic or procedures.
Brad Myers' 1987 PhD thesis "Creating User Interfaces by Demonstration" is one of his early and influential works in the field of PBD. It discusses the design, implementation, and evaluation of Peridot, a system that allows users to create user interfaces by demonstrating actions instead of writing code.
>Peridot was a system for creating User Interfaces that was created between 1985 - 1987. This video was previously published as: Brad A. Myers. Creating User Interfaces by Demonstration: The Peridot UIMS. Technical Video Program of the SIGCHI'88 Conference, Washington, D.C., May 15-19, 1988. and IFIP Interact '87 Conference on Human-Computer Interaction. Stuttgart, West Germany. Sept. 1-4, 1987. SIGGRAPH Video Review, Issue 59, no. 2.
"Demonstrational Interfaces: Sometimes You Need a Little Intelligence, Sometimes You Need a Lot" (1998): In this article, Myers explores the concept of demonstrational interfaces, which are a subset of intelligent user interfaces that allow users to demonstrate actions to achieve goals. The paper discusses the spectrum of intelligence required in such systems, from minimal to substantial, depending on the task complexity.
"Garnet: Comprehensive Support for Graphical, Highly Interactive User Interfaces" (1990, IEEE Computer): Myers and his colleagues introduced Garnet, a toolkit for creating graphical, interactive user interfaces. Garnet supports PBD in the context of UI development, making it easier for developers to create and manipulate UI elements.
"Programming by Example: Intelligence in Demonstrational Interfaces" (1992, Communications of the ACM): This paper discusses the role of intelligence in demonstrational interfaces, emphasizing how such systems can infer user intentions from examples to automate tasks and create programs.
"Past, Present, and Future of User Interface Software Tools" (2000, ACM Transactions on Computer-Human Interaction): Although not solely focused on PBD, this paper, co-authored with Scott E. Hudson and Randy Pausch, provides an extensive overview of user interface software tools, including those that incorporate principles of programming by example and demonstration.
Myers has contributed to several books either as an author or editor, covering topics related to user interface design, software development, and PBD. One example is "Watch What I Do: Programming by Demonstration," which Myers edited. This book is a comprehensive overview of the field of programming by demonstration, featuring chapters by various authors on different PBD systems and theories.
Ivan Sutherland developed "Sketchpad" as his PhD thesis at MIT in 1963, which is often considered one of the earliest examples of visual programming. Sketchpad was revolutionary for its time, introducing concepts that laid the groundwork for interactive computer graphics, graphical user interfaces (GUIs), computer-aided design (CAD) systems, and indeed, visual programming itself.
>This video is a TV show made about the software Ivan Sutherland developed in his 1963 thesis at MIT's Lincoln Labs, "Sketchpad, A Man-Machine Graphical Communication System", described as one of the most influential computer programs ever written. This work was seminal in Human-Computer Interaction, Graphics and Graphical User Interfaces (GUIs), Computer Aided Design (CAD), and contraint/object-oriented programming. While watching this video, remember that the TX-2 computer (built circa 1958) on which the software ran was built from discrete transistors (not integrated circuits -it was room-sized) and contained just 64K of 36-bit words (~272k bytes).
Nicholas Jackiw's "The Geometer's Sketchpad" (GSP) is considered a Visual Programming Language (VPL) within the context of educational technology and mathematics due to its innovative approach to geometry, algebra, and calculus through direct manipulation and visualization. While not a VPL in the traditional sense of software development or general-purpose programming, GSP embodies key aspects of visual programming that make it a powerful tool for learning and exploration in mathematics.
Macro Recorders (i.e. Emacs keyboard macros, Photoshop actions, etc): Many applications allow users to record a series of actions (like formatting text or organizing data) as macros. You demonstrate the desired actions, and it records your actions as executable steps. The "code" behind these macros is often generated automatically and can be edited textually, but the initial programming is done visually through demonstration.
Interactive Data Analysis Tools (like Tableau, Microsoft Power BI, etc): Some data analysis tools let users manipulate data sets visually (e.g., through drag-and-drop interfaces for selecting data ranges or specifying graph types), and then automatically generate the scripts or queries that represent these actions. This approach allows users to "program" data transformations and visualizations without writing code directly.
Graphical User Interface Builders (like XCode, etc): GUI builders, often found in integrated development environments (IDEs), allow developers to design interfaces by arranging components visually. Properties and event handlers can be assigned through demonstrations or interactions with the interface, such as by selecting elements and setting their characteristics through property windows. The underlying code for the GUI is generated by the tool, based on the visual design.
End-User Development Tools: Platforms designed to creating custom applications or automations through demonstration. For instance, tools like IFTTT (If This Then That) or Zapier allow users to create "applets" or "zaps" by choosing triggers and actions from a list of services and configuring them without seeing any traditional programming code. The logic is entirely defined through the visual assembly of components.
AgentSheets is a visual programming environment that allows users, including those with little to no programming experience, to create their own simulations, games, and interactive stories. It is particularly known for its use in educational settings to teach computational thinking and programming concepts.
Programming by Example in AgentSheets: The environment supports a PBE approach through its "Visual AgenTalk" programming language. Users can specify the behavior of agents (the active components in their simulations or games) by defining rules in a visual manner. These rules are often created by demonstrating actions or setting up conditions and outcomes using a graphical interface, which the system then translates into executable logic.
David Ackley's "SPLAT" programming language for the Moveable Feast Machine, and Lu Wilson's "Sandpond" visual cellular automata programming language, let you define rules by example, which I recently wrote about on HN here:
I think it still becomes about the data. If using a sheet as an app particularly with sharepoint, data integrity and merging multiple users often becomes a problem.
I run one for a financial services firm and often get "excel couldnt merge changes, want to save a copy or discard". We tested this out, 2 users make editing different sheets on a single excel workbook hosted on sharepoint and excel can't figure out how to merge. If someone is on VPN and connection drops this also often occurs.
This is why for all its faults and limitations, Google Sheets is astronomically better than Excel for collaboration. There is one single source of truth, will full change history and undo-redo of the entire spreadsheet or individual elements, plus comments and chat.
Once you've decided to go with Excel it becomes pretty different from just feeding an array of values. You can autogenerate the combinations applying rules on what to avoid, fuzzy the values, get the result of an API transform it and feed it to another API etc.
I'm not recommending any of this, but it can go as far as you want...
One of my biggest weakness as a developer is that I can barely use excel. It's really embarrassing, especially since I've moved to a financial firm.
Do you have any recommendations for becoming semi competent with it?
Just an idea but how would you advise someone learning programming/a new programming language? Probably you'd say: build something with it. So same goes for excel. Try to hook up a spreadsheet to some database, have it update automatically, have drop down lists that populate automatically from the database (e.g. have a "country" drop down and automatically populate a "region" drop down based on the choice of country) and so on. You need some BASIC and SQL for this but not much
Oh wow. Different poster same problem. Your comment made me realize that it's the "but I don't wanna!" attitude that I already know how to push through when it comes to language learning, just disguised differently. I'm still not sure that I wanna, tho...
- INDEX/MATCH. For all given purposes, the last parameter in MATCH() is always 0 unless you want to find the nearest match
- understanding that formulas can return arrays, not just single elements (easier in more recent versions of Excel which have made this more consistent for every formula), so you can e.g. AVERAGE(IF(A1:A100>100,A1:A100,FALSE)) get the average of the values between A1:A100 which are greater than 100. the FALSE parameter can be omitted there but I left it in for clarity. Interestingly this means AVERAGEIF() is just syntactic sugar, so I prefer to avoid it. it makes it easier to, say, change to MEDIAN(IF(...)) later, since MEDIANIF() doesn't exist
- if you combine the first two bullets above, you'll enter the fifth dimension
- don't ever hardcode a value if you can refer to it somewhere else. want to use INDEX(MATCH()) and AVERAGE to, say, take some average value over some time period? put the start and end dates into their own separate cells with no formulas, and then refer to those cells in your formula. if you later need to change the time period, you won't have to modify all your formulas, just those values
- LET() is strange at first but super powerful. most people still don't use it
- Separate data from presentation. This point can't be stressed enough. I care about it so much I'm literally building an Excel competitor to enforce this. If possible, separate raw data, data transformation and data presentation.
- most people know you can name cell ranges and refer to them in your formulas. most people don't know you can also name formulas and refer to them elsewhere. your "average value over time" calc doesn't even need to be in a cell anywhere, it can just exist as a defined value in a named range. now named ranges are hard to see (only visible if you open the name manager), harder to debug (you basically get just a #VALUE error most of the time, forcing you to copy-and-paste the named range into a cell to debug it) and they get copied to other workbooks when you copy-and-paste across workbooks, which makes them super messy.... but for short formulas they can be pretty nifty
- LAMBDA() is even more recent than LET() and basically makes named ranges more useful. even fewer people use it
The MS documentation for Excel is a pretty good resource to learn from these days, with text descriptions for just about every function, and videos for most too. This page [1] is an especially great resource. It's got a "top-10" list, which is a good place to start, as it covers the majority of things you'll see in a normal business document.
After those 10, they break them down by category, and have one for Financial Functions, which is going to be useful for you. Similarly, the Logical, Math & Trig, and Stats functions, will all be useful for looking at a Finance firm's spreadsheets.
For excel/spreadsheets etc chat gpt or google gemeni are a lot more useful than for normal programming as a lot of stuff about spread sheets is explained in easy steps for non technical people to implement so that has become a treasure trove of information for llm models.
For a best practices introduction I would recommend a course geared toward investment bankers. Some things that should be covered are:
* Avoiding hardcoding numbers, making input cells a certain color, etc.
* Knowing the all powerful F4 key that alternates between A1, $A$1, $A1, and A$1 (and knowing what each of these mean)
* Inserting blank lines above and below a summed range and including those rows inside of the sum formula (prevents formula from breaking if you move rows around)
Other than that you can do a lot of productive things with a combination of index/match and dynamic named ranges via offset formula.
All solid advice. I learned many of these by: (1) reverse engineering other people's great sheets (formulas and VBA), and (2) watching highly skilled Excel users.
All the spreadsheets have map/reduce stuff! It's the programming you are used to but you type into a box and reference variables by sheet location instead of by name.
I find I learn best from (good) books because they actually explain things coherently, and you can leaf through them to discover features and things that look interesting.
Try stepping into a management role for a while, ideally one in which you have lots of dealings with less technical parts of a business. Even if it's not for you, at least you'll have gained some insights into that side of things, and more importantly - lots of real world exposure to spreadsheets!
I'm confused... are you just commenting on the name of the website itself or the content?
The content itself is about demonstrating how an LLM/Neural net works using a spreadsheet and is a play on the title "Attention is All You Need". It has nothing to do with using a spreadsheet for most of your use cases.
And the author's comments on the difficulty of doing matrix multiplication in Excel suggest that he doesn't _actually_ believe that spreadsheets are all you need.
Our company used to run completely on Google spreadsheets (a lot of it was written by the CEO). It worked, but at some point, it became a convoluted unmaintainable mess.
First, we partially switched to Airtable, but soon abandoned it in favour of our own internal node/python tools. The company is now a lot larger and the tools are more robust/capable/clean now, but at the same time, they are much less flexible than the old spreadsheets.
“ they are much less flexible than the old spreadsheets.”
I have seen that several times while I was a consultant. People run their stuff on spreadsheets. It’s a big mess but things get done . IT comes in and starts producing “professional” systems. Months of requirements gathering, then they deliver something. It’s not 100% right so people need to write tickets. IT may or may not make the changes. If they make a change it takes forever. People still need to do their job so they go back to spreadsheets.
Why is this downvoted? Many Wall Street fixed income trading desks were the same before 2010. What you wrote is true for many different types of businesses. A huge amount of sales (customer management) tools are written in Excel/VBA. They work well. It is a great platform.
Thanks! There's a truth to the name beyond just the play on the transformers paper. Definitely have thought about how many SaaS apps could be a spreadsheet and vice versa and often use them to create mini-apps (often via apps script).
Could we summarize that spreadsheets are one of the best practical representations (UX/UI wise) and integration tech for databases? From the technological architecture point of view, don't we should separate Spreadsheets = UX/UI + DAG (directed acyclic graph) + Database + Integrations ?
I had a coworker who had the idea of having a system that automatically compiling xlsx files to Apache Spark, so that you could have the easy interface of Excel while having the processing power of Spark to crunch bigger data sets.
He actually quit the company to build it; I should find out what came of that.
Well, there are spreadsheet backends using Apache Arrow for storage. You could use something like pola.rs or Arrow DataFusion + Arrow Ballista for distributed processing of the dataset.
Funny thing is that excel can do such thing but cannot replace dates within the sheet. For example on Windows in Office 2021 if I want to replace text cell which has value 2023-10-10 with new value 2023-10-11 I have only two options as the result: 10/11/2023 or 45210.
> What about Google Sheets?
> This project actually started on Google Sheets but the full 124M model was too big and switched to Excel. I’m still exploring ways to make this work in Google Sheets but it is unlikely to fit into a single file as it can with Excel.
Haha, I'm one of the authors, although most of the credit goes to the other author (David Fouhey). Despite being in the form of a joke, this sheet is actually functional, for anyone wondering. Like the one in this post, it's inference only, but for a CNN instead of a transformer. I think someone actually made an excel CNN with backpropagation as well.
Jokes on you, my Excel regularily thinks that two cells are not equal, despite both of them having exactly the same value and type (also no trailing spaces or these things). Copy and pasting it around sometimes solves the problem, but it never is apparent to me why that happens.
Shot in the dark, but could this be some kind of IEEE floating point rounding shenanigans? Never had that issue in excel specifically, but I’ve certainly had that issue in C when trying to compare two floats that should be equal.
another shot in the dark but it could be different data types. some numbers can get stored as text when you import them. you can try wrapping your operands in VALUE() or TEXT() formulas accordingly
other than that, I agree with the other poster and the only issue I can think of is due to having a different number of digits after the decimal point causing some rounding error when comparing floats
I occasionally need to loan people money, and usually they need to pay me back in multiple payments, sometimes over several months.
Being a proper geek, my first inclination was to build a web app to keep track of that and so the people I am loaning cash to can also view their balance, but I realized that would involve maintaining state and login creds and a bunch of other variables I haven’t even considered, and maintaining this app would become a second, unpaid job.
Then I thought “I could just use a Google sheet and share it”, which is what I ended up doing. It’s easy to set permissions, I don’t have to worry about hosting, I can have it do any amount of arithmetic and light programming that I might need, as well as being automatic and reactive as I add information. It will be slower than something I would write in C obviously, but realistically for something involving budgeting the time savings will be on the order of milliseconds.
I love spreadsheets. They provide a gateway to programming for non-programmers, and they provide a low-effort means of playing with data.
I think you told your experience based on the "spreadsheets are all you need" title. I don't know if you know this, but it's a play on the "Attention Is All You Need" 2017 paper by Google which introduces transformers, the building block of LLMs.
Tangential but I've been thinking about building a real-estate calculator after my first homebuying experience, something that will provide you with a total cost of ownership, points bydowns and other useful metrics about a home purchase. Rather than jumping into a web app to start building, I've been prototyping it in Excel. I found it really cool for prototyping this type of app, I built my entire "MVP" in about an hour and now just evaluating all the "features" I've added before committing to transplanting this into a webapp.
Yeah, at this point for a lot of the more "number heavy" things I do, my first version is generally in Google Sheets to get a rough prototype and to make sure my math is basically correct, and then a port to Julia.
The hyper-interactive nature of spreadsheets makes them kind of genuinely fun to tinker with. You can see results immediately, and you have enough logic (even in just the vanilla cells, not even counting JS or VBA) to implement a lot of algo stuff.
Well said. This can go even further by defining JavaScript code as functions in a sheet. For example, if someone pays you back, you click a checkbox in the corresponding cell, and it runs your code to send them a thank you email.
I do think that there's sometimes value in reinventing the wheel. Sometimes your goal is to figure out how a wheel works, or you think you have a better design for a wheel, etc.
In this case, I don't think anything I'd do is going to be revolutionary, and I don't think I would have learned at that much, so Google Sheets was great.
Mentioned in a sibling reply, but the money I loan is to either close friends or family members or in-laws. I know them well enough to know that "not paying me back" really isn't on the table.
In regards to "not paying on time", these are friends and family, so it's not like I'm going to go break their kneecaps or anything; if they're a bit short on the payment date we just push it back.
Its not an unpaid job. Presumably you are charging interest. You just dont have the scale of a bank to make money changing anything close to market rates.
Next video will be on embeddings and hopefully done soon-ish.