Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Dtab – Spreadsheet for Data Science (dtab.io)
229 points by dnprock on Oct 2, 2015 | hide | past | favorite | 67 comments



I like!

One thing that I am hoping for however is some kind of notebook-like interface for doing small, ad-hoc data analysis projects. This looks fairly close to that.

Mathematica pioneered "the computational notebook" concept, and now we have similar stuff with R/Rstudio, IPython/Jupyter, matlab, and jmp. Those are all really nice and powerful, but they're somewhat large in scope and very ambitious.

It would be soooo cool to be able to just make notebooks that contain spreadsheets, related manipulation and analysis code and also text that can render to well-formatted html/pdf documents with really nice tables.

This tool looks like it can get there... if it can develop the ability to render documents sort of like Knitr for R.


Have you seen AirFlow before? I hooked a data scientist friend up with it dockerized, and he raves about it: http://nerds.airbnb.com/airflow/


container image please


Thanks, Google (first result for "docker airflow"):

https://github.com/puckel/docker-airflow


Not mine, but one that is maintained: https://github.com/puckel/docker-airflow


I have been really curious about getting first-class interactive spreadsheet functionality in Jupyter (possibly by building a widget). I think it would be the ideal analysis platform.


Org mode[1] has some spreadsheet functionality which works fine as long as you don't use very large tables. Otherwise it gets a bit too slow.

You can also use those spreadsheets as input for pieces of code so you perform some analysis on their data.

--

[1] - http://orgmode.org/


if you're looking for something like an iPython notebook, check out tonicdev.com


Quick idea:

How about providing numeric cell referencing, jQuery-style:

  $(1, 2) // row 1, col 2
Like it's done here: https://github.com/hliyan/magpie/blob/master/lib/gasp.gs


while i dont have much of an opinion of js in a spreadsheet except "neat!", the website design and text is spot-on in that I understood in an instant what the project does! Kudos.


I'd love to see Jupyter (IPython Notebooks) capable of this kind of thing. It'd be awesome to be able to select a column of a Pandas DataFrame and apply a function to it.


Tongue in cheek? Alternatively, you should know about: http://pandas.pydata.org/pandas-docs/stable/generated/pandas...


Well capable? Yes Pandas can do that but I think what you are talking about is you want a GUI tool?

Everything about Pandas, IPython(Now Jupyter) and R (And the other data languages) are 100% about performing functions.


Yeah, I mean more as a GUI enhancement. It might be outside the scope of Jupyter, but I think it'd be neat to be able to treat DataFrames like a spreadsheet.

Personally, I'm comfortable writing Python, but I bet you'd see more adoption for non-programmer academics who still need to work with data.


I doubt we will ever see the abandonment of spreadsheets LIKE WE SHOULD.

Look at Open Refine you might really like it.

http://lemire.me/blog/archives/2014/05/23/you-shouldnt-use-a...


I'm of the same mind about Excel, which is why I don't use it any more. I don't like anything where there are hidden layers of meaning that are easy to overlook. I like Jupyter because everything is right there in front of you.

Another shortcoming of spreadsheets is that they don't solve what I consider to be the Fundamental Problem of Programming: How to keep your sanity when your project gets bigger than one screen or one page. Of course programmers have lots of ways to deal with this, by creating named abstractions, subroutines, and so forth. It's why we can write million line programs today. But how it's solved in Excel seems incredibly clumsy by comparison. Of course you can adopt the conventions of programming by using Visual Basic, but that seems to occur quite rarely.


I use Jupyter as a lab notebook, and it's OK to enter data by hand in an array literal, paste it from a spreadsheet (it comes in as tab delimited text, which is easy to mung) or read it from a text file.

But an edit-able tabular data array (that was also accessible programmatically) would be really super.


+1 for good spreadsheet functionality in Jupyter!


if i understand you correctly - you can - like this:

import pandas as pd

df = pd.DataFrame() ...loads data...

def modify_column(x): """ modify x"""" return modified_x

df['column_name'].apply(lambda x: modify_column(x))


This is already possibly with Google Spreadsheets albeit this is a much nicer interface.


> albeit this is a much nicer interface

And, if I understand correctly, this could be rolled into something like Jupyter. Which would be awesome!


Looks neat, but currently getting a 503 error on one of the scripts in all the examples: https://rawgit.com/replit/jq-console/master/jqconsole.min.js


It should be on cdn.rawgit.com in production: https://rawgit.com/


Thanks. Forgot to change to cdn version. Fixed now.


Nice work!

I noticed that you're using a web-worker, does all the heavy lifting happen there and the main thread is used for purely rendering?

I'm noticing some lag when I just press enter into a cell twice (once to get into edit, and then again to get out). Any idea what's going on there (I'm interested technically)?


It's nice to see handsontable put to another good use! What other open source libraries did you use?


A couple of things I noticed:

No editbox for the contents of the currently selected cell? (I have to double click to see the formula)

One of the examples suggested I use dl.stdev(D2:D65). I think it actually meant D5, but in any case I noticed that it computed the stdev as if empty cells had zeroes in them, rather than ignore them which would be more the norm with spreadsheets.

I do like the idea of being able to import random javascript, could be very handy.

edit: also the drop down menu headings seem to stay highlighted when you click off them, but you probably knew that. 2nd edit: oh I see, it's if I click off them onto a blank part of the window


When i log in and view source I get back a little bit of json only

{"_id":"560ee464ba6b14a61c74de66","name":"Untitled","data":null,"ownerId":"560ee461ba6b14a61c74de65","javascript":"","__v":0,"opened":"2015-10-02T20:09:09.119Z","modified":"2015-10-02T20:09:08.569Z","created":"2015-10-02T20:09:08.569Z","externalLibraries":[]}

how are you doing that ?


Try another refresh. What you're seeing is the sheet data only.


While I appreciate the idea....

Use Javascript to replace vba in excel

http://blogs.msdn.com/b/officeapps/archive/2013/03/18/excel-...

Use python to replace vba in excel

http://xlwings.org/


Yeah it's not really "new". Lotus Improv was doing this 25 years ago, for those of us old fogeys who remember. 1-2-3 was for the secretaries, Improv for the power users.

Also, I know know of plenty of financial institutions that have mod'd Excel with Add-ins to the point where the Worksheet integrity is better than 95% of the RDBMS' I've worked with. Version controlled, full audit trails, access control lists, type validation...

Hell out of the box, you can get an out of the box Excel + SQL Server BI to essentially make an off-the-cuff Line of Business app, completely sync'd, with a Silverlight front similar to (this)[http://www.infragistics.com/uploadedImages/Content/PRODUCTS/...]


True, but having a feature isn't necessarily the same as having it front-and-center as part of the experience. Also, that JS API you have linked looks pretty verbose and clunky compared to OPs.


This is great! A lot of useful spreadsheet functions are really esoteric and unintuitive. Would love to see this integrated with Google Sheets


Great UI. I wish it had some sort of import function that could take in the Excel formulas and not just CSV data though.


I know a project doing something similar at MIT for finance called AlphaSheets.

https://founder.org/company/alphasheets


This reminds me a lot of pyspread - http://manns.github.io/pyspread/


The "About", "Documentation" etc. links at the bottom (footer) don't work for me unfortunately; is it just me, or are they so for others too?


Sorry, those pages are not there yet. We're working to put in the those pages. We got excited and decided to show the meat first. We'll have done ready soon. If you'd like to contact us, go ahead and shoot us an email contact at dtab dot io


bug?

- (click) Clean text using regular expression

- Load external lib: https://vega.github.io/datalib/datalib.min.js

  - Impossible (because there is no 'load external lib' functionality)


You can go to Settings tab and load this file externally.


Can I drop in any .js file I want that defines appropriate functions?


Yes, you can load any JS library to call. Check out this example with moment.js: http://dtab.io/sheets/560e138927b2d2da2c44fbce


Yes, you can. You can either paste the code in the Javascript tab, or add the hosted .js file in the Settings.


is there any angular 1.2 bindings?


Every single spreadsheet software has been able to do this for years, but using Javascript makes it better amirite?

What makes this better than Excel (VBA, Python, probably half of the languages on the planet.), Libre Office, Google Spreadsheets, etc., aside from Javascript ? The only point I could see would be integrating this into a page, which is of dubious utility.


Please don't be mean and dismissive in response to new work.

https://news.ycombinator.com/showhn.html


I'm having trouble finding anything useful in a new, nonstandard spreadsheet software whose only interest is that you can use Javascript.

But it's branded as a "new kind" of software, so it must be good :^)


As the Show HN guidelines say: "When something isn't good, you needn't pretend that it is." But you also needn't trample over someone's freshly planted garden.

Helpful criticism is fine. Asking a question is fine (a real question, not a gotcha one designed to make someone look bad); for example, one could ask "what do you mean when you say this is a new kind of spreadsheet?" Pointing out similarities to other work is fine. All we want to avoid is being dismissive. Since the line isn't obvious (especially to a million readers), simply err on the side of not being.

On HN, new work doesn't need to be useful, just alive. One can't always predict what will turn out to be useful anyhow.

Everybody showing new work deserves a minimal baseline of respect here. If you know more and have done more than someone else has, that's great. That's an opportunity to teach and to share, not put them down.

Edit: since "new kind of spreadsheet" is arguably a bit baity we've changed to the HTML doc title.


I am uncomfortable with the line you're taking here. The sentence "javascript makes it better amirite" was a bit juvenile, but otherwise the comment was perfectly valid. We do not want to be forced to be positive about things which have serious flaws; this sort of kumbaya HN thought police makes this seem more like some sort of Christian summer camp than an environment for rational discussion.


Nobody wants to be "forced to be positive", including us. If you try to find a different interpretation of what I wrote, I don't think you'll find it very hard.


The Kumbaya Thought Police are deployed specifically in "Show HN" threads; they have their own rules.


Being similar to other things that already exist is usually a benefit, not a "serious flaw".


In a typical office environment, everybody knows how to work with spreadsheets. They're a common interface and a great way to present and explore tabular data.

But as soon as you put that data into an Excel document and hit Save, it's outdated. Somewhere, somebody in the office has a .xlsx file on their C: drive with the data you need, but that doesn't help you much, does it? Or somebody puts the data into a spreadsheet and emails it to everybody in the office, then a few people make some changes and email again, and now you've got 15 different versions of the same spreadsheet floating around and nobody knows which one is the latest and most truthful.

Or you put that interface on a web page connected to a shared data source and all of those problems disappear.


Yes, I agree.

There's another nuanced advantage to this kind of tool, however: the ability to perform "reproducible" analysis.

With an excel spreadsheet, the person doing the analysis typically pulls in some data (usually by copy/paste), and then performs a series of manipulations on it. These manipulations are NOT recorded. A the end of the analysis the author sends the spreadsheet in an email or maybe copies the spreadsheet into the body of an email. Recipients, if they're going to evaluate the analysis have a lot of work to do over and aren't necessarily able to determine what, exactly, was done to the data.

Something like this puts everything into a console, javascript libraries, and tables-- as long as the author manipulates data from the console and the functions, it should be possible for someone to follow what was done to the data, reproduce results, and go further with the analysis.


Use of spreadsheets results into many problems when team-size sharing spreadsheet becomes bigger.

I have created summary of comparing spreadsheet with web application using online database at the following link. Major items are security, multi-user experience, encryption/decryption of sensitive data, auditing/change-history, workflow-requirements etc. For simple work in a smaller team environment spreadsheets work fine.

https://mydataorganizer.com/MyDataOrganizerBlog/Employees_HR...


I'm sorry, what?

What does this product have that fixes the problem you're talking about here? How does working in a spreadsheet with javascript net you any better benefit than working with Excel with respect to obsolete sheets?


Im afraid your experience of workflow is stuck in the 90s

Excel works just fine with dynamic data via SQL and Jason via URL.

You can also embed it as a runtime object in HTML.


My experience is stuck in the 90s, as are the coworkers that drive that experience. Most of them refer to the Chrome browser as "the Google", I haven't a snowball's chance in teaching them how to connect an Excel document directly to a central data source.

But what I can do is provide them with an easy interface for interacting with that data directly, and give them the option to "save" their results within the context of those interactions so that they can share a simple link instead of a static file.

As for embedding runtimes in HTML, ActiveX is dead and I do not mourn it. Our office has finally decided to standardize on a modern web browser, and I couldn't be happier about that.


Just because your place of work doesn't follow best practice doesn't mean it is not possible.


I didn't say it wasn't possible, but in my environment it is not as feasible as the alternative.

Creating a better solution is a lot easier than teaching everyone what a database is and why it's useful.


Colaborative spreadsheets using Javascript is not the answer


It's not your answer, and that's okay.


If you put this product in front of your office colleagues, how much traction would it get ?


But those people could be using Google Spreadsheet already for example, which would have already solved the problem. And people in the office are not going to be writing Javascript in their spreadsheet.

It literally solves nothing.


Storing proprietary data in Google may not be viable. I know it isn't in my office, for instance. Plus, you lose the flexibility to integrate it into a more comprehensive application interface which runs on your own hardware/software stack, which the Javascript in this case provides.

I can already imagine several good uses for this software in my workplace, so saying it "literally solves nothing" is "literally false".


Indeed, anyone who thinks Google Spreadsheets is the end of all spreadsheets and nobody would ever need anything else, has never tried to embed Google Spreadsheets into a larger application.


I might be wrong but I think you can use JS in LibreOffice already?

Either way calling it a "new kind" seems a stretch unless it's not as well described as it appears to be.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: