Hacker News new | past | comments | ask | show | jobs | submit login
Learn data science in your browser (dataquest.io)
244 points by emre on Jan 23, 2015 | hide | past | favorite | 45 comments



Hi everyone:

Very exciting to see this posted here.

I'm the maker of dataquest. I'm a self-taught data scientist/coder, and I wanted an easier way for people to get into the field.

I've been working on it for the past three months, and I'm really excited to see people learning with it.

I chose to teach python because it's one of the best first languages to learn, it's useful outside of data science, and a lot of production data science work is now done in python.

It's missing advanced content, but I'm working on it. Let me know if I can help, or answer any questions!

Vik


Been having a play and it looks good, although the way that variable assignment is referred to in some of the problems feels inconsistent and is sometimes confusing. In some problems you talk about the variable first and the value second, and in others the value first and variable second.

Also it really pains me to see you recommending using a for loop to count list members when there's a perfectly good len function there to do it for you. I can understand the desire to do it from a fundamentals point of view, but it feels overcomplicated in the crime mission.

Edit: Is it ok to like the video bits but hate then stylus? It feels like an actual whiteboard or something that drew less artificial (and noisy) lines would be friendlier. The narration is good :)


It's really hard to approach problems from the perspective of a beginner. I did some testing around using "magic" functions like len vs building intuition, and it's really important to understand how the things are working. Otherwise, you can't really generalize the concept easily.

I'll look into the variable assignments more. Making content has been way harder than I imagined it would be.

I think I need a better graphics tablet -- I got a cheap refurbished one, and it lags a lot.


Feedback: the text of the left hand-side panel (instruction area) is light-gray and too thin to be readable on my 14'' screen, I had to play with the console to fix this.

You may benefit from staying away of fashionable design trends and focus on usability.

Otherwise thanks for setting this source up, looks great (and very usable besides the problem noted above).


This looks absolutely beautiful. Will there be a way for users to create and add missions of their own?


Thanks! That's a great idea, and would be good to build out in the medium term.

Right now, content is the main bottleneck, and I'd love some help. If anyone is interested in talking, shoot me an email (vik@dataquest.io).


It's great that you're doing this. I'm really thankful.

In the ...tooltip there's this text

> Hey there, Welcome to DataQuest! If you’ve got any questions or feedba…

I have no idea how to expand that. I can't scroll within it and clicking on it doesn't expand it.


Strange, you should be able to click on it to expand. I'll look into it.


Thanks for building a great resource.


This looks great vik! Good work!


Would you kindly share what resources you used to teach yourself data science?


Too long of a list to post here, but I did write a blog post about it a while ago. I need to update it, but these are still relevant: http://www.vikparuchuri.com/blog/resources-for-learning-stat... .


This looks great, vik.


Small suggestion, you're still using the Yeoman favicon, you should add your own ;).


I'm curious to know what made you decide to teach Python 3 instead of Python 2? (I'm only asking because most courses these days seem to favor teaching Python 2).


A few reasons:

- Python3 makes some concepts, like unicode, division, the print function, etc, simpler to understand.

- Almost all major packages (only scrapy doesn't that I can think of), including the scientific stack, are compatible with 3.

- The trend towards 3 seems to be accelerating. 2 isn't going away anytime soon, but I'd like this content to be relevant for some time.

Ultimately, they aren't that different, and I may have a section listing out what you need to do to switch between them.


Because Python3 is the current version of python? If you're starting new projects, there are very few valid reasons not to use it.


A bunch of data sciene material, in particular, is written in Python 2.x, and it wouldn't be bad to teach Python2 either. You could always use the _future_ command for newer modules as well


That's a beautiful website you have there. I find it easier to motivate myself to learn something when it's pleasing to the eye :)


I couldn't help but think of how much it reminded me of this parody of flashy, boilerplate sites: (http://jonhendren.com/). My eyes kinda glaze over when I see Bootstrappy style (or something similar).


The colours compliment each other and it's simple. If I had to be harsh, it could do without the gliding animations.

The parody site was instantly painful though, I think it was a to do with the grey/green on white; amusing afterwards.


I have thought of something like that for teaching programming language, you dit it, this is very good ! You should make your tool a generic plateform an sell it to teaching organizations.


Thanks! Do you know of any organizations that want such a platform? There are a lot of players in the LMS space, like edX, Canvas, etc.


It'll be difficult for you to sell the platform.

Follow your own gut, but investors/buyers don't care how great your platform is. You have no IP so they can pay someone to build another version. Investors and buyers purchase users.

Get a lot of users on your platform and then consider selling.

That being said, codecademy and code school are both model templates that you could use to turn this into a business if you'd like that route.

You can also find hacker schools teaching data science (such as iron yard - http://theironyard.com/academy/python-engineering/) and see if they'll be willing to add your curriculum to their pre-course requirements. You might be able to get them to have students help build our your curriculum as part of their course projects.

Any kind of media coverage (such as fast co, popular science, etc) will help garner attention and users.

If you are able to get access to current sports data (nba, nfl, baseball, etc) and are able to help teach data science around those data sets you could probably get a lot of motivated but non-cs educated users and media coverage. That is also a feature you could probably charge a subscription for. I'm shooting ideas from the hip, so take them and turn them into something that is more familiar with your background.

Whatever direction you choose to take dataQuest, talk with your potential users and get their feel. Make a decision to move in a direction that'll have minimum push-back with maximum achievement of your desired goals.

Focus on user growth and you'll garner the attention of affluent individuals.

Good-Luck and let me know if you need any help or someone to bounce idea's off of!


This website looks great. Hopefully sites like these will help the new comers in data science to jump start their education.


Agreed. There are many different components to learning the basics of exploring, analyzing, visualizing, and interpreting real data.

Checkout Tuva's incredibly easy to use tools to get an idea of how these concepts can be brought to life for data novices and young learners.

https://tuvalabs.com/datasets/us_cities__part_i/#/


I somehow missed the "no sign-up required" in the front page, because of the login link.


Excellent website. What did you use for the front-end, is it based off a template? It looks great!


Thanks! The frontend is written in javascript, and uses angular. Design is based on bootstrap. Frontpage is a modified template, rest is custom.


Great work Vik! Nice to see you getting a lot of great feedback and attention.


Looks very good and definitively more affordable than datascience@berkeley.

Thanks!

Looking forward to use it


That's the first course I actually like. And it's free too!


Perhaps the thing I find most impressive about this it's that it runs quite usably in my phone browser (WP8.1). It's the first one of these I've found that does.


I just finished the first two missions. Amazing!!!


Is there a javascript version a la freeCodeCamp?


I had not actually heard of freeCodeCamp. That looks pretty cool, and will probably shoot to the top of my list for entry points to JS. Thanks for the mention!


It is very similar in concept to http://coderscrowd.com. Good job


This is really incredible. Thank you.


This is really incredible!


This looked intriguing, but all the lessons seem to revolve around learning Python, which doesn't interest me (since I'm already a software developer). I guess I expected something more science-y like R or maybe Julia.


I actually learned a lot of coding with R. It's hard to learn, but once you do, it's great for hacking up models and trying stuff out.

But, because of R's quirks, it's generally easier to write and deploy consistent, good-performing code in python. Python code is also more readable, which makes it easier to collaborate. All of my data science work, including plotting, is now done in python.

I'm working on more advanced content for dataquest, and thanks for the feedback.


I find it funny how Anti-R the Python community can be. Why was the original poster down voted for saying a question about the choice of Python. It was a good question. Python is a good language for data science is why and also people like the language.

R is also a great choice for a language and Revolution Analytics just got acquired by Microsoft.

I don't understand why Python community says it is hard to write great performing programs and deploy with R. It is by far the most used language in data science and statistics (Open Sourced and possible everything else).

If you look at Revolution Analytics and RStudio's Shinny it is very easy to deploy efficient and amazing apps right now.


I'm curious what reseources you used to learn R. I've had it on my list of to-learns for awhile.


Code School as a free introduction course that'll get your feet wet and use to a lot of the concepts. It's a nice intro to the O'Reilly book


The whole reason they built so many data tools in Python though is because R can be a pain to program in.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: