Hacker News new | past | comments | ask | show | jobs | submit login

I've used R and Python. I stick with Python whenever possible because IMO it supports the non-modeling parts of data science more effectively. ETL scripts, API creation, Flask for hosting simple websites, etc. yHat makes python-ggplot and Rodeo, similar to RStudio. I explore and develop algorithms in Jupyter notebooks, documenting along the way, while running "hardened" code from the command line, often nohup'ing it on a Linux box, for services that run perpetually, keep me updated via Slack/SMS/email, etc.

For visualization, almost everything I do is in D3, p5.js, or in Processing (Java), which has a Python interpreter, for those interested. There are some great Processing books and Daniel Shiffman is the Hadley of that world. Tons of engaging resources from him. There are tons and tons of good D3 books and online resources. bl.ocks and Mike Bostock's other online articles are wonderful.

Every organization with data scientists defines "data science" differently. People with a modeling and stats focus probably should stick with R. If you find yourself in a position with a wider scope, you simply must have more tools in your tool belt, and in my opinion, R, Python, and JavaScript all are part of that package. For me, personally, Processing is, too. Have a look at Ben Fry's work to understand why. I also use openFrameworks when the volume of data to visualize and performance concerns require it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: