Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Wickham was really on to something with ggplot and it's graphical grammar concept for plots. That and Tidyverse in general, I believe, has saved R from irrelevance.

I wish more projects would use that way of thinking but it seems that in the jupyter/julia/python world there's too many choices and they all attempt a "kitchen-sink" approach for visualization.



Honestly, R will be fine, even apart from tidyverse.

There's a huge shadow universe of scientists running particular R packages for these analyses, which never gets seen from HN/programmers.

While Python is definitely (sadly, IMO) a better language for building ML systems (because more people know it, and it's harder to write impenetrable code), R is definitely a better DSL for statistical analysis, modelling and graphics.

It's a shame that R takes its lineage from a language developed in the 70's (S), as that's a cause of many of the inconsistencies in the core language.


I agree R would have been "fine" without Tidyverse-- in the sense that it would keep a persistent survivable niche base.

I suspect, however, that it would not be anywhere near the top 10 of the TIOBE index like it is now and that it's userbase would consist of mostly of statistics practitioners rather than huge swathes of people that need to perform any basic data analysis.


> Tidyverse in general, I believe, has saved R from irrelevance.

It’s interesting that you say this. I’ve been using R daily for almost 3 years now, and I was originally taught the Tidyverse.

However, I also tutored in biostats and have collaborated with many different faculty and students.

On one hand the Tidyverse created an opening for R learners especially, but it has lead to some controversy as well.

Trust me when I say there are plenty of R users who have never heard of or used the Tidyverse.

It leads to some difficulty in teaching/collaborating because do I use grepl or str_detect? lapply or map? Do they know the magrittr syntax (%>%)?

Then there’s non-standard evaluation, which often forces some arbitrary meta-programming on novice programmers.

R is foremost the stats language, and tidyverse has attempted to make it more general purpose, for better or worse depending on who you ask.


Tidyverse is much, much harder to write functions for, which is pretty bad for beginners. The API is lovely, but the non-standard evaluation causes real problems for people.

True story: writing tidyverse functions is so hard that I once worked for a company with business critical code running in incredibly long R-scripts with almost no functions, and 100 line pipes.

While that may be an extreme example, base-R is much, much easier to get started writing functions for (like I've been using R for a decade now, and the "idiomatic" way to use NSE with ggplot and dplyr has changed multiple times over that time).

Meanwhile, base-R is fugly, but its solid and backwards-compatible to a fault.


As a huge fan of tidyverse, I have to agree. Tidy eval is anything but.

However, once they get that right, then tidyverse is close to perfection

I think the newish “interpolation” syntax with double braces {{ }} is getting there.


See, it's changed again!

Damnit Hadley, why must you do this to me?

(Really I'm bitter because I know that I'll end up maintaining a bunch of important code in version N-2 of Hadley's NSE adventures at some point in the future).


Changed, yes, and finally in the right direction I would say!


... and the following is what I mean. I think this is as close to clean and understandable as it can get, for most use cases (variables in functions, including creating new variables):

https://www.tidyverse.org/blog/2020/02/glue-strings-and-tidy...


The biggest difficulty of NSE is not the syntax, but the requirement to hold parallel mental models while writing the code: symbolic expressions and value manipulation. Juggling will always be difficult.


R has always been relevant and isn't going away any time soon. The language is ugly and flawed and has a bunch of quirks and gotchas but there are just too many libs.

Also for a lot of people R is going to be the first and often only language they're going to learn because they have no use for programming outside of statistical analyses and it's damn practical for that single use. Right now generations of students are being taught in R, right now bioconductor is booming and being added to everyday. That stupid arrow assignment operator has plenty of good years ahead of it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: