> R has two types of empty string: character(0) and "". I understand it's frustr...

probably_wrong · on March 22, 2022

I'm going to side with the author here: if he read "Advanced R", "R for data Science", "The R Inferno", "Rtips. Revival 2014!", the official "An Introduction to R", "R Language Definition", and "R FAQ", and yet he still has problems with the language, then maybe the language is to blame.

And even if the author is the problem, I wouldn't accuse them of not reading enough.

em500 · on March 22, 2022

Ok, but if someone claims to have read all the Python manuals and wrote something like

> Python has two types of empty string, array('u',) and ""

you'd probably conclude that hasn't really understood what he read.

kafkaIncarnate · on March 22, 2022

For Python (at least Python3) a better example might have been b"" and "". They are not equal, and they are empty. You have to decode or encode one, for instance, and different functions return different things. Then different functions might return False, None, (), {}, etc.

This OP complaint seems like weird nitpicking about R. Many languages have different empty/null-types for different variable-types. Also, don't get me started on "nulls" in C, C-strings, C++ strings, or memory allocation.

All languages are complicated.

mellavora · on March 22, 2022

does f"" also count as an empty string? Because oddly enough, I don't see anything in that syntax which suggests that it is really a function which returns a formatted version of whatever is in the "".

stewbrew · on March 22, 2022

Or maybe the language just isn't for every one and for every usecase. I would be hesitant to write something customer-facing in R. But it's great for doing statistics. The main problem with R is that people underestimate how different it is and thus don't care to learn practices for writig robust R code.

pmyteh · on March 22, 2022

The issue isn't so much that character(0) is a zero-length vector and "" is a length 1 vector containing the empty string, it's that you can't necessarily rely on other people's code returning one or the other: things that 'nearly' always return a one element vector (which may contain the empty string for 'nothing') can vary unexpectedly if it fails on an edge case. And, unless you catch it correctly, this can cause downstream failures with little in the way of warning (because a zero-length vector is obviously a 'sensible' thing to return from a function in the usual case).

In that sense, it's similar to the problem many languages have with NULL, but on steroids: you can have NULL, NA, character(0) (or anythingelse(0)), or '' as your null result and each of them are tested for in different ways.

Obviously this won't be a problem for the various battle-tested standard libraries, but a lot of my work in R at least is assembling somewhat-novel analysis pipelines based on quite new statistics code.

stewbrew · on March 23, 2022

To some extent, this is rather a general problem with quality of code in dynamically typed languages. Public functions should return predictable results. This is a matter of testing. In my experience, packages on CRAN are well tested.

With respect to dealing with return values, you can circumvent some pain points by using identical(), isTRUE(), isFALSE() in if conditions instead of, e.g., `==` which many people use because this is what they know from other languages. The assertive package is also nice.

em500 · on March 22, 2022

That could have been expressed more diplomatically, but I think you're right. IMHO, what people should try to understand about new languages first most thoroughly, are its native data types. This is more fundamental than the syntactical constructs.

R's data types are one of its most alien part, and that's why I think if you're coming from another language, chapter 20 of Hadley Wickham's book[1] is the most important one.

[1] https://r4ds.had.co.nz/vectors.html

usgroup · on March 22, 2022

> I understand it's frustrating trying to use a language you don't understand. And instead of reading the language manual you go on rambling.

I'd agree with this assessment. If you start doing R and it feels weird to you then -- in my opinion -- you're probably in the wrong place. Meanwhile, for the cognoscenti -- the researcher, the statistician -- R behaves just as you'd expect. That is the draw -- a language developed around statistics.

R is not a great computing environment for computer science. E.g. writing iterative algorithms. Almost everything worth a damn in R is written in C++ and then FFId in. Those who do not want to use C++ can write their algorithms in Python or Julia -- and they often do. Arguably the defacto for computing oriented machine learning is Python, not R.