Hacker News new | past | comments | ask | show | jobs | submit login

I'm pushing for our actuarial team to transition to more R + Git. After 3 years of preaching, most of the actuaries now use RStudio + git as their primary work tool. It is happening.

What we did :

1) Provide documentation on everything from install to using internal R libraries for ETL.

2) Provide mostly problem free, always updated VMs with RStudio Server/ Shiny Server.

3) Establish an hotline channel for instant help on R or git.

4) A couple members on the team developed really close working relationship with IT and we have great respect for each other work.

What we provide is way better and by being active, we built users trust in the tools.

We are phasing out SAS and proprietary modeling tools. Python never took hold even if we bought Anaconda entreprise. Excel is there to stay for sure but since actuarial student learn R in school, it is easier to onboard new hire.

If you want to go down this path and have a chat, hit me up. I'm in P&C. We use R both in development and production environments. We use it for pricing, spatial contractual obligation, claims assignment and a couple more models.




The current top comment (sibling to the one I’m replying to) argues that keeping Python environments across actuaries/users computers up to date is too difficult.

This is nicely solved by using R server.

I’ve worked in an R server shop, and the experience is really nice. You log on to the server in chrome or Firefox and the browser window basically becomes RStudio and all calculations are done on the server and all code and data also lives on the server which is a huge bonus in terms of data protection. No copies are floating around on peoples laptops and if Johnny is sick and forgot to push his code to git - no worries, it’s all on the r studio server.

I don’t now of a nearly as good Python solution. I think Conda suggests using jupyter lab, and while that is a great environment it’s not great if it’s all you can use.


The big problem with notebooks is that you don't have a real REPL. This prevents one from single step debugging and tracing. This is one area where RStudio is much, much better.

The trouble is that so many of the younger DS people are focused on Python, that it makes financial sense to just deal with all its problems. There's also a lot more programming tools (though less statistical modelling tools).


You do have access to a repl when using jupyter notebooks.

You can hook a notebook or a repl to an existing kernel. I always have a command line attached to my notebooks. When using jupyter lab I attach the build-in terminal and place it at the bottom. When using notebooks I attach it from my terminal.

The experience in Rstudio is still better imho. It’s also a more mature text editor and ide than jupyter.


Ok fair enough, I only used notebooks when I can't avoid it. I'm pretty sure you don't get a repl by default though, is there an involved set up in jupyter?


    jupyter console --existing
should start ipython in your terminal and connect to the last started kernel (e.g., the one in the notebook you just started)

https://stackoverflow.com/questions/22447572/connect-termina...

For jupyter lab, you just choose to start a repl from the gui and choose an existing kernel.


Thank you! (clearly I didn't spend a lot of time doing this, as I have an Emacs addiction ;) )


I'm an actuary with a strong interest in this area - would be very interested to hear more especially on your R vs Python experience.


It came down to IDE, workflow and data.table.

RStudio is an absolute killer solution from the get go. Package management in R is simple and robust. Shiny is the new Excel pivot table on performance enhancing code.

Python has more contributors, more users. It also creates a lot more noise. Business people may feel like it is a a programmer tool. R feel more approachable.

In the end, both are great solutions but we decided on R because we believe in the people contributing to the ecosystem, mostly RStudio. Somewhere down the line, there might be a transition to julia.


Thanks - really interesting, especially on the RStudio point.


I've used R (3 years) and Python (8+ years) in data science and much prefer Python, because it can do things that aren't just pure data analysis, and because pandas is so amazingly good compared to R's data matrix solutions, in my opinion. I believe that the algorithmic trading industry has gone fully into Python and away from R for these reasons.


R has data.table. It is the game changer as I agree base R data.frame do not cut it for performance. tibble will come close once they incorporate more of the data.table performance tricks.

https://h2oai.github.io/db-benchmark/


Does R have robust CSV parsing? I remember using the default and it'd be extremely finicky about getting the header and index flags right and wouldn't typecast numeric columns properly (instead they'd end up as factors and not play nice)


Python version of data.table has very fast CSV parsing (compared to Pandas), and it didn't have issues like those you mention. Even if data.table had issues with CSV parsing, you could probably use Apache Arrow to parse CSV into arrow table and then convert it to data.table (but that is probably suboptimal).



Personally have never had a problem with R csv parsing


It happens, but mostly because other formats don't produce usable CSV's. The biggest problem is if there are any free-entry text fields (common for customer/business name), and there isn't full quoting around these fields, base R will break.

I believe both fread and readr::read_csv do the right thing here, but the base-R perspective on data manipulation before read.csv is to use Perl (the R-core team are pretty old-school, to be fair).


h2o's data.table clone is fine

https://github.com/h2oai/datatable


I've been a heavy user of all 3, and pandas syntax is a nightmare compared to dplyR or data.table in R. That being said, I still use pandas because I prefer python for non-analysis.


I'm a CPA. When I started learning code, I looked for whatever was most like a spreadsheet. R for the bill, with built-in frames.


Oh.. similar line for me, accounting/tax law. Excel is bread and butter because all year end fianncials are prepared and finalised on excel. Although I have used libreoffice on my personal machine, it also kinda works.

For a couple of years I have tried to excel macro myself a balance sheet template which does most of the copy pasting from precious years, does bank interest calculations and all.

It would be interesting to know how does a us CPA work because its all accounting package>excel>efile.


I'm on mobile, but do also consider https://JuliaActuary.org (something that I personally have contributed to).


Looks really interesting thanks. I've seen some interesting insurance projects using Julia e.g.

https://www.youtube.com/watch?v=__gMirBBNXY


R is better if your raw data is already tabular. I prefer Python if the raw data is unstructured / semi-structured. You can make the case that once Python has converted the data to tabular then move to R, but at that point I like the soup to nuts to be in one language.


I’ll second klelatti’s question about R vs Python. From my perspective Python is just as practical for actuarial calcs and better for building general purpose tools. Is there a reason Anaconda didn’t click?


GPU integration was broken for a long time. Managing VMs / Environments. The absolutely horrible integration with git/Github.

Having to rebuild your environment from scratch when your workspace crashed. Imagine starting a notebook with a 45 minutes compile time. No go.

One click deploy, let's just forget about it.


Thanks! That totally makes sense. If I had to pick a pain point for getting people started with Python tools it would be environments. Comments here make me think my team is working with a lot less data too.


I am putting together course aimed at Python beginners in enterprise, I too have experience in Finance. If someone is interested I would love some early feedback, you can contact me my email is in this profile bio.


Would love to have a chat about how you’re making R + git more accessible. How do I best reach out?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: