trthatcher's comments

trthatcher · on Jan 23, 2022

While I'm not clear why you would want to go back to an undergrad given your education, I would look for programs with names like "Computational Mathematics" or "Computational Science". I earned a Bachelor of Mathematics in Computational Mathematics. The first two or three years were heavily focused on the mathematical foundations (eg. non-linear and discrete optimization, combinatorics and CS fundamentals, statistics and applied mathematics). The latter half of third year and all of fourth year were basically a "create your own adventure" where you could focus on subject areas that interested you (ex. econometrics, computational biology, industrial engineering, computational statistics + machine learning, etc). We had a breadth requirement t that often pushed people to obtain a minor from a different faculty (eg. Arts or Sciences). The program overview for my school might give you an indication of what content to look for [1].

You might also find graduate level programs that are similar in aim; my school also offers an MMath in Computational Mathematics [2].

1. http://ugradcalendar.uwaterloo.ca/page/MATH-Computational-Ma...

2. https://uwaterloo.ca/computational-mathematics/future-master...

mettamage · on Jan 23, 2022

A couple of reasons:

* Building CRUD apps lack depth, when they don't lack depth, they lack timeless knowledge (in many cases). A well picked bachelor/master does not have that. There are programming jobs out there that don't have that but they are in the minority and I am bad at getting to the interview stage, despite having a master in CS (my cv is ragged, I had a terrible start and don't know how to recover from it other than just by moving on and hoping for the best).

* A university provides fellow students and TAs. In some cases I prefer heavily to study with them rather than doing a MOOC on my own. A study group is harder to setup. Getting a tutor is definitely an alternative for the TA part.

trthatcher · on Nov 22, 2021

I quit my job in May 2019... before the pandemic started.

I was working as a senior data analyst at an insurance company in Canada. The work was boring, thankless and pretty exhausting but I was paid well (for a data analyst, anyway). There were no opportunities on the data science team since I didn't have an MSc so I didn't have a clear next step. At the same time, I had started the OMSCS program that January and that was eating up my off time.

I was too busy just trying to stay on top of school and work that I just couldn't see myself having the time to prep for tech interviews or do some side projects to jump-start the tech side of my resume. I decided I just wanted a breather.

I resigned and cited wanting to get ahead on my degree. I figured this would be a reasonable narrative if the gap in my employment were to ever come up during an interview.

I spent 10 months focusing on school, enjoying my time and improving my development skills.

When I started looking, I applied to about 25 data scientist and a handful of ML engineer roles. I received no callbacks except for one ML Eng role which didn't go anywhere. Luckily an old manager was starting a data science team at another insurance company at the same time as I was looking. He basically handed me a data engineer role with a bump in compensation in early 2020.

If I reflect:

- It worked out extremely well for me. I left a non-technical job, had a nice 10 month break and ended up getting a development job where I get to write code all day. I actually enjoy my work now and I have learned so much since then. Zero regrets for me.

- At the same time, I underestimated how little my experience as a data analyst meant to data science teams. I would have had to apply to many more jobs to get something via the standard online application approach. I think that would have been really stressful.

- I ate through about 25% of my cash which was a little painful to watch.

I think if you have a strong skill set and experience profile, its probably just fine. If you were like me and trying to make a big switch (eg. data analyst to data scientist or some kind of eng), it was a risky move and I wouldn't recommend it without a plan. I lucked out. YMMV.

disgruntledphd2 · on Nov 22, 2021

> There were no opportunities on the data science team since I didn't have an MSc

As an aside, this is total nonsense. For gods sake, if someone has the skills to do data science (and if you can code and do analysis then you definitely do) arbitrary gatekeeping on a credential makes no goddamn sense.

EDIT: > At the same time, I underestimated how little my experience as a data analyst meant to data science teams.

Not all data science teams, I've been doing this for a while and absolutely adore getting people with analytics experience, as it's critical to success in a lot of DS teams and is hard to train.

trthatcher · on May 1, 2020

Data Scientist | Equitable Life of Canada (https://www.equitable.ca) | Waterloo, ON | Full-time | Onsite or Remote (within South-Western Ontario)

Equitable Life is a small mutual life insurance company (~700 people) based in Waterloo, Ontario. We're hiring our first data scientist to help found the data science team. As of right now, the team consists of my manager and myself (data/ml engineer).

This data scientist role is a foundational one; you'll need to help define our methodology, tooling and data strategy. The data scientist will act as an internal consultant within the organization and will help various teams optimize their processes through the application of predictive models. This is a great opportunity for someone with a couple years of experience under their belt.

We are primarily a Windows shop with all infrastructure managed on-premise and most development is waterfall. However, the company is actively working towards being cloud-friendly (Azure) and rethinking its development processes (eg. embracing devops tech, agile).

As this is a foundational role, we're looking for someone with either a masters or PhD in a quantitative discipline and a minimum of a couple years of experience in developing predictive models (preference for supervised learning).

Apply here: https://canr57.dayforcehcm.com/CandidatePortal/en-US/equitab...

trthatcher · on March 2, 2020

Location: Toronto, Canada

Remote: Yes

Willing to Relocate: Depends, open to discussion

Technologies: Python, Julia, some JavaScript, SQL

Résumé/CV: https://thatcher.dev/resume.pdf

Email: tim @ my domain above

I'm seeking opportunities in the data science field, especially ML Ops if you don't mind someone who is just getting started in that space.

I'm currently a part time student in the OMSCS program. I left my corporate job ~10 months ago to focus on self development and to focus on finding work that is a good fit. I'm highly motivated, independent and I love tech; I know I'll perform given the opportunity.

trthatcher · on Nov 25, 2019

Excel is a piece of software that I would absolutely not hesitate to pay for... but I'm unfortunately stuck with LibreOffice.

I used to work in an insurance company on an actuarial pricing team where the preferred tool was Excel (the modelling was pretty simple). Needless to say, I became very accustomed to and adept with using Excel. We definitely pushed the limits... but what was nice was being able to grab 600K records (maybe 20 cols) from a DB, throw it in Excel and get some results in a matter of minutes. You might have to wait a few seconds based on what you were trying to do, but Excel could handle it.

At home I run Linux... where there is no Excel, so I use LibreOffice instead. Just a few days ago I was poking around the Himalayan Database[1] and one of the tables has about 50K records. LibreOffice absolutely chokes when I try to do any filtering or calculations. As well, the pivot tables in Excel are in a totally different league than LibreOffice in terms performance and flexibility. It's unfortunate because I try to support OSS as much as possible... but LibreOffice is just so painful.

You could argue that I'm using the wrong tool for the job. Ultimately, I do throw it in SQLite or Pandas, but Excel is just so nice for ad hocs if it fits in memory.

[1] https://www.himalayandatabase.com/

nullspace · on Nov 25, 2019

> You could argue that I'm using the wrong tool for the job.

Without comparing Excel vs LibreOffice vs etc, I've found the category of spreadsheet software to be immensely helpful for analyzing, aggregating, visualizing and reducing small datasets.

You do have to end up falling back to python scripts using pandas or whatever if you need to run jobs that need to take data from one DB and put it into another DB or something like that.

But if my output is basically a reduced set of tables, series or graphs for presentation, spreadsheets are an immensely useful tool.

What I've specifically done is have one sheet of the spreadsheet represent the "raw, unreduced data", and write up formulae for aggregation in other sheets, based on the raw data and intermediate aggregations. Starts becoming less ideal when you have 10 or more sheets, but for simple cases, it's very much underrated by us engineering folks.

gpresot · on Nov 25, 2019

I always found that excel can be as simple and as complex as you need it to be. You can use it to write your shopping list (ok, not ideal), or a small budget, and you can ramp it up for really complex simulations. Once you know vlookup, indirect/match, sumif/countif and very few other formulas (for example text manipulation to clean and harmonise data inputs), you can do really interesting stuff even with datasets of thousands of lines. I found that often in excel the most difficult stuff is not the back-end (data analysis and simulation) but the front end (i.e. giving the user a clear interface for inputing data and conditions in the model, and visualising outputs, for example with buttons, constrained inputs, conditional formatting etc.).

surfsvammel · on Nov 25, 2019

Why not ideal for shopping list? When we go skiing with a couple of friends, we setup a long shopping list in excel, there each row has the name, category, amount and amount unit.

Before heading into the store we have a Pivot table sum it all up, group everything by category sorted by the order those categories are encountered when walking though the store.

This way it takes 2-3 people less than an hour to shop for a 4 day trip for 10-20 people. One holds the list, and then 1 or two runners go fetch stuff.

It’s been perfected over the years:)

gpresot · on Nov 25, 2019

Hats off for this. I was thinking more about as a substitute of a simple paper list (or if on mobile, just a list on whatever default note app is you have in iOS or Android)

MiracleUser · on Nov 25, 2019

>What I've specifically done is have one sheet of the spreadsheet represent the "raw, unreduced data", and write up formulae for aggregation in other sheets, based on the raw data and intermediate aggregations

This is exactly what I do. When it works, it is just way too fast for ad-hoc reporting to consider doing anything else. Pretty much all of my first draft reporting is built this way, so i can cut through the back-and-forth of management change requests before going through the trouble of setting up a full thing.

It is amazing how much you can accomplish with appending a couple columns to your raw data for categorization and then pivoting.

when im in a rush and need another layer (pivottable of a pivot table), I even set up formulas to reference a produced pivottable and then pivot on the formulas. You can extend the formulas far beyond expected use-cases and filter out the irrelevant rows later

meithecatte · on Nov 25, 2019

> LibreOffice absolutely chokes when I try to do any filtering or calculations.

File a bug. https://bugs.documentfoundation.org/

It's easy to complain in an HN comment, but a clear description of the test case would certainly help with getting the issue resolved.

IshKebab · on Nov 25, 2019

I wish this comment was banned from HN. Every time someone says that some free software is in any way flawed there's always the completely useless reply "well did you file a bug?" or even worse "it's open source - why didn't you fix it?"

I'd hope it's obvious why these comments are bad, but I guess not so:

* Not everyone has time to fix a bug or even file one.

* The ability to fix or file bugs doesn't mean that the bugs don't exist!

* The ability to fix or file bugs doesn't mean that one shouldn't talk about the bugs.

* The possibility of fixing a bug doesn't help if you don't have the time or ability to fix it and actually want to use the software.

* Not everyone is able to fix all bugs

* It's not nearly as easy for people outside a project to fix bugs as it is for people familiar with the code.

* Everyone already knows it is possible to file and fix bugs in open source software so you're not adding anything to the conversation.

xtracto · on Nov 25, 2019

Specially because you usually have to jump through a lot of hoops to file a bug. I remember some time ago trying to file a bug for some major open source software. It was so painful that I gave up before finishing.

boomboomsubban · on Nov 25, 2019

Without context, it was likely painful as they need that information to have any chance of fixing the bug.

ubercow13 · on Nov 26, 2019

I don’t mind collecting useful information, it’s the having to create accounts on some hard-to-navigate project-owned bug tracking thing that kills it for me.

bityard · on Nov 25, 2019

There's a word for people who expect to get things for free without giving anything back to the authors or community that made it possible.

Yhippa · on Nov 26, 2019

There's also a phrase for people who give things away for free and expect something in return.

Dayshine · on Nov 26, 2019

Could you point me at the paid for equivalent for excel on Linux? If he donated even a thousand dollars he wouldn't get the product he can get for almost free on windows.

SyneRyder · on Nov 26, 2019

Doesn't Excel run under Wine? That might actually be the better option.

(LibreOffice user here on Windows, donated in the past, but I find LibreOffice Calc really unpolished. It puzzles me because NeoOffice is really quite good on the Mac. I should probably just switch to MS Office.)

stjohnswarts · on Nov 26, 2019

Certain versions yep :) . Although I'm not sure it's flawless. Crossover office is best if you're going down that route, or just put windows in a VM and run it there if it's crucial to your work. That's what I do.

esyir · on Nov 26, 2019

This just turns Foss into the inferior but free option, and I don't think that was the intention.

mariushn · on Nov 25, 2019

And since you mentioned you'd pay for Excel, please contribute a one-time life-time $50-$100 bounty to help fix that performance issue.

ken · on Nov 26, 2019

Would this hypothetical fix include a performance suite, and a regression testing system, to ensure this task stays fast forever?

IME, systems that don't focus on performance as a primary metric tend to slow down over time as they accumulate features. Someone who is willing to buy Excel ($130) may not be willing to pay a $100 bounty every time some other programmer makes LibreOffice slower again.

I googled for LibreOffice performance regressions and found several, going back many years, and they tend to include a one-line fix. For all its faults, I'm sure there's a team at Microsoft that knows every time Excel performance regresses.

mariushn · on Nov 26, 2019

Good point. How can one prevent this though, for any free/open source software? Introducing issues (intentional or careless) to keep some funding going.

noja · on Nov 25, 2019

Why is that? Nobody ready to improve LibreOffice Calc? Calc is not good enough to start improving on?

Do we need a list of missing features? I would also add:

* Copy paste from Writer does not work

* Summing a column (not a range) isn't possible

* Type detection is not as good

* Auto-update doesn't work at all

gerdesj · on Nov 25, 2019

I've just pasted some data from Writer into Calc.

Could you elaborate on your column summation problem please. =sum(E:E) works as does the summation button.

Don't understand your third point but if it is text vs numbers and dates etc then I've had plenty of fun with this in Excel as well over the decades.

Auto-update - please elaborate where it fails

noja · on Nov 25, 2019

I've just pasted some data from Writer into Calc.

Try to paste a table from Writer into Calc so that the structure is converted to rows and columns (like Word and Excel does).

Could you elaborate on your column summation problem please. =sum(E:E) works as does the summation button.

~I get Err:522 when I run =SUM(E:E)~

It works!

Don't understand your third point but if it is text vs numbers and dates etc then I've had plenty of fun with this in Excel as well over the decades.

Me too. It's annoying in Excel, but worse in Calc.

Auto-update - please elaborate where it fails

The built-in auto-updater doesn't so much fail, as hasn't ever worked at all (Mac, but I think Windows as well). iirc I get a %%PERCENT%% placeholder sometimes, and the rest of the time I have to download the update from the website instead. Same on several machines.

gerdesj · on Nov 25, 2019

I can paste a table into Calc OK. It was a simple effort with a,b,c,d in row 1 and 1,2,3,4 in row 2. The numbers came in as numbers OK. Paste as unformatted text is rubbish (single column of data) and I think I might file a bug for that.

Sorry, I get you now about auto updater - software updater not [Google: Excel auto-updater]. That's not really a thing on Linux because when you update the system, the whole thing updates - apps and all.

The pace of development in LO is absolutely amazing but there are some horrendous bugs with near immortality status. If you can put up with them then use it. No software is perfect but LO is moving at increasing speed towards the light.

stjohnswarts · on Nov 26, 2019

Isn't that the job of brew or whatever package manager you're using on apple?

IronWolve · on Nov 25, 2019

Copy and paste on linux when using multiple apps and remote vms/vnc/rdp etc, is a pain. Depending on what app you use, its looking for which clipboard buffer and what commands to copy/paste. Mixing Os's and apps, is a clusterfuck on a multiheaded desktop with excel in a windows rdp session and libre office in another, then throw google sheet in because work decided to switch to google office at 1/2 the price of office 365 per user.

Sigh.

chopin · on Nov 25, 2019

I absolutely hate type detection in Excel. It's at least comparable to type coercion in Javascript, maybe even more capricious. I know the type thank you very much.

tinza123 · on Nov 25, 2019

I have been using https://openrefine.org/ for interacting with large csv files, and it has been amazing. It supports the normal filtering / transformation and other import / export actions. Plus it's open source.

gerdesj · on Nov 25, 2019

I run LO on this laptop under Linux. I've just opened the member's table which has 74,853 records in it. It's a 54.8MiB FoxPro database and took about 10 seconds to import. I've put an auto filter on the whole thing and a search for France took blink. I created a pivot table in seconds showing counts of sex by citizenship (roughly 10% of UK climbers are women.)

This is a fairly beefy machine but over two years old so nothing particularly special nowadays.

dgudkov · on Nov 25, 2019

>You could argue that I'm using the wrong tool for the job.

For ad hoc filtering, calculations, merging, analyzing tabular datasets a dedicated tool such as our EasyMorph (https://easymorph.com) would do a better job than Excel in the vast majority of cases. It can handle millions of rows, keeps data in memory, has a wide range of available transforms and built-in charting. We target heavy Excel users and it works quite well. Disclaimer: I work for EasyMorph.

205guy · on Nov 26, 2019

I think both Excel and Libre Office have their place. Excel has the power and refinement for professional users such as accounting, actuarial, and the commenter's data analysis.

Libre Office is for home use: tax prep, budgeting, graphing science projects. That's what I use it for, and it's perfect and free. That might be unfair to all the hard work that's been put into it, but if you don't push its limits, it works great.

So I guess I am arguing it's the wrong tool for data heavy jobs.

e12e · on Nov 25, 2019

Always felt it was a shame that scheme in a grid was abandoned (and last I tried getting it up and running I gave up;I believe the only viable option is to change the gui library - and that didn't seem trivial at the time. I suppose these days one might rustle up an ancient version of Slackware in a docker container...).

http://siag.nu/siag/

Anyone know of something similar, but more up to date?

bityard · on Nov 25, 2019

> Excel is a piece of software that I would absolutely not hesitate to pay for... but I'm unfortunately stuck with LibreOffice.

But... according to the rest of your post, you _did_ hesitate to pay for it and are finding that LibreOffice doesn't fit your needs. So why not buy Excel?

sjy · on Nov 25, 2019

From the parent:

> At home I run Linux... where there is no Excel

You can run it in a VM, but at that point you're already settling for a sub-optimal experience, so why not try a alternative that's free and packaged for your OS?

_euac · on Nov 25, 2019

I had the same issue with LibreOffice choking with big datasets so I've started using SoftMaker PlanMaker for loading anything with more than a few hundred records. It's works just fine with documents with tens of thousands of rows and it allows me to do the - admittedly very simple - calculations and charting I need.

The whole SoftMaker Office suite is pretty nice. Not sure where it's on feature parity with Microsoft Office, but the price isn't terrible (something like $20 a year for the paid version, or free if you don't care about the paid features) and it does a great job of reading Office files.

monoideism · on Nov 25, 2019

have you tried gnumeric? I prefer gnumeric to open office's spreadsheet when it comes to open source.

isostatic · on Nov 25, 2019

gnumeric is great for simple tabulated data as far as simple charts etc, but the advanced features that spreadsheet lovers use aren't available.

Shorel · on Nov 25, 2019

Have you tried WPS Office? It kind of works for the things I do, but I am not an advanced Excel user.

I would be very curious about your experience as you certainly tax the software much more than I do.

DrScump · on Nov 27, 2019

  LibreOffice is just so painful

You were running it on a different OS and different hardware. LO vs Excel was not the only variable.

flowerlad · on Nov 26, 2019

* You could argue that I'm using the wrong tool for the job.*

You are. You could try this reporting tool instead: http://pebblereports.com it has an Office like interface, works really well with databases and has performant crosstabs.

ken · on Nov 26, 2019

They said they're using LibreOffice because they're running Linux. That tool you're suggesting is Windows-only.

If they were using Windows, they could save a couple hundred bucks and just use Excel.

trthatcher · on Jan 2, 2014

You have the right idea.

You're assuming an underlying model for the data. You have a test statistic( that estimates a model parameter) and you have a hypothesis regarding a parameter. The p-value is the probability that you get a test statistic more extreme than the one observed assuming that your hypothesis is true.

Ex. You have a sample of 1000 men's heights. You compute the sample average height as 5'9 and a sample standard deviation of 3 inches.

(Unlikely) hypothesis: the average height is 4 feet. Your p-value is the probability of getting an sample average more extreme than 5'9 given that your 4 ft height hypothesis is true. Given that the sample standard deviation is 3 inches and 5'9 is 7 standard deviations from 4ft... the p-value is going to be small, so you'll reject that.

Note: I'm leaving out details and assumptions

dmd · on Jan 2, 2014

Yeah, that's what I was taught. So what are people answering instead?

tel · on Jan 2, 2014

People can be easily lead to misinterpret p-values even if they can define them. Most often people assume that p values indicate something about the correctness of a model or an inference. This is the classic p(d|h) v p(h|d) debate.

trthatcher · on Dec 10, 2013

I don't think it will be a problem to do both.

It doesn't take long to learn Julia well enough to write a Julia package that others will use. For me, that was about 3 weeks @ 10-15 hours a week, and I'm no programmer. It's a small language and it's quite readable. I'd reccomend going through the Julia manual; it doesn't take long and only requires some diligence.

The problem is the lack of packages/libraries and lots of documentation is still missing. That, and the packages that do exist with multiple contributors are still in disagreement about standards and consistency.