Hacker News new | past | comments | ask | show | jobs | submit | ASpring's comments login

Aside from the methodological concerns (very valid) this is also published in a set of journals of controversial repute: https://en.wikipedia.org/wiki/MDPI


I'm not sure I'm fully understanding your point. Is it that constructing confidence intervals using t-statistics is inappropriate for a lot of real data that isn't distributed somewhat normally?


It's their point, and it's a good one, but I think they're somewhat overstating how common power-law data is; it probably varies a lot by field of study. And at least the logarithm of a power-law variable can help bring it back closer to the world of sanity. Plus, there are plenty of fields where nonparametric tests of medians are accepted standard practice.


You can turn most issues into powerlaws by recursing a reasonable risk distribution over it.

So suppose we ask, what is our confidence in X? (rather than X); and then, what is our confidence in the model by which we give confidences in X (ie., the model risk); and so on...

In practice, what we want to model is the appropriate confidence, not an actual prediction (bunk). So we are very often screwed.

Statistics is an illusion.


It is still the Linux of statistical software. Rstanarm is getting quite good though for R users.


It's pretty common in large tech companies to see this but agreed it is not a super intuitive abbreviation.


I did not know that. First time seeing these abbreviations. Thx.


It was great for us at UC Santa Cruz. The reason I had healthcare during grad school was because the union won it right before I joined. The reason they have a housing stipend now (in the most expensive rental market in the US[1]) is because the union fought for a cost-of-living allowance. We at UCSC didn't always agree with the course of the larger UAW 2685 but they did a lot for us.

I'm not sure what the system was like in the union in Wisconsin but I'm surprised that more STEM students didn't join and change the course of the union if they were that negatively affected. Our union was democratic almost to a fault but maybe the structure in Wisconsin was different.

[1]https://www.sfchronicle.com/realestate/article/most-expensiv...


When was this? Just before the pandemic, the grad students protested at UAW meetings and went on a wildcat strike after UAW ratified a contract the campus voted against.

Later, Janet Nepolitano released police drones and set up barricades to try to shut down the picket lines. Eventually covid ended the drama, but only after some students were deported (I assume. The plan was to deport them, but the story stopped making news once the 2020 lockdowns hit.)

Anyway, the UAW was a similar disaster at UC Berkeley a while back. There weren’t widespread protests, but there were salary caps for grad students, and the union eliminated health care coverage for a number of female problems (over student objections).


The wildcat strike is exactly what I was referring to as "We at UCSC didn't always agree with the course of the larger UAW 2685..."

The wildcat strike was led by the local union leadership after they abdicated their official positions iirc. Having that previous level of organization and identified leadership certainly made organizing wildcat actions easier.

Unions are more than just the highest level of leadership.


Do you have sources for those claims? I don't have any knowledge here; just cursory googling indicates the issue was a lot more complex than UAW being the bad guy.


Female problems? Are you serious, that is a pretty negative way to describe health issues that might apply only to women. Why would you put it that way, it's just kind of dismissive.


How would you have put it, rnk?


Health concerns?


> The reason I had healthcare during grad school was because the union won it right before I joined.

I’m sorry if this is weird, but as someone who also went to UCSC for grad school I found this a bit confusing. So I looked it up and you started at UCSC at 2014, yeah?

UCSC grad students had GSHIP coverage for years before that time. I myself was on it when I joined starting in 2009, and there’s plenty of documentation of fights folks had over trying to get better rates and coverage on GSHIP well before both our times: https://www.indybay.org/newsitems/2007/05/13/18415831.php (Which personally I thought was pretty good especially after the expansion of airlift coverage which was an unfortunately common problem for UCSC’s location “over the hill” from many tier 1 emergency rooms.)

Maybe I missed something when I was there 2009-2015. But what did the union representation and bargaining bring to the table there?

From a couple years ago, it doesn’t seem to have resulted in anything close to a reasonable or even livable stipend for a researcher. It was bad when I was in grad school, but I was pretty appalled to hear during the wildcat strikes ten years later that despite the increase in costs there didn’t seem to be that much change in the stipend amounts for graduate researchers. The students who were wildcatting out of frustration seemed to have a pretty good reason IMO.

I think that meets a pretty similar pattern of unions focusing on fighting about healthcare while leaving wages to stagnate over years of price increases, which I guess also applies in many unrepresented UC roles and in dynamics elsewhere. I personally didn’t see much difference between UAW’s representation and not when I was there, but I guess I didn’t have a huge point of comparison.

I hope whatever this new swell of support is provides livable stipends for young researchers though. So I hope I’m either wrong or grad student unions are able to win more in bargaining in the future. :)


Nice to see a fellow slug! I think you are correct on the timeline being further back. The narrative I recalled was that there was a major victory around health care fee remission before I joined but it looks as if that was part of the original contract the union negotiated [1].

I spent my final years at UCSC working through the systems they had set up internally (administration meetings with GSA, getting on committees of administrators as a grad student voice, working with on campus housing developers[2]) in order to improve housing availability and cost. We had marginal wins if anything. The strike the next year won everyone thousands of dollars toward housing every year. I understand the nuance of it being a wildcat strike but the entire organizing infrastructure there was from the union.

I agree with your final points and hope stipends will follow upwards in the near future.

[1] https://livinghistory.as.ucsb.edu/tag/uaw-local-2865/ [2] https://payusmoreucsc.com/history-of-students-attempts-to-en...


Likewise! Those housing stipends seemed sorely needed. Nice work on organizing around them and thanks for sharing your perspective on it! :)


I'm confused. The article that you linked says San Francisco is the most expensive rental market. However, you speak about UC Santa Cruz. The campus is about 120 km south. It's a totally different rental market. Do I misunderstand?


First line: “Santa Cruz County has vaulted over the San Francisco area as the most expensive market in the country for renters”


It’s an hour and 14 minute drive. It’s not unheard of to commute from there, and it’s far more beautiful than any of the closer beaches, so yes the Bay Area rental prices still apply, in addition to beach community rental prices. As far as calling it SF, some people shorten “The San Francisco Bay Area,” which includes San Jose, to “San Francisco,” and people still know what they mean, though I see why it can be confusing.


I wrote about this exact topic a few years back: "Algorithmic Bias is Not Just Data Bias" (https://aaronlspringer.com/not-just-data-bias/).

I think the author is generally correct but there is a lot of focus on algorithmic design and not on how we collectively decide what is fair and ethical for these algorithms to do. Right now it is totally up to the algorithm developer to articulate their version of "fair" and implement it however they see fit. I'm not convinced that is a responsibility that belongs to private corporations.


> I'm not convinced that is a responsibility that belongs to private corporations.

Private corporations are, by and large, the entities which execute their business using these algorithms, which their employees write.

They are already responsible for business decisions whether made using computers or otherwise. Indeed, who else would possibly manage such a thing? This is tantamount to saying that private corporations should have no business deciding how to execute their business — definitely an opinion you can have, it's just that it's an incredibly statist-central-planning opinion the end.


> Indeed, who else would possibly manage such a thing? This is tantamount to saying that private corporations should have no business deciding how to execute their business

No business is allowed to discriminate against protected groups. That's arguably a third-party standard for fairness, but I don't think this qualifies as central planning.

I see no reason why other types of third-party standards would be impossible or infeasible for machine learning applications.


One of the first papers I read in this area was very interesting in this regard (https://crim.sas.upenn.edu/sites/default/files/2017-1.0-Berk...). I think the challenge is that a business (e.g. COMPAS) can certainly take a position on what definition of algorithmic fairness they want to enforce, but the paper mentions six different definitions of fairness, which are impossible to satisfy simultaneously unless base rates are the same across all groups (the "data problem"). Even the measurement of these base rates itself can be biased, such as over- or under-reporting of certain crimes. And even if you implement one definition, there's no guarantee that that is the kind of algorithmic fairness that the government/society/case law ends up interpreting as the formal mathematical instantiation of the written law. Moreover, this interpretation can change over time since laws, and for that matter, moral thinking, also change over time.

I think the upshot to me is that businesses, whether it's one operating in criminal judicial risk assessment or advertising or whatever, don't really make obvious which definition (if any) of fairness that they are enforcing, and thus it becomes difficult to determine whether they are doing a good job at it.


Maybe I wasn't very clear, I don't think every single machine learning model should be subject to regulation.

Rather I view it more along the lines of how the US currently regulates accessibility standards for the web or enforces mortgage non-discrimination in protected categories. The role of government here is identify a class of tangible harms that can result from unfair models deployed in various contexts and to legislate in a way to ensure those harms are avoided.


I wonder what the countermeasures to this are.

If you trained a model to predict the outcome purely from the protected class and it was successful (in terms of predictive power), does that mean fairness is efectively impossible?

e.g. if you trained an educational performance predictor on wealth of parents, then I'd guess it would do reasonably well. And there is the argument that your parents are rich because they're smart and you are genetically connected to them.

But there's obvious counterexamples, like children adopted by rich families or children of refugees (who may have been professors or surgeons in their home country).

So if we can't avoid the bias in that extreme example, then adding extra data is only going to bury that truth under confusion.

I'm not sure we're ready to admit that we disadvantage the children of the poor to ourselves, which will make this whole AI bias thing a tricky conversation to have.


It talks directly about this in the article. The service purports to know the difference between prerecorded and live voices


As a PhD candidate in my last years I was paid 2500 gross per month (for 9 months of the year), take-home was 2250 or so.

70% of net is 1575. 1575 is 63% of my gross pay.

Remember, these workers are paid so little they are in the bottom tax brackets at 10-12%.


Can you update the article by holding out a portion of the training set and then using it as an unseen test set?

Otherwise it's impossible to make the comparisons at the end of the article to other results.


Yeah, was the model trained using train/test splits? otherwise, the model likely has been severely overfit.

I wonder how this performance would have compared to a simple random forest or MLP model.


I was also curious. Using an out-of-the-box weighted random forest, I got an f-score of ~.85 using a 75:25 stratified train-test split.


Do measurements of 'screentime' normally include talking on the phone or passively listening to music? Or are you implying something different?


Screen time might encompass video chatting with a relative, reading a book on the Kindle app, watching a 2 hour long documentary, watching a 10 minute long tutorial on YouTube, writing an email, editing pictures of one’s vacation, managing one’s finances through a webpage or spreadsheet, etc etc etc.


Sure, there are always things you can do on a screen that are productive. But you need to way the difficulty of managing these activities with the potential upsides that only a screen can provide.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: