More

vwinsyee · on June 9, 2014

In many cases, I'd agree with you. But in context of what happened here, the fact that the company was Chinese (or simply just foreign, for that matter) is absolutely relevant. A few outstanding points:

1. The company was "unaccustomed to the rigor of American construction rules" and "lacked basic quality control", leading Caltran to "relax U.S. standards when the firm couldn’t finish fast enough."

2. "ZPMC violated the job contract by delivering key documents in Chinese instead of English. ABF lacked sufficient quality-assurance staff to speak directly to its own subcontractor – also a contract violation."

3. "Counting the money spent on travel and living costs for Caltrans and its contractors, the suspension span consumed much more than the $250 million in ZPMC’s assumed efficiencies that made the Chinese steel so cost-effective." On living costs alone, Caltran "paid about $50,000 annually per person to rent more than a dozen well-appointed rooms."

Asked for justification, "Caltrans described the accommodations as "reasonable and appropriate" in a written statement. 'The hotel provided a government rate that was comparable to rates at other western hotels,' and followed bargaining agreements, based in part on providing adequate 'safety and support for employees far from home.'"

vwinsyee · on May 30, 2014

If you want to know how to do something, how something works, you'll use a different, non-google indexer for searching, say, all of the "how to" sites: e.g. wikipedia, stack exchange, about.com, etc.

I've gradually started to do this using DuckDuckGo's bangs [1]. It actually works pretty well if I know exactly which site I want to search. I do miss Google's ability to filter by time, though.

[1] https://duckduckgo.com/bang.html

chillingeffect · on May 30, 2014

Hail there, fellow out-of-liner in sub-zero downvote land! I guess by not towing the party line about Google, we're getting downvoted.

Nevertheless, I wanted to take the time to thank you for your information about DDG's bangs. I've been using DDG since Snowden, but didn't know bout bangs! Looks very useful and shorter than typing "site:" in google. It kind of reminds me of multireddits

vwinsyee · on May 17, 2014

The FDA has more than likely directly prevented hundreds of millions deaths worldwide. It's literally one of very few safeguards you can rely on to know that you're not eating poison.

Amusingly, the top comment on a front page HN post [1] right now concerns one of the first court cases [2] that gave FDA some real power. Before this case, companies could sell a poison like emetic tartar and market it as a hangover drug. Selling stuff like this still happens today in countries without an FDA equivalent. [3] It even happens in countries with a drug regulatory body, but one that's weak. [4]

As with all regulation, there's a balance between benefit/risk. As you point out, there is a financial and time burden of regulation. But with warts and all, the work that the FDA has done and currently does far outweighs these burdens.

> The only good thing I can find to say about the bureaucrats at the FDA is that they respond to regional regulatory arbitrage - to medical tourism - by loosening their grip and letting things through. Or at least they do when said treatments are so widely available elsewhere in the world that the risk of looking bad by continuing to block them in the US is worse than the risk of looking bad because one of the outcomes of a normal, low error rate in medicine gets picked up by the press. Look at simple stem cell transplants, now available in the US these last couple of years, for a good example of this in practice.

I'm not quite sure how to respond to this rant. But I'll point out findings from a 2012 NEJM study comparing the FDA to the EU's EMA and Canada's Health Canada: [5]

"RESULTS: There were 510 applications for novel therapeutic agents approved from 2001 through 2010 — 225 by the FDA, 186 by the EMA, and 99 by Health Canada; among the applications, there were 289 unique agents. The median length of time for completion of the first review was 303 days (interquartile range, 185 to 372) for applications approved by the FDA, 366 days (interquartile range, 310 to 445) for those approved by the EMA, and 352 days (interquartile range, 255 to 420) for those approved by Health Canada (P<0.001 for the comparison across the three agencies). The median total review time was also shorter at the FDA than at the EMA or Health Canada (P=0.002). Among the 289 unique novel therapeutic agents, 190 were approved in both the United States and Europe (either by the EMA or through the mutual recognition process), of which 121 (63.7%) were first approved in the United States; similarly, 154 were approved in both the United States and Canada, of which 132 (85.7%) were first approved in the United States."

[1] https://news.ycombinator.com/item?id=7758508

[2] https://en.wikipedia.org/wiki/United_States_v._11_1/4_Dozen_...

[3] http://ijp-online.com/article.asp?issn=0253-7613;year=2007;v...

[4] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1151014/

[5] http://www.nejm.org/doi/full/10.1056/NEJMsa1200223

vwinsyee · on March 28, 2014

I'm seriously thinking about applying (actually to the Data Science program). I'm currently looking to start in the SF Bay Area as a data scientist (or data analyst, if need be) in late May to June.

But I have a question -- I've only advanced to the masters level at this point; I recently graduated with an MS in biostatistics. "PhD" and "postdoc" is written all over the site. Should I even consider applying?

Finally -- what's the best way to contact you? Should I email the email address under "Contact"? Or is there a preferred alternative?

jakek · on March 28, 2014

The Insight Data Science Fellows Program is currently for PhDs only. If you have enough engineering experience, I would suggest applying to the Data Engineering program, which is open to anyone. If not, then drop me a line at jake@insightdatascience.com and I can see what I can do to help.

vwinsyee · on March 28, 2014

Thanks for the response; I'm definitely applying to the Data Engineers program. My engineering background has been more on-the-job than from formal coursework, so I was a bit subdued by list of engineering disciplines in the "Accepting Applicants From" section (though I now see "Scientific Research" as one of the fields, yay!). I really hope I can participate.

eggoa: Sorry for hijacking your reply-thread!

jakek · on March 28, 2014

Great to hear. We love skills learned on-the-job. That list is just meant to cast a wide net, so people feel welcome to apply from various backgrounds. The main take-away is that we want people who have the right fundamental skill set, and are not too concerned about which formal discipline they learned it under.

vwinsyee · on March 28, 2014

I agree with your general sentiment. However, it only takes one member of the underrepresented group (with endorsing capability), or any member of a centrist or neutral party, to approve the a comment. (And as of so far, it doesn't seem like pending comments can be disapproved, e.g. by a hostile party.)

Granted, it's true that by nature of being underrepresented, the probability of that one member showing up to approve the comment may be unacceptably low. But I doubt it. In other words, to use your example, as long as the probability is high enough that at least one woman will show up to approve another woman's comment, there shouldn't be a problem.

Personally, I think that this probability is pretty high, though, as pg mentions, this'll have to be determined empirically. For example, it's entirely possible (as discussed in previous threads), that the population of active endorsers are skewed toward certain groups. Or that even a slightly <100% probability of legit comments being seen can ultimately compound systemic biases in the long run.

saurik · on March 28, 2014

I think it is very important to not discount the emotional complexity of forcing someone into a position where their comments must be "endorsed" by a group of people they are perceiving as "hostile" to them.

Even if there are people who would endorse the comment, that is not what the person leaving the comment is likely going to be "feeling" in that moment: they are reading a bunch of comments that they are bothered by--ones which none of the other people around seem to be taking serious issue with (which is a key part of the original problem statement)--and those are the people they are going to perceive as being the ones who must endorse their comment: the ones to whom they are effectively submitting themselves for "endorsement".

(Which, frankly, I think is a separate reason why this endorsement system doesn't make sense as a solution to an endemic problem: most of those issuematic threads are filled with users--many of which have high karma--who are reinforcing the negative comments; it is unclear to me why the negative comments aren't going to have an easy time getting endorsements given that they currently don't have a difficult time getting upvotes.)

I thereby feel like even if this feature "worked" (and again, it isn't really clear to me why this would help, given that it isn't like these users are currently being downvoted or flagged out of the conversation) it still might not be an "appropriate" way to solve these underlying problems (which I would claim are inherently messy and emotional).

jlees · on March 28, 2014

I agree with you and your parent. If this system goes into place I'll probably spend far more time on HN than I usually do these days -- at least at first -- specifically trying to approve comments made by other women, in the fear that others with enough karma will dismiss their point of view.

However, that won't stop many of the negative comments you mention, posted by and/or endorsed by high-karma members, also showing up in the thread. It only takes one high-karma approver to allow the negative stuff through and we're back to square one, but possibly excluding some underrepresented groups.

I like the idea in general, that is, I like the drive to improve the quality of HN comments. I'm not sure that flipping 'flag' around into 'approve' is the solution, but sadly I don't have any better ideas. Well, other than post emotion analysis and multiple-tiered karma systems (+1 Insightful, anyone?).

vwinsyee · on March 19, 2014

Abstract: http://www.jneurosci.org/content/34/12/4418.short Full Text PDF: http://www.jneurosci.org/content/34/12/4418.full.pdf

I'm on an academic network, so I'm not sure if the PDF's paywalled...

From a cursory glance at the paper, this is what they had to say about how they kept the mice awake:

Mice were housed in a light/dark environment with lights on from 6:00 A.M. to 6:00 P.M. ... A previously validated, enriched, novel environment (Gompf et al., 2010) was used to promote spontaneous exploratory wakefulness. The short-term wakefulness (Sh Wake) period selected was 3 h during the lights-on period. With zeitgeber time 0 h (ZT0) referenced to the onset of the lights-on period, Sh Wake occurred at ZT8–ZT11, while extended sleep loss (Ext Wake) consisted of 8 h of continuous wake time at ZT3–ZT11 with 16 h intervals in the home cages after the first and second days of 8 h wake time.

sillysaurus3 · on March 19, 2014

Yes, the PDF is paywalled. Would you mind posting a Dropbox link to it or emailing it to me? Thanks!

This is a very interesting paper, but I need to analyze the specific method they used to keep mice awake. All that paragraph says is "we used a previously validated technique," with no details as to how specifically they woke the mice up.

treePhase · on March 19, 2014

I believe this is the cited paper: http://www.jneurosci.org/content/30/43/14543.full.pdf

They describe the "novel environment" starting on page 3.

vwinsyee · on March 12, 2014

When I saw "6x4", my immediate thought was a "six by four" matrix. So, by convention, this means 6 rows and 4 columns, corresponding to e.g. 6 observations of 4 variables each (from a statistician's perspective). The closest answer I could find here was "six lots of four."

th0br0 · on March 12, 2014

Definitely a matrix... * is for multiplication ;)

vwinsyee · on March 1, 2014

You might enjoy this story/thread from three months ago: https://news.ycombinator.com/item?id=6815282

jastanton · on March 1, 2014

This is brilliant, and in large supports my argument :) thanks

vwinsyee · on Feb 27, 2014

> Hospitals don't let uninsured people die and insuring people doesn't magically save their lives.

Not sure where you're getting this. A quick Google Scholar or PubMed search shows a consensus that mortality rate is significantly higher for uninsured than for insured. [1, 2, 3]

[1] e.g. http://jpubhealth.oxfordjournals.org/content/32/2/236.short -- On multivariate analysis, uninsured compared with insured patients had an increased mortality risk (odds ratio: 1.60, 95% CI: 1.45–1.76). The excess mortality in uninsured children in the US was 37.8%, or 16 787, of the 38 649 deaths over the 18 period of the study. Children who were hospitalized without insurance have significantly increased all-cause in-hospital mortality as compared with children who present with insurance.

[2] e.g. http://journals.lww.com/jtrauma/Abstract/2012/11000/Undiagno... -- Undiagnosed preexisting comorbidities play a crucial role in determining outcomes following trauma. Diagnosis of medical comorbidities may be a marker of access to health care and may be associated with treatment, which may explain the gap in mortality rates between insured and uninsured trauma patients.

[3] e.g. https://www.sciencedirect.com/science/article/pii/S000296101... -- A total of 1,203,243 patients were analyzed, with a mortality rate of 3.7%. The death rate was significantly higher in penetrating trauma patients versus blunt trauma patients (7.9% vs 3.0%; P < .001), and higher in the uninsured (5.3% vs 3.2%; P < .001). On multivariate analysis, uninsured patients had an increased odds of death than insured patients, in both penetrating and blunt trauma patients.

trhway · on Feb 27, 2014

>mortality rate is significantly higher for uninsured than for insured.

correlation doesn't mean causation.

>to help make our government work better

i somehow doubt that throwing a team of "rockstars" to clean up the mess is making the government to work better. If anything, it enables the typical government behavior we saw in the case of healthcare.gov.

vwinsyee · on Feb 27, 2014

> correlation doesn't mean causation.

As a statistician, I guess I should be happy that more people are aware of this. But I also think too many people are taking "correlation != causation" superficially. I mean, almost all of science is based on significant correlational findings, especially when the traditional way to prove causation (i.e. via randomized trial) is unethical (i.e. we can't randomly assign people to be insured vs. uninsured).

Along these lines, I often find people who say "correlation != causation" don't stop and wonder "so how _can_ we prove causation (in a non-randomized study)?" I guess many of them can be partially excused since the answer is non-trivial. But generally, here's a few rules of thumb for making a stronger case for causality from correlation:

* the effect size is relatively large (e.g. uninsured children die at 60% higher odds than insured children)

* the cause comes before the effect (e.g. people are uninsured before they go to the hospital and/or die)

* reversible association (e.g. risk of dying at a hospital changes when people get insurance)

* consistency / consensus across multiple studies (e.g. many studies showing that a difference in insurance status is associated with a significant difference in hospital mortality )

* dose-response relationship (e.g. I didn't link examples previously -- but there were a few studies showing that different levels of insurance, from none to Medicaid to private, is associated with different rates of hospital mortality)

* plausibility (e.g. even from a qualitative point of view, it's quite believable that people who unable to pay a hospital bill might get worse service)

DanielRibeiro · on Feb 27, 2014

Please turn this into a blog post and submit it to HN. I'd love to see this comment about correlations see more attention ;)

Fomite · on Feb 28, 2014

I just submitted this: https://news.ycombinator.com/item?id=7317254 which is the CrossValidated (StackOverflow for stats) discussion of whether or not causation implies correlation.

gizmo686 · on Feb 28, 2014

All good points, but you also should consider the plausibility of it being a correlation. By this I mean that there seem to be clear candidates for a common cause between no insurance and high mortality, for example: income.

Once you control for this, and other potential common causes, your case for causality becomes much stronger (or non-existent).

Fomite · on Feb 28, 2014

If you had read the paper linked above, you'd note they controlled for income.

aero142 · on Feb 27, 2014

Many people believe that the Oregon Health Study is the best chance we have of teasing causation out of the data. It is still in progress but it is being watched closely. http://oregonhealthstudy.org/

tunesmith · on Feb 27, 2014

To clarify, this is because they actually did what would normally be unethical: they randomly assigned citizens to be eligible for the insurance plan. It was a lottery.

anon808 · on Feb 27, 2014

so are you saying not having health insurance is better for people? I'll take the common sense angle that having health insurance increases the chances of medical care being applied to a health problem, versus not having any health care insurance.

there's no such thing as a monolithic 'government' being that can either work better or be enabled to act a certain way. there are people, and groups of people that do certain things. some of those things are worthless, some of those things are worthwhile. i'd argue the mess the 'rock stars' cleaned up is a generally positive activity.

trhway · on Feb 27, 2014

>so are you saying not having health insurance is better for people?

i'm just saying that being from a poor or damaged family, uneducated, having mental illness or substance abuse problem, etc... usually leads to higher mortality and also to not having health insurance. Giving them a health insurance [i'm all for it, i think modern civilized society should provide basic level of free health insurance to everybody] would be inconsequential in many cases as not having the insurance isn't the cause, just a manifestation.

>there's no such thing as a monolithic 'government' being that can either work better or be enabled to act a certain way.

http://en.wikipedia.org/wiki/Systems_theory

and if we specifically consider a closed loop system of "government + society" then this may be of interest too:

http://en.wikipedia.org/wiki/Cybernetics

saraid216 · on Feb 27, 2014

I completely support the conclusion that we should be fixing our third variables.

vwinsyee · on Feb 8, 2014

Contrary to the current HN title, the article points out:

Evidence presented during Private Manning’s court-martial for his role as the source for large archives of military and diplomatic files given to WikiLeaks revealed that he had used a program called “wget” to download the batches of files. That program automates the retrieval of large numbers of files, but it is considered less powerful than the tool Mr. Snowden used.

So the tool wasn't wget. curl, perhaps?

3pt14159 · on Feb 9, 2014

Having done this type of work before for a legitimate purpose, it is almost certainly a python or perl script with a nice library in front of it that makes it easy to follow links.

wget is too brittle, not extensible enough, and not as maintainable as a nice python script.

jrochkind1 · on Feb 13, 2014

I believe Manning actually used Windows batch scripting to automate wget, or so the government alleged from forensics at the trial. (I observed a couple days of the trial).

Manning did not have the tech skills of Snowden though, she wasn't neccesarily doing things in the most effective or elegant ways, but it worked.

emmelaich · on Feb 9, 2014

Probably; but could be something like lftp. It's name belies its capabilities.

Or maybe Kermit? Half-smiley; only if he's a masochist. http://www.kermitproject.org/ckscripts.html

wslh · on Feb 9, 2014

Wget is also single threading which is a slow strategy to download pages.

Fasebook · on Feb 9, 2014

that's what xargs -n x is for

wslh · on Feb 12, 2014

Can you elaborate?

sebastianavina · on Feb 9, 2014

the other day I had the task to batch download product pictures from a website... every picture had a sessionid on the uri so I could't make a simple image wget. I wrote a simple python script that wrote a shell script with a lot of "wget -E -H -k -p \n sleep 30" and ran it trough a cloud server for a couple days... after that, some simple scripts for renaming the pictures, some regular expressions here and the, and voila, 250k perfectly named pictures for my product catalog... (it's for an intranet, so I guess I wont have copyright problems"

eli · on Feb 9, 2014

FYI, you have exactly the same copyright issues on an intranet. You're just less likely to get caught, I guess.

kudu · on Feb 8, 2014

curl is just a library with a slim command-line interface. It can't scrape pages by itself. Perhaps you're thinking of curlmirror? Even then, I doubt it can be considered more powerful than a good wget configuration.

kurtsiegfried · on Feb 9, 2014

Nutch/Solr could provide a way to do a crawl, refine parameters, and then feed into a tool to download the actual resources.