More

logancg · on Aug 25, 2020

And, consequently, this may have serious implications for language models that use the Scots Wikipedia corpus. [0]

[0] https://twitter.com/r_speer/status/1298297872228786176

logancg · on Feb 15, 2019

We do. If you’re interested in the architecture particulars (I.e. which layer types, layer shapes, layer parameters, activation functions, etc) you’ll commonly see these in deep learning papers as a directed acyclic graph with layers represented by a collection of vertices or rectangles. Other architecture specifics are sometimes noted with text alongside layers. We also have more general representations of deep models which look like, and sometimes are exactly, Probabilistic Graphical Models. PGMs have their own formal language which makes it quite easy to write complex models describing a joint distribution very simply.

However, this is different than interpreting how a deep model works. “Interpreting” is an overloaded and poorly-defined term of active research. The sense I get from my friends in the interpretability/explainability research world is that despite the buzz, there is no common definition that lends to a clear set of requirements for an acceptable interpretation of a neural network.

logancg · on Feb 6, 2019

I'm econ undergrad -> DS -> Machine learning. Econ is very useful for data science if you focus on the right subjects: statistics, math, and experimental design. You get all the hard skills you need to interact with data that a statistician or computer scientist gets, with the (significant, unique) benefit of learning how to ask the right question or design the right experiment given what is likely a messy, weird, social scientific question.

On the other hand, if you don't do any quantitative, empirical, or experimental economics -- i.e. you only do theory or political econ -- then you won't pick up these skills (as much).

logancg · on Jan 8, 2019

How do you come to the conclusion that ML is a buzzword here? It's natural to publicize interesting new research that grounds a field -- as statisticial learning theory does for machine learning. Historically, Ben-David and co-authors have produced foundational work on the implications of SLT in ML.

dooglius · on Jan 8, 2019

The problem is that the model was invented in this paper itself, and doesn't actually seem to relate to any actual ML algorithm. Admittedly, I didn't read the paper itself so this may be the overreach on the article's part.

zwaps · on Jan 9, 2019

I don't think that can be true. If it was, it were like this:

Hey guys, here's an algorithm. Oh by the way, we can not prove that this algorithm does something useful, and their can you.

gets published in nature

Talk about bang for the buck

logancg · on Jan 8, 2019

Deep Graph Library (DGL) [1] came out in November and I've heard good things. It looks easy and intuitive.

[1] https://github.com/dmlc/dgl

logancg · on Jan 8, 2019

For more, I highly recommend looking at the accepted papers from the NeurIPS 2018 Relational Representation Learning workshop. [1] I really enjoyed the workshop and I hear workshops tend to represent a (rough) frontier of the subfield.

[1] https://r2learning.github.io/

logancg · on Jan 8, 2019

There are a lot of talented ML researchers in China. This is a product of (a) the government and major companies (i.e. BAT) investing heavily in fundamental ML research (b) the population size (c) a long tradition of STEM-focused education in China. So, it's not surprising that would be the case.

The interesting questions are if China is uniquely focused on deep learning over other ML techniques, and Chinese research compares in terms of quality. Anecdotally (speaking as a researcher in the field) papers from Chinese institutions seem disproportionately focused on deep learning (whereas, for example, the UK does great work in Bayesian ML and the US does disproprotionately well in NLP). I'm not a deep learning researcher so I can't judge the technical merit, but I was just at NeurIPS in Montreal, and I saw about equal representation of Chinese institutions as South Korean ones. South Korea, with ~1/25 the population, punches way above its weight per capita.

taneq · on Jan 8, 2019

I thought the question was more "why arXiv and not other journals" in which case maybe 'prestigious' Western journal publications just aren't valued as much in China?

nl · on Jan 8, 2019

Almost all ML research is published on arXiv.

In ML (as in most of Comp Sci) conference proceedings (NeurIPS, ICML etc) are where the prestige publishing is.

HuShifang · on Jan 8, 2019

I can't speak specifically to this topical domain, but generally speaking (assuming the research topic isn't politically sensitive) there are professional incentives to do transnational work -- whether it's publishing in Anglophone journals, organizing international conferences, etc. So probably, it's not that foreign journals are valued less. (Probably, in fact, the contrary.)

lettergram · on Jan 8, 2019

Perhaps their papers are disproportionately not accepted to conferences so we see many more on arXiv

logancg · on Oct 1, 2018

I'm curious: if you think so much about climate change and its effects, and even plan your actions and location around it, have you considered directing your energy to combatting climate change with your skills on a systemic level? (For example, building sustainability companies, being part of climate activism, earning-to-donate?)

I'm in a similar spot but optimistic about systemic action.

madaxe_again · on Oct 1, 2018

I’m doing consultancy at a CTO level for a number of medium/large environmental technology companies while I travel, comprising everything from supply chain management around hazardous materials to sustainable living systems to pipeline management - and I’ve been contributing photography and footage to documentarians as I go.

I’m not optimistic about systemic action - the people taking it seriously don’t have the power to overcome a far more dominant system: our consumption-driven way of living, and the economies built around it. Mindshare may grow sufficiently in coming decades, but by then it’ll largely be too late, and that growth will likely come from more people witnessing and enduring the effects of climate change - and if we’re at that point, it’s already too late, as systemic collapse beckons as I laid out in my previous.

The climate is a chaotic system - it finds a metastable mode for a given set of inputs, and once moved by changing those inputs (e.g. CO2 ppm) from that point of stability can wander drastically before finding that point again. As forests die, as methane is belched from the thawing permafrost, as marine ecosystems collapse (everywhere is overrun by jellies - even Antarctica), as ice caps and glaciers melt, as coastal plains flood, as grassland turns to desert, core components of the current system of stability are yet further altered, thus pushing the current mode yet further from its stable point and making that stable point harder to reattain.

Like I said, get a climatologist talking in private, and it’s grim listening. They see the bigger picture in their work, but only publish studies on specific observations. Longitudinal studies end up being conservative otherwise they sound radically alarmist and don’t get published.

Anyhow. We should keep on playing while the ship sinks.

bjourne · on Oct 1, 2018

There were large scale demonstrations held on September 8. Hundreds of thousands of people marched and demanded change around the world. You didn't hear about it? Not strange. Zero media coverage.

CalRobert · on Oct 1, 2018

tl;dr I got tired of people saying I was "lucky" that I could ride a bike to work and made a tool to help them do the same thing. It shows flats for rent on a map with transport and bike routes instead of roads.

https://www.gaffologist.com/ (Ireland-only)

Sorry if it sounds like a plug but it's the best answer I have to your question. I studied physics in college and hoped it would help me work on energy efficiency or renewable energy generation technology, but it turns out that when I graduated (in to the worst recession in a century) nobody had any interest at all in what I was able to offer. I'd still love to work on saving the planet if I could do so while paying rent or a mortgage.

jacquesm · on Oct 1, 2018

Super neat, and cool to see you act.

logancg · on June 18, 2018

Sounds interesting. The two big barriers to this system are:

1. Political buy-in to set a cost of carbon emission 2. Economic viability of removal (often referred to as carbon capture and storage, or CCS)

1 is non-trivial, but happening. Canada has a provincial carbon tax measure which requires setting a price. A price is also defined in the European cap-and-trade system.

2 is also non-trivial, as the economics are pretty bad right now. However, CCS costs are projected to decrease rapidly. [0]

Right now, integrating CCS is on average more expensive than purchasing carbon credits. Carbon offsetting companies are, to your point, an example of private enterprise filling a gap.

[0] http://www.ccsassociation.org/why-ccs/affordability/

logancg · on June 14, 2018

This is a great idea Ben, and I appreciate the work you do. Do you see Kaggle datasets as a tool to encourage better data formatting, or are you also thinking about building tools for automatically visualizing, cleaning, and organising data?

benhamner · on June 14, 2018

All of the above, and more! One thing I'm really excited about that we're about to release is a much better explorer for tabular data (automated histograms, sorting/filtering/showing the data, and the like).

We also encourage sharing analytics code and visualizations that users create on the data back to the community. For example, see all these visualizations and insights in StackOverflow's developer survey data linked from https://www.kaggle.com/stackoverflow/stack-overflow-2018-dev...