If this truly is an attempt at censorship, it seems that shadow banning could have been just as effective and harder to detect. Maybe they could even secretly add a generic has_negative_sentiment_towards_an_elon_company modifier to their ranking algorithm and dial the outreach of such tweets way way down, in similar vein to author_is_elon.
Indeed, but Elon's Twitter is a case study in low effort trolling.
For example, Twitter manually override the blue check system to make it look like critics had purchased it. Then someone realized changing their name would clear the checkmark. After dozens of iterations of Dril changing his name and having the checkmark manually re-added someone at Twitter thought to disable changing his name.
At every step of the way they used the shittiest possible solution. I'm unsurprised the pattern continues.
There's no reason to think Musk doesn't doomscroll Twitter looking for negative stuff about him to nuke. There doesn't have to be a galaxy brain plan here.
I really like the idea of automated code review tools that point out unusual or suspicious solutions and code patterns. Kind of like an advanced linter that looks deeper into the code structure. With emerging AI tools like Github Copilot, it seems like the inevitable future. Programming is very pattern-oriented and even though these kinds of tools might not necessarily be able to point out architectural flaws in a codebase, there might be lots of low-hanging fruits in this area and opportunities to add automated value.
Consider that you may be describing a compiler. Typos are not generally a problem in statically typed languages with notable exceptions such as dictionary key lookups etc.
Even without static typing, argument length verification etc. can be done with a suitable compiler. In python we are left chasing 100% code coverage in unit tests as it's the only way to be certain that the code doesn't include a silly mistake.
I think 100% code coverage is folly. Spreading tests so widely near-inevitably means they're also going to be thin. In any codebase I'm working on, I would focus my attention on testing functions which are either (a) crucially important or (b) significantly complex (and I mean real complexity, not just the cyclomatic complexity of the control flow inside the function itself).
Fully agree, but I never want to see a missed function argument programming error in customer facing code. In python you really do need code coverage to achieve this goal - static languages have some additional flexibility.
Or a rich suite of linters religiously applied. Never save a file with red lines in flymake or the equivalent. Ed: actually, I am unsure if my current suite would miss required parameters. I tend to have defaults for all but the first parameter or two, so not a big issue for me I guess. I do like a compile time check on stuff tho, one of the reasons I am doing more and more tools in Go.
I actually recently joined a startup working on this problem!
One of our products is a universal linter, which wraps the standard open-source tools available for different ecosystems, simplifies the setup/installation process for all of them, and a bunch of other usability things (suppressing existing issues so that you can introduce new linters with minimal pain, CI integration, and more): you can read more about it at http://trunk.io/products/check or try out the VSCode extension[0] :)
cool product :) it is just linting or do any of the tools do code transformation to offer the fix for the lint failure? (code review doctor also offers the fix if you add the github PR integration)
This is basically linting, i.e. code analysis. The techniques used might be more current (as they have been evolving, as you say, for pattern matching) but linting is just that: a code review tool to find usual bugs. (This is what did happen in this blog post. It wasn't looking for unusual solutions but usual mistakes.) The packaging, form of the feedback seems also different and that in itself may make a lot of difference in ease of use and thus adoption.
Admittedly, the difference here is that codereview.doctor spent time tuning a custom lint on a variety of repos. In an org with a sufficiently large monorepo (or enough repos, but I don't really know how the tooling scales there) it's possible to justify spending time doing that, but for most companies it's one of those "one day we'll get around to it" issues.
Or people could just write it correctly in the first place! Controversial I know! Seems like people would rather half-ass things and then let some AI autocorrect fix it up for whatever reason rather than doing it properly.
Almost exactly 10 years ago I picked up climbing/bouldering and I cannot recommend it enough. It's a fascinating sport combining elements of pure strength, coordination, technique and problem solving. Every route is unique and it's not uncommon to have vastly different approaches depending on your height, finger strength, flexibility, and many other parameters. Finding a solution that works for you can sometimes be more difficult than the climb itself. It's definitely not as boring and repetitive as weightlifting. I highly encourage everyone to try if they haven't yet.
Totally agree. I picked up climbing several months ago, so I'm still green, but it's a night and day difference from other forms of exercise for me. I've mostly gravitated toward solo sports because I enjoy competing against myself more than competing with others, but running, cycling, weight lifting, etc always felt like a chore and I quickly became bored with the repetition. In addition to being physically dynamic, I've found it works well as a social form of exercise too. There's built in incentives to befriend a wide pool of fellow belayers and we all climb at different levels, so there doesn't seem to be much in the way of interpersonal rivalry, just love of the sport.
I think I found one more minor typo – in the Control Flow Graph section I believe "The possible number of calls to c is zero to infinity;" should be "zero to one" instead, as the flow terminates as soon as it reaches "c", so there is no chance for it to be greater than one.
I imagine we have tools in that direction, but nothing complete. Unlike math and computers, biological systems don’t really go from a uniform set of simple rules to emergent complexity - there is a whole lot of sideways complexity thrown in.
Something that might fit the computation vision of your comment are the various Ontologies for bioinformatics. The Gene Ontology is probably the most complete, although it lags many years behind the literature.
There's also a tutorial [0] which goes through these examples one by one explaining in detail how each concept works. I highly recommend it to anyone interested in learning this framework.
> Writing isn't so bad really when you get through the worry. Forget about the worry, just press on. Don't be embarrassed about the bad bits. Don't strain at them. Give yourself time you can come back and do it again in the light of what you discover about the story later on. It's better to have pages and pages of material to work with and sift and maybe find an unexpected shape in that you can then craft and put to good use, rather than one manically reworked paragraph or sentence.
> But writing can be good. You attack it, don't let it attack you. You can get pleasure out of it. You can certainly do very well for yourself with it...!
Yeah, I'm in the middle of editing a novel I wrote late last year right now. A big thing I figured out was it it okay to leave out some of what a final draft needs because I can do it later. Right now I'm actually adding more rich descriptions to a lot of my scenes because I just didn't bother originally, and simply not worrying about it let me focus on the things I knew I needed and now I'm fixing it.
Not trying to do everything at once makes writing prose way easier.
The following values were inspected:
1
2
4
8
16
32
64
32
48
40
44
42
43
The most surprising part for me is that in the integer search 32 is inspected twice. From my brief testing it seems to only happen with infinite ranges. Is that a bug in bsearch or am I missing something?
With a finite range, you can bisect directly by splitting in the middle of the range.
With infinite ranges, you can't do that; so the usual way is to start with a small number and increase exponentially until you find a number that is too large; which is what is done here. When you got that number, it becomes the upper bound of a finite interval.
So that's a two step process, which we can see here. The first 32 is in the exponential growth step (so is 64), and the second one is in the bisect step.
This will always happen exactly once (unless the expected result is 0) and for only the first pivot in the bisect, so it's not that bad; but indeed, they could get rid of it by bisecting on [1/n; n] instead of [0; n], as they already know that 1/n (and numbers lower than 1/n) isn't a valid candidate from the first step.
> they could get rid of it by bisecting on [1/n; n] instead of [0; n], as they already know that 1/n (and numbers lower than 1/n) isn't a valid candidate from the first step.
It shouldn't be a problem but technically you are right that Ruby does one more comparison than needed. My guess is that it would mean to keep the previous value of mid (as in `bsearch_integer_range(prev_mid, mid, 0)` instead of the current `bsearch_integer_range(beg, mid, 0)`) and that might be annoying to do in C.
I love the article, but I don't agree with the premise that machine learning equals neural nets. In my understanding machine learning is a very broad term that just as well could be applied to the polynomial model if the constants were optimized algorithmically. I feel like the presented argument is more for transparent vs opaque models rather than machine learning vs something else. Also one could argue that the polynomial model is just a perceptron[0].
The machine learning course at my university starts out with polynomial regression and estimators, statistics of classification, etc.. Neural networks are only one tool in a large toolbox.
But they are all the rage and it is no surprise that a lot of people want to play with them.
Cynically, neural networks are easier as you don't really have to think about your model. Give some examples with some classes and you're done. Or give examples of one class and let the neural net generate new ones. Doing away with the abstraction beforehand is an enticing prospect.
> Cynically, neural networks are easier as you don't really have to think about your model. Give some examples with some classes and you're done.
This way of thinking about it leads directly to things like statistical redlining.
It's also not specific to neural networks. I take a similar approach with logistic regression. Except that I like to replace the "and you're done" step with, "and you're ready to analyze the parameters to double check that the model is doing what you hope it is." Even when linear models need some help, and I need to do a little feature engineering first, I find that the feature transformations needed to get a good result are generally obvious enough if I actually understand what data I'm using. (Which, if you're doing this at work, is a precondition of getting started, anyway. IMNSHO, doing data science in the absence of domain expertise is professional malpractice.)
There is no, "and you're done" step, outside of Kaggle competitions or school homework. Because machine learning models in production need ongoing maintenance to ensure they're still doing what you think they're doing. See, for example, https://research.google/pubs/pub43146/
I had always thought of neural nets in terms of the massive connected graph, that in my head was somehow behaved like a machine.
When I realized in the end its just a representation of a massive function, f:Rm->Rn, which needs to fitted to match inputs and outputs.
I know this is not precisely correct and glosses over many, many details - but this change in viewpoint is what finally allowed me to increase the depth of my understanding.
It's unclear that there is such a thing as an NN, and in any case, that it is graph-like.
What are the nodes and edges?
There is a computational graph which corresponds to any mathematical function -- but it is not the NN diagram -- and not very interesting (eg., addition would be a node).
> Cynically, neural networks are easier as you don't really have to think about your model. Give some examples with some classes and you're done. Or give examples of one class and let the neural net generate new ones. Doing away with the abstraction beforehand is an enticing prospect.
If you're trying to solve a well understood business problem sure but my issue with this is that you pigeonhole yourself and your solution. I'm much more interested in understanding the model than doing the implementation because that allows you to build on top of what you get out of the box in a framework for example. It's like learning React before learning Javascript. It might be a good short term solution but long term it certainly isn't.
Oh, I was not defending neural networks. This was the cynical sales pitch for the case where you don't want to employ mathematicians or computer scientists, but just throw code and computational resources at the problem.
But isn't that an important part of the value of neural networks? Mathematicians are expensive so we'd like a computer to make a model for us, just like drivers are expensive so we want self-driving cars.
The issue with that is NN fail in some really interesting ways so you still need a lot of effort to get a robust solution. Remember, after some serious investments by many organizations self driving cars are still in development. At the same time a few people have demonstrated a basic system that seems close without nearly that much investment. Unfortunately, the difference between a demo and working solution can be several orders of magnitude.
Could a person with ML experience come up with this solution? Yes! Would his ML experience help him come up with this solution compared to someone who just learned numerical methods and automatic control theory? No. This isn't an ML solution.
Just because something is taught in an ML course doesn't mean that it is ML. It is pretty common for physics classes to teach maths and for chemistry classes to teach physics for example.
So if something is taught in ML class but also in statistics class then it is statistics and not ML. If something is taught in ML class but also in a numerical methods class then it is numerical methods and not ML.
Well... I guess most people equal ML with AI and use these terms interchangeably.
If you just replace ML with AI everywhere in this article it is going to make sense.
The article has other problems, one being the main premise.
The problem isn't to drive a car around track (which is what the polynomials did) but rather write a program that can figure out how to drive a car without you knowing how to solve it.
Well that depends on your definition of AI. Which isn't well defined. We call AI what we perceive as "magic". Black box algorithms have a higher chance of being perceived that way (e.g. neural nets). When you get some insight into how an algorithm works (easier for transparent box algos, but same holds for black box algorithms), you start to see it less and less as "magic", and, consequently, you're less likely to refer to it as an (artificial) intelligence. Because ultimately, that's what we mean by intelligence -- magic. When we say that something is intelligent, we liken it to ourselves: it evokes a sense of identification. It all comes back to a sense of humans being fundamentally separate from "the other" (computers in this case). If we saw the mathematical models and algorithms as just that, we wouldn't call them AI. Also, if we didn't think of our intelligence as more than the behaviour of our biological computer, we wouldn't be enchanted by the concept of non-biological systems mimicking some of our behaviour.
We don't find these systems intelligent because, on inspection, they arent.
We are intelligent. Not "magically", but actually nevertheless.
Our intelligence, and that of dogs (, mice, etc.) consists in the ability to operate on partial models of environments; dynamically responsive to them; and to skilfully respond to changes in them.
This sort of intelligence requires the environment to physically reconstitue the animal in order to non-cognitively develop skills.
It is skillful action we are interested in; and precisely what I missing in naive rule-based models of congition.
You provided an illustration of "magic". It's important to realise that you don't need a complex algorithm to produce complex behaviour (see Stephen Wolfram and his work on cellular automata).
In my understanding AI is an even broader term and means "any solution that imitates intelligent behavior". E.g. expert systems which are pretty much a bunch of if-then rules are also considered AI.
It's my understanding as well, many things that a modern programmer thinks in term of "computation" were once considered to be "AI". Lisp and Prolog were "AI", even the A* algorithm is still considered a rudimentary form of "AI" in textbooks just because it uses heuristics. There's a joke that says "every time AI researchers figure out a piece of it, it stops being AI" [0].
It's why I use "AI" and "ML" interchangeably although I know it's technically incorrect - the formal definition doesn't match what people are currently thinking.
There have traditionally been different approaches and definitions for AI. Some emphasize behaviour while others emphasize the logic behind the behaviour. (In some sense, while expert systems of course were an attempt at getting practical results, they might also have been an attempt to implement what was seen as human reasoning, while e.g. black box machine learning could be more about just getting the behaviour we want.) Some approaches view agents as intelligent if their action resembles humans or other beings that we consider intelligent, while other approaches are merely interested in whether they perform well at a specified task, perhaps more so than humans.
So yes, "any solution that imitates intelligent behaviour" is probably right, but with nuances with regard to what that actually means.
That's not symbolic AI though. That's only statistical methods. The statistical methods are all the rage now, but explainable AI that can reason is an important area of computer science (and research) and uses formal methods.
Edit: yeah, you can downvote this, but current AI research splits right along this line, whether it's symbolic or statistical. Some AI courses will use NNs, others will use Prolog and ASP. You can't just dismiss a whole field of research by reducing AI to statistical methods.
"Expert systems" were the hot research area in AI prior to machine learning (data driven methods, basically). Old methods and problems from that era like automated reasoning still have some research and applications going on, but aren't remotely as big an area as machine learning.
When I see "symbolic AI" I immediately think of Gary Marcus and immediately feel disdain towards the topic because of his behaviour on Twitter and other places.
I don't know the dude. I "only" know that my field of research is deductive reasoning in interactive applications and that this area falls under "Logic Programming" and LP is an area of AI.
I know that AI researchers are usually a bit dismissive about the other area. I don't like statistics either. Reducing the whole of AI research to statistical approaches (and NNs are one of those) is disingenious and dismisses hundreds of researchers doing important work.
You may not want to have rule-based image recognition, but if your car decides to run over somebody, I feel we better have an explanation for this behaviour based on reasoning and logic.
I don’t think anyone is dismissing symbolic AI. As far as I can see, it’s just not beating current SOTA results of NNs? It’s not really about ideology, it’s about what currently has superior performance. Model interpretability is not always a requirement.
The author may have implemented ML when they optimized their polynomial constants:
> If I was developing a racing game using this as the AI, I’d not just pick constants that successfully complete the track, but the ones that do it quickly.
If they wrote code that automatically picked constants that successfully completed the track quickly, (even something as simple as sorting the results by completion time), then that's reinforcement learning.
I agree, machine learning can certainly be over transparent models and classic models can certainly be non transparent. I tend to think of machine learning as any method which optimizes not only the model parameters, but also the model structure in a single step. Though then again the latter are just parameters of a more abstract model. So its all always optimization in the end.
I am not sure when we changed the terms, but back in the day, this would happily fall into machine learning. As he mentioned, if you want a good driver you would execute thousands of experiments to pick a good set of parameters
As soon as we recognize plain old regression as machine learning, then we start to see "averages" as models of systems and how practically useful could that be?
I think you're being facetious, but on the off-chance you're not, and for the benefit of others: averages are incredibly practically useful for modeling systems. Parameter estimation (which generalizes averages and applies to other distribution features like variance) is a foundational modeling methodology. It's useful for both understanding and forecasting data. Measures of central tendency are nearly always good (if obviously imperfect) models of systems.
Here is a trivial example: one of the best ways of modeling timeseries data, both in and out of sample, is to naively take the moving average. This is a rolling mean parameter estimate on n lagged values from the current timestep. Not only is this an excellent way of understanding the data (by decomposing it into seasonality, trend and residuals), it's a competitive benchmark for future values. The first step in timeseries analysis shouldn't be to reach for a neural network or even ARIMA. It should be to naively forecast forward using the mean.
You might be surprised at how difficult it is to beat that benchmark with cross-validation and no overfitting or look-ahead bias.
Thank you for your fabulous response. I hope my provocative comment wasn't in bad humor, disrespectful or trolling. I too love averages and regressions. Thank you for proudly defending these marvelously simple and powerful tools.
Well, actually working with "averages" as baselines before you start experimenting with more complex ML models is a good habit.
Sure, they are dummy regressors [1], but they can be so useful for proving that your whatever ML model you choose is at least better than a dummy baseline. If your model can't beat it, then you need to develop a better one.
They can even be used as a place-holder model so you can develop your whole architecture surrounding it, while another teammate is iterating over more complex experiments.
You could also settle in for a moving average process as a first model in a time-series [2], because they are easy to implement and simple to reason about.