I only skimmed this, so perhaps there is more to this than meets the eye. But after reading I'm left with the question: Why does this need to be "neural"?
Given that there is a history of using classical NLP methods for sequence alignment / noisy channel decoding, I would've expected a more extensive discussion of how NNs might be able to overcome limitations of simpler methods.
But it seems the opposite is true--here they're using classical approaches to overcome limitations of their neural approach. The paper concludes by observing the "utmost importance of injecting prior linguistic knowledge" into the model. This "linguistic knowledge" is outlined in Section 3, and basically appears in the model as a regularization term based on a classical noisy channel / word alignment model. These regularization terms basically just encourage the neural network to behave like the classical models. And "neural" approach only performs marginally better than the (Berg-Kilpatrick & Klein 2011) paper they're comparing to, which takes a more classical combinatorial approach.
this only gets 67% accuracy, which is obviously an improvement, but it's still not close to sufficient accuracy to be useful. Historically a good test was how many "gods" or what not were invented in a translation.
E.g. at 67% accuracy a translation of Linear A would probably being meaningless
Strong AI is a shifting goal post isn't it? Just whatever we can do now that machines can't. As soon as someone works out how to turn a human problem into a mathematical optimization problem it turns into "just mechanical maths".
Generally “strong AI” means general intelligence, such as a human’s ability to figure out the solution to problems in general, as opposed to the single-purpose AI tools that we have today. So solving Linear A (by a method like this one) wouldn’t be “Strong AI”- merely profoundly powerful “weak AI.”
(“Strong AI” is also sometimes used to refer to conscious AI, but that’s a different issue again.)
Given that there is a history of using classical NLP methods for sequence alignment / noisy channel decoding, I would've expected a more extensive discussion of how NNs might be able to overcome limitations of simpler methods.
But it seems the opposite is true--here they're using classical approaches to overcome limitations of their neural approach. The paper concludes by observing the "utmost importance of injecting prior linguistic knowledge" into the model. This "linguistic knowledge" is outlined in Section 3, and basically appears in the model as a regularization term based on a classical noisy channel / word alignment model. These regularization terms basically just encourage the neural network to behave like the classical models. And "neural" approach only performs marginally better than the (Berg-Kilpatrick & Klein 2011) paper they're comparing to, which takes a more classical combinatorial approach.