This model was trained by asking people to rate internet comments on a scale from "Very toxic" to "Very healthy" contribution. Toxic is defined as... "a rude, disrespectful, or unreasonable comment that is likely to make you leave a discussion.
I'm not sure if it will ever be possible for algorithms/sentiment analysis to identify sarcasm as it is extremely situational and is often misread or misunderstood even by humans.
Seems to me that sarcasm will keep human moderators employed for some time to come.