That's a great question. Optimizing for sentence selection is important for teaching. For now, I have a simple check that filters out sentences which are longer than 160 characters.
Also, I believe that this is one thing which humans can do better. I, therefore, plan to add upvote & downvote buttons to rate the quality of sentences.
I wonder if you might get a bit of an head start if you combine the shorter sentence idea with selection based on higher n-gram counts. For instance, if the keyword + words either side match a common n-gram, you could expect that sentence was reasonably representative and boost it in the initial rankings as compared to an n-gram that has a much lower count.
Also, I believe that this is one thing which humans can do better. I, therefore, plan to add upvote & downvote buttons to rate the quality of sentences.