Hacker News new | past | comments | ask | show | jobs | submit login

In C#, I ended up with

   static public List<string> Tokenizer(string source)
   {
      // Initialise the token list.
      List<string> tokens = new List<string>();

      // Define a regex pattern whose groups match the MAL syntax.
      string pattern = @"[\s ,]*(~@|[\[\]{}()'`~@]|""(?:[\\].|[^\\""])*""|;.*|[^\s \[\]{}()'""`~@,;]*)";
      //                 empty  ~@ | specials     |   double quotes      |;  | non-specials

      // Break the input string into its constituent tokens.
      string[] result = Regex.Split(source, pattern);
This took a while to understand and get going but it really improved my understanding of regex.



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: