Anyway, here's the code:
parseCSV := method(str,
r := list
str split("\n") foreach(line,
m := line asMutable prependSeq("(") appendSeq(")") asMessage
l := list
m arguments foreach(arg,
tokens := Compiler tokensForString(arg name)
if(tokens at(0) type != "Number" and tokens at(0) type != "MonoQuote",
Exception raise("Element not a number or a string.")
)
l append(doMessage(arg))
)
r append(l)
)
r
)
Now then, this parseCSV method is pretty straight forward; takes a string as input, creates an empty list to store the return values in. Then it splits the string up into lines and iterates over each line. Now it has to insert grouping characters around the argument list before converting it to a message, otherwise Io's parser will throw a fit, so we prependSeq and appendSeq before converting the string to a message.
Now we create the inner a new list, which will be used for each individual line (each element representing a comma separated value), and iterate over the arguments in our message. Now we need to know what tokens are in the CSV file; I'm not sure that the only kind of CSV data that's valid is numbers and strings, but for the sake of this, that's all I'm assuming. So we need to figure out what token is at the beginning of the argument (note that this doesn't perform any checking to make sure it's the only thing added, but it could easily with a tiny bit of extra code). Then we evaluate the message and add it to the list and repeat for each argument. Once we've iterated over each element in the CSV line, we append that list to our list pointed at by "r", and iterate over the next line in the sequence until we run out of lines, then we just return the list.
There's nothing super magical about this, but I thought it would be interesting to demonstrate how to write something like this in about a dozen lines.

1 comments:
Hi, this is the shortest parser I've seen yet.
But I think it only handles native line endings ('CRLF' on windows/dos, 'LF' on Linux, and 'CR' on Mac OS 9 and earlier). And I don't think it handles fields that contain newlines or commas (Usually the entire field is wrapped in quotes and quotes within the field are double quoted).
In anycase, I've written a tiny lisp based parser for CSV http://formlis.wordpress.com/2011/02/14/parsing-csv-files-with-lisp/ that might interest you. Its not quite as short as yours, but has no library dependencies and does handle those cases. The idea behind it is also widely applicable, so you could port it to IO.
Post a Comment