Feed on
Posts
Comments

I got round to writing my first factor module tonight: a csv parser.

Actually there’s already a csv parser written in factor by Daniel Ehrenberg, but it’s been removed from the latest factor releases. I found a copy of that code here but I don’t know for how long it’ll stay there.

Unfortunately I had two problems with this existing module: The first was that it ran pretty slowly (1M of csv took ~5 seconds to parse on my laptop) mainly because my copy of factor wouldn’t compile the state-parser module that it depends on so it ran un-optimized. The second was that I needed a parser that could parse a row at a time for reading huge csv files in chunks. I took that as an opportunity to write my own.

The code performs ok (~500ms for 1M csv) and parses all the examples on the wikipedia csv page, but I can’t help feeling that I’ve written it in a similar style to what I would have done if I were using scheme. If anybody has any hints on ways to make the code smaller or faster or more elegant then I’d be delighted.

USING: kernel sequences io namespaces combinators ;
IN: csvparser

DEFER: quoted-field

: not-quoted-field ( -- endchar )
  ",\"nst" read-until   #! "
  dup
  { { CHAR: s  [ drop % not-quoted-field ] } ! skip whitespace
    { CHAR: t  [ drop % not-quoted-field ] }
    { CHAR: ,   [ swap % ] }
    { CHAR: "   [ drop drop quoted-field ] }  ! "
    { CHAR: n  [ swap % ] }
    { f         [ swap % ] }       ! eof
  } case ;

: maybe-escaped-quote ( -- endchar )
  read1
  dup
  { { CHAR: "   [ , quoted-field ] }     ! " is an escaped quote
    { CHAR: s  [ drop not-quoted-field ] }
    { CHAR: t  [ drop not-quoted-field ] }
    [ drop ]
  } case ;

: quoted-field ( -- endchar )
  "\"" read-until                                 ! "
  drop % maybe-escaped-quote ;

: field ( -- string sep )
  [ not-quoted-field ] "" make swap ;

: (row) ( -- sep )
  field swap ,
  dup CHAR: , = [ drop (row) ] when ;

: row ( -- array[string] eof? )
  [ (row) ] { } make swap ;

: (csv) ( -- )
  row swap , [ (csv) ] when ;

: csv-row ( stream -- row )
  [ row drop ] with-stream ;

: csv ( stream -- rows )
  [ [ (csv) ] { } make ] with-stream ;

If anybody’s interested the module (inc tests and doc) is here.

close Reblog this comment
blog comments powered by Disqus

generic acomplia purchase cialis overnight delivery cheap acomplia online buy generic clomid buy cialis low price viagra without prescription where to buy cialis lowest price levitra where to buy propecia cheap cialis from canada lasix no prescription viagra without rx cheap accutane tablets viagra online without prescription viagra no rx buying cialis online zithromax viagra in uk free cialis cialis us where to buy acomplia find cialis online buy viagra lowest price accutane prescription buy cheap accutane online cialis buy buy generic cialis online acomplia order propecia online lowest price synthroid synthroid without a prescription synthroid online buy propecia online cheap levitra online where to buy levitra cialis online review synthroid prices cialis generic cialis buy drug buy viagra on line viagra pharmacy cialis for order price of levitra zithromax online where to buy synthroid soma generic generic clomid propecia online stores viagra cheap drug cheap generic soma cialis cheap zithromax online cheap order accutane online purchase zithromax online purchase viagra online buy cheap clomid cheap generic propecia zithromax pharmacy online pharmacy cialis cheapest acomplia cost of cialis no prescription viagra free viagra purchase lasix online cialis from india viagra from india order discount cialis soma online stores find no rx cialis cialis no rx required find viagra without prescription approved cialis pharmacy lasix discount