CSV Challenge

useerup · on April 25, 2015

PowerShell, the right tool for the job:

    (iwr https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json | ConvertFrom-Json) | ? creditcard | select name,creditcard | Export-Csv ('{0:yyyyMMdd}.csv' -f (Get-Date)) -not

Edit: the single command written out to multiple (continued) lines in the PowerShell "long" form - i.e. no aliases and parameter names explicitly named.

    (Invoke-WebRequest -Uri https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json | ConvertFrom-Json) | 
    Where-Object creditcard | 
    Select-Object -Property name,creditcard | 
    Export-Csv -Path ('{0:yyyyMMdd}.csv' -f (Get-Date)) -NoTypeInformation

useerup · on April 25, 2015

Replying to self because I learned something new about PowerShell today :)

On the challenge github page dfinke used irm (Invoke-RestMethod) instead of iwr (Invoke-WebRequest).

It turns out that irm actually looks at the content type of the response and converts from json to objects implicitly if the content type indicates json.

Thanks to dfinke.

The above PowerShell one-liner can thus be written shorter (omitting the explicit ConvertFrom-Json invokation):

    (irm https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json) |
    ? creditcard |
    select name,creditcard | 
    epcsv ('{0:yyyyMMdd}.csv' -f (Get-Date)) -not

orf · on April 25, 2015

Looks a lot more readable than the UNIXy alternatives below

useerup · on April 25, 2015

yeah. And unlike many of the Unixy solutions it will actually produce correct results even if the input data contained quotes ("), apostrophes ('), commas (,), backslashes (\) and other characters that could throw parsing and text synthesizing off.

* The iwr (Invoke-WebRequest) reads the text file from the uri.

* The ConvertFrom-Json cmdlet reads the content as json and parses it according to json rules. The output is a sequence of objects (not json objects, but PowerShell objects) with properties corresponding to the json fields. This will respect specially encoded characters.

* The ? (select-object) cmdlet can filter on complex expressions, but in this case it just filters on the presence of a creditcard property.

* The select (Select-Object) cmdlet selects just the properties we want to export, out of all of the properties for each object.

The epcsv (Export-Csv) exports objects from the pipeline and outputs them to a file with proper escaping for e.g. quotes, apostrophes etc.

By not relying on home-cooked text munging you can actually produce a robust solution that is also readable.

jclos · on April 26, 2015

I always dismissed PowerShell but your comment intrigues me. What would you recommend as a starting resource for learning it?

useerup · on April 26, 2015

https://technet.microsoft.com/en-us/library/cc281945(v=sql.1...

If you are more of a developer you may want to start at MSDN instead: https://msdn.microsoft.com/en-us/library/dd835506(v=vs.85).a...

I recently found out about Microsoft Virtual Academy. There are some seriously good (free) online courses there, featuring Jeffrey Snover himself (author of the Monad Manifest and inventor of PowerShell). For example this one: http://www.microsoftvirtualacademy.com/training-courses/gett...

Downside is that you do not control the pace yourself as much as by written resources.

There's a very friendly community at http://powershell.org where you can ask questions, get advice etc.

stonogo · on April 25, 2015

UNIX was born for this.

    awk 'BEGIN { FS="," } /"creditcard":"/ { split ($5, a, " "); split (a[1], b, "\""); date=b[4]; gsub (/-/, "", date);  split ($1, a, "\""); name=a[4]; name=sprintf("\"%s\"", name); split ($6, a, "\""); card=a[4]; print "echo",name","card,">>",date".csv";  }' data.json | sh

vitorbaptistaa · on April 25, 2015

Using the excellent jq [1] command

  wget https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json
  echo 'Name,Credit Card' > 20150425.csv
  jq -r '.[] | [.name, .creditcard] | @csv' data.json >> 20150425.csv

This consider that all timestamps are from 2015/04/25 (which is the case with the example dataset, but might not be).

[1] https://stedolan.github.io/jq/

th0br0 · on April 25, 2015

You forgot the select(.creditcard) - see my comment on the gist - as this one does not filter out null'ed creditcards:

  name=`date +'%Y%m%d'`; 
  echo "name,creditcard" > $name; 
  jq -r '.[] | select(.creditcard) | [.name, .creditcard] | @csv' data.json >> $name

nburger · on April 25, 2015

If we're going for succinctness, you could combine these all together:

(echo '"Name","Credit Card"'; jq -r '.[] | select(.creditcard) | [.name, .creditcard] | @csv' data.json) >> `date +%Y%m%d`.csv

vitorbaptistaa · on April 25, 2015

Nice trick to write the headers, thanks!

ekimekim · on April 26, 2015

I used jq as well, though instead of doing a seperate echo I did this:

    ([{name: "name", creditcard: "creditcard"}] + .)[]

instead of .[]

Another less hacky way would be:

    [["name", "creditcard"]] + map(select(.creditcard) | [.name, .creditcard]) | .[] | @csv

vitorbaptistaa · on April 25, 2015

You're right! I missed that requirement.

bfstein · on April 25, 2015

Copying my answer from the website, but this can be done by extremely nontechnical people:

1) Copy all of the JSON data.

2) Paste it into this link: http://konklone.io/json/ (I don't know if this is cheating)

3) Download the file.

3a) If it opens in a new page, just manually save the source as a .csv

4) Open it in excel.

5) Click the column of credit cards.

6) Click the button to sort in descending order.

7) Scroll down to the point where you run out of credit cards.

8) Delete everything after that.

maxerickson · on April 25, 2015

The goal of posting the challenge is to see tools like the one you link.

Your solution is probably more sensible as a one off than a lot of the other answers.

traviscj · on April 26, 2015

If this was a list of cat pictures, I would not consider posting to a third party cheating. But seeing as how it's credit card data, even though it is already "in the wild," it is still worth not posting to another random website.

gwu78 · on April 26, 2015

Considering these are credit card numbers and you are trying to protect the account holders, is it really a smart idea to upload the names and numbers to some third party website?

bfstein · on April 26, 2015

FWIW, all of the conversion is done in-browser, and it's open source (https://github.com/konklone/json/tree/gh-pages/assets).

danbruc · on April 25, 2015

Notepadd++ search and replace using the regular expressions

  ^\{"name":("[^"]+)".+"creditcard":("[^"]+")\},?$

and replace it with

  \1,\2

then sort the file using the TextFX plugin and delete all the lines without credit card number at the bottom. Type the header line and done. Five to ten minutes including some manual sanity checking. Of course don't forget to look at the calendar and save with the correct file name.

whoopdedo · on April 25, 2015

Lowest effort solution: Post to stackexchange/github/reddit with the question "I've got a programming challenge. Can you find the shortest code to convert this JSON to CSV?" Then sit back and watch the answers roll in.

bch · on April 25, 2015

Or go to say, a #json channel in IRC and say that the problem can't be solved.

HaseebR7 · on April 26, 2015

FTFY

sit back and watch the answers and downvotes roll in.

oevi · on April 25, 2015

JavaScript in the browser console.

Assuming you are on this URL:

    https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json

you can do this (might take a few seconds):

    var csv=""; JSON.parse(document.querySelector('pre').textContent).forEach(function(item){if (item.name && item.creditcard){csv+=item.name+','+item.creditcard+'\n'}});

ICWiener · on April 25, 2015

Common Lisp

     (ql:quickload :drakma)
     (ql:quickload :cl-json)
     (ql:quickload :local-time)

     (defparameter *data*
       (cl-json:decode-json-from-string
        (drakma:http-request 
         "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json")))

     (with-open-file 
         (stream (make-pathname
                  :name (local-time:format-timestring
                         t
                         (local-time:now)
                         :format '((:year 4)(:month 2)(:day 2)))
                  :type "csv") 
          :direction :output
          :if-exists :supersede)
       (loop initially (format stream "Name,Credit Card~%")
             for slot in *data*
             for name = (cdr(assoc :name slot))
             for card = (cdr(assoc :creditcard slot))
             when (and name card)
               do (format stream "~a,~a~%" name card)))

greggyb · on April 25, 2015

Simple pipeline prettied up a little.

    curl https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json |
        cut -d, -f1,6 |
        cut -d\" -f4,8 |
        awk -F\" -e 'BEGIN {print "name,creditcard"};
            {if (NF==2) print $1 "," $2}' \
        > 20150425.csv

prakashk · on April 25, 2015

Perl (assuming the data is downloaded):

    perl -0777 -MJSON -le '
        print for "name,creditcard\n",
                  map {"$_->{name},$_->{creditcard}\n"}
                  grep {$_->{creditcard}}
                  @{JSON->new->decode(<>)}
    ' data.json > 20150425.csv

jakejake · on April 25, 2015

It's been years since I wrote perl. Any kind soul care to comment some of this code? It looks like the logic starts at the bottom, but I must admit defeat in truly understanding how it works!

hawkice · on April 25, 2015

Perl is water in the cupped hands of our minds. I have yet to meet someone who can keep it from slipping between their fingers without consistent effort.

vram22 · on April 25, 2015

Innovative analogy.

a1369209993 · on April 25, 2015

decode input from JSON

grep for items with a 'creditcard' field

map each item to a string with its 'name' and 'creditcard' fields

for each string (and "name,creditcard\n"), print it

into 20150425.csv

prakashk · on April 25, 2015

In addition to the explanation by a1369209993's explanation (https://news.ycombinator.com/item?id=9438730):

    * <> reads from file(s) specified as command-line arguments (data.json in this case)
    * Option -0777 tells Perl to slurp the whole file
    * Option -MJSON loads the JSON module

noughts · on April 25, 2015

A simple solution in Clojure:

  (require '[clj-http.client :as client]
           '[cheshire.core :as json])
  (let [url "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json"
        body (:body (client/get url))
        data (json/parse-string body true)
        data-with-cc (remove #(nil? (:creditcard %)) data)
        rows (map #(str (:name %) "," (:creditcard %)) data-with-cc)
        csv (str "Name,Credit Card\n" (clojure.string/join "\n" rows))
        filename (str (.format (java.text.SimpleDateFormat. "YYYYMMDD") (java.util.Date.)) ".csv")]
    (spit filename csv))

lgas · on April 26, 2015

This is a great solution. For fun I had a go at golfing it a bit:

  (-> "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json"
      client/get :body
      (json/parse-string true)
      (->> (remove (comp nil? :creditcard))
           (map (comp (partial join ",") (juxt :name :creditcard)))
           (join "\n")
           (apply str "Name,Credit Card\n")
           (spit (-> (java.text.SimpleDateFormat. "YYYYMMDD")
                     (.format (java.util.Date.))
                     (str ".csv")))))

edit: Shorter, requires `(require '[clojure.string :refer [join]])`

Dn_Ab · on April 25, 2015

There are actually 2 dates,the 24th and the 25th, in the data sample.

You can do this in a manner that's both fairly comprehensible and succint, for arbitrary number of dates, using a Json TypeProvider in F#.

  #r @"../Fsharp.Data.dll"  
  open FSharp.Data  
  open System

  type PersonsData = JsonProvider<"../data.sample.json">

  let dateTriple (d:PersonsData.Root) = d.Timestamp.Year, d.Timestamp.Month,d.Timestamp.Day 
  
  let info = PersonsData.Load ("./data.json")
  
  let uniqueDates = info |> Array.map dateTriple |> set
  
  let createCsv (d : PersonsData.Root seq) =    
    d |> Seq.filter (fun p -> Option.isSome p.Creditcard)
      |> Seq.map (fun p -> sprintf "%s,%s" p.Name p.Creditcard.Value )   
      |> String.concat "\n"

  info |> Seq.groupBy dateTriple
       |> Seq.iter (fun ((y,m,d), data) ->
         IO.File.WriteAllText (sprintf "%d%02d%02d.csv" y m d, createCsv data))

With the below as a sample (though dataset itself could have been used since it's not so large):

    [{"name":"Quincy Gerhold","email":"laron.cremin@macejkovic.info","city":"Port Tiabury","mac":"64:d2:17:ff:28:13","timestamp":"2015-04-25 15:57:12 +0700","creditcard":null},{"name":"Lolita Hudson","email":"tracy.goodwin@schmidt.com","city":"Port Brookefurt","mac":"2d:20:78:41:8e:35","timestamp":"2015-04-25 23:20:21 +0700","creditcard":"1211-1221-1234-2201"}]

arianvanp · on April 25, 2015

If I read correctly. It reads in sample data at compile time and constructs a type from that? `PersonsData` ? That's really cool.

DCoder · on April 26, 2015

Very cool. That's exactly what TypeProviders are for: https://msdn.microsoft.com/en-us/library/hh156509.aspx

adpreese · on April 25, 2015

I think it's usually better to do this with a reproducible script instead of manual editing in a text editor. There are too many times where the requirements change after the fact for "simple one-off" transformations.

zackmorris · on April 25, 2015

I was going to say the same thing. I think of everything as black boxes, so choice of language is not as important as approach. I would use PHP because it's informal - it's a hybrid between shell and C so expressiveness and speed come basically for free and there isn't much friction. Being able to quickly iterate often makes relationships become apparent at a meta level. For example a quick lookup of a parsing error might reveal that the data was generated by a standard tool, so I could drop what I'm doing and grab that instead. It's not so much about individual choices, but a way of attacking problems that compounded over time leads to a great deal of leverage.

nyir · on April 25, 2015

They shouldn't have put each record on a single line, that makes it way too easy for text editors. Assume that this a single line or there are arbitrary newlines inserted and then use a proper parser/filter, e.g. jq in the shell as mentioned in the comments.

danbruc · on April 25, 2015

Replace },{ with },\r\n{ and done. Using more sophisticated tools than a text editor only makes sense if you know them well, the task is more complex or the file larger.

nyir · on April 25, 2015

Sure, which is why I suggested to make the task complex enough so that the "usual" *nix text tools aren't enough, or significantly more complex to use then a real parser. I mean it's a basically a rehash of "how do I parse HTML with regexes".

whoopdedo · on April 25, 2015

Also could have thrown in one or two names with a comma. Hardly anyone is quoting the output fields.

ah- · on April 25, 2015

A bit overkill since it could do so much more than just filtering nulls, but here is a KDB/Q oneliner:

`:20150425.csv 0: csv 0: (`Name; `$"Credit Card") xcol select name,creditcard from (.j.k raze read0`:data.json) where 0n<>first each creditcard

mannykannot · on April 25, 2015

I have to admit that I have only skimmed the answers, but I didn't notice much in the way of testing that the entire file conforms to the assumptions, implicitly used in the design of a solution, about how it is formatted.

emmelaich · on April 26, 2015

Mine is perl, for which the guts is

    m/"name":(".*?").*"creditcard":("\d{4}(-\d{4}){3}")/ && print "$1,$2\n"

i.e. if we we match what looks like a credit card in the credit card field then write the name and credit card.

The other only slightly tricky thing is the use of minimal match for the name field -- the ? after the *.

gwu78 · on April 25, 2015

This gratuitously uses a "sentinel" (octal 004) which is a technique that I sometimes use when converting more complex input to CSV.

I regularly convert (bad) HTML and other nonsense to CSV, and this "challenge" was a little too easy; let's have some more difficult CSV tasks!

   [ -f data.json ]||
   ftp -4o data.json https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json &&
   {
   a=$(echo b|tr b '\004') ;
   exec sed '
   1i\
   name,creditcard
   
   /creditcard\":\"./!d;
   s/.*name\":\"//;
   s/\"/,/;
   s/'"$a"'//g;
   s/creditcard\":\"/'"$a"'/
   s/,.*'"$a"'/,/;
   s/\".*//;
   ' data.json 
   }

steakejjs · on April 25, 2015

The first way I did it was just read the file, parse the json, check for nulls, and print. Then I piped this to a file. That is too boring though.

The next way I did it was with Vim replaces. There might be sexier ways of doing this?

1) A big replace to get rid of all the junk.

    %s/\%Vemail.*creditcard/creditcard/g

2) A big replace to fix nulls in the credit card

    %s/\%V.*creditcard.*null.*//g

Then I could either change my code to not check for nulls or keep cleaning up this data until I have a CSV.

This Vim flow is the exact thing I do every week or two at work when building the Chrome HSTS preload list into our product. Anyone know the sexier ways to make this really really fun?

I should note to those not super familiar with vim that may try this, my vim commands are over a visual block (the json string), since this was in the same file as my code.

peapicker · on April 25, 2015

The one thing that was not left fully specified: Should the filename in YYYYMMDD.csv be the date that the program is run, or should a file of that spec be created grouping the lines in the .json file by their date field? If it is the latter, none of the github solutions are coding this correctly.

notpeter · on April 25, 2015

My PostgreSQL 9.3 solution using json_populate_recordset and @cjauvin's Python solution are the only two which group by date. It wasn't much harder, but nudges it out of shell oneliner territory.

https://gist.github.com/jorin-vogel/2e43ffa981a97bc17259#com...

phillc73 · on April 25, 2015

The R example posted on the GitHub thread is very nice. [1]

I use R every day, often parsing json and writing csv files. The solution posted by mrdwab is much more succinct than my go to method, especially the use of pipes. I've read about pipes in the dplyr/magrittr packages before, but this is the first time I've seen them used and thought they made complete sense.

[1] https://gist.github.com/jorin-vogel/2e43ffa981a97bc17259#com...

Gatsky · on April 26, 2015

I agree, the solution in R seems by far the most simple and intelligible. But then R is specifically designed to do this kind of thing, so that's not too surprising.

corbinpage · on April 25, 2015

Possibly use transformy, which was on HN yesterday and is featured on Product Hunt today.

http://www.transformy.io/

Or Ruby.

danbruc · on April 25, 2015

  "Keeley Bosco","1228-1221-1221-1431"
  "Rubye Jerde","1221-1221-1221-"
  "Miss Darian","null---"
  "Celine Ankunding","0700-creditcard-creditcard-1221"
  "Dr Araceli","creditcard-1211-1211-1234"
  "Esteban Von","---"
  "Everette Swift","creditcard-null-null-"
  "Terrell Boyle","0700-creditcard-creditcard-1221"
  "Miss Emmie","---"
  "Libby Renner","1221-1211-1211-"

Xurinos · on April 25, 2015

I am disappointed in this challenge on two levels: (1) A great many solutions fail to actually produce proper CSV: what would you do if any of the names or arbitrary credit card number inputs had quotes or commas in them? Big waste of time to roll your own, and you would have failed to prevent fraud likely without even realizing it. (2) Our JSON-dumping hackers didn't put quotes or commas in the strings to foil do-gooders.

scrollaway · on April 25, 2015

Looks like you completely missed the point of the "challenge". This is reproducing a real-world situation to figure out what solutions different devs will think of first.

Though I guess "complain about the situation" is a valid answer to that question.

Xurinos · on April 25, 2015

That was an incredibly condescending answer.

In 2015, we continue to develop incorrect CSV parsing and production when there are ready solutions in the wild for the spec (http://www.rfc-editor.org/rfc/rfc4180.txt - 2005), such as Ruby's csv package, Perl's Text::CSV, CL's CL-CSV, and so forth. It's a solved problem. I quite understand this challenge, but this issue is important to me because this same print "%s,%s\n" stuff shows up in the wild from professional developers who have the time on their hands to use the right tool, some of which I have personally worked with. Perhaps, like in this challenge, they are under pressure to get their feature finished, and this is the first thing they think of because, after all, it's just "comma-separated values".

This challenge is especially amusing in that it, were it a real-world situation, involves people's financial welfare, and developers in a rush could very well have screwed it up. Isn't that cause for worry?

ekimekim · on April 26, 2015

That did cross my mind - my solution used a full json parser (jq), and I did a quick grep for commas in name or creditcard fields before using a "%s,%s"-style solution to generate the CSV - if there had been any present, I would have fallen back to a slightly longer form of python -c 'import csv...'

pravj · on April 25, 2015

A lot of people have already solved this but here is my 2 cents.

    wget https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json
    echo "Name,Credit Card" > `date +"%Y%m%d.csv"`
    jq -r '.[] | select(.creditcard) | .name+","+.creditcard' data.json >> !$

trevordixon · on April 25, 2015

Sublime Text and vim mode: http://youtu.be/7OOHUbpiX1A

_cipher_ · on April 25, 2015

vim ftw. (there is an incorrect solution at link) :)

  Gddggdd
  :g/creditcard\":null/d   # to get rid of null credit cards :)
  gg
  qa
  9x
  f"
  d/creditcard
  c2f"
  ,
  ^[
  f"d$
  j^q

After that, just <num>@a and set.

edit: fix format and add gg after null deletion.

raverbashing · on April 25, 2015

(Assuming data has already been downloaded)

cat data.json | grep -v 'null}' | sed -r 's/\"name\"\:\"([a-zA-Z.'\'' -]{2,50})\",.{80,200}\"creditcard\":\"([0-9-]{2,50})\"},?/\1,\2/' | tr -d '{}]['

(replace with sed -E on Mac OSX)

greggyb · on April 25, 2015

No need to cat first. Command line utilities can act on files directly:

    grep -v 'null}' data.json | ....

raverbashing · on April 25, 2015

Yes, it's just like this so you can take it and curl the url in place.

Spittie · on April 25, 2015

On a whim, I'd probably do something like this

  tr -d '[{"}]' < data.json | awk -F ',' '{ print $1","$6 }' | awk -F '[:,]' '{ print $2";"$4 }' | grep -v ";null" > $(date +%Y%m%d).csv

PedroSena · on April 25, 2015

Single line, no assumptions about data previously stored, no invalid CSV with quotes getting the entire line and no useless spaces in the beginning of each line:

echo "name,creditcard$(wget -qO- https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a84... | jq -r 'map(select(.creditcard != null) | .name + "," + .creditcard)' | sed -E -e 's/\]|\"|\[| {2,}|,$//g')" > $(date +"%Y%m%d").csv

nburger · on April 25, 2015

Here's a solution using just one instance of jq+awk that handles the "multiple files" scenario and writes proper headers on each:

jq -r '.[]|[.timestamp,.name,.creditcard]|@csv'<data.json|awk -F, '{if ($3!=""){f=substr($1,2,10)".csv";gsub(/-/,"",f);gsub(/"/,"",$0);c="(echo name,creditcard;cat -)>>"f;print $2","$3|c}}'

Strangely, the OP's "solution" actually omits the quotes around the strings, which adds unnecessary complexity.

forrestthewoods · on April 25, 2015

I post about it on the internet so other people do the heavy lifting for me! https://xkcd.com/356/

squeaky-clean · on April 25, 2015

My first thought was to just manipulate it in Sublime Text, the file isn't too large to work on in memory. I also noticed the data seemed to be very consistently formatted.

Search "name":".+?".*?"creditcard":".+?" to grab every line with non-empty Name and CC. Alt+Enter to edit all lines at once. Remove all the other data and json syntax. This was pretty easy because "name" is always the first field on the line and "creditcard" always the final field.

cabirum · on April 25, 2015

  curl -s https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json | \
  php -r '$f=fopen(date("Ymd") . ".csv", "wb"); 
  foreach( json_decode( file_get_contents("php://stdin"), 1) as $d) 
  if ($d["creditcard"]) fputcsv($f, [$d["name"], $d["creditcard"]]);'

Jodie_C · on April 26, 2015

In awk:

  curl https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json| awk -F"[:,]" 'BEGIN{OFS=",";print "Name,Credit Card"} $19~/([[:digit:]]{4}-){3}[[:digit:]]{4}/{ gsub(/"/,"",$2);gsub(/"|}/,"",$19); print $2,$19}' > $(date +%Y%m%d).CSV

mslot · on April 25, 2015

Not very fast and lacks headers, but otherwise does the trick. Assumes date should be taken from timestamp, not sure if that's the case :)

    jq -r 'map(select(.creditcard)) | .[] | .timestamp + ";" + .name + "," + .creditcard' data.json | tr '\n' ';' | xargs -n 2 -d ';' sh -c 'echo $1 >> $(TZ=GMT date --date="$0" +"%Y%m%d").csv'

ProfDreamer · on April 25, 2015

    echo 'name,creditcard' > `date '+%Y%m%d'`.csv
    grep 'name"' data.json | \
        grep -v 'creditcard":null' | \
        tr '{:}' ',' | \
        cut -d',' -f3,20 | \
        sed 's/"//g' \
            >> `date '+%Y%m%d'`.csv

troels · on April 25, 2015

Considering how vaguely defined CSV is as a format, the challenge might have specified that a bit better. Fun nonetheless.

userbinator · on April 26, 2015

When I see CSV I think RFC4180. It's not a real "standardised standard", but good enough to be one, and probably what a lot of tools will produce and consume if asked to process CSV.

troels · on April 26, 2015

In my experience, CSV is very fuzzy. Do you include headers? What character is used for separator? Comma? Semi colon? Tab? Are fields enclosed in quotes? How are quotes within quotes escaped? What encoding do you use for non-ascii symbols? &c &c .. Without are sample reference, these are all very common variables.

userbinator · on April 27, 2015

What character is used for separator? Comma? Semi colon? Tab?

It isn't Comma-Separated-Values if fields are not separated by commas, but the others are valid points. Failure to handle quoting correctly is the most common form of broken CSV support I've seen. Encoding is usually specified out-of-band.

troels · on April 27, 2015

Technically no, but if you are in mainland Europe, Excel will happily save your CSV files using a semi colon. And, yes - that _is_ maddening.

whoopdedo · on April 25, 2015

When he said "old school" I was expecting fixed-length records.

jondumbau · on April 25, 2015

curl -s https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a84... | perl -MJSON -e 'print join("\n",map {($_->{creditcard})?qq("$_->{name}","$_->{creditcard}"):()}{name=>Name=>creditcard=>q(Credit Card)},@{decode_json(join("",map{chomp;$_}<>))}) . "\n"' > $(date +'%Y-%m-%d').csv

juanmacuevas · on April 25, 2015

by the way, I see only 4 credit cards there: 1212-1221-1121-1234 1211-1221-1234-2201 1234-2121-1221-1211 1228-1221-1221-1431 Someone is lying