Double quotes should be escaped in data.gov catalog csv

From Data-gov Wiki

Jump to: navigation, search
Infobox (Issue Report) edit with form
  • name: Double quotes should be escaped in data.gov catalog csv

  • description: found wrong "Data.gov Data Category Type" in dataset 92
  • creator(s): Li Ding
  • created: 2009-08-18
  • modified: 2010-5-16


Symptom

Field values were placed in wrong columns in http://www.data.gov/data_gov_catalog.csv. (i.e. Dataset 92 (Data.gov Catalog, Executive Office of the President). This has been observed in over 20 rows. To see this point, download the csv file and open it using Microsoft Excel.

  • check "Data.gov Data Category Type"
Diagnose
  • since text fields were wrapped by double quote, the double quotes inside field value should be escaped by two double quotes. Following are some correct usages
11982,"note: ""this is, a test"" for double quote","<a href=""http://foo.com"">http://foo.com</a>",123

the csv carries the following fields

11982
note: "this is, a test" for double quote
<a href="http://foo.com">http://foo.com</a>
123

Typically, double quotes were found in fields, e.g. "Description", "Citation", which contains either text descriptions or HTTP links.

How to Fix
  • before write to CSV, escape each text field by replacing " (one double quote) with "" (two double quotes)
Facts about Double quotes should be escaped in data.gov catalog csvRDF feed
Dcterms:created18 August 2009  +
Dcterms:creatorLi Ding  +
Dcterms:descriptionfound wrong "Data.gov Data Category Type" in dataset 92
Dcterms:modified2010-5-16  +
Foaf:nameDouble quotes should be escaped in data.gov catalog csv
Skos:altLabelDouble quotes should be escaped in data.gov catalog csv  +, double quotes should be escaped in data.gov catalog csv  +, and DOUBLE QUOTES SHOULD BE ESCAPED IN DATA.GOV CATALOG CSV  +
Personal tools
internal pages